C1orf141

Chromosome 1 open reading frame 141, or C1orf141 is a protein which, in humans, is encoded by gene C1orf141.^[1] It is a precursor protein that becomes active after cleavage.^[2] The function is not yet well understood, but it is suggested to be active during development^[3]

Gene

Locus

This gene is located on chromosome 1 at position 1p31.3. It is encoded on the antisense strand of DNA spanning from 67,092,176 to 67,141,646 and has 10 total exons. It overlaps slightly with the gene IL23R being encoded on the sense strand.^[1]

Chromosome 1 spanning from 66,924,895 to 67,267,726.^[1]

Transcription regulation

A specific promoter region has not been predicted for C1orf141 so the 1000 base pairs upstream of the start of transcription was analyzed for transcription factor binding sites.^[4] The transcription factors below represent a subset of the transcription factor binding sites found within this region that give an idea of the kind of factors that could bind to the promoter^[4]

Vertebrate TATA binding protein factor
CCAAT binding factor
Lim homeodomain factor
Cart-1
Homeodomain transcription factor
Fork head domain factor
Nuclear receptor subfamily
Brn POU domain

mRNA

Alternative Splicing

The C1orf141 gene appears to have two common isoforms and seven less common transcript variants.^[1]

C1orf141 Isoforms
Name	mRNA Length (base pairs)	Protein Length (amino acids)
C1orf141 Isoform 1	2177	400
C1orf141 Isoform 2	2203	217
C1orf141 Isoform X1	2348	471
C1orf141 Isoform X2	2265	458
C1orf141 Isoform X3	1875	333
C1orf141 Isoform X4	920	243
C1orf141 Isoform X5	612	154
C1orf141 Isoform X6	639	146
C1orf141 Isoform X7	514	138

Protein

The primary encoded precursor protein (C1orf141 Isoform 1) consists of 400 amino acid residues and is 2177 base pairs long. It consists of 7 exons and a domain of unknown function DUF4545.^[5] Its predicted molecular mass is 54.4 kDa and predicted isoelectric point is 9.63.^[6]

Composition

The C1orf141 precursor protein has more lysine amino acid residues and less glycine amino acid residues than expected when compared to other human proteins. The sequence has 11.7% lysine and only 2.1% glycine.^[6]

Post-translational modifications

C1orf141 is modified post translation to form a mature protein product. It undergoes O-linked glycosylation, sumoylation, glycation, and phosphorylation.^[7]^[8]^[9]^[10] One N-terminal cleavage occurs followed by acetylation. Propeptide cleavage occurs at the start site of the final exon.^[2]

Structure

The secondary structure for uncleaved C1orf141 consists primarily of alpha helices with a few small segments of beta sheets. These helices can be seen in the model of the tertiary structure predicted by the I-TASSER program.^[11] The program Phyre2 also predicts the protein to be made up primarily of alpha helices.^[12] After propeptide cleavage of C1orf141, I-TASSER predicts that only alpha helices remain.

Interactions

There are currently no experimentally confirmed interactions for C1orf141. The STRING database for protein interactions identified ten potential proteins that interact with C1orf141 through text mining.^[13] These include SALT1, C8orf74, SHCBP1L, ACTL9, RBM44, CCDC116, ADO, WDR78, ZNF365, SPATA45.^[14]^[15]^[16]^[17] Through investigation of the papers where these interaction predictions were found, a solid link was not clear for any of the identified proteins.

Expression

C1orf141 is expressed in 30 different tissues but primarily in the testes.^[1] Other tissues where expression is above baseline levels are the brain, lungs, and ovaries.^[3]

Localization

The subcellular localization for C1orf141 is predicted to be in the nucleus. There are two nuclear localization signals within the protein sequence, one of which stays present after propeptide cleavage.^[18]

Function

The function of C1orf141 is not yet fully understood and has not been experimentally confirmed. However, expression data shows that the protein is active in some developmental stages. RNA-Seq data taken at different stages of development show expression at varying levels throughout.^[3] Expression rates are seen at higher levels in the fetal developmental stage than the adult in the protein's ETS profile.^[19] Microarray data for cumulus cells during natural and stimulated in vitro fertilization show relatively high levels of expression.^[20] There is no significant change in expression in adult tissue disease states.^[19]

Homology

Paralogs

There are no paralogs for C1orf141^[21]

Orthologs

Orthologous sequences are seen primarily in other mammalian species. The most distant ortholog identified through a NCBI BLAST search is a Reptilian species, but that is the only non-mammalian species.^[21] This list contains a subset of the species identified as orthologs to display the diversity of the species where orthologs can be found. Each species was compared to the human C1orf141 isoform that includes each coding exon, isoform X1.^[1]

C1orf141 Orthologs
Genus and Species	Common Name	Taxonomic Group	Accession Number	Date of Divergence (millions of years)	Sequence Length (amino acids)	Sequence Identity	Sequence Similarity
Homo sapiens	Human	Primate	XP_011539768.1	0	471	100%	100%
Gorilla gorilla gorilla	Western Lowland Gorilla	Primate	XP_018892062.1	8.61	469	97%	98%
Otolemur garnettii	Northern Greater Galago	Primate	XP_023365656.1	84	457	59%	70%
Tupaia chinensis	Northern Treeshrew	Scandentia	XP_006171456.1	88	468	62%	74%
Oryctolagus cuniculus	European Rabbit	Lagomorpha	XP_017201685.1	88	470	56%	68%
Fukomys damarensis	Damaraland Mole Rat	Rodentia	XP_010603404.1	88	479	54%	66%
Chinchilla lanigera	Long-tailed Chincilla	Rodentia	XP_013369940.1	94	476	50%	65%
Ochotona princeps	American Pika	Lagomorpha	XP_012783463.1	94	450	50%	67%
Miniopterus natalensis	Natal long-fingered bat	Chiroptera	XP_016064273.1	94	390	63%	72%
Panthera pardus	Leopard	Carnivora	XP_019304485.1	94	450	62%	74%
Enhydra lutris kenyoni	Sea Otter	Carnivora	XP_022351992.1	94	451	62%	74%
Balaenoptera acutorostrata scammoni	Minke Whale	Cetacea	XP_007164359.1	94	432	60%	60%
Delphinapterus leucas	Beluga Whale	Cetacea	XP_022436606.1	94	432	59%	72%
Sus scrofa	Wild Boar	Cetartiodactyla	XP_005656203.1	94	442	56%	70%
Pteropus vampyrus	Large Flying Fox	Chiroptera	XP_011367916.1	94	470	56%	68%
Ovis aries	Sheep	Cetartiodactyla	XP_012026840.1	94	431	55%	69%
Bos taurus	Cattle	Cetartiodactyla	NP_001070559.1	94	430	54%	69%
Condylura cristata	Star-nosed Mole	Eulipotyphla	XP_012577585.1	94	432	52%	64%
Desmodus rotundus	Common Vampire Bat	Chiroptera	XP_024421106.1	94	398	48%	59%
Sarcophilus harrisii	Tasmanian Devil	Marsupiala	XP_012405605.1	160	356	43%	63%
Phascolarctos cinereus	Koala	Marsupiala	XP_020848724.1	160	204	29%	50%
Monodelphis domestica	Gray Short-tailed Opossum	Marsupiala	XP_007480481.1	160	524	25%	48%
Pogona vitticeps	Central Bearded Dragon	Reptilia	XP_020661721.1	320	501	28%	54%

Evolutionary History

Using the Molecular Clock Hypothesis, the m value (the number of corrected amino acid changes per 100 residues) was calculated for C1orf141 and plotted against the divergence of species. When compared to the same m value plot for hemoglobin, fibrinogen alpha chain, and cytochrome c, it is clear that the C1orf141 gene is evolving at a faster rate than all three.

References

^ ^a ^b ^c ^d ^e ^f "C1orf141 chromosome 1 open reading frame 141 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-03.
^ ^a ^b "ProP 1.0 Server". www.cbs.dtu.dk. Retrieved 2019-05-03.
^ ^a ^b ^c ^d "C1orf141 Gene Expression - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-03.
^ ^a ^b "Genomatix: Gene2Promoter Subtasks". www.genomatix.de. Retrieved 2019-05-03.
^ "C1orf141 Gene (Protein Coding)". www.genecards.org. Retrieved 2019-05-03.
^ ^a ^b "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-05-03.
^ "NetOGlyc 4.0 Server". www.cbs.dtu.dk. Retrieved 2019-05-05.
^ "SUMOplot™ Analysis Program | Abgent". www.abgent.com. Archived from the original on 2005-01-03. Retrieved 2019-05-05.
^ "NetGlycate 1.0 Server". www.cbs.dtu.dk. Retrieved 2019-05-05.
^ "NetPhos 3.1 Server". www.cbs.dtu.dk. Retrieved 2019-05-05.
^ ^a ^b "I-TASSER results". zhanglab.ccmb.med.umich.edu. Archived from the original on 2019-05-03. Retrieved 2019-05-03.
^ "PHYRE2 Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2019-05-03.
^ "C1orf141 protein (human) - STRING interaction network". string-db.org. Retrieved 2019-05-03.
^ Sammut, Stephen J.; Feichtinger, Julia; Stuart, Nicholas; Wakeman, Jane A.; Larcombe, Lee; McFarlane, Ramsay J. (2014-05-06). "A novel cohort of cancer-testis biomarker genes revealed through meta-analysis of clinical data sets". Oncoscience. 1 (5): 349–359. doi:10.18632/oncoscience.37. ISSN 2331-4737. PMC 4278308. PMID 25594029.
^ Swami, Meera (2014). "Genome-wide association study identifies three new melanoma susceptibility loci". Nature Medicine. 17 (11): 1357. doi:10.1038/nm.2568. hdl:2445/128818. ISSN 1078-8956. S2CID 42251944.
^ Lu, Weining; Quintero-Rivera, Fabiola; Fan, Yanli; Alkuraya, Fowzan S.; Donovan, Diana J.; Xi, Qiongchao; Turbe-Doan, Annick; Li, Qing-Gang; Campbell, Craig G. (2007). "NFIA Haploinsufficiency Is Associated with a CNS Malformation Syndrome and Urinary Tract Defects". PLOS Genetics. 3 (5): e80. doi:10.1371/journal.pgen.0030080. ISSN 1553-7390. PMC 1877820. PMID 17530927.
^ Yao, Fang; Zhang, Chi; Du, Wei; Liu, Chao; Xu, Ying (2015-09-16). "Identification of Gene-Expression Signatures and Protein Markers for Breast Cancer Grading and Staging". PLOS ONE. 10 (9): e0138213. Bibcode:2015PLoSO..1038213Y. doi:10.1371/journal.pone.0138213. ISSN 1932-6203. PMC 4573873. PMID 26375396.
^ "Welcome to psort.org!!". www.psort.org. Retrieved 2019-05-03.
^ ^a ^b "EST Profile - Hs.666621". www.ncbi.nlm.nih.gov. Retrieved 2019-05-03.
^ "Modified natural and stimulated in vitro fertilization cycles: cumulus cells - - GEO DataSets - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-03.
^ ^a ^b "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2019-05-03.

[:0-1] ^ ^a ^b ^c ^d ^e ^f "C1orf141 chromosome 1 open reading frame 141 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-03.

[:5-2] "ProP 1.0 Server". www.cbs.dtu.dk. Retrieved 2019-05-03.

[:1-3] "C1orf141 Gene Expression - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-03.

[:6-4] "Genomatix: Gene2Promoter Subtasks". www.genomatix.de. Retrieved 2019-05-03.

[5] "C1orf141 Gene (Protein Coding)". www.genecards.org. Retrieved 2019-05-03.

[:2-6] "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-05-03.

[7] "NetOGlyc 4.0 Server". www.cbs.dtu.dk. Retrieved 2019-05-05.

[8] "SUMOplot™ Analysis Program | Abgent". www.abgent.com. Archived from the original on 2005-01-03. Retrieved 2019-05-05.

[9] "NetGlycate 1.0 Server". www.cbs.dtu.dk. Retrieved 2019-05-05.

[10] "NetPhos 3.1 Server". www.cbs.dtu.dk. Retrieved 2019-05-05.

[:3-11] "I-TASSER results". zhanglab.ccmb.med.umich.edu. Archived from the original on 2019-05-03. Retrieved 2019-05-03.

[12] "PHYRE2 Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2019-05-03.

[13] "C1orf141 protein (human) - STRING interaction network". string-db.org. Retrieved 2019-05-03.

[14] Sammut, Stephen J.; Feichtinger, Julia; Stuart, Nicholas; Wakeman, Jane A.; Larcombe, Lee; McFarlane, Ramsay J. (2014-05-06). "A novel cohort of cancer-testis biomarker genes revealed through meta-analysis of clinical data sets". Oncoscience. 1 (5): 349–359. doi:10.18632/oncoscience.37. ISSN 2331-4737. PMC 4278308. PMID 25594029.

[15] Swami, Meera (2014). "Genome-wide association study identifies three new melanoma susceptibility loci". Nature Medicine. 17 (11): 1357. doi:10.1038/nm.2568. hdl:2445/128818. ISSN 1078-8956. S2CID 42251944.

[16] Lu, Weining; Quintero-Rivera, Fabiola; Fan, Yanli; Alkuraya, Fowzan S.; Donovan, Diana J.; Xi, Qiongchao; Turbe-Doan, Annick; Li, Qing-Gang; Campbell, Craig G. (2007). "NFIA Haploinsufficiency Is Associated with a CNS Malformation Syndrome and Urinary Tract Defects". PLOS Genetics. 3 (5): e80. doi:10.1371/journal.pgen.0030080. ISSN 1553-7390. PMC 1877820. PMID 17530927.

[17] Yao, Fang; Zhang, Chi; Du, Wei; Liu, Chao; Xu, Ying (2015-09-16). "Identification of Gene-Expression Signatures and Protein Markers for Breast Cancer Grading and Staging". PLOS ONE. 10 (9): e0138213. Bibcode:2015PLoSO..1038213Y. doi:10.1371/journal.pone.0138213. ISSN 1932-6203. PMC 4573873. PMID 26375396.

[18] "Welcome to psort.org!!". www.psort.org. Retrieved 2019-05-03.

[:7-19] "EST Profile - Hs.666621". www.ncbi.nlm.nih.gov. Retrieved 2019-05-03.

[20] "Modified natural and stimulated in vitro fertilization cycles: cumulus cells - - GEO DataSets - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-03.

[:4-21] "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2019-05-03.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]