(Below N is a link to NCBI taxonomic web page and E link to ESTHER at designed phylum.) > cellular organisms: NE > Eukaryota: NE > Opisthokonta: NE > Metazoa: NE > Eumetazoa: NE > Bilateria: NE > Deuterostomia: NE > Chordata: NE > Craniata: NE > Vertebrata: NE > Gnathostomata: NE > Teleostomi: NE > Euteleostomi: NE > Sarcopterygii: NE > Dipnotetrapodomorpha: NE > Tetrapoda: NE > Amniota: NE > Mammalia: NE > Theria: NE > Eutheria: NE > Boreoeutheria: NE > Euarchontoglires: NE > Primates: NE > Haplorrhini: NE > Simiiformes: NE > Catarrhini: NE > Hominoidea: NE > Hominidae: NE > Homininae: NE > Homo: NE > Homo sapiens: NE
LegendThis sequence has been compared to family alignement (MSA) red => minority aminoacid blue => majority aminoacid color intensity => conservation rate title => sequence position(MSA position)aminoacid rate Catalytic site Catalytic site in the MSA AFLQAKEELKLLKLPGFMYSEVPLLASSVPYFSVEEEDGSEDGVHLIVCV HGLDGNSADLRLVKTYIELGLPGGRIDFLMSERNQNDTFADFDSMTDRLL DEIIQYIQIYSLTVSKISFIGHSLGNLIIRSVLTRPRFKYYLNKLHTFLS LSGPHLGTLYNSSALVNTGLWFMQKWKKSGSLLQLTCRDHSDPRQTFLYK LSNKAGLHYFKNVVLVGSLQDRYVPYHSARIEMCKTALKDKQSGQIYSEM IHNLLRPVLQSKDCNLVRYNVINALPNTADSLIGRAAHIAVLDSEIFLEK FFLVAALKYFQ
Chromosome 6 is a metacentric chromosome that constitutes about 6% of the human genome. The finished sequence comprises 166,880,988 base pairs, representing the largest chromosome sequenced so far. The entire sequence has been subjected to high-quality manual annotation, resulting in the evidence-supported identification of 1,557 genes and 633 pseudogenes. Here we report that at least 96% of the protein-coding genes have been identified, as assessed by multi-species comparative sequence analysis, and provide evidence for the presence of further, otherwise unsupported exons/genes. Among these are genes directly implicated in cancer, schizophrenia, autoimmunity and many other diseases. Chromosome 6 harbours the largest transfer RNA gene cluster in the genome; we show that this cluster co-localizes with a region of high transcriptional activity. Within the essential immune loci of the major histocompatibility complex, we find HLA-B to be the most polymorphic gene on chromosome 6 and in the human genome.
With the complete human genomic sequence being unraveled, the focus will shift to gene identification and to the functional analysis of gene products. The generation of a set of cDNAs, both sequences and physical clones, which contains the complete and noninterrupted protein coding regions of all human genes will provide the indispensable tools for the systematic and comprehensive analysis of protein function to eventually understand the molecular basis of man. Here we report the sequencing and analysis of 500 novel human cDNAs containing the complete protein coding frame. Assignment to functional categories was possible for 52% (259) of the encoded proteins, the remaining fraction having no similarities with known proteins. By aligning the cDNA sequences with the sequences of the finished chromosomes 21 and 22 we identified a number of genes that either had been completely missed in the analysis of the genomic sequences or had been wrongly predicted. Three of these genes appear to be present in several copies. We conclude that full-length cDNA sequencing continues to be crucial also for the accurate identification of genes. The set of 500 novel cDNAs, and another 1000 full-coding cDNAs of known transcripts we have identified, adds up to cDNA representations covering 2%--5 % of all human genes. We thus substantially contribute to the generation of a gene catalog, consisting of both full-coding cDNA sequences and clones, which should be made freely available and will become an invaluable tool for detailed functional studies.
        
Title: Prediction of the coding sequences of unidentified human genes. XVI. The complete sequences of 150 new cDNA clones from brain which code for large proteins in vitro Nagase T, Kikuno R, Ishikawa KI, Hirosawa M, Ohara O Ref: DNA Research, 7:65, 2000 : PubMed
We have carried out a human cDNA sequencing project to accumulate information regarding the coding sequences of unidentified human genes. As an extension of the preceding reports, we herein present the entire sequences of 150 cDNA clones of unknown human genes, named KIAA1294 to KIAA1443, from two sets of size-fractionated human adult and fetal brain cDNA libraries. The average sizes of the inserts and corresponding open reading frames of cDNA clones analyzed here reached 4.8 kb and 2.7 kb (910 amino acid residues), respectively. From sequence similarities and protein motifs, 73 predicted gene products were functionally annotated and 97% of them were classified into the following four functional categories: cell signaling/communication, nucleic acid management, cell structure/motility and protein management. Additionally, the chromosomal loci of the genes were assigned by using human-rodent hybrid panels for those genes whose mapping data were not available in the public databases. The expression profiles of the genes were also studied in 10 human tissues, 8 brain regions, spinal cord, fetal brain and fetal liver by reverse transcription-coupled polymerase chain reaction, products of which were quantified by enzyme-linked immunosorbent assay.
        
2 lessTitle: Identification of candidate genes that underlie the QTL on chromosome 1 that mediates genetic differences in stress-ethanol interactions Cook MN, Baker JA, Heldt SA, Williams RW, Hamre KM, Lu L Ref: Physiol Genomics, 47:308, 2015 : PubMed
Alcoholism, stress, and anxiety are strongly interacting heritable, polygenetic traits. In a previous study, we identified a quantitative trait locus (QTL) on murine chromosome (Chr) 1 between 23.0 and 31.5 Mb that modulates genetic differences in the effects of ethanol on anxiety-related phenotypes. The goal of the present study was to extend the analysis of this locus with a focus on identifying candidate genes using newly available data and tools. Anxiety-like behavior was evaluated with an elevated zero maze following saline or ethanol injections (1.8 g/kg) in C57BL/6J, DBA2J, and 72 BXD strains. We detected significant effects of strain and treatment and their interaction on anxiety-related behaviors, although surprisingly, sex was not a significant factor. The Chr1 QTL is specific to the ethanol-treated cohort. Candidate genes in this locus were evaluated using now standard bioinformatic criteria. Collagen 19a1 (Col19a1) and family sequence 135a (Fam135a) met most criteria but have lower expression levels and lacked biological verification and, therefore, were considered less likely candidates. In contrast, two other genes, the prenylated protein tyrosine phosphate family member Ptp4a1 (protein tyrosine phosphate 4a1) and the zinc finger protein Phf3 (plant homeoDomain finger protein 3) met each of our bioinformatic criteria and are thus strong candidates. These findings are also of translational relevance because both Ptp4a1 and Phf3 have been nominated as candidates genes for alcohol dependence in a human genome-wide association study. Our findings support the hypothesis that variants in one or both of these genes modulate heritable differences in the effects of ethanol on anxiety-related behaviors.
To identify the disease gene in 6 Spanish families with autosomal recessive retinitis pigmentosa linked to the RP25 locus, mutation screening of 4 candidate genes, KHDRBS2, PTP4A1, KIAA1411 and OGFRL1, was undertaken based on their expression or functional relevance to the retina. Twenty-six single nucleotide polymorphisms were identified, of which 14 were novel. Even though no pathological mutations were detected, these genes however remain as good candidates for other retinal degenerations mapping to the same chromosomal region.
Chromosome 6 is a metacentric chromosome that constitutes about 6% of the human genome. The finished sequence comprises 166,880,988 base pairs, representing the largest chromosome sequenced so far. The entire sequence has been subjected to high-quality manual annotation, resulting in the evidence-supported identification of 1,557 genes and 633 pseudogenes. Here we report that at least 96% of the protein-coding genes have been identified, as assessed by multi-species comparative sequence analysis, and provide evidence for the presence of further, otherwise unsupported exons/genes. Among these are genes directly implicated in cancer, schizophrenia, autoimmunity and many other diseases. Chromosome 6 harbours the largest transfer RNA gene cluster in the genome; we show that this cluster co-localizes with a region of high transcriptional activity. Within the essential immune loci of the major histocompatibility complex, we find HLA-B to be the most polymorphic gene on chromosome 6 and in the human genome.
With the complete human genomic sequence being unraveled, the focus will shift to gene identification and to the functional analysis of gene products. The generation of a set of cDNAs, both sequences and physical clones, which contains the complete and noninterrupted protein coding regions of all human genes will provide the indispensable tools for the systematic and comprehensive analysis of protein function to eventually understand the molecular basis of man. Here we report the sequencing and analysis of 500 novel human cDNAs containing the complete protein coding frame. Assignment to functional categories was possible for 52% (259) of the encoded proteins, the remaining fraction having no similarities with known proteins. By aligning the cDNA sequences with the sequences of the finished chromosomes 21 and 22 we identified a number of genes that either had been completely missed in the analysis of the genomic sequences or had been wrongly predicted. Three of these genes appear to be present in several copies. We conclude that full-length cDNA sequencing continues to be crucial also for the accurate identification of genes. The set of 500 novel cDNAs, and another 1000 full-coding cDNAs of known transcripts we have identified, adds up to cDNA representations covering 2%--5 % of all human genes. We thus substantially contribute to the generation of a gene catalog, consisting of both full-coding cDNA sequences and clones, which should be made freely available and will become an invaluable tool for detailed functional studies.
        
Title: Prediction of the coding sequences of unidentified human genes. XVI. The complete sequences of 150 new cDNA clones from brain which code for large proteins in vitro Nagase T, Kikuno R, Ishikawa KI, Hirosawa M, Ohara O Ref: DNA Research, 7:65, 2000 : PubMed
We have carried out a human cDNA sequencing project to accumulate information regarding the coding sequences of unidentified human genes. As an extension of the preceding reports, we herein present the entire sequences of 150 cDNA clones of unknown human genes, named KIAA1294 to KIAA1443, from two sets of size-fractionated human adult and fetal brain cDNA libraries. The average sizes of the inserts and corresponding open reading frames of cDNA clones analyzed here reached 4.8 kb and 2.7 kb (910 amino acid residues), respectively. From sequence similarities and protein motifs, 73 predicted gene products were functionally annotated and 97% of them were classified into the following four functional categories: cell signaling/communication, nucleic acid management, cell structure/motility and protein management. Additionally, the chromosomal loci of the genes were assigned by using human-rodent hybrid panels for those genes whose mapping data were not available in the public databases. The expression profiles of the genes were also studied in 10 human tissues, 8 brain regions, spinal cord, fetal brain and fetal liver by reverse transcription-coupled polymerase chain reaction, products of which were quantified by enzyme-linked immunosorbent assay.