Title: Transcription profiling of Prss16 (Tssp) can be used to find additional peptidase genes that are candidates for self-peptide generation in the thymus Fornari TA, Marques MM, Nguyen C, Carrier A, Passos GA Ref: Mol Biol Rep, 39:4051, 2012 : PubMed
Positive selection (PS) in the thymus involves the presentation of self-peptides that are bound to MHC class II on the surface of cortical thymus epithelial cells (cTECs). Prss16 gene corresponds to one important element regulating the PS of CD4(+) T lymphocytes, which encodes Thymus-specific serine protease (Tssp), a cTEC serine-type peptidase involved in the proteolytic generation of self-peptides. Nevertheless, additional peptidase genes participating in the generation of self-peptides need to be found. Because of its role in the mechanism of PS and its expression in cTECs, the Prss16 gene might be used as a transcriptional marker to identify new genes that share the same expression profile and that encode peptidases in the thymus. To test this hypothesis, we compared the differential thymic expression of 4,500 mRNAs of wild-type (WT) C57BL/6 mice with their respective Prss16-knockout (KO) mutants by using microarrays. From these, 223 genes were differentially expressed, of which 115 had known molecular/biological functions. Four endopeptidase genes (Casp1, Casp2, Psmb3 and Tpp2) share the same expression profile as the Prss16 gene; i.e., induced in WT and repressed in KO while one endopeptidase gene, Capns1, features opposite expression profile. The Tpp2 gene is highlighted because it encodes a serine-type endopeptidase functionally similar to the Tssp enzyme. Profiling of the KO mice featured down-regulation of Prss16, as expected, along with the genes mentioned above. Considering that the Prss16-KO mice featured impaired PS, the shared regulation of the four endopeptidase genes suggested their participation in the mechanism of self-peptide generation and PS.
The International Human Genome Sequencing Consortium (IHGSC) recently completed a sequence of the human genome. As part of this project, we have focused on chromosome 8. Although some chromosomes exhibit extreme characteristics in terms of length, gene content, repeat content and fraction segmentally duplicated, chromosome 8 is distinctly typical in character, being very close to the genome median in each of these aspects. This work describes a finished sequence and gene catalogue for the chromosome, which represents just over 5% of the euchromatic human genome. A unique feature of the chromosome is a vast region of approximately 15 megabases on distal 8p that appears to have a strikingly high mutation rate, which has accelerated in the hominids relative to other sequenced mammals. This fast-evolving region contains a number of genes related to innate immunity and the nervous system, including loci that appear to be under positive selection--these include the major defensin (DEF) gene cluster and MCPH1, a gene that may have contributed to the evolution of expanded brain size in the great apes. The data from chromosome 8 should allow a better understanding of both normal and disease biology and genome evolution.
Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome. It is also enriched in segmental duplications, ranking third in density among the autosomes. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome.
Human chromosome 2 is unique to the human lineage in being the product of a head-to-head fusion of two intermediate-sized ancestral chromosomes. Chromosome 4 has received attention primarily related to the search for the Huntington's disease gene, but also for genes associated with Wolf-Hirschhorn syndrome, polycystic kidney disease and a form of muscular dystrophy. Here we present approximately 237 million base pairs of sequence for chromosome 2, and 186 million base pairs for chromosome 4, representing more than 99.6% of their euchromatic sequences. Our initial analyses have identified 1,346 protein-coding genes and 1,239 pseudogenes on chromosome 2, and 796 protein-coding genes and 778 pseudogenes on chromosome 4. Extensive analyses confirm the underlying construction of the sequence, and expand our understanding of the structure and evolution of mammalian chromosomes, including gene deserts, segmental duplications and highly variant regions.
Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.
Chromosome 18 appears to have the lowest gene density of any human chromosome and is one of only three chromosomes for which trisomic individuals survive to term. There are also a number of genetic disorders stemming from chromosome 18 trisomy and aneuploidy. Here we report the finished sequence and gene annotation of human chromosome 18, which will allow a better understanding of the normal and disease biology of this chromosome. Despite the low density of protein-coding genes on chromosome 18, we find that the proportion of non-protein-coding sequences evolutionarily conserved among mammals is close to the genome-wide average. Extending this analysis to the entire human genome, we find that the density of conserved non-protein-coding sequences is largely uncorrelated with gene density. This has important implications for the nature and roles of non-protein-coding sequence elements.
Salmonella enterica serovars often have a broad host range, and some cause both gastrointestinal and systemic disease. But the serovars Paratyphi A and Typhi are restricted to humans and cause only systemic disease. It has been estimated that Typhi arose in the last few thousand years. The sequence and microarray analysis of the Paratyphi A genome indicates that it is similar to the Typhi genome but suggests that it has a more recent evolutionary origin. Both genomes have independently accumulated many pseudogenes among their approximately 4,400 protein coding sequences: 173 in Paratyphi A and approximately 210 in Typhi. The recent convergence of these two similar genomes on a similar phenotype is subtly reflected in their genotypes: only 30 genes are degraded in both serovars. Nevertheless, these 30 genes include three known to be important in gastroenteritis, which does not occur in these serovars, and four for Salmonella-translocated effectors, which are normally secreted into host cells to subvert host functions. Loss of function also occurs by mutation in different genes in the same pathway (e.g., in chemotaxis and in the production of fimbriae).
Human chromosome 7 has historically received prominent attention in the human genetics community, primarily related to the search for the cystic fibrosis gene and the frequent cytogenetic changes associated with various forms of cancer. Here we present more than 153 million base pairs representing 99.4% of the euchromatic sequence of chromosome 7, the first metacentric chromosome completed so far. The sequence has excellent concordance with previously established physical and genetic maps, and it exhibits an unusual amount of segmentally duplicated sequence (8.2%), with marked differences between the two arms. Our initial analyses have identified 1,150 protein-coding genes, 605 of which have been confirmed by complementary DNA sequences, and an additional 941 pseudogenes. Of genes confirmed by transcript sequences, some are polymorphic for mutations that disrupt the reading frame.
The male-specific region of the Y chromosome, the MSY, differentiates the sexes and comprises 95% of the chromosome's length. Here, we report that the MSY is a mosaic of heterochromatic sequences and three classes of euchromatic sequences: X-transposed, X-degenerate and ampliconic. These classes contain all 156 known transcription units, which include 78 protein-coding genes that collectively encode 27 distinct proteins. The X-transposed sequences exhibit 99% identity to the X chromosome. The X-degenerate sequences are remnants of ancient autosomes from which the modern X and Y chromosomes evolved. The ampliconic class includes large regions (about 30% of the MSY euchromatin) where sequence pairs show greater than 99.9% identity, which is maintained by frequent gene conversion (non-reciprocal transfer). The most prominent features here are eight massive palindromes, at least six of which contain testis genes.
Salmonella enterica subspecies I, serovar Typhimurium (S. typhimurium), is a leading cause of human gastroenteritis, and is used as a mouse model of human typhoid fever. The incidence of non-typhoid salmonellosis is increasing worldwide, causing millions of infections and many deaths in the human population each year. Here we sequenced the 4,857-kilobase (kb) chromosome and 94-kb virulence plasmid of S. typhimurium strain LT2. The distribution of close homologues of S. typhimurium LT2 genes in eight related enterobacteria was determined using previously completed genomes of three related bacteria, sample sequencing of both S. enterica serovar Paratyphi A (S. paratyphi A) and Klebsiella pneumoniae, and hybridization of three unsequenced genomes to a microarray of S. typhimurium LT2 genes. Lateral transfer of genes is frequent, with 11% of the S. typhimurium LT2 genes missing from S. enterica serovar Typhi (S. typhi), and 29% missing from Escherichia coli K12. The 352 gene homologues of S. typhimurium LT2 confined to subspecies I of S. enterica-containing most mammalian and bird pathogens-are useful for studies of epidemiology, host specificity and pathogenesis. Most of these homologues were previously unknown, and 50 may be exported to the periplasm or outer membrane, rendering them accessible as therapeutic or vaccine targets.
A set of 3000 mouse thymus cDNAs was analyzed by extensive measurement of expression using complex-probe hybridization of DNA arrays ("quantitative differential screening"). The complex probes were initially prepared using total thymus RNA isolated from C57BL/6 wild-type (WT), CD3epsilon- and RAG1-deficient mice. Over 100 clones displaying over- or under-expression by at least a factor of two between WT and knockout (KO) thymuses were further analyzed by measuring hybridization signatures with probes from a wide range of KO thymuses, cell types, organs, and embryonic thymuses. A restricted set of clones was selected by virtue of their expression spectra (modulation in KO thymuses and thymocytes, lymphoid cell specificity, and differential expression during embryonic thymus development), sequenced at one extremity, and compared to sequences in databases. Clones corresponding to previously identified genes (e.g., Tcrbeta, Tcf1 or CD25) showed expression patterns that were consistent with existing data. Ten distinct clones corresponding to new genes were subjected to further study: Northern blot hybridization, in situ hybridization on thymus sections, and partial or complete mRNA sequence determination. Among these genes, we report a new serine peptidase highly expressed in cortical epithelial cells that we have named thymus-specific serine peptidase (TSSP), and an acidic protein expressed in thymocytes and of unknown function that we have named thymus-expressed acidic protein (TEAP). This approach identifies new molecules likely to be involved in thymocyte differentiation and function.
Dipeptidyl peptidase IV-beta (DPP IV-beta) is a novel protein which shows a peptidase activity similar to the T-cell-activation antigen CD26. To further characterize this DPP IV-beta and confirm its cell surface expression, we have developed a purification strategy using the CD26- cell line C8166. The purification process includes biotinylation of cell surface proteins before preparation of cell extracts and processing by gel-filtration, ion-exchange and lectin chromatographies. Consistent with the molecular mass of DPP IV-beta estimated by gel-filtration chromatography, the final purified fraction, manifesting a typical DPP IV activity, showed a major biotinylated 75-80-kDa band in SDS/PAGE, thus suggesting the monomeric nature of this enzyme. Kinetic parameters of DPP IV-beta and the sensitivity to a new family of irreversible DPP IV inhibitors, were studied in comparison to CD26. Both enzymes followed a Michaelis kinetics with different Km values for Gly-Pro-NH-Np (NH-Np, para-nitroanilide) hydrolysis (0.28+/-0.05 mM and 0.12+/-0.02 mM). More significant differences were observed in the sensitivity to inhibitors, which exerted a much higher activity on CD26 than on DPP IV-beta. These differences permitted us to study DPP IV-beta expression in CD26-expressing cells, showing the expression of this new enzyme in all lymphoid cells tested, and a rapid enhancement in phytohemagglutinin-stimulated or protein-A-stimulated peripheral blood mononuclear cells. Our results indicate that, although DPP IV-beta and CD26 are coexpressed and manifest a typical DPP IV activity, there are distinct features in their catalytic activities that may confer to each enzyme a complementary role in peptide processing.
The dipeptidyl peptidase IV (DPP IV) activity of CD26 is characterized by its post-proline-cleaving capacity that plays an important but not yet understood role in biological processes. Here we describe a new family of specific and irreversible inhibitors of this enzyme. Taking into account the substrate specificity of DPP IV for P2-P1><-P1' cleavage, we have designed and synthesized cyclopeptides c[(alphaH2N+)-Lys-Pro-Aba-(6-CH2-S+R2)-Glyn] 2TFA- (Aba = 3-aminobenzoic acid, R = alkyl) possessing a proline at the P1 position and a lysine in the P2 position, which allows the closing of the cycle on its side chain. These molecules show a free N-terminus, necessary for binding to the CD26 catalytic site, and a latent quinoniminium methide electrophile, responsible for inactivation. Treatment of c[alphaZ-Lys-Pro-Aba-(6-CH2-OC6H5)-Glyn], obtained by peptide synthesis in solution, with R2S/TFA simutaneously cleaved the Z protecting group and the phenyl ether function and led to a series of cyclopeptide sulfonium salts. These cyclopeptides inhibited rapidly and irreversibly the DPP IV activity of CD26, with IC50 values in the nanomolar range. Further studies were carried out to investigate the effect of the modification of the ring size (n = 2 or 4) and the nature of the sulfur substituents (R = Me, Bu, Oct). Cycle enlargement improved the inhibitory activity of the methylsulfonio cyclopeptide, whereas the increase of the alkyl chain length on the sulfur atom had no apparent effect. Other aminopeptidases were not inhibited, and a much weaker activity was observed on a novel isoform of DPP IV referred to as DPP IV-beta. Thus, this new family of irreversible inhibitors of DPP IV is highly specific to the peptidase activity of CD26.