BACKGROUND: Adult house flies, Musca domestica L., are mechanical vectors of more than 100 devastating diseases that have severe consequences for human and animal health. House fly larvae play a vital role as decomposers of animal wastes, and thus live in intimate association with many animal pathogens. RESULTS: We have sequenced and analyzed the genome of the house fly using DNA from female flies. The sequenced genome is 691 Mb. Compared with Drosophila melanogaster, the genome contains a rich resource of shared and novel protein coding genes, a significantly higher amount of repetitive elements, and substantial increases in copy number and diversity of both the recognition and effector components of the immune system, consistent with life in a pathogen-rich environment. There are 146 P450 genes, plus 11 pseudogenes, in M. domestica, representing a significant increase relative to D. melanogaster and suggesting the presence of enhanced detoxification in house flies. Relative to D. melanogaster, M. domestica has also evolved an expanded repertoire of chemoreceptors and odorant binding proteins, many associated with gustation. CONCLUSIONS: This represents the first genome sequence of an insect that lives in intimate association with abundant animal pathogens. The house fly genome provides a rich resource for enabling work on innovative methods of insect control, for understanding the mechanisms of insecticide resistance, genetic adaptation to high pathogen loads, and for exploring the basic biology of this important pest. The genome of this species will also serve as a close out-group to Drosophila in comparative genomic studies.
        
Title: Adult Drosophila melanogaster glutathione S-transferases: Effects of acute treatment with methyl parathion Alias Z, Clark AG Ref: Pesticide Biochemistry and Physiology, 98:94, 2010 : PubMed
The effect of acute treatment of methyl parathion (MP) on the expression of BSP/GSH-agarose purified glutathione S-transferases (GSTs) in Drosophila melanogaster was investigated. Using 2-D gel analysis of the identified Epsilon-class, only DmGSTE6 (100%) and DmGSTE7 (72%) demonstrated significant increases in expression, suggesting the possibility that both may be involved in MP metabolism. A smaller percentage increase was also observed in DmGSTD1 (18%), a known DDT dehydrochlorinase, DmGSTE3 and DmGSTE9 and a putative Epsilon-class GST, CG16936, were shown not to be responsive to the challenge. Our finding demonstrates that not all member of the Epsilon-class GST, which are known for their role in insecticide resistance, are immediately responsive to this toxic challenge.
Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.
The rhesus macaque (Macaca mulatta) is an abundant primate species that diverged from the ancestors of Homo sapiens about 25 million years ago. Because they are genetically and physiologically similar to humans, rhesus monkeys are the most widely used nonhuman primate in basic and applied biomedical research. We determined the genome sequence of an Indian-origin Macaca mulatta female and compared the data with chimpanzees and humans to reveal the structure of ancestral primate genomes and to identify evidence for positive selection and lineage-specific expansions and contractions of gene families. A comparison of sequences from individual animals was used to investigate their underlying genetic diversity. The complete description of the macaque genome blueprint enhances the utility of this animal model for biomedical research and improves our understanding of the basic biology of the species.
Drosophila melanogaster males transfer seminal fluid proteins along with sperm during mating. Among these proteins, ACPs (Accessory gland proteins) from the male's accessory gland induce behavioral, physiological, and life span reduction in mated females and mediate sperm storage and utilization. A previous evolutionary EST screen in D. simulans identified partial cDNAs for 57 new candidate ACPs. Here we report the annotation and confirmation of the corresponding Acp genes in D. melanogaster. Of 57 new candidate Acp genes previously reported in D. melanogaster, 34 conform to our more stringent criteria for encoding putative male accessory gland extracellular proteins, thus bringing the total number of ACPs identified to 52 (34 plus 18 previously identified). This comprehensive set of Acp genes allows us to dissect the patterns of evolutionary change in a suite of proteins from a single male-specific reproductive tissue. We used sequence-based analysis to examine codon bias, gene duplications, and levels of divergence (via dN/dS values and ortholog detection) of the 52 D. melanogaster ACPs in D. simulans, D. yakuba, and D. pseudoobscura. We show that 58% of the 52 D. melanogaster Acp genes are detectable in D. pseudoobscura. Sequence comparisons of ACPs shared and not shared between D. melanogaster and D. pseudoobscura show that there are separate classes undergoing distinctly dissimilar evolutionary dynamics.
We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25-55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species--but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.
Anopheles gambiae is the principal vector of malaria, a disease that afflicts more than 500 million people and causes more than 1 million deaths each year. Tenfold shotgun sequence coverage was obtained from the PEST strain of A. gambiae and assembled into scaffolds that span 278 million base pairs. A total of 91% of the genome was organized in 303 scaffolds; the largest scaffold was 23.1 million base pairs. There was substantial genetic variation within this strain, and the apparent existence of two haplotypes of approximately equal frequency ("dual haplotypes") in a substantial fraction of the genome likely reflects the outbred nature of the PEST strain. The sequence produced a conservative inference of more than 400,000 single-nucleotide polymorphisms that showed a markedly bimodal density distribution. Analysis of the genome sequence revealed strong evidence for about 14,000 protein-encoding transcripts. Prominent expansions in specific families of proteins likely involved in cell adhesion and immunity were noted. An expressed sequence tag analysis of genes regulated by blood feeding provided insights into the physiological adaptations of a hematophagous insect.
A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.
        
Title: Identification and cloning of a key insecticide-metabolizing glutathione S-transferase (MdGST-6A) from a hyper insecticide-resistant strain of the housefly Musca domestica Wei SH, Clark AG, Syvanen M Ref: Insect Biochemistry & Molecular Biology, 31:1145, 2001 : PubMed
Strains of the housefly, Musca domestica, highly resistant to organophosphate (OP) and other insecticides are known because they overproduce glutathione S-transferases (GSTs). Previous work has shown that overproduction in these strains involved numerous isozymes with glutathione conjugating activities (Pesticide Biochem. Physiol., 25 (1986) 169; Mol. General Genetics, 227 (1991) 355; J. Biol. Chem., 267 (1992) 1840; Mol. General Genetics, 245 (1994) 236; J. Mol. Evol., 43 (1996) 236). The current work describes the purification and identification of a M. domestica GST isozyme (pI 7.1) broadly specific for substrates from a housefly strain, Cornell-HR, that is highly resistant against OP-insecticides, and the isolation of two new MdGST genes using the antibody made against it. This isozyme, which was identified from amongst more than 20 isoelectric forms of GSTs of the same subunit size, was highly active for conjugating GSH to the model substrate 3,4-dichloronitrobenzne (DCNB). When expressed in Escherichia coli, one of the cloned GSTs, MdGST-6A, produces an enzyme that conjugates glutathione to the insecticides methyl parathion and lindane. On indication that it was the most active isozyme toward several xenobiotics among several MdGSTs tested, we advance the notion that MdGST-6A probably plays an important role in M. domestica Cornell-HR's resistance towards OP-insecticides. MdGST-6A and a second closely related one found in this work, MdGST-6B, are members of the traditional insect class I family (theta-class) and share the greatest homologies with a cluster of Drosophila GSTs on locus 55. In addition to having the unusually broad substrate specificity, the sequence of the new group of enzymes reveals that it has a highly diverged hydrophobic motif in its active site as compared to other class I GSTs from insects.
Allelic variation in 9.7 kb of genomic DNA sequence from the human lipoprotein lipase gene (LPL) was scored in 71 healthy individuals (142 chromosomes) from three populations: African Americans (24) from Jackson, MS; Finns (24) from North Karelia, Finland; and non-Hispanic Whites (23) from Rochester, MN. The sequences had a total of 88 variable sites, with a nucleotide diversity (site-specific heterozygosity) of .002+/-.001 across this 9.7-kb region. The frequency spectrum of nucleotide variation exhibited a slight excess of heterozygosity, but, in general, the data fit expectations of the infinite-sites model of mutation and genetic drift. Allele-specific PCR helped resolve linkage phases, and a total of 88 distinct haplotypes were identified. For 1,410 (64%) of the 2,211 site pairs, all four possible gametes were present in these haplotypes, reflecting a rich history of past recombination. Despite the strong evidence for recombination, extensive linkage disequilibrium was observed. The number of haplotypes generally is much greater than the number expected under the infinite-sites model, but there was sufficient multisite linkage disequilibrium to reveal two major clades, which appear to be very old. Variation in this region of LPL may depart from the variation expected under a simple, neutral model, owing to complex historical patterns of population founding, drift, selection, and recombination. These data suggest that the design and interpretation of disease-association studies may not be as straightforward as often is assumed.
Lipoprotein lipase plays a central role in lipid metabolism and the gene that encodes this enzyme (LPL) is a candidate susceptibility gene for cardiovascular disease. Here we report the complete sequence of a fraction of the LPL gene for 71 individuals (142 chromosomes) from three populations that may have different histories affecting the organization of the sequence variation. Eighty-eight sites in this 9.7 kb vary among individuals from these three populations. Of these, 79 were single nucleotide substitutions and 9 sites involved insertion-deletion variations. The average nucleotide diversity across the region was 0.2% (or on average 1 variable site every 500 bp). At 34 of these sites, the variation was found in only one of the populations, reflecting the differing population and mutational histories. If LPL is a typical human gene, the pattern of sequence variation that exists in introns as well as exons, even for the small number of samples considered here, will present challenges for the identification of sites, or combinations of sites, that influence variation in risk of disease in the population at large.
        
Title: The Role of E3 Esterase, GlutathioneS-Transferases and Other Nonoxidative Mechanisms in Resistance to Diazinon and Other Organophosphate Insecticides in Lucilia cuprina Wilson JA, Clark AG Ref: Pesticide Biochemistry and Physiology, 54:85, 1996 : PubMed
Seven possible nonoxidative mechanisms which might contribute to resistance to organophosphate insecticides have been examined in larvae and adults from field isolates of the sheep blow fly,Lucilia cuprina,selected for resistance to diazinon. The basal resistance mechanism appears to be an altered esterase (the E3 esterase). In addition, in adults, there is a strong correlation between resistance to diazinon and the glutathioneS-transferase activity with 3,4-dichloronitrobenzene as substrate. This correlation is marginal in the larvae of this species. There was also a marginal negative correlation between the resistance to diazinon and the rate of inactivation of acetylcholine esterase by tetrachlorvinphos. No significant degree of correlation was found between resistance to diazinon and total esterase, with either alpha- or beta-naphthyl actetate as substrate, total acetylcholine esterase activity, or glutathioneS-transferase using 1-chloro-2,4-dinitrobenzene as substrate. Significant correlations were found between resistance to diazinon and that to other organophosphate insecticides including chlorpyrifos, propetamphos, and dichlofenthion. In light of these correlations, it is concluded that the same basic mechanisms of resistance are operating with respect to all of these insecticides. Since the field isolates were sampled over a wide geographical range and over several years, it appears that the mechanisms of resistance are stable and widely distributed.