free full text journal articles: genetics and proteomics
(skip the 30 most recent)




Recent Articles in Nucleic Acids Research

Kikin O, Zappala Z, D'Antonio L, Bagga PS
GRSDB2 and GRS_UTRdb: databases of quadruplex forming G-rich sequences in pre-mRNAs and mRNAs.
Nucleic Acids Res. 2007 Nov 27; .
G-quadruplex motifs in the RNA play significant roles in key cellular processes and human disease. While sequences capable of forming G-quadruplexes in the pre-mRNA are involved in regulation of polyadenylation and splicing events in mammalian transcripts, the G-quadruplex motifs in the UTRs may help regulate mRNA expression. GRSDB2 is a second-generation database containing information on the composition and distribution of putative Quadruplex-forming G-Rich Sequences (QGRS) mapped in approximately 29 000 eukaryotic pre-mRNA sequences, many of which are alternatively processed. The data stored in the GRSDB2 is based on computational analysis of NCBI Entrez Gene entries with the help of an improved version of the QGRS Mapper program. The database allows complex queries with a wide variety of parameters, including Gene Ontology terms. The data is displayed in a variety of formats with several additional computational capabilities. We have also developed a new database, GRS_UTRdb, containing information on the composition and distribution patterns of putative QGRS in the 5'- and 3'-UTRs of eukaryotic mRNA sequences. The goal of these experiments has been to build freely accessible resources for exploring the role of G-quadruplex structure in regulation of gene expression at post-transcriptional level. The databases can be accessed at the G-Quadruplex Resource Site at: [Abstract/Link to Full Text]

Rossignol T, Lechat P, Cuomo C, Zeng Q, Moszer I, d'Enfert C
CandidaDB: a multi-genome database for Candida species and related Saccharomycotina.
Nucleic Acids Res. 2007 Nov 26;
CandidaDB ( was established in 2002 to provide the first genomic database for the human fungal pathogen Candida albicans. The availability of an increasing number of fully or partially completed genome sequences of related fungal species has opened the path for comparative genomics and prompted us to migrate CandidaDB into a multi-genome database. The new version of CandidaDB houses the latest versions of the genomes of C. albicans strains SC5314 and WO-1 along with six genome sequences from species closely related to C. albicans that all belong to the CTG clade of Saccharomycotina-Candida tropicalis, Candida (Clavispora) lusitaniae, Candida (Pichia) guillermondii, Lodderomyces elongisporus, Debaryomyces hansenii, Pichia stipitis-and the reference Saccharomyces cerevisiae genome. CandidaDB includes sequences coding for 54 170 proteins with annotations collected from other databases, enriched with illustrations of structural features and functional domains and data of comparative analyses. In order to take advantage of the integration of multiple genomes in a unique database, new tools using pre-calculated or user-defined comparisons have been implemented that allow rapid access to comparative analysis at the genomic scale. [Abstract/Link to Full Text]

Cochrane G, Akhtar R, Aldebert P, Althorpe N, Baldwin A, Bates K, Bhattacharyya S, Bonfield J, Bower L, Browne P, Castro M, Cox T, Demiralp F, Eberhardt R, Faruque N, Hoad G, Jang M, Kulikova T, Labarga A, Leinonen R, Leonard S, Lin Q, Lopez R, Lorenc D, McWilliam H, Mukherjee G, Nardone F, Plaister S, Robinson S, Sobhany S, Vaughan R, Wu D, Zhu W, Apweiler R, Hubbard T, Birney E
Priorities for nucleotide trace, sequence and annotation data capture at the Ensembl Trace Archive and the EMBL Nucleotide Sequence Database.
Nucleic Acids Res. 2007 Nov 26;
The Ensembl Trace Archive ( and the EMBL Nucleotide Sequence Database (, known together as the European Nucleotide Archive, continue to see growth in data volume and diversity. Selected major developments of 2007 are presented briefly, along with data submission and retrieval information. In the face of increasing requirements for nucleotide trace, sequence and annotation data archiving, data capture priority decisions have been taken at the European Nucleotide Archive. Priorities are discussed in terms of how reliably information can be captured, the long-term benefits of its capture and the ease with which it can be captured. [Abstract/Link to Full Text]

Sindelka R, Jonák J, Hands R, Bustin SA, Kubista M
Intracellular expression profiles measured by real-time PCR tomography in the Xenopus laevis oocyte.
Nucleic Acids Res. 2007 Nov 26;
Real-time PCR tomography is a novel, quantitative method for measuring localized RNA expression profiles within single cells. We demonstrate its usefulness by dissecting an oocyte from Xenopus laevis into slices along its animal-vegetal axis, extracting its RNA and measuring the levels of 18 selected mRNAs by real-time RT-PCR. This identified two classes of mRNA, one preferentially located towards the animal, the other towards the vegetal pole. mRNAs within each group show comparable intracellular gradients, suggesting they are produced by similar mechanisms. The polarization is substantial, though not extreme, with around 5% of vegetal gene mRNA molecules detected at the animal pole, and around 50% of the molecules in the far most vegetal section. Most animal pole mRNAs were found in the second section from the animal pole and in the central section, which is where the nucleus is located. mRNA expression profiles did not change following in vitro fertilization and we conclude that the cortical rotation that follows fertilization has no detectable effect on intracellular mRNA gradients. [Abstract/Link to Full Text]

Farge G, Holmlund T, Khvorostova J, Rofougaran R, Hofer A, Falkenberg M
The N-terminal domain of TWINKLE contributes to single-stranded DNA binding and DNA helicase activities.
Nucleic Acids Res. 2007 Nov 26;
The TWINKLE protein is a hexameric DNA helicase required for replication of mitochondrial DNA. TWINKLE displays striking sequence similarity to the bacteriophage T7 gene 4 protein (gp4), which is a bi-functional primase-helicase required at the phage DNA replication fork. The N-terminal domain of human TWINKLE contains some of the characteristic sequence motifs found in the N-terminal primase domain of the T7 gp4, but other important motifs are missing. TWINKLE is not an active primase in vitro and the functional role of the N-terminal region has remained elusive. In this report, we demonstrate that the N-terminal part of TWINKLE is required for efficient binding to single-stranded DNA. Truncations of this region reduce DNA helicase activity and mitochondrial DNA replisome processivity. We also find that the gp4 and TWINKLE are functionally distinct. In contrast to the phage protein, TWINKLE binds to double-stranded DNA. Moreover, TWINKLE forms stable hexamers even in the absence of Mg(2+) or NTPs, which suggests that an accessory protein, a helicase loader, is needed for loading of TWINKLE onto the circular mtDNA genome. [Abstract/Link to Full Text]

Rodgers ME, Schleif R
DNA tape measurements of AraC.
Nucleic Acids Res. 2007 Nov 26;
A new method for measuring distances between points in the AraC-DNA complex was developed and applied. It utilizes variable lengths of single-stranded DNA that connect double-stranded regions containing the two half-site binding sequences of AraC. These distances plus the protein interdomain linker distances are compatible with two classes of structure for the dimeric AraC gene regulatory protein. In one class, the N-terminal regulatory arm of one dimerization domain is capable of interacting with the DNA-binding domain on the same polypeptide chain for a cis interaction. In the other class, the possible arm-DNA-binding domain interaction is trans, where it adds to the dimerization interface. [Abstract/Link to Full Text]

Zemla A, Geisbrecht B, Smith J, Lam M, Kirkpatrick B, Wagner M, Slezak T, Zhou CE
STRALCP structure alignment-based clustering of proteins.
Nucleic Acids Res. 2007 Nov 26;
Protein structural annotation and classification is an important and challenging problem in bioinformatics. Research towards analysis of sequence-structure correspondences is critical for better understanding of a protein's structure, function, and its interaction with other molecules. Clustering of protein domains based on their structural similarities provides valuable information for protein classification schemes. In this article, we attempt to determine whether structure information alone is sufficient to adequately classify protein structures. We present an algorithm that identifies regions of structural similarity within a given set of protein structures, and uses those regions for clustering. In our approach, called STRALCP (STRucture ALignment-based Clustering of Proteins), we generate detailed information about global and local similarities between pairs of protein structures, identify fragments (spans) that are structurally conserved among proteins, and use these spans to group the structures accordingly. We also provide a web server at for selecting protein structures, calculating structurally conserved regions and performing automated clustering. [Abstract/Link to Full Text]

Dalal S, Chikova A, Jaeger J, Sweasy JB
The Leu22Pro tumor-associated variant of DNA polymerase beta is dRP lyase deficient.
Nucleic Acids Res. 2007 Nov 26;
Approximately 30% of human tumors characterized to date express DNA polymerase beta (pol beta) variant proteins. Two of the polymerase beta cancer-associated variants are sequence-specific mutators, and one of them binds to DNA but has no polymerase activity. The Leu22Pro (L22P) DNA polymerase beta variant was identified in a gastric carcinoma. Leu22 resides within the 8 kDa amino terminal domain of DNA polymerase beta, which exhibits dRP lyase activity. This domain catalyzes the removal of deoxyribose phosphate during short patch base excision repair. We show that this cancer-associated variant has very little dRP lyase activity but retains its polymerase activity. Although residue 22 has no direct contact with the DNA, we report here that the L22P variant has reduced DNA-binding affinity. The L22P variant protein is deficient in base excision repair. Molecular dynamics calculations suggest that alteration of Leu22 to Pro changes the local packing, the loop connecting helices 1 and 2 and the overall juxtaposition of the helices within the N-terminal domain. This in turn affects the shape of the binding pocket that is required for efficient dRP lyase catalysis. [Abstract/Link to Full Text]

Valgardsdottir R, Chiodi I, Giordano M, Rossi A, Bazzini S, Ghigna C, Riva S, Biamonti G
Transcription of Satellite III non-coding RNAs is a general stress response in human cells.
Nucleic Acids Res. 2007 Dec 11;
In heat-shocked human cells, heat shock factor 1 activates transcription of tandem arrays of repetitive Satellite III (SatIII) DNA in pericentromeric heterochromatin. Satellite III RNAs remain associated with sites of transcription in nuclear stress bodies (nSBs). Here we use real-time RT-PCR to study the expression of these genomic regions. Transcription is highly asymmetrical and most of the transcripts contain the G-rich strand of the repeat. A low level of G-rich RNAs is detectable in unstressed cells and a 10(4)-fold induction occurs after heat shock. G-rich RNAs are induced by a wide range of stress treatments including heavy metals, UV-C, oxidative and hyper-osmotic stress. Differences exist among stressing agents both for the kinetics and the extent of induction (>100- to 80.000-fold). In all cases, G-rich transcripts are associated with nSBs. On the contrary, C-rich transcripts are almost undetectable in unstressed cells and modestly increase after stress. Production of SatIII RNAs after hyper-osmotic stress depends on the Tonicity Element Binding Protein indicating that activation of the arrays is triggered by different transcription factors. This is the first example of a non-coding RNA whose transcription is controlled by different transcription factors under different growth conditions. [Abstract/Link to Full Text]

Witcher M, Pettersson F, Dupéré-Richer D, Padovani A, Summers-Deluca L, Baldwin AS, Miller WH
Retinoic acid modulates chromatin to potentiate tumor necrosis factor alpha signaling on the DIF2 promoter.
Nucleic Acids Res. 2007 Nov 26;
Transcriptional activation by nuclear hormone receptors is well characterized, but their cooperation with other signaling pathways to activate transcription remains poorly understood. Tumor necrosis factor alpha (TNFalpha) and all-trans retinoic acid (RA) induce monocytic differentiation of acute promyelocytic leukemia (APL) cells in a synergistic manner. We used the promoter of DIF2, a gene involved in monocytic differentiation, to model the mechanism underlying the cooperative induction of target genes by RA and TNFalpha. We show a functional RA response element in the DIF2 promoter, which is constitutively bound by PML/RARalpha in APL cells. RA stimulates release of corepressors and recruitment of chromatin modifying proteins and additional transcription factors to the promoter, but these changes cause only a modest induction of DIF2 mRNA. Co-stimulation with RA plus TNFalpha facilitates binding of NF-kappaB to the promoter, which is crucial for full induction of transcription. Furthermore, RA plus TNFalpha greatly enhanced the level of RNA Pol II phosphorylation on the DIF2 promoter, via synergistic recruitment of TFIIH. We propose that RA mediates remodeling of chromatin to facilitate binding of transcription factors, which cooperate to enhance Pol II phosphorylation, providing a mechanism whereby nuclear receptors interact with other signaling pathways on the level of transcription. [Abstract/Link to Full Text]

Hines JC, Ray DS
Structure of discontinuities in kinetoplast DNA-associated minicircles during S phase in Crithidia fasciculata.
Nucleic Acids Res. 2007 Nov 26;
Kinetoplast DNA (kDNA) is a novel form of mitochondrial DNA consisting of thousands of interlocked minicircles and 20-30 maxicircles. The minicircles replicate free of the kDNA network but nicks and gaps in the newly synthesized strands remain at the time of reattachment to the kDNA network. We show here that the steady-state population of replicated, network-associated minicircles only becomes repaired to the point of having nicks with a 3'OH and 5'deoxyribonucleoside monophosphate during S phase. These nicks represent the origin/terminus of the strand and occur within the replication origins (oriA and oriB) located 180 degrees apart on the minicircle. Minicircles containing a new L strand have a single nick within either oriA or oriB but not in both origins in the same molecule. The discontinuously synthesized H strand contains single nicks within both oriA and oriB in the same molecule implying that discontinuities between the H-strand Okazaki fragments become repaired except for the fragments initiated within the two origins. Nicks in L and H strands at the origins persist throughout S phase and only become ligated as a prelude to network division. The failure to ligate these nicks until just prior to network division is not due to inappropriate termini for ligation. [Abstract/Link to Full Text]

Souličre MF, Perreault JP, Bisaillon M
Magnesium-binding studies reveal fundamental differences between closely related RNA triphosphatases.
Nucleic Acids Res. 2007 Nov 26;
The Chlorella virus RNA triphosphatase (cvRTPase) is involved in the formation of the RNA cap structure found at the 5'-end of the viral mRNAs and requires magnesium ions to mediate its catalytic activity. To extend our studies on the role of metal ions in phosphohydrolysis, we have used a combination of fluorescence spectroscopy, circular dichroism, denaturation studies and thermodynamic analyses to monitor the binding of magnesium ions to the cvRTPase. Using these techniques, the thermodynamic forces responsible for the interaction of metal ions with an RNA triphosphatase were also evaluated for the first time. Our thermodynamic analyses indicate that the initial association of magnesium with the cvRTPase is dominated by a favorable entropic effect and is accompanied by the release of eight water molecules from the enzyme. Moreover, both fluorescence spectroscopy and circular dichroism assays indicated that minor conformational changes were occurring upon magnesium binding. Mutational studies were also performed and confirmed the importance of three specific glutamate residues located in the active site of the enzyme for the binding of magnesium ions. Finally, in contrast to the yeast RNA triphosphatase, we demonstrate that the binding of magnesium ions to the cvRTPase does not lead to the stabilization of the ground state binding of the RNA substrate. Based on the results of the present study, we hypothesize that the binding of magnesium ions induces local conformational perturbations in the active site residues that ultimately positions the lateral chains of critical amino acids involved in catalysis. Our results highlight fundamental differences in the role of magnesium ions in the phosphohydrolase reactions catalyzed by the cvRTPase and the closely related yeast RNA triphosphatase. [Abstract/Link to Full Text]

Halfon MS, Gallo SM, Bergman CM
REDfly 2.0: an integrated database of cis-regulatory modules and transcription factor binding sites in Drosophila.
Nucleic Acids Res. 2007 Nov 26;
The identification and study of the cis-regulatory elements that control gene expression are important areas of biological research, but few resources exist to facilitate large-scale bioinformatics studies of cis-regulation in metazoan species. Drosophila melanogaster, with its well-annotated genome, exceptional resources for comparative genomics and long history of experimental studies of transcriptional regulation, represents the ideal system for regulatory bioinformatics. We have merged two existing Drosophila resources, the REDfly database of cis-regulatory modules and the FlyReg database of transcription factor binding sites (TFBSs), into a single integrated database containing extensive annotation of empirically validated cis-regulatory modules and their constituent binding sites. With the enhanced functionality made possible through this integration of TFBS data into REDfly, together with additional improvements to the REDfly infrastructure, we have constructed a one-stop portal for Drosophila cis-regulatory data that will serve as a powerful resource for both computational and experimental studies of transcriptional regulation. REDfly is freely accessible at [Abstract/Link to Full Text]

Araúzo-Bravo MJ, Sarai A
Indirect readout in drug-DNA recognition: role of sequence-dependent DNA conformation.
Nucleic Acids Res. 2007 Nov 26;
DNA-binding drugs have numerous applications in the engineered gene regulation. However, the drug-DNA recognition mechanism is poorly understood. Drugs can recognize specific DNA sequences not only through direct contacts but also indirectly through sequence-dependent conformation, in a similar manner to the indirect readout mechanism in protein-DNA recognition. We used a knowledge-based technique that takes advantage of known DNA structures to evaluate the conformational energies. We built a dataset of non-redundant free B-DNA crystal structures to calculate the distributions of adjacent base-step and base-pair conformations, and estimated the effective harmonic potentials of mean force (PMF). These PMFs were used to calculate the conformational energy of drug-DNA complexes, and the Z-score as a measure of the binding specificity. Comparing the Z-scores for drug-DNA complexes with those for free DNA structures with the same sequence, we observed that in several cases the Z-scores became more negative upon drug binding. Furthermore, the specificity is position-dependent within the drug-bound region of DNA. These results suggest that DNA conformation plays an important role in the drug-DNA recognition. The presented method provides a tool for the analysis of drug-DNA recognition and can facilitate the development of drugs for targeting a specific DNA sequence. [Abstract/Link to Full Text]

Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL, Bateman A
The Pfam protein families database.
Nucleic Acids Res. 2007 Nov 26;
Pfam is a comprehensive collection of protein domains and families, represented as multiple sequence alignments and as profile hidden Markov models. The current release of Pfam (22.0) contains 9318 protein families. Pfam is now based not only on the UniProtKB sequence database, but also on NCBI GenPept and on sequences from selected metagenomics projects. Pfam is available on the web from the consortium members using a new, consistent and improved website design in the UK (, the USA ( and Sweden (, as well as from mirror sites in France ( and South Korea ( [Abstract/Link to Full Text]

Rattei T, Tischler P, Arnold R, Hamberger F, Krebs J, Krumsiek J, Wachinger B, Stümpflen V, Mewes W
SIMAP structuring the network of protein similarities.
Nucleic Acids Res. 2007 Nov 23;
Protein sequences are the most important source of evolutionary and functional information for new proteins. In order to facilitate the computationally intensive tasks of sequence analysis, the Similarity Matrix of Proteins (SIMAP) database aims to provide a comprehensive and up-to-date dataset of the pre-calculated sequence similarity matrix and sequence-based features like InterPro domains for all proteins contained in the major public sequence databases. As of September 2007, SIMAP covers approximately 17 million proteins and more than 6 million non-redundant sequences and provides a complete annotation based on InterPro 16. Novel features of SIMAP include a new, portlet-based web portal providing multiple, structured views on retrieved proteins and integration of protein clusters and a unique search method for similar domain architectures. Access to SIMAP is freely provided for academic use through the web portal for individuals at through Web Services for programmatic access at [Abstract/Link to Full Text]

Jones P, Côté RG, Cho SY, Klie S, Martens L, Quinn AF, Thorneycroft D, Hermjakob H
PRIDE: new developments and new datasets.
Nucleic Acids Res. 2007 Nov 22;
The PRIDE ( database of protein and peptide identifications was previously described in the NAR Database Special Edition in 2006. Since this publication, the volume of public data in the PRIDE relational database has increased by more than an order of magnitude. Several significant public datasets have been added, including identifications and processed mass spectra generated by the HUPO Brain Proteome Project and the HUPO Liver Proteome Project. The PRIDE software development team has made several significant changes and additions to the user interface and tool set associated with PRIDE. The focus of these changes has been to facilitate the submission process and to improve the mechanisms by which PRIDE can be queried. The PRIDE team has developed a Microsoft Excel workbook that allows the required data to be collated in a series of relatively simple spreadsheets, with automatic generation of PRIDE XML at the end of the process. The ability to query PRIDE has been augmented by the addition of a BioMart interface allowing complex queries to be constructed. Collaboration with groups outside the EBI has been fruitful in extending PRIDE, including an approach to encode iTRAQ quantitative data in PRIDE XML. [Abstract/Link to Full Text]

Li D, Da L, Tang H, Li T, Zhao M
CpG methylation plays a vital role in determining tissue- and cell-specific expression of the human cell-death-inducing DFF45-like effector A gene through the regulation of Sp1/Sp3 binding.
Nucleic Acids Res. 2007 Nov 22;
Cell-death-inducing DFF45-like effector A (CIDE-A) belongs to a family of proapoptotic proteins, the expression of which is highly restricted in human tissues and cells. Here, the core region of the human CIDE-A promoter was characterized. Surprisingly, two Sp1/Sp3-binding sites, rather than tissue-specific transcription factors, were found to be required for the promoter activity. Although the ubiquitously expressed Sp1 and Sp3 were crucial, they alone could not adequately regulate the specific expression of CIDE-A. We found that the expression of CIDE-A was further regulated by CpG methylation of the promoter region. By performing bisulfite sequencing, we observed dense CpG methylation of the promoter region in tissues and cells with low or no expression of CIDE-A but not in tissues with high level of CIDE-A expression. In vitro methylation of this region showed significantly reduced transcriptional activity. Treatment of CIDE-A-negative cells with 5-aza-2'-deoxycytidine demethylated the CpG sites; this opened the closed chromatin conformation and markedly enhanced the binding affinity of Sp1/Sp3 to the promoter in vivo, thereby restoring CIDE-A expression. These data indicated that CpG methylation plays a crucial role in establishing and maintaining tissue- and cell-specific transcription of the CIDE-A gene through the regulation of Sp1/Sp3 binding. [Abstract/Link to Full Text]

Choi SW, Kano A, Maruyama A
Activation of DNA strand exchange by cationic comb-type copolymers: effect of cationic moieties of the copolymers.
Nucleic Acids Res. 2007 Nov 22;
We have previously reported that poly(l-lysine)-graft-dextran cationic comb-type copolymers accelerate strand exchange reaction between duplex DNA and its complementary single strand by >4 orders of magnitude, while stabilizing duplex. However, the stabilization of the duplex is considered principally unfavourable for the accelerating activity since the strand exchange reaction requires, at least, partial melting of the initial duplex. Here we report the effects of different cationic moieties of cationic comb-type copolymers on the accelerating activity. The copolymer having guanidino groups exhibited markedly higher accelerating effect on strand exchange reactions than that having primary amino groups. The high accelerating effect of the former is considered to be due to its lower stabilizing effect on duplex DNA, resulting from its increased affinity to single-stranded DNA. The difference in affinity was clearly demonstrated by a fluorescence correlation spectroscopy study; the interaction of the former with single-stranded DNA still remained high even at 1 M NaCl, while that of the latter completely disappeared. These results suggest that some modes of interactions, such as hydrogen bonding, other than electrostatic interactions between the copolymers having guanidino groups and DNAs may be involved in strand exchange activation. [Abstract/Link to Full Text]

Ivanyi-Nagy R, Lavergne JP, Gabus C, Ficheux D, Darlix JL
RNA chaperoning and intrinsic disorder in the core proteins of Flaviviridae.
Nucleic Acids Res. 2007 Nov 22;
RNA chaperone proteins are essential partners of RNA in living organisms and viruses. They are thought to assist in the correct folding and structural rearrangements of RNA molecules by resolving misfolded RNA species in an ATP-independent manner. RNA chaperoning is probably an entropy-driven process, mediated by the coupled binding and folding of intrinsically disordered protein regions and the kinetically trapped RNA. Previously, we have shown that the core protein of hepatitis C virus (HCV) is a potent RNA chaperone that can drive profound structural modifications of HCV RNA in vitro. We now examined the RNA chaperone activity and the disordered nature of core proteins from different Flaviviridae genera, namely that of HCV, GBV-B (GB virus B), WNV (West Nile virus) and BVDV (bovine viral diarrhoea virus). Despite low-sequence similarities, all four proteins demonstrated general nucleic acid annealing and RNA chaperone activities. Furthermore, heat resistance of core proteins, as well as far-UV circular dichroism spectroscopy suggested that a well-defined 3D protein structure is not necessary for core-induced RNA structural rearrangements. These data provide evidence that RNA chaperoning-possibly mediated by intrinsically disordered protein segments-is conserved in Flaviviridae core proteins. Thus, besides nucleocapsid formation, core proteins may function in RNA structural rearrangements taking place during virus replication. [Abstract/Link to Full Text]

Holbein S, Freimoser FM, Werner TP, Wengi A, Dichtl B
Cordycepin-hypersensitive growth links elevated polyphosphate levels to inhibition of poly(A) polymerase in Saccharomyces cerevisiae.
Nucleic Acids Res. 2007 Nov 22;
To identify genes involved in poly(A) metabolism, we screened the yeast gene deletion collection for growth defects in the presence of cordycepin (3'-deoxyadenosine), a precursor to the RNA chain terminating ATP analog cordycepin triphosphate. Deltapho80 and Deltapho85 strains, which have a constitutively active phosphate-response pathway, were identified as cordycepin hypersensitive. We show that inorganic polyphosphate (poly P) accumulated in these strains and that poly P is a potent inhibitor of poly(A) polymerase activity in vitro. Binding analyses of poly P and yeast Pap1p revealed an interaction with a k(D) in the low nanomolar range. Poly P also bound mammalian poly(A) polymerase, however, with a 10-fold higher k(D) compared to yeast Pap1p. Genetic tests with double mutants of Deltapho80 and other genes involved in phosphate homeostasis and poly P accumulation suggest that poly P contributed to cordycepin hypersensitivity. Synergistic inhibition of mRNA synthesis through poly P-mediated inhibition of Pap1p and through cordycepin-mediated RNA chain termination may thus account for hypersensitive growth of Deltapho80 and Deltapho85 strains in the presence of the chain terminator. Consistent with this, a mutation in the 3'-end formation component rna14 was synthetic lethal in combination with Deltapho80. Based on these observations, we suggest that binding of poly P to poly(A) polymerase negatively regulates its activity. [Abstract/Link to Full Text]

Mohanty BK, Kushner SR
Rho-independent transcription terminators inhibit RNase P processing of the secG leuU and metT tRNA polycistronic transcripts in Escherichia coli.
Nucleic Acids Res. 2007 Nov 22;
The widely accepted model for the processing of tRNAs in Escherichia coli involves essential initial cleavages by RNase E within polycistronic transcripts to generate pre-tRNAs that subsequently become substrates for RNase P. However, recently we identified two polycistronic tRNA transcripts whose endonucleolytic processing was solely dependent on RNase P. Here we show that the processing of the secG leuU and metT leuW glnU glnW metU glnV glnX polycistronic transcripts takes place through a different type of maturation pathway. Specifically, RNase P separates the tRNA units within each operon following the endonucleolytic removal of the distal Rho-independent transcription terminator, primarily by RNase E. Failure to remove the Rho-independent transcription terminator inhibits RNase P processing of both transcripts leading to a decrease in mature tRNA levels and dramatically increased levels of full-length transcripts in an RNase E deletion strain. Furthermore, we show for the first time that RNase G also removes the Rho-independent transcription terminator associated with the secG leuU operon. Our data also demonstrate that the Rne-1 protein retains significant activity on tRNA substrates at the non-permissive temperature. Taken together it is clear that there are multiple pathways involved in the maturation of tRNAs in E. coli. [Abstract/Link to Full Text]

Hernandez-Boussard T, Whirl-Carrillo M, Hebert JM, Gong L, Owen R, Gong M, Gor W, Liu F, Truong C, Whaley R, Woon M, Zhou T, Altman RB, Klein TE
The pharmacogenetics and pharmacogenomics knowledge base: accentuating the knowledge.
Nucleic Acids Res. 2007 Nov 21;
PharmGKB is a knowledge base that captures the relationships between drugs, diseases/phenotypes and genes involved in pharmacokinetics (PK) and pharmacodynamics (PD). This information includes literature annotations, primary data sets, PK and PD pathways, and expert-generated summaries of PK/PD relationships between drugs, diseases/phenotypes and genes. PharmGKB's website is designed to effectively disseminate knowledge to meet the needs of our users. PharmGKB currently has literature annotations documenting the relationship of over 500 drugs, 450 diseases and 600 variant genes. In order to meet the needs of whole genome studies, PharmGKB has added new functionalities, including browsing the variant display by chromosome and cytogenetic locations, allowing the user to view variants not located within a gene. We have developed new infrastructure for handling whole genome data, including increased methods for quality control and tools for comparison across other data sources, such as dbSNP, JSNP and HapMap data. PharmGKB has also added functionality to accept, store, display and query high throughput SNP array data. These changes allow us to capture more structured information on phenotypes for better cataloging and comparison of data. PharmGKB is available at [Abstract/Link to Full Text]

Tremblay S, Wagner JR
Dehydration, deamination and enzymatic repair of cytosine glycols from oxidized poly(dG-dC) and poly(dI-dC).
Nucleic Acids Res. 2007 Nov 21;
Cytosine glycols (5,6-dihydroxy-5,6-dihydrocytosine) are initial products of cytosine oxidation. Because these products are not stable, virtually all biological studies have focused on the stable oxidation products of cytosine, including 5-hydroxycytosine, uracil glycols and 5-hydroxyuracil. Previously, we reported that the lifetime of cytosine glycols was greatly enhanced in double-stranded DNA, thus implicating these products in DNA repair and mutagenesis. In the present work, cytosine and uracil glycols were generated in double-stranded alternating co-polymers by oxidation with KMnO(4). The half-life of cytosine glycols in poly(dG-dC) was 6.5 h giving a ratio of dehydration to deamination of 5:1. At high substrate concentrations, the excision of cytosine glycols from poly(dG-dC) by purified endonuclease III was comparable to that of uracil glycols, whereas the excision of these substrates was 5-fold greater than that of 5-hydroxycytosine. Kinetic studies revealed that the V(max) was several fold higher for the excision of cytosine glycols compared to 5-hydroxycytosine. In contrast to cytosine glycols, uracil glycols did not undergo detectable dehydration to 5-hydroxyuracil. Replacing poly(dG-dC) for poly(dI-dC) gave similar results with respect to the lifetime and excision of cytosine glycols. This work demonstrates the formation of cytosine glycols in DNA and their removal by base excision repair. [Abstract/Link to Full Text]

Hatch K, Danilowicz C, Coljee V, Prentiss M
Measurement of the salt-dependent stabilization of partially open DNA by Escherichia coli SSB protein.
Nucleic Acids Res. 2007 Nov 21;
The rezipping force of two complementary DNA strands under tension has been measured in the presence of Escherichia coli single-stranded-binding proteins under salt conditions ranging from 10- to 400 mM NaCl. The effectiveness of the binding protein in preventing rezipping is strongly dependent on salt concentration and compared with the salt dependence in the absence of the protein. At concentrations less than 50 mM NaCl, the protein prevents complete rezipping of lambda-phage on the 2-s timescale of the experiment, when the ssDNA is under tensions as low as 3.5 +/- 1 pN. For salt concentrations greater than 200 mM NaCl, the protein inhibits rezipping but cannot block rezipping when the tension is reduced below 6 +/- 1.8 pN. This change in effectiveness as a function of salt concentration may correspond to salt-dependent changes in binding modes that were previously observed in bulk assays. [Abstract/Link to Full Text]

Nord D, Sjöberg BM
Unconventional GIY-YIG homing endonuclease encoded in group I introns in closely related strains of the Bacillus cereus group.
Nucleic Acids Res. 2007 Nov 21;
Several group I introns have been previously found in strains of the Bacillus cereus group at three different insertion sites in the nrdE gene of the essential nrdIEF operon coding for ribonucleotide reductase. Here, we identify an uncharacterized group IA intron in the nrdF gene in 12 strains of the B. cereus group and show that the pre-mRNA is efficiently spliced. The Bacillus thuringiensis ssp. pakistani nrdF intron encodes a homing endonuclease, denoted I-BthII, with an unconventional GIY-(X)(8)-YIG motif that cleaves an intronless nrdF gene 7 nt upstream of the intron insertion site, producing 2-nt 3' extensions. We also found four additional occurrences of two of the previously reported group I introns in the nrdE gene of 25 sequenced B. thuringiensis and one B. cereus strains, and one non-annotated group I intron at a fourth nrdE insertion site in the B. thuringiensis ssp. Al Hakam sequenced genome. Two strains contain introns in both the nrdE and the nrdF genes. Phylogenetic studies of the nrdIEF operon from 39 strains of the B. cereus group suggest several events of horizontal gene transfer for two of the introns found in this operon. [Abstract/Link to Full Text]

Yeats C, Lees J, Reid A, Kellam P, Martin N, Liu X, Orengo C
Gene3D: comprehensive structural and functional annotation of genomes.
Nucleic Acids Res. 2007 Nov 21;
Gene3D provides comprehensive structural and functional annotation of most available protein sequences, including the UniProt, RefSeq and Integr8 resources. The main structural annotation is generated through scanning these sequences against the CATH structural domain database profile-HMM library. CATH is a database of manually derived PDB-based structural domains, placed within a hierarchy reflecting topology, homology and conservation and is able to infer more ancient and divergent homology relationships than sequence-based approaches. This data is supplemented with Pfam-A, other non-domain structural predictions (i.e. coiled coils) and experimental data from UniProt. In order to enhance the investigations possible with this data, we have also incorporated a variety of protein annotation resources, including protein-protein interaction data, GO functional assignments, KEGG pathways, FUNCAT functional descriptions and links to microarray expression data. All of this data can be accessed through a newly re-designed website that has a focus on flexibility and clarity, with searches that can be restricted to a single genome or across the entire sequence database. Currently Gene3D contains over 3.5 million domain assignments for nearly 5 million proteins including 527 completed genomes. This is available at: [Abstract/Link to Full Text]

Wardle J, Burgers PM, Cann IK, Darley K, Heslop P, Johansson E, Lin LJ, McGlynn P, Sanvoisin J, Stith CM, Connolly BA
Uracil recognition by replicative DNA polymerases is limited to the archaea, not occurring with bacteria and eukarya.
Nucleic Acids Res. 2007 Nov 21;
Family B DNA polymerases from archaea such as Pyrococcus furiosus, which live at temperatures approximately 100 degrees C, specifically recognize uracil in DNA templates and stall replication in response to this base. Here it is demonstrated that interaction with uracil is not restricted to hyperthermophilic archaea and that the polymerase from mesophilic Methanosarcina acetivorans shows identical behaviour. The family B DNA polymerases replicate the genomes of archaea, one of the three fundamental domains of life. This publication further shows that the DNA replicating polymerases from the other two domains, bacteria (polymerase III) and eukaryotes (polymerases delta and epsilon for nuclear DNA and polymerase gamma for mitochondrial) are also unable to recognize uracil. Uracil occurs in DNA as a result of deamination of cytosine, either in G:C base-pairs or, more rapidly, in single stranded regions produced, for example, during replication. The resulting G:U mis-pairs/single stranded uracils are promutagenic and, unless repaired, give rise to G:C to A:T transitions in 50% of the progeny. The confinement of uracil recognition to polymerases of the archaeal domain is discussed in terms of the DNA repair pathways necessary for the elimination of uracil. [Abstract/Link to Full Text]

Kobayashi Y, Matsuo M, Sakamoto K, Wakasugi T, Yamada K, Obokata J
Two RNA editing sites with cis-acting elements of moderate sequence identity are recognized by an identical site-recognition protein in tobacco chloroplasts.
Nucleic Acids Res. 2007 Nov 21;
The chloroplast genome of higher plants contains 20-40 C-to-U RNA editing sites, whose number and locations are diversified among plant species. Biochemical analyses using in vitro RNA editing systems with chloroplast extracts have suggested that there is one-to-one recognition between proteinous site recognition factors and their respective RNA editing sites, but their rigidness and generality are still unsettled. In this study, we addressed this question with the aid of an in vitro RNA editing system from tobacco chloroplast extracts and with UV-crosslinking experiments. We found that the ndhB-9 and ndhF-1 editing sites of tobacco chloroplast transcripts are both bound by the site-specific trans-acting factors of 95 kDa. Cross-competition experiments between ndhB-9 and ndhF-1 RNAs demonstrated that the 95 kDa proteins specifically binding to the ndhB-9 and ndhF-1 sites are the identical protein. The binding regions of the 95 kDa protein on the ndhB-9 and ndhF-1 transcripts showed 60% identity in nucleotide sequence. This is the first biochemical demonstration that a site recognition factor of chloroplast RNA editing recognizes plural sites. On the basis of this finding, we discuss how plant organellar RNA editing sites have diverged during evolution. [Abstract/Link to Full Text]

Lechat P, Hummel L, Rousseau S, Moszer I
GenoList: an integrated environment for comparative analysis of microbial genomes.
Nucleic Acids Res. 2007 Nov 21;
The multitude of bacterial genome sequences being determined has generated new requirements regarding the development of databases and graphical interfaces: these are needed to organize and retrieve biological information from the comparison of large sets of genomes. GenoList ( is an integrated environment dedicated to querying and analyzing genome data from bacterial species. GenoList inherits from the SubtiList database and web server, the reference data resource for the Bacillus subtilis genome. The data model was extended to hold information about relationships between genomes (e.g. protein families). The web user interface was designed to primarily take into account biologists' needs and modes of operation. Along with standard query and browsing capabilities, comparative genomics facilities are available, including subtractive proteome analysis. One key feature is the integration of the many tools accessible in the environment. As an example, it is straightforward to identify the genes that are specific to a group of bacteria, export them as a tab-separated list, get their protein sequences and run a multiple alignment on a subset of these sequences. [Abstract/Link to Full Text]

Recent Articles in Genome Research

Dennis JH, Fan HY, Reynolds SM, Yuan G, Meldrim JC, Richter DJ, Peterson DG, Rando OJ, Noble WS, Kingston RE
Independent and complementary methods for large-scale structural analysis of mammalian chromatin.
Genome Res. 2007 Jun;17(6):928-39.
The fundamental building block of chromatin, the nucleosome, occupies 150 bp of DNA in a spaced arrangement that is a primary determinant in regulation of the genome. The nucleosomal organization of some regions of the human genome has been described, but mapping of these regions has been limited to a few kilobases. We have explored two independent and complementary methods for the high-throughput analysis of mammalian chromatin structure. Through adaptations to a protocol used to map yeast chromatin structure, we determined sites of nucleosomal protection over large regions of the mammalian genome using a tiling microarray. By modifying classical primer extension methods, we localized specific internucleosomally cleaved mammalian genomic sequences using a capillary electrophoresis sequencer in a manner that allows high-throughput nucleotide-resolution characterization of nucleosome protection patterns. We developed algorithms for the automated and unbiased analysis of the resulting data, a necessary step toward large-scale analysis. We validated these assays using the known positions of nucleosomes on the mouse mammary tumor virus LTR, and additionally, we characterized the previously unreported chromatin structure of the LCMT2 gene. These results demonstrate the effectiveness of the combined methods for reliable analysis of mammalian chromatin structure in a high-throughput manner. [Abstract/Link to Full Text]

Thurman RE, Day N, Noble WS, Stamatoyannopoulos JA
Identification of higher-order functional domains in the human ENCODE regions.
Genome Res. 2007 Jun;17(6):917-27.
It has long been posited that human and other large genomes are organized into higher-order (i.e., greater than gene-sized) functional domains. We hypothesized that diverse experimental data types generated by The ENCODE Project Consortium could be combined to delineate active and quiescent or repressed functional domains and thereby illuminate the higher-order functional architecture of the genome. To address this, we coupled wavelet analysis with hidden Markov models for unbiased discovery of "domain-level" behavior in high-resolution functional genomic data, including activating and repressive histone modifications, RNA output, and DNA replication timing. We find that higher-order patterns in these data types are largely concordant and may be analyzed collectively in the context of HeLa cells to delineate 53 active and 62 repressed functional domains within the ENCODE regions. Active domains comprise approximately 44% of the ENCODE regions but contain approximately 75%-80% of annotated genes, transcripts, and CpG islands. Repressed domains are enriched in certain classes of repetitive elements and, surprisingly, in evolutionarily conserved nonexonic sequences. The functional domain structure of the ENCODE regions appears to be largely stable across different cell types. Taken together, our results suggest that higher-order functional domains represent a fundamental organizing principle of human genome architecture. [Abstract/Link to Full Text]

Bhinge AA, Kim J, Euskirchen GM, Snyder M, Iyer VR
Mapping the chromosomal targets of STAT1 by Sequence Tag Analysis of Genomic Enrichment (STAGE).
Genome Res. 2007 Jun;17(6):910-6.
Identifying the genome-wide binding sites of transcription factors is important in deciphering transcriptional regulatory networks. ChIP-chip (Chromatin immunoprecipitation combined with microarrays) has been widely used to map transcription factor binding sites in the human genome. However, whole genome ChIP-chip analysis is still technically challenging in vertebrates. We recently developed STAGE as an unbiased method for identifying transcription factor binding sites in the genome. STAGE is conceptually based on SAGE, except that the input is ChIP-enriched DNA. In this study, we implemented an improved sequencing strategy and analysis methods and applied STAGE to map the genomic binding profile of the transcription factor STAT1 after interferon treatment. STAT1 is mainly responsible for mediating the cellular responses to interferons, such as cell proliferation, apoptosis, immune surveillance, and immune responses. We present novel algorithms for STAGE tag analysis to identify enriched loci with high specificity, as verified by quantitative ChIP. STAGE identified several previously unknown STAT1 target genes, many of which are involved in mediating the response to interferon-gamma signaling. STAGE is thus a viable method for identifying the chromosomal targets of transcription factors and generating meaningful biological hypotheses that further our understanding of transcriptional regulatory networks. [Abstract/Link to Full Text]

Euskirchen GM, Rozowsky JS, Wei CL, Lee WH, Zhang ZD, Hartman S, Emanuelsson O, Stolc V, Weissman S, Gerstein MB, Ruan Y, Snyder M
Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies.
Genome Res. 2007 Jun;17(6):898-909.
Recent progress in mapping transcription factor (TF) binding regions can largely be credited to chromatin immunoprecipitation (ChIP) technologies. We compared strategies for mapping TF binding regions in mammalian cells using two different ChIP schemes: ChIP with DNA microarray analysis (ChIP-chip) and ChIP with DNA sequencing (ChIP-PET). We first investigated parameters central to obtaining robust ChIP-chip data sets by analyzing STAT1 targets in the ENCODE regions of the human genome, and then compared ChIP-chip to ChIP-PET. We devised methods for scoring and comparing results among various tiling arrays and examined parameters such as DNA microarray format, oligonucleotide length, hybridization conditions, and the use of competitor Cot-1 DNA. The best performance was achieved with high-density oligonucleotide arrays, oligonucleotides >/=50 bases (b), the presence of competitor Cot-1 DNA and hybridizations conducted in microfluidics stations. When target identification was evaluated as a function of array number, 80%-86% of targets were identified with three or more arrays. Comparison of ChIP-chip with ChIP-PET revealed strong agreement for the highest ranked targets with less overlap for the low ranked targets. With advantages and disadvantages unique to each approach, we found that ChIP-chip and ChIP-PET are frequently complementary in their relative abilities to detect STAT1 targets for the lower ranked targets; each method detected validated targets that were missed by the other method. The most comprehensive list of STAT1 binding regions is obtained by merging results from ChIP-chip and ChIP-sequencing. Overall, this study provides information for robust identification, scoring, and validation of TF targets using ChIP-based technologies. [Abstract/Link to Full Text]

Karnani N, Taylor C, Malhotra A, Dutta A
Pan-S replication patterns and chromosomal domains defined by genome-tiling arrays of ENCODE genomic areas.
Genome Res. 2007 Jun;17(6):865-76.
In eukaryotes, accurate control of replication time is required for the efficient completion of S phase and maintenance of genome stability. We present a high-resolution genome-tiling array-based profile of replication timing for approximately 1% of the human genome studied by The ENCODE Project Consortium. Twenty percent of the investigated segments replicate asynchronously (pan-S). These areas are rich in genes and CpG islands, features they share with early-replicating loci. Interphase FISH showed that pan-S replication is a consequence of interallelic variation in replication time and is not an artifact derived from a specific cell cycle synchronization method or from aneuploidy. The interallelic variation in replication time is likely due to interallelic variation in chromatin environment, because while the early- or late-replicating areas were exclusively enriched in activating or repressing histone modifications, respectively, the pan-S areas had both types of histone modification. The replication profile of the chromosomes identified contiguous chromosomal segments of hundreds of kilobases separated by smaller segments where the replication time underwent an acute transition. Close examination of one such segment demonstrated that the delay of replication time was accompanied by a decrease in level of gene expression and appearance of repressive chromatin marks, suggesting that the transition segments are boundary elements separating chromosomal domains with different chromatin environments. [Abstract/Link to Full Text]

Washietl S, Pedersen JS, Korbel JO, Stocsits C, Gruber AR, Hackermüller J, Hertel J, Lindemeyer M, Reiche K, Tanzer A, Ucla C, Wyss C, Antonarakis SE, Denoeud F, Lagarde J, Drenkow J, Kapranov P, Gingeras TR, Guigó R, Snyder M, Gerstein MB, Reymond A, Hofacker IL, Stadler PF
Structured RNAs in the ENCODE selected regions of the human genome.
Genome Res. 2007 Jun;17(6):852-64.
Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack characteristic signals in primary sequence, comparative approaches evaluating evolutionary conservation of structures are most promising. We have used three recently introduced programs based on either phylogenetic-stochastic context-free grammar (EvoFold) or energy directed folding (RNAz and AlifoldZ), yielding several thousand candidate structures (corresponding to approximately 2.7% of the ENCODE regions). EvoFold has its highest sensitivity in highly conserved and relatively AU-rich regions, while RNAz favors slightly GC-rich regions, resulting in a relatively small overlap between methods. Comparison with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz and EvoFold, and an additional 239 RNAz or EvoFold predictions are supported by the (more stringent) AlifoldZ algorithm. Five hundred seventy RNAz structure predictions fall into regions that show signs of selection pressure also on the sequence level (i.e., conserved elements). More than 700 predictions overlap with noncoding transcripts detected by oligonucleotide tiling arrays. One hundred seventy-five selected candidates were tested by RT-PCR in six tissues, and expression could be verified in 43 cases (24.6%). [Abstract/Link to Full Text]

Zheng D, Frankish A, Baertsch R, Kapranov P, Reymond A, Choo SW, Lu Y, Denoeud F, Antonarakis SE, Snyder M, Ruan Y, Wei CL, Gingeras TR, Guigó R, Harrow J, Gerstein MB
Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution.
Genome Res. 2007 Jun;17(6):839-51.
Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are "genomic fossils" valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome's structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction ( approximately 80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues. [Abstract/Link to Full Text]

Ruan Y, Ooi HS, Choo SW, Chiu KP, Zhao XD, Srinivasan KG, Yao F, Choo CY, Liu J, Ariyaratne P, Bin WG, Kuznetsov VA, Shahab A, Sung WK, Bourque G, Palanisamy N, Wei CL
Fusion transcripts and transcribed retrotransposed loci discovered through comprehensive transcriptome analysis using Paired-End diTags (PETs).
Genome Res. 2007 Jun;17(6):828-38.
Identification of unconventional functional features such as fusion transcripts is a challenging task in the effort to annotate all functional DNA elements in the human genome. Paired-End diTag (PET) analysis possesses a unique capability to accurately and efficiently characterize the two ends of DNA fragments, which may have either normal or unusual compositions. This unique nature of PET analysis makes it an ideal tool for uncovering unconventional features residing in the human genome. Using the PET approach for comprehensive transcriptome analysis, we were able to identify fusion transcripts derived from genome rearrangements and actively expressed retrotransposed pseudogenes, which would be difficult to capture by other means. Here, we demonstrate this unique capability through the analysis of 865,000 individual transcripts in two types of cancer cells. In addition to the characterization of a large number of differentially expressed alternative 5' and 3' transcript variants and novel transcriptional units, we identified 70 fusion transcript candidates in this study. One was validated as the product of a fusion gene between BCAS4 and BCAS3 resulting from an amplification followed by a translocation event between the two loci, chr20q13 and chr17q23. Through an examination of PETs that mapped to multiple genomic locations, we identified 4055 retrotransposed loci in the human genome, of which at least three were found to be transcriptionally active. The PET mapping strategy presented here promises to be a useful tool in annotating the human genome, especially aberrations in human cancer genomes. [Abstract/Link to Full Text]

Lin JM, Collins PJ, Trinklein ND, Fu Y, Xi H, Myers RM, Weng Z
Transcription factor binding and modified histones in human bidirectional promoters.
Genome Res. 2007 Jun;17(6):818-27.
Bidirectional promoters have received considerable attention because of their ability to regulate two downstream genes (divergent genes). They are also highly abundant, directing the transcription of approximately 11% of genes in the human genome. We categorized the presence of DNA sequence motifs, binding of transcription factors, and modified histones as overrepresented, shared, or underrepresented in bidirectional promoters with respect to unidirectional promoters. We found that a small set of motifs, including GABPA, MYC, E2F1, E2F4, NRF-1, CCAAT, YY1, and ACTACAnnTCC are overrepresented in bidirectional promoters, while the majority (73%) of known vertebrate motifs are underrepresented. We performed chromatin-immunoprecipitation (ChIP), followed by quantitative PCR for GABPA, on 118 regions in the human genome and showed that it binds to bidirectional promoters more frequently than unidirectional promoters, and its position-specific scoring matrix is highly predictive of binding. Signatures of active transcription, such as occupancy of RNA polymerase II and the modified histones H3K4me2, H3K4me3, and H3ac, are overrepresented in regions around bidirectional promoters, suggesting that a higher fraction of divergent genes are transcribed in a given cell than the fraction of other genes. Accordingly, analysis of whole-genome microarray data indicates that 68% of divergent genes are transcribed compared with 44% of all human genes. By combining the analysis of publicly available ENCODE data and a detailed study of GABPA, we survey bidirectional promoters with breadth and depth, leading to biological insights concerning their motif composition and bidirectional regulatory mode. [Abstract/Link to Full Text]

Jin VX, O'Geen H, Iyengar S, Green R, Farnham PJ
Identification of an OCT4 and SRY regulatory module using integrated computational and experimental genomics approaches.
Genome Res. 2007 Jun;17(6):807-17.
ChIP-chip studies have revealed that many in vivo binding sites have a weak match to the consensus sequence for the transcription factor being analyzed. Possible explanations for these observations include (1) the in vitro-derived consensus site does not represent the in vivo binding site and/or (2) the factor is recruited to a weak binding site via interaction with another protein. To address these possibilities, we developed an approach (ChIPMotifs) that incorporates a bootstrap resampling method to statistically infer the optimal cutoff threshold for a position weight matrix (PWM) of a motif identified from ChIP-chip data by ab initio motif discovery programs. Using OCT4 ChIP-chip data and the ChIPMotifs approach, we first developed a refined OCT4 PWM. We then used the refined PWM and a ChIPModules approach to identify transcription factors colocalizing with OCT4 in Ntera2 testicular embryonal carcinoma cells. We found that the consensus binding site for SRY, a transcription factor critical for testis development, colocalizes with the OCT4 PWM. To further characterize the relationship between OCT4 and SRY, we performed ChIP-chip experiments with human promoter microarrays, and found that 49% of the top approximately 1000 OCT4 target promoters were also bound by SRY. This analysis represents the first identification of SRY target promoters. Interestingly, we determined that promoters bound by OCT4 and SRY, but not those bound by SRY alone, were also bound by the transcriptional repressor KAP1. Our studies not only validate the ChIPMotifs and ChIPModules combinatorial approach but also identify a possible new regulatory partner of OCT4. [Abstract/Link to Full Text]

Xi H, Yu Y, Fu Y, Foley J, Halees A, Weng Z
Analysis of overrepresented motifs in human core promoters reveals dual regulatory roles of YY1.
Genome Res. 2007 Jun;17(6):798-806.
A set of 723 high-quality human core promoter sequences were compiled and analyzed for overrepresented motifs. Beside the two well-characterized core promoter motifs (TATA and Inr), several known motifs (YY1, Sp1, NRF-1, NRF-2, CAAT, and CREB) and one potentially new motif (motif8) were found. Interestingly, YY1 and motif8 mostly reside immediately downstream from the TSS. In particular, the YY1 motif occurs primarily in genes with 5'-UTRs shorter than 40 base pairs (bp) and its locations coincide with the translation start site. We verified that the YY1 motif is bound by YY1 in vitro. We then performed detailed analysis on YY1 chromatin immunoprecipitation data with a whole-genome human promoter microarray (ChIP-chip) and revealed that the thus identified promoters in HeLa cells were highly enriched with the YY1 motif. Moreover, the motif overlapped with the translation start sites on the plus strand of a group of genes, many with short 5'-UTRs, and with the transcription start sites on the minus strand of another distinct group of genes; together, the two groups of genes accounted for the majority of the YY1-bound promoters in the ChIP-chip data. Furthermore, the first group of genes was highly enriched in the functional categories of ribosomal proteins and nuclear-encoded mitochondria proteins. We suggest that the YY1 motif plays a dual role in both transcription and translation initiation of these genes. We also discuss the evolutionary advantages of housing a transcriptional element inside the transcript in terms of the migration of these genes in the human genome. [Abstract/Link to Full Text]

Zhang ZD, Paccanaro A, Fu Y, Weissman S, Weng Z, Chang J, Snyder M, Gerstein MB
Statistical analysis of the genomic distribution and correlation of regulatory elements in the ENCODE regions.
Genome Res. 2007 Jun;17(6):787-97.
The comprehensive inventory of functional elements in 44 human genomic regions carried out by the ENCODE Project Consortium enables for the first time a global analysis of the genomic distribution of transcriptional regulatory elements. In this study we developed an intuitive and yet powerful approach to analyze the distribution of regulatory elements found in many different ChIP-chip experiments on a 10 approximately 100-kb scale. First, we focus on the overall chromosomal distribution of regulatory elements in the ENCODE regions and show that it is highly nonuniform. We demonstrate, in fact, that regulatory elements are associated with the location of known genes. Further examination on a local, single-gene scale shows an enrichment of regulatory elements near both transcription start and end sites. Our results indicate that overall these elements are clustered into regulatory rich "islands" and poor "deserts." Next, we examine how consistent the nonuniform distribution is between different transcription factors. We perform on all the factors a multivariate analysis in the framework of a biplot, which enhances biological signals in the experiments. This groups transcription factors into sequence-specific and sequence-nonspecific clusters. Moreover, with experimental variation carefully controlled, detailed correlations show that the distribution of sites was generally reproducible for a specific factor between different laboratories and microarray platforms. Data sets associated with histone modifications have particularly strong correlations. Finally, we show how the correlations between factors change when only regulatory elements far from the transcription start sites are considered. [Abstract/Link to Full Text]

King DC, Taylor J, Zhang Y, Cheng Y, Lawson HA, Martin J, Chiaromonte F, Miller W, Hardison RC
Finding cis-regulatory elements using comparative genomics: some lessons from ENCODE data.
Genome Res. 2007 Jun;17(6):775-86.
Identification of functional genomic regions using interspecies comparison will be most effective when the full span of relationships between genomic function and evolutionary constraint are utilized. We find that sets of putative transcriptional regulatory sequences, defined by ENCODE experimental data, have a wide span of evolutionary histories, ranging from stringent constraint shown by deep phylogenetic comparisons to recent selection on lineage-specific elements. This diversity of evolutionary histories can be captured, at least in part, by the suite of available comparative genomics tools, especially after correction for regional differences in the neutral substitution rate. Putative transcriptional regulatory regions show alignability in different clades, and the genes associated with them are enriched for distinct functions. Some of the putative regulatory regions show evidence for recent selection, including a primate-specific, distal promoter that may play a novel role in regulation. [Abstract/Link to Full Text]

Margulies EH, Cooper GM, Asimenos G, Thomas DJ, Dewey CN, Siepel A, Birney E, Keefe D, Schwartz AS, Hou M, Taylor J, Nikolaev S, Montoya-Burgos JI, Löytynoja A, Whelan S, Pardi F, Massingham T, Brown JB, Bickel P, Holmes I, Mullikin JC, Ureta-Vidal A, Paten B, Stone EA, Rosenbloom KR, Kent WJ, Bouffard GG, Guan X, Hansen NF, Idol JR, Maduro VV, Maskeri B, McDowell JC, Park M, Thomas PJ, Young AC, Blakesley RW, Muzny DM, Sodergren E, Wheeler DA, Worley KC, Jiang H, Weinstock GM, Gibbs RA, Graves T, Fulton R, Mardis ER, Wilson RK, Clamp M, Cuff J, Gnerre S, Jaffe DB, Chang JL, Lindblad-Toh K, Lander ES, Hinrichs A, Trumbower H, Clawson H, Zweig A, Kuhn RM, Barber G, Harte R, Karolchik D, Field MA, Moore RA, Matthewson CA, Schein JE, Marra MA, Antonarakis SE, Batzoglou S, Goldman N, Hardison R, Haussler D, Miller W, Pachter L, Green ED, Sidow A
Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome.
Genome Res. 2007 Jun;17(6):760-74.
A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization. [Abstract/Link to Full Text]

Denoeud F, Kapranov P, Ucla C, Frankish A, Castelo R, Drenkow J, Lagarde J, Alioto T, Manzano C, Chrast J, Dike S, Wyss C, Henrichsen CN, Holroyd N, Dickson MC, Taylor R, Hance Z, Foissac S, Myers RM, Rogers J, Hubbard T, Harrow J, Guigó R, Gingeras TR, Antonarakis SE, Reymond A
Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions.
Genome Res. 2007 Jun;17(6):746-59.
This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations. [Abstract/Link to Full Text]

Rozowsky JS, Newburger D, Sayward F, Wu J, Jordan G, Korbel JO, Nagalakshmi U, Yang J, Zheng D, Guigó R, Gingeras TR, Weissman S, Miller P, Snyder M, Gerstein MB
The DART classification of unannotated transcription within the ENCODE regions: associating transcription with known and novel loci.
Genome Res. 2007 Jun;17(6):732-45.
For the approximately 1% of the human genome in the ENCODE regions, only about half of the transcriptionally active regions (TARs) identified with tiling microarrays correspond to annotated exons. Here we categorize this large amount of "unannotated transcription." We use a number of disparate features to classify the 6988 novel TARs-array expression profiles across cell lines and conditions, sequence composition, phylogenetic profiles (presence/absence of syntenic conservation across 17 species), and locations relative to genes. In the classification, we first filter out TARs with unusual sequence composition and those likely resulting from cross-hybridization. We then associate some of those remaining with proximal exons having correlated expression profiles. Finally, we cluster unclassified TARs into putative novel loci, based on similar expression and phylogenetic profiles. To encapsulate our classification, we construct a Database of Active Regions and Tools ( DART has special facilities for rapidly handling and comparing many sets of TARs and their heterogeneous features, synchronizing across builds, and interfacing with other resources. Overall, we find that approximately 14% of the novel TARs can be associated with known genes, while approximately 21% can be clustered into approximately 200 novel loci. We observe that TARs associated with genes are enriched in the potential to form structural RNAs and many novel TAR clusters are associated with nearby promoters. To benchmark our classification, we design a set of experiments for testing the connectivity of novel TARs. Overall, we find that 18 of the 46 connections tested validate by RT-PCR and four of five sequenced PCR products confirm connectivity unambiguously. [Abstract/Link to Full Text]

Trinklein ND, Karaöz U, Wu J, Halees A, Force Aldred S, Collins PJ, Zheng D, Zhang ZD, Gerstein MB, Snyder M, Myers RM, Weng Z
Integrated analysis of experimental data sets reveals many novel promoters in 1% of the human genome.
Genome Res. 2007 Jun;17(6):720-31.
The regulation of transcriptional initiation in the human genome is a critical component of global gene regulation, but a complete catalog of human promoters currently does not exist. In order to identify regulatory regions, we developed four computational methods to integrate 129 sets of ENCODE-wide chromatin immunoprecipitation data. They collectively predicted 1393 regions. Roughly 47% of the regions were unique to one method, as each method makes different assumptions about the data. Overall, predicted regions tend to localize to highly conserved, DNase I hypersensitive, and actively transcribed regions in the genome. Interestingly, a significant portion of the regions overlaps with annotated 3'-UTRs, suggesting that some of them might regulate anti-sense transcription. The majority of the predicted regions are >2 kb away from the 5'-ends of previously annotated human cDNAs and hence are novel. These novel regions may regulate unannotated transcripts or may represent new alternative transcription start sites of known genes. We tested 163 such regions for promoter activity in four cell lines using transient transfection assays, and 25% of them showed transcriptional activity above background in at least one cell line. We also performed 5'-RACE experiments on 62 novel regions, and 76% of the regions were associated with the 5'-ends of at least two RACE products. Our results suggest that there are at least 35% more functional promoters in the human genome than currently annotated. [Abstract/Link to Full Text]

Rada-Iglesias A, Enroth S, Ameur A, Koch CM, Clelland GK, Respuela-Alonso P, Wilcox S, Dovey OM, Ellis PD, Langford CF, Dunham I, Komorowski J, Wadelius C
Butyrate mediates decrease of histone acetylation centered on transcription start sites and down-regulation of associated genes.
Genome Res. 2007 Jun;17(6):708-19.
Butyrate is a histone deacetylase inhibitor (HDACi) with anti-neoplastic properties, which theoretically reactivates epigenetically silenced genes by increasing global histone acetylation. However, recent studies indicate that a similar number or even more genes are down-regulated than up-regulated by this drug. We treated hepatocarcinoma HepG2 cells with butyrate and characterized the levels of acetylation at DNA-bound histones H3 and H4 by ChIP-chip along the ENCODE regions. In contrast to the global increases of histone acetylation, many genomic regions close to transcription start sites were deacetylated after butyrate exposure. In order to validate these findings, we found that both butyrate and trichostatin A treatment resulted in histone deacetylation at selected regions, while nucleosome loss or changes in histone H3 lysine 4 trimethylation (H3K4me3) did not occur in such locations. Furthermore, similar histone deacetylation events were observed when colon adenocarcinoma HT-29 cells were treated with butyrate. In addition, genes with deacetylated promoters were down-regulated by butyrate, and this was mediated at the transcriptional level by affecting RNA polymerase II (POLR2A) initiation/elongation. Finally, the global increase in acetylated histones was preferentially localized to the nuclear periphery, indicating that it might not be associated to euchromatin. Our results are significant for the evaluation of HDACi as anti-tumourogenic drugs, suggesting that previous models of action might need to be revised, and provides an explanation for the frequently observed repression of many genes during HDACi treatment. [Abstract/Link to Full Text]

Koch CM, Andrews RM, Flicek P, Dillon SC, Karaöz U, Clelland GK, Wilcox S, Beare DM, Fowler JC, Couttet P, James KD, Lefebvre GC, Bruce AW, Dovey OM, Ellis PD, Dhami P, Langford CF, Weng Z, Birney E, Carter NP, Vetrie D, Dunham I
The landscape of histone modifications across 1% of the human genome in five human cell lines.
Genome Res. 2007 Jun;17(6):691-707.
We generated high-resolution maps of histone H3 lysine 9/14 acetylation (H3ac), histone H4 lysine 5/8/12/16 acetylation (H4ac), and histone H3 at lysine 4 mono-, di-, and trimethylation (H3K4me1, H3K4me2, H3K4me3, respectively) across the ENCODE regions. Studying each modification in five human cell lines including the ENCODE Consortium common cell lines GM06990 (lymphoblastoid) and HeLa-S3, as well as K562, HFL-1, and MOLT4, we identified clear patterns of histone modification profiles with respect to genomic features. H3K4me3, H3K4me2, and H3ac modifications are tightly associated with the transcriptional start sites (TSSs) of genes, while H3K4me1 and H4ac have more widespread distributions. TSSs reveal characteristic patterns of both types of modification present and the position relative to TSSs. These patterns differ between active and inactive genes and in particular the state of H3K4me3 and H3ac modifications is highly predictive of gene activity. Away from TSSs, modification sites are enriched in H3K4me1 and relatively depleted in H3K4me3 and H3ac. Comparison between cell lines identified differences in the histone modification profiles associated with transcriptional differences between the cell lines. These results provide an overview of the functional relationship among histone modifications and gene expression in human cells. [Abstract/Link to Full Text]

Gingeras TR
Origin of phenotypes: genes and transcripts.
Genome Res. 2007 Jun;17(6):682-90.
While the concept of a gene has been helpful in defining the relationship of a portion of a genome to a phenotype, this traditional term may not be as useful as it once was. Currently, "gene" has come to refer principally to a genomic region producing a polyadenylated mRNA that encodes a protein. However, the recent emergence of a large collection of unannotated transcripts with apparently little protein coding capacity, collectively called transcripts of unknown function (TUFs), has begun to blur the physical boundaries and genomic organization of genic regions with noncoding transcripts often overlapping protein-coding genes on the same (sense) and opposite strand (antisense). Moreover, they are often located in intergenic regions, making the genic portions of the human genome an interleaved network of both annotated polyadenylated and nonpolyadenylated transcripts, including splice variants with novel 5' ends extending hundreds of kilobases. This complex transcriptional organization and other recently observed features of genomes argue for the reconsideration of the term "gene" and suggests that transcripts may be used to define the operational unit of a genome. [Abstract/Link to Full Text]

Gerstein MB, Bruce C, Rozowsky JS, Zheng D, Du J, Korbel JO, Emanuelsson O, Zhang ZD, Weissman S, Snyder M
What is a gene, post-ENCODE? History and updated definition.
Genome Res. 2007 Jun;17(6):669-81.
While sequencing of the human genome surprised us with how many protein-coding genes there are, it did not fundamentally change our perspective on what a gene is. In contrast, the complex patterns of dispersed regulation and pervasive transcription uncovered by the ENCODE project, together with non-genic conservation and the abundance of noncoding RNA genes, have challenged the notion of the gene. To illustrate this, we review the evolution of operational definitions of a gene over the past century--from the abstract elements of heredity of Mendel and Morgan to the present-day ORFs enumerated in the sequence databanks. We then summarize the current ENCODE findings and provide a computational metaphor for the complexity. Finally, we propose a tentative update to the definition of a gene: A gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products. Our definition side-steps the complexities of regulation and transcription by removing the former altogether from the definition and arguing that final, functional gene products (rather than intermediate transcripts) should be used to group together entities associated with a single gene. It also manifests how integral the concept of biological function is in defining genes. [Abstract/Link to Full Text]

Weinstock GM
ENCODE: more genomic empowerment.
Genome Res. 2007 Jun;17(6):667-8. [Abstract/Link to Full Text]

Kim JH, Waterman MS, Li LM
Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi.
Genome Res. 2007 Jul;17(7):1101-10.
One of the main goals in genome sequencing projects is to determine a haploid consensus sequence even when clone libraries are constructed from homologous chromosomes. However, it has been noticed that haplotypes can be inferred from genome assemblies by investigating phase conservation in sequenced reads. In this study, we seek to infer haplotypes, a diploid consensus sequence, from the genome assembly of an organism, Ciona intestinalis. The Ciona intestinalis genome is an ideal resource from which haplotypes can be inferred because of the high polymorphism rate (1.2%). The haplotype estimation scheme consists of polymorphism detection and phase estimation. The core step of our method is a Gibbs sampling procedure. The mate-pair information from two-end sequenced clone inserts is exploited to provide long-range continuity. We estimate the polymorphism rate of Ciona intestinalis to be 1.2% and 1.5%, according to two different polymorphism counting schemes. The distribution of heterozygosity number is well fit by a compound Poisson distribution. The N50 length of haplotype segments is 37.9 kb in our assembly, while the N50 scaffold length of the Ciona intestinalis assembly is 190 kb. We also infer diploid gene sequences from haplotype segments. According to our reconstruction, 85.4% of predicted gene sequences are continuously covered by single haplotype segments. Our results indicate 97% accuracy in haplotype estimation, based on a simulated data set. We conduct a comparative analysis with Ciona savignyi, and discover interesting patterns of conserved DNA elements in chordates. [Abstract/Link to Full Text]

Faux NG, Huttley GA, Mahmood K, Webb GI, de la Banda MG, Whisstock JC
RCPdb: An evolutionary classification and codon usage database for repeat-containing proteins.
Genome Res. 2007 Jul;17(7):1118-27.
Over 3% of human proteins contain single amino acid repeats (repeat-containing proteins, RCPs). Many repeats (homopeptides) localize to important proteins involved in transcription, and the expansion of certain repeats, in particular poly-Q and poly-A tracts, can also lead to the development of neurological diseases. Previous studies have suggested that the homopeptide makeup is a result of the presence of G+C-rich tracts in the encoding genes and that expansion occurs via replication slippage. Here, we have performed a large-scale genomic analysis of the variation of the genes encoding RCPs in 13 species and present these data in an online database ( This resource allows rapid comparison and analysis of RCPs, homopeptides, and their underlying genetic tracts across the eukaryotic species considered. We report three major findings. First, there is a bias for a small subset of codons being reiterated within homopeptides, and there is no G+C or A+T bias relative to the organism's transcriptome. Second, single base pair transversions from the homocodon are unusually common and may represent a mechanism of reducing the rate of homopeptide mutations. Third, homopeptides that are conserved across different species lie within regions that are under stronger purifying selection in contrast to nonconserved homopeptides. [Abstract/Link to Full Text]

Garg K, Green P
Differing patterns of selection in alternative and constitutive splice sites.
Genome Res. 2007 Jul;17(7):1015-22.
In addition to allowing identification of putative functional elements as regions having reduced substitution rates, comparison of genome sequences can also provide insights into these elements at the nucleotide level, by indicating the pattern of tolerated substitutions. We created data sets of orthologous alternative and constitutive splice sites in mouse, rat, and human and analyzed the substitutions occurring within them. Our results illuminate differences between alternative and constitutive sites and, in particular, strongly support the idea that alternative sites are under selection to be weak. [Abstract/Link to Full Text]

Sebaihia M, Peck MW, Minton NP, Thomson NR, Holden MT, Mitchell WJ, Carter AT, Bentley SD, Mason DR, Crossman L, Paul CJ, Ivens A, Wells-Bennik MH, Davis IJ, Cerdeńo-Tárraga AM, Churcher C, Quail MA, Chillingworth T, Feltwell T, Fraser A, Goodhead I, Hance Z, Jagels K, Larke N, Maddison M, Moule S, Mungall K, Norbertczak H, Rabbinowitsch E, Sanders M, Simmonds M, White B, Whithead S, Parkhill J
Genome sequence of a proteolytic (Group I) Clostridium botulinum strain Hall A and comparative analysis of the clostridial genomes.
Genome Res. 2007 Jul;17(7):1082-92.
Clostridium botulinum is a heterogeneous Gram-positive species that comprises four genetically and physiologically distinct groups of bacteria that share the ability to produce botulinum neurotoxin, the most poisonous toxin known to man, and the causative agent of botulism, a severe disease of humans and animals. We report here the complete genome sequence of a representative of Group I (proteolytic) C. botulinum (strain Hall A, ATCC 3502). The genome consists of a chromosome (3,886,916 bp) and a plasmid (16,344 bp), which carry 3650 and 19 predicted genes, respectively. Consistent with the proteolytic phenotype of this strain, the genome harbors a large number of genes encoding secreted proteases and enzymes involved in uptake and metabolism of amino acids. The genome also reveals a hitherto unknown ability of C. botulinum to degrade chitin. There is a significant lack of recently acquired DNA, indicating a stable genomic content, in strong contrast to the fluid genome of Clostridium difficile, which can form longer-term relationships with its host. Overall, the genome indicates that C. botulinum is adapted to a saprophytic lifestyle both in soil and aquatic environments. This pathogen relies on its toxin to rapidly kill a wide range of prey species, and to gain access to nutrient sources, it releases a large number of extracellular enzymes to soften and destroy rotting or decayed tissues. [Abstract/Link to Full Text]

Belov K, Sanderson CE, Deakin JE, Wong ES, Assange D, McColl KA, Gout A, de Bono B, Barrow AD, Speed TP, Trowsdale J, Papenfuss AT
Characterization of the opossum immune genome provides insights into the evolution of the mammalian immune system.
Genome Res. 2007 Jul;17(7):982-91.
The availability of the first marsupial genome sequence has allowed us to characterize the immunome of the gray short-tailed opossum (Monodelphis domestica). Here we report the identification of key immune genes, including the highly divergent chemokines, defensins, cathelicidins, and Natural Killer cell receptors. It appears that the increase in complexity of the mammalian immune system occurred prior to the divergence of the marsupial and eutherian lineages approximately 180 million years ago. Genomes of ancestral mammals most likely contained all of the key mammalian immune gene families, with evolution on different continents, in the presence of different pathogens leading to lineage specific expansions and contractions, resulting in some minor differences in gene number and composition between different mammalian lineages. Gene expansion and extensive heterogeneity in opossum antimicrobial peptide genes may have evolved as a consequence of the newborn young needing to survive without an adaptive immune system in a pathogen laden environment. Given the similarities in the genomic architecture of the marsupial and eutherian immune systems, we propose that marsupials are ideal model organisms for the study of developmental immunology. [Abstract/Link to Full Text]

Carmel L, Rogozin IB, Wolf YI, Koonin EV
Evolutionarily conserved genes preferentially accumulate introns.
Genome Res. 2007 Jul;17(7):1045-50.
Introns that interrupt eukaryotic protein-coding sequences are generally thought to be nonfunctional. However, for reasons still poorly understood, positions of many introns are highly conserved in evolution. Previous reconstructions of intron gain and loss events during eukaryotic evolution used a variety of simplified evolutionary models that yielded contradicting conclusions and are not suited to reveal some of the key underlying processes. We combine a comprehensive probabilistic model and an extended data set, including 391 conserved genes from 19 eukaryotes, to uncover previously unnoticed aspects of intron evolution--in particular, to assign intron gain and loss rates to individual genes. The rates of intron gain and loss in a gene show moderate positive correlation. A gene's intron gain rate shows a highly significant negative correlation with the coding-sequence evolution rate; intron loss rate also significantly, but positively, correlates with the sequence evolution rate. Correlations of the opposite signs, albeit less significant ones, are observed between intron gain and loss rates and gene expression level. It is proposed that intron evolution includes a neutral component, which is manifest in the positive correlation between the gain and loss rates and a selection-driven component as reflected in the links between intron gain and loss and sequence evolution. The increased intron gain and decreased intron loss in evolutionarily conserved genes indicate that intron insertion often might be adaptive, whereas some of the intron losses might be deleterious. This apparent functional importance of introns is likely to be due, at least in part, to their multiple effects on gene expression. [Abstract/Link to Full Text]

Carmel L, Wolf YI, Rogozin IB, Koonin EV
Three distinct modes of intron dynamics in the evolution of eukaryotes.
Genome Res. 2007 Jul;17(7):1034-44.
Several contrasting scenarios have been proposed for the origin and evolution of spliceosomal introns, a hallmark of eukaryotic genes. A comprehensive probabilistic model to obtain a definitive reconstruction of intron evolution was developed and applied to 391 sets of conserved genes from 19 eukaryotic species. It is inferred that a relatively high intron density was reached early, i.e., the last common ancestor of eukaryotes contained >2.15 introns/kilobase, and the last common ancestor of multicellular life forms harbored approximately 3.4 introns/kilobase, a greater intron density than in most of the extant fungi and in some animals. The rates of intron gain and intron loss appear to have been dropping during the last approximately 1.3 billion years, with the decline in the gain rate being much steeper. Eukaryotic lineages exhibit three distinct modes of evolution of the intron-exon structure. The primary, balanced mode, apparently, operates in all lineages. In this mode, intron gain and loss are strongly and positively correlated, in contrast to previous reports on inverse correlation between these processes. The second mode involves an elevated rate of intron loss and is prevalent in several lineages, such as fungi and insects. The third mode, characterized by elevated rate of intron gain, is seen only in deep branches of the tree, indicating that bursts of intron invasion occurred at key points in eukaryotic evolution, such as the origin of animals. Intron dynamics could depend on multiple mechanisms, and in the balanced mode, gain and loss of introns might share common mechanistic features. [Abstract/Link to Full Text]

Oliver MJ, Petrov D, Ackerly D, Falkowski P, Schofield OM
The mode and tempo of genome size evolution in eukaryotes.
Genome Res. 2007 May;17(5):594-601.
Eukaryotic genome size varies over five orders of magnitude; however, the distribution is strongly skewed toward small values. Genome size is highly correlated to a number of phenotypic traits, suggesting that the relative lack of large genomes in eukaryotes is due to selective removal. Using phylogenetic contrasts, we show that the rate of genome size evolution is proportional to genome size, with the fastest rates occurring in the largest genomes. This trend is evident across the 20 major eukaryotic clades analyzed, indicating that over long time scales, proportional change is the dominant and universal mode of genome-size evolution in eukaryotes. Our results reveal that the evolution of eukaryotic genome size can be described by a simple proportional model of evolution. This model explains the skewed distribution of eukaryotic genome sizes without invoking strong selection against large genomes. [Abstract/Link to Full Text]

Recent Articles in Journal of Applied Genetics

Bobkowski W, Sobieszcza?ska M, Turska-Kmie? A, Nowak A, Jagielski J, Gonerska M, Lebioda A, Siwi?ska A
Mutation of the MYH7 gene in a child with hypertrophic cardiomyopathy and Wolff-Parkinson-White syndrome.
J Appl Genet. 2007;48(2):185-8.
Familial hypertrophic cardiomyopathy (HCM) displays autosomal dominant inheritance with incomplete penetration of defective genes. Data concerning the familial occurrence of ventricular preexcitation, i.e. Wolff-Parkinson-White (WPW) syndrome, also indicate autosomal dominant inheritance. In the literature, only a gene mutation on chromosome 7q3 has been described in familial HCM coexisting with WPW syndrome to date. The present paper describes the case of a 7-year-old boy with HCM and coexisting WPW syndrome. On his chromosome 14, molecular diagnostics revealed a C 9123 mutation (arginine changed into cysteine in position 453) in exon 14 in a copy of the gene for beta-myosin heavy chain (MYH7). It is the first known case of mutation of the MYH7 gene in a child with both HCM and WPW. Since no linkage between MYH7 mutation and HCM with WPW syndrome has been reported to date, we cannot conclude whether the observed mutation is a common cause for both diseases, or this patient presents an incidental co-occurrence of HCM (caused by MYH7 mutation) and WPW syndrome. [Abstract/Link to Full Text]

Borkowska E, Binka-Kowalska A, Constantinou M, Nawrocka A, Matych J, Ka?uzewski B
P53 mutations in urinary bladder cancer patients from Central Poland.
J Appl Genet. 2007;48(2):177-83.
The present study aimed at detection of P53 gene mutations in cells of urinary bladder neoplasms, as the mutations may be regarded as an independent prognostic factor for progression and recurrence of tumours. In the study, 82 patients with clinically diagnosed urinary bladder tumour were included. The control was composed of DNA samples from urine and blood of 202 healthy patients. Exons 5-8 of the P53 gene were screened for mutations by using multitemperature single-strand conformational polymorphism (MSSCP) analysis. Samples with abnormal MSSCP patterns were subjected to direct sequencing. The frequency of mutations in exons 5-8 of the P53 gene in patients with bladder cancer was lower (3.3% in grade G1, 24% in G2, and 39% in G3) than the data reported in the literature. We found a higher percentage of polymorphism at codon 213 of the P53 gene in bladder cancer patients (6%), compared with the values in the reference group (2.5%). These results were matched with those of the loss of heterozygosity (LOH) analysis. In conclusion, mutations were found mainly in more advanced histopathological and clinical stages of the disease and at the CIS stage (carcinoma in situ). It cannot be excluded that the observed polymorphism at codon 213 may be a predisposing factor for urinary bladder carcinoma development. [Abstract/Link to Full Text]

Pietrzak J, Mrasek K, Obersztyn E, Stankiewicz P, Kosyakova N, Weise A, Cheung SW, Cai WW, von Eggeling F, Mazurczak T, Bocian E, Liehr T
Molecular cytogenetic characterization of eight small supernumerary marker chromosomes originating from chromosomes 2, 4, 8, 18, and 21 in three patients.
J Appl Genet. 2007;48(2):167-75.
Small supernumerary marker chromosomes (sSMCs) are a morphologically heterogeneous group of additional structurally abnormal chromosomes that cannot be identified unambiguously by conventional banding techniques alone. Molecular cytogenetic methods enable detailed characterization of sSMCs; however, in many cases interpretation of their clinical significance is problematic. The aim of our study was to characterize precisely sSMCs identified in three patients with dysmorphic features, psychomotor retardation and multiple congenital anomalies. We also attempted to correlate the patients' genotypes with phenotypes by inclusion of data from the literature. The sSMCs were initially detected by G-banding analysis in peripheral blood lymphocytes in these patients and were subsequently characterized using multicolor fluorescence in situ hybridization (M-FISH), (sub)centromere-specific multicolor FISH (cenM-FISH, subcenM-FISH), and multicolor banding (MCB) techniques. Additionally, the sSMCs in two patients were also studied by hybridization to whole-genome bacterial artificial chromosome (BAC) arrays (array-CGH) to map the breakpoints on a single BAC clone level. In all three patients, the chromosome origin, structure, and euchromatin content of the sSMCs were determined. In patient RS, only a neocentric r(2)(q35q36) was identified. It is a second neocentric sSMC(2) in the literature and the first marker chromosome derived from the terminal part of 2q. In the other two patients, two sSMCs were found, as M-FISH detected additional sSMCs that could not be characterized in G-banding analysis. In patient MK, each of four cell lines contained der(4)(:p11.1-->q12:) accompanied by a sSMC(18): r(18)(:p11.2-->q11.1::p11.2-->q11.1:), inv dup(18)(:p11.1-->q11.1::q11.1-->p11.1:), or der(18) (:p11.2-->q11.1::q11.1-->p11.1:). In patient NP, with clinical features of trisomy 8p, three sSMCs were characterized: r(8)(:p12-->q11.1::q11.1-->p21:) der(8) (:p11.22-->q11.1::q11.1-->p21::p21-->p11.22:) and der(21)(:p11.1-->q21.3:). The BAC array results confirmed the molecular cytogenetic results and refined the breakpoints to the single BAC clone resolution. However, the complex mosaic structure of the marker chromosomes derived from chromosomes 8 and 18 could only be identified by molecular cytogenetic methods. This study confirms the usefulness of multicolor FISH combined with whole-genome arrays for comprehensive analyses of marker chromosomes. [Abstract/Link to Full Text]

Kowalczyk M, Srebniak M, Tomaszewska A
Chromosome abnormalities without phenotypic consequences.
J Appl Genet. 2007;48(2):157-66.
Some changes in chromosome morphology, detected during cytogenetic analysis, are not associated with clinical defects. Therefore a proper discrimination of harmless variants from true abnormalities, especially during prenatal diagnosis, is crucial to allow precise counseling. In this review we described chromosome variants and examples of chromosome anomalies that are considered to be unrelated to phenotypic consequences. The correlation between the presence of marker chromosomes and a risk of clinical signs is also discussed. Structural rearrangements of heterochromatic material, satellite polymorphism, or fragile sites, are well-known examples of common chromosome variation. However, the absence of clinical effects has also been reported in some cases of chromosome abnormalities concerning euchromatin. Such euchromatic anomalies were divided into 2 categories: unbalanced chromosome abnormalities (UBCAs), such as deletions or duplications, and euchromatic variants (EVs). Recently so-called molecular karyotyping, especially whole-genome screening by the use of high-resolution array-CGH technique, contributed to revealing a high number of previously unknown small genomic variations, which seem to be asymptomatic, as they are present in phenotypically normal individuals. [Abstract/Link to Full Text]

Patel RK, Singh KM, Soni KJ, Chauhan JB, Sambasiva Rao KR
Low incidence of bovine leukocyte adhesion deficiency (BLAD) carriers in Indian cattle and buffalo breeds.
J Appl Genet. 2007;48(2):153-5.
BLAD is an autosomal recessive genetic disease that affects Holstein-Friesian (HF) cattle worldwide. It is a disease characterized by a reduced expression of the adhesion molecules on neutrophils. The disease is caused by a mutation that replaces adenine at 383 with guanine, which causes an amino acid change from aspartic acid to glycine. Blood samples and a few semen samples were collected from 1250 phenotypically normal individuals, including HF (N=377), HF crossbred (N=334), Jersey (105), other breeds of cattle (N=160) and water buffalo Bubalus bubalis (N=274) belonging to various artificial insemination stations, bull mother farms (BMFs) and embryo transfer (ET) centres across the country. PCR-RFLP was performed to detect a point mutation in CD18, surface molecules of neutrophils. The results indicate that out of 1250 cattle and buffaloes tested for BLAD, 13 HF purebreds out of 377 and 10 HF crossbreds out of 334 appear to be BLAD carriers. In the HF and HF crossbred population, the percentage of BLAD carriers was estimated as 3.23%. The condition is alarming as the mutant gene has already entered the HF crossbred cattle population and therefore, the population of HF and its crossbreds needs regular screening to avoid the risk of spreading BLAD in the breeding cattle population of India. [Abstract/Link to Full Text]

Corręa MJ, da Mota MD
Genetic evaluation of performance traits in Brazilian Quarter Horse.
J Appl Genet. 2007;48(2):145-51.
The aim of this study was to estimate genetic parameters for racing performance traits in Quarter Horses in Brazil. The data (provided by the Sorocaba Jockey Club) came from 3 Brazilian hippodromes in 1994-2003, with 11 875 observations of race time and 7775 of the speed index (SI), distributed in 2403 and 2169 races, respectively. The variance components were estimated by the MTGSAM program, under animal models including the random additive genetic effect, random permanent environmental effect, and the fixed effects of sex, age and race. Heritabilities for race time and the SI, for the 3 distances studied (301, 365 and 402 m), varied from 0.26 to 0.41 and from 0.14 to 0.19, respectively, whereas repeatabilities varied from 0.36 to 0.68 (time) and from 0.27 to 0.42 (SI) and the genetic correlations from 0.90 to 0.97 (time) and from 0.67 to 0.73 (SI). [Abstract/Link to Full Text]

Aksu S, Koczan D, Renne U, Thiesen HJ, Brockmann GA
Differentially expressed genes in adipose tissues of high body weight-selected (obese) and unselected (lean) mouse lines.
J Appl Genet. 2007;48(2):133-43.
Recently, quantitative trait loci (QTLs) for body weight and obesity have been mapped in an intercross population between the high body weight-selected mouse line DU6i and the inbred line DBA/2. Most QTLs were highly significant, but had small effects only. Under the hypothesis that small-effect QTLs might result from changes in gene activity, our strategy to identify candidate genes for the observed effects was directed towards the identification of differentially expressed genes. Therefore, here we compare the transcription profile of about 11 000 genes in epididymal fat tissues of males of two high body weight-selected (DU6 and DU6i) and two unselected mouse lines (DUKs and DBA/2). For the hybridisation of GeneChips, we used pooled samples of 20 individual mice. By pair-wise comparisons between selected and unselected mouse lines, a set of 77 genes was identified representing genes whose level of expression differed between obese and lean mouse strains. According to the functional classification of genes, 69 differentially expressed genes were involved in regulatory and metabolic pathways, cell division, cell stability, or immune response, and thus might have an effect on body weight and fat accumulation. 14 out of these genes, occur in QTL regions for body weight or abdominal fat weight. Further analyses are necessary to discriminate between genes directly causing QTL effects and indirectly regulated differentially expressed genes. [Abstract/Link to Full Text]

Gebler P, Wolko ?, Knaflewski M
Identification of molecular markers for selection of supermale (YY) asparagus plants.
J Appl Genet. 2007;48(2):129-31.
The research was aimed to elaborate a method for selection of male plants (XY, YY) and female ones (XX) as well as for identification of supermale genotypes (YY) among male phenotypes. The population obtained by self-pollination of andromonoecious plants was analysed. In order to identify the bands differentiating the male from the female genotypes, Bulk Segregant Analysis (BSA) was carried out. Primers identified by BSA analysis were used for RAPD amplification on the template of the male and female individuals. Among the products obtained by the use of primer OPB-20, some bands were linked with sex. A band of about 700 bp was found in all female plants, and in 4 phenotypically male specimens. In the male plants, the band showed a much lower intensity, compared with the female specimens. It seems that this fragment can be linked to the X chromosome in the investigated specimens. In the female specimens with XX karyotype, template duplication occurs and hence the band intensity is twice as high as in the XY karyotype. Three male plants did not include the OPB-20-700 fragment so they could potentially have the supermale (YY) karyotype. If the obtained marker proved its usefulness for identification of supermale plants, it could become a valuable tool facilitating breeding work. [Abstract/Link to Full Text]

Ariyarathna C, Gunasekare K
Genetic base of tea (Camellia sinensis L.) cultivars in Sri Lanka as revealed by pedigree analysis.
J Appl Genet. 2007;48(2):125-8.
An understanding of genetic diversity and relationships among breeding materials is a prerequisite for crop improvement. Coefficient of parentage (COP) can be used to measure the genetic diversity among genotypes on the basis of pedigree information. In the present study, COP was estimated for 56 cultivars, including commercial tea cultivars developed by the Tea Research Institute of Sri Lanka and their parental lines. Mean COP of the 56 accessions studied was 0.097 and the value was raised up to 0.272 when non-related pair-wise comparisons were excluded. A single cultivar (Assam/Cambod introduction) was the nucleus of the commercial cultivars. Group mean COP of the cultivars derived from Assam/Cambod parentage was 0.17. Thirty-three percent of the pair-wise comparisons had 0.00 COP, highlighting that many cultivars were unrelated. Within the pedigree, 2 major COP clusters were identified: Assam/Cambod open-pollinated half-sib progenies, and full-sib progenies derived from crosses between Assam/Cambod and other parental lines. The elite groups within the pedigree, where Assam/Cambod parentage was concentrated, were also identified. Information generated in this study should be useful for effective utilization of available diversity in future breeding programmes as well as for proper conservation of genetic diversity in the adapted germplasm. This is the first report on estimates of genetic diversity based on COP in a woody perennial crop, such as tea. [Abstract/Link to Full Text]

Escandón AS, Zelener N, de la Torre MP, Soto S
Molecular identification of new varieties of Nierembergia linariaefolia (Graham), a native Argentinean ornamental plant.
J Appl Genet. 2007;48(2):115-23.
Six Nierembergia linariaefolia clones were selected for ornamental traits during a native germplasm development program. For fingerprinting diagnosis, 13 anchored inter-simple sequence repeat (ISSR) primers and 6 amplified fragment length polymorphism (AFLP) primer-enzyme combinations were used. Both markers revealed high levels of polymorphism, enabling genetic discrimination of the accessions analyzed by using 443 informative ISSRs and 541 AFLP markers. Both molecular techniques are suitable for monitoring genetic diversity in Nierembergia linariaefolia and, under our experimental conditions, they showed correlation coefficients of 0.629 for similarity matrices and of 0.649 in the cophenetic matrices. These results suggest that ISSRs are a good choice for DNA analysis in N. linariaefolia when simple manipulation and a low budget are required. [Abstract/Link to Full Text]

Branco CJ, Vieira EA, Malone G, Kopp MM, Malone E, Bernardes A, Mistura CC, Carvalho FI, Oliveira CA
IRAP and REMAP assessments of genetic similarity in rice.
J Appl Genet. 2007;48(2):107-13.
Rice is a model genome for cereal research, providing important information about genome structure and evolution. Retrotransposons are common components of grass genomes, showing activity at transcription, translation and integration levels. Their abundance and ability to transpose make them good potential markers. In this study, we used 2 multilocus PCR-based techniques that detect retrotransposon integration events in the genome: IRAP (inter-retrotransposon amplified polymorphism) and REMAP (retrotransposon-microsatellite amplified polymorphism). Markers derived from Tos17, a copia-like endogenous retrotransposon of rice, were used to identify genetic similarity among 51 rice cultivars (Oryza sativa L.). Genetic similarity analysis was performed by means of the Dice coefficient, and dendrograms were developed by using the average linkage distance method. A cophenetic correlation coefficient was also calculated. The clustering techniques revealed a good adjustment between matrices, with correlation coefficients of 0.74 and 0.80, or lower (0.21) but still significant, between IRAP and REMAP-based techniques. Consistent clusters were found for Japanese genotypes, while a subgroup clustered the irrigated Brazilian genotypes. [Abstract/Link to Full Text]

Juchimiuk J, Hering B, Maluszynska J
Multicolour FISH in an analysis of chromosome aberrations induced by N-nitroso-N-methylurea and maleic hydrazide in barley cells.
J Appl Genet. 2007;48(2):99-106.
The present study is a rare example of a detailed characterization of chromosomal aberrations by identification of individual chromosomes (or chromosome arms) involved in their formation in plant cells by using fluorescent in situ hybridization (FISH). In addition, the first application of more than 2 DNA probes in FISH experiments in order to analyse chromosomal aberrations in plant cells is presented. Simultaneous FISH with 5S and 25S rDNA and, after reprobing of preparations, telomeric and centromeric DNA sequences as probes, were used to compare the cytogenetic effects of 2 chemical mutagens: N-nitroso-N-methylurea (MNU) and maleic hydrazide (MH) on root tip meristem cells of Hordeum vulgare (2n=14). The micronucleus (MN) test combined with FISH allowed the quantitative analysis of the involvement of specific chromosome fragments in micronuclei formation and thus enabled the possible origin of mutagen-induced micronuclei to be explained. Terminal deletions were most frequently caused by MH and MNU. The analysis of the frequency of micronuclei with signals of the investigated DNA probes showed differences between the frequency of MH- and MNU-induced micronuclei with specific signals. The micronuclei with 2 signals, telomeric DNA and rDNA (5S and/or 25S rDNA), were the most frequently observed in the case of both mutagens, but with a higher frequency after treatment with MH (46%) than MNU (37%). Also, 10% of MH-induced micronuclei were characterized by the presence of only telomere DNA sequences, whereas there were almost 3-fold more in the case of MNU-induced micronuclei (28%). Additionally, by using FISH with the same probes, an attempt was made to identify the origin of chromosome fragments in mitotic anaphase. [Abstract/Link to Full Text]

Rivera H, Vásquez-Velásquez AI, Ramirez-Duenas Mde L, Becerra-Solano LE
A 9p13-->p24 duplication coupled with a whole 22q translocation onto 9p24.
J Appl Genet. 2007;48(1):95-8.
We report on a 3-year-old girl with a typical 9p trisomy syndrome, whose 45-chromosome karyotype includes a 9p+. As assessed by G, C and Ag-NOR bands, the rearranged chromosome resulted from a 9p13-->p24 direct duplication coupled with a translocation of the whole 22q onto 9pter, had heterochromatin at the junction site, lacked both nucleolar organizing regions (NORs) and centromere dots at the unconstricted fusion point, and was present in all metaphases scored. FISH results: a 9p subtelomere probe gave a diminished signal on the 9p+ precisely at the duplication junction 9p24::9p13, but no labeling was observed at the 9;22 translocation site; a pancentromeric alphoid probe labeled all centromeres, and gave a distinct signal at the 9pter;22cen junction. Hence, her karyotype was 45,XX,rea(9;22)(9qter-->9p24::9p13-->9p24::22p10-->22qter).ish rea(9;22) (9psubtel+dim,pancen+). Parental chromosomes were normal. The distinctiveness of the present centromere-telomere fusion rests on the coupling of an intrachromosomal distal duplication with a whole-arm translocation including alphoid DNA onto the duplicated segment. The centromeric inertia of the residual alphoid DNA in the present case compares with the variable functional status of the chromosome 22 centromere in true heterodicentrics involving such a chromosome. [Abstract/Link to Full Text]

Salahshourifar I, Gilani MA, Vosough A, Tavakolzadeh T, Tahsili M, Mansori Z, Karimi H, Totonchi M, Gourabi H
De novo complex chromosomal rearrangement of 46, XY, t (3; 16; 8) (p26; q13; q21.2) in a non-obstructive azoospermic male.
J Appl Genet. 2007;48(1):93-4.
Complex Chromosomal Rearrangements (CCRs) are rare structural abnormalities that are usually associated with infertility or subfertility in male carriers. We described clinical and chromosomal features of a non-obstructive azoospermic male that has been referred for infertility. Cytogenetic analysis showed three chromosomes, i.e. 3, 8 and 16, which have been involved and caused spermatogenesis failure. [Abstract/Link to Full Text]

Mork?niené A, Steponavici?t D, Utkus A, Kucinskas V
Few associations of candidate genes with nonsyndromic orofacial clefts in the population of Lithuania.
J Appl Genet. 2007;48(1):89-91.
Nonsyndromic orofacial clefting (NS-OFC) is a common complex multifactorial trait with a considerable genetic component and a number of candidate genes suggested by various approaches. Twenty biallelic and microsatellite DNA markers in the strong candidate loci TGFA, TGFB3, GABRB3, RARA, and BCL3 were analysed for allelic association with the NS-OFC phenotype in 112 nuclear families (proband + both parents) from Lithuania by using the transmission disequilibrium test (TDT). Associations were found between the TGFA gene marker rs2166975 and nonsyndromic cleft palate only (CPO) phenotype (p = 0.045, df 1) as well as between the D2S292 marker and the cleft lip with or without cleft palate (CL/CP) phenotype in allele-wise TDT (P = 0.005, df 9) and genotype-wise TDT (P = 0.021, df 24). A weak association (P = 0.085, df 3) of the BCL3 marker (BCL3 gene) with the risk of CPO was also found. Thus our initial results support the contribution of allelic variation in the TGFA locus to the aetiology of CL/CP in the population of Lithuania but they do not point to TGFA as a major causal gene. Different roles of the TGFA and BCL3 genes in the susceptibility to NS-OFC phenotypes are suggested. [Abstract/Link to Full Text]

Skrzypczak U, Rutkiewicz E, Pogorzelski A, Witt M, Zietkiewicz E
Carrier status for 3 most frequent CFTR mutations in Polish PCD/KS patients: lack of association with the primary ciliary dyskinesia phenotype.
J Appl Genet. 2007;48(1):85-8.
We screened a large group of primary ciliary dyskinesia/Kartagener syndrome (PCD/KS) patients and their siblings (148 patients from 126 unrelated families) for the presence of the CFTR mutations that are most frequently found in the Polish population: the severe F508del and 2,3del21kb, and the mild 3849+10kbC > T. No statistically significant increase in the frequency of these mutations was found in the studied group, as compared with the general population. This is consistent with an earlier observation in another population and indicates that the status of being a carrier of any of these CFTR mutations should not be considered as an important risk factor in PCD/KS pathogenesis. [Abstract/Link to Full Text]

Jó?kowska J, Derwich K, Dawidowska M
Methods of minimal residual disease (MRD) detection in childhood haematological malignancies.
J Appl Genet. 2007;48(1):77-83.
The appropriate management of haematological disorders must rely on a precise and long-term monitoring of the patient's response to chemotherapy and radiotherapy. Clinical data are not sufficient and that is why in the last decade it became the most important to improve the knowledge of haematological diseases on the basis of molecular techniques and molecular markers. The presence of residual malignant cells among normal cells is termed minimal residual disease (MRD). Nowadays a great progress has been made in the treatment of malignant diseases and in the development of reliable molecular techniques, which are characterised by high sensitivity (10-3- 10-6) and ability to distinguish between normal and malignant cells at diagnosis and during follow-up. Especially, MRD data based on quantitative analysis (RQ-PCR, RT-RQ-PCR) appear to be crucial for appropriate evaluation of treatment response in many haematological malignancies. Implementation of standardized approaches for MRD assessment into routine molecular diagnostics available in all oncohaematological centres should be regarded nowadays a crucial point in further MRD study development. [Abstract/Link to Full Text]

Szczerbal I, Lin L, Stachowiak M, Chmurzynska A, Mackowski M, Winter A, Flisikowski K, Fries R, Switonski M
Cytogenetic mapping of DGAT1, PPARA, ADIPOR1 and CREB genes in the pig.
J Appl Genet. 2007;48(1):73-6.
In the present study we show FISH localization of 4 porcine BAC clones harbouring potential candidate genes for fatness traits: DGAT1 (SSC4p15), PPARA (SSC5p15), ADIPOR1 (SSC10p13) and CREB (SSC15q24). Until now the CREB and ADIPOR1 genes are considered to be monomorphic, DGAT1 is highly polymorphic, while for the PPARA gene only 1 SNP was identified. Assignment of the studied genes in relation to QTL chromosome regions for meat quality in pig chromosomes SSC4, SSC5, SSC10 and SSC15 is discussed. [Abstract/Link to Full Text]

Czarnik U, Zabolewicz T, Strychalski J, Grzybowski G, Bogusz M, Walawski K
Deletion/insertion polymorphism of the prion protein gene (PRNP) in Polish Holstein-Friesian cattle.
J Appl Genet. 2007;48(1):69-71.
The aim of the present study was to identify the deletion/insertion polymorphism of the bovine prion protein gene (PRNP) within the promoter sequence (23 bp), intron 1 (12 bp) and 3' untranslated region (14 bp). DNA was isolated from blood of 234 randomly tested Polish Holstein-Friesian cows and from semen of 47 sires used for artificial insemination (AI) in 2004. No statistically significant differences were found in the frequency of genotypes and alleles between cows and breeding bulls in the 3 analysed polymorphic sites within the PRNP gene. Only 3 haplotypes were identified in sires and 4 haplotypes in cows. [Abstract/Link to Full Text]

Chung H, Choi B, Jang G, Lee K, Kim H, Yoon S, Im S, Davis M, Hines H
Effect of variants in the ovine skeletal-muscle-specific calpain gene on body weight.
J Appl Genet. 2007;48(1):61-8.
The ovine skeletal-muscle-specific calpain gene (p94), which is known also as the n-calpain or calpain 3 gene (CAPN3), was screened with primers. Selection of the PCR primers was based on the ovine cDNA sequence (GenBank accession No. AF087570). After sequence alignment between the ovine and human (AY902237) genes, exon and intron boundaries were determined. Polymorphisms were observed in the intron region for the CAPN31112 and CAPN31213 segments, and the sequences for these segments were submitted to the GenBank (AF309635 and AY102617, respectively). Body weight was recorded at birth, weaning and post-weaning. Calpain 3 genotypes of the CAPN31112 segment were associated with birth weight (P < 0.01), and a dominant gene effect was observed. Breeding group, birth type, and rearing type were significantly associated with weight traits. Allele frequencies were similar in purebred and crossbred animals. [Abstract/Link to Full Text]

Melo EO, Canavessi AM, Franco MM, Rumpf R
Animal transgenesis: state of the art and applications.
J Appl Genet. 2007;48(1):47-61.
There is a constant expectation for fast improvement of livestock production and human health care products. The advent of DNA recombinant technology and the possibility of gene transfer between organisms of distinct species, or even distinct phylogenic kingdoms, has opened a wide range of possibilities. Nowadays we can produce human insulin in bacteria or human coagulation factors in cattle milk. The recent advances in gene transfer, animal cloning, and assisted reproductive techniques have partly fulfilled the expectation in the field of livestock transgenesis. This paper reviews the recent advances and applications of transgenesis in livestock and their derivative products. At first, the state of art and the techniques that enhance the efficiency of livestock transgenesis are presented. The consequent reduction in the cost and time necessary to reach a final product has enabled the multiplication of transgenic prototypes around the world. We also analyze here some emerging applications of livestock transgenesis in the field of pharmacology, meat and dairy industry, xenotransplantation, and human disease modeling. Finally, some bioethical and commercial concerns raised by the transgenesis applications are discussed. [Abstract/Link to Full Text]

Feng J, Zhang Z, Li G, Zhou Y, Wang H, Guo Q, Sun J
Inheritance of resistance to stripe rust in winter wheat cultivars Aquileja and Xian Nong 4.
J Appl Genet. 2007;48(1):43-6.
Winter wheat cultivars Aquileja (AQ) and Xian Nong 4(XN) were previously reported to possess durable, quantitative resistance to stripe rust disease. In the present study, AQ, XN and a susceptible wheat cultivar were reciprocally crossed in all 6 combinations. Parents, F1, F2, F3, BCP1 and BCP2 were used to determine quantitative genetic parameters for infection type and disease severity. The results showed that fixable genetic components preponderated in the inheritance of the resistance in AQ and XN for both infection type and disease severity, while the dominant component could be detected in some cases. The resistance was conditioned by oligogenes. Heritability of the resistance ranged from 50 to 79% in most cases. [Abstract/Link to Full Text]

Wang HY, Wei YM, Yan ZH, Zheng YL
EST-SSR DNA polymorphism in durum wheat (Triticum durum L.) collections.
J Appl Genet. 2007;48(1):35-42.
SSRs derived from EST were molecular markers belonging to the transcribed region of the genome. Therefore, any polymorphism detected using EST-SSRs might reflect the better relationship among species or varieties. Using wheat EST-SSR markers, 60 durum wheat (Triticum durum L.) accessions from seven countries were investigated. Twenty-five primer pairs could amplify successfully in the 60 durum wheat accessions, of which tri-nucleotide repeats were the dominant type, and revealed 26 loci on all seven wheat homologous chromosome groups. A total of 87 eSSR alleles were detected, and the number of alleles detected by a single pair of primers ranged from 1 to 11, with an average of 3.3 alleles per locus. Higher numbers of alleles and PIC were identified on the B genome than those on the A genome. [Abstract/Link to Full Text]

Akond MA, Watanabe N, Furuta Y
Exploration of genetic diversity among Xinjiang Triticum and Triticum polonicum by AFLP markers.
J Appl Genet. 2007;48(1):25-33.
Seventy-two Xinjiang Triticum and Triticum polonicum accessions were subjected to AFLP analyses to discuss the origin of Triticum petropavlovskyi. A total of 91 putative loci were produced by four primer combinations. Among them 56 loci were polymorphic, which is equivalent to 61.53 % of the total number of putative loci. Genetic diversity among 11 T. petropavlovskyi accessions was narrow due to the lowest number (32) of polymorphic loci among the wheat species. Forty four polymorphic loci were found in T. aestivum and T. compactum, whereas the highest polymorphism was observed in T. polonicum. On the basis of the UPGMA clustering and PCO grouping and genetic similarity estimates from the AFLPs, we noted that T. petropavlovskyi was more closely related to the Chinese accessions of T. polonicum than to T. polonicum from other countries. Two accessions of T. aestivum were grouped with T. petropavlovskyi in the UPGMA clustering. Both of them were similar to T. petropavlovskyi in respect of spike structure, i.e. the presence of awn, glume awn and also the presence of leaf pubescence. Six loci, which were commonly absent in Chinese T. polonicum, were also absent in almost all of the T. petropavlovskyi accessions. Findings of this study reduced the probability of an independent allopolyploidization event in the origin of T. petropavlovskyi and indicated a greater degree of gene flow between T. aestivum and T. polonicum leading to T. petropavlovskyi. It is most likely that the P-gene of T. petropavlovskyi hexaploid wheat was introduced from T. polonicum to T. aestivum via a spontaneous introgression or breeding effort. [Abstract/Link to Full Text]

Milczarski P, Banek-Tabor A, Lebiecka K, Stoja?owski S, My?ków B, Masoj? P
New genetic map of rye composed of PCR-based molecular markers and its alignment with the reference map of the DS2 x RXL10 intercross.
J Appl Genet. 2007;48(1):11-24.
A new genetic map of rye, developed by using the 541 x Ot1-3 F2 intercross, consists of 148 marker loci, including 99 RAPDs, 18 SSRs, 14 STSs, 9 SCARs and 7 ISSRs, and spans the distance of 1401.4 cM. To the 7 rye chromosomes, 8 linkage groups were assigned and compared with the reference map of the DS2 x RXL10 F2 intercross by using 24 common markers. The 2 combined maps contain altogether 611 marker loci (70-109 per chromosome) and constitute a substantial source of information useful for further genomic studies in rye. From 21 to 37 RAPD marker loci are distributed randomly along each chromosome length and their total number for all 7 rye chromosomes is 177. This abundance of RAPD marker loci in the rye genetic map can be exploited for development of SCARs in regions containing important genes or QTL. [Abstract/Link to Full Text]

Bartoszewski G, Havey MJ, Zió?kowska A, D?ugosz M, Malepszy S
The selection of mosaic (MSC) phenotype after passage of cucumber (Cucumis sativus L.) through cell culture - a method to obtain plant mitochondrial mutants.
J Appl Genet. 2007;48(1):1-9.
Mosaic (MSC) mutants of cucumber (Cucumis sativus L.) appear after passage through cell cultures. The MSC phenotype shows paternal transmission and is associated with mitochondrial DNA rearrangements. This review describes the origins and phenotypes of independently produced MSC mutants of cucumber, including current knowledge on their mitochondrial DNA rearrangements, and similarities of MSC with other plant mitochondrial mutants. Finally we propose that passage of cucumber through cell culture can be used as a unique and efficient method to generate mitochondrial mutants of a higher plant in a highly homozygous nuclear background. [Abstract/Link to Full Text]

Ramegowda S, Gawde HM, Hyderi A, Savitha MR, Patel ZM, Krishnamurthy B, Ramachandra NB
De novo isochromosome 18p in a female dysmorphic child.
J Appl Genet. 2006;47(4):397-401.
Isochromosome 18p results in tetrasomy 18p. Most of the i(18p) cases reported so far in the literature are sporadic due to de novo formation, while familial and mosaic cases are infrequent. It is a rare chromosomal abnormality, occurring once in every 140,000 livebirths, affecting males and females equally. In the present investigation, we report a de novo i(18p) in a female dysmorphic child. The small metacentric marker chromosome was confirmed as i(18p) in the proband by cytogenetic and FISH analysis [47,XX+i(18p)]. Cytogenetic investigations in the family members revealed normal chromosome numbers, indicating the case as a de novo event of i(18p) formation. It could be due to the somewhat advanced maternal age (32 years) and/or expression of recessive genes in the proband, who is the progeny of consanguineous marriage, which could have led to misdivision and nondisjunction of chromosome 18 in meiosis I, followed by failure in the chromatid separation of 18p in meiosis II and by inverted duplication. [Abstract/Link to Full Text]

Sankar VH, Arya V, Tewari D, Gupta UR, Pradhan M, Agarwal S
Genotyping of alpha-thalassemia in microcytic hypochromic anemia patients from North India.
J Appl Genet. 2006;47(4):391-5.
Microcytic hypochromic anemia is a common condition in clinical practice and alpha-thalassemia has to be considered as a differential diagnosis. Molecular diagnosis of alpha-thalassemia is possible by polymerase chain reaction. The aim of this study was to evaluate the frequency of alpha-gene numbers in subjects with microcytosis. In total, 276 subjects with microcytic hypochromic anemia [MCV<80fl; MCH<27pg] were studied. These include 125 with thalassemia trait, 48 with thalassemia major, 26 with sickle-cell thalassemia, 15 with E beta-thalassemia, 40 with iron-deficiency anemia, 8 with another hemolytic anemia, and 14 patients with no definite diagnosis. Genotyping for -alpha3.7 deletion, -alpha4.2 deletion, Hb Constant Spring, and a-triplications was done with polymerase chain reaction. The overall frequency of -alpha3.7 deletion in 276 individuals is 12.7%. The calculated allele frequency for a-thalassemia is 0.09. The subgroup analysis showed that co-inheritance of a-deletion is more frequent with the sickle-cell mutation than in other groups. We were able to diagnose 1/3 of unexplained cases of microcytosis as a-thalassemia carriers. The a-gene mutation is quite common in the Indian subcontinent. Molecular genotyping of a-thalassemia helps to diagnose unexplained microcytosis, and thus prevents unnecessary iron supplementation. [Abstract/Link to Full Text]

Binczak-Kuleta A, Rozanski J, Domanski L, Myslak M, Ciechanowski K, Ciechanowicz A
DNA microsatellite analysis in families with autosomal dominant polycystic kidney disease (ADPKD): the first Polish study.
J Appl Genet. 2006;47(4):383-9.
BACKGROUND: Autosomal dominant polycystic kidney disease (ADPKD) is one of the most common inherited renal disorders with genetic heterogeneity. Mutations of two known genes are responsible for this disease: PKD1 at 16p13.3 and PKD2 at 4q21-23. A majority of cases (85%) are caused by mutations in PKD1. Because direct mutation screening remains complex, we describe here the application of an efficient approach to studies based on highly informative dinucleotide and tetranucleotide repeats flanking genes PKD1 and PKD2. METHODS: For this study a series of microsatellites closely linked to locus PKD1 (D16S291, D16S663, D16S665, D16S283, D16S407, D16S475) and to locus PKD2 (D4S1563, D4S2929, D4S414, D4S1534, D4S423) were selected. Short (81-242 bp) DNA fragments containing the tandem repeats were amplified by polymerase chain reaction (PCR). The number of repeat units of microsatelite markers was determined by fluorescent capillary electrophoresis. RESULTS: DNA microsatellite analysis was performed in 25 Polish ADPKD families and established the type of disease (21 families PKD1-type, 1 family PKD2-type). CONCLUSIONS: While a disease-causing mutation in the PKD1 and PKD2 genes cannot be identified, DNA microsatellite analysis provided an early diagnosis and may be considered in ADPKD families. [Abstract/Link to Full Text]

Urbina-Cano P, Bobadilla-Morales L, Ramírez-Herrera MA, Corona-Rivera JR, Mendoza-Magańa ML, Troyo-Sanromán R, Corona-Rivera A
DNA damage in mouse lymphocytes exposed to curcumin and copper.
J Appl Genet. 2006;47(4):377-82.
Dietary polyphenolics, such as curcumin, have shown antioxidant and anti-inflammatory effects. Some antioxidants cause DNA strand breaks in excess of transition metal ions, such as copper. The aim of this study was to evaluate the in vitro effect of curcumin in the presence of increasing concentrations of copper to induce DNA damage in murine leukocytes by the comet assay. Balb-C mouse lymphocytes were exposed to 50 microM curcumin and various concentrations of copper (10 microM, 100 microM and 200 microM). Cellular DNA damage was detected by means of the alkaline comet assay. Our results show that 50 microM curcumin in the presence of 100-200 microM copper induced DNA damage in murine lymphocytes. Curcumin did not inhibit the oxidative DNA damage caused by 50 microM H2O2 in mouse lymphocytes. Moreover, 50 microM curcumin alone was capable of inducing DNA strand breaks under the tested conditions. The increased DNA damage by 50 mM curcumin was observed in the presence of various concentrations of copper, as detected by the alkaline comet assay. [Abstract/Link to Full Text]

Recent Articles in Genetics and Molecular Research

Arruda JT, Bordin BM, Santos PR, Mesquita WE, Silva RC, Maia MC, Approbato MS, Floręncio RS, Amaral WN, Rocha Filho MA, Moura KK
Y chromosome microdeletions in Brazilian fertility clinic patients.
Genet Mol Res. 2007;6(2):461-9.
Microdeletions in Yq are associated with defects in spermatogenesis, while those in the AZF region are considered critical for germ cell development. We examined microdeletions in the Y chromosomes of patients attended at the Laboratory of Human Reproduction of the Clinical Hospital of the Federal University of Goiás as part of a screening of patients who plan to undergo assisted reproduction. Analysis was made of the AZF region of the Y chromosome in men who had altered spermograms to detect possible microdeletions in Yq. Twenty-three patients with azoospermia and 40 with severe oligozoospermia were analyzed by PCR for the detection of six sequence-tagged sites: sY84 and sY86 for AZFa, sY127 and sY134 for AZFb, and sY254 and sY255 for AZFc. Microdeletions were detected in 28 patients, including 10 azoospermics and 18 severe oligozoospermics. The patients with azoospermia had 43.4% of their microdeletions in the AZFa region, 8.6% in the AZFb region and 17.4% in the AZFc region. In the severe oligozoospermics, 40% were in the AZFa region, 5% in the AZFb region and 5% in the AZFc region. We conclude that microdeletions can be the cause of idiopathic male infertility, supporting conclusions from previous studies. [Abstract/Link to Full Text]

Ondei LS, Zamaro PJ, Mangonaro PH, Valęncio CR, Bonini-Domingos CR
HPLC determination of hemoglobins to establish reference values with the aid of statistics and informatics.
Genet Mol Res. 2007;6(2):453-60.
The purpose of the present study was to establish reference values for hemoglobins (Hb) using HPLC, in samples containing normal Hb (AA), sickle cell trait without alpha-thalassemia (AS), sickle cell trait with alpha-thalassemia (ASH), sickle cell anemia (SS), and Hb SC disease (SC). The blood samples were analyzed by electrophoresis, HPLC and molecular procedures. The Hb A2 mean was 4.30 +/- 0.44% in AS, 4.18 +/- 0.42% in ASH, 3.90 +/- 1.14% in SS, and 4.39 +/- 0.35% in SC. They were similar, but above the normal range. Between the AS and ASH groups, only the amount of Hb S was higher in the AS group. The Hb S mean in the AS group was 38.54 +/- 3.01% and in the ASH it was 36.54 +/- 3.76%. In the qualitative analysis, using FastMap, distinct groups were seen: AA and SS located at opposite extremes, AS and ASH with overlapping values and intermediate distribution, SC between heterozygotes and the SS group. Hb S was confirmed by allele-specific polymerase chain reaction. The Hb values established will be available for use as a reference for the Brazilian population, drawing attention to the increased levels of Hb A2, which should be considered with caution to prevent incorrect diagnoses. [Abstract/Link to Full Text]

Abud S, de Souza PI, Vianna GR, Leonardecz E, Moreira CT, Faleiro FG, Júnior JN, Monteiro PM, Rech EL, Aragăo FJ
Gene flow from transgenic to nontransgenic soybean plants in the Cerrado region of Brazil.
Genet Mol Res. 2007;6(2):445-52.
Evaluation of transgenic crops under field conditions is a fundamental step for the production of genetically engineered varieties. In order to determine if there is pollen dispersal from transgenic to nontransgenic soybean plants, a field release experiment was conducted in the Cerrado region of Brazil. Nontransgenic plants were cultivated in plots surrounding Roundup Ready transgenic plants carrying the cp4 epsps gene, which confers herbicide tolerance against glyphosate herbicide, and pollen dispersal was evaluated by checking for the dominant gene. The percentage of cross-pollination was calculated as a fraction of herbicide-tolerant and -nontolerant plants. The greatest amount of transgenic pollen dispersion was observed in the first row, located at one meter from the central (transgenic) plot, with a 0.52% average frequency. The frequency of pollen dispersion decreased to 0.12% in row 2, reaching 0% when the plants were up to 10 m distance from the central plot. Under these conditions pollen flow was higher for a short distance. This fact suggests that the management necessary to avoid cross-pollination from transgenic to nontransgenic plants in the seed production fields should be similar to the procedures currently utilized to produce commercial seeds. [Abstract/Link to Full Text]

DeGroot BJ, Keown JF, Van Vleck LD, Kachman SD
Estimates of genetic parameters for Holstein cows for test-day yield traits with a random regression cubic spline model.
Genet Mol Res. 2007;6(2):434-44.
Genetic parameters were estimated with restricted maximum likelihood for individual test-day milk, fat, and protein yields and somatic cell scores with a random regression cubic spline model. Test-day records of Holstein cows that calved from 1994 through early 1999 were obtained from Dairy Records Management Systems in Raleigh, North Carolina, for the analysis. Estimates of heritability for individual test-days and estimates of genetic and phenotypic correlations between test-days were obtained from estimates of variances and covariances from the cubic spline analysis. Estimates were calculated of genetic parameters for the averages of the test days within each of the ten 30-day test intervals. The model included herd test-day, age at first calving, and bovine somatropin treatment as fixed factors. Cubic splines were fitted for the overall lactation curve and for random additive genetic and permanent environmental effects, with five predetermined knots or four intervals between days 0, 50, 135, 220, and 305. Estimates of heritability for lactation one ranged from 0.10 to 0.15, 0.06 to 0.10, 0.09 to 0.15, and 0.02 to 0.06 for test-day one to test-day 10 for milk, fat, and protein yields and somatic cell scores, respectively. Estimates of heritability were greater in lactations two and three. Estimates of heritability increased over the course of the lactation. Estimates of genetic and phenotypic correlations were smaller for test-days further apart. [Abstract/Link to Full Text]

Mazzé FM, Fuzo CA, Ciancaglini P, Degrčve L
Recognition of alpha-helix transmembrane domains with an amphipathy scale generated by molecular dynamics using only the primary sequence of proteins.
Genet Mol Res. 2007;6(2):422-33.
We recently developed an amphipathy scale, elaborated from molecular dynamics data that can be used for the identification of hydrophobic or hydrophilic regions in proteins. This amphipathy scale reflects side chain/water molecule interaction energies. We have now used this amphipathy scale to find candidates for transmembrane segments, by examining a large sample of membrane proteins with alpha-helix segments. The candidates were selected based on an amphipathy coefficient value range and the minimum number of residues in a segment. We compared our results with the transmembrane segments previously identified in the PDB_TM database by the TMDET algorithm. We expected that the hydrophobic segments would be identified using only the primary structures of the proteins and the amphipathy scale. However, some of these hydrophobic segments may pertain to hydrophobic pockets not included in transmembrane regions. We found that our amphipathy scale could identify alpha-helix transmembrane regions with a probability of success of 76% when all segments were included and 90% when all membrane proteins were included. [Abstract/Link to Full Text]

Bonini-Domingos CR, Silva MB, Romero RM, Zamaro PJ, Ondei LS, Zago CE, Moreira SB, Salgado CG
Description of electrophoretic and chromatographic hemoglobin profile of Rhinoclemmys punctularia.
Genet Mol Res. 2007;6(2):415-21.
Studies of the hemoglobin pattern in Brazilian reptiles are important for determining ecological and phylogenetic relationships, but they are scarce. Peripheral blood samples were obtained from 7 males and 18 females of Rhinoclemmys punctularia. The hematological profile was based on the total hemoglobin and hematocrit values. The hemoglobin profile was obtained using electrophoretic procedures at different pH, isoelectric focusing, globin chain electrophoresis, and HPLC. The hematocrit (31 +/- 2%) and total hemoglobin (7.5 +/- 0.2 g/dL) values did not indicate gender variations. Alkaline pH electrophoresis of the total blood samples treated with 1% saponin demonstrated the presence of four well-defined hemoglobin fractions, one major component (fraction I), showing cathodic migration and three others faster than fraction I with anodic migration. When the samples were precipitated with chloroform, only two hemoglobin fractions were observed, similar to fractions I and III from the first procedure. Isoelectric focusing and HPLC showed the same pattern. With acid and neutral pH electrophoresis, two fractions with anodic migration were observed. The globin chain identification at alkaline pH showed two fractions, but four fractions were observed at acidic pH, suggesting that different polypeptide chains are involved in the hemoglobin molecule. The chromatographic separation of the total blood sample demonstrated that the major fraction comprised 81.9% and the minor 18.1%. The results obtained demonstrated a similarity between these hemoglobin components and those of some Chelidae reported in the literature for both land and aquatic animals, reflecting the adaptation to environmental conditions. [Abstract/Link to Full Text]

Ferreira RC, Bosco F, Paiva PB, Briones MR
Minimization of transcriptional temporal noise and scale invariance in the yeast genome.
Genet Mol Res. 2007;6(2):297-314.
The analysis of transcriptional temporal noise could be an interesting means to study gene expression dynamics and stochasticity in eukaryotes. To study the statistical distributions of temporal noise in the eukaryotic model system Saccharomyces cerevisiae, we analyzed microarray data corresponding to one cell cycle for 6200 genes. We found that the temporal noise follows a lognormal distribution with scale invariance at the genome, chromosomal and sub-chromosomal levels. Correlation of temporal noise with the codon adaptation index suggests that at least 70% of all protein-coding genes are a noise minimization core of the genome. Accordingly, a mathematical model of individual gene expression dynamics was proposed, using an operator theoretical approach, which reveals strict conditions for noise variability and a possible global noise minimization/optimization strategy at the genome level. Our model and data show that minimal noise does not correspond to genes obeying a strictly deterministic dynamics. The natural strategy of minimization consists in equating the mean of the absolute value of the relative variation of the expression level (alpha) with noise (eta). We hypothesize that the temporal noise pattern is an emergent property of the genome and shows how the dynamics of gene expression could be related to chromosomal organization. [Abstract/Link to Full Text]

Tannure-Nascimento IC, Nascimento FS, Turatti IC, Lopes NP, Trigo JR, Zucchi R
Colony membership is reflected by variations in cuticular hydrocarbon profile in a Neotropical paper wasp, Polistes satan (Hymenoptera, Vespidae).
Genet Mol Res. 2007;6(2):290-6.
Nestmate recognition is one the most important features in social insect colonies. Although epicuticular lipids or cuticular hydrocarbons have both structural and defensive functions in insects, they also seem to be involved in several aspects of communication in wasps, bees and ants. We analyzed and described for the first time the cuticular hydrocarbons of a Neotropical paper wasp, Polistes satan, and found that variation in hydrocarbon profile was sufficiently strong to discriminate individuals according to their colony membership. Therefore, it seems that small differences in the proportion of these compounds can be detected and used as a chemical-based cue by nestmates to detect invaders and avoid usurpation. [Abstract/Link to Full Text]

Grisolia AB, Moreno VR, Campagnari F, Milazzotto MP, Garcia JF, Adania CH, Souza EB
Genetic diversity of microsatellite loci in Leopardus pardalis, Leopardus wiedii and Leopardus tigrinus.
Genet Mol Res. 2007;6(2):282-9.
The microsatellite loci FCA045, FCA077, FCA008, and FCA096 are highly variable molecular markers which were used to determine the genetic diversity in 148 captive Leopardus sp. The PCR-amplified products of microsatellite loci were characterized in ABI Prism 310 Genetic Analyzer. Allele numbers, heterozygosity, polymorphism information content, exclusive allele number, and shared alleles were calculated. Sixty-five alleles were found and their sizes ranged from 116 to 216 bp in four microsatellite loci. The heterozygosity ranged from 0.36 to 0.81 in Leopardus pardalis, 0.57 to 0.67 in L. tigrinus and 0.80 to 0.92 in L. wiedii. The polymorphism information content was from 0.80 to 0.88 in L. pardalis, 0.76 to 0.88 in L. tigrinus and 0.77 to 0.90 in L. wiedii. The margay (L. wiedii) showed the highest index of polymorphism among the three species in this study. These results imply that microsatellite DNA markers can help in the study of the genetic diversity of Leopardus specimens. [Abstract/Link to Full Text]

Basualdo M, Rodríguez EM, Bedascarrasbure E, De Jong D
Selection and estimation of the heritability of sunflower (Helianthus annuus) pollen collection behavior in Apis mellifera colonies.
Genet Mol Res. 2007;6(2):274-81.
We selected honey bee colonies (Apis mellifera L.) with a high tendency to collect sunflower pollen and estimated the heritability of this trait. The percentage of sunflower pollen collected by 74 colonies was evaluated. Five colonies that collected the highest percentages of sunflower pollen were selected. Nineteen colonies headed by daughters of these selected queens were evaluated for this characteristic in comparison with 20 control (unselected) colonies. The variation for the proportion of sunflower pollen was greater among colonies of the control group than among these selected daughter colonies. The estimated heritability was 0.26 +/- 0.23, demonstrating that selection to increase sunflower pollen collection is feasible. Such selected colonies could be used to improve sunflower pollination in commercial fields. [Abstract/Link to Full Text]

Hoenigsberg HF
From geochemistry and biochemistry to prebiotic evolution...we necessarily enter into Gánti's fluid automata.
Genet Mol Res. 2007;6(2):258-73.
The present study is just an overview of the opening of the geochemical stage for the appearance of life. But that opening would not have been sufficient for the intellectual discovery of the origin of life! The excellent works and many commendable efforts that advance this explanation have not shown the fundamental elements that participate in the theoretical frame of biological evolution. The latter imply the existence of evolutionary transitions and the production of new levels of organization. In this brief analysis we do not intend to introduce the audience to the philosophy of biology. But we do expect to provide a modest overview, in which the geochemical chemolithoautotrophic opening of the stage should be seen, at most, as the initial metabolism that enabled organic compounds to follow the road where a chemical fluid machinery was thus able to undertake the more "sublime" course of organic biological evolution. We think that Tibor Gánti's chemoton is the most significant contribution to theoretical biology, and the only course now available to comprehend the unit of evolution problem without the structuralist and functionalist conflict prevalent in theoretical biology. In our opinion Gánti's chemoton theory travels to the "locus" where evolutionary theory dares to extend itself to entities at many levels of structural organization, beyond the gene or the group above. Therefore, in this and subsequent papers on the prebiotic conditions for the eventual appearance of the genetic code, we explore the formation and the presence of metal sulfide minerals, from the assembly of metal sulfide clusters through the precipitation of nanocrystals and the further reactions resulting in bulk metal sulfide phases. We endeavor to characterize pristine reactions and the modern surfaces, utilizing traditional surface science techniques and computational methods. Moreover, mechanistic details of the overall oxidation of metal sulfide minerals are set forth. We hope that this paper will lead our audience to accept that in a chemically oscillating system the chemoton is a model fluid state automaton capable of growth and self-reproduction. This is not simply a matter of transmitting a pattern, as in inorganic crystals; such self-reproduction must be more complex than crystal growth. Indeed that is what Gánti's theoretical and abstract model offers to us all: we finally have a philosophy of evolutionary units in theoretical biology. [Abstract/Link to Full Text]

Salim DC, Akimoto AA, Carvalho CB, Oliveira SF, Grisolia CK, Moreira JR, Klautau-Guimarăes MN
Genetic variability in maned wolf based on heterologous short-tandem repeat markers from domestic dog.
Genet Mol Res. 2007;6(2):248-57.
The maned wolf (Chrysocyon brachyurus) is the largest South American canid. Habitat loss and fragmentation, due to agricultural expansion and predatory hunting, are the main threats to this species. It is included in the official list of threatened wildlife species in Brazil, and is also protected by IUCN and CITES. Highly variable genetic markers such as microsatellites have the potential to resolve genetic relationships at all levels of the population structure (among individuals, demes or metapopulations) and also to identify the evolutionary unit for strategies for the conservation of the species. Tests were carried out to verify whether a class of highly polymorphic tetranucleotide repeats described for the domestic dog effectively amplifies DNA in the maned wolf. All five loci studied were amplified; however, one of these, was shown to be monomorphic in 69 maned wolf samples. The average allele number and estimated heterozygosity per polymorphic locus were 4.3 and 67%, respectively. The genetic variability found for this species, which is considered threatened with extinction, showed similar results when compared to studies of other canids. [Abstract/Link to Full Text]

Mińo CI, Del Lama SN
Genetic structure in Brazilian breeding colonies of the Roseate Spoonbill (Platalea ajaja, Aves: Threskiornithidae).
Genet Mol Res. 2007;6(2):238-47.
Roseate Spoonbills (Platalea ajaja, Linnaeus) are wading birds present in two of the most important Brazilian wetlands: the Pantanal wetlands and Rio Grande do Sul marshes. Natural populations of these species have not been previously studied with variable nuclear molecular markers. In order to support decision making regarding the management and conservation of these populations, we estimated and characterized the distribution of genetic variability among five Brazilian breeding colonies. The average observed heterozygosity in Brazilian Roseate Spoonbill populations (Ho = 0.575) did not differ significantly from the value determined in a U.S. wild-caught sample of 15 individuals, using data generated by the same set of microsatellite loci. Considering that the U.S. population underwent a recent reduction in size, we discuss this result supposing that the U.S. population was not genetically affected or that both populations had suffered a bottleneck. Global F(ST) indicated the lack of genetic differentiation among colonies, indicating the occurrence of past and/or present gene flow among them. Analysis of molecular variance revealed that most of the genetic variation is distributed within the colonies. Results are explained by a recent origin of colonies or by high levels of gene flow. Management decisions should take into consideration the fact that, even in the presence of high genetic exchange, ecological adaptations to different environments are important for species survival. [Abstract/Link to Full Text]

Stehling EG, Campos TA, Azevedo V, Brocchi M, Silveira WD
DNA sequencing of a pathogenicity-related plasmid of an avian septicemic Escherichia coli strain.
Genet Mol Res. 2007;6(2):231-7.
A 43-MDa conjugative plasmid isolated from an avian septicemic Escherichia coli (APEC) strain possessing genes related to the adhesion and invasion capacities of in vitro-cultured cells was sequenced. The results demonstrated that the 43-MDa plasmid harbors bacterial pathogenicity-related sequences which probably allow the wild-type pathogenic strain to adhere to and invade tissues and to cause septicemia in poultry. The existence of homology sequences to sequences belonging to other human pathogenic Enterobacteriaceae like Escherichia coli O157:H7, Shigella and Salmonella was also observed. The presence of these sequences in this plasmid could indicate that there is horizontal genetic transfer between bacterial strains isolated from different host species. In conclusion, the present study suggests that APEC strains harbor high-molecular weight plasmids that present pathogenicity-related sequences and that these are probably responsible for the pathogenicity exhibited by these strains. The presence of human pathogenicity-associated sequences in APEC conjugative plasmids suggests that these strains could represent a zoonotic risk. [Abstract/Link to Full Text]

Leite KC, Collevatti RG, Menegasso TR, Tomas WM, Duarte JM
Transferability of microsatellite loci from Cervidae species to the endangered Brazilian marsh deer, Blastocerus dichotomus.
Genet Mol Res. 2007;6(2):225-30.
Blastocerus dichotomus, the marsh deer, is the largest Brazilian Cervidae species. The species is endangered because of hunting and loss of its natural habitat, i.e., flood plain areas, because of hydroelectric power station construction and agricultural land expansion. In the present study, we tested 38 microsatellite loci from four Cervidae species: Odocoileus virginianus (7), Rangifer tarandus (17), Capreolus capreolus (7), and Mazama bororo (7). Eleven loci showed clear amplification, opening a new perspective for the generation of fundamental population genetic data for devising conservation strategies for B. dichotomus. [Abstract/Link to Full Text]

Lins TC, Nogueira LR, Lima RM, Gentil P, Oliveira RJ, Pereira RW
A multiplex single-base extension protocol for genotyping Cdx2, FokI, BsmI, ApaI, and TaqI polymorphisms of the vitamin D receptor gene.
Genet Mol Res. 2007;6(2):216-24.
The well-described role of the vitamin D endocrine system in bone metabolism makes its receptor a widely investigated candidate gene in association studies looking for the genetic basis of complex bone-related phenotypes. Most association studies genotype five polymorphic sites along the gene using PCR-RFLP and allele-specific amplification methods, which may not be the better choice in large case/control or cross-sectional studies. In this case, genotyping SNPs in parallel and using automated allele-calling methods are important to decrease genotyping errors due to manual data handling and save sample in cases where the amount of DNA is limited. The aim of this study was to present a straightforward method based on multiplex PCR amplification followed by multiplex single-base extension as a simple way to genotype five vitamin D receptor gene polymorphisms in parallel, which may be implemented in medium- to large-scale case/control or cross-sectional studies. The results regarding method feasibility and optimization are presented by genotyping eight paternity trios and seven samples of Brazilian postmenopausal women who took part in an ongoing association study carried out by members of our group. [Abstract/Link to Full Text]

Fuzinatto VA, Pagliarini MS, Valle CB
Evidence of programmed cell death during microsporogenesis in an interspecific Brachiaria (Poaceae: Panicoideae: Paniceae) hybrid.
Genet Mol Res. 2007;6(2):208-15.
Morphological changes have been investigated during plant programmed cell death (PCD) in the last few years due to the new interest in a possible apoptotic-like phenomenon existing in plants. Although PCD has been reported in several tissues and specialized cells in plants, there have been few reports of its occurrence during microsporogenesis. The present study reports a typical process of PCD during meiosis in an interspecific Brachiaria hybrid leading to male sterility. In this hybrid, some inflorescences initiated meiosis but it was arrested in zygotene/pachytene. From this stage, meiocytes underwent a severe alteration in shape showing substantial membrane blebbing; the cytoplasm became denser at the periphery; the cell nucleus entered a progressive stage of chromatin disintegration, and then the nucleolus disintegrated, and the cytoplasm condensed and shrunk. The oldest flowers of the raceme showed only the callose wall in the anthers showing obvious signs of complete sterility. [Abstract/Link to Full Text]

Pérez IA, Santana SP, Argudin TD, Gardon DO
Analysis of blood processing conditions to obtain high-quality total RNA from human leukocyte concentrate.
Genet Mol Res. 2007;6(2):198-207.
Blood samples are used as a biological source to discover biomarkers of hematological and non-hematological disorders. The present study shows the impact of different experimental conditions associated with cell lysis buffer, TRI-reagent protocol and blood cell storage buffer and their correlation with the quantity, quality and Adrenomedullin gene expression levels of total RNA when RT-PCR technique is used. A leukocyte cell bank protocol is also proposed for further mRNA expression analysis using RNAlater as storage buffer. There is evidence that total RNA isolated from leukocyte concentrate stored for 1 month at -70 degrees C did not show significant differences concerning quality, purity and Adrenomedullin gene expression compared with the freshly processed leukocyte sample. [Abstract/Link to Full Text]

Nassar NM, Sousa MV
Amino acid profile in cassava and its interspecific hybrid.
Genet Mol Res. 2007;6(2):192-7.
Cassava roots have a low-protein content (0.7-2%). Amino acids such as lysine and methionine are also low, and some research reports have indicated the absence of methionine. The amino acid profiles of a common cassava cultivar and an interspecific hybrid, namely ICB 300, were determined using the computerized amino acid analyzer Hitachi L-8500. The interspecific hybrid has 10 times more lysine and 3 times more methionine than the common cassava cultivar: lysine content was 0.010 g per 100 g in the common cassava cultivar while it reached 0.098 in the interspecific hybrid. Methionine in the common cassava cultivar was 0.014 g per 100 g whereas it reached 0.041 g per 100 g in the interspecific hybrid. Total amino acid content in the common cassava cultivar was 0.254 g per 100 g viz. a viz. 1.664 g per 100 g in the interspecific hybrid. The genetic variability of the profile and quantity of amino acids indicate the feasibility of selecting interspecific hybrids that are rich in both crude protein and amino acids. This is the first report of high true protein in cassava root. [Abstract/Link to Full Text]

Das JK, Khuda-Bukhsh AR
Preponderance of GC-rich sites in silver-stained nucleolus organizing regions of Rita rita (Hamilton) and Mystus gulio (Hamilton) (Bagridae, Pisces), as revealed by chromomycin A3-staining technique and scanning electron microscopic studies.
Genet Mol Res. 2007;6(2):184-91.
The karyotypes of two species of catfish, Rita rita (Hamilton) (2n = 54; 14m + 34sm + 6st; NF = 102) and Mystus gulio (Hamilton) (2n = 58; 30m + 12sm + 2st + 14t, NF = 100) were studied through Giemsa-, silver- and chromomycin A(3)-staining techniques. The silver-stained karyotypes in both sexes of R. rita and M. gulio revealed that the nucleolus organizing regions were located terminally at the shorter arms (Tp) of one pair of submetacentric chromosomes, placed at positions Nos. 2 and 1, respectively, which was confirmed by scanning electron microscopy. Staining with a GC-specific fluorochrome, chromomycin A(3), produced bright fluorescence in the Ag-positive nucleolus organizer regions, suggesting thereby that nucleolus organizing regions actually included GC-rich sites of active r-RNA genes in metaphase chromosomes of these two bagrids. Further such studies are needed due to the extreme paucity of data on fish. [Abstract/Link to Full Text]

Calliari LE, Longui CA, Rocha MN, Faria CD, Kochi C, Melo MR, Melo MB, Monte O
A novel mutation in DAX1 gene causing different phenotypes in three siblings with adrenal hypoplasia congenita.
Genet Mol Res. 2007;6(2):177-83.
Adrenal hypoplasia congenita (AHC) is a rare disease that can be caused by many abnormalities, including an X-linked form. Mutations in the DAX1 gene have been assigned as the genetic cause of AHC. We describe here three siblings with AHC, clinically presented at different ages, two in the neonatal period and one oligosymptomatic during infancy. Molecular analysis was able to detect a novel mutation in exon 1 of the DAX1 gene, consisting of a transition of C to T at position 359, determining a stop codon at position 359 (Q359X). The mutated gene encodes a truncated protein missing a large portion of the ligand-binding domain (C-terminal domain). The recognition of the disease in the index case suggested the diagnosis in the other siblings. Interestingly, the same mutation is presented with different phenotypes, suggesting that first-degree family members of patients with DAX1 mutations should be carefully evaluated routinely. [Abstract/Link to Full Text]

Campos SR, Rieger TT, Santos JF
Homology of polytene elements between Drosophila and Zaprionus determined by in situ hybridization in Zaprionus indianus.
Genet Mol Res. 2007;6(2):162-76.
The drosophilid Zaprionus indianus due to its economical importance as an insect pest in Brazil deserves more investigation into its genetics. Its mitotic karyotype and a line-drawing map of its polytene chromosomes are already available. This paper presents a photomap of Z. indianus polytene chromosomes, which was used as the reference map for identification of sections marked by in situ hybridization with gene probes. Hybridization signals for Hsp70 and Hsr-omega were detected, respectively, in sections 34B and 32C of chromosome V of Z. indianus, which indicates its homology to the chromosomal arm 3R of Drosophila melanogaster and, therefore, to Muller's element E. The main signal for Hsp83 gene probe hybridization was in section 17C of Z. indianus chromosome III, suggesting its homology to arm 3L of D. melanogaster and to element D of Muller. The Ubi probe hybridized in sections 10C of chromosome II and 17A of chromosome III. Probably the 17A is the polyubiquitin locus, with homology to arm 3L of D. melanogaster and to the mullerian D element, as suggested also by Hsp83 gene location. The Br-C gene was mapped in section 1D, near the tip of the X chromosome, indicating its homology to the X chromosome of D. melanogaster and to mullerian element A. The Dpp gene probe hybridized mainly in the section 32A of chromosome V and, at lower frequencies to other sections, although no signal was observed as expected in the correspondent mullerian B element. This result led to the suggestion of a rearrangement including the Dpp locus in Z. indianus, the secondary signals possibly pointing to related genes of the TGF-beta family. In conclusion, the results indicate that chromosomes X, III, V of Z. indianus are respectively correspondents to elements A, D, and E of Muller. At least chromosome V of Z. indianus seems to share synteny with the 3R arm of D. melanogaster, as indicated by the relative positions of Hsp70 and Hsr-omega, although the Dpp gene indicates a disruption of synteny in its distal region. [Abstract/Link to Full Text]

Gonçalves VF, Prosdocimi F, Santos LS, Ortega JM, Pena SD
Sex-biased gene flow in African Americans but not in American Caucasians.
Genet Mol Res. 2007;6(2):156-61.
We have previously shown evidence of strong sex-biased genetic blending in the founding and ongoing history of the Brazilian population, with the African and Amerindian contribution being highest from maternal lineages (as measured by mitochondrial DNA) and the European contribution foremost from paternal lineages (estimated from Y-chromosome haplogroups). The same phenomenon has been observed in several other Latin American countries, suggesting that it might constitute a universal characteristic of the Iberian colonization of the Americas. However, it has also recently been detected in the Black population of the United States. We thus wondered if the same could be observed in American Caucasians. To answer that question, we retrieved 1387 hypervariable I Caucasian mitochondrial DNA sequences from the FBI population database and established their haplogroups and continental geographical sources. In sharp contrast with the situation of the Caucasian population of Latin American countries, only 3.1% of the American Caucasian sequences had African and/or Amerindian origin. To explain this discrepancy we propose that the finding of elevated genomic contributions from European males and Amerindian or African females depends not only on the occurrence of directional mating, but also on the "racial" categorization of the children born from these relations. In this respect, social practices in Latin America and in the United States diverge considerably; in the former socially significant "races" are normally designated according to physical appearance, while in the latter descent appears to be the most important factor. [Abstract/Link to Full Text]

Lopes DO, Regis-da-Silva CG, Machado-Silva A, Macedo AM, Franco GR, Hoffmann JS, Cazaux C, Pena SD, Teixeira SM, Machado CR
Analysis of DNA polymerase activity in vitro using non-radioactive primer extension assay in an automated DNA sequencer.
Genet Mol Res. 2007;6(2):150-5.
Although different DNA polymerases have distinct functions and substrate affinities, their general mechanism of action is similar. Thus, they can all be studied using the same technical principle, the primer extension assay employing radioactive tags. Even though fluorescence has been used routinely for many years for DNA sequencing, it has not been used in the in vitro primer extension assay. The use of fluorescence labels has obvious advantages over radioactivity, including safety, speed and ease of manipulation. In the present study, we demonstrated the potential of non-radioactive in vitro primer extension for DNA polymerase studies. By using an M13 tag in the substrate, we can use the same fluorescent M13 primer to study different substrate sequences. This technique allows quantification of the DNA polymerase activity of the Klenow fragment using different templates and under different conditions with similar sensitivity to the radioactive assay. [Abstract/Link to Full Text]

Thomas MG, Enns RM, Shirley KL, Garcia MD, Garrett AJ, Silver GA
Associations of DNA polymorphisms in growth hormone and its transcriptional regulators with growth and carcass traits in two populations of Brangus bulls.
Genet Mol Res. 2007;6(1):222-37.
Sequence polymorphisms in the growth hormone (GH) gene and its transcriptional regulators, Pit-1 and Prop-1, were evaluated for associations with growth and carcass traits in two populations of Brangus bulls Chihuahuan Desert Rangeland Research Center (CDRRC, N = 248 from 14 sires) and a cooperating breeding program (COOP, N = 186 from 34 sires). Polymorphisms were SNP mutations in intron 4 (C/T) and exon V (C/G) in GH, A/G in exon VI in Pit-1, and A/G in exon III in Prop-1. In the COOP population, bulls of Pit-1 GG genotype had a significantly greater percentage of intramuscular fat than bulls of the AA or AG genotype, and bulls of the Prop-1 AA genotype had significantly greater scrotal circumference than bulls of AG or GG genotypes at ~365 days of age. Also, heterozygous genotypes for the two GH polymorphisms appeared advantageous for traits of muscularity and adiposity in the COOP population. The heterozygous genotype of GH intron 4 SNP was associated with advantages in weight gain, scrotal circumference, and fat thickness in the CDRRC population. The two GH polymorphisms accounted for >/=27.7% of the variation in these traits in the CDRRC population; however, R(2) was <5% in the COOP population. Based on haplotype analyses the two GH SNPs appeared to be in phase; the haplotype analyses also paralleled with the genotype analyses. Polymorphisms in GH and its transcriptional regulators appear to be predictors of growth and carcass traits in Brangus bulls, particularly those with heterozygous GH genotypes. [Abstract/Link to Full Text]

José AA, Gama MA, Urban A, Merighe GK, Meirelles FV, Etchegaray MA, Lanna DP
Evaluation of polyvinyl alcohol for fatty acid supplementation in adipose tissue explant culture.
Genet Mol Res. 2007;6(1):214-21.
Cultures of adipose tissue explants are a valuable tool for studying the intracellular mechanisms involving hormones and nutrients. However, testing how fatty acids affect cells requires a carrier molecule; bovine serum albumin (BSA) has been used for this purpose. However, contaminants can alter the cellular response. Our objectives were to: 1) test BSA as a fatty acid carrier and 2) evaluate polyvinyl alcohol (PVA) as a replacement for BSA. Adipose tissue explants from nine pigs were cultured in medium 199 for 4, 12, 24, and 48 h, with the following treatments: control, PVA (100 mM PVA added) and PVA + pGH (100 mM PVA plus 0.1 mg/mL porcine growth hormone). After each culture period, explants were collected and assayed for lipogenesis. After 48 h in culture, explants were assayed for lipolysis. A preliminary study with different commercial sources and high concentrations showed that BSA affected lipogenic rates. On the other hand, there were no effects of PVA on lipid synthesis, while pGH (positive control) reduced glucose incorporation into lipids (P < 0.01) when compared to both control and PVA (P < 0.05). There was no difference between control and PVA for lipolysis rates. However, pGH increased lipolysis when compared to control (P < 0.01) and PVA (P < 0.05). We demonstrated that BSA can alter lipogenesis, which precludes its use as a carrier molecule. On the other hand, addition of PVA had no effect on lipolysis or lipogenesis. We suggest the use of PVA instead of BSA for adding bioactive fatty acids to cultures of adipose tissue. [Abstract/Link to Full Text]

Allen ML
Expressed sequenced tags from Lygus lineolaris (Hemiptera: Miridae), the tarnished plant bug.
Genet Mol Res. 2007;6(1):206-13.
Expressed sequenced tags (ESTs) were prepared to establish a baseline for molecular genetic studies of the tarnished plant bug, Lygus lineolaris (Palisot de Beauvois). The largest class of identifiable ESTs (15.2%) was from genes involved in cellular metabolic functions, including physiological processes. Twenty-seven ESTs (9.8%) were from genes associated with transcription and translation, including ribosomal genes. One hundred and forty-two of the 276 unique ESTs were from genes not previously identified from any organism. Twelve sequences appear to be associated with feeding and digestion and may be targets for pest control studies. [Abstract/Link to Full Text]

Anhę AC, Lima-Oliveira AP, Azeredo-Oliveira MT
Acid phosphatase activity distribution in salivary glands of triatomines (Heteroptera, Reduviidae, Triatominae).
Genet Mol Res. 2007;6(1):197-205.
Acid phosphatase activity (Gömori technique) in salivary gland cells was investigated in adult insects (males and females) of four species of triatomines: Triatoma infestans, Panstrongylus megistus, Rhodnius neglectus, and Rhodnius prolixus. Binucleated cells with bulky and polyploidy nuclei were detected, with acid phosphatase activity in the heterochromatin and nucleolus, which showed the most intense response. Thus, the activity of these phosphatases during rRNA molecule transcription, possibly in the nucleolar fibrillar center, is suggested. The difference in reactivity found among salivary glands is associated with the cellular metabolism of these regions and, probably, with the biosynthesis of their different secretions. This must be essential in maintaining the hematophagy of triatomines. [Abstract/Link to Full Text]

Horimoto AR, Ferraz JB, Balieiro JC, Eler JP
Phenotypic and genetic correlations for body structure scores (frame) with productive traits and index for CEIP classification in Nellore beef cattle.
Genet Mol Res. 2007;6(1):188-96.
The present study was carried out to estimate both (co)variance components and genetic parameters for frame scores obtained using two methods (FRAME_GMA and FRAME_BIF) as well as phenotypic and genetic correlations with traits such as weaning weight, weight gain from weaning to yearling, scrotal circumference, muscle score, and an empiric index for animal classification for the Special Certificate of Identification and Production (CEIP). Data on 12,728 animals, raised in Southeastern Brazil, with ages from 490 to 610 days were analyzed. Estimates of heritability for FRAME_GMA and FRAME_BIF in multi-trait analysis were 0.28 and 0.24, respectively. Genetic correlation coefficients between frame scores and the growth trait were of medium magnitude, which indicates that genetic selection for weight resulted in undesirable responses, increasing the animals' frames. Small changes should be expected in the frame of animals that have been submitted to a genetic selection regarding muscle score and scrotal circumference. The low magnitude of phenotypic and genetic correlation between frame scores and the empirical selection index that classifies animals for CEIP, a Brazilian official certificate that recognizes the value of seedstock that is not registered at breeders associations, but is genetically evaluated, does not indicate important responses in giving a CEIP to animals that have been directly or indirectly selected for frame. Other studies must be performed to determine estimates of the genetic parameters for frame scores in other beef cattle populations. [Abstract/Link to Full Text]

Ribeiro RA, Lovato MB
Comparative analysis of different DNA extraction protocols in fresh and herbarium specimens of the genus Dalbergia.
Genet Mol Res. 2007;6(1):173-87.
Five published DNA extraction protocols were compared for their ability to produce good quality DNA from fresh and herbarium leaves of several species of the genus Dalbergia. The leaves of these species contain high amounts of secondary metabolites, which make it difficult to perform a clean DNA extraction and thereby interfering with subsequent PCR amplification. The protocol that produced the best DNA quality in most of the Dalbergia species analyzed, utilizes polyvinylpyrrolidone to bind the phenolic compounds, a high molar concentration of NaCl to inhibit co-precipitation of polysaccharides and DNA, and LiCl for removing RNA by selective precipitation. The DNA quality of herbarium specimens was worse than that for fresh leaves, due to collecting conditions and preservation of samples. We analyzed 54 herbarium specimens, but the recovered DNA allowed successful PCR amplification in only eight. For the genus Dalbergia, the herbarium is an important source of material for phylogenetic and evolutionary studies; due to the occurrence of the different species in various geographical regions in Brazil, it is difficult to obtain fresh material in nature. Our results demonstrated that for Dalbergia species the methods used for the collection and preservation of herbarium specimens have a mayor influence on DNA quality and in the success of phylogenetic studies of the species. [Abstract/Link to Full Text]

Recent Articles in BMC Genetics

Moen T, Sonesson AK, Hayes B, Lien S, Munck H, Meuwissen TH
Mapping of a quantitative trait locus for resistance against infectious salmon anaemia in Atlantic salmon (Salmo salar): comparing survival analysis with analysis on affected/resistant data.
BMC Genet. 2007;853.
BACKGROUND: Infectious Salmon Anaemia (ISA) is a viral disease affecting farmed Atlantic salmon (Salmo salar) worldwide. The identification of Quantitative Trait Loci (QTL) affecting resistance to the disease could improve our understanding of the genetics underlying the trait and provide a means for Marker-Assisted Selection. We previously performed a genome scan on commercial Atlantic salmon families challenge tested for ISA resistance, identifying several putative QTL. In the present study, we set out to validate the strongest of these QTL in a larger family material coming from the same challenge test, and to determine the position of the QTL by interval mapping. We also wanted to explore different ways of performing QTL analysis within a survival analysis framework (i.e. using time-to-event data), and to compare results using survival analysis with results from analysis on the dichotomous trait 'affected/resistant'. RESULTS: The QTL, located on Atlantic salmon linkage group 8 (following SALMAP notation), was confirmed in the new data set. Its most likely position was at a marker cluster containing markers BHMS130, BHMS170 and BHMS553. Significant segregation distortion was observed in the same region, but was shown to be unrelated to the QTL. A maximum likelihood procedure for identifying QTL, based on the Cox proportional hazard model, was developed. QTL mapping was also done using the Haley-Knott method (affected/resistant data), and within a variance-component framework (affected/resistant data and time-to-event data). In all cases, analysis using affected/resistant data gave stronger evidence for a QTL than did analysis using time-to-event data. CONCLUSION: A QTL for resistance to Infectious Salmon Anaemia in Atlantic salmon was validated in this study, and its more precise location on linkage group eight was determined. The QTL explained 6% of the phenotypic variation in resistance to the disease. The linkage group also displayed significant segregation distortion. Survival models proved in this case not to be more suitable than models based on the dichotomous trait 'affected/resistant' for analysing the data. [Abstract/Link to Full Text]

Hanchard N, Elzein A, Trafford C, Rockett K, Pinder M, Jallow M, Harding R, Kwiatkowski D, McKenzie C
Classical sickle beta-globin haplotypes exhibit a high degree of long-range haplotype similarity in African and Afro-Caribbean populations.
BMC Genet. 2007;852.
BACKGROUND: The sickle (betas) mutation in the beta-globin gene (HBB) occurs on five "classical" betas haplotype backgrounds in ethnic groups of African ancestry. Strong selection in favour of the betas allele - a consequence of protection from severe malarial infection afforded by heterozygotes - has been associated with a high degree of extended haplotype similarity. The relationship between classical betas haplotypes and long-range haplotype similarity may have both anthropological and clinical implications, but to date has not been explored. Here we evaluate the haplotype similarity of classical betas haplotypes over 400 kb in population samples from Jamaica, The Gambia, and among the Yoruba of Nigeria (Hapmap YRI). RESULTS: The most common betas sub-haplotype among Jamaicans and the Yoruba was the Benin haplotype, while in The Gambia the Senegal haplotype was observed most commonly. Both subtypes exhibited a high degree of long-range haplotype similarity extending across approximately 400 kb in all three populations. This long-range similarity was significantly greater than that seen for other haplotypes sampled in these populations (P < 0.001), and was independent of marker choice and marker density. Among the Yoruba, Benin haplotypes were highly conserved, with very strong linkage disequilibrium (LD) extending a megabase across the betas mutation. CONCLUSION: Two different classical betas haplotypes, sampled from different populations, exhibit comparable and extensive long-range haplotype similarity and strong LD. This LD extends across the adjacent recombination hotspot, and is discernable at distances in excess of 400 kb. Although the multi-centric geographic distribution of betas haplotypes indicates strong subdivision among early Holocene sub-Saharan populations, we find no evidence that selective pressures imposed by falciparum malaria varied in intensity or timing between these subpopulations. Our observations also suggest that cis-acting loci, which may influence outcomes in sickle cell disease, could lie considerable distances away from beta-globin. [Abstract/Link to Full Text]

Timpson NJ, Heron J, Day IN, Ring SM, Bartoshuk LM, Horwood J, Emmett P, Davey-Smith G
Refining associations between TAS2R38 diplotypes and the 6-n-propylthiouracil (PROP) taste test: findings from the Avon Longitudinal Study of Parents and Children.
BMC Genet. 2007;851.
BACKGROUND: Previous investigations have highlighted the importance of genetic variation in the determination of bitter tasting ability, however have left unaddressed questions as to within group variation in tasting ability or the possibility of genetic prescription of intermediate tasting ability. Our aim was to examine the relationships between bitter tasting ability and variation at the TAS2R38 locus and to assess the role of psychosocial factors in explaining residual, within group, variation in tasting ability. RESULTS: In a large sample of children from the Avon Longitudinal Study of Parents and Children, we confirmed an association between bitter compound tasting ability and TAS2R38 variation and found evidence of a genetic association with intermediate tasting ability. Antisocial behaviour, social class and depression showed no consistent relationship with the distribution of taste test scores. CONCLUSION: Factors which could influence a child's chosen taste score, extra to taste receptor variation, appeared not to show relationships with test score. Observed spread in the distribution of the taste test scores within hypothesised taster groups, is likely to be, or at least in part, due to physiological differentiation regulated by other genetic contributors. Results confirm relationships between genetic variation and bitter compound tasting ability in a large sample, and suggest that TAS2R38 variation may also be associated with intermediate tasting ability. [Abstract/Link to Full Text]

Svischeva GR
Quantitative trait locus analysis of hybrid pedigrees: variance-components model, inbreeding parameter, and power.
BMC Genet. 2007;850.
BACKGROUND: For the last years reliable mapping of quantitative trait loci (QTLs) has become feasible through linkage analysis based on the variance-components method. There are now many approaches to the QTL analysis of various types of crosses within one population (breed) as well as crosses between divergent populations (breeds). However, to analyse a complex pedigree with dominance and inbreeding, when the pedigree's founders have an inter-population (hybrid) origin, it is necessary to develop a high-powered method taking into account these features of the pedigree. RESULTS: We offer a universal approach to QTL analysis of complex pedigrees descended from crosses between outbred parental lines with different QTL allele frequencies. This approach improves the established variance-components method due to the consideration of the genetic effect conditioned by inter-population origin and inbreeding of individuals. To estimate model parameters, namely additive and dominant effects, and the allelic frequencies of the QTL analysed, and also to define the QTL positions on a chromosome with respect to genotyped markers, we used the maximum-likelihood method. To detect linkage between the QTL and the markers we propose statistics with a non-central chi2-distribution that provides the possibility to deduce analytical expressions for the power of the method and therefore, to estimate the pedigree's size required for 80% power. The method works for arbitrarily structured pedigrees with dominance and inbreeding. CONCLUSION: Our method uses the phenotypic values and the marker information for each individual of the pedigree under observation as initial data and can be valuable for fine mapping purposes. The power of the method is increased if the QTL effects conditioned by inter-population origin and inbreeding are enhanced. Several improvements can be developed to take into account fixed factors affecting trait formation, such as age and sex. [Abstract/Link to Full Text]

Curtis D
Comparison of artificial neural network analysis with other multimarker methods for detecting genetic association.
BMC Genet. 2007;849.
BACKGROUND: Debate remains as to the optimal method for utilising genotype data obtained from multiple markers in case-control association studies. I and colleagues have previously described a method of association analysis using artificial neural networks (ANNs), whose performance compared favourably to single-marker methods. Here, the performance of ANN analysis is compared with other multi-marker methods, comprising different haplotype-based analyses and locus-based analyses. RESULTS: Of several methods studied and applied to simulated SNP datasets, heterogeneity testing of estimated haplotype frequencies using asymptotic p values rather than permutation testing had the lowest power of the methods studied and ANN analysis had the highest power. The difference in power to detect association between these two methods was statistically significant (p = 0.001) but other comparisons between methods were not significant. The raw t statistic obtained from ANN analysis correlated highly with the empirical statistical significance obtained from permutation testing of the ANN results and with the p value obtained from the heterogeneity test. CONCLUSION: Although ANN analysis was more powerful than the standard haplotype-based test it is unlikely to be taken up widely. The permutation testing necessary to obtain a valid p value makes it slow to perform and it is not underpinned by a theoretical model relating marker genotypes to disease phenotype. Nevertheless, the superior performance of this method does imply that the widely-used haplotype-based methods for detecting association with multiple markers are not optimal and efforts could be made to improve upon them. The fact that the t statistic obtained from ANN analysis is highly correlated with the statistical significance does suggest a possibility to use ANN analysis in situations where large numbers of markers have been genotyped, since the t value could be used as a proxy for the p value in preliminary analyses. [Abstract/Link to Full Text]

Kullo IJ, Ding K
Patterns of population differentiation of candidate genes for cardiovascular disease.
BMC Genet. 2007;848.
BACKGROUND: The basis for ethnic differences in cardiovascular disease (CVD) susceptibility is not fully understood. We investigated patterns of population differentiation (FST) of a set of genes in etiologic pathways of CVD among 3 ethnic groups: Yoruba in Nigeria (YRI), Utah residents with European ancestry (CEU), and Han Chinese (CHB) + Japanese (JPT). We identified 37 pathways implicated in CVD based on the PANTHER classification and 416 genes in these pathways were further studied; these genes belonged to 6 biological processes (apoptosis, blood circulation and gas exchange, blood clotting, homeostasis, immune response, and lipoprotein metabolism). Genotype data were obtained from the HapMap database. RESULTS: We calculated FST for 15,559 common SNPs (minor allele frequency > or = 0.10 in at least one population) in genes that co-segregated among the populations, as well as an average-weighted FST for each gene. SNPs were classified as putatively functional (non-synonymous and untranslated regions) or non-functional (intronic and synonymous sites). Mean FST values for common putatively functional variants were significantly higher than FST values for nonfunctional variants. A significant variation in FST was also seen based on biological processes; the processes of 'apoptosis' and 'lipoprotein metabolism' showed an excess of genes with high FST. Thus, putative functional SNPs in genes in etiologic pathways for CVD show greater population differentiation than non-functional SNPs and a significant variance of FST values was noted among pairwise population comparisons for different biological processes. CONCLUSION: These results suggest a possible basis for varying susceptibility to CVD among ethnic groups. [Abstract/Link to Full Text]

Yang R, Fang M
Mapping quantitative trait loci in line cross with repeat records.
BMC Genet. 2007;847.
BACKGROUND: Phenotypes with repeat records from one individual or multiple individuals were often encountered in practices of mapping QTL in linecross. The current genetic mapping method for a trait with repeat records is adopted by simply replacing the phenotype by the average value of the repeat records. This simple treatment has not sufficiently utilized the information from the replication and ignored the impacts of the permanent environmental effects on the accuracy of the estimated QTL. RESULTS: We propose to map QTL by using the repeatability model to directly analyze the repeat records rather than simply analyze the mean phenotype, improving the efficiency of QTL detecting because of adequately utilizing the information from data and allowing for the permanent environmental effects. A maximum likelihood method implemented via the expectation-maximization (EM) algorithm is applied to perform the parameter estimation of the repeatability model. The superiority of the mapping method based on the repeatability model over simple analysis using the mean phenotype was demonstrated by a series of simulations. CONCLUSION: Our results suggest that the proposed method can serve as a powerful alternative to existing methods. By mean of the repeatability model, utilizing the repeat records on individual may improve the efficiency of QTL detecting in line cross. [Abstract/Link to Full Text]

Anney RJ, Lotfi-Miri M, Olsson CA, Reid SC, Hemphill SA, Patton GC
Variation in the gene coding for the M5 muscarinic receptor (CHRM5) influences cigarette dose but is not associated with dependence to drugs of addiction: evidence from a prospective population based cohort study of young adults.
BMC Genet. 2007;846.
BACKGROUND: The mesolimbic structures of the brain are important in the anticipation and perception of reward. Moreover, many drugs of addiction elicit their response in these structures. The M5 muscarinic receptor (M5R) is expressed in dopamine-containing neurones of the substantia nigra pars compacta and ventral tegmental area, and regulates the release of mesolimbic dopamine. Mice lacking M5R show a substantial reduction in both reward and withdrawal responses to morphine and cocaine. The CHRM5, the gene that codes for the M5R, is a strong biological candidate for a role in human addiction. We screened the coding and core promoter sequences of CHRM5 using denaturing high performance liquid chromatography to identify common polymorphisms. Additional polymorphisms within the coding and core promoter regions that were identified through dbSNP were validated in the test population. We investigated whether these polymorphisms influence substance dependence and dose in a cohort of 1947 young Australians. RESULTS: Analysis was performed on 815 participants of European ancestry who were interviewed at wave 8 of the cohort study and provided DNA. We observed a 26.8% increase in cigarette consumption in carriers of the rs7162140 T-allele, equating to 20.1 cigarettes per week (p=0.01). Carriers of the rs7162140 T-allele were also found to have nearly a 3-fold increased risk of developing cannabis dependence (OR=2.9 (95%CI 1.1-7.4); p=0.03). CONCLUSION: Our data suggest that variation within the CHRM5 locus may play an important role in tobacco and cannabis but not alcohol addiction in European ancestry populations. This is the first study to show an association between CHRM5 and substance use in humans. These data support the further investigation of this gene as a risk factor in substance use and dependence. [Abstract/Link to Full Text]

Howell GR, Libby RT, Marchant JK, Wilson LA, Cosma IM, Smith RS, Anderson MG, John SW
Absence of glaucoma in DBA/2J mice homozygous for wild-type versions of Gpnmb and Tyrp1.
BMC Genet. 2007;845.
BACKGROUND: The glaucomas are a common but incompletely understood group of diseases. DBA/2J mice develop a pigment liberating iris disease that ultimately causes elevated intraocular pressure (IOP) and glaucoma. We have shown previously that mutations in two genes, Gpnmb and Tyrp1, initiate the iris disease. However, mechanisms involved in the subsequent IOP elevation and optic nerve degeneration remain unclear. RESULTS: Here we present new mouse strains with Gpnmb and/or Tyrp1 genes of normal function and with a DBA/2J genetic background. These strains do not develop elevated IOP or glaucoma with age. CONCLUSION: These strains provide much needed controls for studying pathogenic mechanisms of glaucoma using DBA/2J mice. Given the involvement of Gpnmb and/or Tyrp1 in areas such as immunology and tumor development and progression, these strains are also important in other research fields. [Abstract/Link to Full Text]

Rosenberger A, Sharma M, Müller-Myhsok B, Gasser T, Bickeböller H
Meta analysis of whole-genome linkage scans with data uncertainty: an application to Parkinson's disease.
BMC Genet. 2007;844.
BACKGROUND: Genome wide linkage scans have often been successful in the identification of genetic regions containing susceptibility genes for a disease. Meta analysis is used to synthesize information and can even deliver evidence for findings missed by original studies. If researchers are not contributing their data, extracting valid information from publications is technically challenging, but worth the effort. We propose an approach to include data extracted from published figures of genome wide linkage scans. The validity of the extraction was examined on the basis of those 25 markers, for which sufficient information was reported. Monte Carlo simulations were used to take into account the uncertainty in marker position and in linkage test statistic. For the final meta analysis we compared the Genome Search Meta Analysis method (GSMA) and the Corrected p-value Meta analysis Method (CPMM). An application to Parkinson's disease is given. Because we had to use secondary data a meta analysis based on original summary values would be desirable. RESULTS: Data uncertainty by replicated extraction of marker position is shown to be much smaller than 30 cM, a distance up to which a maximum LOD score may usually be found away from the true locus. The main findings are not impaired by data uncertainty. CONCLUSION: Applying the proposed method a novel linked region for Parkinson's disease was identified on chromosome 14 (p = 0.036). Comparing the two meta analysis methods we found in this analysis more regions of interest being identified by GSMA, whereas CPMM provides stronger evidence for linkage. For further validation of the extraction method comparisons with raw data would be required. [Abstract/Link to Full Text]

Joy N, Abraham Z, Soniya EV
A preliminary assessment of genetic relationships among agronomically important cultivars of black pepper.
BMC Genet. 2007;842.
BACKGROUND: The impact of diseases such as Phytophthora foot rot and the replacement of unproductive cultivars by high yielding ones has brought about the disappearance of varieties in Piper species, like any other crop. Black pepper (King of spices), is a major spice crop consumed throughout the world. It is widely cultivated across various parts of the world apart from India. The different cultivars may be genetically related and could be a source of valuable genes for disease resistance and an increase in quantity and quality. Even though Western Ghats in India is believed to be the site of origin of this crop, numerous accessions from the NBPGR have not yet been evaluated. Our study aims to investigate the genetic relatedness in major cultivars of black pepper using Amplified Fragment Length Polymorphism. RESULTS: Amplified Fragment Length Polymorphic (AFLP) DNA analysis was performed in thirty popular cultivars of black pepper from National Bureau of Plant Genetic Resources (NBPGR), India. Fingerprint profiles were generated initially with, five different primer combinations, from which three primer pair combinations (EAGC/MCAA, EAGG/MCTA and EAGC/MCTG) gave consistent and scorable banding patterns. From 173 scorable markers, 158(> 90%) were polymorphic which shows there is considerable variation in the available germplasm. The dendrogram derived by unweighted pair group method analysis (UPGMA) grouped the accessions into three major clusters and four diverse cultivars with only 30% similarity. Karimunda, a widely grown and popular cultivar was unique in the fingerprint profiles obtained. CONCLUSION: There are currently few fingerprinting studies using the valuable spice crop black pepper. We found considerable genetic variability among cultivars of black pepper. Fingerprinting analysis with AFLP proved to be an ideal tool for cultivar identification and phylogenetic studies. It shows the high level of polymorphism and the unique characterization of the major cultivars. An extensive range of similarity value between the cultivars was noted (6.01 to 98.13). Further screening of more cultivars will provide valuable information for current breeding programmes. [Abstract/Link to Full Text]

Harris SE, Fox H, Wright AF, Hayward C, Starr JM, Whalley LJ, Deary IJ
A genetic association analysis of cognitive ability and cognitive ageing using 325 markers for 109 genes associated with oxidative stress or cognition.
BMC Genet. 2007;843.
BACKGROUND: Non-pathological cognitive ageing is a distressing condition affecting an increasing number of people in our 'ageing society'. Oxidative stress is hypothesised to have a major role in cellular ageing, including brain ageing. RESULTS: Associations between cognitive ageing and 325 single nucleotide polymorphisms (SNPs), located in 109 genes implicated in oxidative stress and/or cognition, were examined in a unique cohort of relatively healthy older people, on whom we have cognitive ability scores at ages 11 and 79 years (LBC1921). SNPs showing a significant positive association were then genotyped in a second cohort for whom we have cognitive ability scores at the ages of 11 and 64 years (ABC1936). An intronic SNP in the APP gene (rs2830102) was significantly associated with cognitive ageing in both LBC1921 and a combined LBC1921/ABC1936 analysis (p < 0.01), but not in ABC1936 alone. CONCLUSION: This study suggests a possible role for APP in normal cognitive ageing, in addition to its role in Alzheimer's disease. [Abstract/Link to Full Text]

Wagner K, Grzybowska E, Butkiewicz D, Pamula-Pilat J, Pekala W, Tecza K, Hemminki K, Försti A
High-throughput genotyping of a common deletion polymorphism disrupting the TRY6 gene and its association with breast cancer risk.
BMC Genet. 2007;841.
BACKGROUND: Copy number polymorphisms caused by genomic rearrangements like deletions, make a significant contribution to the genomic differences between two individuals and may add to disease predisposition. Therefore, genotyping of such deletion polymorphisms in case-control studies could give important insights into risk associations. RESULTS: We mapped the breakpoints and developed a fluorescent fragment analysis for a deletion disrupting the TRY6 gene to exemplify a quick and cheap genotyping approach for such structural variants. We showed that the deletion is larger than predicted and encompasses also the pseudogene TRY5. We performed a case-control study to test an association of the TRY6 deletion polymorphism with breast cancer using a single nucleotide polymorphism which is in 100% linkage disequilibrium with the deletion. We did not observe an effect of the deletion on breast cancer risk (OR 1.05, 95% CI 0.71-1.56). CONCLUSION: Although we did not observe an association between the TRY6 deletion polymorphism and breast cancer risk, the identification and investigation of further deletions using the present approach may help to elucidate their effect on disease susceptibility. [Abstract/Link to Full Text]

Zhang Y, Ni Z, Yao Y, Nie X, Sun Q
Gibberellins and heterosis of plant height in wheat (Triticum aestivum L.).
BMC Genet. 2007;840.
BACKGROUND: Heterosis in internode elongation and plant height are commonly observed in hybrid plants, and higher GAs contents were found to be correlated with the heterosis in plant height. However, the molecular basis for the increased internode elongation in hybrids is unknown. RESULTS: In this study, heterosis in plant height was determined in two wheat hybrids, and it was found that the increased elongation of the uppermost internode contributed mostly to the heterosis in plant height. Higher GA4 level was also observed in a wheat hybrid. By using the uppermost internode tissues of wheat, we examined expression patterns of genes participating in both GA biosynthesis and GA response pathways between a hybrid and its parental inbreds. Our results indicated that among the 18 genes analyzed, genes encoding enzymes that promote synthesis of bioactive GAs, and genes that act as positive components in the GA response pathways were up-regulated in hybrid, whereas genes encoding enzymes that deactivate bioactive GAs, and genes that act as negative components of GA response pathways were down-regulated in hybrid. Moreover, the putative wheat GA receptor gene TaGID1, and two GA responsive genes participating in internode elongation, GIP and XET, were also up-regulated in hybrid. A model for GA and heterosis in wheat plant height was proposed. CONCLUSION: Our results provided molecular evidences not only for the higher GA levels and more active GA biosynthesis in hybrid, but also for the heterosis in plant height of wheat and possibly other cereal crops. [Abstract/Link to Full Text]

Fujiwara K, Igarashi J, Irahara N, Kimura M, Nagase H
New chemically induced skin tumour susceptibility loci identified in a mouse backcross between FVB and dominant resistant PWK.
BMC Genet. 2007;839.
BACKGROUND: A variety of skin cancer susceptibility among mouse strains has allowed identification of genes responsible for skin cancer development. Fifteen Skts loci for skin tumour susceptibility have been mapped so far by using the two-stage skin carcinogenesis model [induced by 7.12-dimethylbenz(a)anthracene (DMBA)/12-O-tetradecanoylphorbol-13-acetate (TPA)]. A few responsible genes have been identified using wild-derived dominant resistant Mus spretus mice, and one has been confirmed as a low penetrance cancer susceptibility gene in a variety of human cancers. RESULTS: In the present study, we found that wild-derived PWK mice developed no tumour by treatment with the two-stage skin carcinogenesis protocol. This phenotype is dominant resistant when crossed with the highly susceptible strain FVB. By analyzing the F1 backcross generation between PWK and FVB, we found empirical evidence of significant linkage at the new loci Skts-fp1 on chromosome 4 and suggestive linkage on chromosomes 1, 3, 11, 12 and 14 for skin tumour susceptibility. Skts-fp1 includes the Skts7 interval, which was previously mapped by a Mus spretus and NIH backcross. We also observed suggestive linkage on chromosomes 1 and 2 in the female population only, while suggestive linkage on chromosomes 14 and 15 only was observed in the male population. A significant genetic interaction was seen between markers of D11Mit339 and D16Mit14. CONCLUSION: Analysis of this new cross may facilitate the identification of genes responsible for mouse skin cancer susceptibility and may reveal their biological interactions. [Abstract/Link to Full Text]

Curtis D, Xu K
Minor differences in haplotype frequency estimates can produce very large differences in heterogeneity test statistics.
BMC Genet. 2007;838.
BACKGROUND: Tests for association between a haplotype and disease are commonly performed using a likelihood ratio test for heterogeneity between case and control haplotype frequencies. Using data from a study of association between heroin dependence and the DRD2 gene, we obtained estimated haplotype frequencies and the associated likelihood ratio statistic using two different computer programs, MLOCUS and GENECOUNTING. We also carried out permutation testing to assess the empirical significance of the results obtained. RESULTS: Both programs yielded similar, though not identical, estimates for the haplotype frequencies. MLOCUS produced a p value of 1.8*10-15 and GENECOUNTING produced a p value of 5.4*10-4. Permutation testing produced a p value 2.8*10-4. CONCLUSION: The fact that very large differences occur between the likelihood ratio statistics from the two programs may reflect the fact that the haplotype frequencies for the combined group are not constrained to be equal to the weighted averages of the frequencies for the cases and controls, as they would be if they were directly observed rather than being estimated. Minor differences in haplotype frequency estimates can result in very large differences in the likelihood ratio statistic and associated p value. [Abstract/Link to Full Text]

Calderón R, Lodeiro R, Varela TA, Farińa J, Ambrosio B, Guitard E, González-Martín A, Dugoujon JM
GM and KM immunoglobulin allotypes in the Galician population: new insights into the peopling of the Iberian Peninsula.
BMC Genet. 2007;837.
BACKGROUND: The current genetic structure of Iberian populations has presumably been affected by the complex orography of its territory, the different people and civilizations that settled there, its ancient and complex history, the diverse and persistent sociocultural patterns in its different regions, and also by the effects of the Iberian Peninsula representing a refugium area after the last glacial maximum. This paper presents the first data on GM and KM immunoglobulin allotypes in the Galician population and, thus, provides further insights into the extent of genetic diversity in populations settled in the geographic extremes of the Cantabrian region of northern Spain. Furthermore, the genetic relationships of Galicians with other European populations have been investigated. RESULTS: Galician population shows a genetic profile for GM haplotypes that is defined by the high presence of the European Mediterranean GM*3 23 5* haplotype, and the relatively high incidence of the African marker GM*1,17 23' 5*. Data based on comparisons between Galician and other Spanish populations (mainly from the north of the peninsula) reveal a poor correlation between geographic and genetic distances (r = 0.30, P = 0.105), a noticeable but variable genetic distances between Galician and Basque subpopulations, and a rather close genetic affinity between Galicia and Valencia, populations which are geographically separated by a long distance and have quite dissimilar cultures and histories. Interestingly, Galicia occupies a central position in the European genetic map, despite being geographically placed at one extreme of the European continent, while displaying a close genetic proximity to Portugal, a finding that is consistent with their shared histories over centuries. CONCLUSION: These findings suggest that the population of Galicia is the result of a relatively balanced mixture of European populations or of the ancestral populations that gave rise to them. This would support the importance of the migratory movements that have taken place in Europe over the course of recent human history and their effects on the European genetic landscape. [Abstract/Link to Full Text]

Sebastiani P, Abad-Grau MM
Bayesian estimates of linkage disequilibrium.
BMC Genet. 2007;836.
BACKGROUND: The maximum likelihood estimator of D'--a standard measure of linkage disequilibrium--is biased toward disequilibrium, and the bias is particularly evident in small samples and rare haplotypes. RESULTS: This paper proposes a Bayesian estimation of D' to address this problem. The reduction of the bias is achieved by using a prior distribution on the pair-wise associations between single nucleotide polymorphisms (SNP)s that increases the likelihood of equilibrium with increasing physical distances between pairs of SNPs. We show how to compute the Bayesian estimate using a stochastic estimation based on MCMC methods, and also propose a numerical approximation to the Bayesian estimates that can be used to estimate patterns of LD in large datasets of SNPs. CONCLUSION: Our Bayesian estimator of D' corrects the bias toward disequilibrium that affects the maximum likelihood estimator. A consequence of this feature is a more objective view about the extent of linkage disequilibrium in the human genome, and a more realistic number of tagging SNPs to fully exploit the power of genome wide association studies. [Abstract/Link to Full Text]

Shehata MF, Leenen FH, Tesson F
Sequence analysis of coding and 3' and 5' flanking regions of the epithelial sodium channel alpha, beta, and gamma genes in Dahl S versus R rats.
BMC Genet. 2007;835.
BACKGROUND: To test whether epithelial sodium channel (ENaC) genes' variants contribute to salt sensitive hypertension in Dahl rats, we screened ENaC alpha, beta, and gamma genes entire coding regions, intron-exon junctions, and the 3' and 5' flanking regions in Dahl S, R and Wistar rats using both Denaturing High Performance Liquid Chromatography (DHPLC) and sequencing. RESULTS: Our analysis revealed no sequence variability in the three genes encoding ENaC in Dahl S versus R rats. One homozygous sequence variation predicted to result in a D75E substitution was identified in Dahl and Wistar rat ENaC alpha compared to Brown Norway. Six and two previously reported polymorphic sites in Brown Norway sequences were lost in Dahl and Wistar rats, respectively. In the 5' flanking regions, we found a deletion of 5GCTs in Dahl and Wistar rat ENaC alpha gene, five new polymorphic sites in ENaC beta and gamma genes, one homozygous sequence variation in Dahl and Wistar rat ENaC gamma gene, as well as one Dahl rat specific homozygous insertion of -1118CCCCCA in ENaC gamma gene. This insertion created additional binding sites for Sp1 and Oct-1. Five and three Brown Norway polymorphic sites were lost in Dahl and Wistar rats, respectively. No sequence variability in ENaC 3' flanking regions was identified in Dahl compared to Brown Norway rats. CONCLUSION: The first comprehensive sequence analysis of ENaC genes did not reveal any differences between Dahl S and R rats that were isogenic in the regions screened. Mutations in ENaC genes intronic sequence or in ENaC-regulatory genes might possibly account for increased ENaC activity in Dahl S versus R rats. [Abstract/Link to Full Text]

Gao X, Starmer J
Human population structure detection via multilocus genotype clustering.
BMC Genet. 2007;834.
BACKGROUND: We describe a hierarchical clustering algorithm for using Single Nucleotide Polymorphism (SNP) genetic data to assign individuals to populations. The method does not assume Hardy-Weinberg equilibrium and linkage equilibrium among loci in sample population individuals. RESULTS: We show that the algorithm can assign sample individuals highly accurately to their corresponding ethnic groups in our tests using HapMap SNP data and it is also robust to admixed populations when tested with Perlegen SNP data. Moreover, it can detect fine-scale population structure as subtle as that between Chinese and Japanese by using genome-wide high-diversity SNP loci. CONCLUSION: The algorithm provides an alternative approach to the popular STRUCTURE program, especially for fine-scale population structure detection in genome-wide association studies. This is the first successful separation of Chinese and Japanese samples using random SNP loci with high statistical support. [Abstract/Link to Full Text]

Kahlmann D, Davalos-Misslitz AC, Ohl L, Stanke F, Witte T, Förster R
Genetic variants of chemokine receptor CCR7 in patients with systemic lupus erythematosus, Sjogren's syndrome and systemic sclerosis.
BMC Genet. 2007;833.
BACKGROUND: The chemokine receptor CCR7 is a key organizer of the immune system. Gene targeting in mice revealed that Ccr7-deficient animals are severely impaired in the induction of central and peripheral tolerance. Due to these defects, Ccr7-deficient mice spontaneously develop multi-organ autoimmunity showing symptoms similar to those observed in humans suffering from connective tissue autoimmune diseases. However, it is unknown whether mutations of CCR7 are linked to autoimmunity in humans. RESULTS: DNA samples were collected from 160 patients suffering from connective tissue autoimmune disease (Sjogren's syndrome, n = 40; systemic lupus erythematosus, SLE, n = 20 and systemic sclerosis, n = 100) and 40 health subjects (n = 40). All participants in this study were of German descent. Samples were screened for single nucleotide polymorphisms (SNP) by sequencing the coding region of the CCR7 gene as well asthe exon flaking intron sites and parts of the regions encoding for the 5'- and 3'-UTR. CCR7 variants were rare. We identified six different sequence variants, which occurred in heterozygosis. The identified SNP were observed at position -60 C/T (observed 1x), +6,476 A/G (7x), +6,555 C/T (15x), +6,560 C/T (6x), +10,440 A/G (3x) and +11,475 C/A (1x). Four of these variants (+6,476 A/G, +6,555 C/T, +6,560 C/T and +10,440 A/G) display allelic frequencies between 1% and 5 % and were present in both patients and control groups. The variants +6,476 A/G, +6,555 C/T, +6,560 C/T are located in the intron 2, while the +10,440 A/G variant corresponds to a silent mutation in exon 3. The variants -60 C/T and +11,475 C/A which are located at the 5'-UTR and 3-UTR respectively, display allelic frequencies below 1%. No correlation between these variants and the autoimmune diseases investigated could be observed. However, reporter gene expression assay demonstrated that the mutation at the -60 C/T position in homozygosis leads to reduced luciferase activity. CONCLUSION: These results suggest that variants of CCR7 gene occur at an extremely low frequency in the German population and that neither Sjogren's syndrome, systemic lupus erythematosus, nor systemic sclerosis are associated with these variants. Nevertheless, the decreased luciferase activity observed in cells transfected with the promoter region bearing the -60 C/T mutation suggests that this CCR7 variant could potentially lead to increased susceptibility to autoimmunity. [Abstract/Link to Full Text]

Olsen HG, Nilsen H, Hayes B, Berg PR, Svendsen M, Lien S, Meuwissen T
Genetic support for a quantitative trait nucleotide in the ABCG2 gene affecting milk composition of dairy cattle.
BMC Genet. 2007;832.
BACKGROUND: Our group has previously identified a quantitative trait locus (QTL) affecting fat and protein percentages on bovine chromosome 6, and refined the QTL position to a 420-kb interval containing six genes. Studies performed in other cattle populations have proposed polymorphisms in two different genes (ABCG2 and OPN) as the underlying functional QTL nucleotide. Due to these conflicting results, we have included these QTNs, together with a large collection of new SNPs produced from PCR sequencing, in a dense marker map spanning the QTL region, and reanalyzed the data using a combined linkage and linkage disequilibrium approach. RESULTS: Our results clearly exclude the OPN SNP (OPN_3907) as causal site for the QTL. Among 91 SNPs included in the study, the ABCG2 SNP (ABCG2_49) is clearly the best QTN candidate. The analyses revealed the presence of only one QTL for the percentage traits in the tested region. This QTL was completely removed by correcting the analysis for ABCG2_49. Concordance between the sires' marker genotypes and segregation status for the QTL was found for ABCG2_49 only. The C allele of ABCG2_49 is found in a marker haplotype that has an extremely negative effect on fat and protein percentages and positive effect on milk yield. Of the 91 SNPs, ABCG2_49 was the only marker in perfect linkage disequilibrium with the QTL. CONCLUSION: Based on our results, OPN_3907 can be excluded as the polymorphism underlying the QTL. The results of this and other papers strongly suggest the [A/C] mutation in ABCG2_49 as the causal mutation, although the possibility that ABCG2_49 is only a marker in perfect LD with the true mutation can not be completely ruled out. [Abstract/Link to Full Text]

Berquist BR, DasSarma P, DasSarma S
Essential and non-essential DNA replication genes in the model halophilic Archaeon, Halobacterium sp. NRC-1.
BMC Genet. 2007;831.
BACKGROUND: Information transfer systems in Archaea, including many components of the DNA replication machinery, are similar to those found in eukaryotes. Functional assignments of archaeal DNA replication genes have been primarily based upon sequence homology and biochemical studies of replisome components, but few genetic studies have been conducted thus far. We have developed a tractable genetic system for knockout analysis of genes in the model halophilic archaeon, Halobacterium sp. NRC-1, and used it to determine which DNA replication genes are essential. RESULTS: Using a directed in-frame gene knockout method in Halobacterium sp. NRC-1, we examined nineteen genes predicted to be involved in DNA replication. Preliminary bioinformatic analysis of the large haloarchaeal Orc/Cdc6 family, related to eukaryotic Orc1 and Cdc6, showed five distinct clades of Orc/Cdc6 proteins conserved in all sequenced haloarchaea. Of ten orc/cdc6 genes in Halobacterium sp. NRC-1, only two were found to be essential, orc10, on the large chromosome, and orc2, on the minichromosome, pNRC200. Of the three replicative-type DNA polymerase genes, two were essential: the chromosomally encoded B family, polB1, and the chromosomally encoded euryarchaeal-specific D family, polD1/D2 (formerly called polA1/polA2 in the Halobacterium sp. NRC-1 genome sequence). The pNRC200-encoded B family polymerase, polB2, was non-essential. Accessory genes for DNA replication initiation and elongation factors, including the putative replicative helicase, mcm, the eukaryotic-type DNA primase, pri1/pri2, the DNA polymerase sliding clamp, pcn, and the flap endonuclease, rad2, were all essential. Targeted genes were classified as non-essential if knockouts were obtained and essential based on statistical analysis and/or by demonstrating the inability to isolate chromosomal knockouts except in the presence of a complementing plasmid copy of the gene. CONCLUSION: The results showed that ten out of nineteen eukaryotic-type DNA replication genes are essential for Halobacterium sp. NRC-1, consistent with their requirement for DNA replication. The essential genes code for two of ten Orc/Cdc6 proteins, two out of three DNA polymerases, the MCM helicase, two DNA primase subunits, the DNA polymerase sliding clamp, and the flap endonuclease. [Abstract/Link to Full Text]

Curtis D
Allelic association studies of genome wide association data can reveal errors in marker position assignments.
BMC Genet. 2007;830.
BACKGROUND: Genome wide association (GWA) studies provide the opportunity to develop new kinds of analysis. Analysing pairs of markers from separate regions might lead to the detection of allelic association which might indicate an interaction between nearby genes. METHODS: 396,591 markers typed in 541 subjects were studied. 7.8*1010 pairs of markers were screened and those showing initial evidence for allelic association were subjected to more thorough investigation along with 10 flanking markers on either side. RESULTS: No evidence was detected for interaction. However 6 markers appeared to have an incorrect map position according to NCBI Build 35. One of these was corrected in Build 36 and 2 were dropped. The remaining 3 were left with map positions inconsistent with their allelic association relationships. DISCUSSION: Although no interaction effects were detected the method was successful in identifying markers with probably incorrect map positions. CONCLUSION: The study of allelic association can supplement other methods for assigning markers to particular map positions. Analyses of this type may usefully be applied to data from future GWA studies. [Abstract/Link to Full Text]

Zhu L, Ruan XD, Ge YF, Wan QH, Fang SG
Low major histocompatibility complex class II DQA diversity in the Giant Panda (Ailuropoda melanoleuca).
BMC Genet. 2007;829.
BACKGROUND: The giant panda (Ailuropoda melanoleuca) is one of the most endangered animals due to habitat fragmentation and loss. Although the captive breeding program for this species is now nearly two decades old, researches on the genetic background of such captive populations, especially on adaptive molecular polymorphism of major histocompatibility complex (MHC), are still limited. In this study, we characterized adaptive variation of the giant panda's MHC DQA gene by PCR amplification of its antigen-recognizing region (i.e. the exon 2) and subsequent single-strand conformational polymorphism (SSCP) and sequence analyses. RESULTS: The results revealed a low level of DQA exon 2 diversity in this rare animal, presenting 6 alleles from 61 giant panda individuals. The observed polymorphism was restricted to 9 amino acid substitutions, all of which occurred at and adjacent to positions forming the functionally important antigen-binding sites. All the samples were in Hardy-Weinberg proportions. A significantly higher rate of non-synonymous than synonymous substitutions at the antigen-binding sites indicated positive selection for diversity in the locus. CONCLUSION: The DQA allelic diversity of giant pandas was low relative to other vertebrates. Nonetheless, the pandas exhibited more alleles in DQA than those in DRB, suggesting the alpha chain genes would play a leading role when coping with certain pathogens and thus should be included in conservation genetic investigation. The microsatellite and MHC loci might predict long-term persistence potential and short-term survival ability, respectively. Consequently, it is recommended to utilize multiple suites of microsatellite markers and multiple MHC loci to detect overall genetic variation in order to design unbiased conservation strategies. [Abstract/Link to Full Text]

Zaki M, King J, Fütterer K, Insall RH
Replacement of the essential Dictyostelium Arp2 gene by its Entamoeba homologue using parasexual genetics.
BMC Genet. 2007;828.
BACKGROUND: Cell motility is an essential feature of the pathogenesis and morbidity of amoebiasis caused by Entamoeba histolytica. As motility depends on cytoskeletal organisation and regulation, a study of the molecular components involved is key to a better understanding of amoebic pathogenesis. However, little is known about the physiological roles, interactions and regulation of the proteins of the Entamoeba cytoskeleton. RESULTS: We have established a genetic strategy that uses parasexual genetics to allow essential Dictyostelium discoideum genes to be manipulated and replaced with modified or tagged homologues. Our results show that actin related protein 2 (Arp2) is essential for survival, but that the Dictyostelium protein can be complemented by E. histolytica Arp2, despite the presence of an insertion of 16 amino acids in an otherwise highly conserved protein. Replacement of endogenous Arp2 with myc-tagged Entamoeba or Dictyostelium Arp2 has no obvious effects on growth and the protein incorporates effectively into the Arp2/3 complex. CONCLUSION: We have established an effective two-step method for replacing genes that are required for survival. Our protocol will allow such genes to be studied far more easily, and also allows an unambiguous demonstration that particular genes are truly essential. In addition, cells in which the Dictyostelium Arp2 has been replaced by the Entamoeba protein are potential targets for drug screens. [Abstract/Link to Full Text]

Bighignoli B, Niini T, Grahn RA, Pedersen NC, Millon LV, Polli M, Longeri M, Lyons LA
Cytidine monophospho-N-acetylneuraminic acid hydroxylase (CMAH) mutations associated with the domestic cat AB blood group.
BMC Genet. 2007;827.
BACKGROUND: The cat has one common blood group with two major serotypes, blood type A that is dominant to type B. A rare type AB may also be allelic and is suspected to be recessive to A and dominant to B. Cat blood type antigens are defined, N-glycolylneuraminic acid (NeuGc) is associated with type A and N-acetylneuraminic acid (NeuAc) with type B. The enzyme cytidine monophospho-N-acetylneuraminic acid hydroxylase (CMAH) determines the sugar bound to the red cell by converting NeuAc to NeuGc. Thus, mutations in CMAH may cause the A and B blood types. RESULTS: Genomic sequence of CMAH from eight cats and the cDNA of four cats representing all blood types were analyzed to identify causative mutations. DNA variants consistent with the blood types were genotyped in over 200 cats. Five SNPs and an indel formed haplotypes that were consistent with each blood type. CONCLUSION: Mutations in type B cats likely disrupt the gene function of CMAH, leading to a predominance of NeuAc. Type AB concordant variants were not identified, however, cDNA species suggest an alternative allele that activates a downstream start site, leading to a CMAH protein that would be altered at the 5' region. The cat AB blood group system is proposed to be designated by three alleles, A > aab > b. The A and b CMAH alleles described herein can distinguish type A and type B cats without blood sample collections. CMAH represents the first blood group gene identified outside of non-human primates and humans. [Abstract/Link to Full Text]

Chan WM, Andrews C, Dragan L, Fredrick D, Armstrong L, Lyons C, Geraghty MT, Hunter DG, Yazdani A, Traboulsi EI, Pott JW, Gutowski NJ, Ellard S, Young E, Hanisch F, Koc F, Schnall B, Engle EC
Three novel mutations in KIF21A highlight the importance of the third coiled-coil stalk domain in the etiology of CFEOM1.
BMC Genet. 2007;826.
BACKGROUND: Congenital fibrosis of the extraocular muscles types 1 and 3 (CFEOM1/CFEOM3) are autosomal dominant strabismus disorders that appear to result from maldevelopment of ocular nuclei and nerves. We previously reported that most individuals with CFEOM1 and rare individuals with CFEOM3 harbor heterozygous mutations in KIF21A. KIF21A encodes a kinesin motor involved in anterograde axonal transport, and the familial and de novo mutations reported to date predictably alter one of only a few KIF21A amino acids--three within the third coiled-coil region of the stalk and one in the distal motor domain, suggesting they result in altered KIF21A function. To further define the spectrum of KIF21A mutations in CFEOM we have now identified all CFEOM probands newly enrolled in our study and determined if they harbor mutations in KIF21A. RESULTS: Sixteen CFEOM1 and 29 CFEOM3 probands were studied. Three previously unreported de novo KIF21A mutations were identified in three CFEOM1 probands, all located in the same coiled-coil region of the stalk that contains all but one of the previously reported mutations. Eight additional CFEOM1 probands harbored three of the mutations previously reported in KIF21A; seven had one of the two most common mutations, while one harbored the mutation in the distal motor domain. No mutation was detected in 5 CFEOM1 or any CFEOM3 probands. CONCLUSION: Analysis of sixteen CFEOM1 probands revealed three novel KIF21A mutations and confirmed three reported mutations, bringing the total number of reported KIF21A mutations in CFEOM1 to 11 mutations among 70 mutation positive probands. All three new mutations alter amino acids in heptad repeats within the third coiled-coil region of the KIF21A stalk, further highlighting the importance of alterations in this domain in the etiology of CFEOM1. [Abstract/Link to Full Text]

Marrosu MG, Murru R, Costa G, Melis MC, Rolesu M, Schirru L, Solla E, Cuccu S, Secci MA, Whalen MB, Cocco E, Pugliatti M, Sotgiu S, Rosati G, Cucca F
Variation of the myelin oligodendrocyte glycoprotein gene is not primarily associated with multiple sclerosis in the Sardinian population.
BMC Genet. 2007;825.
BACKGROUND: Multiple sclerosis (MS) is consistently associated with particular HLA-DRB1-DQB1 haplotypes. However, existing evidence suggests that variation at these loci does not entirely explain association of the HLA region with the disease. The MOG locus is a prime positional and functional candidate for such additional predisposing effects but the analysis is complicated by the strong, albeit labyrinthine pattern of linkage disequilibrium in the region. Here we have assessed the association of MOG variation with MS in the Sardinian population to see if it represents an independent contributor to MS predisposition. RESULTS: After re-sequencing the MOG gene in 21 healthy parents of MS patients we detected 134 variants, 33 of which were novel. A set of 40 informative SNPs was then selected and assessed for disease association together with 1 intragenic microsatellite in an initial data set of 239 MS families. This microsatellite and 11 SNPs were found to be positively associated with MS, using the transmission disequilibrium test, and were followed up in an additional 158 families (total families analysed = 397). While in these 397 families, 8 markers showed significant association with MS, through conditional tests we determined that these MOG variants were not associated with MS independently of the main DRB1-DQB1 disease associations. CONCLUSION: These results indicate that variation within the MOG gene is not an important independent determinant of MS-inherited risk in the Sardinian population. [Abstract/Link to Full Text]

Nejentsev S, Smink LJ, Smyth D, Bailey R, Lowe CE, Payne F, Masters J, Godfrey L, Lam A, Burren O, Stevens H, Nutland S, Walker NM, Smith A, Twells R, Barratt BJ, Wright C, French L, Chen Y, Deloukas P, Rogers J, Dunham I, Todd JA
Sequencing and association analysis of the type 1 diabetes-linked region on chromosome 10p12-q11.
BMC Genet. 2007;824.
BACKGROUND: In an effort to locate susceptibility genes for type 1 diabetes (T1D) several genome-wide linkage scans have been undertaken. A chromosomal region designated IDDM10 retained genome-wide significance in a combined analysis of the main linkage scans. Here, we studied sequence polymorphisms in 23 Mb on chromosome 10p12-q11, including the putative IDDM10 region, to identify genes associated with T1D. RESULTS: Initially, we resequenced the functional candidate genes, CREM and SDF1, located in this region, genotyped 13 tag single nucleotide polymorphisms (SNPs) and found no association with T1D. We then undertook analysis of the whole 23 Mb region. We constructed and sequenced a contig tile path from two bacterial artificial clone libraries. By comparison with a clone library from an unrelated person used in the Human Genome Project, we identified 12,058 SNPs. We genotyped 303 SNPs and 25 polymorphic microsatellite markers in 765 multiplex T1D families and followed up 22 associated polymorphisms in up to 2,857 families. We found nominal evidence of association in six loci (P = 0.05 - 0.0026), located near the PAPD1 gene. Therefore, we resequenced 38.8 kb in this region, found 147 SNPs and genotyped 84 of them in the T1D families. We also tested 13 polymorphisms in the PAPD1 gene and in five other loci in 1,612 T1D patients and 1,828 controls from the UK. Overall, only the D10S193 microsatellite marker located 28 kb downstream of PAPD1 showed nominal evidence of association in both T1D families and in the case-control sample (P = 0.037 and 0.03, respectively). CONCLUSION: We conclude that polymorphisms in the CREM and SDF1 genes have no major effect on T1D. The weak T1D association that we detected in the association scan near the PAPD1 gene may be either false or due to a small genuine effect, and cannot explain linkage at the IDDM10 region. [Abstract/Link to Full Text]

Recent Articles in BMC Genomics

Hedeler C, Wong HM, Cornell MJ, Alam I, Soanes DM, Rattray M, Hubbard SJ, Talbot NJ, Oliver SG, Paton NW
e-Fungi: a data resource for comparative analysis of fungal genomes.
BMC Genomics. 2007 Nov 20;8(1):426.
ABSTRACT: BACKGROUND: The number of sequenced fungal genomes is ever increasing, with about 200 genomes already fully sequenced or in progress. Only a small percentage of those genomes have been comprehensively studied, for example using techniques from functional genomics. Comparative analysis has proven to be a useful strategy for enhancing our understanding of evolutionary biology and of the less well understood genomes. However, the data required for these analyses tends to be distributed in various heterogeneous data sources, making systematic comparative studies a cumbersome task. Furthermore, comparative analyses benefit from close integration of derived data sets that cluster genes or organisms in a way that eases the expression of requests that clarify points of similarity or difference between species. Description To support systematic comparative analyses of fungal genomes we have developed the e-Fungi database, which integrates a variety of data for more than 30 fungal genomes. Publicly available genome data, functional annotations, and pathway information has been integrated into a single data repository and complemented with results of comparative analyses, such as MCL and OrthoMCL cluster analysis, and predictions of signaling proteins and the sub-cellular localisation of proteins. To access the data, a library of analysis tasks is available through a web interface. The analysis tasks are motivated by recent comparative genomics studies, and aim to support the study of evolutionary biology as well as community efforts for improving the annotation of genomes. Web services for each query are also available, enabling the tasks to be incorporated into workflows. CONCLUSIONS: The e-Fungi database provides fungal biologists with a resource for comparative studies of a large range of fungal genomes. Its analysis library supports the comparative study of genome data, functional annotation, and results of large scale analyses over all the genomes stored in the database. The database is accessible at, as is the WSDL for the web services. [Abstract/Link to Full Text]

Buza TJ, McCarthy FM, Burgess SC
Experimental-confirmation and functional-annotation of predicted proteins in the chicken genome.
BMC Genomics. 2007 Nov 19;8(1):425.
ABSTRACT: BACKGROUND: The chicken genome was sequenced because of its phylogenetic position as a non-mammalian vertebrate, its use as a biomedical model especially to study embryology and development, its role as a source of human disease organisms and its importance as the major source of animal derived food protein. However, genomic sequence data is, in itself, of limited value; generally it is not equivalent to understanding biological function. The benefit of having a genome sequence is that it provides a basis for functional genomics. However, the sequence data currently available is poorly structurally and functionally annotated and many genes do not have standard nomenclature assigned. RESULTS: We analysed eight chicken tissues and improved the chicken genome structural annotation by providing experimental support for the in vivo expression of 7,809 computationally predicted proteins, including 30 chicken proteins that were only electronically predicted or hypothetical translations in human. To improve functional annotation (based on Gene Ontology), we mapped these identified proteins to their human and mouse orthologs and used this orthology to transfer Gene Ontology (GO) functional annotations to the chicken proteins. The 8,213 orthology-based GO annotations that we produced represent an 8% increase in currently available chicken GO annotations. Orthologous chicken products were also assigned standardized nomenclature based on current chicken nomenclature guidelines. CONCLUSIONS: We demonstrate the utility of high-throughput expression proteomics for rapid experimental structural annotation of a newly sequenced eukaryote genome. These experimentally-supported predicted proteins were further annotated by assigning the proteins with standardized nomenclature and functional annotation. This method is widely applicable to a diverse range of species. Moreover, information from one genome can be used to improve the annotation of other genomes and inform gene prediction algorithms. [Abstract/Link to Full Text]

Lijavetzky D, Cabezas JA, Ibanez A, Rodriguez V, Martinez-Zapater JM
High throughput SNP discovery and genotyping in grapevine (Vitis vinifera L.) by combining a re-sequencing approach and SNPlex technology.
BMC Genomics. 2007 Nov 19;8(1):424.
ABSTRACT: BACKGROUND: Single-nucleotide polymorphisms (SNPs) are the most abundant type of DNA sequence polymorphisms. Their higher availability and stability when compared to simple sequence repeats (SSRs) provide enhanced possibilities for genetic and breeding applications such as the construction of genetic maps, the assessment of genetic diversity, the detection of genotype/phenotype associations or marker-assisted breeding. In addition, the efficiency of these activities can be improved thanks to the ease with which SNP genotyping can be automated. Expressed sequence tags (EST) sequencing projects in grapevine are allowing for the in silico detection of multiple putative sequence polymorphisms within and among a reduced number of cultivars. In parallel, the sequence of Pinot Noir is also providing thousands of polymorphisms present in this highly heterozygous genome. Still the general application of those SNPs requires further validation since their use could be restricted to those specific genotypes. RESULTS: In order to develop a large SNP set of wide application in grapevine we followed a systematic re-sequencing approach in a group of 11 grape genotypes selected to better represent the existent genetic variation. Using this approach, we have sequenced 230 gene fragments, what represents the analysis of over 1 Mb of grape DNA sequence. This analysis has allowed the discovery of ~1700 SNPs with an average of 60 bp/SNP (43 bp/SNP in non-coding regions and 67 bp/SNP in coding regions). Nucleotide diversity in grape (Pi=0.0066) was found to be similar to values observed in highly polymorphic plant species such as maize. The average number of haplotypes per gene was estimated as six with three haplotypes representing over 83% of the analyzed sequences. Short-range linkage disequilibrium (LD) studies within the analyzed sequences indicate the existence of a rapid decay of LD in the grapevine genome. To validate the use of the detected polymorphisms in genetic mapping, cultivar identification and genetic diversity studies we have used the SNPlexTM genotyping technology in a sample of grapevine genotypes and segregating progenies. CONCLUSIONS: These results allow to generate accurate values for coding sequences nucleotide diversity and provide a first estimate of short-range LD in grapevine. Using SNPlexTM genotyping we have validated the utilization of the discovered SNPs as molecular markers for linkage mapping, cultivar identification and genetic diversity studies. Thus, the combination a highly efficient re-sequencing approach and the SNPlexTM high throughput genotyping technology provide a powerful tool for grapevine genetic analysis. [Abstract/Link to Full Text]

Ramsey JS, Wilson AC, de Vos M, Sun Q, Tamborindeguy C, Winfield A, Malloch G, Smith DM, Fenton B, Gray SM, Jander G
Genomic resources for Myzus persicae: EST sequencing, SNP identification, and microarray design.
BMC Genomics. 2007 Nov 16;8(1):423.
ABSTRACT: BACKGROUND: The green peach aphid, Myzus persicae (Sulzer), is a world-wide insect pest capable of infesting more than 40 plant families, including many crop species. However, despite the significant damage inflicted by M. persicae in agricultural systems, primarily through its ability to transmit plant viruses, limited genomic information is available for this species. RESULTS: Sequencing of 16 M. persicae cDNA libraries generated 26,669 expressed sequence tags (ESTs). Aphids for library construction were raised on Arabidopsis thaliana, Nicotiana benthamiana, Brassica oleracea, B. napus, and Physalis floridana (with and without potato leafroll virus infection). The M. persicae cDNA libraries include ones made from sexual and asexual whole aphids, guts, heads, and salivary glands. In silico comparison of cDNA libraries identified aphid genes with tissue-specific expression patterns, and gene expression that is induced by feeding on Nicotiana benthamiana. Furthermore, 2423 genes that are novel to science and potentially aphid-specific were identified. Comparison of cDNA data from three aphid lineages identified single nucleotide polymorphisms that can be used as genetic markers and, in some cases, may represent functional differences in the protein products. In particular, non-conservative amino acid substitutions in a highly expressed gut protease may be of adaptive significance for M. persicae feeding on different host plants. The Agilent eArray platform was used to design an M. persicae oligonucleotide microarray representing over 10,000 unique genes. CONCLUSIONS: New genomic resources have been developed for M. persicae, an agriculturally important insect pest. These include previously unknown sequence data, a collection of expressed genes, molecular markers, and a DNA microarray that can be used to study aphid gene expression. These resources will help elucidate the adaptations that allow M. persicae to develop compatible interactions with its host plants, complementing ongoing work illuminating plant molecular responses to phloem-feeding insects. [Abstract/Link to Full Text]

de Boer JG, Yazawa R, Davidson WS, Koop BF
Bursts and horizontal evolution of DNA transposons in the speciation of pseudotetraploid salmonids.
BMC Genomics. 2007 Nov 16;8(1):422.
ABSTRACT: BACKGROUND: Several genome duplications have occurred in the evolutionary history of teleost fish. In returning to a stable diploid state, the polyploid genome reorganized, and large portions are lost, while the fish lines evolved to numerous species. Large scale transposon movement has been postulated to play an important role in the genome reorganization process. We analyzed the DNA sequence of several large loci in Salmo salar and other species for the presence of DNA transposon families. RESULTS: We have identified bursts of activity of 14 families of DNA transposons (12 Tc1-like and 2 piggyBac-like families, including 11 novel ones) in genome sequences of Salmo salar. Several of these families have similar sequences in a number of closely and distantly related fish, lamprey, and frog species as well as in the parasite Schistosoma japonicum. Analysis of sequence similarities between copies within the families of these bursts demonstrates several waves of transposition activities coinciding with salmonid species divergence. Tc1-like families show a master gene-like copying process, illustrated by extensive but short burst of copying activity, while the piggyBac-like families show a more random copying pattern. Recent families may include copies with an open reading frame for an active transposase enzyme. CONCLUSIONS: We have identified defined bursts of transposon activity that make use of master-slave and random mechanisms. The bursts occur well after hypothesized polyploidy events and coincide with speciation events. Parasite-mediated lateral transfer of transposons are implicated. [Abstract/Link to Full Text]

Pant SD, Schenkel FS, Leyva-Baca I, Sharma BS, Karrow NA
Identification of single nucleotide polymorphisms in bovine CARD15 and their associations with health and production traits in Canadian Holsteins.
BMC Genomics. 2007 Nov 15;8(1):421.
ABSTRACT: BACKGROUND: Toll-like receptor-2 (TLR2) and Caspase Recruitment Domain 15 (CARD15) are important pattern recognition receptors that play a role in the initiation of the inflammatory and subsequent immune response. They have been previously identified as susceptibility loci for inflammatory bowel diseases in humans and are, therefore, suitable candidate genes for inflammatory disease resistance in cattle. The objective of this study was to identify single nucleotide polymorphisms (SNPs) in the bovine TLR2 and CARD15 and evaluate the association of these SNPs with health and production traits in a population of Canadian Holstein bulls. RESULTS: A selective DNA pool was constructed based on the estimated breeding values (EBVs) for SCS. Gene segments were amplified from this pool in PCR reactions and the amplicons sequenced to reveal polymorphisms. A total of four SNPs, including one in intron 10 (c.2886-14A>G) and three in the exon 12 (c.3020A>T, c.4500A>C and c.4950C>T) were identified in CARD15; none were identified in TLR2. Canadian Holstein bulls (n=338) were genotyped and haplotypes were reconstructed. Two SNPs, c.3020A>T and c.4500A>C, were associated with EBVs for health and production traits. The SNP, c.3020A>T, for example, was associated with SCS EBVs (p=0.0097) with an allele substitution effect of 0.07 score. When compared to the most frequent haplotype Hap12(AC), Hap22(TC) was associated with increased milk (p < 0.0001) and protein (p = 0.0007) yield EBVs, and hap21(TA) was significantly associated with increased SCS EBV(p = 0.0120). All significant comparison-wise associations retained significance at 8% experimental-wise level by permutation test. CONCLUSIONS: This study indicates that SNP c.3020A>T might play a role in the host response against mastitis and further detailed studies are needed to understand its functional mechanisms. [Abstract/Link to Full Text]

Collins JF, Hu Z
Promoter analysis of intestinal genes induced during iron-deprivation reveals enrichment of conserved SP1-like binding sites.
BMC Genomics. 2007 Nov 15;8(1):420.
ABSTRACT: BACKGROUND: Iron-deficiency leads to the induction of genes related to intestinal iron absorption and homeostasis. By analyzing a large GeneChip dataset from the rat intestine, we identified a large cluster of 228 genes that was induced by iron-deprivation. Only 2 of these genes contained 3' iron-response elements, suggesting that other regulation including transcriptional may be involved. We therefore utilized computational methods to test the hypothesis that some of the genes within this large up-regulated cluster are co-ordinately regulated by common transcriptional mechanisms. Methods: We identified promoters from the up-regulated gene cluster from rat, mouse and human, and performed enrichment analyses with the Clover program and the TRANSFAC database. RESULTS: Surprisingly, we found a strong statistical enrichment for SP1 binding sites in our experimental promoters as compared to background sequences. As the TRANSFAC database cannot distinguish among SP/KLF family members, many of which bind similar GC-rich DNA sequences, we surmise that SP1 or an SP1-like factor could be involved in this response. In fact, we detected induction of SP6/KLF14 in the GeneChip studies, and confirmed it by real-time PCR analyses. Additional computational analyses suggested that an SP1-like factor may function synergistically with a FOX TF to regulate a subset of these genes. Furthermore, analysis of promoter sequences identified many genes with multiple, conserved SP1 and FOX binding sites, the relative location of which within orthologous promoters was highly conserved. CONCLUSIONS: SP1 or a closely related factor may play a primary role in the genetic response to iron-deficiency in the mammalian intestine. [Abstract/Link to Full Text]

Wilson BJ, Giguere V
Identification of novel pathway partners of p68 and p72 RNA helicases through Oncomine meta-analysis.
BMC Genomics. 2007 Nov 15;8(1):419.
ABSTRACT: BACKGROUND: The Oncomine database is an online collection of microarrays from various sources, usually cancer-related, and contains many "multi-arrays" (collections of analyzed microarrays, in a single study). As there are often many hundreds of tumour samples/microarrays within a single multi-array results from coexpressed genes can be analyzed, and are fully searchable. This gives a potentially significant list of coexpressed genes, which is important to define pathways in which the gene of interest is involved. However, to increase the likelihood of revealing truly significant coexpressed genes we have analyzed their frequency of occurrence over multiple studies (meta-analysis), greatly increasing the significance of results compared to those of a single study. RESULTS: We have used the DEAD-box proteins p68(Ddx5) and p72(Ddx17) as models for this coexpression frequency analysis as there are defined functions for these proteins in splicing and transcription (known functions which we could use as a basis for quality control). Furthermore, as these proteins are highly similar, interact together, and may be to some degree functionally redundant, we then analyzed the overlap between coexpressed genes of p68 and p72. This final analysis gave us a highly significant list of coexpressed genes, clustering mainly in splicing and transcription (recapitulating their published roles), but also revealing new pathways such as cytoskeleton remodelling and protein folding. We have further tested a predicted pathway partner, RNA helicase A(Dhx9) in a reciprocal meta-analysis that identified p68 and p72 as being coexpressed, and further show a direct interaction of Dhx9 with p68 and p72, attesting to the predictive nature of this technique. CONCLUSIONS: In summary we have extended the capabilities of Oncomine by analyzing the frequency of coexpressed genes over multiple studies, and furthermore assessing the overlap with a known pathway partner (in this case p68 with p72). We have shown our predictions corroborate previously published studies on p68 and p72, and that novel predictions can be easily tested. These techniques are widely applicable and should increase the quality of data from future meta-analysis studies. [Abstract/Link to Full Text]

Anthony A, Blaxter M
Association of the Matrix Attachment Region Recognition Signature with coding regions in Caenorhabditis elegans.
BMC Genomics. 2007 Nov 15;8(1):418.
ABSTRACT: BACKGROUND: Matrix attachment regions (MAR) are the sites on genomic DNA that interact with the nuclear matrix. There is increasing evidence for the involvement of MAR in regulation of gene expression. The unsuitability of experimental detection of MAR for genome-wide analyses has led to the development of computational methods of detecting MAR. The MAR recognition signature (MRS) has been reported to be associated with a significant fraction of MAR in C. elegans and has also been found in MAR from a wide range of other eukaryotes. However the effectiveness of the MRS in specifically and sensitively identifying MAR remains unresolved. RESULTS: Using custom software, we have mapped the occurrence of MRS across the entire C. elegans genome. We find that MRS have a distinctive chromosomal distribution, in which they appear more frequently in the gene-rich chromosome centres than in arms. Comparison to distributions of MRS estimated from chromosomal sequences randomised using mono-, di- tri- and tetra-nucleotide frequency patterns showed that, while MRS are less common in real sequence than would be expected from nucleotide content alone, they are more frequent than would be predicted from short-range nucleotide structure. In comparison to the rest of the genome, MRS frequency was elevated in 5' and 3' UTRs, and striking peaks of average MRS frequency flanked C. elegans coding sequence (CDS). Genes associated with MRS were significantly enriched for receptor activity annotations, but not for expression level or other features. CONCLUSION: Through a genome-wide analysis of the distribution of MRS in C. elegans we have shown that they have a distinctive distribution, particularly in relation to genes. Due to their association with untranslated regions, it is possible that MRS could have a post-transcriptional role in the control of gene expression. A role for MRS in nuclear scaffold attachment is not supported by these analyses. [Abstract/Link to Full Text]

Lefevre CM, Digby MR, Whitley JC, Strahm Y, Nicholas KR
Lactation transcriptomics in the Australian marsupial, Macropus eugenii: transcript sequencing and quantification.
BMC Genomics. 2007 Nov 13;8(1):417.
ABSTRACT: BACKGROUND: Lactation is an important aspect of mammalian biology and, amongst mammals, marsupials show one of the most complex lactation cycles. Marsupials, such as the tammar wallaby (Macropus eugenii) give birth to a relatively immature newborn and progressive changes in milk composition and milk production regulate early stage development of the young. RESULTS: In order to investigate gene expression in the marsupial mammary gland during lactation, a comprehensive set of cDNA libraries was derived from lactating tissues throughout the lactation cycle of the tammar. A total of 14,837 express sequence tags where produced by cDNA sequencing. Sequence analysis and sequence assembly were used to construct a comprehensive catalogue of mammary transcripts. cDNA libraries from pregnant, and early or late lactating tissues and massively parallel sequencing data from early or late lactation were combined to analyse the variation of milk protein gene expression during the lactation cycle. CONCLUSION: Results show a steady increase in expression of genes coding for secreted protein during the lactation cycle that is associated with high proportions of transcripts coding for milk proteins. In addition, genes involved in immune function, translation and energy or anabolic metabolism are expressed across the lactation cycle. A number of potential new milk proteins or mammary gland remodelling markers, including non-coding RNAs have been identified. [Abstract/Link to Full Text]

Chen YA, Lin CC, Wang CD, Wu HB, Hwang PI
An optimized procedure greatly improves EST vector contamination removal.
BMC Genomics. 2007 Nov 13;8(1):416.
ABSTRACT: BACKGROUND: The enormous amount of sequence data available in the public domain database has been a gold mine for researchers exploring various themes in life sciences, and hence the quality of such data is of serious concern to researchers. Removal of vector contamination is one of the most significant operations to obtain accurate sequence data containing only a cDNA insert from the basecalls output by an automatic DNA sequencer. Popular bioinformatics programs to accomplish vector trimming include LUCY, cross_match and SeqClean. RESULTS: In a recent study, where the program SeqClean was used to remove vector contamination from our test set of EST data compiled through various library construction systems, however, a significant number of errors remained after preliminary trimming. These errors were later almost completely corrected by simply using a re-linearized form of the cloning vector to compare against the target ESTs. The modified trimming procedure for SeqClean was also compared with the trimming efficiency of the other two popular programs, LUCY2, and cross_match. Using SeqClean with a re-linearized form of the cloning vector significantly surpassed the other two programs in all tested conditions, while the performance of the other two programs was not influenced by the modified procedure. Vector contamination in dbEST was also investigated in this study: 2203 out of the 48212 ESTs sampled from dbEST (2007-04-18 freeze) were found to match sequences in UNIVEC. CONCLUSIONS: Vector contamination remains a serious concern to the data quality in the public sequence database nowadays. Based on the results presented here, we feel that our modified procedure with SeqClean should be recommended to all researchers for the task of vector removal from EST or genomic sequences. [Abstract/Link to Full Text]

Lange C, Zaigler A, Hammelmann M, Twellmeyer J, Raddatz G, Schuster SC, Oesterhelt D, Soppa J
Genome-wide analysis of growth phase-dependent translational and transcriptional regulation in halophilic archaea.
BMC Genomics. 2007 Nov 12;8(1):415.
ABSTRACT: BACKGROUND: Differential expression of genes can be regulated on many different levels. Most global studies of gene regulation concentrate on transcript level regulation, and very few global analyses of differential translational efficiencies exist. The studies have revealed that in Saccharomyces cerevisiae, Arabidopsis thaliana, and human cell lines translational regulation plays a significant role. Additional species have not been investigated yet. Particularly, until now no global study of translational control with any prokaryotic species was available. RESULTS: A global analysis of translational control was performed with two haloarchaeal model species, Halobacterium salinarum and Haloferax volcanii. To identify differentially regulated genes, exponentially growing and stationary phase cells were compared. More than 20% of H. salinarum transcripts are translated with non-average efficiencies. By far the largest group is comprised of genes that are translated with above-average efficiency specifically in exponential phase, including genes for many ribosomal proteins, RNA polymerase subunits, enzymes, and chemotaxis proteins. Translation of 1% of all genes is specifically repressed in either of the two growth phases. For comparison, DNA microarrays were also used to identify differential transcriptional regulation in H. salinarum, and 17% of all genes were found to have non-average transcript levels in exponential versus stationary phase. In H. volcanii, 12% of all genes are translated with non-average efficiencies. The overlap with H. salinarum is negligible. In contrast to H. salinarum, 4.6% of genes have non-average translational efficiency in both growth phases, and thus they might be regulated by other stimuli than growth phase. CONCLUSIONS: For the first time in any prokaryotic species it was shown that a significant fraction of genes is under differential translational control. Groups of genes with different regulatory patterns were discovered. However, neither the fractions nor the identity of regulated genes are conserved between H. salinarum and H. volcanii, indicating that prokaryotes as well as eukaryotes use differential translational control for the regulation of gene expression, but that the identity of regulated genes is not conserved For 70 H. salinarum genes potentiation of regulation was observed, but for the majority of regulated genes either transcriptional or translational regulation is employed. [Abstract/Link to Full Text]

Chen J, Agrawal V, Rattray M, West MA, St Clair DA, Michelmore RW, Coughlan SJ, Meyers BC
A comparison of microarray and MPSS technology platforms for expression analysis of Arabidopsis.
BMC Genomics. 2007 Nov 12;8(1):414.
ABSTRACT: BACKGROUND: Several high-throughput technologies can measure in parallel the abundance of many mRNA transcripts within a sample. These include the widely-used microarray as well as the more recently developed methods based on sequence tag abundances such as the Massively Parallel Signature Sequencing (MPSS) technology. A comparison of microarray and MPSS technologies can help to establish the metrics for data comparisons across these technology platforms and determine some of the factors affecting the measurement of mRNA abundances using different platforms. RESULTS: We compared transcript abundance (gene expression) measurement data obtained using Affymetrix and Agilent microarrays with MPSS data. All three technologies were used to analyze the same set of mRNA samples; these samples were extracted from various wild type Arabidopsis thaliana tissues and floral mutants. We calculated correlations and used clustering methodology to compare the normalized expression data and expression ratios across samples and technologies. Absolute abundance expression measurements were more similar between different samples measured by the same technology than between the same sample measured by different technologies. However, when expression ratios were employed, samples measured by different technologies were found to cluster together more frequently than with absolute abundance expression levels. Furthermore, the two microarray technologies were more consistent with each other than with MPSS. We also investigated probe-position effects on Affymetrix data and tag-position effects in MPSS. We found a similar impact on Affymetrix and MPSS measurements, which suggests that these effects were more likely a characteristic of the RNA sample rather than technology-specific biases. CONCLUSIONS: Comparisons of transcript expression ratios showed greater consistency across platforms than absolute measurements of transcript abundance. In addition, for measurements based on absolute abundances, technology differences can mask the impact of biological differences between samples and tissues. [Abstract/Link to Full Text]

Kim S, Choi KH, Baykiz AF, Gershenfeld HK
Suicide candidate genes associated with bipolar disorder and schizophrenia: An exploratory gene expression profiling analysis of post-mortem prefrontal cortex.
BMC Genomics. 2007 Nov 12;8(1):413.
ABSTRACT: BACKGROUND: Suicide is an important and potentially preventable consequence of serious mental disorders of unknown etiology. Gene expression profiling technology provides an unbiased approach to identifying candidate genes for mental disorders. Microarray studies with post-mortem prefrontal cortex (Brodmann's Area 46/10) tissue require larger sample sizes due to the small magnitude of differentially expressed genes, the genetic heterogeneity of mental disorders, and the mixed cellularity of brain tissue. This study poses the question: to what extent are differentially expressed genes for suicide a diagnostic specific set of genes (bipolar disorder vs. schizophrenia) vs. a shared common pathway? RESULTS: In a reanalysis of a large set of Affymetrix Human Genome U133A microarray data, gene expression levels were compared between suicide completers vs. non-suicide groups within a diagnostic group, namely Bipolar disorder (N=45; 22 suicide completers; 23 non-suicide) or Schizophrenia (N=45; 10 suicide completers ; 35 non-suicide). Among bipolar samples, 13 genes were found and among schizophrenia samples, 70 genes were found as differentially expressed. Two genes, PLSCR4 (phospholipid scramblase 4) and EMX2 (empty spiracles homolog 2 (Drosophila)) were differentially expressed in suicide groups of both diagnostic groups by microarray analysis. CONCLUSIONS: This molecular level analysis suggests that diagnostic specific genes predominate to shared genes in common among suicide vs. non-suicide groups. These differentially expressed, candidate genes are neural correlates of suicide, not necessarily causal. While suicide is a complex endpoint with many pathways, these candidate genes provide entry points for future studies of molecular mechanisms and genetic association studies to test causality. [Abstract/Link to Full Text]

Dvorakova L, Cvrckova F, Fischer L
Analysis of the hybrid proline-rich protein families from seven plant species suggests rapid diversification of their sequences and expression patterns.
BMC Genomics. 2007 Nov 12;8(1):412.
ABSTRACT: BACKGROUND: Plant hybrid proline-rich proteins (HyPRPs) are putative cell wall proteins consisting, usually, of a repetitive proline-rich (PR) N-terminal domain and a conserved eight-cysteine motif (8CM) C-terminal domain. Understanding the evolutionary dynamics of HyPRPs might provide not only insight into their so far elusive function, but also a model for other large protein families in plants. RESULTS: We have performed a phylogenetic analysis of HyPRPs from seven plant species, including representatives of gymnosperms and both monocot and dicot angiosperms. Every species studied possesses a large family of 14-52 HyPRPs. Angiosperm HyPRPs exhibit signs of recent major diversification involving, at least in Arabidopsis and rice, several independent tandem gene multiplications. A distinct subfamily of relatively well-conserved C-type HyPRPs, often with long hydrophobic PR domains, has been identified. In most of gymnosperm (pine) HyPRPs, diversity appears within the C-type group while angiosperms have only a few of well-conserved C-type representatives. Atypical (glycine-rich or extremely short) N-terminal domains apparently evolved independently in multiple lineages of the HyPRP family, possibly via inversion or loss of sequences encoding proline-rich domains. Expression profiles of potato and Arabidopsis HyPRP genes exhibit instances of both overlapping and complementary organ distribution. The diversified non-C-type HyPRP genes from recently amplified chromosomal clusters in Arabidopsis often share their specialized expression profiles. C-type genes have broader expression patterns in both species (potato and Arabidopsis), although orthologous genes exhibit some differences. CONCLUSIONS: HyPRPs represent a dynamically evolving protein family apparently unique to seed plants. We suggest that ancestral HyPRPs with long proline-rich domains produced the current diversity through ongoing gene duplications accompanied by shortening, modification or loss of the proline-rich domains. Most of the diversity in gymnosperms and angiosperms originates from different branches of the HyPRP family. Rapid sequence diversification is consistent with only limited requirements for structure conservation and, together with high variability of gene expression patterns, limits the interpretation of any functional study focused on a single HyPRP gene or a couple of HYPRP genes in single plant species. [Abstract/Link to Full Text]

Cheng KC, Huang HC, Chen JH, Hsu JW, Cheng HC, Ou CH, Yang WB, Chen ST, Wong CH, Juan HF
Ganoderma lucidum polysaccharides in human monocytic leukemia cells: from gene expression to network construction.
BMC Genomics. 2007 Nov 9;8(1):411.
ABSTRACT: BACKGROUND: Ganoderma lucidum has been widely used as a herbal medicine for promoting health and longevity in China and other Asian countries. Polysaccharide extracts from Ganoderma lucidum have been reported to exhibit immuno-modulating and anti-tumor activities. In previous studies, F3, the active component of the polysaccharide extract, was found to activate various cytokines such as IL-1, IL-6, IL-12, and TNF-alpha. This gave rise to our investigation on how F3 stimulates immuno-modulating or anti-tumor effects in human leukemia THP-1 cells. RESULTS: Here, we integrated time-course DNA microarray analysis, quantitative PCR assays, and bioinformatics methods to study the F3-induced effects in THP-1 cells. Significantly disturbed pathways induced by F3 were identified with statistical analysis on microarray data. The apoptosis induction through the DR3 and DR4/5 death receptors was found to be one of the most significant pathways and play a key role in THP-1 cells after F3 treatment. Based on time-course gene expression measurements of the identified pathway, we reconstructed a plausible regulatory network of the involved genes using reverse-engineering computational approach. CONCLUSIONS: Our results showed that F3 may induce death receptor ligands to initiate signaling via receptor oligomerization, recruitment of specialized adaptor proteins and activation of caspase cascades. [Abstract/Link to Full Text]

Pagaling E, Haigh RD, Grant WD, Cowan DA, Jones BE, Ma Y, Ventosa A, Heaphy S
Sequence analysis of an Archaeal Virus isolated from a hypersaline lake in Inner Mongolia, China.
BMC Genomics. 2007 Nov 9;8(1):410.
ABSTRACT: BACKGROUND: We are profoundly ignorant about the diversity of viruses that infect the domain Archaea. Less than 100 have been identified and described and very few of these have had their genomic sequences determined. Here we report the genomic sequence of a previously undescribed archaeal virus. RESULTS: Haloarchaeal strains with 16S rRNA gene sequences 98% identical to Halorubrum saccharovorum were isolated from a hypersaline lake in Inner Mongolia. Two lytic viruses infecting these were isolated from the lake water. The BJ1 virus is described in this paper. It has an icosahedral head and tail morphology and most likely a linear double stranded DNA genome exhibiting terminal redundancy. Its genome sequence has 42,271 base pairs with a GC content of ~65mol%. The genome of BJ1 is predicted to encode 70 ORFs, including one for a tRNA. Fifty of the seventy ORFs had no identity to data base entries; twenty showed sequence identity matches to archaeal viruses and to haloarchaea. ORFs possibly coding for an origin of replication complex, integrase, helicase and structural capsid proteins were identified. Evidence for viral integration was obtained. CONCLUSIONS: The virus described here has a very low sequence identity to any previously described virus. Fifty of the seventy ORFs could not be annotated in any way based on amino acid identities with sequences already present in the databases. Determining functions for ORFs such as these is probably easier using a simple virus as a model system. [Abstract/Link to Full Text]

Grzebelus D, Lasota S, Gambin T, Kucherov G, Gambin A
Diversity and structure of PIF/Harbinger-like elements in the genome of Medicago truncatula.
BMC Genomics. 2007 Nov 9;8(1):409.
ABSTRACT: BACKGROUND: Transposable elements constitute a significant fraction of plant genomes. The PIF/Harbinger superfamily includes DNA transposons (class II elements) carrying terminal inverted repeats and producing a 3 bp target site duplication upon insertion. The presence of an ORF coding for the DDE/DDD transposase, required for transposition, is characteristic for the autonomous PIF/Harbinger-like elements. Based on the above features, PIF/Harbinger-like elements were identified in several plant genomes and divided into several evolutionary lineages. Availability of a significant portion of Medicago truncatula genomic sequence allowed for mining PIF/Harbinger-like elements, starting from a single previously described element MtMaster. RESULTS: Twenty two putative autonomous, i.e. carrying an ORF coding for TPase and complete terminal inverted repeats, and 67 non-autonomous PIF/Harbinger-like elements were found in the genome of M. truncatula. They were divided into five families, MtPH-A5, MtPH-A6, MtPH-D, MtPH-E, and MtPH-M, corresponding to three previously identified and two new lineages. The largest families, MtPH-A6 and MtPH-M were further divided into four and three subfamilies, respectively. Non-autonomous elements were usually direct deletion derivatives of the putative autonomous element, however other types of rearrangements, including inversions and nested insertions were also observed. An interesting structural characteristic - the presence of 60 bp tandem repeats - was observed in a group of elements of subfamily MtPH-A6-4. Some families could be related to miniature inverted repeat elements (MITEs). The presence of empty loci (RESites), paralogous to those flanking 11 of the identified transposable elements, both autonomous and non-autonomous, confirmed that these elements were capable for transposition. CONCLUSIONS: The population of PIF/Harbinger-like elements in the genome of M. truncatula is diverse. A detailed intra-family comparison of the elements' structure proved that they proliferated in the genome generally following the model of abortive gap repair. However, the presence of tandem repeats facilitated more pronounced rearrangements of the element internal regions. A potential insertional polymorphism of the MtPH elements and related MITE families in different populations of M. truncatula, if confirmed experimentally, could be used as a source of molecular markers complementary to other marker systems. [Abstract/Link to Full Text]

Gallach M, Arnau V, Marin I
Global patterns of sequence evolution in Drosophila.
BMC Genomics. 2007 Nov 9;8(1):408.
ABSTRACT: BACKGROUND: Sequencing of the genomes of several Drosophila allows for the first precise analyses of how global sequence patterns change among multiple, closely related animal species. A basic question is whether there are characteristic features that differentiate chromosomes within a species or between different species. RESULTS: We explored the euchromatin of the chromosomes of seven Drosophila species to establish their global patterns of DNA sequence diversity. Between species, differences in the types and amounts of simple sequence repeats were found. Within each species, the autosomes have almost identical oligonucleotide profiles. However, X chromosomes and autosomes have, in all species, a qualitatively different composition. The X chromosomes are less complex than the autosomes, containing both a higher amount of simple DNA sequences and, in several cases, chromosome-specific repetitive sequences. Moreover, we show that the right arm of the X chromosome of Drosophila pseudoobscura, which evolved from an autosome 10 - 18 millions of years ago, has a composition which is identical to that of the original, left arm of the X chromosome. CONCLUSIONS: The consistent differences among species, differences among X chromosomes and autosomes and the convergent evolution of X and neo-X chromosomes demonstrate that strong forces are acting on drosophilid genomes to generate peculiar chromosomal landscapes. We discuss the relationships of the patterns observed with differential recombination and mutation rates and with the process of dosage compensation. [Abstract/Link to Full Text]

Venancio TM, Demarco R, Almeida GT, Oliveira KC, Setubal JC, Verjovski-Almeida S
Analysis of Schistosoma mansoni genes shared with Deuterostomia and with possible roles in host interactions.
BMC Genomics. 2007 Nov 8;8(1):407.
ABSTRACT: BACKGROUND: Schistosoma mansoni is a blood helminth parasite that causes schistosomiasis, a disease that affects 200 million people in the world. Many orthologs of known mammalian genes have been discovered in this parasite and evidence is accumulating that some of these genes encode proteins linked to signaling pathways in the parasite that appear to be involved with growth or development, suggesting a complex co-evolutionary process. RESULTS: In this work we found 427 genes conserved in the Deuterostomia group that have orthologs in S. mansoni and no members in any nematodes and insects so far sequenced. Among these genes we have identified Insulin Induced Gene (INSIG), Interferon Regulatory Factor (IRF) and vasohibin orthologs, known to be involved in mammals in mevalonate metabolism, immune response and angiogenesis control, respectively. We have chosen these three genes for a more detailed characterization, which included extension of their cloned messages to obtain full-length sequences. Interestingly, SmINSIG showed a 10-fold higher expression in adult females as opposed to males, in accordance with its possible role in regulating egg production. SmIRF has a DNA binding domain, a tryptophan-rich N-terminal region and several predicted phosphorylation sites, usually important for IRF activity. Fourteen different alternatively spliced forms of the S. mansoni vasohibin (SmVASL) gene were detected that encode seven different protein isoforms including one with a complete C-terminal end, and other isoforms with shorter C-terminal portions. Using S. mansoni homologs, we have employed a parsimonious rationale to compute the total gene losses/gains in nematodes, arthropods and deuterostomes under either the Coelomata or the Ecdysozoa evolutionary hypotheses; our results show a lower losses/gains number under the latter hypothesis. CONCLUSIONS: The genes discussed which are conserved between S. mansoni and deuterostomes, probably have an ancient origin and were lost in Ecdysozoa, being still present in Lophotrochozoa. Given their known functions in Deuterostomia, it is possible that some of them have been co-opted to perform functions related (directly or indirectly) to host adaptation or interaction with host signaling processes. [Abstract/Link to Full Text]

Rose D, Hackermueller J, Washietl S, Reiche K, Hertel J, Findeiss S, Stadler PF, Prohaska SJ
Computational RNomics of Drosophilids.
BMC Genomics. 2007 Nov 8;8(1):406.
ABSTRACT: BACKGROUND: Recent experimental and computational studies have provided overwhelming evidence for a plethora of diverse transcripts that are unrelated to protein-coding genes. One subclass consists of those RNAs that require distinctive secondary structure motifs to exert their biological function and hence exhibit distinctive patterns of sequence conservation characteristic for positive selection on RNA secondary structure.The deep-sequencing of 12 drosophilid species coordinated by the NHGRI provides an ideal data set of comparative computational approaches to determine those genomic loci that code for evolutionarily conserved RNA motifs. This class of loci includes the majority of the known small ncRNAs as well as structured RNA motifs in mRNAs. We report here on a genome-wide survey using RNAz. RESULTS: We obtain 16 000 high quality predictions among which we recover the majority of the known ncRNAs. Taking a pessimistically estimated false discovery rate of 40% into account, this implies that at least some ten thousand loci in the Drosophila genome show the hallmarks of stabilizing selection action of RNA structure, and hence are most likely functional at the RNA level. A subset of RNAz predictions overlapping with TRF1 and BRF binding sites [Isogai et al., EMBO J. 26: 79-89 (2007)], which are plausible candidates of Pol III transcripts, have been studied in more detail. Among these sequences we identify several "clusters" of ncRNA candidates with striking structural similarities. CONCLUSIONS: The statistical evaluation of the RNAz predictions in comparison with a similar analysis ofvertebrate genomes [Washietl et al., Nat. Biotech. 23: 1383-1390 (2005)] shows that qualitatively similar fractions of structured RNAs are found in introns, UTRs, and intergenic regions. The intergenic RNA structures, however, are concentrated much more closely around known protein-coding loci, suggesting that flies have significantly smaller complement of independent structured ncRNAs compared to mammals. [Abstract/Link to Full Text]

Moe M, Meuwissen T, Lien S, Bendixen C, Wang X, Conley LN, Berget I, Tajet H, Grindflek E
Gene expression profiles in testis of pigs with extreme high and low levels of androstenone.
BMC Genomics. 2007 Nov 7;8(1):405.
ABSTRACT: BACKGROUND: Boar taint is a major obstacle when using uncastrated male pigs for swine production. One of the main compounds causing this taint is androstenone, a pheromone produced in porcine testis. Here we use microarrays to study the expression of thousands of genes simultaneously in testis of high and low androstenone boars. The study allows identification of genes and pathways associated with elevated androstenone levels. RESULTS: Testicular tissue was collected from 60 boars, 30 with extreme high and 30 with extreme low levels of androstenone, from each of the two breeds Duroc and Norwegian Landrace. The samples were hybridised to porcine arrays containing 26.877 cDNA clones, detecting 563 and 160 genes that were differentially expressed (p < 0.01) in Duroc and Norwegian Landrace, respectively. Of these significantly up- and down-regulated clones, 72 were found to be common for the two breeds, suggesting both general and breed specific mechanisms in regulation of, or response to androstenone levels in boars. Ten of the most significant genes were chosen for verification of expression patterns by quantitative real competitive PCR and real-time PCR. As expected, our results point towards steroid hormone metabolism and biosynthesis as important biological processes for the androstenone levels, but other potential pathways were identified as well. Among these were oxidoreductase activity, ferric iron binding, iron ion binding and electron transport activities. Genes belonging to the cytochrome P450 and hydroxysteroid dehydrogenase families were highly up-regulated, in addition to several genes encoding different families of conjugation enzymes. Furthermore, a number of genes encoding transcription factors were found both up- and down-regulated. The high number of clones belonging to ferric iron and iron ion binding suggests an importance of these genes, and the association between these pathways and androstenone levels is not previously described. CONCLUSIONS: This study contributes to the understanding of the complex genetic system controlling and responding to androstenone levels in pig testis. The identification of new pathways and genes involved in the biosynthesis and metabolism of androstenone is an important first step towards finding molecular markers to reduce boar taint. [Abstract/Link to Full Text]

Grayson TH, Ohms SJ, Brackenbury TD, Meaney KR, Peng K, Pittelkow YE, Wilson SR, Sandow SL, Hill CE
Vascular microarray profiling in two models of hypertension identifies Cav-1, Rgs2 and Rgs5 as antihypertensive targets.
BMC Genomics. 2007 Nov 7;8(1):404.
ABSTRACT: BACKGROUND: Hypertension is a complex disease with many contributory genetic and environmental factors. We aimed to identify common targets for therapy by gene expression profiling of a resistance artery taken from animals representing two different models of hypertension. We studied gene expression and morphology of a saphenous artery branch in normotensive WKY rats, spontaneously hypertensive rats (SHR) and adrenocorticotropic hormone (ACTH)-induced hypertensive rats. RESULTS: Differential remodeling of arteries occurred in SHR and ACTH-treated rats, involving changes in both smooth muscle and endothelium. Increased expression of smooth muscle cell growth promoters and decreased expression of growth suppressors confirmed smooth muscle cell proliferation in SHR but not in ACTH. Differential gene expression between arteries from the two hypertensive models extended to the renin-angiotensin system, MAP kinase pathways, mitochondrial activity, lipid metabolism, extracellular matrix and calcium handling. In contrast, arteries from both hypertensive models exhibited significant increases in caveolin-1 expression and decreases in the regulators of G-protein signalling, Rgs2 and Rgs5. Increased protein expression of caveolin-1 and increased incidence of caveolae was found in both smooth muscle and endothelial cells of arteries from both hypertensive models. CONCLUSIONS: We conclude that the majority of differences in gene expression found in the saphenous artery taken from rats with two different forms of hypertension reflect distinctive morphological and physiological alterations. However, changes in common to caveolin-1 expression and G protein signalling, through attenuation of Rgs2 and Rgs5, may contribute to hypertension through augmentation of vasoconstrictor pathways and provide potential targets for common drug development. [Abstract/Link to Full Text]

Zhao Y, O'Neil NJ, Rose AM
Poly-G/poly-C tracts in the genomes of Caenorhabditis.
BMC Genomics. 2007 Nov 7;8(1):403.
ABSTRACT: BACKGROUND: In the genome of Caenorhabditis elegans, homopolymeric poly-G/poly-C tracts (G/C tracts) exist at high frequency and are maintained by the activity of the DOG-1 protein. The frequency and distribution of G/C tracts in the genomes of C. elegans and the related nematode, C. briggsae were analyzed to investigate possible biological roles for G/C tracts. RESULTS: In C. elegans, G/C tracts are distributed along every chromosome in a non-random pattern. Most G/C tracts are within introns or are close to genes. Analysis of SAGE data showed that G/C tracts correlate with the levels of regional gene expression in C. elegans. G/C tracts are over-represented and dispersed across all chromosomes in another Caenorhabditis species, C. briggase. However, the positions and distribution of G/C tracts in C. briggsae differ from those in C. elegans. Furthermore, the C. briggsae dog-1 ortholog CBG19723 can rescue the mutator phenotype of C. elegans dog-1 mutants. CONCLUSIONS: The abundance and genomic distribution of G/C tracts in C. elegans, the effect of G/C tracts on regional transcription levels, and the lack of positional conservation of G/C tracts in C. briggsae suggest a role for G/C tracts in chromatin structure but not in the transcriptional regulation of specific genes. [Abstract/Link to Full Text]

Jones MR, Maydan JS, Flibotte S, Moerman DG, Baillie DL
Oligonucleotide Array Comparative Genomic Hybridization (oaCGH) based characterization of genetic deficiencies as an aid to gene mapping in Caenorhabditis elegans.
BMC Genomics. 2007 Nov 7;8(1):402.
ABSTRACT: BACKGROUND: A collection of genetic deficiencies covering over 70% of the C. elegans genome exists, however the application of these valuable biological tools has been limited due to the incomplete correlation between their genetic and physical characterization. RESULTS: We have applied oligonucleotide array Comparative Genomic Hybridization (oaCGH) to the high resolution, molecular characterization of several genetic deficiency and duplication strains in a 5Mb region of Chromosome III. We incorporate this data into a deficiency map which is subsequently used to direct the positional cloning of essential genes in the region. From this analysis we are able to quickly determine the molecular identity of several previously unidentified mutations. CONCLUSION: We have applied accurate, high resolution molecular analysis to the characterization of genetic mapping tools in C. elegans. Consequently we have generated a valuable physical mapping resource, which we have demonstrated can aid in the rapid molecular identification of mutations of interest. [Abstract/Link to Full Text]

Aubourg S, Martin-Magniette ML, Brunaud V, Taconnat L, Bitton F, Balzergue S, Jullien PE, Ingouff M, Thareau V, Schiex T, Lecharny A, Renou JP
Analysis of CATMA transcriptome data identifies hundreds of novel functional genes and improves gene models in the Arabidopsis genome.
BMC Genomics. 2007 Nov 2;8(1):401.
ABSTRACT: BACKGROUND: Since the finishing of the sequencing of the Arabidopsis thaliana genome, the Arabidopsis community and the annotator centers have been working on the improvement of gene annotation at the structural and functional levels. In this context, we have used the large CATMA resource on the Arabidopsis transcriptome to search for genes missed by different annotation processes. Probes on the CATMA microarrays are specific gene sequence tags (GSTs) based on the CDS models predicted by the Eugene software. Among the 24 576 CATMA v2 GSTs, 677 are in regions considered as intergenic by the TAIR annotation. We analyzed the cognate transcriptome data in the CATMA resource and carried out data-mining to characterize novel genes and improve gene models. RESULTS: The statistical analysis of the results of more than 500 hybridized samples distributed among 12 organs provides an experimental validation for 465 novel genes. The hybridization evidence was confirmed by RT-PCR approaches for 88% of the 465 novel genes. Comparisons with the current annotation show that these novel genes often encode small proteins, with an average size of 137 aa. Our approach has also led to the improvement of pre-existing gene models through both the extension of 16 CDS and the identification of 13 gene models erroneously constituted of two merged CDS. CONCLUSIONS: This work is a noticeable step forward in the improvement of the Arabidopsis genome annotation. We increased the number of Arabidopsis validated genes by 465 novel transcribed genes to which we associated several functional annotations such as expression profiles, sequence conservation in plants, cognate transcripts and protein motifs. [Abstract/Link to Full Text]

Meade KG, Gormley E, Doyle MB, Fitzsimons T, O' Farrelly C, Costello E, Keane J, Zhao Y, Machugh DE
Innate gene repression associated with Mycobacterium bovis infection in cattle: toward a gene signature of disease.
BMC Genomics. 2007 Oct 31;8(1):400.
ABSTRACT: BACKGROUND: Bovine tuberculosis is an enduring disease of cattle that has significant repercussions for human health. The advent of high-throughput functional genomics technologies has facilitated large-scale analyses of the immune response to this disease that may ultimately lead to novel diagnostics and therapeutic targets. Analysis of mRNA abundance in peripheral blood mononuclear cells (PBMC) from six Mycobacterium bovis infected cattle and six non-infected controls was performed. A targeted immunospecific bovine cDNA microarray with duplicated spot features representing 1,391 genes was used to test the hypothesis that a distinct gene expression profile may exist in M. bovis infected animals in vivo. RESULTS: In total, 378 gene features were differentially expressed at the P [less than or equal to] 0.05 level in bovine tuberculosis (BTB)-infected and control animals, of which 244 were expressed at lower levels (65%) in the infected group. Lower relative expression of key innate immune genes, including the Toll-like receptor 2 (TLR2) and TLR4 genes, lack of differential expression of indicator adaptive immune gene transcripts (IFNG, IL2, IL4), and lower BOLA major histocompatibility complex - class I (BOLA) and class II (BOLA-DRA) gene expression was consistent with innate immune gene repression in the BTB-infected animals. Supervised hierarchical cluster analysis and class prediction validation identified a panel of 15 genes predictive of disease status and selected gene transcripts were validated (n = 8 per group) by real time quantitative reverse transcription PCR. CONCLUSION: These results suggest that large-scale expression profiling can identify gene signatures of disease in peripheral blood that can be used to classify animals on the basis of in vivo infection, in the absence of exogenous antigenic stimulation. [Abstract/Link to Full Text]

Bechtel S, Rosenfelder H, Duda A, Schmidt CP, Ernst U, Wellenreuther R, Mehrle A, Schuster C, Bahr A, Blocker H, Heubner D, Hoerlein A, Michel G, Wedler H, Kohrer K, Ottenwalder B, Poustka A, Wiemann S, Schupp I
The full-ORF clone resource of the German cDNA Consortium.
BMC Genomics. 2007 Oct 31;8(1):399.
ABSTRACT: BACKGROUND: With the completion of the human genome sequence the functional analysis and characterization of the encoded proteins has become the next urging challenge in the post-genome era. The lack of comprehensive ORFeome resources has thus far hampered systematic applications by protein gain-of-function analysis. Gene and ORF coverage with full-length ORF clones thus needs to be extended. In combination with a unique and versatile cloning system, these will provide the tools for genome-wide systematic functional analyses, to achieve a deeper insight into complex biological processes. RESULTS: Here we describe the generation of a full-ORF clone resource of human genes applying the Gateway cloning technology (Invitrogen). A pipeline for efficient cloning and sequencing was developed and a sample tracking database was implemented to streamline the clone production process targeting more than 2,200 different ORFs. In addition, a robust cloning strategy was established, permitting the simultaneous generation of two clone variants that contain a particular ORF with as well as without a stop codon by the implementation of only one additional working step into the cloning procedure. Up to 92 % of the targeted ORFs were successfully amplified by PCR and more than 93 % of the amplicons successfully cloned. CONCLUSIONS: The German cDNA Consortium ORFeome resource currently consists of more than 3,800 sequence-verified entry clones representing ORFs, cloned with and without stop codon, for about 1,700 different gene loci. 177 splice variants were cloned representing 121 of these genes. The entry clones have been used to generate over 5,000 different expression constructs, providing the basis for functional profiling applications. As a member of the recently formed international ORFeome collaboration we substantially contribute to generating and providing a whole genome human ORFeome collection in a unique cloning system that is made freely available in the community. [Abstract/Link to Full Text]

Retelska D, Beaudoing E, Notredame C, Jongeneel CV, Bucher P
Vertebrate conserved non coding DNA regions have a high persistence length and a short persistence time.
BMC Genomics. 2007 Oct 31;8(1):398.
ABSTRACT: BACKGROUND: The comparison of complete genomes has revealed surprisingly large numbers of conserved non-protein-coding (CNC) DNA regions. However, the biological function of CNC remains elusive. CNC differ in two aspects from conserved protein-coding regions. They are not conserved across phylum boundaries, and they do not contain readily detectable sub-domains. Here we characterize the persistence length and time of CNC and conserved protein-coding regions in the vertebrate and insect lineages. RESULTS: The persistence length is the length of a genome region over which a certain level of sequence identity is consistently maintained. The persistence time is the evolutionary period during which a conserved region evolves under the same selective constraints. Our main findings are: (i) Insect genomes contain 1.60 times less conserved information than vertebrates; (ii) Vertebrate CNC have a higher persistence length than conserved coding regions or insect CNC; (iii) CNC have shorter persistence times as compared to conserved coding regions in both lineages. CONCLUSIONS: Higher persistence length of vertebrate CNC indicates that the conserved information in vertebrates and insects is organized in functional elements of different lengths. These findings might be related to the higher morphological complexity of vertebrates and give clues about the structure of active CNC elements. Shorter persistence time explains the previously puzzling observations of highly conserved CNC within each phylum, and of a lack of conservation between phyla. It suggests that CNC divergence might be a key factor in vertebrate evolution. Further evolutionary studies will help to relate individual CNC to specific developmental processes. [Abstract/Link to Full Text]

Lavin JL, Kiil K, Resano O, Ussery DW, Oguiza JA
Comparative genomic analysis of two-component regulatory proteins in Pseudomonas syringae.
BMC Genomics. 2007 Oct 31;8(1):397.
ABSTRACT: BACKGROUND: Pseudomonas syringae is a widespread bacterial plant pathogen, and strains of P. syringae may be assigned to different pathovars based on host specificity among different plant species. The genomes of P. syringae pv. syringae (Psy) B728a, pv. tomato (Pto) DC3000 and pv. phaseolicola (Pph) 1448A have been recently sequenced providing a major resource for comparative genomic analysis. A mechanism commonly found in bacteria for signal transduction is the two-component system (TCS), which typically consists of a sensor histidine kinase (HK) and a response regulator (RR). P. syringae requires a complex array of TCS proteins to cope with diverse plant hosts, host responses, and environmental conditions. RESULTS: Based on the genomic data, pattern searches with Hidden Markov Model (HMM) profiles have been used to identify putative HKs and RRs. The genomes of Psy B728a, Pto DC3000 and Pph 1448A were found to contain a large number of genes encoding TCS proteins, and a core of complete TCS proteins were shared between these genomes: 30 putative TCS clusters, 11 orphan HKs, 33 orphan RRs, and 16 hybrid HKs. A close analysis of the distribution of genes encoding TCS proteins revealed important differences in TCS proteins among the three P. syringae pathovars. CONCLUSIONS: In this article we present a thorough analysis of the identification and distribution of TCS proteins among the sequenced genomes of P. syringae. We have identified differences in TCS proteins among the three P. syringae pathovars that may contribute to their diverse host ranges and association with plant hosts. The identification and analysis of the repertoire of TCS proteins in the genomes of P. syringae pathovars constitute a basis for future functional genomic studies of the signal transduction pathways in this important bacterial phytopathogen. [Abstract/Link to Full Text]

Recent Articles in BMC Medical Genetics

Jaquish CE
The Framingham Heart Study, on its way to becoming the gold standard for Cardiovascular Genetic Epidemiology?
BMC Med Genet. 2007 Oct 4;8(1):63.
ABSTRACT: The Framingham Heart Study, founded in 1948 to examine the epidemiology of cardiovascular disease in a small town outside of Boston, has become the worldwide standard for cardiovascular epidemiology. It is among the longest running, most comprehensively characterized multi-generational studies in the world. Such seminal findings as the effects of smoking and high cholesterol on heart disease came from the Framingham Heart Study. At the time of publication these were novel cardiovascular disease (CVD) risk factors, now they are the basis of treatment and prevention in the US. Is the Framingham study now on its way to becoming the gold standard for genetic epidemiology of CVD? Will the novel genetic findings of today become the health care standards of tomorrow? The accompanying articles summarizing the results of genome-wide association studies (GWAS) give the reader a first glimpse into the possibilities. [Abstract/Link to Full Text]

Cupples LA, Arruda HT, Benjamin EJ, D'Agostino RB, Demissie S, DeStefano AL, Dupuis J, Falls KM, Fox CS, Gottlieb DJ, Govindaraju DR, Guo CY, Heard-Costa NL, Hwang SJ, Kathiresan S, Kiel DP, Laramie JM, Larson MG, Levy D, Liu CY, Lunetta KL, Mailman MD, Manning AK, Meigs JB, Murabito JM, Newton-Cheh C, O'Connor GT, O'Donnell CJ, Pandey M, Seshadri S, Vasan RS, Wang ZY, Wilk JB, Wolf PA, Yang Q, Atwood LD
The Framingham Heart Study 100K SNP genome-wide association study resource: overview of 17 phenotype working group reports.
BMC Med Genet. 2007;8 Suppl 1S1.
BACKGROUND: The Framingham Heart Study (FHS), founded in 1948 to examine the epidemiology of cardiovascular disease, is among the most comprehensively characterized multi-generational studies in the world. Many collected phenotypes have substantial genetic contributors; yet most genetic determinants remain to be identified. Using single nucleotide polymorphisms (SNPs) from a 100K genome-wide scan, we examine the associations of common polymorphisms with phenotypic variation in this community-based cohort and provide a full-disclosure, web-based resource of results for future replication studies.METHODS: Adult participants (n = 1345) of the largest 310 pedigrees in the FHS, many biologically related, were genotyped with the 100K Affymetrix GeneChip. These genotypes were used to assess their contribution to 987 phenotypes collected in FHS over 56 years of follow up, including: cardiovascular risk factors and biomarkers; subclinical and clinical cardiovascular disease; cancer and longevity traits; and traits in pulmonary, sleep, neurology, renal, and bone domains. We conducted genome-wide variance components linkage and population-based and family-based association tests.RESULTS: The participants were white of European descent and from the FHS Original and Offspring Cohorts (examination 1 Offspring mean age 32 +/- 9 years, 54% women). This overview summarizes the methods, selected findings and limitations of the results presented in the accompanying series of 17 manuscripts. The presented association results are based on 70,897 autosomal SNPs meeting the following criteria: minor allele frequency > or + 10%, genotype call rate > or = 80%, Hardy-Weinberg equilibrium p-value > or = 0.001, and satisfying Mendelian consistency. Linkage analyses are based on 11,200 SNPs and short-tandem repeats. Results of phenotype-genotype linkages and associations for all autosomal SNPs are posted on the NCBI dbGaP website at webcite.CONCLUSION: We have created a full-disclosure resource of results, posted on the dbGaP website, from a genome-wide association study in the FHS. Because we used three analytical approaches to examine the association and linkage of 987 phenotypes with thousands of SNPs, our results must be considered hypothesis-generating and need to be replicated. Results from the FHS 100K project with NCBI web posting provides a resource for investigators to identify high priority findings for replication. [Abstract/Link to Full Text]

Meng W, Hughes A, Patterson CC, Belton C, Kamaruddin MS, Horan PG, Kee F, McKeown PP
Genetic variants of complement factor H gene are not associated with premature coronary heart disease: a family-based study in the Irish population.
BMC Med Genet. 2007;862.
BACKGROUND: The complement factor H (CFH) gene has been recently confirmed to play an essential role in the development of age-related macular degeneration (AMD). There are conflicting reports of its role in coronary heart disease. This study was designed to investigate if, using a family-based approach, there was an association between genetic variants of the CFH gene and risk of early-onset coronary heart disease. METHODS: We evaluated 6 SNPs and 5 common haplotypes in the CFH gene amongst 1494 individuals in 580 Irish families with at least one member prematurely affected with coronary heart disease. Genotypes were determined by multiplex SNaPshot technology. RESULTS: Using the TDT/S-TDT test, we did not find an association between any of the individual SNPs or any of the 5 haplotypes and early-onset coronary heart disease. CONCLUSION: In this family-based study, we found no association between the CFH gene and early-onset coronary heart disease. [Abstract/Link to Full Text]

Kardia SL, Sun YV, Hamon SC, Barkley RA, Boerwinkle E, Turner ST
Interactions between the adducin 2 gene and antihypertensive drug therapies in determining blood pressure in people with hypertension.
BMC Med Genet. 2007;861.
BACKGROUND: As part of the NHLBI Family Blood Pressure Program, the Genetic Epidemiology Network of Arteriopathy (GENOA) recruited 575 sibships (n = 1583 individuals) from Rochester, MN who had at least two hypertensive siblings diagnosed before age 60. Linkage analysis identified a region on chromosome 2 that was investigated using 70 single nucleotide polymorphisms (SNPs) typed in 7 positional candidate genes, including adducin 2 (ADD2). METHOD: To investigate whether blood pressure (BP) levels in these hypertensives (n = 1133) were influenced by gene-by-drug interactions, we used cross-validation statistical methods (i.e., estimating a model for predicting BP levels in one subgroup and testing it in a different subgroup). These methods greatly reduced the chance of false positive findings. RESULTS: Eight SNPs in ADD2 were significantly associated with systolic BP in untreated hypertensives (p-value < 0.05). Moreover, we also identified SNPs associated with gene-by-drug interactions on systolic BP in drug-treated hypertensives. The TT genotype at SNP rs1541582 was associated with an average systolic BP of 133 mmHg in the beta-blocker subgroup and 148 mmHg in the diuretic subgroup after adjusting for overall mean differences among drug classes. CONCLUSION: Our findings suggest that hypertension candidate gene variation may influence BP responses to specific antihypertensive drug therapies and measurement of genetic variation may assist in identifying subgroups of hypertensive patients who will benefit most from particular antihypertensive drug therapies. [Abstract/Link to Full Text]

Tang X, Hu Y, Chen D, Zhan S, Zhang Z, Dou H
The Fangshan/Family-based Ischemic Stroke Study In China (FISSIC) protocol.
BMC Med Genet. 2007;860.
BACKGROUND: The exact etiology of ischemic stroke remains unclear, because multiple genetic predispositions and environmental risk factors may be involved, and their interactions dictate the complexity. Family-based studies provide unique features in design, while they are currently underrepresented for studies of ischemic stroke in developing countries. The Fangshan/Family-based Ischemic Stroke Study In China (FISSIC) program aims to conduct a genetic pedigree study of ischemic stroke in rural communities of China. METHODS/DESIGN: The pedigrees of ischemic stroke with clear documentation are recruited by using the proband-initiated contact method, based on the stroke registry in hospital and communities. Blood samples and detailed information of pedigrees are collected through the health care network in the rural area, and prospective follow-up of the pedigrees cohort is scheduled. Complementary strategies of both family-based design and matched case-spousal control design are used, and comprehensive statistical methods will be implemented to ascertain potential complex genetic and environmental factors and their interactions as well. DISCUSSION: This study is complementary to other genetic pedigree studies of ischemic stroke, such as the Siblings With Ischemic Stroke Study (SWISS), which are established in developed countries. We describe the protocol of this family-based genetic epidemiological study that may be used as a new practical guideline and research paradigm in developing countries and facilitate initiatives of stroke study for international collaborations. [Abstract/Link to Full Text]

Cavallari U, Trabetti E, Malerba G, Biscuola M, Girelli D, Olivieri O, Martinelli N, Angiolillo DJ, Corrocher R, Pignatti PF
Gene sequence variations of the platelet P2Y12 receptor are associated with coronary artery disease.
BMC Med Genet. 2007;859.
BACKGROUND: The platelet P2Y12 receptor plays a key role in platelet activation. The H2 haplotype of the P2Y12 receptor gene (P2RY12) has been found to be associated with maximal aggregation response to adenosine diphosphate (ADP) and with increased risk for peripheral arterial disease. No data are available on its association with coronary artery disease (CAD). METHODS : The H2 haplotype of the P2RY12 was determined in 1378 unrelated patients of both sexes selected according to the presence of significant coronary artery disease (CAD group) or having normal coronary angiogram at cardiac catheterization (CAD-free group). Significant coronary artery disease was angiographically determined, and was defined as a greater than 50% visually estimated luminal diameter stenosis in at least one major epicardial coronary artery. RESULTS: In the studied population 71.9% had CAD (n = 991) and 28.1% had normal coronary angiogram (n = 387). H2 haplotype carriers were more frequent in the CAD group (p = 0.03, OR = 1.36, 95%CI = 1.02-1.82). The H2 haplotype was significantly associated with CAD in non-smokers (p = 0.007, OR = 1.83 95%CI = 1.17-2.87), but not in smokers. The association remained significant after adjustment for other covariates (age, triglycerides, HDL, hypertension, diabetes) by multivariate logistic regression (p = 0.004, OR = 2.32 95%CI = 1.30-4.15). CONCLUSION: Gene sequence variations of the P2Y12 receptor gene are associated with the presence of significant CAD, particularly in non-smoking individuals. [Abstract/Link to Full Text]

Hassan MJ, Khurshid M, Azeem Z, John P, Ali G, Chishti MS, Ahmad W
Previously described sequence variant in CDK5RAP2 gene in a Pakistani family with autosomal recessive primary microcephaly.
BMC Med Genet. 2007;858.
BACKGROUND: Autosomal Recessive Primary Microcephaly (MCPH) is a disorder of neurogenic mitosis. MCPH leads to reduced cerebral cortical volume and hence, reduced head circumference associated with mental retardation of variable degree. Genetic heterogeneity is well documented in patients with MCPH with six loci known, while pathogenic sequence variants in four respective genes have been identified so far. Mutations in CDK5RAP2 gene at MCPH3 locus have been least involved in causing MCPH phenotype. METHODS: All coding exons and exon/intron splice junctions of CDK5RAP2 gene were sequenced in affected and normal individuals of Pakistani MCPH family of Kashmiri origin, which showed linkage to MCPH3 locus on chromosome 9q33.2. RESULTS: A previously described nonsense mutation [243 T>A (S81X)] in exon 4 of CDK5RAP2 gene has been identified in the Pakistani family, presented here, with MCPH Phenotype. Genomic and cDNA sequence comparison revealed that the exact nomenclature for this mutation is 246 T>A (Y82X). CONCLUSION: Recurrent observation of Y82X mutation in CDK5RAP2 gene in this Pakistani family may be a sign of confinement of a rare ancestral haplotype carrying this pathogenic variant within Northern Pakistani population, as this has not been reported in any other population. [Abstract/Link to Full Text]

Fernandez F, Curtain RP, Colson NJ, Ovcaric M, MacMillan J, Griffiths LR
Association analysis of chromosome 1 migraine candidate genes.
BMC Med Genet. 2007;857.
BACKGROUND: Migraine with aura (MA) is a subtype of typical migraine. Migraine with aura (MA) also encompasses a rare severe subtype Familial Hemiplegic Migraine (FHM) with several known genetic loci. The type 2 FHM (FHM-2) susceptibility locus maps to chromosome 1q23 and mutations in the ATP1A2 gene at this site have recently been implicated. We have previously provided evidence of linkage of typical migraine (predominantly MA) to microsatellite markers on chromosome 1, in the 1q31 and 1q23 regions. In this study, we have undertaken a large genomic investigation involving candidate genes that lie within the chromosome 1q23 and 1q31 regions using an association analysis approach. METHODS: We have genotyped a large population of case-controls (243 unrelated Caucasian migraineurs versus 243 controls) examining a set of 5 single nucleotide polymorphisms (SNPs) and the Fas Ligand dinucleotide repeat marker, located within the chromosome 1q23 and 1q31 regions. RESULTS: Several genes have been studied including membrane protein (ATP 1 subtype A4 and FasL), cytoplasmic glycoprotein (CASQ 1) genes and potassium (KCN J9 and KCN J10) and calcium (CACNA1E) channel genes in 243 migraineurs (including 85% MA and 15% of migraine without aura (MO)) and 243 matched controls. After correction for multiple testing, chi-square results showed non-significant P values (P > 0.008) across all SNPs (and a CA repeat) tested in these different genes, however results with the KCN J10 marker gave interesting results (P = 0.02) that may be worth exploring further in other populations. CONCLUSION: These results do not show a significant role for the tested candidate gene variants and also do not support the hypothesis that a common chromosome 1 defective gene influences both FHM and the more common forms of migraine. [Abstract/Link to Full Text]

Evans D, Beil FU
The D9N, N291S and S447X variants in the lipoprotein lipase (LPL) gene are not associated with Type III hyperlipidemia.
BMC Med Genet. 2007;856.
BACKGROUND: Type III hyperlipidemia (Type III HLP) is associated with homozygosity for the epsilon2 allele of the APOE gene. However only about 10% of epsilon2 homozygotes develop Type III HLP and it is assumed that additional genetic and/or environmental factors are required for its development. Common variants in the LPL gene have been proposed as likely genetic co-factors. METHODS: The frequency of the LPL SNPs D9N, N291S and S447X in 100 patients with hyperlipidemia and APOE2/2 genotype has been determined and compared to that in healthy blood donors and patients with hyperlipidemia. RESULTS: There were no statistically significant difference in the frequencies of the variants between APOE2/2 patients and controls. CONCLUSION: It is unlikely that the D9N, N291S or S447X variants in the LPL gene play an important role in the development of Type III HLP. [Abstract/Link to Full Text]

Uthurralt J, Gordish-Dressman H, Bradbury M, Tesi-Rocha C, Devaney J, Harmon B, Reeves EK, Brandoli C, Hansen BC, Seip RL, Thompson PD, Price TB, Angelopoulos TJ, Clarkson PM, Moyna NM, Pescatello LS, Visich PS, Zoeller RF, Gordon PM, Hoffman EP
PPARalpha L162V underlies variation in serum triglycerides and subcutaneous fat volume in young males.
BMC Med Genet. 2007;855.
BACKGROUND: Of the five sub-phenotypes defining metabolic syndrome, all are known to have strong genetic components (typically 50-80% of population variation). Studies defining genetic predispositions have typically focused on older populations with metabolic syndrome and/or type 2 diabetes. We hypothesized that the study of younger populations would mitigate many confounding variables, and allow us to better define genetic predisposition loci for metabolic syndrome. METHODS: We studied 610 young adult volunteers (average age 24 yrs) for metabolic syndrome markers, and volumetric MRI of upper arm muscle, bone, and fat pre- and post-unilateral resistance training. RESULTS: We found the PPARalpha L162V polymorphism to be a strong determinant of serum triglyceride levels in young White males, where carriers of the V allele showed 78% increase in triglycerides relative to L homozygotes (LL = 116 +/- 11 mg/dL, LV = 208 +/- 30 mg/dL; p = 0.004). Men with the V allele showed lower HDL (LL = 42 +/- 1 mg/dL, LV = 34 +/- 2 mg/dL; p = 0.001), but women did not. Subcutaneous fat volume was higher in males carrying the V allele, however, exercise training increased fat volume of the untrained arm in V carriers, while LL genotypes significantly decreased in fat volume (LL = -1,707 +/- 21 mm3, LV = 17,617 +/- 58 mm3 ; p = 0.002), indicating a systemic effect of the V allele on adiposity after unilateral training. Our study suggests that the primary effect of PPARalpha L162V is on serum triglycerides, with downstream effects on adiposity and response to training. CONCLUSION: Our results on association of PPARalpha and triglycerides in males showed a much larger effect of the V allele than previously reported in older and less healthy populations. Specifically, we showed the V allele to increase triglycerides by 78% (p = 0.004), and this single polymorphism accounted for 3.8% of all variation in serum triglycerides in males (p = 0.0037). [Abstract/Link to Full Text]

Santiago JL, Martínez A, de la Calle H, Fernández-Arquero M, Figueredo MA, de la Concha EG, Urcelay E
Susceptibility to type 1 diabetes conferred by the PTPN22 C1858T polymorphism in the Spanish population.
BMC Med Genet. 2007;854.
BACKGROUND: The protein tyrosine phosphatase N22 gene (PTPN22) encodes a lymphoid-specific phosphatase (LYP) which is an important downregulator of T cell activation. A PTPN22 polymorphism, C1858T, was found associated with type 1 diabetes (T1D) in different Caucasian populations. In this study, we aimed at confirming the role of this variant in T1D predisposition in the Spanish population. METHODS: A case-control was performed with 316 Spanish white T1D patients consecutively recruited and 554 healthy controls, all of them from the Madrid area. The PTPN22 C1858T SNP was genotyped in both patients and controls using a TaqMan Assay in a 7900 HT Fast Real-Time PCR System. RESULTS: We replicated for the first time in a Spanish population the association of the 1858T allele with an increased risk for developing T1D [carriers of allele T vs. CC: OR (95%) = 1.73 (1.17-2.54); p = 0.004]. Furthermore, this allele showed a significant association in female patients with diabetes onset before age 16 years [carriers of allele T vs. CC: OR (95%) = 2.95 (1.45-6.01), female patients vs female controls p = 0.0009]. No other association in specific subgroups stratified for gender, HLA susceptibility or age at onset were observed. CONCLUSION: Our results provide evidence that the PTPN22 1858T allele is a T1D susceptibility factor also in the Spanish population and it might play a different role in susceptibility to T1D according to gender in early-onset T1D patients. [Abstract/Link to Full Text]

Gutiérrez-Aguilar R, Benmezroua Y, Vaillant E, Balkau B, Marre M, Charpentier G, Sladek R, Froguel P, Neve B
Analysis of KLF transcription factor family gene variants in type 2 diabetes.
BMC Med Genet. 2007;853.
BACKGROUND: The Krüppel-like factor (KLF) family consists of transcription factors that can activate or repress different genes implicated in processes such as differentiation, development, and cell cycle progression. Moreover, several of these proteins have been implicated in glucose homeostasis, making them candidate genes for involvement in type 2 diabetes (T2D). METHODS: Variants of nine KLF genes were genotyped in T2D cases and controls and analysed in a two-stage study. The first case-control set included 365 T2D patients with a strong family history of T2D and 363 normoglycemic individuals and the second set, 750 T2D patients and 741 normoglycemic individuals, all of French origin. The SNPs of six KLF genes were genotyped by Taqman SNP Genotyping Assays. The other three KLF genes (KLF2, -15 and -16) were screened and the identified frequent variants of these genes were analysed in the case-control studies. RESULTS: Three of the 28 SNPs showed a trend to be associated with T2D in our first case-control set (P < 0.10). These SNPs, located in the KLF2, KLF4 and KLF5 gene were then analysed in our second replication set, but analysis of this set and the combined analysis of the three variants in all 2,219 individuals did not show an association with T2D in this French population. As the KLF2, -15 and -16 variants were representative for the genetic variability in these genes, we conclude they do not contribute to genetic susceptibility for T2D. CONCLUSION: It is unlikely that variants in different members of the KLF gene family play a major role in T2D in the French population. [Abstract/Link to Full Text]

Zhang X, Chen L, Liu J, Zhao Z, Qu E, Wang X, Chang W, Xu C, Wang QK, Liu M
A novel DSPP mutation is associated with type II dentinogenesis imperfecta in a Chinese family.
BMC Med Genet. 2007;852.
BACKGROUND: Hereditary defects of tooth dentin are classified into two main groups: dentin dysplasia (DD) (types I and II) and dentinogenesis imperfecta (DGI) (types I, II, and III). Type II DGI is one of the most common tooth defects with an autosomal dominant mode of inheritance. One disease-causing gene, the dentin sialophosphoprotein (DSPP) gene, has been reported for type II DGI. METHODS: In this study, we characterized a four-generation Chinese family with type II DGI that consists of 18 living family members, including 8 affected individuals. Linkage analysis with polymorphic markers D4S1534 and D4S414 that span the DSPP gene showed that the family is linked to DSPP. All five exons and exon-intron boundaries of DSPP were sequenced in members of type II DGI family. RESULTS: Direct DNA sequence analysis identified a novel mutation (c.49C-->T, p.Pro17Ser) in exon 1 of the DSPP gene. The mutation spot, the Pro17 residue, is the second amino acid of the mature DSP protein, and highly conserved during evolution. The mutation was identified in all affected individuals, but not in normal family members and 100 controls. CONCLUSION: These results suggest that mutation p.Pro17Ser causes type II DGI in the Chinese family. This study identifies a novel mutation in the DSPP gene, and expands the spectrum of mutations that cause DGI. [Abstract/Link to Full Text]

Qu HQ, Polychronakos C
The TCF7L2 locus and type 1 diabetes.
BMC Med Genet. 2007;851.
BACKGROUND: TCF7L2 belongs to a subfamily of TCF7-like HMG box-containing transcription factors, and maps to human chromosome 10q25.3. A recent study identified genetic association of type 2 diabetes (T2D) with this gene, correlated with diminished insulin secretion. This study aimed to investigate the possibility of genetic association between TCF7L2 and type 1 diabetes (T1D). METHODS: The SNP most significantly associated with T2D, rs7903146, was genotyped in 886 T1D nuclear family trios with ethnic backgrounds of mixed European descent. RESULTS: This study found no T1D association with, and no age-of-onset effect from rs7903146. CONCLUSION: This study suggests that a T2D mechanism mediated by TCF7L2 does not participate in the etiology of T1D. [Abstract/Link to Full Text]

Chu MW, Siegmund KD, Eckstam CL, Kim JY, Yang AS, Kanel GC, Tavaré S, Shibata D
Lack of increases in methylation at three CpG-rich genomic loci in non-mitotic adult tissues during aging.
BMC Med Genet. 2007;850.
BACKGROUND: Cell division occurs during normal human development and aging. Despite the likely importance of cell division to human pathology, it has been difficult to infer somatic cell mitotic ages (total numbers of divisions since the zygote) because direct counting of lifetime numbers of divisions is currently impractical. Here we attempt to infer relative mitotic ages with a molecular clock hypothesis. Somatic genomes may record their mitotic ages because greater numbers of replication errors should accumulate after greater numbers of divisions. Mitotic ages will vary between cell types if they divide at different times and rates. METHODS: Age-related increases in DNA methylation at specific CpG sites (termed "epigenetic molecular clocks") have been previously observed in mitotic human epithelium like the intestines and endometrium. These CpG rich sequences or "tags" start unmethylated and potentially changes in methylation during development and aging represent replication errors. To help distinguish between mitotic versus time-associated changes, DNA methylation tag patterns at 8-20 CpGs within three different genes, two on autosomes and one on the X-chromosome were measured by bisulfite sequencing from heart, brain, kidney and liver of autopsies from 21 individuals of different ages. RESULTS: Levels of DNA methylation were significantly greater in adult compared to fetal or newborn tissues for two of the three examined tags. Consistent with the relative absence of cell division in these adult tissues, there were no significant increases in tag methylation after infancy. CONCLUSION: Many somatic methylation changes at certain CpG rich regions or tags appear to represent replication errors because this methylation increases with chronological age in mitotic epithelium but not in non-mitotic organs. Tag methylation accumulates differently in different tissues, consistent with their expected genealogies and mitotic ages. Although further studies are necessary, these results suggest numbers of divisions and ancestry are at least partially recorded by epigenetic replication errors within somatic cell genomes. [Abstract/Link to Full Text]

Kaushal R, Woo D, Pal P, Haverbusch M, Xi H, Moomaw C, Sekar P, Kissela B, Kleindorfer D, Flaherty M, Sauerbeck L, Chakraborty R, Broderick J, Deka R
Subarachnoid hemorrhage: tests of association with apolipoprotein E and elastin genes.
BMC Med Genet. 2007;849.
BACKGROUND: Apolipoprotein E (APOE) and elastin (ELN) are plausible candidate genes involved in the pathogenesis of stroke. We tested for association of variants in APOE and ELN with subarachnoid hemorrhage (SAH) in a population-based study. We genotyped 12 single nucleotide polymorphisms (SNPs) on APOE and 10 SNPs on ELN in a sample of 309 Caucasian individuals, of whom 107 are SAH cases and 202 are age-, race-, and gender-matched controls from the Greater Cincinnati/Northern Kentucky region. Associations were tested at genotype, allele, and haplotype levels. A genomic control analysis was performed to check for spurious associations resulting from population substructure. RESULTS: At the APOE locus, no individual SNP was associated with SAH after correction for multiple comparisons. Haplotype analysis revealed significant association of the major haplotype (Hap1) in APOE with SAH (p = 0.001). The association stemmed from both the 5' promoter and the 3' region of the APOE gene. APOE epsilon2 and epsilon 4 were not significantly associated with SAH. No association was observed for ELN at genotype, allele, or haplotype level and our study failed to confirm previous reports of ELN association with aneurysmal SAH. CONCLUSION: This study suggests a role of the APOE gene in the etiology of aneurysmal SAH. [Abstract/Link to Full Text]

Kumar RA, Everman DB, Morgan CT, Slavotinek A, Schwartz CE, Simpson EM
Absence of mutations in NR2E1 and SNX3 in five patients with MMEP (microcephaly, microphthalmia, ectrodactyly, and prognathism) and related phenotypes.
BMC Med Genet. 2007;848.
BACKGROUND: A disruption of sorting nexin 3 (SNX3) on 6q21 was previously reported in a patient with MMEP (microcephaly, microphthalmia, ectrodactyly, and prognathism) and t(6;13)(q21;q12) but no SNX3 mutations were identified in another sporadic case of MMEP, suggesting involvement of another gene. In this work, SNX3 was sequenced in three patients not previously studied for this gene. In addition, we test the hypothesis that mutations in the neighbouring gene NR2E1 may underlie MMEP and related phenotypes. METHODS: Mutation screening was performed in five patients: the t(6;13)(q21;q12) MMEP patient, three additional patients with possible MMEP or a related phenotype, and one patient with oligodactyly, ulnar aplasia, and a t(6;7)(q21;q31.2) translocation. We used sequencing to exclude SNX3 coding mutations in three patients not previously studied for this gene. To test the hypothesis that mutations in NR2E1 may contribute to MMEP or related phenotypes, we sequenced the entire coding region, complete 5' and 3' untranslated regions, consensus splice-sites, and evolutionarily conserved regions including core and proximal promoter in all five patients. Two-hundred and fifty control subjects were genotyped for any candidate mutation. RESULTS: We did not detect any synonymous nor nonsynonymous coding mutations of NR2E1 or SNX3. In one patient with possible MMEP, we identified a candidate regulatory mutation that has been reported previously in a patient with microcephaly but was not found in 250 control subjects examined here. CONCLUSION: Our results do not support involvement of coding mutations in NR2E1 or SNX3 in MMEP or related phenotypes; however, we cannot exclude the possibility that regulatory NR2E1 or SNX3 mutations or deletions at this locus may underlie abnormal human cortical development in some patients. [Abstract/Link to Full Text]

Kepp K, Juhanson P, Kozich V, Ots M, Viigimaa M, Laan M
Resequencing PNMT in European hypertensive and normotensive individuals: no common susceptibilily variants for hypertension and purifying selection on intron 1.
BMC Med Genet. 2007;847.
BACKGROUND: Human linkage and animal QTL studies have indicated the contribution of genes on Chr17 into blood pressure regulation. One candidate gene is PNMT, coding for phenylethanolamine-N-methyltransferase, catalyzing the synthesis of epinephrine from norepinephrine. METHODS: Fine-scale variation of PNMT was screened by resequencing hypertensive (n = 50) and normotensive (n = 50) individuals from two European populations (Estonians and Czechs). The resulting polymorphism data were analyzed by statistical genetics methods using Genepop 3.4, PHASE 2.1 and DnaSP 4.0 software programs. In silico prediction of transcription factor binding sites for intron 1 was performed with MatInspector 2.2 software. RESULTS: PNMT was characterized by minimum variation and excess of rare SNPs in both normo- and hypertensive individuals. None of the SNPs showed significant differences in allelic frequencies among population samples, as well as between screened hypertensives and normotensives. In the joint case-control analysis of the Estonian and the Czech samples, hypertension patients had a significant excess of heterozygotes for two promoter region polymorphisms (SNP-184; SNP-390). The identified variation pattern of PNMT reflects the effect of purifying selection consistent with an important role of PNMT-synthesized epinephrine in the regulation of cardiovascular and metabolic functions, and as a CNS neurotransmitter. A striking feature is the lack of intronic variation. In silico analysis of PNMT intron 1 confirmed the presence of a human-specific putative Glucocorticoid Responsive Element (GRE), inserted by Alu-mediated transfer. Further analysis of intron 1 supported the possible existence of a full Glucocorticoid Responsive Unit (GRU) predicted to consist of multiple gene regulatory elements known to cooperate with GRE in driving transcription. The role of these elements in regulating PNMT expression patterns and thus determining the dynamics of the synthesis of epinephrine is still to be studied. CONCLUSION: We suggest that the differences in PNMT expression between normotensives and hypertensives are not determined by the polymorphisms in this gene, but rather by the interplay of gene expression regulators, which may vary among individuals. Understanding the determinants of PNMT expression may assist in developing PNMT inhibitors as potential novel therapeutics. [Abstract/Link to Full Text]

Del Mastro RG, Turenne L, Giese H, Keith TP, Van Eerdewegh P, May KJ, Little RD
Mechanistic role of a disease-associated genetic variant within the ADAM33 asthma susceptibility gene.
BMC Med Genet. 2007;846.
BACKGROUND: ADAM33 has been identified as an asthma-associated gene in an out-bred population. Genetic studies suggested that the functional role of this metalloprotease was in airway remodeling. However, the mechanistic roles of the disease-associated SNPs have yet to be elucidated especially in the context of the pathophysiology of asthma. One disease-associated SNP, BC+1, which resides in intron BC toward the 5' end of ADAM33, is highly associated with the disease. METHODS: The region surrounding this genetic variant was cloned into a model system to determine if there is a regulatory element within this intron that influences transcription. RESULTS: The BC+1 protective allele did not impose any affect on the transcription of the reporter gene. However, the at-risk allele enforced such a repressive affect on the promoter that no protein product from the reporter gene was detected. These results indicated that there exists within intron BC a regulatory element that acts as a repressor for gene expression. Moreover, since SNP BC+1 is a common genetic variant, this region may interact with other undefined regulatory elements within ADAM33 to provide a rheostat effect, which modulates pre-mRNA processing. Thus, SNP BC+1 may have an important role in the modulation of ADAM33 gene expression. CONCLUSION: These data provide for the first time a functional role for a disease-associated SNP in ADAM33 and begin to shed light on the deregulation of this gene in the pathophysiology of asthma. [Abstract/Link to Full Text]

Ruixing Y, Guangqin C, Yong W, Weixiong L, Dezhai Y, Shangling P
Effect of the 3'APOB-VNTR polymorphism on the lipid profiles in the Guangxi Hei Yi Zhuang and Han populations.
BMC Med Genet. 2007;845.
BACKGROUND: Apolipoprotein (Apo) B is the major component of low-density lipoprotein (LDL), very low-density lipoprotein (VLDL) and chylomicrons. Many genetic polymorphisms of the Apo B have been described, associated with variation of lipid levels. However, very few studies have evaluated the effect of the variable number of tandem repeats region 3' of the Apo B gene (3'APOB-VNTR) polymorphism on the lipid profiles in the special minority subgroups in China. Thus, the present study was undertaken to study the effect of the 3'APOB-VNTR polymorphism on the serum lipid levels in the Guangxi Hei Yi Zhuang and Han populations. METHODS: A total of 548 people of Hei Yi Zhuang were surveyed by a stratified randomized cluster sampling. The epidemiological survey was performed using internationally standardized methods. Serum lipid and apolipoprotein levels were measured. The 3'APOB-VNTR alleles were determined by polymerase chain reaction (PCR) followed by electrophoresis in polyacrylamide gels, and classified according to the number of repeats of a 15-bp hypervariable elements (HVE). The sequence of the most common allele was determined using the PCR and direct sequencing. The possible association between alleles of the 3'APOB-VNTR and lipid variables was examined. The results were compared with those in 496 people of Han who also live in that district. RESULTS: Nineteen alleles ranging from 24 to 64 repeats were detected in both Hei Yi Zhuang and Han. HVE56 and HVE58 were not be detected in Hei Yi Zhuang whereas HVE48 and HVE62 were totally absent in Han. The frequencies of HVE26, HVE30, HVE46, heterozygote, and short alleles (< 38 repeats) were higher in Hei Yi Zhuang than in Han. But the frequencies of HVE34, HVE38, HVE40, homozygote, and long alleles (> or = 38 repeats) were lower in Hei Yi Zhuang than in Han (P < 0.05-0.01). The levels of total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C) and Apo B in Hei Yi Zhuang but not in Han were higher in VNTR-LS (carrier of one long and one short alleles) than in VNTR-LL (the individual carrying two long alleles) genotypes. The levels of TC, triglycerides (TG), LDL cholesterol, and Apo B in Hei Yi Zhuang were higher in both HVE34 and HVE36 alleles than in HVE32 allele. The levels of TC, TG, HDL-C and Apo B in Hei Yi Zhuang were also higher in homozygotes than in heterozygotes. There were no significant differences in the detected lipid parameters between the VNTR-SS (carrier of two short alleles) and VNTR-LS or VNTR-LL genotypes in both ethnic groups. CONCLUSION: There were significant differences of the 3'APOB-VNTR polymorphism between the Hei Yi Zhuang and Han populations. An association between the 3'APOB-VNTR polymorphism and serum lipid levels was observed in the Hei Yi Zhuang but not in the Han populations. [Abstract/Link to Full Text]

Bouatia-Naji N, Vatin V, Lecoeur C, Heude B, Proença C, Veslot J, Jouret B, Tichet J, Charpentier G, Marre M, Balkau B, Froguel P, Meyre D
Secretory granule neuroendocrine protein 1 (SGNE1) genetic variation and glucose intolerance in severe childhood and adult obesity.
BMC Med Genet. 2007;844.
BACKGROUND: 7B2 is a regulator/activator of the prohormone convertase 2 which is involved in the processing of numerous neuropeptides, including insulin, glucagon and pro-opiomelanocortin. We have previously described a suggestive genetic linkage peak with childhood obesity on chr15q12-q14, where the 7B2 encoding gene, SGNE1 is located. The aim of this study is to analyze associations of SGNE1 genetic variation with obesity and metabolism related quantitative traits. METHODS: We screened SGNE1 for genetic variants in obese children and genotyped 12 frequent single nucleotide polymorphisms (SNPs). Case control analyses were performed in 1,229 obese (534 children and 695 adults), 1,535 individuals with type 2 diabetes and 1,363 controls, all French Caucasians. We also studied 4,922 participants from the D.E.S.I.R prospective population-based cohort. RESULTS: We did not find any association between SGNE1 SNPs and childhood or adult obesity. However, the 5' region SNP -1,701A>G associated with higher area under glucose curve after oral glucose tolerance test (p = 0.0005), higher HOMA-IR (p = 0.005) and lower insulinogenic index (p = 0.0003) in obese children. Similar trends were found in obese adults. SNP -1,701A>G did not associate with risk of T2D but tends to associate with incidence of type 2 diabetes (HR = 0.75 95%CI [0.55-1.01]; p = 0.06) in the prospective cohort. CONCLUSION: SGNE1 genetic variation does not contribute to obesity and common forms of T2D but may worsen glucose intolerance and insulin resistance, especially in the background of severe and early onset obesity. Further molecular studies are required to understand the molecular bases involved in this process. [Abstract/Link to Full Text]

Aartsma-Rus A, Janson AA, van Ommen GJ, van Deutekom JC
Antisense-induced exon skipping for duplications in Duchenne muscular dystrophy.
BMC Med Genet. 2007;843.
BACKGROUND: Antisense-mediated exon skipping is currently one of the most promising therapeutic approaches for Duchenne muscular dystrophy (DMD). Using antisense oligonucleotides (AONs) targeting specific exons the DMD reading frame is restored and partially functional dystrophins are produced. Following proof of concept in cultured muscle cells from patients with various deletions and point mutations, we now focus on single and multiple exon duplications. These mutations are in principle ideal targets for this approach since the specific skipping of duplicated exons would generate original, full-length transcripts. METHODS: Cultured muscle cells from DMD patients carrying duplications were transfected with AONs targeting the duplicated exons, and the dystrophin RNA and protein were analyzed. RESULTS: For two brothers with an exon 44 duplication, skipping was, even at suboptimal transfection conditions, so efficient that both exons 44 were skipped, thus generating, once more, an out-of-frame transcript. In such cases, one may resort to multi-exon skipping to restore the reading frame, as is shown here by inducing skipping of exon 43 and both exons 44. By contrast, in cells from a patient with an exon 45 duplication we were able to induce single exon 45 skipping, which allowed restoration of wild type dystrophin. The correction of a larger duplication (involving exons 52 to 62), by combinations of AONs targeting the outer exons, appeared problematic due to inefficient skipping and mistargeting of original instead of duplicated exons. CONCLUSION: The correction of DMD duplications by exon skipping depends on the specific exons targeted. Its options vary from the ideal one, restoring for the first time the true, wild type dystrophin, to requiring more 'classical' skipping strategies, while the correction of multi-exon deletions may need the design of tailored approaches. [Abstract/Link to Full Text]

Nielsen M, Hes FJ, Vasen HF, van den Hout WB
Cost-utility analysis of genetic screening in families of patients with germline MUTYH mutations.
BMC Med Genet. 2007;842.
BACKGROUND: MUTYH associated polyposis (MAP) is an autosomal recessive inherited disorder. Carriers of bi-allelic MUTYH germline mutations have a risk of approximately 60% to develop colorectal carcinoma (CRC). In the general population about 1.5% is a heterozygous MUTYH mutation carrier. Children of MAP patients have an increased risk of inheriting two MUTYH mutations compared to the general population, implicating an increased risk for developing CRC. METHODS: Using data from the literature and Dutch MAP patients (n = 40), we constructed a Markov model to perform a societal cost-utility analysis of genetic screening in MAP families. Genetic screening was done by testing the spouse first and, in case of a heterozygous spouse, also testing of the children. RESULTS: The cost of genetic screening of families of MAP patients, when compared to no genetic screening, was estimated at 25,000 euros per quality-adjusted life year (QALY). The presence of Fecal Occult Blood testing (FOBT) population screening only slightly increased this cost-utility ratio to 25,500 euros per QALY. For a MUTYH heterozygote index-patient, the ratio was 51,500 euros per QALY. The results of our analysis were sensitive to several of the parameters in the model, including the cost assumed for molecular genetic testing. CONCLUSION: The costs per QALY of genetic screening in families of MAP patients are acceptable according to international standards. Therefore, genetic testing of spouses and/or children should be discussed with and offered to counselees. [Abstract/Link to Full Text]

Bi LL, Pan G, Atkinson TP, Zheng L, Dale JK, Makris C, Reddy V, McDonald JM, Siegel RM, Puck JM, Lenardo MJ, Straus SE
Dominant inhibition of Fas ligand-mediated apoptosis due to a heterozygous mutation associated with autoimmune lymphoproliferative syndrome (ALPS) Type Ib.
BMC Med Genet. 2007;841.
BACKGROUND: Autoimmune lymphoproliferative syndrome (ALPS) is a disorder of lymphocyte homeostasis and immunological tolerance due primarily to genetic defects in Fas (CD95/APO-1; TNFRSF6), a cell surface receptor that regulates apoptosis and its signaling apparatus. METHODS: Fas ligand gene mutations from ALPS patients were identified through cDNA and genomic DNA sequencing. Molecular and biochemical assessment of these mutant Fas ligand proteins were carried out by expressing the mutant FasL cDNA in mammalian cells and analysis its effects on Fas-mediated programmed cell death. RESULTS: We found an ALPS patient that harbored a heterozygous A530G mutation in the FasL gene that replaced Arg with Gly at position 156 in the protein's extracellular Fas-binding region. This produced a dominant-interfering FasL protein that bound to the wild-type FasL protein and prevented it from effectively inducing apoptosis. CONCLUSION: Our data explain how a naturally occurring heterozygous human FasL mutation can dominantly interfere with normal FasL apoptotic function and lead to an ALPS phenotype, designated Type Ib. [Abstract/Link to Full Text]

Filippini S, Blanco A, Fernández-Marmiesse A, Alvarez-Iglesias V, Ruíz-Ponte C, Carracedo A, Vega A
Multiplex SNaPshot for detection of BRCA1/2 common mutations in Spanish and Spanish related breast/ovarian cancer families.
BMC Med Genet. 2007;840.
BACKGROUND: It is estimated that 5-10% of all breast cancer are hereditary and attributable to mutations in the highly penetrance susceptibility genes BRCA1 and BRCA2. The genetic analysis of these genes is complex and expensive essentially because their length. Nevertheless, the presence of recurrent and founder mutations allows a pre-screening for the identification of the most frequent mutations found in each geographical region. In Spain, five mutations in BRCA1 and other five in BRCA2 account for approximately 50% of the mutations detected in Spanish families. METHODS: We have developed a novel PCR multiplex SNaPshot reaction that targets all ten recurrent and founder mutations identified in BRCA1 and BRCA2 in Spain to date. RESULTS: The SNaPshot reaction was performed on samples previously analyzed by direct sequencing and all mutations were concordant. This strategy permits the analysis of approximately 50% of all mutations observed to be responsible for breast/ovarian cancer in Spanish families using a single reaction per patient sample. CONCLUSION: The SNaPshot assay developed is sensitive, rapid, with minimum cost per sample and additionally can be automated for high-throughput genotyping. The SNaPshot assay outlined here is not only useful for analysis of Spanish breast/ovarian cancer families, but also e.g. for populations with Spanish ancestry, such as those in Latin America. [Abstract/Link to Full Text]

Gomez LC, Real SM, Ojeda MS, Gimenez S, Mayorga LS, Roqué M
Polymorphism of the FABP2 gene: a population frequency analysis and an association study with cardiovascular risk markers in Argentina.
BMC Med Genet. 2007;839.
BACKGROUND: The FABP2 gene encodes for the intestinal FABP (IFABP) protein, which is expressed only in intestinal enterocytes. A polymorphism at codon 54 in exon 2 of the FABP2 gene exchanges an Alanine (Ala), in the small helical region of the protein, for Threonine (Thr). Given the potential physiological role of the Ala54Thr FABP2 polymorphism, we assess in this study the local population frequency and analyze possible associations with five selected markers, i.e. glycemia, total cholesterol, body mass index (BMI), hypertension, and high Cardiovascular Risk Index (CVR index). METHODS: We studied 86 men and 116 women. DNA was extracted from a blood drop for genotype analysis. Allele frequencies were calculated by direct counting. Hardy Weinberg Equilibrium was evaluated using a Chi-square goodness of fit test.For the polymorphism association analysis, five markers were selected, i.e. blood pressure, Framingham Risk Index, total cholesterol, BMI, and glycemia.For each marker, the Odds Ratio (OR) was calculated by an online statistic tool. RESULTS: Our results reveal a similar population polymorphism frequency as in previous European studies, with q = 0.277 (95% confidence limits 0.234-0.323). No significant association was found with any of the tested markers in the context of our Argentine nutritional and cultural habits. We did, however, observe a tendency for increased Cholesterol and high BMI in Thr54 carriers. CONCLUSION: This is the first study to look at the population frequency of the Thr54 allele in Argentina. The obtained result does not differ from previously reported frequencies in European populations. Moreover, we found no association between the Thr54 allele and any of the five selected markers. The observed tendency to increased total cholesterol and elevated BMI in Thr54 carriers, even though not significant for p < 0.1 could be worth of further investigation to establish whether the Thr54 variant should be taken into consideration in cardiovascular prevention strategies. [Abstract/Link to Full Text]

Cauchi S, Meyre D, Choquet H, Deghmoun S, Durand E, Gaget S, Lecoeur C, Froguel P, Levy-Marchal C
TCF7L2 rs7903146 variant does not associate with smallness for gestational age in the French population.
BMC Med Genet. 2007;837.
BACKGROUND: In adults, the TCF7L2 rs7903146 T allele, commonly associated with type 2 diabetes (T2D), has been also associated with a lower body mass index (BMI) in T2D individuals and with a smaller waist circumference in subjects with impaired glucose tolerance. METHODS: The present association study aimed at analyzing the contribution of the rs7903146 SNP to smallness for gestational age (SGA) and metabolic profiles in subjects with SGA or appropriate for gestational age birth weight (AGA). Two groups of French Caucasian subjects were selected on birth data: SGA (birth weight < 10th percentile; n = 764), and AGA (25th < birth weight < 75th percentile; n = 627). Family-based association tests were also performed in 3,012 subjects from 628 SGA and AGA pedigrees. RESULTS: The rs7903146 genotypic distributions between AGA (30.7%) and SGA (29.0%) were not statistically different (allelic OR = 0.92 [0.78-1.09], p = 0.34). Family association-based studies did not show a distortion of T allele transmission in SGA subjects (p = 0.52). No significant effect of the T allele was detected on any of the metabolic parameters in the SGA group. However, in the AGA group, trends towards a lower insulin secretion (p = 0.03) and a higher fasting glycaemia (p = 0.002) were detected in carriers of the T allele. CONCLUSION: The TCF7L2 rs7903146 variant neither increases the risk for SGA nor modulates birth weight and young adulthood glucose homeostasis in French Caucasian subjects born with SGA. [Abstract/Link to Full Text]

Reamon-Buettner SM, Cho SH, Borlak J
Mutations in the 3'-untranslated region of GATA4 as molecular hotspots for congenital heart disease (CHD).
BMC Med Genet. 2007;838.
BACKGROUND: The 3'-untranslated region (3'-UTR) of mRNA contains regulatory elements that are essential for the appropriate expression of many genes. These regulatory elements are involved in the control of nuclear transport, polyadenylation status, subcellular targetting as well as rates of translation and degradation of mRNA. Indeed, 3'-UTR mutations have been associated with disease, but frequently this region is not analyzed. To gain insights into congenital heart disease (CHD), we have been analyzing cardiac-specific transcription factor genes, including GATA4, which encodes a zinc finger transcription factor. Germline mutations in the coding region of GATA4 have been associated with septation defects of the human heart, but mutations are rather rare. Previously, we identified 19 somatically-derived zinc finger mutations in diseased tissues of malformed hearts. We now continued our search in the 609 bp 3'-UTR region of GATA4 to explore further molecular avenues leading to CHD. METHODS: By direct sequencing, we analyzed the 3'-UTR of GATA4 in DNA isolated from 68 formalin-fixed explanted hearts with complex cardiac malformations encompassing ventricular, atrial, and atrioventricular septal defects. We also analyzed blood samples of 12 patients with CHD and 100 unrelated healthy individuals. RESULTS: We identified germline and somatic mutations in the 3'-UTR of GATA4. In the malformed hearts, we found nine frequently occurring sequence alterations and six dbSNPs in the 3'-UTR region of GATA4. Seven of these mutations are predicted to affect RNA folding. We also found further five nonsynonymous mutations in exons 6 and 7 of GATA4. Except for the dbSNPs, analysis of tissue distal to the septation defect failed to detect sequence variations in the same donor, thus suggesting somatic origin and mosaicism of mutations. In a family, we observed c.+119A > T in the 3'-UTR associated with ASD type II. CONCLUSION: Our results suggest that somatic GATA4 mutations in the 3'-UTR may provide an additional molecular rationale for CHD. [Abstract/Link to Full Text]

Jordan C, Li HH, Kwan HC, Francke U
Cerebellar gene expression profiles of mouse models for Rett syndrome reveal novel MeCP2 targets.
BMC Med Genet. 2007;836.
BACKGROUND: MeCP2, methyl-CpG-binding protein 2, binds to methylated cytosines at CpG dinucleotides, as well as to unmethylated DNA, and affects chromatin condensation. MECP2 mutations in females lead to Rett syndrome, a neurological disorder characterized by developmental stagnation and regression, loss of purposeful hand movements and speech, stereotypic hand movements, deceleration of brain growth, autonomic dysfunction and seizures. Most mutations occur de novo during spermatogenesis. Located at Xq28, MECP2 is subject to X inactivation, and affected females are mosaic. Rare hemizygous males suffer from a severe congenital encephalopathy. METHODS: To identify the pathways mis-regulated by MeCP2 deficiency, microarray-based global gene expression studies were carried out in cerebellum of Mecp2 mutant mice. We compared transcript levels in mutant/wildtype male sibs of two different MeCP2-deficient mouse models at 2, 4 and 8 weeks of age. Increased transcript levels were evaluated by real-time quantitative RT-PCR. Chromatin immunoprecipitation assays were used to document in vivo MeCP2 binding to promoter regions of candidate target genes. RESULTS: Of several hundred genes with altered expression levels in the mutants, twice as many were increased than decreased, and only 27 were differentially expressed at more than one time point. The number of misregulated genes was 30% lower in mice with the exon 3 deletion (Mecp2tm1.1Jae) than in mice with the larger deletion (Mecp2tm1.1Bird). Between the mutants, few genes overlapped at each time point. Real-time quantitative RT-PCR assays validated increased transcript levels for four genes: Irak1, interleukin-1 receptor-associated kinase 1; Fxyd1, phospholemman, associated with Na, K-ATPase;Reln, encoding an extracellular signaling molecule essential for neuronal lamination and synaptic plasticity; and Gtl2/Meg3, an imprinted maternally expressed non-translated RNA that serves as a host gene for C/D box snoRNAs and microRNAs. Chromatin immunoprecipitation assays documented in vivo MeCP2 binding to promoter regions of Fxyd1, Reln, and Gtl2. CONCLUSION: Transcriptional profiling of cerebellum failed to detect significant global changes in Mecp2-mutant mice. Increased transcript levels of Irak1, Fxyd1, Reln, and Gtl2 may contribute to the neuronal dysfunction in MeCP2-deficient mice and individuals with Rett syndrome. Our data provide testable hypotheses for future studies of the regulatory or signaling pathways that these genes act on. [Abstract/Link to Full Text]

Abu-Amero KK, Al-Mohanna F, Al-Boudari OM, Mohamed GH, Dzimiri N
The interactive role of type 2 diabetes mellitus and E-selectin S128R mutation on susceptibility to coronary heart disease.
BMC Med Genet. 2007;835.
BACKGROUND: The role of gene-environment interactions as risk factors for coronary heart disease (CAD) remains largely undefined. Such interactions may involve gene mutations and disease conditions such as type 2 diabetes mellitus (DM2) predisposing individuals to acquiring the disease. METHODS: In the present study, we assessed the possible interactive effect of DM2 and E-selectin S128R polymorphism with respect to its predisposing individuals to CAD, using as a study model a population of 1,112 patients and 427 angiographed controls of Saudi origin. E-selectin genotyping was accomplished by polymerase chain reaction (PCR) amplification followed by PstI restriction enzyme digestion. RESULTS: The results show that DM2 is an independent risk factor for CAD. In the absence of DM2, the presence of the R mutant allele alone is not significantly associated with CAD (p = 0.431, OR 1.28). In contrast, in the presence of DM2 and the S allele, the likelihood of an individual acquiring CAD is significant (odds ratio = 5.44; p = < 0.001). This effect of DM2 becomes remarkably greater in the presence of the mutant 128R allele, as can be observed from the odds ratio of their interaction term (odds ratio = 6.11; p = < 0.001). CONCLUSION: Our findings indicate therefore that the risk of acquiring CAD in patients with DM2 increases significantly in the presence of the 128R mutant allele of the E-selectin gene. [Abstract/Link to Full Text]

Recent Articles in American Journal of Human Genetics

Heinzen EL, Yoon W, Tate SK, Sen A, Wood NW, Sisodiya SM, Goldstein DB
Nova2 interacts with a cis-acting polymorphism to influence the proportions of drug-responsive splice variants of SCN1A.
Am J Hum Genet. 2007 May;80(5):876-83.
An intronic polymorphism in the SCN1A gene, which encodes a neuronal sodium-channel alpha subunit, has been previously associated with the dosing of two commonly used antiepileptic drugs that elicit their pharmacologic action primarily at this ion-channel subunit. This study sought to characterize the functional effects of this polymorphism on alternative splicing of SCN1A and to explore the potential for modulating the drug response in the pharmacologically unfavorable genotype by identification of a splice modifier acting on SCN1A. The effects of the genotype at the SCN1A IVS5N+5 G-->A polymorphism on SCN1A splice-variant proportions and the consequences of increased expression of splice modifiers were investigated both in human temporal neocortex tissue and in a cellular minigene expression system. Quantitative real-time polymerase chain reaction was used to quantify the amounts of SCN1A transcripts forms. We show that the polymorphism has a dramatic effect on the proportions of neonate and adult alternative transcripts of SCN1A in adult brain tissue and that the effect of the polymorphism also appears to be modified by Nova2 expression levels. A minigene expression system confirms both the effect of the polymorphism on transcript proportions and the role of Nova2 in the regulation of splicing, with higher Nova2 expression increasing the proportion of the neonate form. A larger Nova2-mediated effect was detected in the AA genotype that is associated with increased dose requirements. The effects of Nova2 on modulation of the alternative splicing of 17 other neuronally expressed genes were investigated, and no effect was observed. These findings emphasize the emerging role of genetic polymorphisms in modulation of drug effect and illustrate both alternative splicing as a potential therapeutic target and the importance of considering the activity of compounds at alternative splice forms of drug targets in screening programs. [Abstract/Link to Full Text]

Kallberg H, Padyukov L, Plenge RM, Ronnelid J, Gregersen PK, van der Helm-van Mil AH, Toes RE, Huizinga TW, Klareskog L, Alfredsson L
Gene-gene and gene-environment interactions involving HLA-DRB1, PTPN22, and smoking in two subsets of rheumatoid arthritis.
Am J Hum Genet. 2007 May;80(5):867-75.
Gene-gene and gene-environment interactions are key features in the development of rheumatoid arthritis (RA) and other complex diseases. The aim of this study was to use and compare three different definitions of interaction between the two major genetic risk factors of RA--the HLA-DRB1 shared epitope (SE) alleles and the PTPN22 R620W allele--in three large case-control studies: the Swedish Epidemiological Investigation of Rheumatoid Arthritis (EIRA) study, the North American RA Consortium (NARAC) study, and the Dutch Leiden Early Arthritis Clinic study (in total, 1,977 cases and 2,405 controls). The EIRA study was also used to analyze interactions between smoking and the two genes. "Interaction" was defined either as a departure from additivity, as interaction in a multiplicative model, or in terms of linkage disequilibrium--for example, deviation from independence of penetrance of two unlinked loci. Consistent interaction, defined as departure from additivity, between HLA-DRB1 SE alleles and the A allele of PTPN22 R620W was seen in all three studies regarding anti-CCP-positive RA. Testing for multiplicative interactions demonstrated an interaction between the two genes only when the three studies were pooled. The linkage disequilibrium approach indicated a gene-gene interaction in EIRA and NARAC, as well as in the pooled analysis. No interaction was seen between smoking and PTPN22 R620W. A new pattern of interactions is described between the two major known genetic risk factors and the major environmental risk factor concerning the risk of developing anti-CCP-positive RA. The data extend the basis for a pathogenetic hypothesis for RA involving genetic and environmental factors. The study also raises and illustrates principal questions concerning ways to define interactions in complex diseases. [Abstract/Link to Full Text]

Saccone SF, Pergadia ML, Loukola A, Broms U, Montgomery GW, Wang JC, Agrawal A, Dick DM, Heath AC, Todorov AA, Maunu H, Heikkila K, Morley KI, Rice JP, Todd RD, Kaprio J, Peltonen L, Martin NG, Goate AM, Madden PA
Genetic linkage to chromosome 22q12 for a heavy-smoking quantitative trait in two independent samples.
Am J Hum Genet. 2007 May;80(5):856-66.
We conducted a genomewide linkage screen of a simple heavy-smoking quantitative trait, the maximum number of cigarettes smoked in a 24-h period, using two independent samples: 289 Australian and 155 Finnish nuclear multiplex families, all of which were of European ancestry and were targeted for DNA analysis by use of probands with a heavy-smoking phenotype. We analyzed the trait, using a regression of identity-by-descent allele sharing on the sum and difference of the trait values for relative pairs. Suggestive linkage was detected on chromosome 22 at 27-29 cM in each sample, with a LOD score of 5.98 at 26.96 cM in the combined sample. After additional markers were used to localize the signal, the LOD score was 5.21 at 25.46 cM. To assess the statistical significance of the LOD score in the combined sample, 1,000 simulated genomewide screens were conducted, resulting in an empirical P value of .006 for the LOD score of 5.21. This linkage signal is driven mainly by the microsatellite marker D22S315 (22.59 cM), which had a single-point LOD score of 5.41 in the combined sample and an empirical P value <.001 from 1,000 simulated genomewide screens. This marker is located within an intron of the gene ADRBK2, encoding the beta-adrenergic receptor kinase 2. Fine mapping of this linkage region may reveal variants contributing to heaviness of smoking, which will lead to a better understanding of the genetic mechanisms underlying nicotine dependence. [Abstract/Link to Full Text]

Hustad S, Midttun Ř, Schneede J, Vollset SE, Grotmol T, Ueland PM
The methylenetetrahydrofolate reductase 677C-->T polymorphism as a modulator of a B vitamin network with major effects on homocysteine metabolism.
Am J Hum Genet. 2007 May;80(5):846-55.
Folates are carriers of one-carbon units and are metabolized by 5,10-methylenetetrahydrofolate reductase (MTHFR) and other enzymes that use riboflavin, cobalamin, or vitamin B6 as cofactors. These B vitamins are essential for the remethylation and transsulfuration of homocysteine, which is an important intermediate in one-carbon metabolism. We studied the MTHFR 677C-->T polymorphism and B vitamins as modulators of one-carbon metabolism in 10,601 adults from the Norwegian Colorectal Cancer Prevention (NORCCAP) cohort, using plasma total homocysteine (tHcy) as the main outcome measure. Mean concentrations of plasma tHcy were 10.4 micromol/liter, 10.9 micromol/liter, and 13.3 micromol/liter in subjects with the CC (51%), CT (41%), and TT (8%) genotypes, respectively. The MTHFR 677C-->T polymorphism, folate, riboflavin, cobalamin, and vitamin B6 were independent predictors of tHcy in multivariate models (P<.001), and genotype effects were strongest when B vitamins were low (P<or=.006). Conversely, the MTHFR polymorphism influenced B vitamin effects, which were strongest in the TT group, in which the estimated tHcy difference between subjects with vitamin concentrations in the lowest compared with the highest quartile was 5.4 micromol/liter for folate, 4.1 micromol/liter for riboflavin, 3.2 micromol/liter for cobalamin, and 2.1 micromol/liter for vitamin B6. Furthermore, interactions between B vitamins were observed, and B vitamins were more strongly related to plasma tHcy when concentrations of other B vitamins were low. The study provides comprehensive data on the MTHFR-B vitamin network, which has major effects on the transfer of one-carbon units. Individuals with the TT genotype were particularly sensitive to the status of several B vitamins and might be candidates for personalized nutritional recommendations. [Abstract/Link to Full Text]

Holder AM, Klaassens M, Tibboel D, de Klein A, Lee B, Scott DA
Genetic factors in congenital diaphragmatic hernia.
Am J Hum Genet. 2007 May;80(5):825-45.
Congenital diaphragmatic hernia (CDH) is a relatively common birth defect associated with high mortality and morbidity. Although the exact etiology of most cases of CDH remains unknown, there is a growing body of evidence that genetic factors play an important role in the development of CDH. In this review, we examine key findings that are likely to form the basis for future research in this field. Specific topics include a short overview of normal and abnormal diaphragm development, a discussion of syndromic forms of CDH, a detailed review of chromosomal regions recurrently altered in CDH, a description of the retinoid hypothesis of CDH, and evidence of the roles of specific genes in the development of CDH. [Abstract/Link to Full Text]

Lin DY, Huang BE
The use of inferred haplotypes in downstream analyses.
Am J Hum Genet. 2007 Mar;80(3):577-9. [Abstract/Link to Full Text]

Marchini J, Cutler D, Patterson N, Stephens M, Eskin E, Halperin E, Lin S, Qin ZS, Munro HM, Abecasis GR, Donnelly P
A comparison of phasing algorithms for trios and unrelated individuals.
Am J Hum Genet. 2006 Mar;78(3):437-50.
Knowledge of haplotype phase is valuable for many analysis methods in the study of disease, population, and evolutionary genetics. Considerable research effort has been devoted to the development of statistical and computational methods that infer haplotype phase from genotype data. Although a substantial number of such methods have been developed, they have focused principally on inference from unrelated individuals, and comparisons between methods have been rather limited. Here, we describe the extension of five leading algorithms for phase inference for handling father-mother-child trios. We performed a comprehensive assessment of the methods applied to both trios and to unrelated individuals, with a focus on genomic-scale problems, using both simulated data and data from the HapMap project. The most accurate algorithm was PHASE (v2.1). For this method, the percentages of genotypes whose phase was incorrectly inferred were 0.12%, 0.05%, and 0.16% for trios from simulated data, HapMap Centre d'Etude du Polymorphisme Humain (CEPH) trios, and HapMap Yoruban trios, respectively, and 5.2% and 5.9% for unrelated individuals in simulated data and the HapMap CEPH data, respectively. The other methods considered in this work had comparable but slightly worse error rates. The error rates for trios are similar to the levels of genotyping error and missing data expected. We thus conclude that all the methods considered will provide highly accurate estimates of haplotypes when applied to trio data sets. Running times differ substantially between methods. Although it is one of the slowest methods, PHASE (v2.1) was used to infer haplotypes for the 1 million-SNP HapMap data set. Finally, we evaluated methods of estimating the value of r(2) between a pair of SNPs and concluded that all methods estimated r(2) well when the estimated value was >or=0.8. [Abstract/Link to Full Text]

Kraft P, Stram DO
Re: the use of inferred haplotypes in downstream analysis.
Am J Hum Genet. 2007 Oct;81(4):863-5; author reply 865-6. [Abstract/Link to Full Text]

Rice G, Newman WG, Dean J, Patrick T, Parmar R, Flintoff K, Robins P, Harvey S, Hollis T, O'Hara A, Herrick AL, Bowden AP, Perrino FW, Lindahl T, Barnes DE, Crow YJ
Heterozygous mutations in TREX1 cause familial chilblain lupus and dominant Aicardi-Goutieres syndrome.
Am J Hum Genet. 2007 Apr;80(4):811-5.
TREX1 constitutes the major 3'-->5' DNA exonuclease activity measured in mammalian cells. Recently, biallelic mutations in TREX1 have been shown to cause Aicardi-Goutieres syndrome at the AGS1 locus. Interestingly, Aicardi-Goutieres syndrome shows overlap with systemic lupus erythematosus at both clinical and pathological levels. Here, we report a heterozygous TREX1 mutation causing familial chilblain lupus. Additionally, we describe a de novo heterozygous mutation, affecting a critical catalytic residue in TREX1, that results in typical Aicardi-Goutieres syndrome. [Abstract/Link to Full Text]

Hulsebos TJ, Plomp AS, Wolterman RA, Robanus-Maandag EC, Baas F, Wesseling P
Germline mutation of INI1/SMARCB1 in familial schwannomatosis.
Am J Hum Genet. 2007 Apr;80(4):805-10.
Patients with schwannomatosis develop multiple schwannomas but no vestibular schwannomas diagnostic of neurofibromatosis type 2. We report an inactivating germline mutation in exon 1 of the tumor-suppressor gene INI1 in a father and daughter who both had schwannomatosis. Inactivation of the wild-type INI1 allele, by a second mutation in exon 5 or by clear loss, was found in two of four investigated schwannomas from these patients. All four schwannomas displayed complete loss of nuclear INI1 protein expression in part of the cells. Although the exact oncogenetic mechanism in these schwannomas remains to be elucidated, our findings suggest that INI1 is the predisposing gene in familial schwannomatosis. [Abstract/Link to Full Text]

Hoskins BE, Cramer CH, Silvius D, Zou D, Raymond RM, Orten DJ, Kimberling WJ, Smith RJ, Weil D, Petit C, Otto EA, Xu PX, Hildebrandt F
Transcription factor SIX5 is mutated in patients with branchio-oto-renal syndrome.
Am J Hum Genet. 2007 Apr;80(4):800-4.
Branchio-oto-renal syndrome (BOR) is an autosomal dominant developmental disorder characterized by the association of branchial arch defects, hearing loss, and renal anomalies. Mutations in EYA1 are known to cause BOR. More recently, mutations in SIX1, which interacts with EYA1, were identified as an additional cause of BOR. A second member of the SIX family of proteins, unc-39 (SIX5), has also been reported to directly interact with eya-1 in Caenorhabditis elegans. We hypothesized that this interaction would be conserved in humans and that interactors of EYA1 represent good candidate genes for BOR. We therefore screened a cohort of 95 patients with BOR for mutations in SIX5. Four different heterozygous missense mutations were identified in five individuals. Functional analyses of these mutations demonstrated that two mutations affect EYA1-SIX5 binding and the ability of SIX5 or the EYA1-SIX5 complex to activate gene transcription. We thereby identified heterozygous mutations in SIX5 as a novel cause of BOR. [Abstract/Link to Full Text]

Leach NT, Sun Y, Michaud S, Zheng Y, Ligon KL, Ligon AH, Sander T, Korf BR, Lu W, Harris DJ, Gusella JF, Maas RL, Quade BJ, Cole AJ, Kelz MB, Morton CC
Disruption of diacylglycerol kinase delta (DGKD) associated with seizures in humans and mice.
Am J Hum Genet. 2007 Apr;80(4):792-9.
We report a female patient with a de novo balanced translocation, 46,X,t(X;2)(p11.2;q37)dn, who exhibits seizures, capillary abnormality, developmental delay, infantile hypotonia, and obesity. The 2q37 breakpoint observed in association with the seizure phenotype is of particular interest, because it lies near loci implicated in epilepsy in humans and mice. Fluorescence in situ hybridization mapping of the translocation breakpoints showed that no known genes are disrupted at Xp11.2, whereas diacylglycerol kinase delta (DGKD) is disrupted at 2q37. Expression studies in Drosophila and mouse suggest that DGKD is involved in central nervous system development and function. Electroencephalographic assessment of Dgkd mutant mice revealed abnormal epileptic discharges and electrographic seizures in three of six homozygotes. These findings implicate DGKD disruption by the t(X;2)(p11.2;q37)dn in the observed phenotype and support a more general role for DGKD in the etiology of seizures. [Abstract/Link to Full Text]

Ahituv N, Kavaslar N, Schackwitz W, Ustaszewska A, Martin J, Hebert S, Doelle H, Ersoy B, Kryukov G, Schmidt S, Yosef N, Ruppin E, Sharan R, Vaisse C, Sunyaev S, Dent R, Cohen J, McPherson R, Pennacchio LA
Medical sequencing at the extremes of human body mass.
Am J Hum Genet. 2007 Apr;80(4):779-91.
Body weight is a quantitative trait with significant heritability in humans. To identify potential genetic contributors to this phenotype, we resequenced the coding exons and splice junctions of 58 genes in 379 obese and 378 lean individuals. Our 96-Mb survey included 21 genes associated with monogenic forms of obesity in humans or mice, as well as 37 genes that function in body weight-related pathways. We found that the monogenic obesity-associated gene group was enriched for rare nonsynonymous variants unique to the obese population compared with the lean population. In addition, computational analysis predicted a greater fraction of deleterious variants within the obese cohort. Together, these data suggest that multiple rare alleles contribute to obesity in the population and provide a medical sequencing-based approach to detect them. [Abstract/Link to Full Text]

Melquist S, Craig DW, Huentelman MJ, Crook R, Pearson JV, Baker M, Zismann VL, Gass J, Adamson J, Szelinger S, Corneveaux J, Cannon A, Coon KD, Lincoln S, Adler C, Tuite P, Calne DB, Bigio EH, Uitti RJ, Wszolek ZK, Golbe LI, Caselli RJ, Graff-Radford N, Litvan I, Farrer MJ, Dickson DW, Hutton M, Stephan DA
Identification of a novel risk locus for progressive supranuclear palsy by a pooled genomewide scan of 500,288 single-nucleotide polymorphisms.
Am J Hum Genet. 2007 Apr;80(4):769-78.
To date, only the H1 MAPT haplotype has been consistently associated with risk of developing the neurodegenerative disease progressive supranuclear palsy (PSP). We hypothesized that additional genetic loci may be involved in conferring risk of PSP that could be identified through a pooling-based genomewide association study of >500,000 SNPs. Candidate SNPs with large differences in allelic frequency were identified by ranking all SNPs by their probe-intensity difference between cohorts. The MAPT H1 haplotype was strongly detected by this methodology, as was a second major locus on chromosome 11p12-p11 that showed evidence of association at allelic (P<.001), genotypic (P<.001), and haplotypic (P<.001) levels and was narrowed to a single haplotype block containing the DNA damage-binding protein 2 (DDB2) and lysosomal acid phosphatase 2 (ACP2) genes. Since DNA damage and lysosomal dysfunction have been implicated in aging and neurodegenerative processes, both genes are viable candidates for conferring risk of disease. [Abstract/Link to Full Text]

Achilli A, Olivieri A, Pala M, Metspalu E, Fornarino S, Battaglia V, Accetturo M, Kutuev I, Khusnutdinova E, Pennarun E, Cerutti N, Di Gaetano C, Crobu F, Palli D, Matullo G, Santachiara-Benerecetti AS, Cavalli-Sforza LL, Semino O, Villems R, Bandelt HJ, Piazza A, Torroni A
Mitochondrial DNA variation of modern Tuscans supports the near eastern origin of Etruscans.
Am J Hum Genet. 2007 Apr;80(4):759-68.
The origin of the Etruscan people has been a source of major controversy for the past 2,500 years, and several hypotheses have been proposed to explain their language and sophisticated culture, including an Aegean/Anatolian origin. To address this issue, we analyzed the mitochondrial DNA (mtDNA) of 322 subjects from three well-defined areas of Tuscany and compared their sequence variation with that of 55 western Eurasian populations. Interpopulation comparisons reveal that the modern population of Murlo, a small town of Etruscan origin, is characterized by an unusually high frequency (17.5%) of Near Eastern mtDNA haplogroups. Each of these haplogroups is represented by different haplotypes, thus dismissing the possibility that the genetic allocation of the Murlo people is due to drift. Other Tuscan populations do not show the same striking feature; however, overall, ~5% of mtDNA haplotypes in Tuscany are shared exclusively between Tuscans and Near Easterners and occupy terminal positions in the phylogeny. These findings support a direct and rather recent genetic input from the Near East--a scenario in agreement with the Lydian origin of Etruscans. Such a genetic contribution has been extensively diluted by admixture, but it appears that there are still locations in Tuscany, such as Murlo, where traces of its arrival are easily detectable. [Abstract/Link to Full Text]

Gargiulo A, Auricchio R, Barone MV, Cotugno G, Reardon W, Milla PJ, Ballabio A, Ciccodicola A, Auricchio A
Filamin A is mutated in X-linked chronic idiopathic intestinal pseudo-obstruction with central nervous system involvement.
Am J Hum Genet. 2007 Apr;80(4):751-8.
We have previously reported that an X-linked recessive form of chronic idiopathic intestinal pseudo-obstruction (CIIPX) maps to Xq28. To select candidate genes for the disease, we analyzed the expression in murine fetal brain and intestine of 56 genes from the critical region. We selected and sequenced seven genes and found that one affected male from a large CIIPX-affected kindred bears a 2-bp deletion in exon 2 of the FLNA gene that is present at the heterozygous state in the carrier females of the family. The frameshift mutation is located between two close methionines at the filamin N terminus and is predicted to produce a protein truncated shortly after the first predicted methionine. Loss-of-function FLNA mutations have been associated with X-linked dominant nodular ventricular heterotopia (PVNH), a central nervous system (CNS) migration defect that presents with seizures in females and lethality in males. Notably, the affected male bearing the FLNA deletion had signs of CNS involvement and potentially has PVNH. To understand how the severe frameshift mutation we found can explain the CIIPX phenotype and its X-linked recessive inheritance, we transiently expressed both the wild- type and mutant filamin in cell culture and found that filamin translation can start from either of the two initial methionines in these conditions. Therefore, translation of a normal shorter filamin can occur in vitro from the second methionine downstream of the 2-bp insertion we found. We confirmed this, demonstrating that the filamin protein is present in the patient's lymphoblastoid cell line that shows abnormal cytoskeletal actin organization compared with normal lymphoblasts. We conclude that the filamin N terminal region between the initial two methionines is crucial for proper enteric neuron development. [Abstract/Link to Full Text]

Eeds AM, Mortlock D, Wade-Martins R, Summar ML
Assessing the functional characteristics of synonymous and nonsynonymous mutation candidates by use of large DNA constructs.
Am J Hum Genet. 2007 Apr;80(4):740-50.
As we identify more and more genetic changes, either through mutation studies or population screens, we need powerful tools to study their potential molecular effects. With these tools, we can begin to understand the contributions of genetic variations to the wide range of human phenotypes. We used our catalogue of molecular changes in patients with carbamyl phosphate synthetase I (CPSI) deficiency to develop such a system for use in eukaryotic cells. We developed the tools and methods for rapidly modifying bacterial artificial chromosomes (BACs) for eukaryotic episomal replication, marker expression, and selection and then applied this protocol to a BAC containing the entire CPSI gene. Although this CPSI BAC construct was suitable for studying nonsynonymous mutations, potential splicing defects, and promoter variations, our focus was on studying potential splicing and RNA-processing defects to validate this system. In this article, we describe the construction of this system and subsequently examine the mechanism of four putative splicing mutations in patients deficient in CPSI. Using this model, we also demonstrate the reversible role of nonsense-mediated decay in all four mutations, using small interfering RNA knockdown of hUPF2. Furthermore, we were able to locate cryptic splicing sites for the two intronic mutations. This BAC-based system permits expression studies in the absence of patient RNA or tissues with relevant gene expression and provides experimental flexibility not available in genomic DNA or plasmid constructs. Our splicing and RNA degradation data demonstrate the advantages of using whole-gene constructs to study the effects of sequence variation on gene expression and function. [Abstract/Link to Full Text]

Kryukov GV, Pennacchio LA, Sunyaev SR
Most rare missense alleles are deleterious in humans: implications for complex disease and association studies.
Am J Hum Genet. 2007 Apr;80(4):727-39.
The accumulation of mildly deleterious missense mutations in individual human genomes has been proposed to be a genetic basis for complex diseases. The plausibility of this hypothesis depends on quantitative estimates of the prevalence of mildly deleterious de novo mutations and polymorphic variants in humans and on the intensity of selective pressure against them. We combined analysis of mutations causing human Mendelian diseases, of human-chimpanzee divergence, and of systematic data on human genetic variation and found that ~20% of new missense mutations in humans result in a loss of function, whereas ~27% are effectively neutral. Thus, the remaining 53% of new missense mutations have mildly deleterious effects. These mutations give rise to many low-frequency deleterious allelic variants in the human population, as is evident from a new data set of 37 genes sequenced in >1,500 individual human chromosomes. Surprisingly, up to 70% of low-frequency missense alleles are mildly deleterious and are associated with a heterozygous fitness loss in the range 0.001-0.003. Thus, the low allele frequency of an amino acid variant can, by itself, serve as a predictor of its functional significance. Several recent studies have reported a significant excess of rare missense variants in candidate genes or pathways in individuals with extreme values of quantitative phenotypes. These studies would be unlikely to yield results if most rare variants were neutral or if rare variants were not a significant contributor to the genetic component of phenotypic inheritance. Our results provide a justification for these types of candidate-gene (pathway) association studies and imply that mutation-selection balance may be a feasible evolutionary mechanism underlying some common diseases. [Abstract/Link to Full Text]

Reich D, Patterson N, Ramesh V, De Jager PL, McDonald GJ, Tandon A, Choy E, Hu D, Tamraz B, Pawlikowska L, Wassel-Fyr C, Huntsman S, Waliszewska A, Rossin E, Li R, Garcia M, Reiner A, Ferrell R, Cummings S, Kwok PY, Harris T, Zmuda JM, Ziv E
Admixture mapping of an allele affecting interleukin 6 soluble receptor and interleukin 6 levels.
Am J Hum Genet. 2007 Apr;80(4):716-26.
Circulating levels of inflammatory markers can predict cardiovascular disease risk. To identify genes influencing the levels of these markers, we genotyped 1,343 single-nucleotide polymorphisms (SNPs) in 1,184 African Americans from the Health, Aging and Body Composition (Health ABC) Study. Using admixture mapping, we found a significant association of interleukin 6 soluble receptor (IL-6 SR) with European ancestry on chromosome 1 (LOD 4.59), in a region that includes the gene for this receptor (IL-6R). Genotyping 19 SNPs showed that the effect is largely explained by an allele at 4% frequency in West Africans and at 35% frequency in European Americans, first described as associated with IL-6 SR in a Japanese cohort. We replicate this association (P<1.0x10-12) and also demonstrate a new association with circulating levels of a different molecule, IL-6 (P<3.4x10-5). After replication in 1,674 European Americans from Health ABC, the combined result is even more significant: P<1.0x10-12 for IL-6 SR, and P<2.0x10-9 for IL-6. These results also serve as an important proof of principle, showing that admixture mapping can not only coarsely localize but can also fine map a phenotypically important variant. [Abstract/Link to Full Text]

Li Y, Sung WK, Liu JJ
Association mapping via regularized regression analysis of single-nucleotide-polymorphism haplotypes in variable-sized sliding windows.
Am J Hum Genet. 2007 Apr;80(4):705-15.
Large-scale haplotype association analysis, especially at the whole-genome level, is still a very challenging task without an optimal solution. In this study, we propose a new approach for haplotype association analysis that is based on a variable-sized sliding-window framework and employs regularized regression analysis to tackle the problem of multiple degrees of freedom in the haplotype test. Our method can handle a large number of haplotypes in association analyses more efficiently and effectively than do currently available approaches. We implement a procedure in which the maximum size of a sliding window is determined by local haplotype diversity and sample size, an attractive feature for large-scale haplotype analyses, such as a whole-genome scan, in which linkage disequilibrium patterns are expected to vary widely. We compare the performance of our method with that of three other methods--a test based on a single-nucleotide polymorphism, a cladistic analysis of haplotypes, and variable-length Markov chains--with use of both simulated and experimental data. By analyzing data sets simulated under different disease models, we demonstrate that our method consistently outperforms the other three methods, especially when the region under study has high haplotype diversity. Built on the regression analysis framework, our method can incorporate other risk-factor information into haplotype-based association analysis, which is becoming an increasingly necessary step for studying common disorders to which both genetic and environmental risk factors contribute. [Abstract/Link to Full Text]

Chen CT, Wang JC, Cohen BA
The strength of selection on ultraconserved elements in the human genome.
Am J Hum Genet. 2007 Apr;80(4):692-704.
Ultraconserved elements are stretches of consecutive nucleotides that are perfectly conserved in multiple mammalian genomes. Although these sequences are identical in the reference human, mouse, and rat genomes, we identified numerous polymorphisms within these regions in the human population. To determine whether polymorphisms in ultraconserved elements affect fitness, we genotyped unrelated human DNA samples at loci within these sequences. For all single-nucleotide polymorphisms tested in ultraconserved regions, individuals homozygous for derived alleles (alleles that differ from the rodent reference genomes) were present, viable, and healthy. The distribution of allele frequencies in these samples argues against strong, ongoing selection as the force maintaining the conservation of these sequences. We then used two methods to determine the minimum level of selection required to generate these sequences. Despite the lack of fixed differences in these sequences between humans and rodents, the average level of selection on ultraconserved elements is less than that on essential genes. The strength of selection associated with ultraconserved elements suggests that mutations in these regions may have subtle phenotypic consequences that are not easily detected in the laboratory. [Abstract/Link to Full Text]

Zaitlen N, Kang HM, Eskin E, Halperin E
Leveraging the HapMap correlation structure in association studies.
Am J Hum Genet. 2007 Apr;80(4):683-91.
Recent high-throughput genotyping technologies, such as the Affymetrix 500k array and the Illumina HumanHap 550 beadchip, have driven down the costs of association studies and have enabled the measurement of single-nucleotide polymorphism (SNP) allele frequency differences between case and control populations on a genomewide scale. A key aspect in the efficiency of association studies is the notion of "indirect association," where only a subset of SNPs are collected to serve as proxies for the uncollected SNPs, taking advantage of the correlation structure between SNPs. Recently, a new class of methods for indirect association, multimarker methods, has been proposed. Although the multimarker methods are a considerable advancement, current methods do not fully take advantage of the correlation structure between SNPs and their multimarker proxies. In this article, we propose a novel multimarker indirect-association method, WHAP, that is based on a weighted sum of the haplotype frequency differences. In contrast to traditional indirect-association methods, we show analytically that there is a considerable gain in power achieved by our method compared with both single-marker and multimarker tests, as well as traditional haplotype-based tests. Our results are supported by empirical evaluation across the HapMap reference panel data sets, and a software implementation for the Affymetrix 500k and Illumina HumanHap 550 chips is available for download. [Abstract/Link to Full Text]

Pare G, Serre D, Brisson D, Anand SS, Montpetit A, Tremblay G, Engert JC, Hudson TJ, Gaudet D
Genetic analysis of 103 candidate genes for coronary artery disease and associated phenotypes in a founder population reveals a new association between endothelin-1 and high-density lipoprotein cholesterol.
Am J Hum Genet. 2007 Apr;80(4):673-82.
Coronary artery disease (CAD) is a major health concern in both developed and developing countries. With a heritability estimated at ~50%, there is a strong rationale to better define the genetic contribution to CAD. This project involves the analysis of 884 individuals from 142 families (with average sibships of 5.7) as well as 558 case and control subjects from the Saguenay Lac St-Jean region of northeastern Quebec, with the use of 1,536 single-nucleotide polymorphisms (SNPs) in 103 candidate genes for CAD. By use of clusters of SNPs to generate multiallelic haplotypes at candidate loci for segregation studies within families, suggestive linkage for high-density lipoprotein (HDL) cholesterol is observed on chromosome 1p36.22. Furthermore, several associations that remain significant after Bonferroni correction are observed with lipoprotein-related traits as well as plasma concentrations of adiponectin. Of note, HDL cholesterol levels are associated with an amino acid substitution (lysine/asparagine) at codon 198 (rs5370) of endothelin-1 (EDN1) in a sex-specific manner, as well as with a SNP (rs2292318) located 7.7 kb upstream of lecithin cholesterol acyl-transferase (LCAT). Whereas the other observed associations are described in the current literature, these two are new. Using an independent validation sample of 806 individuals, we confirm the EDN1 association (P<.005), whereas the LCAT association was nonsignificant (P=.12). [Abstract/Link to Full Text]

Choudhury K, McQuillin A, Puri V, Pimm J, Datta S, Thirumalai S, Krasucki R, Lawrence J, Bass NJ, Quested D, Crombie C, Fraser G, Walker N, Nadeem H, Johnson S, Curtis D, St Clair D, Gurling HM
A genetic association study of chromosome 11q22-24 in two different samples implicates the FXYD6 gene, encoding phosphohippolin, in susceptibility to schizophrenia.
Am J Hum Genet. 2007 Apr;80(4):664-72.
Previous linkage analyses of families with multiple cases of schizophrenia by us and others have confirmed the involvement of the chromosome 11q22-24 region in the etiology of schizophrenia, with LOD scores of 3.4 and 3.1. We now report fine mapping of a susceptibility gene in the 11q22-24 region, determined on the basis of a University College London (UCL) sample of 496 cases and 488 supernormal controls. Confirmation was then performed by the study of an Aberdeen sample consisting of 858 cases and 591 controls (for a total of 2,433 individuals: 1,354 with schizophrenia and 1,079 controls). Seven microsatellite or single-nucleotide polymorphism (SNP) markers localized within or near the FXYD6 gene showed empirically significant allelic associations with schizophrenia in the UCL sample (for D11S1998, P=.021; for rs3168238, P=.009; for TTTC20.2, P=.048; for rs1815774, P=.049; for rs4938445, P=.010; for rs4938446, P=.025; for rs497768, P=.023). Several haplotypes were also found to be associated with schizophrenia; for example, haplotype Hap-F21 comprising markers rs10790212-rs4938445-rs497768 was found to be associated with schizophrenia, by a global permutation test (P=.002). Positive markers in the UCL sample were then genotyped in the Aberdeen sample. Two of these SNPs were found to be associated with schizophrenia in the Scottish sample (for rs4938445, P=.044; for rs497768, P=.037). The Hap-F21 haplotype also showed significant association with schizophrenia in the Aberdeen sample, with the same alleles being associated (P=.013). The FXYD6 gene encodes a protein called "phosphohippolin" that is highly expressed in regions of the brain thought to be involved in schizophrenia. The protein functions by modulating the kinetic properties of Na,K-ATPase to the specific physiological requirements of the tissue. Etiological base-pair changes in FXYD6 or in associated promoter/control regions are likely to cause abnormal function or expression of phosphohippolin and to increase genetic susceptibility to schizophrenia. [Abstract/Link to Full Text]

Wang L, Hauser ER, Shah SH, Pericak-Vance MA, Haynes C, Crosslin D, Harris M, Nelson S, Hale AB, Granger CB, Haines JL, Jones CJ, Crossman D, Seo D, Gregory SG, Kraus WE, Goldschmidt-Clermont PJ, Vance JM
Peakwide mapping on chromosome 3q13 identifies the kalirin gene as a novel candidate gene for coronary artery disease.
Am J Hum Genet. 2007 Apr;80(4):650-63.
A susceptibility locus for coronary artery disease (CAD) has been mapped to chromosome 3q13-21 in a linkage study of early-onset CAD. We completed an association-mapping study across the 1-LOD-unit-down supporting interval, using two independent white case-control data sets (CATHGEN, initial and validation) to evaluate association under the peak. Single-nucleotide polymorphisms (SNPs) evenly spaced at 100-kb intervals were screened in the initial data set (N=468). Promising SNPs (P<.1) were then examined in the validation data set (N=514). Significant findings (P<.05) in the combined initial and validation data sets were further evaluated in multiple independent data sets, including a family-based data set (N=2,954), an African American case-control data set (N=190), and an additional white control data set (N=255). The association between genotype and aortic atherosclerosis was examined in 145 human aortas. The peakwide survey found evidence of association in SNPs from multiple genes. The strongest associations were found in three SNPs from the kalirin (KALRN) gene, especially in patients with early-onset CAD (P=.00001-00028 in the combined CATHGEN data sets). In-depth investigation of the gene found that an intronic SNP, rs9289231, was associated with early-onset CAD in all white data sets examined (P<.05). In the joint analysis of all white early-onset CAD cases (N=332) and controls (N=546), rs9289231 was highly significant (P=.00008), with an odds-ratio estimate of 2.1. Furthermore, the risk allele of this SNP was associated with atherosclerosis burden (P=.03) in 145 human aortas. KALRN is a protein with many functions, including the inhibition of inducible nitric oxide synthase and guanine-exchange-factor activity. KALRN and two other associated genes identified in this study (CDGAP and MYLK) belong to the Rho GTPase-signaling pathway. Our data suggest the importance of the KALRN gene and the Rho GTPase-signaling pathway in the pathogenesis of CAD. [Abstract/Link to Full Text]

Potocki L, Bi W, Treadwell-Deering D, Carvalho CM, Eifert A, Friedman EM, Glaze D, Krull K, Lee JA, Lewis RA, Mendoza-Londono R, Robbins-Furman P, Shaw C, Shi X, Weissenberger G, Withers M, Yatsenko SA, Zackai EH, Stankiewicz P, Lupski JR
Characterization of Potocki-Lupski syndrome (dup(17)(p11.2p11.2)) and delineation of a dosage-sensitive critical interval that can convey an autism phenotype.
Am J Hum Genet. 2007 Apr;80(4):633-49.
The duplication 17p11.2 syndrome, associated with dup(17)(p11.2p11.2), is a recently recognized syndrome of multiple congenital anomalies and mental retardation and is the first predicted reciprocal microduplication syndrome described--the homologous recombination reciprocal of the Smith-Magenis syndrome (SMS) microdeletion (del(17)(p11.2p11.2)). We previously described seven subjects with dup(17)(p11.2p11.2) and noted their relatively mild phenotype compared with that of individuals with SMS. Here, we molecularly analyzed 28 additional patients, using multiple independent assays, and also report the phenotypic characteristics obtained from extensive multidisciplinary clinical study of a subset of these patients. Whereas the majority of subjects (22 of 35) harbor the homologous recombination reciprocal product of the common SMS microdeletion (~3.7 Mb), 13 subjects (~37%) have nonrecurrent duplications ranging in size from 1.3 to 15.2 Mb. Molecular studies suggest potential mechanistic differences between nonrecurrent duplications and nonrecurrent genomic deletions. Clinical features observed in patients with the common dup(17)(p11.2p11.2) are distinct from those seen with SMS and include infantile hypotonia, failure to thrive, mental retardation, autistic features, sleep apnea, and structural cardiovascular anomalies. We narrow the critical region to a 1.3-Mb genomic interval that contains the dosage-sensitive RAI1 gene. Our results refine the critical region for Potocki-Lupski syndrome, provide information to assist in clinical diagnosis and management, and lend further support for the concept that genomic architecture incites genomic instability. [Abstract/Link to Full Text]

Lu W, van Eerde AM, Fan X, Quintero-Rivera F, Kulkarni S, Ferguson H, Kim HG, Fan Y, Xi Q, Li QG, Sanlaville D, Andrews W, Sundaresan V, Bi W, Yan J, Giltay JC, Wijmenga C, de Jong TP, Feather SA, Woolf AS, Rao Y, Lupski JR, Eccles MR, Quade BJ, Gusella JF, Morton CC, Maas RL
Disruption of ROBO2 is associated with urinary tract anomalies and confers risk of vesicoureteral reflux.
Am J Hum Genet. 2007 Apr;80(4):616-32.
Congenital anomalies of the kidney and urinary tract (CAKUT) include vesicoureteral reflux (VUR). VUR is a complex, genetically heterogeneous developmental disorder characterized by the retrograde flow of urine from the bladder into the ureter and is associated with reflux nephropathy, the cause of 15% of end-stage renal disease in children and young adults. We investigated a man with a de novo translocation, 46,X,t(Y;3)(p11;p12)dn, who exhibits multiple congenital abnormalities, including severe bilateral VUR with ureterovesical junction defects. This translocation disrupts ROBO2, which encodes a transmembrane receptor for SLIT ligand, and produces dominant-negative ROBO2 proteins that abrogate SLIT-ROBO signaling in vitro. In addition, we identified two novel ROBO2 intracellular missense variants that segregate with CAKUT and VUR in two unrelated families. Adult heterozygous and mosaic mutant mice with reduced Robo2 gene dosage also exhibit striking CAKUT-VUR phenotypes. Collectively, these results implicate the SLIT-ROBO signaling pathway in the pathogenesis of a subset of human VUR. [Abstract/Link to Full Text]

Zollner S, Pritchard JK
Overcoming the winner's curse: estimating penetrance parameters from case-control data.
Am J Hum Genet. 2007 Apr;80(4):605-15.
Genomewide association studies are now a widely used approach in the search for loci that affect complex traits. After detection of significant association, estimates of penetrance and allele-frequency parameters for the associated variant indicate the importance of that variant and facilitate the planning of replication studies. However, when these estimates are based on the original data used to detect the variant, the results are affected by an ascertainment bias known as the "winner's curse." The actual genetic effect is typically smaller than its estimate. This overestimation of the genetic effect may cause replication studies to fail because the necessary sample size is underestimated. Here, we present an approach that corrects for the ascertainment bias and generates an estimate of the frequency of a variant and its penetrance parameters. The method produces a point estimate and confidence region for the parameter estimates. We study the performance of this method using simulated data sets and show that it is possible to greatly reduce the bias in the parameter estimates, even when the original association study had low power. The uncertainty of the estimate decreases with increasing sample size, independent of the power of the original test for association. Finally, we show that application of the method to case-control data can improve the design of replication studies considerably. [Abstract/Link to Full Text]

McKusick VA
Mendelian Inheritance in Man and its online version, OMIM.
Am J Hum Genet. 2007 Apr;80(4):588-604. [Abstract/Link to Full Text]

Loughlin J, Meulenbelt I, Min J, Mustafa Z, Sinsheimer JS, Carr A, Slagboom PE
Genetic association analysis of RHOB and TXNDC3 in osteoarthritis.
Am J Hum Genet. 2007 Feb;80(2):383-6; author reply 386-7. [Abstract/Link to Full Text]

Mahr S, Burmester GR, Hilke D, Göbel U, Grützkau A, Häupl T, Hauschild M, Koczan D, Krenn V, Neidel J, Perka C, Radbruch A, Thiesen HJ, Müller B
Cis- and trans-acting gene regulation is associated with osteoarthritis.
Am J Hum Genet. 2006 May;78(5):793-803.
Osteoarthritis (OA) is a complex disease of the skeleton and is associated with aging. Both environmental and genetic factors contribute to its pathogenesis. We set out to identify novel genes associated with OA, concentrating on regulatory polymorphisms allowing for differential expression. Our strategy to identify differentially expressed genes included an initial transcriptome analysis of the peripheral blood mononuclear cells of six patients with OA and six age-matched healthy controls. These were screened for allelic expression imbalances and potentially regulatory single-nucleotide polymorphisms (SNPs) in the 5' regions of the genes. To establish disease association, disparate promoter SNP distributions correlating with the differential expression were tested on larger cohorts. Our approach yielded 26 candidate genes differentially expressed between patients and controls. Whereas BLP2 and CIAS1 seem to be trans-regulated, as the absence of allelic expression imbalances suggests, the presence of allelic imbalances confirms cis-regulatory mechanisms for RHOB and TXNDC3. Interestingly, on/off-switching suggests additional trans-regulation for TXNDC3. Moreover, we demonstrate for RHOB and TXNDC3 statistically significant associations between 5' SNPs and the disease that hint at regulatory functions. Investigating the respective genes functionally will not only shed light on the disease association but will also add to the understanding of the pathogenic processes involved in OA and may point out novel therapeutic approaches. [Abstract/Link to Full Text]

Elson JL, Majamaa K, Howell N, Chinnery PF
Associating mitochondrial DNA variation with complex traits.
Am J Hum Genet. 2007 Feb;80(2):378-82; author reply 382-3. [Abstract/Link to Full Text]

Saxena R, de Bakker PI, Singer K, Mootha V, Burtt N, Hirschhorn JN, Gaudet D, Isomaa B, Daly MJ, Groop L, Ardlie KG, Altshuler D
Comprehensive association testing of common mitochondrial DNA variation in metabolic disease.
Am J Hum Genet. 2006 Jul;79(1):54-61.
Many lines of evidence implicate mitochondria in phenotypic variation: (a) rare mutations in mitochondrial proteins cause metabolic, neurological, and muscular disorders; (b) alterations in oxidative phosphorylation are characteristic of type 2 diabetes, Parkinson disease, Huntington disease, and other diseases; and (c) common missense variants in the mitochondrial genome (mtDNA) have been implicated as having been subject to natural selection for adaptation to cold climates and contributing to "energy deficiency" diseases today. To test the hypothesis that common mtDNA variation influences human physiology and disease, we identified all 144 variants with frequency >1% in Europeans from >900 publicly available European mtDNA sequences and selected 64 tagging single-nucleotide polymorphisms that efficiently capture all common variation (except the hypervariable D-loop). Next, we evaluated the complete set of common mtDNA variants for association with type 2 diabetes in a sample of 3,304 diabetics and 3,304 matched nondiabetic individuals. Association of mtDNA variants with other metabolic traits (body mass index, measures of insulin secretion and action, blood pressure, and cholesterol) was also tested in subsets of this sample. We did not find a significant association of common mtDNA variants with these metabolic phenotypes. Moreover, we failed to identify any physiological effect of alleles that were previously proposed to have been adaptive for energy metabolism in human evolution. More generally, this comprehensive association-testing framework can readily be applied to other diseases for which mitochondrial dysfunction has been implicated. [Abstract/Link to Full Text]

Genome scan for Tourette disorder in affected-sibling-pair and multigenerational families.
Am J Hum Genet. 2007 Feb;80(2):265-72.
Tourette disorder (TD) is a neuropsychiatric disorder with a complex mode of inheritance and is characterized by multiple waxing and waning motor and phonic tics. This article reports the results of the largest genetic linkage study yet undertaken for TD. The sample analyzed includes 238 nuclear families yielding 304 "independent" sibling pairs and 18 separate multigenerational families, for a total of 2,040 individuals. A whole-genome screen with the use of 390 microsatellite markers was completed. Analyses were completed using two diagnostic classifications: (1) only individuals with TD were included as affected and (2) individuals with either TD or chronic-tic (CT) disorder were included as affected. Strong evidence of linkage was observed for a region on chromosome 2p (-log P = 4.42, P = 3.8 x 10(-5) in the analyses that included individuals with TD or CT disorder as affected. Results in several other regions also provide moderate evidence (-log P >2.0) of additional susceptibility loci for TD. [Abstract/Link to Full Text]

Recent Articles in Genetics

Issue highlights.
Genetics. 2007 Sep;177(1):NP. [Abstract/Link to Full Text]

Cao Y, Ding X, Cai M, Zhao J, Lin Y, Li X, Xu C, Wang S
The expression pattern of a rice disease resistance gene xa3/xa26 is differentially regulated by the genetic backgrounds and developmental stages that influence its function.
Genetics. 2007 Sep;177(1):523-33.
Genetic background and developmental stage influence the function of some disease resistance (R) genes. The molecular mechanisms of these modifications remain elusive. Our results show that the two factors are associated with the expression of the R gene in rice Xa3 (also known as Xa26)-mediated resistance to Xanthomonas oryzae pv. oryzae (Xoo), which in turn influences the expression of defense-responsive genes. The background of japonica rice, one of the two major subspecies of Asian cultivated rice, facilitates the function of Xa3 more than the background of indica rice, another rice subspecies. Xa3 expression gradually increases from early seedling stage to adult stage. Japonica plants carrying Xa3 regulated by the native promoter showed an enlarged resistance spectrum (i.e., resistance to more Xoo races), an increased resistance level (i.e., further reduced lesion length), and whole-growth-stage resistance compared to the indica rice; this enhanced resistance was associated with an increased expression of Xa3 throughout the growth stages in the japonica plants, which resulted in enhanced expression of defense-responsive genes. Overexpressing Xa3 with a constitutive strong promoter further enhanced rice resistance due to further increased Xa3 transcripts in both indica and japonica backgrounds, whereas regulating Xa3 with a pathogen-induced weak promoter impaired rice resistance. [Abstract/Link to Full Text]

Dolezelova E, Dolezel D, Hall JC
Rhythm defects caused by newly engineered null mutations in Drosophila's cryptochrome gene.
Genetics. 2007 Sep;177(1):329-45.
Much of the knowledge about cryptochrome function in Drosophila stems from analyzing the cryb mutant. Several features of this variant's light responsiveness imply either that CRYb retains circadian-photoreceptive capacities or that additional CRY-independent light-input routes subserve these processes. Potentially to resolve these issues, we generated cry knock-out mutants (cry0's) by gene replacement. They behaved in an anomalously rhythmic manner in constant light (LL). However, cry0 flies frequently exhibited two separate circadian components in LL, not observed in most previous cryb analyses. Temperature-dependent circadian phenotypes exhibited by cry(0) flies suggest that CRY is involved in core pacemaking. Further locomotor experiments combined cry0 with an externally blinding mutation (norpAP24), which caused the most severe decrements of circadian photoreception observed so far. cryb cultures were shown previously to exhibit either aperiodic or rhythmic eclosion in separate studies. We found cry0 to eclose in a solidly periodic manner in light:dark cycles or constant darkness. Furthermore, both cry0 and cryb eclosed rhythmically in LL. These findings indicate that the novel cry0 type causes more profound defects than does the cryb mutation, implying that CRYb retains residual activity. Because some norpAP24 cry0 individuals can resynchronize to novel photic regimes, an as-yet undetermined light-input route exists in Drosophila. [Abstract/Link to Full Text]

Tang Z, Wang X, Hu Z, Yang Z, Xu C
Genetic dissection of cytonuclear epistasis in line crosses.
Genetics. 2007 Sep;177(1):669-72.
Dissection of cytonuclear interactions is fundamentally important for understanding the genetic architecture of complex traits. Here we propose a mating design based on reciprocal crosses and extend the existing QTL mapping method to evaluate the contribution of cytoplasm and QTL x cytoplasm interactions to the phenotypic variation. Efficiency of the design and method is demonstrated via simulated data. [Abstract/Link to Full Text]

Ryder E, Ashburner M, Bautista-Llacer R, Drummond J, Webster J, Johnson G, Morley T, Chan YS, Blows F, Coulson D, Reuter G, Baisch H, Apelt C, Kauk A, Rudolph T, Kube M, Klimm M, Nickel C, Szidonya J, Maróy P, Pal M, Rasmuson-Lestander A, Ekström K, Stocker H, Hugentobler C, Hafen E, Gubb D, Pflugfelder G, Dorner C, Mechler B, Schenkel H, Marhold J, Serras F, Corominas M, Punset A, Roote J, Russell S
The DrosDel deletion collection: a Drosophila genomewide chromosomal deficiency resource.
Genetics. 2007 Sep;177(1):615-29.
We describe a second-generation deficiency kit for Drosophila melanogaster composed of molecularly mapped deletions on an isogenic background, covering approximately 77% of the Release 5.1 genome. Using a previously reported collection of FRT-bearing P-element insertions, we have generated 655 new deletions and verified a set of 209 deletion-bearing fly stocks. In addition to deletions, we demonstrate how the P elements may also be used to generate a set of custom inversions and duplications, particularly useful for balancing difficult regions of the genome carrying haplo-insufficient loci. We describe a simple computational resource that facilitates selection of appropriate elements for generating custom deletions. Finally, we provide a computational resource that facilitates selection of other mapped FRT-bearing elements that, when combined with the DrosDel collection, can theoretically generate over half a million precisely mapped deletions. [Abstract/Link to Full Text]

Han Y, Gasic K, Korban SS
Multiple-copy cluster-type organization and evolution of genes encoding O-methyltransferases in the apple.
Genetics. 2007 Aug;176(4):2625-35.
Plant O-methyltransferases (OMTs) play important roles in secondary metabolism. Two clusters of genes coding for caffeic acid OMT (COMT) have been identified in the apple genome. Three genes from one cluster and two genes from another cluster were isolated. These five genes encoding COMT, designated Mdomt1-Mdomt5 (GenBank accession nos. DQ886018-DQ886022), were distinguished by a (CT)(n) microsatellite in the 5'-UTR and two transposon-like sequences present in the promoter region and intron 1, respectively. The transposon-like sequence in intron 1 unambiguously traced the five Mdomt genes in the apple to a common ancestor. The ancestor must have undergone an initial duplication generating two progenitors, and this was followed by further duplication of these progenitors resulting in the two clusters identified in this study. The distal regions of the transposon-like sequences in promoter regions of Mdomt genes are capable of forming palindromic hairpin-like structures. The hairpin formation is likely responsible for nucleotide sequence differences observed in the promoter regions of these genes as it plays a destabilizing role in eukaryotic chromosomes. In addition, the possible mechanism of amplification of Mdomt genes in the apple genome is also discussed. [Abstract/Link to Full Text]

Bailey MF, Delph LF
Sex-ratio evolution in nuclear-cytoplasmic gynodioecy when restoration is a threshold trait.
Genetics. 2007 Aug;176(4):2465-76.
Gynodioecious plant species, which have populations consisting of female and hermaphrodite individuals, usually have complex sex determination involving cytoplasmic male sterility (CMS) alleles interacting with nuclear restorers of fertility. In response to recent evidence, we present a model of sex-ratio evolution in which restoration of male fertility is a threshold trait. We find that females are maintained at low frequencies for all biologically relevant parameter values. Furthermore, this model predicts periodically high female frequencies (>50%) under conditions of lower female seed fecundity advantages (compensation, x = 5%) and pleiotropic fitness effects associated with restorers of fertility (costs of restoration, y = 20%) than in other models. This model explains the maintenance of females in species that have previously experienced invasions of CMS alleles and the evolution of multiple restorers. Sensitivity of the model to small changes in cost and compensation values and to initial conditions may explain why populations of the same species vary widely for sex ratio. [Abstract/Link to Full Text]

Zhang XX, Rainey PB
Genetic analysis of the histidine utilization (hut) genes in Pseudomonas fluorescens SBW25.
Genetics. 2007 Aug;176(4):2165-76.
The histidine utilization (hut) locus of Pseudomonas fluorescens SBW25 confers the ability to utilize histidine as a sole carbon and nitrogen source. Genetic analysis using a combination of site-directed mutagenesis and chromosomally integrated lacZ fusions showed the hut locus to be composed of 13 genes organized in 3 transcriptional units: hutF, hutCD, and 10 genes from hutU to hutG (which includes 2 copies of hutH, 1 of which is nonfunctional). Inactivation of hutF eliminated the ability to grow on histidine, indicating that SBW25 degrades histidine by the five-step enzymatic pathway. The 3 hut operons are negatively regulated by the HutC repressor with urocanate (the first intermediate of the histidine degradation pathway) as the physiological inducer. 5'-RACE analysis of transcriptional start sites revealed involvement of both sigma(54) (for the hutU-G operon) and sigma(70) (for hutF); the involvement of sigma(54) was experimentally demonstrated. CbrB (an enhancer binding protein for sigma(54) recruitment) was required for bacterial growth on histidine, indicating positive control of hut gene expression by CbrB. Recognition that a gene (named hutD) encoding a widely distributed conserved hypothetical protein is transcribed along with hutC led to analysis of its role. Mutational and gene fusion studies showed that HutD functions independently of HutC. Growth and fitness assays in laboratory media and on sugar beet seedlings suggest that HutD acts as a governor that sets an upper bound to the level of hut activity. [Abstract/Link to Full Text]

Sarin S, O'Meara MM, Flowers EB, Antonio C, Poole RJ, Didiano D, Johnston RJ, Chang S, Narula S, Hobert O
Genetic screens for Caenorhabditis elegans mutants defective in left/right asymmetric neuronal fate specification.
Genetics. 2007 Aug;176(4):2109-30.
We describe here the results of genetic screens for Caenorhabditis elegans mutants in which a single neuronal fate decision is inappropriately executed. In wild-type animals, the two morphologically bilaterally symmetric gustatory neurons ASE left (ASEL) and ASE right (ASER) undergo a left/right asymmetric diversification in cell fate, manifested by the differential expression of a class of putative chemoreceptors and neuropeptides. Using single cell-specific gfp reporters and screening through a total of almost 120,000 haploid genomes, we isolated 161 mutants that define at least six different classes of mutant phenotypes in which ASEL/R fate is disrupted. Each mutant phenotypic class encompasses one to nine different complementation groups. Besides many alleles of 10 previously described genes, we have identified at least 16 novel "lsy" genes ("laterally symmetric"). Among mutations in known genes, we retrieved four alleles of the miRNA lsy-6 and a gain-of-function mutation in the 3'-UTR of a target of lsy-6, the cog-1 homeobox gene. Using newly found temperature-sensitive alleles of cog-1, we determined that a bistable feedback loop controlling ASEL vs. ASER fate, of which cog-1 is a component, is only transiently required to initiate but not to maintain ASEL and ASER fate. Taken together, our mutant screens identified a broad catalog of genes whose molecular characterization is expected to provide more insight into the complex genetic architecture of a left/right asymmetric neuronal cell fate decision. [Abstract/Link to Full Text]

Gaytán de Ayala Alonso A, Gutiérrez L, Fritsch C, Papp B, Beuchle D, Müller J
A genetic screen identifies novel polycomb group genes in Drosophila.
Genetics. 2007 Aug;176(4):2099-108.
Polycomb group (PcG) genes encode evolutionarily conserved transcriptional repressors that are required for the long-term silencing of particular developmental control genes in animals and plants. PcG genes were first identified in Drosophila as regulators that keep HOX genes inactive in cells where these genes must remain silent during development. Here, we report the results of a genetic screen aimed at isolating novel PcG mutants in Drosophila. In an EMS mutagenesis, we isolated 82 mutants that show Polycomb-like phenotypes in clones in the adult epidermis and misexpression of the HOX gene Ubx in clones in the imaginal wing disc. Analysis of these mutants revealed that we isolated multiple new alleles in most of the already- known PcG genes. In addition, we isolated multiple mutant alleles in each of ten different genes that previously had not been known to function in PcG repression. We show that the newly identified PcG gene calypso is required for the long-term repression of multiple HOX genes in embryos and larvae. In addition, our studies reveal that the Kto/Med12 and Skd/Med13 subunits of the Med12.Med13.Cdk8.CycC repressor subcomplex of Mediator are needed for repression of the HOX gene Ubx. The results of the mutant screen reported here suggest that the majority of nonredundant Drosophila genes with strong classic PcG phenotypes have been identified. [Abstract/Link to Full Text]

Hegreness M, Meselson M
What did Sutton see? Thirty years of confusion over the chromosomal basis of Mendelism.
Genetics. 2007 Aug;176(4):1939-44. [Abstract/Link to Full Text]

Reynolds RM, Temiyasathit S, Reedy MM, Ruedi EA, Drnevich JM, Leips J, Hughes KA
Age specificity of inbreeding load in Drosophila melanogaster and implications for the evolution of late-life mortality plateaus.
Genetics. 2007 Sep;177(1):587-95.
Current evolutionary theories explain the origin of aging as a byproduct of the decline in the force of natural selection with age. These theories seem inconsistent with the well-documented occurrence of late-life mortality plateaus, since under traditional evolutionary models mortality rates should increase monotonically after sexual maturity. However, the equilibrium frequencies of deleterious alleles affecting late life are lower than predicted under traditional models, and thus evolutionary models can accommodate mortality plateaus if deleterious alleles are allowed to have effects spanning a range of neighboring age classes. Here we test the degree of age specificity of segregating alleles affecting fitness in Drosophila melanogaster. We assessed age specificity by measuring the homozygous fitness effects of segregating alleles across the adult life span and calculated genetic correlations of these effects across age classes. For both males and females, we found that allelic effects are age specific with effects extending over 1-2 weeks across all age classes, consistent with modified mutation-accumulation theory. These results indicate that a modified mutation-accumulation theory can both explain the origin of senescence and predict late-life mortality plateaus. [Abstract/Link to Full Text]

Skřt L, Humphreys J, Humphreys MO, Thorogood D, Gallagher J, Sanderson R, Armstead IP, Thomas ID
Association of candidate genes with flowering time and water-soluble carbohydrate content in Lolium perenne (L.).
Genetics. 2007 Sep;177(1):535-47.
We describe a candidate gene approach for associating SNPs with variation in flowering time and water-soluble carbohydrate (WSC) content and other quality traits in the temperate forage grass species Lolium perenne. Three analysis methods were used, which took the significant population structure into account. First, a linear mixed model was used enabling a structured association analysis to be incorporated with the nine populations identified in the structure analysis as random variables. Second, a within-population analysis of variance was performed. Third, a tree-scanning method was used, in which haplotype trees were associated with phenotypes on the basis of inferred haplotypes. Analysis of variance within populations identified several associations between WSC, nitrogen (N), and dry matter digestibility with allelic variants within an alkaline invertase candidate gene LpcAI. These associations were only detected in material harvested in one of the two years. By contrast, consistent associations between the L. perenne homolog (LpHD1) of the rice photoperiod control gene HD1 and flowering time were identified. One SNP, in the immediate upstream region of the LpHD1 coding sequence (C-4443-A), was significant in the linear mixed model. Within-population analysis of variance and tree-scanning analysis confirmed and extended this result to the 2118 polymorphisms in some of the populations. The merits of the tree-scanning method are compared to the single SNP analysis. The potential usefulness of the 4443 SNP in marker-assisted selection is currently being evaluated in test crosses of genotypes from this work with turf-grass varieties. [Abstract/Link to Full Text]

Smith JJ, Voss SR
Bird and mammal sex-chromosome orthologs map to the same autosomal region in a salamander (ambystoma).
Genetics. 2007 Sep;177(1):607-13.
We tested hypotheses concerning the origin of bird and mammal sex chromosomes by mapping the location of amniote sex-chromosome loci in a salamander amphibian (Ambystoma). We found that ambystomatid orthologs of human X and chicken Z sex chromosomes map to neighboring regions of a common Ambystoma linkage group 2 (ALG2). We show statistically that the proportion of human X and chicken Z orthologs observed on ALG2 is significantly different from the proportion that would be expected by chance. We further show that conserved syntenies between ALG2 and amniote chromosomes are identified as overlapping conserved syntenies when all available chicken (N = 3120) and human (N = 14,922) RefSeq orthologs are reciprocally compared. In particular, the data suggest that chromosomal regions from chicken chromosomes (GGA) Z and 4 and from human chromosomes (HSA) 9, 4, X, 5, and 8 were linked ancestrally. A more distant outgroup comparison with the pufferfish Tetraodon nigroviridis reveals ALG2/GGAZ/HSAX syntenies among three pairs of ancestral chromosome duplicates. Overall, our results suggest that sex chromosomal regions of birds and mammals were recruited from a common ancestral chromosome, and thus our findings conflict with the currently accepted hypothesis of separate autosomal origins. We note that our results were obtained using the most immediate outgroup to the amniote clade (mammals, birds, and other reptiles) while the currently accepted hypothesis is primarily based upon conserved syntenies between in-group taxa (birds and mammals). Our study illustrates the importance of an amphibian outgroup perspective in identifying ancestral amniote gene orders and in reconstructing patterns of vertebrate sex-chromosome evolution. [Abstract/Link to Full Text]

Acevedo SF, Tsigkari KK, Grammenoudi S, Skoulakis EM
In vivo functional specificity and homeostasis of Drosophila 14-3-3 proteins.
Genetics. 2007 Sep;177(1):239-53.
The functional specialization or redundancy of the ubiquitous 14-3-3 proteins constitutes a fundamental question in their biology and stems from their highly conserved structure and multiplicity of coexpressed isotypes. We address this question in vivo using mutations in the two Drosophila 14-3-3 genes, leonardo (14-3-3zeta) and D14-3-3epsilon. We demonstrate that D14-3-3epsilon is essential for embryonic hatching. Nevertheless, D14-3-3epsilon null homozygotes survive because they upregulate transcripts encoding the LEOII isoform at the time of hatching, compensating D14-3-3epsilon loss. This novel homeostatic response explains the reported functional redundancy of the Drosophila 14-3-3 isotypes and survival of D14-3-3epsilon mutants. The response appears unidirectional, as D14-3-3epsilon elevation upon LEO loss was not observed and elevation of leo transcripts was stage and tissue specific. In contrast, LEO levels are not changed in the wing disks, resulting in the aberrant wing veins characterizing D14-3-3epsilon mutants. Nevertheless, conditional overexpression of LEOI, but not of LEOII, in the wing disk can partially rescue the venation deficits. Thus, excess of a particular LEO isoform can functionally compensate for D14-3-3epsilon loss in a cellular-context-specific manner. These results demonstrate functional differences both among Drosophila 14-3-3 proteins and between the two LEO isoforms in vivo, which likely underlie differential dimer affinities toward 14-3-3 targets. [Abstract/Link to Full Text]

Fearnhead P
On the choice of genetic distance in spatial-genetic studies.
Genetics. 2007 Sep;177(1):427-34.
We look at how to choose genetic distance so as to maximize the power of detecting spatial structure. We answer this question through analyzing two population genetic models that allow for a spatially structured population in a continuous habitat. These models, like most that incorporate spatial structure, can be characterized by a separation of timescales: the history of the sample can be split into a scattering and a collecting phase, and it is only during the scattering phase that the spatial locations of the sample affect the coalescence times. Our results suggest that the optimal choice of genetic distance is based upon splitting a DNA sequence into segments and counting the number of segments at which two sequences differ. The size of these segments depends on the length of the scattering phase for the population genetic model. [Abstract/Link to Full Text]

Mena-Ali JI, Stephenson AG
Segregation analyses of partial self-incompatibility in self and cross progeny of Solanum carolinense reveal a leaky S-allele.
Genetics. 2007 Sep;177(1):501-10.
Natural populations of self-incompatible species often exhibit marked phenotypic variation among individuals in the strength of self-incompatibility (SI). In previous studies, we found that the strength of the SI response in Solanum carolinense, a weedy invasive with RNase-mediated SI, is a plastic trait. Selfing can be particularly important for weeds and other successional species that typically undergo repeated colonization and local extinction events and whose population sizes are often small. We applied a PCR-based protocol to identify the S-alleles present in 16 maternal genotypes and their offspring and performed a two-generation greenhouse study to determine whether variation in the strength of SI is due to the existence of weak and strong S-alleles differing in their ability to recognize and reject self-pollen. We found that allele S9 sets significantly more self seed than the other S-alleles in the population we sampled and that its ability to self is not dependent on interactions with other S-alleles. Our data suggest that the observed variations in self-fertility are likely due to factors that directly influence the expression of SI by altering the translation, turnover, or activity of the S-RNase. The variability in the strength of SI among individuals that we have observed in this and our previous studies raises the possibility that plasticity in the strength of SI in S. carolinense may play a role in the colonization and establishment of this weedy species. [Abstract/Link to Full Text]

Kolkman JM, Berry ST, Leon AJ, Slabaugh MB, Tang S, Gao W, Shintani DK, Burke JM, Knapp SJ
Single nucleotide polymorphisms and linkage disequilibrium in sunflower.
Genetics. 2007 Sep;177(1):457-68.
Genetic diversity in modern sunflower (Helianthus annuus L.) cultivars (elite oilseed inbred lines) has been shaped by domestication and breeding bottlenecks and wild and exotic allele introgression(-)the former narrowing and the latter broadening genetic diversity. To assess single nucleotide polymorphism (SNP) frequencies, nucleotide diversity, and linkage disequilibrium (LD) in modern cultivars, alleles were resequenced from 81 genic loci distributed throughout the sunflower genome. DNA polymorphisms were abundant; 1078 SNPs (1/45.7 bp) and 178 insertions-deletions (INDELs) (1/277.0 bp) were identified in 49.4 kbp of DNA/genotype. SNPs were twofold more frequent in noncoding (1/32.1 bp) than coding (1/62.8 bp) sequences. Nucleotide diversity was only slightly lower in inbred lines ( = 0.0094) than wild populations ( = 0.0128). Mean haplotype diversity was 0.74. When extraploted across the genome ( approximately 3500 Mbp), sunflower was predicted to harbor at least 76.4 million common SNPs among modern cultivar alleles. LD decayed more slowly in inbred lines than wild populations (mean LD declined to 0.32 by 5.5 kbp in the former, the maximum physical distance surveyed), a difference attributed to domestication and breeding bottlenecks. SNP frequencies and LD decay are sufficient in modern sunflower cultivars for very high-density genetic mapping and high-resolution association mapping. [Abstract/Link to Full Text]

Mutiu AI, Hoke SM, Genereaux J, Hannam C, MacKenzie K, Jobin-Robitaille O, Guzzo J, Côté J, Andrews B, Haniford DB, Brandl CJ
Structure/function analysis of the phosphatidylinositol-3-kinase domain of yeast tra1.
Genetics. 2007 Sep;177(1):151-66.
Tra1 is an essential component of the Saccharomyces cerevisiae SAGA and NuA4 complexes. Using targeted mutagenesis, we identified residues within its C-terminal phosphatidylinositol-3-kinase (PI3K) domain that are required for function. The phenotypes of tra1-P3408A, S3463A, and SRR3413-3415AAA included temperature sensitivity and reduced growth in media containing 6% ethanol or calcofluor white or depleted of phosphate. These alleles resulted in a twofold or greater change in expression of approximately 7% of yeast genes in rich media and reduced activation of PHO5 and ADH2 promoters. Tra1-SRR3413 associated with components of both the NuA4 and SAGA complexes and with the Gal4 transcriptional activation domain similar to wild-type protein. Tra1-SRR3413 was recruited to the PHO5 promoter in vivo but gave rise to decreased relative amounts of acetylated histone H3 and histone H4 at SAGA and NuA4 regulated promoters. Distinct from other components of these complexes, tra1-SRR3413 resulted in generation-dependent telomere shortening and synthetic slow growth in combination with deletions of a number of genes with roles in membrane-related processes. While the tra1 alleles have some phenotypic similarities with deletions of SAGA and NuA4 components, their distinct nature may arise from the simultaneous alteration of SAGA and NuA4 functions. [Abstract/Link to Full Text]

Gallardo TD, John GB, Shirley L, Contreras CM, Akbay EA, Haynie JM, Ward SE, Shidler MJ, Castrillon DH
Genomewide discovery and classification of candidate ovarian fertility genes in the mouse.
Genetics. 2007 Sep;177(1):179-94.
Female infertility syndromes are among the most prevalent chronic health disorders in women, but their genetic basis remains unknown because of uncertainty regarding the number and identity of ovarian factors controlling the assembly, preservation, and maturation of ovarian follicles. To systematically discover ovarian fertility genes en masse, we employed a mouse model (Foxo3) in which follicles are assembled normally but then undergo synchronous activation. We developed a microarray-based approach for the systematic discovery of tissue-specific genes and, by applying it to Foxo3 ovaries and other samples, defined a surprisingly large set of ovarian factors (n = 348, approximately 1% of the mouse genome). This set included the vast majority of known ovarian factors, 44% of which when mutated produce female sterility phenotypes, but most were novel. Comparative profiling of other tissues, including microdissected oocytes and somatic cells, revealed distinct gene classes and provided new insights into oogenesis and ovarian function, demonstrating the utility of our approach for tissue-specific gene discovery. This study will thus facilitate comprehensive analyses of follicle development, ovarian function, and female infertility. [Abstract/Link to Full Text]

Hutter S, Li H, Beisswanger S, De Lorenzo D, Stephan W
Distinctly different sex ratios in African and European populations of Drosophila melanogaster inferred from chromosomewide single nucleotide polymorphism data.
Genetics. 2007 Sep;177(1):469-80.
It has been hypothesized that the ratio of X-linked to autosomal sequence diversity is influenced by unequal sex ratios in Drosophila melanogaster populations. We conducted a genome scan of single nucleotide polymorphism (SNP) of 378 autosomal loci in a derived European population and of a subset of 53 loci in an ancestral African population. On the basis of these data and our already available X-linked data, we used a coalescent-based maximum-likelihood method to estimate sex ratios and demographic histories simultaneously for both populations. We confirm our previous findings that the African population experienced a population size expansion while the European population suffered a population size bottleneck. Our analysis also indicates that the female population size in Africa is larger than or equal to the male population size. In contrast, the European population shows a huge excess of males. This unequal sex ratio and the bottleneck alone, however, cannot account for the overly strong decrease of X-linked diversity in the European population (compared to the reduction on the autosome). The patterns of the frequency spectrum and the levels of linkage disequilibrium observed in Europe suggest that, in addition, positive selection must have acted in the derived population. [Abstract/Link to Full Text]

Hoppins SC, Go NE, Klein A, Schmitt S, Neupert W, Rapaport D, Nargang FE
Alternative splicing gives rise to different isoforms of the Neurospora crassa Tob55 protein that vary in their ability to insert beta-barrel proteins into the outer mitochondrial membrane.
Genetics. 2007 Sep;177(1):137-49.
Tob55 is the major component of the TOB complex, which is found in the outer membrane of mitochondria. A sheltered knockout of the tob55 gene was developed in Neurospora crassa. When grown under conditions that reduce the levels of the Tob55 protein, the strain exhibited a reduced growth rate and mitochondria isolated from these cells were deficient in their ability to import beta-barrel proteins. Surprisingly, Western blots of wild-type mitochondrial proteins revealed two bands for Tob55 that differed by approximately 4 kDa in their apparent molecular masses. Sequence analysis of cDNAs revealed that the tob55 mRNA is alternatively spliced and encodes three isoforms of the protein, which are predicted to contain 521, 516, or 483 amino acid residues. Mass spectrometry of proteins isolated from purified outer membrane vesicles confirmed the existence of each isoform in mitochondria. Strains that expressed each isoform of the protein individually were constructed. When cells expressing only the longest form of the protein were grown at elevated temperature, their growth rate was reduced and mitochondria isolated from these cells were deficient in their ability to assembly beta-barrel proteins. [Abstract/Link to Full Text]

Deng H, Bao X, Zhang W, Girton J, Johansen J, Johansen KM
Reduced levels of Su(var)3-9 but not Su(var)2-5 (HP1) counteract the effects on chromatin structure and viability in loss-of-function mutants of the JIL-1 histone H3S10 kinase.
Genetics. 2007 Sep;177(1):79-87.
It has recently been demonstrated that activity of the essential JIL-1 histone H3S10 kinase is a major regulator of chromatin structure and that it functions to maintain euchromatic domains while counteracting heterochromatization and gene silencing. In the absence of JIL-1 kinase activity, the major heterochromatin markers histone H3K9me2 and HP1 spread in tandem to ectopic locations on the chromosome arms. In this study, we show that the lethality as well as some of the chromosome morphology defects associated with the null JIL-1 phenotype to a large degree can be rescued by reducing the dose of the Su(var)3-9 gene. This effect was observed with three different alleles of Su(var)3-9, strongly suggesting it is specific to Su(var)3-9 and not to second site modifiers. This is in contrast to similar experiments performed with alleles of the Su(var)2-5 gene that codes for HP1 in Drosophila where no genetic interactions were detectable between JIL-1 and Su(var)2-5. Taken together, these findings indicate that while Su(var)3-9 histone methyltransferase activity is a major factor in the lethality and chromatin structure perturbations associated with loss of the JIL-1 histone H3S10 kinase, these effects are likely to be uncoupled from HP1. [Abstract/Link to Full Text]

Zhong S, Jannink JL
Using quantitative trait loci results to discriminate among crosses on the basis of their progeny mean and variance.
Genetics. 2007 Sep;177(1):567-76.
To develop inbred lines, parents are crossed to generate segregating populations from which superior inbred progeny are selected. The value of a particular cross thus depends on the expected performance of its best progeny, which we call the superior progeny value. Superior progeny value is a linear combination of the mean of the cross's progeny and their standard deviation. In this study we specify theory to predict a cross's progeny standard deviation from QTL results and explore analytically and by simulation the variance of that standard deviation under different genetic models. We then study the impact of different QTL analysis methods on the prediction accuracy of a cross's superior progeny value. We show that including all markers, rather than only markers with significant effects, improves the prediction. Methods that account for the uncertainty of the QTL analysis by integrating over the posterior distributions of effect estimates also produce better predictions than methods that retain only point estimates from the QTL analysis. The utility of including estimates of a cross's among-progeny standard deviation in the prediction increases with increasing heritability and marker density but decreasing genome size and QTL number. This utility is also higher if crosses are envisioned only among the best parents rather than among all parents. Nevertheless, we show that among crosses the variance of progeny means is generally much greater than the variance of progeny standard deviations, restricting the utility of estimates of progeny standard deviations to a relatively small parameter space. [Abstract/Link to Full Text]

Aulchenko YS, de Koning DJ, Haley C
Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis.
Genetics. 2007 Sep;177(1):577-85.
For pedigree-based quantitative trait loci (QTL) association analysis, a range of methods utilizing within-family variation such as transmission-disequilibrium test (TDT)-based methods have been developed. In scenarios where stratification is not a concern, methods exploiting between-family variation in addition to within-family variation, such as the measured genotype (MG) approach, have greater power. Application of MG methods can be computationally demanding (especially for large pedigrees), making genomewide scans practically infeasible. Here we suggest a novel approach for genomewide pedigree-based quantitative trait loci (QTL) association analysis: genomewide rapid association using mixed model and regression (GRAMMAR). The method first obtains residuals adjusted for family effects and subsequently analyzes the association between these residuals and genetic polymorphisms using rapid least-squares methods. At the final step, the selected polymorphisms may be followed up with the full measured genotype (MG) analysis. In a simulation study, we compared type 1 error, power, and operational characteristics of the proposed method with those of MG and TDT-based approaches. For moderately heritable (30%) traits in human pedigrees the power of the GRAMMAR and the MG approaches is similar and is much higher than that of TDT-based approaches. When using tabulated thresholds, the proposed method is less powerful than MG for very high heritabilities and pedigrees including large sibships like those observed in livestock pedigrees. However, there is little or no difference in empirical power of MG and the proposed method. In any scenario, GRAMMAR is much faster than MG and enables rapid analysis of hundreds of thousands of markers. [Abstract/Link to Full Text]

Hadfield JD, Wilson AJ
Multilevel selection 3: modeling the effects of interacting individuals as a function of group size.
Genetics. 2007 Sep;177(1):667-8.
Bijma et al. (2007a,b) presented a quantitative genetic theory of multilevel selection and showed how to estimate the relevant parameters using standard restricted maximum-likelihood (REML) methodology. Extending their results we develop a wider class of models that provide a more realistic framework for capturing the effects of interacting individuals. These models also make use of standard REML techniques and include the original model as a special case. [Abstract/Link to Full Text]

Jang JK, Rahman T, Kober VS, Cesario J, McKim KS
Misregulation of the kinesin-like protein Subito induces meiotic spindle formation in the absence of chromosomes and centrosomes.
Genetics. 2007 Sep;177(1):267-80.
Bipolar spindles assemble in the absence of centrosomes in the oocytes of many species. In Drosophila melanogaster oocytes, the chromosomes have been proposed to initiate spindle assembly by nucleating or capturing microtubules, although the mechanism is not understood. An important contributor to this process is Subito, which is a kinesin-6 protein that is required for bundling interpolar microtubules located within the central spindle at metaphase I. We have characterized the domains of Subito that regulate its activity and its specificity for antiparallel microtubules. This analysis has revealed that the C-terminal domain may interact independently with microtubules while the motor domain is required for maintaining the interaction with the antiparallel microtubules. Surprisingly, deletion of the N-terminal domain resulted in a Subito protein capable of promoting the assembly of bipolar spindles that do not include centrosomes or chromosomes. Bipolar acentrosomal spindle formation during meiosis in oocytes may be driven by the bundling of antiparallel microtubules. Furthermore, these experiments have revealed evidence of a nuclear- or chromosome-based signal that acts at a distance to activate Subito. Instead of the chromosomes directly capturing microtubules, signals released upon nuclear envelope breakdown may activate proteins like Subito, which in turn bundles together microtubules. [Abstract/Link to Full Text]

Pablo-Hernando ME, Arnaiz-Pita Y, Nakanishi H, Dawson D, del Rey F, Neiman AM, Vázquez de Aldana CR
Cdc15 is required for spore morphogenesis independently of Cdc14 in Saccharomyces cerevisiae.
Genetics. 2007 Sep;177(1):281-93.
In Saccharomyces cerevisiae exit from mitosis requires the Cdc14 phosphatase to reverse CDK-mediated phosphorylation. Cdc14 is released from the nucleolus by the Cdc14 early anaphase release (FEAR) and mitotic exit network (MEN) pathways. In meiosis, the FEAR pathway is essential for exit from anaphase I. The MEN component Cdc15 is required for the formation of mature spores. To analyze the role of Cdc15 during sporulation, a conditional mutant in which CDC15 expression was controlled by the CLB2 promoter was used. Cdc15-depleted cells proceeded normally through the meiotic divisions but were unable to properly disassemble meiosis II spindles. The morphology of the prospore membrane was aberrant and failed to capture the nuclear lobes. Cdc15 was not required for Cdc14 release from the nucleoli, but it was essential to maintain Cdc14 released and for its nucleo-cytoplasmic transport. However, cells carrying a CDC14 allele with defects in nuclear export (Cdc14-DeltaNES) were able to disassemble the spindle and to complete spore formation, suggesting that the Cdc14 nuclear export defect was not the cause of the phenotypes observed in cdc15 mutants. [Abstract/Link to Full Text]

Youderian P, Hartzell PL
Triple mutants uncover three new genes required for social motility in Myxococcus xanthus.
Genetics. 2007 Sep;177(1):557-66.
The bacterium Myxococcus xanthus glides over surfaces using two different locomotive mechanisms, called S (social) and A (adventurous) motility that enable cells to move both as groups and as individuals. Neither mechanism involves flagella. The functions of these two motors are coordinated by the activity of a small Ras-like protein, encoded by the mglA gene. The results of previous studies of a second-site suppressor of the mglA-8 missense mutation masK-815 indicate that MglA interacts with a protein tyrosine kinase, MasK, to control social motility. Sequence analysis of the sites of 12 independent insertions of the transposon magellan-4 that result in the loss of motility in an M. xanthus mglA-8 masK-815 double mutant shows that nine of these 12 insertions are in genes known to be required for S gliding motility. This result confirms that the masK-815 suppressor restores S but not A motility. Three of the 12 insertions define three new genes required for S motility and show that the attachment of heptose to the lipopolysaccharide inner core, an ortholog of the CheR methyltransferase, and a large protein with YD repeat motifs, are required for S motility. When these three insertions are backcrossed into an otherwise wild-type genetic background, their recombinants are found to have defects in S, but not, A motility. The spectrum of magellan-4 insertions that lead to the loss of S motility in the mglA-8 masK-815 double mutant background is different than that resulting from a previous mutant hunt starting with a different (A mutant) genetic background, suggesting that the number of genes required for S motility in M. xanthus is quite large. [Abstract/Link to Full Text]

James N, Landrieux E, Collart MA
A SAGA-independent function of SPT3 mediates transcriptional deregulation in a mutant of the Ccr4-not complex in Saccharomyces cerevisiae.
Genetics. 2007 Sep;177(1):123-35.
The conserved multi-subunit Ccr4-Not complex regulates gene expression in diverse ways. In this work, we characterize the suppression of temperature sensitivity associated with a mutation in the gene encoding the scaffold subunit of the Ccr4-Not complex, NOT1, by the deletion of SPT3. We determine that the deletion of SPT3, but not the deletion of genes encoding other subunits of the SAGA complex, globally suppresses transcriptional defects of not1-2. We find that transcriptional activation in not1-2 is associated with increased binding of TFIID and SAGA at promoters of upregulated genes, and this is suppressed by the deletion of SPT3. Interestingly, Spt3p-dependent activation of transcription occurs in not1-2 even if the SAGA complex is disrupted by the deletion of SPT7 that encodes a subunit of SAGA required for its integrity. Consistent with a SAGA-independent function of Spt3p, the deletion of SPT3 displays synthetic phenotypes when combined with a deletion of SPT7. Taken together, our results provide a new view of the Spt3 protein by identifying a SAGA-independent function of this protein that is functionally linked to the Ccr4-Not complex. [Abstract/Link to Full Text]

Recent Articles in Cell & Chromosome

No recent articles are currently available.

Recent Articles in Genome Biology

Zhang ZD, Rozowsky J, Lam HY, Du J, Snyder M, Gerstein M
Tilescope: online analysis pipeline for high-density tiling microarray data.
Genome Biol. 2007;8(5):R81.
We developed Tilescope, a fully integrated data processing pipeline for analyzing high-density tiling-array data In a completely automated fashion, Tilescope will normalize signals between channels and across arrays, combine replicate experiments, score each array element, and identify genomic features. The program is designed with a modular, three-tiered architecture, facilitating parallelism, and a graphic user-friendly interface, presenting results in an organized web page, downloadable for further analysis. [Abstract/Link to Full Text]

Bonhomme F, Rivals E, Orth A, Grant GR, Jeffreys AJ, Bois PR
Species-wide distribution of highly polymorphic minisatellite markers suggests past and present genetic exchanges among house mouse subspecies.
Genome Biol. 2007;8(5):R80.
BACKGROUND: Four hypervariable minisatellite loci were scored on a panel of 116 individuals of various geographical origins representing a large part of the diversity present in house mouse subspecies. Internal structures of alleles were determined by minisatellite variant repeat mapping PCR to produce maps of intermingled patterns of variant repeats along the repeat array. To reconstruct the genealogy of these arrays of variable length, the specifically designed software MS_Align was used to estimate molecular divergences, graphically represented as neighbor-joining trees. RESULTS: Given the high haplotypic diversity detected (mean He = 0.962), these minisatellite trees proved to be highly informative for tracing past and present genetic exchanges. Examples of identical or nearly identical alleles were found across subspecies and in geographically very distant locations, together with poor lineage sorting among subspecies except for the X-chromosome locus MMS30 in Mus mus musculus. Given the high mutation rate of mouse minisatellite loci, this picture cannot be interpreted only with simple splitting events followed by retention of polymorphism, but implies recurrent gene flow between already differentiated entities. CONCLUSION: This strongly suggests that, at least for the chromosomal regions under scrutiny, wild house mouse subspecies constitute a set of interrelated gene pools still connected through long range gene flow or genetic exchanges occurring in the various contact zones existing nowadays or that have existed in the past. Identifying genomic regions that do not follow this pattern will be a challenging task for pinpointing genes important for speciation. [Abstract/Link to Full Text]

Okoniewski MJ, Yates T, Dibben S, Miller CJ
An annotation infrastructure for the analysis and interpretation of Affymetrix exon array data.
Genome Biol. 2007;8(5):R79.
Affymetrix exon arrays contain probesets intended to target every known and predicted exon in the entire genome, posing significant challenges for high-throughput genome-wide data analysis. X:MAP, an annotation database, and exonmap, a BioConductor/R package, are designed to support fine-grained analysis of exon array data. The system supports the application of standard statistical techniques, prior to the use of genome scale annotation to provide gene-, transcript- and exon-level summaries and visualization tools. [Abstract/Link to Full Text]

Yan B, Yang X, Lee TL, Friedman J, Tang J, Van Waes C, Chen Z
Genome-wide identification of novel expression signatures reveal distinct patterns and prevalence of binding motifs for p53, nuclear factor-kappaB and other signal transcription factors in head and neck squamous cell carcinoma.
Genome Biol. 2007;8(5):R78.
BACKGROUND: Differentially expressed gene profiles have previously been observed among pathologically defined cancers by microarray technologies, including head and neck squamous cell carcinomas (HNSCCs). However, the molecular expression signatures and transcriptional regulatory controls that underlie the heterogeneity in HNSCCs are not well defined. RESULTS: Genome-wide cDNA microarray profiling of ten HNSCC cell lines revealed novel gene expression signatures that distinguished cancer cell subsets associated with p53 status. Three major clusters of over-expressed genes (A to C) were defined through hierarchical clustering, Gene Ontology, and statistical modeling. The promoters of genes in these clusters exhibited different patterns and prevalence of transcription factor binding sites for p53, nuclear factor-kappaB (NF-kappaB), activator protein (AP)-1, signal transducer and activator of transcription (STAT)3 and early growth response (EGR)1, as compared with the frequency in vertebrate promoters. Cluster A genes involved in chromatin structure and function exhibited enrichment for p53 and decreased AP-1 binding sites, whereas clusters B and C, containing cytokine and antiapoptotic genes, exhibited a significant increase in prevalence of NF-kappaB binding sites. An increase in STAT3 and EGR1 binding sites was distributed among the over-expressed clusters. Novel regulatory modules containing p53 or NF-kappaB concomitant with other transcription factor binding motifs were identified, and experimental data supported the predicted transcriptional regulation and binding activity. CONCLUSION: The transcription factors p53, NF-kappaB, and AP-1 may be important determinants of the heterogeneous pattern of gene expression, whereas STAT3 and EGR1 may broadly enhance gene expression in HNSCCs. Defining these novel gene signatures and regulatory mechanisms will be important for establishing new molecular classifications and subtyping, which in turn will promote development of targeted therapeutics for HNSCC. [Abstract/Link to Full Text]

Liu Y, Ringnér M
Revealing signaling pathway deregulation by using gene expression signatures and regulatory motif analysis.
Genome Biol. 2007;8(5):R77.
Gene expression signatures consisting of tens to hundreds of genes have been found to be informative for different biological states. Recently, many computational methods have been proposed for biological interpretation of such signatures. However, there is a lack of methods for identifying cell signaling pathways whose deregulation results in an observed expression signature. We present a strategy for identifying such signaling pathways and evaluate the strategy using six human and mouse gene expression signatures. [Abstract/Link to Full Text]

Cokol M, Rodriguez-Esteban R, Rzhetsky A
A recipe for high impact.
Genome Biol. 2007;8(5):406.
Our analysis highlights common statistical features of high-impact articles; we also show how information flows among various publication types. [Abstract/Link to Full Text]

Herschkowitz JI, Simin K, Weigman VJ, Mikaelian I, Usary J, Hu Z, Rasmussen KE, Jones LP, Assefnia S, Chandrasekharan S, Backlund MG, Yin Y, Khramtsov AI, Bastein R, Quackenbush J, Glazer RI, Brown PH, Green JE, Kopelovich L, Furth PA, Palazzo JP, Olopade OI, Bernard PS, Churchill GA, Van Dyke T, Perou CM
Identification of conserved gene expression features between murine mammary carcinoma models and human breast tumors.
Genome Biol. 2007;8(5):R76.
BACKGROUND: Although numerous mouse models of breast carcinomas have been developed, we do not know the extent to which any faithfully represent clinically significant human phenotypes. To address this need, we characterized mammary tumor gene expression profiles from 13 different murine models using DNA microarrays and compared the resulting data to those from human breast tumors. RESULTS: Unsupervised hierarchical clustering analysis showed that six models (TgWAP-Myc, TgMMTV-Neu, TgMMTV-PyMT, TgWAP-Int3, TgWAP-Tag, and TgC3(1)-Tag) yielded tumors with distinctive and homogeneous expression patterns within each strain. However, in each of four other models (TgWAP-T121, TgMMTV-Wnt1, Brca1Co/Co;TgMMTV-Cre;p53+/- and DMBA-induced), tumors with a variety of histologies and expression profiles developed. In many models, similarities to human breast tumors were recognized, including proliferation and human breast tumor subtype signatures. Significantly, tumors of several models displayed characteristics of human basal-like breast tumors, including two models with induced Brca1 deficiencies. Tumors of other murine models shared features and trended towards significance of gene enrichment with human luminal tumors; however, these murine tumors lacked expression of estrogen receptor (ER) and ER-regulated genes. TgMMTV-Neu tumors did not have a significant gene overlap with the human HER2+/ER- subtype and were more similar to human luminal tumors. CONCLUSION: Many of the defining characteristics of human subtypes were conserved among the mouse models. Although no single mouse model recapitulated all the expression features of a given human subtype, these shared expression features provide a common framework for an improved integration of murine mammary tumor models with human breast tumors. [Abstract/Link to Full Text]

Brody T, Rasband W, Baler K, Kuzin A, Kundu M, Odenwald WF
cis-Decoder discovers constellations of conserved DNA sequences shared among tissue-specific enhancers.
Genome Biol. 2007;8(5):R75.
A systematic approach is described for analysis of evolutionarily conserved cis-regulatory DNA using cis-Decoder, a tool for discovery of conserved sequence elements that are shared between similarly regulated enhancers. Analysis of 2,086 conserved sequence blocks (CSBs), identified from 135 characterized enhancers, reveals most CSBs consist of shorter overlapping/adjacent elements that are either enhancer type-specific or common to enhancers with divergent regulatory behaviors. Our findings suggest that enhancers employ overlapping repertoires of highly conserved core elements. [Abstract/Link to Full Text]

Nilsson B, Hĺkansson P, Johansson M, Nelander S, Fioretos T
Threshold-free high-power methods for the ontological analysis of genome-wide gene-expression studies.
Genome Biol. 2007;8(5):R74.
Ontological analysis facilitates the interpretation of microarray data. Here we describe new ontological analysis methods which, unlike existing approaches, are threshold-free and statistically powerful. We perform extensive evaluations and introduce a new concept, detection spectra, to characterize methods. We show that different ontological analysis methods exhibit distinct detection spectra, and that it is critical to account for this diversity. Our results argue strongly against the continued use of existing methods, and provide directions towards an enhanced approach. [Abstract/Link to Full Text]

Takahashi R, Miller JH
Conversion of amino-acid sequence in proteins to classical music: search for auditory patterns.
Genome Biol. 2007;8(5):405.
We have converted genome-encoded protein sequences into musical notes to reveal auditory patterns without compromising musicality. We derived a reduced range of 13 base notes by pairing similar amino acids and distinguishing them using variations of three-note chords and codon distribution to dictate rhythm. The conversion will help make genomic coding sequences more approachable for the general public, young children, and vision-impaired scientists. [Abstract/Link to Full Text]

Rustici G, van Bakel H, Lackner DH, Holstege FC, Wijmenga C, Bähler J, Brazma A
Global transcriptional responses of fission and budding yeast to changes in copper and iron levels: a comparative study.
Genome Biol. 2007;8(5):R73.
BACKGROUND: Recent studies in comparative genomics demonstrate that interspecies comparison represents a powerful tool for identifying both conserved and specialized biologic processes across large evolutionary distances. All cells must adjust to environmental fluctuations in metal levels, because levels that are too low or too high can be detrimental. Here we explore the conservation of metal homoeostasis in two distantly related yeasts. RESULTS: We examined genome-wide gene expression responses to changing copper and iron levels in budding and fission yeast using DNA microarrays. The comparison reveals conservation of only a small core set of genes, defining the copper and iron regulons, with a larger number of additional genes being specific for each species. Novel regulatory targets were identified in Schizosaccharomyces pombe for Cuf1p (pex7 and SPAC3G6.05) and Fep1p (srx1, sib1, sib2, rds1, isu1, SPBC27B12.03c, SPAC1F8.02c, and SPBC947.05c). We also present evidence refuting a direct role of Cuf1p in the repression of genes involved in iron uptake. Remarkable differences were detected in responses of the two yeasts to excess copper, probably reflecting evolutionary adaptation to different environments. CONCLUSION: The considerable evolutionary distance between budding and fission yeast resulted in substantial diversion in the regulation of copper and iron homeostasis. Despite these differences, the conserved regulation of a core set of genes involved in the uptake of these metals provides valuable clues to key features of metal metabolism. [Abstract/Link to Full Text]

Wong YW, Schulze C, Streichert T, Gronostajski RM, Schachner M, Tilling T
Gene expression analysis of nuclear factor I-A deficient mice indicates delayed brain maturation.
Genome Biol. 2007;8(5):R72.
BACKGROUND: Nuclear factor I-A (NFI-A), a phylogenetically conserved transcription/replication protein, plays a crucial role in mouse brain development. Previous studies have shown that disruption of the Nfia gene in mice leads to perinatal lethality, corpus callosum agenesis, and hydrocephalus. RESULTS: To identify potential NFI-A target genes involved in the observed tissue malformations, we analyzed gene expression in brains from Nfia-/- and Nfia+/+ littermate mice at the mRNA level using oligonucleotide microarrays. In young postnatal animals (postnatal day 16), 356 genes were identified as being differentially regulated, whereas at the late embryonic stage (embryonic day 18) only five dysregulated genes were found. An in silico analysis identified phylogenetically conserved NFI binding sites in at least 70 of the differentially regulated genes. Moreover, assignment of gene function showed that marker genes for immature neural cells and neural precursors were expressed at elevated levels in young postnatal Nfia-/- mice. In contrast, marker genes for differentiated neural cells were downregulated at this stage. In particular, genes relevant for oligodendrocyte differentiation were affected. CONCLUSION: Our findings suggest that brain development, especially oligodendrocyte maturation, is delayed in Nfia-/- mice during the early postnatal period, which at least partly accounts for their phenotype. The identification of potential NFI-A target genes in our study should help to elucidate NFI-A dependent transcriptional pathways and contribute to enhanced understanding of this period of brain formation, especially with regard to the function of NFI-A. [Abstract/Link to Full Text]

Lefébure T, Stanhope MJ
Evolution of the core and pan-genome of Streptococcus: positive selection, recombination, and genome composition.
Genome Biol. 2007;8(5):R71.
BACKGROUND: The genus Streptococcus is one of the most diverse and important human and agricultural pathogens. This study employs comparative evolutionary analyses of 26 Streptococcus genomes to yield an improved understanding of the relative roles of recombination and positive selection in pathogen adaptation to their hosts. RESULTS: Streptococcus genomes exhibit extreme levels of evolutionary plasticity, with high levels of gene gain and loss during species and strain evolution. S. agalactiae has a large pan-genome, with little recombination in its core-genome, while S. pyogenes has a smaller pan-genome and much more recombination of its core-genome, perhaps reflecting the greater habitat, and gene pool, diversity for S. agalactiae compared to S. pyogenes. Core-genome recombination was evident in all lineages (18% to 37% of the core-genome judged to be recombinant), while positive selection was mainly observed during species differentiation (from 11% to 34% of the core-genome). Positive selection pressure was unevenly distributed across lineages and biochemical main role categories. S. suis was the lineage with the greatest level of positive selection pressure, the largest number of unique loci selected, and the largest amount of gene gain and loss. CONCLUSION: Recombination is an important evolutionary force in shaping Streptococcus genomes, not only in the acquisition of significant portions of the genome as lineage specific loci, but also in facilitating rapid evolution of the core-genome. Positive selection, although undoubtedly a slower process, has nonetheless played an important role in adaptation of the core-genome of different Streptococcus species to different hosts. [Abstract/Link to Full Text]

Meyer JN, Boyd WA, Azzam GA, Haugen AC, Freedman JH, Van Houten B
Decline of nucleotide excision repair capacity in aging Caenorhabditis elegans.
Genome Biol. 2007;8(5):R70.
BACKGROUND: Caenorhabditis elegans is an important model for the study of DNA damage and repair related processes such as aging, neurodegeneration, and carcinogenesis. However, DNA repair is poorly characterized in this organism. We adapted a quantitative polymerase chain reaction assay to characterize repair of DNA damage induced by ultraviolet type C (UVC) radiation in C. elegans, and then tested whether DNA repair rates were affected by age in adults. RESULTS: UVC radiation induced lesions in young adult C. elegans, with a slope of 0.4 to 0.5 lesions per 10 kilobases of DNA per 100 J/m2, in both nuclear and mitochondrial targets. L1 and dauer larvae were more than fivefold more sensitive to lesion formation than were young adults. Nuclear repair kinetics in a well expressed nuclear gene were biphasic in nongravid adult nematodes: a faster, first order (half-life about 16 hours) phase lasting approximately 24 hours and resulting in removal of about 60% of the photoproducts was followed by a much slower phase. Repair in ten nuclear DNA regions was 15% and 50% higher in more actively transcribed regions in young and aging adults, respectively. Finally, repair was reduced by 30% to 50% in each of the ten nuclear regions in aging adults. However, this decrease in repair could not be explained by a reduction in expression of nucleotide excision repair genes, and we present a plausible mechanism, based on gene expression data, to account for this decrease. CONCLUSION: Repair of UVC-induced DNA damage in C. elegans is similar kinetically and genetically to repair in humans. Furthermore, this important repair process slows significantly in aging C. elegans, the first whole organism in which this question has been addressed. [Abstract/Link to Full Text]

Fodor AA, Tickle TL, Richardson C
Towards the uniform distribution of null P values on Affymetrix microarrays.
Genome Biol. 2007;8(5):R69.
Methods to control false-positive rates require that P values of genes that are not differentially expressed follow a uniform distribution. Commonly used microarray statistics can generate P values that do not meet this assumption. We show that poorly characterized variance, imperfect normalization, and cross-hybridization are among the many causes of this non-uniform distribution. We demonstrate a simple technique that produces P values that are close to uniform for nondifferentially expressed genes in control datasets. [Abstract/Link to Full Text]

King BR, Guda C
ngLOC: an n-gram-based Bayesian method for estimating the subcellular proteomes of eukaryotes.
Genome Biol. 2007;8(5):R68.
We present a method called ngLOC, an n-gram-based Bayesian classifier that predicts the localization of a protein sequence over ten distinct subcellular organelles. A tenfold cross-validation result shows an accuracy of 89% for sequences localized to a single organelle, and 82% for those localized to multiple organelles. An enhanced version of ngLOC was developed to estimate the subcellular proteomes of eight eukaryotic organisms: yeast, nematode, fruitfly, mosquito, zebrafish, chicken, mouse, and human. [Abstract/Link to Full Text]

Su AI, Hogenesch JB
Power-law-like distributions in biomedical publications and research funding.
Genome Biol. 2007;8(4):404.
Gene annotation, as measured by links to the biomedical literature and funded grants, is governed by a power law, indicating that researchers favor the extensive study of relatively few genes. This emphasizes the need for data-driven science to accomplish genome-wide gene annotation. [Abstract/Link to Full Text]

Beltran S, Angulo M, Pignatelli M, Serras F, Corominas M
Functional dissection of the ash2 and ash1 transcriptomes provides insights into the transcriptional basis of wing phenotypes and reveals conserved protein interactions.
Genome Biol. 2007;8(4):R67.
BACKGROUND: The trithorax group (trxG) genes absent, small or homeotic discs 1 (ash1) and 2 (ash2) were isolated in a screen for mutants with abnormal imaginal discs. Mutations in either gene cause homeotic transformations but Hox genes are not their only targets. Although analysis of double mutants revealed that ash2 and ash1 mutations enhance each other's phenotypes, suggesting they are functionally related, it was shown that these proteins are subunits of distinct complexes. RESULTS: The analysis of wing imaginal disc transcriptomes from ash2 and ash1 mutants showed that they are highly similar. Functional annotation of regulated genes using Gene Ontology allowed identification of severely affected groups of genes that could be correlated to the wing phenotypes observed. Comparison of the differentially expressed genes with those from other genome-wide analyses revealed similarities between ASH2 and Sin3A, suggesting a putative functional relationship. Coimmunoprecipitation studies and immunolocalization on polytene chromosomes demonstrated that ASH2 and Sin3A interact with HCF (host-cell factor). The results of nucleosome western blots and clonal analysis indicated that ASH2 is necessary for trimethylation of the Lys4 on histone 3 (H3K4). CONCLUSION: The similarity between the transcriptomes of ash2 and ash1 mutants supports a model in which the two genes act together to maintain stable states of transcription. Like in humans, both ASH2 and Sin3A bind HCF. Finally, the reduction of H3K4 trimethylation in ash2 mutants is the first evidence in Drosophila regarding the molecular function of this trxG gene. [Abstract/Link to Full Text]

Solignac M, Mougel F, Vautrin D, Monnerot M, Cornuet JM
A third-generation microsatellite-based linkage map of the honey bee, Apis mellifera, and its comparison with the sequence-based physical map.
Genome Biol. 2007;8(4):R66.
BACKGROUND: The honey bee is a key model for social behavior and this feature led to the selection of the species for genome sequencing. A genetic map is a necessary companion to the sequence. In addition, because there was originally no physical map for the honey bee genome project, a meiotic map was the only resource for organizing the sequence assembly on the chromosomes. RESULTS: We present the genetic (meiotic) map here and describe the main features that emerged from comparison with the sequence-based physical map. The genetic map of the honey bee is saturated and the chromosomes are oriented from the centromeric to the telomeric regions. The map is based on 2,008 markers and is about 40 Morgans (M) long, resulting in a marker density of one every 2.05 centiMorgans (cM). For the 186 megabases (Mb) of the genome mapped and assembled, this corresponds to a very high average recombination rate of 22.04 cM/Mb. Honey bee meiosis shows a relatively homogeneous recombination rate along and across chromosomes, as well as within and between individuals. Interference is higher than inferred from the Kosambi function of distance. In addition, numerous recombination hotspots are dispersed over the genome. CONCLUSION: The very large genetic length of the honey bee genome, its small physical size and an almost complete genome sequence with a relatively low number of genes suggest a very promising future for association mapping in the honey bee, particularly as the existence of haploid males allows easy bulk segregant analysis. [Abstract/Link to Full Text]

Jolly ER, Chin CS, Miller S, Bahgat MM, Lim KC, DeRisi J, McKerrow JH
Gene expression patterns during adaptation of a helminth parasite to different environmental niches.
Genome Biol. 2007;8(4):R65.
BACKGROUND: Schistosome bloodflukes are complex trematodes responsible for 200 million cases of schistosomiasis worldwide. Their life cycle is characterized by a series of remarkable morphological and biochemical transitions between an invertebrate host, an aquatic environment, and a mammalian host. We report a global transcriptional analysis of how this parasite alters gene regulation to adapt to three distinct environments. RESULTS: Utilizing a genomic microarray made of 12,000 45-50-mer oligonucleotides based on expressed sequence tags, three different developmental stages of the schistosome parasite were analyzed by pair-wise comparisons of transcript hybridization signals. This analysis resulted in the identification of 1,154 developmentally enriched transcripts. CONCLUSION: This study expands the repertoire of schistosome genes analyzed for stage-specific expression to over 70% of the predicted genome. Among the new associations identified are the roles of robust protein synthesis and programmed cell death in development of cercariae in the sporocyst stages, the relative paucity of cercarial gene expression outside of energy production, and the remarkable diversity of adult gene expression programs that reflect adaptation to the host bloodstream and an average lifespan that may approach 10 years. [Abstract/Link to Full Text]

Clark TA, Schweitzer AC, Chen TX, Staples MK, Lu G, Wang H, Williams A, Blume JE
Discovery of tissue-specific exons using comprehensive human exon microarrays.
Genome Biol. 2007;8(4):R64.
BACKGROUND: Higher eukaryotes express a diverse population of messenger RNAs generated by alternative splicing. Large-scale methods for monitoring gene expression must adapt in order to accurately detect the transcript variation generated by this splicing. RESULTS: We have designed a high-density oligonucleotide microarray with probesets for more than one million annotated and predicted exons in the human genome. Using these arrays and a simple algorithm that normalizes exon signal to signal from the gene as a whole, we have identified tissue-specific exons from a panel of 16 different normal adult tissues. RT-PCR validation confirms approximately 86% of the predicted tissue-enriched probesets. Pair-wise comparisons between the tissues suggest that as many as 73% of detected genes are differentially alternatively spliced. We also demonstrate how an inclusive exon microarray can be used to discover novel alternative splicing events. As examples, 17 new tissue-specific exons from 11 genes were validated by RT-PCR and sequencing. CONCLUSION: In conjunction with a conceptually simple algorithm, comprehensive exon microarrays can detect tissue-specific alternative splicing events. Our data suggest significant expression outside of known exons and well annotated genes and a high frequency of alternative splicing events. In addition, we identified and validated a number of novel exons with tissue-specific splicing patterns. The tissue map data will likely serve as a valuable source of information on the regulation of alternative splicing. [Abstract/Link to Full Text]

Qin X, Ahn S, Speed TP, Rubin GM
Global analyses of mRNA translational control during early Drosophila embryogenesis.
Genome Biol. 2007;8(4):R63.
BACKGROUND: In many animals, the first few hours of life proceed with little or no transcription, and developmental regulation at these early stages is dependent on maternal cytoplasm rather than the zygotic nucleus. Translational control is critical for early Drosophila embryogenesis and is exerted mainly at the gene level. To understand post-transcriptional regulation during Drosophila early embryonic development, we used sucrose polysomal gradient analyses and GeneChip analysis to illustrate the translation profile of individual mRNAs. RESULTS: We determined ribosomal density and ribosomal occupancy of over 10,000 transcripts during the first ten hours after egg laying. CONCLUSION: We report the extent and general nature of gene regulation at the translational level during early Drosophila embryogenesis on a genome-wide basis. The diversity of the translation profiles indicates multiple mechanisms modulating transcript-specific translation. Cluster analyses suggest that the genes involved in some biological processes are co-regulated at the translational level at certain developmental stages. [Abstract/Link to Full Text]

Rossi L, Salvetti A, Marincola FM, Lena A, Deri P, Mannini L, Batistoni R, Wang E, Gremigni V
Deciphering the molecular machinery of stem cells: a look at the neoblast gene expression profile.
Genome Biol. 2007;8(4):R62.
BACKGROUND: Mammalian stem cells are difficult to access experimentally; model systems that can regenerate offer an alternative way to characterize stem cell related genes. Planarian regeneration depends on adult pluripotent stem cells--the neoblasts. These cells can be selectively destroyed using X-rays, enabling comparison of organisms lacking stem cells with wild-type worms. RESULTS: Using a genomic approach we produced an oligonucleotide microarray chip (the Dj600 chip), which was designed using selected planarian gene sequences. Using this chip, we compared planarians treated with high doses of X-rays (which eliminates all neoblasts) with wild-type worms, which led to identification of a set of putatively neoblast-restricted genes. Most of these genes are involved in chromatin modeling and RNA metabolism, suggesting that epigenetic modifications and post-transcriptional regulation are pivotal in neoblast regulation. Comparing planarians treated with low doses of X-rays (after which some radiotolerant neoblasts re-populate the planarian body) with specimens irradiated with high doses and unirradiated control worms, we identified a group of genes that were upregulated as a consequence of low-dose X-ray treatment. Most of these genes encode proteins that are known to regulate the balance between death and survival of the cell; our results thus suggest that genetic programs that control neoblast cytoprotection, proliferation, and migration are activated by low-dose X-rays. CONCLUSION: The broad differentiation potential of planarian neoblasts is unparalleled by any adult stem cells in the animal kingdom. In addition to our validation of the Dj600 chip as a valuable platform, our work contributes to elucidating the molecular mechanisms that regulate the self-renewal and differentiation of neoblasts. [Abstract/Link to Full Text]

Kunin V, Sorek R, Hugenholtz P
Evolutionary conservation of sequence and secondary structures in CRISPR repeats.
Genome Biol. 2007;8(4):R61.
BACKGROUND: Clustered regularly interspaced short palindromic repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in approximately 40% of bacterial and most archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CASs), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been recently shown that CRISPR provides acquired resistance against viruses in prokaryotes. RESULTS: Here we analyze CRISPR repeats identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. Some of the clusters present stable, highly conserved RNA secondary structures, while others lack detectable structures. Stable secondary structures exhibit multiple compensatory base changes in the stem region, indicating evolutionary and functional conservation. CONCLUSION: We show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification, including specific relationships between CRISPR and CAS subtypes. [Abstract/Link to Full Text]

Kim BH, Cai X, Vaughn JN, von Arnim AG
On the functions of the h subunit of eukaryotic initiation factor 3 in late stages of translation initiation.
Genome Biol. 2007;8(4):R60.
BACKGROUND: The eukaryotic translation initiation factor 3 (eIF3) has multiple roles during the initiation of translation of cytoplasmic mRNAs. How individual subunits of eIF3 contribute to the translation of specific mRNAs remains poorly understood, however. This is true in particular for those subunits that are not conserved in budding yeast, such as eIF3h. RESULTS: Working with stable reporter transgenes in Arabidopsis thaliana mutants, it was demonstrated that the h subunit of eIF3 contributes to the efficient translation initiation of mRNAs harboring upstream open reading frames (uORFs) in their 5' leader sequence. uORFs, which can function as devices for translational regulation, are present in over 30% of Arabidopsis mRNAs, and are enriched among mRNAs for transcriptional regulators and protein modifying enzymes. Microarray comparisons of polysome loading in wild-type and eif3h mutant seedlings revealed that eIF3h generally helps to maintain efficient polysome loading of mRNAs harboring multiple uORFs. In addition, however, eIF3h also boosted the polysome loading of mRNAs with long leaders or coding sequences. Moreover, the relative polysome loading of certain functional groups of mRNAs, including ribosomal proteins, was actually increased in the eif3h mutant, suggesting that regulons of translational control can be revealed by mutations in generic translation initiation factors. CONCLUSION: The intact eIF3h protein contributes to efficient translation initiation on 5' leader sequences harboring multiple uORFs, although mRNA features independent of uORFs are also implicated. [Abstract/Link to Full Text]

Miller DJ, Hemmrich G, Ball EE, Hayward DC, Khalturin K, Funayama N, Agata K, Bosch TC
The innate immune repertoire in cnidaria--ancestral complexity and stochastic gene loss.
Genome Biol. 2007;8(4):R59.
BACKGROUND: Characterization of the innate immune repertoire of extant cnidarians is of both fundamental and applied interest--it not only provides insights into the basic immunological 'tool kit' of the common ancestor of all animals, but is also likely to be important in understanding the global decline of coral reefs that is presently occurring. Recently, whole genome sequences became available for two cnidarians, Hydra magnipapillata and Nematostella vectensis, and large expressed sequence tag (EST) datasets are available for these and for the coral Acropora millepora. RESULTS: To better understand the basis of innate immunity in cnidarians, we scanned the available EST and genomic resources for some of the key components of the vertebrate innate immune repertoire, focusing on the Toll/Toll-like receptor (TLR) and complement pathways. A canonical Toll/TLR pathway is present in representatives of the basal cnidarian class Anthozoa, but neither a classic Toll/TLR receptor nor a conventional nuclear factor (NF)-kappaB could be identified in the anthozoan Hydra. Moreover, the detection of complement C3 and several membrane attack complex/perforin domain (MAC/PF) proteins suggests that a prototypic complement effector pathway may exist in anthozoans, but not in hydrozoans. Together with data for several other gene families, this implies that Hydra may have undergone substantial secondary gene loss during evolution. Such losses are not confined to Hydra, however, and at least one MAC/PF gene appears to have been lost from Nematostella. CONCLUSION: Consideration of these patterns of gene distribution underscores the likely significance of gene loss during animal evolution whilst indicating ancient origins for many components of the vertebrate innate immune system. [Abstract/Link to Full Text]

Miller DL, Myers CL, Rickards B, Coller HA, Flint SJ
Adenovirus type 5 exerts genome-wide control over cellular programs governing proliferation, quiescence, and survival.
Genome Biol. 2007;8(4):R58.
BACKGROUND: Human adenoviruses, such as serotype 5 (Ad5), encode several proteins that can perturb cellular mechanisms that regulate cell cycle progression and apoptosis, as well as those that mediate mRNA production and translation. However, a global view of the effects of Ad5 infection on such programs in normal human cells is not available, despite widespread efforts to develop adenoviruses for therapeutic applications. RESULTS: We used two-color hybridization and oligonucleotide microarrays to monitor changes in cellular RNA concentrations as a function of time after Ad5 infection of quiescent, normal human fibroblasts. We observed that the expression of some 2,000 genes, about 10% of those examined, increased or decreased by a factor of two or greater following Ad5 infection, but were not altered in mock-infected cells. Consensus k-means clustering established that the temporal patterns of these changes were unexpectedly complex. Gene Ontology terms associated with cell proliferation were significantly over-represented in several clusters. The results of comparative analyses demonstrate that Ad5 infection induces reversal of the quiescence program and recapitulation of the core serum response, and that only a small subset of the observed changes in cellular gene expression can be ascribed to well characterized functions of the viral E1A and E1B proteins. CONCLUSION: These findings establish that the impact of adenovirus infection on host cell programs is far greater than appreciated hitherto. Furthermore, they provide a new framework for investigating the molecular functions of viral early proteins and information relevant to the design of conditionally replicating adenoviral vectors. [Abstract/Link to Full Text]

Rector A, Lemey P, Tachezy R, Mostmans S, Ghim SJ, Van Doorslaer K, Roelke M, Bush M, Montali RJ, Joslin J, Burk RD, Jenson AB, Sundberg JP, Shapiro B, Van Ranst M
Ancient papillomavirus-host co-speciation in Felidae.
Genome Biol. 2007;8(4):R57.
BACKGROUND: Estimating evolutionary rates for slowly evolving viruses such as papillomaviruses (PVs) is not possible using fossil calibrations directly or sequences sampled over a time-scale of decades. An ability to correlate their divergence with a host species, however, can provide a means to estimate evolutionary rates for these viruses accurately. To determine whether such an approach is feasible, we sequenced complete feline PV genomes, previously available only for the domestic cat (Felis domesticus, FdPV1), from four additional, globally distributed feline species: Lynx rufus PV type 1, Puma concolor PV type 1, Panthera leo persica PV type 1, and Uncia uncia PV type 1. RESULTS: The feline PVs all belong to the Lambdapapillomavirus genus, and contain an unusual second noncoding region between the early and late protein region, which is only present in members of this genus. Our maximum likelihood and Bayesian phylogenetic analyses demonstrate that the evolutionary relationships between feline PVs perfectly mirror those of their feline hosts, despite a complex and dynamic phylogeographic history. By applying host species divergence times, we provide the first precise estimates for the rate of evolution for each PV gene, with an overall evolutionary rate of 1.95 x 10(-8) (95% confidence interval 1.32 x 10(-8) to 2.47 x 10(-8)) nucleotide substitutions per site per year for the viral coding genome. CONCLUSION: Our work provides evidence for long-term virus-host co-speciation of feline PVs, indicating that viral diversity in slowly evolving viruses can be used to investigate host species evolution. These findings, however, should not be extrapolated to other viral lineages without prior confirmation of virus-host co-divergence. [Abstract/Link to Full Text]

Doss MX, Winkler J, Chen S, Hippler-Altenburg R, Sotiriadou I, Halbach M, Pfannkuche K, Liang H, Schulz H, Hummel O, Hübner N, Rottscheidt R, Hescheler J, Sachinidis A
Global transcriptome analysis of murine embryonic stem cell-derived cardiomyocytes.
Genome Biol. 2007;8(4):R56.
BACKGROUND: Characterization of gene expression signatures for cardiomyocytes derived from embryonic stem cells will help to define their early biologic processes. RESULTS: A transgenic alpha-myosin heavy chain (MHC) embryonic stem cell lineage was generated, exhibiting puromycin resistance and expressing enhanced green fluorescent protein (EGFP) under the control of the alpha-MHC promoter. A puromycin-resistant, EGFP-positive, alpha-MHC-positive cardiomyocyte population was isolated with over 92% purity. RNA was isolated after electrophysiological characterization of the cardiomyocytes. Comprehensive transcriptome analysis of alpha-MHC-positive cardiomyocytes in comparison with undifferentiated alpha-MHC embryonic stem cells and the control population from 15-day-old embryoid bodies led to identification of 884 upregulated probe sets and 951 downregulated probe sets in alpha-MHC-positive cardiomyocytes. A subset of upregulated genes encodes cytoskeletal and voltage-dependent channel proteins, and proteins that participate in aerobic energy metabolism. Interestingly, mitosis, apoptosis, and Wnt signaling-associated genes were downregulated in the cardiomyocytes. In contrast, annotations for genes upregulated in the alpha-MHC-positive cardiomyocytes are enriched for the following Gene Ontology (GO) categories: enzyme-linked receptor protein signaling pathway (GO:0007167), protein kinase activity (GO:0004672), negative regulation of Wnt receptor signaling pathway (GO:0030178), and regulation of cell size (O:0008361). They were also enriched for the Biocarta p38 mitogen-activated protein kinase signaling pathway and Kyoto Encyclopedia of Genes and Genomes (KEGG) calcium signaling pathway. CONCLUSION: The specific pattern of gene expression in the cardiomyocytes derived from embryonic stem cells reflects the biologic, physiologic, and functional processes that take place in mature cardiomyocytes. Identification of cardiomyocyte-specific gene expression patterns and signaling pathways will contribute toward elucidating their roles in intact cardiac function. [Abstract/Link to Full Text]

Bradley KM, Elmore JB, Breyer JP, Yaspan BL, Jessen JR, Knapik EW, Smith JR
A major zebrafish polymorphism resource for genetic mapping.
Genome Biol. 2007;8(4):R55.
We have identified 645,088 candidate polymorphisms in zebrafish and observe a single nucleotide polymorphism (SNP) validation rate of 71% to 86%, improving with polymorphism confidence score. Variant sites are non-random, with an excess of specific novel T- and A-rich motifs. We positioned half of the polymorphisms on zebrafish genetic and physical maps as a resource for positional cloning. We further demonstrate bulked segregant analysis using the anchored SNPs as a method for high-throughput genetic mapping in zebrafish. [Abstract/Link to Full Text]

Recent Articles in Nuclear Receptor

No recent articles are currently available.

Recent Articles in Molecular & Cellular Proteomics

Matic I, van Hagen M, Schimmel J, Macek B, Ogg SC, Tatham MH, Hay RT, Lamond AI, Mann M, Vertegaal AC
In vivo identification of human SUMO polymerization sites by high accuracy mass spectrometry and an in-vitro to in vivo strategy.
Mol Cell Proteomics. 2007 Oct 15;
The length and precise linkage of polyubiquitin chains is important for their biological activity. While other ubiquitin-like proteins have the potential to form polymeric chains their identification in vivo is challenging and their functional role is unclear. Vertebrates express three Small Ubiquitin-like MOdifiers, SUMO-1, SUMO-2 and SUMO-3. Mature SUMO-2 and SUMO-3 are nearly identical and contain an internal consensus site for sumoylation that is missing in SUMO-1. Combining state-of-the-art mass spectrometry with an "in vitro to in vivo" strategy for post-translational modifications, we provide direct evidence that SUMO-1, SUMO-2 and SUMO-3 form mixed chains in cells via the internal consensus sites for sumoylation in SUMO-2 and SUMO-3. In vitro, the chain-length of SUMO polymers could be influenced by changing the relative amounts of SUMO-1 and SUMO-2. The developed methodology is generic and can be adapted for the identification of other sumoylation sites in complex samples. [Abstract/Link to Full Text]

Miller ML, Hanke S, Hinsby AM, Friis C, Brunak S, Mann M, Blom N
Motif decomposition of the phosphotyrosine proteome reveals a new N-terminal binding motif for SHIP2.
Mol Cell Proteomics. 2007 Oct 15;
Advances in mass spectrometry-based proteomics have yielded a substantial mapping of the tyrosine phosphoproteome and thus provided an important step towards a systematic analysis of intracellular signaling networks in higher eukaryotes. In this study we decompose an uncharacterized proteomic data set of 481 unique phosphotyrosine (pY)peptides by sequence similarity to known ligands of the Src Homology 2 (SH2) and the phosphotyrosine binding (PTB) domains. From 20 clusters we extract 16 known and four new interaction motifs. Using quantitative mass spectrometry we pull down pY-specific binding partners for peptides corresponding to the extracted motifs. We confirm numerous of previously known interaction motifs and find 15 new interactions mediated by phosphosites not previously known to bind SH2 or PTB domains. Remarkably, a novel hydrophobic N-terminal motif ([LVI]-[LVI]-pY) is identified and validated as a binding motif for the SH2 domain containing inositol phosphatase SHIP2. Our decomposition of the in vivo pY proteome furthermore suggests that two thirds of the pY sites mediate interaction while the remaining third governs processes such as enzyme activation and nucleic acid binding. [Abstract/Link to Full Text]

Macek B, Gnad F, Soufi B, Kumar C, Olsen JV, Mijakovic I, Mann M
Phosphoproteome analysis of E. coli reveals evolutionary conservation of bacterial Ser/Thr/Tyr phosphorylation.
Mol Cell Proteomics. 2007 Oct 15;
Protein phosphorylation on serine, threonine and tyrosine (Ser/Thr/Tyr) is generally considered the major regulatory posttranslational modification (PTM) in eukaryotic cells. Increasing evidence at the genome and proteome level shows that this modification is also present and functional in prokaryotes. We have recently reported the first in-depth phosphorylation site-resolved dataset from the model Gram-positive bacterium, Bacillus subtilis, showing that Ser/Thr/Tyr phosphorylation is also present on many essential bacterial proteins. To test whether this modification is common in Eubacteria, here we use a recently developed proteomics approach based on phosphopeptide enrichment and high accuracy mass spectrometry to analyze the phosphoproteome of the model Gram-negative bacterium Escherichia coli. We report 81 phosphorylation sites on 79 E. coli proteins, with distribution of Ser/Thr/Tyr phosphorylation sites 68/23/9%. Despite their phylogenetic distance, phosphoproteomes of E. coli and B. subtilis show striking similarity in size, classes of phosphorylated proteins and distribution of Ser/Thr/Tyr phosphorylation sites. By combining the two datasets we created the largest phosphorylation site-resolved database of bacterial phosphoproteins to date (available on, and used it to study evolutionary conservation of bacterial phosphoproteins and phosphorylation sites across the phylogenetic tree. We demonstrate that bacterial phosphoproteins and phosphorylated residues are significantly more conserved than their non-phosphorylated counterparts, with a number of potential phosphorylation sites conserved from Archaea to humans. Our results establish Ser/Thr/Tyr phosphorylation as a common PTM in Eubacteria, present since the onset of cellular life. [Abstract/Link to Full Text]

Luque-Garcia JL, Zhou G, Spellman DS, Sun TT, Neubert TA
Analysis of electroblotted proteins by mass spectrometry: Protein identification after western blotting.
Mol Cell Proteomics. 2007 Oct 15;
We describe a new approach for the identification and characterization by mass spectrometry of proteins that have been electroblotted onto nitrocellulose. Using this method (Blotting And Removal of Nitrocellulose, or BARN), proteins can be analyzed either as intact proteins for molecular weight determination or as peptides generated by on-membrane proteolysis. Acetone is used to dissolve the nitrocellulose and to precipitate the adsorbed proteins/peptides, thus removing the nitrocellulose which can interfere with mass spectrometry analysis. This method offers improved protein coverage, especially for membrane proteins such as uroplakins, since the extraction step after in-gel digestion is avoided. Moreover, removal of nitrocellulose from the sample solution allows sample analysis by both MALDI- and (LC) ESI-based mass spectrometers. Finally, we demonstrate the utility of BARN for the direct identification of soluble and membrane proteins after Western blotting, obtaining comparable or better results than with in-gel digestion. [Abstract/Link to Full Text]

Yi X, Luk JM, Lee NP, Peng J, Leng X, Guan XY, Lau GK, Beretta L, Fan ST
Association of mortalin (HSPA9) with liver cancer metastasis and prediction for early tumor recurrence.
Mol Cell Proteomics. 2007 Oct 14;
Hepatocellular carcinoma (HCC) is well known of poor prognosis and short survival owing to high recurrence rate even after curative surgery. Today there is no available biomarker or biochemical test to indicate HCC recurrence, and this study aims to identify protein markers that can discriminate post-operative patients with early recurrence (ER), disease relapsed within the first year. In this study, 103 hepatitis B-related HCC patients were recruited and 68 of them were used for ER-related biomarker discovery study. Proteomic expression patterns of matched tumor and adjacent non-tumor tissues from these patients plus 16 normal liver tissues were delineated by 2-DE differential profiling method. Significant protein spots were evaluated by hierarchical clustering analysis. SSP4612 that yielded the highest ROC curve value for the ER subgroup of HCC was subsequently identified by tandem mass spectrometry, and the corresponding expression patterns were further confirmed by quantitative PCR, western blot and immunohistochemistry. Correlation analysis with clinicopathological data was also examined. Proteomic profiling analysis revealed overexpression of mortalin (gene HSPA9) in HCC when compared to the non-tumor and normal liver tissues (AUC=0.821). Furthermore, elevated mortalin level was also detected in the ER subgroup of HCC versus the recurrence free (RF) state (where no cancer recurs for >1 year) (AUC=0.833, sensitivity=90.9% and specificity=71.4%). Metastatic HCC cell lines also exhibited higher levels of mortalin and HSPA9 mRNA. Clinically, mortalin overexpression in HCC was closely associated with advanced tumor stages and venous infiltration, having implications for increased malignancy and aggressive behavior. Mortalin (HSPA9) is associated with HCC metastasis and thus suggested as a tumor marker for predicting early recurrence, which may have immediate clinical applications for cancer surveillance after curative surgery. [Abstract/Link to Full Text]

Carroll AJ, Heazlewood JL, Ito J, Millar AH
Analysis of the Arabidopsis cytosolic ribosome proteome provides detailed insights into its components and their post-translational modification.
Mol Cell Proteomics. 2007 Oct 13;
Finding gene-specific peptides by mass spectrometry analysis to pinpoint gene loci responsible for particular protein products is a major challenge in proteomics, especially in highly conserved gene families in higher eukaryotes. We have used a combination of in silico approaches coupled to mass spectrometry analysis to advance the proteomic insight into Arabidopsis cytosolic ribosomal composition and its post-translational modifications. In silico digestion of all 409 ribosomal protein sequences in Arabidopsis defined the proportion of theoretical gene-specific peptides for each gene family and highlighted the need for low m/z cutoffs of MS ion selection for MS/MS to characterize low MW, highly basic ribosomal proteins. We undertook an extensive MS/MS survey of the cytosolic ribosome, using trypsin, and when required, chymotrypsin and pepsin. We then used custom software to extract and filter peptide match information from Mascot result files and implement high confidence criteria for calling gene-specific identifications based on the highest quality unambiguous spectra matching exclusively to certain in silico predicted gene- or gene family-specific peptides. This has provided an in-depth analysis of the protein composition based on 1446 high quality MS/MS spectra matching to 795 peptide sequences from ribosomal proteins. These have identified peptides from 5 gene families of r-proteins not identified previously, providing experimental data on 79 of the 80 different types of ribosomal subunits. We provide strong evidence for gene-specific identification of 87 different ribosomal proteins from these 79 families. We also provided new information on 30 specific sites of co- and post-translational modification of r-proteins in Arabidopsis by initiator methionine removal, N-terminal acetylation, N-terminal methylation, lysine N-methylation and phosphorylation. This site-specific modification data provides a wealth of resources for further assessment of the role of ribosome modification in influencing translation in Arabidopsis. [Abstract/Link to Full Text]

Bai Y, Markham K, Chen F, Weerasekera R, Watts J, Horne P, Wakutani Y, Bagshaw R, Mathews PM, Fraser PE, Westaway D, St George-Hyslop P, Schmitt-Ulms G
The in vivo brain interactome of the amyloid precursor protein.
Mol Cell Proteomics. 2007 Nov 19;
Despite intense research efforts, the physiological function and molecular environment of the amyloid precursor protein has remained enigmatic. Here we describe the application of time-controlled transcardiac perfusion crosslinking (tcTPC), a method for the in vivo mapping of protein interactions in intact tissue, to study the interactome of the amyloid precursor protein (APP). To gain insights into the specificity of reported protein interactions the study was extended to the mammalian amyloid precursor like proteins (APLP1 and APLP2). To rule out sampling bias as an explanation for differences in the individual data sets, a small-scale quantitative iTRAQ-based comparison of APP, APLP1 and APLP2 interactomes was carried out. An interactome map was derived which confirmed 8 previously reported interactions of APP and revealed the identity of more than 30 additional proteins which reside in spatial proximity to APP in the brain. Subsequent validation studies confirmed a physiological interaction between APP and leucine rich repeat and Ig domain containing protein 1 (LINGO-1), demonstrated a strong influence of LINGO-1 on the proteolytic processing of APP, and consolidated similarities in the biology of APP and p75. [Abstract/Link to Full Text]

Wang Y, Klemke RL
Phosphoblast: A computational tool for comparing phosphoprotein signatures among large datasets.
Mol Cell Proteomics. 2007 Oct 13;
Identification of specific protein phosphorylation sites provides predicative signatures of cellular activity and specific disease states such as cancer, diabetes, Alzheimer's, and rheumatoid arthritis. Recent progress in phosphopeptide isolation technology and tandem mass spectrometry has provided the means to identify thousands of phosphorylation sites from a single biological sample. These advances now make it possible to profile global changes in the phosphoproteome at an unprecedented level. However, while this technology is generating a wealth of information, there is currently no efficient means to identify phosphoprotein signatures shared among large phosphoprotein databases. Identification of common phosphoprotein signatures found in biological relevant systems and their conservation throughout evolution would provide valuable insight into mechanisms of signal transduction and cell function. Here we describe the development of a computational program (PhosphoBlast) that can rapidly match thousands of phosphopeptides that share phosphorylation sites within and across species. PhosphoBlast analysis of several large phosphoprotein datasets from the literature revealed common phosphorylation signatures shared across diverse experimental platforms and species. Moreover, PhosphoBlast is a powerful analysis tool to identify specific phosphosite mutations. Comparison of the mouse and human phosphoproteomes revealed more than 130 specific phosphoamino acid mutations, some of which are predicted to alter protein function. Further analysis revealed that known phosphorylated amino acids are more evolutionally conserved than the S/T/Y amino acids not known to be phosphorylated. Together our results demonstrate that PhosphoBlast is a versatile mining tool capable of identifying related phosphorylation signatures and phosphoamino acid mutations among complex proteomic datasets in a highly efficient and accurate manner. PhosphoBlast will aid in the informatic analysis of the phosphoproteome and the identification of phosphoprotein biomarkers of disease. [Abstract/Link to Full Text]

Cragnolini JJ, López de Castro JA
Identification of endogenously presented peptides from Chlamydia trachomatis with high homology to human proteins and to a natural self-ligand of HLA-B27.
Mol Cell Proteomics. 2007 Oct 13;
A strategy for the stable expression of proteins, or large protein fragments, from Chlamydia trachomatis into human cells was designed to identify bacterial epitopes endogenously processed and presented by HLA-B27. Fusion protein constructs in which the green fluorescence protein gene was placed at the 5'-end of the bacterial DNA primase gene or some of its fragments were transfected into B*2705-C1R cells. One of these constructs, including residues 90-450 of the bacterial protein was stably and efficiently expressed. Mass spectrometry-based comparative analysis of HLA-B27-bound peptide pools lead to identification of 3 HLA-B27 ligands differentially presented in the transfectant cells. Sequencing of these peptides confirmed that they derived from the bacterial DNA primase. One of them, spanning residues 211-221, showed 55% sequence identity with a known self-ligand of HLA-B27 derived from its own molecule. The other two bacterial ligands, P(112-121) and P(112-122) derived from the same region and differed in length by one residue at the C-terminus. Both peptides showed >50% identity with multiple human protein sequences that possessed the optimal peptide motifs for HLA-B27 binding. Thus, expression of proteins from arthritogenic bacteria in HLA-B27-positive human cells allows identifying bacterial peptides that are endogenously processed and presented by HLA-B27 and show molecular mimicry with known self-ligands of this molecule and human proteins. [Abstract/Link to Full Text]

Song XC, Fu G, Yang X, Jiang Z, Wang Y, Zhou GW
Protein expression profiling of breast cancer cells by dissociable antibody microarray (DAMA) staining.
Mol Cell Proteomics. 2007 Oct 13;
Dissociable Antibody MicroArray (DAMA) staining is a technology that combines protein microarrays with traditional immunostaining techniques. It can simultaneously determine the expression and subcellular location of hundreds of proteins in cultured cells and tissue samples. We have developed this technology and demonstrated its application in identifying potential biomarkers for breast cancer. We have compared the expression profiles of 312 proteins among three normal breast cell lines and seven breast cancer cell lines, and have identified ten differentially expressed proteins by the data analysis program DAMAPEP. Among those proteins, RAIDD, Rb p107, Rb p130, SRF and Tyk2 have been confirmed by Western blot and statistical analysis to have higher expression levels in breast cancer cells than in normal breast cells. These proteins could be potential biomarkers for the diagnosis of breast cancer. [Abstract/Link to Full Text]

Wang X, Huang L
Identifying dynamic interactors of protein complexes by quantitative mass spectrometry.
Mol Cell Proteomics. 2007 Oct 12;
Dynamically interacting proteins associate and dissociate with their binding partners at high on/off rates. Although their identification is of great significance to proteomics research, lack of an efficient strategy to distinguish stable and dynamic interactors has hampered the efforts towards this goal. In this work, we have developed a new method, MAP (mixing after purification)-SILAC, to quantitatively investigate the interactions of protein complexes by mass spectrometry. In combination with the original SILAC approach, stable and dynamic components are effectively distinguished by the differences in their relative abundance ratio changes. We applied the newly developed strategies to decipher the dynamics of the human 26S proteasome interacting proteins (PIPs). A total of 67 putative human PIPs are identified by the MAP-SILAC method, among which 14 proteins would have been misidentified as background proteins due to low relative abundance ratios in standard SILAC experiments and 57 proteins have not been previously reported. In addition, 35 of the 67 proteins are classified as stable interactors of the proteasome complex, whereas 16 of them are identified as dynamic ones. The methods reported here provide a valuable expansion of proteomics technologies for identification of important but previously unidentifiable interacting proteins. [Abstract/Link to Full Text]

Pandey A, Chakraborty S, Datta A, Chakraborty N
Proteomics approach to identify dehydration responsive nuclear proteins from chickpea (cicer arietinum l.).
Mol Cell Proteomics. 2007 Oct 6;
Dehydration or water-deficit is one of the most important environmental stress factors that greatly influences plant growth and development, and limits crop productivity. Plants respond and adapt to such stress by altering their cellular metabolism and activating various defense machineries. Mechanisms that operate signal perception, transduction and downstream regulatory events provide valuable information about the underlying pathways involved in environmental stress responses. The nuclear proteins constitute a highly organized, complex network that plays diverse roles during cellular development and other physiological processes. In order to gain a better understanding of dehydration response in plants, we have developed a comparative nuclear proteome in a food legume, chickpea (Cicer arietinum L.). Three-week-old chickpea seedlings were subjected to progressive dehydration by withdrawing water and the changes in the nuclear proteome were examined using two-dimensional gel electrophoresis. Approximately, 205 protein spots were found to be differentially regulated under dehydration. Mass spectrometry analysis allowed the identification of 147 differentially expressed proteins, presumably involved in a variety of functions including gene transcription and replication, molecular chaperones, cell signaling and chromatin remodeling. The dehydration responsive nuclear proteome of chickpea revealed a coordinated response, which involves both the regulatory as well as the functional proteins. This study, for the first time, provides an insight into the complex metabolic network operating in the nucleus during dehydration. [Abstract/Link to Full Text]

Madoz-Gurpide J, Kuick R, Wang H, Misek DE, Hanash SM
Integral protein microarrays for the identification of lung cancer antigens in sera that induce a humoral immune response.
Mol Cell Proteomics. 2007 Oct 4;
The identification of biomarkers in patient sera offers enormous interest for the diagnosis of cancers. In this context, the detection of antibodies to tumor cell autologous antigens possesses a great potential. The humoral immune response represents a form of biological amplification of signals that are otherwise weak due to very low concentrations of antigen, especially in early stages of cancers. Herein we present the use of integral microarrays spotted with tumor-derived proteins to investigate the antibody repertoire in the sera of lung cancer patients and controls. The use of two-dimensional liquid chromatography allowed us to separate proteins from the lung adenocarcinoma cell line A549 into 1760 fractions, which were printed onto nitrocellulose-coated slides. The sensitivity and specificity of the microarrays to detect singular antibodies in fluids was firstly validated through the recognition of fractions containing a lung marker antigen by antibody probing. Twenty fractions were initially selected as highly reactive against the anti-PGP9.5 antibody, and subsequent mass spectrometry analyses confirmed the identity of PGP9.5 protein in 4 of them. As a result, the importance of neighboring fractions in microarray detection was revealed, due to the spreading of proteins during the separation process. Next, the microarrays were individually incubated with 14 serum samples from patients with lung cancer patients, 14 sera from colon cancer patients, and 14 control sera from healthy subjects. The reactivity of the selected fractions was analyzed, and the level of immunoglobulin bound to each fraction by each serum sample was quantified. Eight out of the 20 fractions offered P values < 0.01, and were recognized by an average of 4 reacting patients, while no serum from normal individuals was positive for those fractions. Protein microarrays from tumor-derived fractions hold the diagnostic potential of uncovering antigens that induce an immune response in patients with certain types of cancers. [Abstract/Link to Full Text]

Björling E, Lindskog C, Oksvold P, Linné J, Kampf C, Hober S, Uhlen M, Ponten F
A web-based tool for in silico biomarker discovery based on tissue-specific protein profiles in normal and cancer tissues.
Mol Cell Proteomics. 2007 Oct 3;
Here we report the development of a publicly available web-based analysis tool for exploring proteins expressed in a tissue- or cancer-specific manner. The search queries are based on the human tissue profiles in normal and cancer cells in the Human Protein Atlas portal and rely on the individual annotation performed by pathologists of images representing immunohistochemically stained tissue sections. Approximately 1.8 million images representing more than 3000 antibodies directed towards human proteins were used in the study. The search tool allows for the systematic exploration of the protein atlas, to discover potential protein biomarkers. Such biomarkers include tissue specific markers, cell type specific markers, tumor type specific markers, markers of malignancy and prognostic or predictive markers of cancers. Here we show examples of database queries to generate sets of candidate biomarker proteins for several of these different categories. Expression profiles of candidate proteins can then subsequently be validated by examination of the underlying high-resolution images. The present study shows examples of search strategies revealing several potential protein biomarkers, including proteins specifically expressed in normal cells and in cancer cells from specified tumor types. The lists of candidate proteins can be used as a starting point for further validation in larger patient cohorts using both immunological approaches and technologies employing more classical proteomics tools. [Abstract/Link to Full Text]

Hřjlund K, Yi Z, Hwang H, Bowen B, Lefort N, Flynn CR, Langlais P, Weintraub ST, Mandarino LJ
Characterization of the human skeletal muscle proteome by one-dimensional Gel electrophoresis and HPLC-ESI-MS/MS.
Mol Cell Proteomics. 2007 Oct 1;
Changes in protein abundance in skeletal muscle are central to a large number of metabolic and other disorders, including, and perhaps most commonly, insulin resistance. Proteomic analysis of human muscle is an important approach for gaining insight into the biochemical basis for normal and pathophysiological conditions. However, to date, the number of proteins identified by this approach has been limited, with 107 different proteins being the maximum reported so far. Using a combination of one-dimensional-gel electrophoresis and HPLC-ESI-MS/MS, we identified 954 different proteins in human vastus lateralis muscle obtained from 3 healthy, non-obese subjects. In addition to a large number of isoforms of contractile proteins, we detected all proteins involved in the major pathways of glucose and lipid metabolism in skeletal muscle. Mitochondrial proteins accounted for 22% of all proteins identified, including 55 subunits of the respiratory complexes I-V. Moreover, a number of enzymes involved in endocrine and metabolic signaling pathways as well as calcium homeostasis were identified. These results provide the most comprehensive characterization of the human skeletal muscle proteome to date. These data hold promise for future global assessment of quantitative changes in the muscle proteome of patients affected by disorders involving skeletal muscle. [Abstract/Link to Full Text]

Patel SS, Rexach M
Discovering novel interactions at the nuclear pore complex using Bead Halo: A rapid method for detecting molecular interactions of high and low affinity at equilibrium.
Mol Cell Proteomics. 2007 Sep 26;
A highly sensitive, equilibrium-based binding assay termed "Bead Halo" was used here to identify and characterize interactions involving components of the nucleocytoplasmic transport machinery in eukaryotes. Bead Halo uncovered novel interactions between the importin Kap95 and the nucleoporins (nups) Nic96, Pom34, Gle1, Ndc1, Nup84 and Seh1, which likely occur during nuclear pore complex (NPC) biogenesis. Bead Halo was also used to characterize the molecular determinants for binding between Kap95 and the family of nups that feature multiple phenylalanine-glycine motifs (FG nups). Binding was sensitive to the number of FG motifs present and to amino-acid (AA) residues immediately flanking the FG motifs. Also, binding was reduced, but not abolished when phenylalanine residues in all FG motifs were replaced by tyrosine or tryptophan. These results suggest flexibility in the binding pockets of Kap95 and synergism in binding FG motifs. The hypothesis that Nup53 and Nup59 bind directly to membranes through a C-terminal amphipathic alpha helix, and to DNA via a RRM domain, was also tested and validated using Bead Halo. The results support a role for these nups in nuclear pore membrane biogenesis and in gene expression. Finally, Bead Halo detected binding of the nups Gle1, Nup60 and Nsp1 to phospholipid bilayers. This may reflect the known interaction between Gle1 and phosphoinositides and suggests similar interactions for Nup60 and Nsp1. As the Bead Halo assay detected molecular interactions in cell lysates, as well as between purified components, it can be adapted for large-scale proteomic studies using automated robotics and microscopy. [Abstract/Link to Full Text]

Bastas G, Sompuram SR, Pierce B, Vani K, Bogen SA
Bioinformatic requirements for protein database searching using predicted epitopes from disease-associated antibodies.
Mol Cell Proteomics. 2007 Sep 25;
We describe a new approach to identify proteins involved in disease pathogenesis. The technology, Epitope-Mediated Antigen Prediction (E-MAP), leverages the specificity of patients' immune responses to disease-relevant targets and requires no prior knowledge about the protein. E-MAP links pathologic antibodies of unknown specificity, isolated from patient sera, to their cognate antigens in the protein database. The E-MAP process first involves reconstruction of a predicted epitope using a peptide combinatorial library. We then search the protein database for closely matching amino acid sequences. Previously published attempts to identify unknown antibody targets in this manner have largely been unsuccessful, for two reasons: short predicted epitopes yield too many irrelevant matches from a database search and the epitopes may not accurately represent the native antigen with sufficient fidelity. Using an in silico model, we demonstrate the critical threshold requirements for epitope length and epitope fidelity. We find that epitopes generally need to have at least seven amino acids, with an overall accuracy of >70% to the native protein, in order to correctly identify the protein in a non-redundant protein database search. We then confirmed these findings experimentally, using the predicted epitopes for four monoclonal antibodies. Since many predicted epitopes often fail to achieve the seven amino acid threshold, we demonstrate the efficacy of paired epitope searches. This is the first systematic analysis of the computational framework to make this approach viable, coupled with experimental validation. [Abstract/Link to Full Text]

Kim YS, Hwang SY, Kang HY, Sohn H, Oh S, Kim JY, Yoo JS, Kim YH, Kim CH, Jeon JH, Lee JM, Kang HA, Miyoshi E, Taniguchi N, Yoo HS, Ko JH
Functional proteomic study reveals that N-acetylglucosaminyltransferase V reinforces the invasive/metastatic potential of colon cancer through aberrant glycosylation on TIMP-1.
Mol Cell Proteomics. 2007 Sep 18;
N-acetylglucosaminyltransferase-V (GnT-V) has been reported to be up-regulated in invasive/metastatic cancer cells, but a comprehensive understanding of how the transferase correlates with the invasive/metastatic potential is not currently available. Through a glycomic approach, we identified 30 proteins including tissue inhibitor of metalloproteinase-1 (TIMP-1) as a target protein for GnT-V in human colon cancer cell WiDr. TIMP-1 was aberrantly glycosylated as characterized by the addition of ss1,6-N-acetylglucosamine, polylactosaminylation, and sialylation in GnT-V-overexpressing WiDr cells. Compared to normal TIMP-1, the aberrantly glycosylated TIMP-1 showed the weaker inhibition on both matrix metalloproteinase (MMP)-2 and MMP-9, and this aberrancy was closely associated with cancer cell invasion and metastasis in vivo as well as in vitro. Integrated data both of TIMP-1 expression level and aberrant glycosylation could provide important information to aid to improve the clinical outcome of colon cancer patients. [Abstract/Link to Full Text]

Hebeler R, Oeljeklaus S, Reidegeld KA, Eisenacher M, Stephan C, Sitek B, Stühler K, Meyer HE, Sturre MJ, Dijkwel PP, Warscheid B
Study of early leaf senescence in Arabidopsis thaliana by quantitative proteomics using reciprocal 14N/15N-labeling and difference gel electrophoresis.
Mol Cell Proteomics. 2007 Sep 18;
Leaf senescence represents the final stage of leaf development and is associated with fundamental changes on the level of the proteome. For the quantitative analysis of changes in protein abundance related to early leaf senescence, we designed an elaborate double and reverse labeling strategy simultaneously employing fluorescent two-dimensional DIGE as well as metabolic (15)N-labeling followed by MS. Reciprocal (14)N/(15)N-labeling of entire Arabidopsis thaliana plants showed that full incorporation of (15)N into the plant's proteins did not cause any adverse effects on development and protein expression. A direct comparison of DIGE and (15)N-labeling combined with MS showed that results obtained by both quantification methods correlated well for proteins showing low to moderate regulation factors. Nano HPLC/ESI-MS/MS analysis of 21 protein spots that consistently exhibited abundance differences in nine biological replicates based on both DIGE and MS resulted in the identification of 13 distinct proteins and protein subunits that showed significant regulation in Arabidopsis mutant plants displaying advanced leaf senescence. Ribulose 1,5-bisphosphate carboxylase/oxygenase large and three of its four small subunits were found to be down-regulated, which reflects the degradation of the photosynthetic machinery during leaf senescence. Among the proteins showing higher abundance in mutant plants were several members of the glutathione S-transferase (GST) family class phi and quinone reductase. Up-regulation of these proteins fits well into the context of leaf senescence since they are generally involved in the protection of plant cells against reactive oxygen species which are increasingly generated by lipid degradation during leaf senescence. With the exception of one GST isoform, none of these proteins has been linked to leaf senescence before. [Abstract/Link to Full Text]

Ulintz PJ, Bodenmiller B, Andrews PC, Aebersold R, Nesvizhskii AI
Investigating MS2-MS3 matching statistics: A model for coupling consecutive stage mass spectrometry data for increased peptide identification confidence.
Mol Cell Proteomics. 2007 Sep 13;
Improvements in ion trap instrumentation have made n-dimensional mass spectrometry more practical. The overall goal of the study is to describe a model for making use of MS2 and MS3 information in mass spectrometry experiments. We present a statistical model for adjusting peptide identification probabilities based on the combined information obtained by coupling peptide assignments of consecutive MS2 and MS3 spectra. Using two data sets, a mixture of known proteins and a complex phosphopeptide-enriched sample, we demonstrate an increase in discriminating power of the adjusted probabilities, compared to models using MS2 or MS3 data only. This work also addresses the overall value of generating MS3 data as compared to an MS2-only approach, with a focus on the analysis of phosphopeptide data. [Abstract/Link to Full Text]

Fry BG, Scheib H, van der Weerd L, Young B, McNaughtan J, Ryan Ramjan SF, Vidal N, Poelmann RE, Norman JA
Evolution of an arsenal: Structural and functional diversification of the venom system in the advanced snakes (Caenophidia).
Mol Cell Proteomics. 2007 Sep 17;
Venom is a key innovation underlying the evolution of advanced snakes (Caenophidia). Despite this, very little is known about venom system structural diversification, toxin recruitment event timings or toxin molecular evolution. A multidisciplinary approach was employed examine the diversification of the venom system and associated toxins across the full-range of the ~100 million year old advanced snake clade, with a particular emphasis upon families that have not secondarily evolved front-fanged venom system (~80% of the 2500 species). Analysis of cDNA libraries revealed complex venom transcriptomes containing multiple toxin types including three finger toxins, cobra venom factor, CRISP, hyaluronidase, kallikrein, kunitz, lectin, matrix metalloprotease, phospholipase A2, snake venom metalloprotease/ADAM and waprin. High levels of sequence diversity were observed, including mutations in structural and functional residues, changes in cysteine spacing, and major deletions/truncations. Morphological analysis comprising gross dissection, histology and magnetic resonance imaging also demonstrated extensive modification of the venom system architecture in non-front-fanged snakes, in contrast to the conserved structure of the venom system within the independently evolved front-fanged elapid or viperid snakes. Further, a reduction in the size and complexity of the venom system was observed in species in which constriction has been secondarily evolved as the preferred method of prey capture or dietary preference has switched from live-prey to eggs or to slugs/snails. Investigation of the timing of toxin recruitment events across the entire advanced snake radiation indicates that the evolution of advanced venom systems in three front-fanged lineages are associated with recruitment of new toxin types or explosive diversification of existing toxin types. These results support the role of venom as a key evolutionary innovation in the diversification of advanced snakes and identify a potential role for non-front-fanged venom toxins as a rich source for lead compounds for drug design and development. [Abstract/Link to Full Text]

Qian M, Sleat DE, Zheng H, Moore D, Lobel P
Proteomic analysis of serum from mutant mice reveals lysosomal proteins selectively transported by each of the two mannose 6-phosphate receptors.
Mol Cell Proteomics. 2007 Sep 11;
Most mammalian cells contain two types of mannose 6-phosphate (Man6-P) receptors (MPRs), the 300 kDa cation-independent (CI) MPR and 46 kDa cation-dependent (CD) MPR. The two MPRs have overlapping function in intracellular targeting of newly synthesized lysosomal proteins but both are required for efficient targeting. Despite extensive investigation, the relative roles and specialized functions of each MPR in targeting of specific proteins remain questions of fundamental interest. One possibility is that most Man6-P glycoproteins are transported by both MPRs but there may be subsets that are preferentially transported by each. To investigate this, we have conducted a proteomic analysis of serum from mice lacking either MPR with the reasoning that lysosomal proteins that are selectively transported by a given MPR should be preferentially secreted into the bloodstream in its absence. We purified and identified Man6-P glycoproteins and glycopeptides from wild-type, CDMPR deficient and CIMPR deficient mouse serum and found both lysosomal proteins and proteins not currently thought to have lysosomal function. Different mass spectrometric approaches (spectral count analysis of nanospray LC-MS/MS experiments on unlabeled samples combined with and LC-MALDI/TOF/TOF experiments on iTRAQ-labeled samples) revealed a number of proteins that appear specifically elevated in serum from each MPR-deficient mouse. Man6-P glycoforms of cellular repressor of E1A-stimulated genes 1, tripeptidyl peptidase I, and heparanase were elevated in absence of the CDMPR and Man6-P glycoforms of alpha-mannosidase B1, cathepsin D and prosaposin were elevated in the absence of the CIMPR. Results were confirmed by western blot analyses for select proteins. This study provides a comparison of different quantitative mass spectrometric approaches and provides the first report of proteins whose cellular targeting appears to be MPR-selective under physiological conditions. [Abstract/Link to Full Text]

Kasthuri RS, Harvey SB, Stone MD, Homoncik M, Jilma B, Nelsestuen GL
WITHDRAWN: Extraordinary apolipoprotein oxidation in chronic hepatitis C and liver cirrhosis.
Mol Cell Proteomics. 2007 Sep 5;
Withdrawn by Author. [Abstract/Link to Full Text]

Dosemeci A, Makusky AJ, Jankowska-Stephens E, Yang X, Slotta DJ, Markey SP
Composition of the synaptic PSD-95 complex.
Mol Cell Proteomics. 2007 Oct;6(10):1749-60.
Postsynaptic density protein 95 (PSD-95), a specialized scaffold protein with multiple protein interaction domains, forms the backbone of an extensive postsynaptic protein complex that organizes receptors and signal transduction molecules at the synaptic contact zone. Large, detergent-insoluble PSD-95-based postsynaptic complexes can be affinity-purified from conventional PSD fractions using magnetic beads coated with a PSD-95 antibody. In the present study purified PSD-95 complexes were analyzed by LC/MS/MS. A semiquantitative measure of the relative abundances of proteins in the purified PSD-95 complexes and the parent PSD fraction was estimated based on the cumulative ion current intensities of corresponding peptides. The affinity-purified preparation was largely depleted of presynaptic proteins, spectrin, intermediate filaments, and other contaminants prominent in the parent PSD fraction. We identified 525 of the proteins previously reported in parent PSD fractions, but only 288 of these were detected after affinity purification. We discuss 26 proteins that are major components in the PSD-95 complex based upon abundance ranking and affinity co-purification with PSD-95. This subset represents a minimal list of constituent proteins of the PSD-95 complex and includes, in addition to the specialized scaffolds and N-methyl-d-aspartate (NMDA) receptors, an abundance of alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors, small G-protein regulators, cell adhesion molecules, and hypothetical proteins. The identification of two Arf regulators, BRAG1 and BRAG2b, as co-purifying components of the complex implies pivotal functions in spine plasticity such as the reorganization of the actin cytoskeleton and insertion and retrieval of proteins to and from the plasma membrane. Another co-purifying protein (Q8BZM2) with two sterile alpha motif domains may represent a novel structural core element of the PSD. [Abstract/Link to Full Text]

Senis YA, Tomlinson MG, García A, Dumon S, Heath VL, Herbert J, Cobbold SP, Spalton JC, Ayman S, Antrobus R, Zitzmann N, Bicknell R, Frampton J, Authi KS, Martin A, Wakelam MJ, Watson SP
A comprehensive proteomics and genomics analysis reveals novel transmembrane proteins in human platelets and mouse megakaryocytes including G6b-B, a novel immunoreceptor tyrosine-based inhibitory motif protein.
Mol Cell Proteomics. 2007 Mar;6(3):548-64.
The platelet surface is poorly characterized due to the low abundance of many membrane proteins and the lack of specialist tools for their investigation. In this study we identified novel human platelet and mouse megakaryocyte membrane proteins using specialist proteomics and genomics approaches. Three separate methods were used to enrich platelet surface proteins prior to identification by liquid chromatography and tandem mass spectrometry: lectin affinity chromatography, biotin/NeutrAvidin affinity chromatography, and free flow electrophoresis. Many known, abundant platelet surface transmembrane proteins and several novel proteins were identified using each receptor enrichment strategy. In total, two or more unique peptides were identified for 46, 68, and 22 surface membrane, intracellular membrane, and membrane proteins of unknown subcellular localization, respectively. The majority of these were single transmembrane proteins. To complement the proteomics studies, we analyzed the transcriptome of a highly purified preparation of mature primary mouse megakaryocytes using serial analysis of gene expression in view of the increasing importance of mutant mouse models in establishing protein function in platelets. This approach identified all of the major classes of platelet transmembrane receptors, including multitransmembrane proteins. Strikingly 17 of the 25 most megakaryocyte-specific genes (relative to 30 other serial analysis of gene expression libraries) were transmembrane proteins, illustrating the unique nature of the megakaryocyte/platelet surface. The list of novel plasma membrane proteins identified using proteomics includes the immunoglobulin superfamily member G6b, which undergoes extensive alternate splicing. Specific antibodies were used to demonstrate expression of the G6b-B isoform, which contains an immunoreceptor tyrosine-based inhibition motif. G6b-B undergoes tyrosine phosphorylation and association with the SH2 domain-containing phosphatase, SHP-1, in stimulated platelets suggesting that it may play a novel role in limiting platelet activation. [Abstract/Link to Full Text]

Nielsen ML, Savitski MM, Zubarev RA
Extent of modifications in human proteome samples and their effect on dynamic range of analysis in shotgun proteomics.
Mol Cell Proteomics. 2006 Dec;5(12):2384-91.
The complexity of the human proteome, already enormous at the organism level, increases further in the course of the proteome analysis due to in vitro sample evolution. Most of in vitro alterations can also occur in vivo as post-translational modifications. These two types of modifications can only be distinguished a posteriori but not in the process of analysis, thus rendering necessary the analysis of every molecule in the sample. With the new software tool ModifiComb applied to MS/MS data, the extent of modifications was measured in tryptic mixtures representing the full proteome of human cells. The estimated level of 8-12 modified peptides per each unmodified tryptic peptide present at >or=1% level is approaching one modification per amino acid on average. This is a higher modification rate than was previously thought, posing an additional challenge to analytical techniques. The solution to the problem is seen in improving sample preparation routines, introducing dynamic range-adjusted thresholds for database searches, using more specific MS/MS analysis using high mass accuracy and complementary fragmentation techniques, and revealing peptide families with identification of additional proteins only by unfamiliar peptides. Extensive protein separation prior to analysis reduces the requirements on speed and dynamic range of a tandem mass spectrometer and can be a viable alternative to the shotgun approach. [Abstract/Link to Full Text]

Brizard JP, Carapito C, Delalande F, Van Dorsselaer A, Brugidou C
Proteome analysis of plant-virus interactome: comprehensive data for virus multiplication inside their hosts.
Mol Cell Proteomics. 2006 Dec;5(12):2279-97.
Known host-parasite molecular interactions are widespread among parasite families, but these interactions have to be particularly large considering that viruses generally encode few proteins. Although some particular virus-host interactions are well described, no global study has yet shown multiple and simultaneous interactions in a host-parasite biological system. To prove that these multiple interactions occur in biological conditions, the complexes formed by a plant virus (rice yellow mottle virus) and the proteins of its natural host (rice) were extracted and purified from infected tissue sample. Remarkably mass spectrometry permitted the identification of a large number of proteins from the complexes that are involved in different functions not encoded by the virus but probably essential for its biological life cycle. This recruiting of proteins was strongly confirmed by the repetition of experiments using different pairs of virus-host and the use of high salt concentration to extract the complexes. We mainly identified proteins involved in plant defense, metabolism, translation, and protein synthesis and some proteins involved in transport. This study demonstrates that viruses are able to recruit many proteins from their hosts to ensure their development. Among different pairs of virus-host, similar protein functions were identified suggesting a particular importance of these proteins for viruses. The identification of particular paralog proteins among multigenic families suggests the high specificity of the recruiting for some protein functions. [Abstract/Link to Full Text]

Vertegaal AC, Andersen JS, Ogg SC, Hay RT, Mann M, Lamond AI
Distinct and overlapping sets of SUMO-1 and SUMO-2 target proteins revealed by quantitative proteomics.
Mol Cell Proteomics. 2006 Dec;5(12):2298-310.
The small ubiquitin-like modifier (SUMO) family in vertebrates includes three different family members that are conjugated as post-translational modifications to target proteins. SUMO-2 and -3 are nearly identical but differ substantially from SUMO-1. We used quantitative proteomics to investigate the target protein preferences of SUMO-1 and SUMO-2. HeLa cells were established that stably express His6-SUMO-1 or His6-SUMO-2. These cell lines and control HeLa cells were labeled with stable arginine isotopes, and His6-SUMOs were enriched from lysates using immobilized metal affinity chromatography. 53 SUMO-conjugated proteins were identified, including 44 novel SUMO targets. 25 proteins were preferentially conjugated to SUMO-1, 19 were preferentially conjugated to SUMO-2, and nine proteins were conjugated to both SUMO-1 and SUMO-2. SART1 was confirmed by immunoblotting to have both SUMO-1- and SUMO-2-linked forms at similar levels. SUMO-1 and SUMO-2 are thus shown to have distinct and overlapping sets of target proteins, indicating that SUMO-1 and SUMO-2 may have both redundant and non-redundant cellular functions. Interestingly, 14 of the 25 SUMO-1-conjugated proteins contain zinc fingers. Although both SUMO family members play roles in many cellular processes, our data show that sumoylation is strongly associated with transcription because nearly one-third of the identified target proteins are putative transcriptional regulators. [Abstract/Link to Full Text]

McDonald T, Sheng S, Stanley B, Chen D, Ko Y, Cole RN, Pedersen P, Van Eyk JE
Expanding the subproteome of the inner mitochondria using protein separation technologies: one- and two-dimensional liquid chromatography and two-dimensional gel electrophoresis.
Mol Cell Proteomics. 2006 Dec;5(12):2392-411.
Currently no single proteomics technology has sufficient analytical power to allow for the detection of an entire proteome of an organelle, cell, or tissue. One approach that can be used to expand proteome coverage is the use of multiple separation technologies especially if there is minimal overlap in the proteins observed by the different methods. Using the inner mitochondrial membrane subproteome as a model proteome, we compared for the first time the ability of three protein separation methods (two-dimensional liquid chromatography using the ProteomeLab PF 2D Protein Fractionation System from Beckman Coulter, one-dimensional reversed phase high performance liquid chromatography, and two-dimensional gel electrophoresis) to determine the relative overlap in protein separation for these technologies. Data from these different methods indicated that a strikingly low number of proteins overlapped with less than 24% of proteins common between any two technologies and only 7% common among all three methods. Utilizing the three technologies allowed the creation of a composite database totaling 348 non-redundant proteins. 82% of these proteins had not been observed previously in proteomics studies of this subproteome, whereas 44% had not been identified in proteomics studies of intact mitochondria. Each protein separation method was found to successfully resolve a unique subset of proteins with the liquid chromatography methods being more suited for the analysis of transmembrane domain proteins and novel protein discovery. We also demonstrated that both the one- and two-dimensional LC allowed for the separation of the alpha-subunit of F1F0 ATP synthase that differed due to a change in pI or hydrophobicity. [Abstract/Link to Full Text]

Xicohtencatl-Cortés J, Lyons S, Chaparro AP, Hernández DR, Saldańa Z, Ledesma MA, Rendón MA, Gewirtz AT, Klose KE, Girón JA
Identification of proinflammatory flagellin proteins in supernatants of Vibrio cholerae O1 by proteomics analysis.
Mol Cell Proteomics. 2006 Dec;5(12):2374-83.
The genome of Vibrio cholerae contains five flagellin genes that encode proteins (FlaA-E) of 39-41 kDa with 61-82% identity among them. Although the existing live oral attenuated vaccine strains against cholera are protective in humans, there is an intrinsic residual cytotoxic and inflammatory component associated with these candidate vaccine strains. Bacterial flagellins are known to be potent inducers of proinflammatory molecules via activation of Toll-like receptor 5. Here we found that purified flagella from wild type V. cholerae 395 induced significant release of interleukin (IL)-8 from cultured HT-29 human colonic epithelial cells. Furthermore we found that filtered supernatants of KKV90, a DeltaflaA isogenic strain unable to produce flagella, were still able to activate production of IL-8 albeit to significantly lower levels than the wild type, suggesting that other activators of proinflammatory molecules were still present in these supernatants. A comparative proteomics analysis of secreted proteins of V. cholerae 395 and KKV90 identified additional proteins with potential to induce IL-8 release in HT-29 cells. Secreted proteins in the range of 30-45 kDa identified by two-dimensional electrophoresis and mass spectrometry revealed the presence of two additional flagellins, FlaC and FlaD, that appeared to be secreted 3- and 6-fold more, respectively, in the mutant compared with the wild type. Double isogenic mutants flaAC and flaAD were unable to trigger IL-8 release from HT-29 cells. In sum, we have shown that purified flagella and secreted flagellin proteins (FlaC and FlaD) are inducers of IL-8 release from epithelial cells via Toll-like receptor 5. This observation may explain, in part, the observed reactogenicity of cholera vaccine strains in humans. [Abstract/Link to Full Text]

Recent Articles in Proteome Science

Braitbard O, Glickstein H, Bishara-Shieban J, Pace U, Stein WD
Competition between bound and free peptides in an ELISA-based procedure that assays peptides derived from protein digests.
Proteome Sci. 2006;412.
BACKGROUND: We describe an ELISA-based method that can be used to identify and quantitate proteins in biological samples. In this method, peptides in solution, derived from proteolytic digests of the sample, compete with substrate-attached synthetic peptides for antibodies, also in solution, generated against the chosen peptides. The peptides used for the ELISA are chosen on the basis of their being (i) products of the proteolytic (e.g. tryptic) digestion of the protein to be identified and (ii) unique to the target protein, as far as one can know from the published sequences. RESULTS: In this paper we describe the competition assay and we define the optimal conditions for the most effective assay. We have performed an analysis of the kinetics of interaction between the four components of the assay: the plastic substratum to which the peptide is bound, the bound peptide itself, the competing added peptide, and the antibody that is specific for the peptide and we compare the results of theoretical simulations to the actual data in some model systems. CONCLUSION: The data suggest that the peptides bind to the plastic substratum in more than one conformation and that, once bound, the peptide displays different affinities for the antibody, depending on how it has bound to the plate. [Abstract/Link to Full Text]

Guércio RA, Shevchenko A, Shevchenko A, López-Lozano JL, Paba J, Sousa MV, Ricart CA
Ontogenetic variations in the venom proteome of the Amazonian snake Bothrops atrox.
Proteome Sci. 2006;411.
BACKGROUND: Bothrops atrox is responsible for the majority of snakebite accidents in the Brazilian Amazon region. Previous studies have demonstrated that the biological and pharmacological activities of B. atrox venom alter with the age of the animal. Here, we present a comparative proteome analysis of B. atrox venom collected from specimens of three different stages of maturation: juveniles, sub-adults and adults. RESULTS: Optimized conditions for two-dimensional gel electrophoresis (2-DE) of pooled venom samples were achieved using immobilized pH gradient (IPG) gels of non-linear 3-10 pH range during the isoelectric focusing step and 10-20% gradient polyacrylamide gels in the second dimension. Software-assisted analysis of the 2-DE gels images demonstrated differences in the number and intensity of spots in juvenile, sub-adult and adult venoms. Although peptide mass fingerprinting (PMF) failed to identify even a minor fraction of spots, it allowed us to group spots that displayed similar peptide maps. The spots were subjected to a combination of tandem mass spectrometry and Mascot and MS BLAST database searches that identified several classes of proteins, including metalloproteinases, serine proteinases, lectins, phospholipases A2, L-amino oxidases, nerve growth factors, vascular endothelial growth factors and cysteine-rich secretory proteins. CONCLUSION: The analysis of B. atrox samples from specimens of different ages by 2-DE and mass spectrometry suggested that venom proteome alters upon ontogenetic development. We identified stage specific and differentially expressed polypeptides that may be responsible for the activities of the venom in each developmental stage. The results provide insight into the molecular basis of the relation between symptomatology of snakebite accidents in humans and the venom composition. Our findings underscore the importance of the use of venoms from individual specimen at various stages of maturation for the production of antivenoms. [Abstract/Link to Full Text]

Hughes SR, Riedmuller SB, Mertens JA, Li XL, Bischoff KM, Qureshi N, Cotta MA, Farrelly PJ
High-throughput screening of cellulase F mutants from multiplexed plasmid sets using an automated plate assay on a functional proteomic robotic workcell.
Proteome Sci. 2006;410.
BACKGROUND: The field of plasmid-based functional proteomics requires the rapid assay of proteins expressed from plasmid libraries. Automation is essential since large sets of mutant open reading frames are being cloned for evaluation. To date no integrated automated platform is available to carry out the entire process including production of plasmid libraries, expression of cloned genes, and functional testing of expressed proteins. RESULTS: We used a functional proteomic assay in a multiplexed setting on an integrated plasmid-based robotic workcell for high-throughput screening of mutants of cellulase F, an endoglucanase from the anaerobic fungus Orpinomyces PC-2. This allowed us to identify plasmids containing optimized clones expressing mutants with improved activity at lower pH. A plasmid library of mutagenized clones of the celF gene with targeted variations in the last four codons was constructed by site-directed PCR mutagenesis and transformed into Escherichia coli. A robotic picker integrated into the workcell was used to inoculate medium in a 96-well deep well plate, combining the transformants into a multiplexed set in each well, and the plate was incubated on the workcell. Plasmids were prepared from the multiplexed culture on the liquid handler component of the workcell and used for in vitro transcription/translation. The multiplexed expressed recombinant proteins were screened for improved activity and stability in an azo-carboxymethylcellulose plate assay. The multiplexed wells containing mutants with improved activity were identified and linked back to the corresponding multiplexed cultures stored in glycerol. Spread plates were prepared from the glycerol stocks and the workcell was used to pick single colonies from the spread plates, prepare plasmid, produce recombinant protein, and assay for activity. The screening assay and subsequent deconvolution of the multiplexed wells resulted in identification of improved CelF mutants and corresponding optimized clones in expression-ready plasmids. CONCLUSION: The multiplex method using an integrated automated platform for high-throughput screening in a functional proteomic assay allows rapid identification of plasmids containing optimized clones ready for use in subsequent applications including transformations to produce improved strains or cell lines. [Abstract/Link to Full Text]

Walker MJ, Rylett CM, Keen JN, Audsley N, Sajid M, Shirras AD, Isaac RE
Proteomic identification of Drosophila melanogaster male accessory gland proteins, including a pro-cathepsin and a soluble gamma-glutamyl transpeptidase.
Proteome Sci. 2006;49.
BACKGROUND: In Drosophila melanogaster, the male seminal fluid contains proteins that are important for reproductive success. Many of these proteins are synthesised by the male accessory glands and are secreted into the accessory gland lumen, where they are stored until required. Previous studies on the identification of Drosophila accessory gland products have largely focused on characterisation of male-specific accessory gland cDNAs from D. melanogaster and, more recently, Drosophila simulans. In the present study, we have used a proteomics approach without any sex bias to identify proteins in D. melanogaster accessory gland secretions. RESULTS: Thirteen secreted accessory gland proteins, including seven new accessory gland proteins, were identified by 2D-gel electrophoresis combined with mass spectrometry of tryptic fragments. They included protein-folding and stress-response proteins, a hormone, a lipase, a serpin, a cysteine-rich protein and two peptidases, a pro-enzyme form of a cathepsin K-like cysteine peptidase and a gamma-glutamyl transpeptidase. Enzymatic studies established that accessory gland secretions contain a cysteine peptidase zymogen that can be activated at low pH. This peptidase may have a role in the processing of female and other male-derived proteins, but is unlikely to be involved in the processing of the sex peptide. gamma-Glutamyl transpeptidases are type II integral membrane proteins; however, the identified AG gamma-glutamyl transpeptidase (GGT-1) is unusual in that it is predicted to be a soluble secreted protein, a prediction that is supported by biochemical evidence. GGT-1 is possibly involved in maintaining a protective redox environment for sperm. The strong gamma-glutamyl transpeptidase activity found in the secretions provides an explanation for the observation that glutamic acid is the most abundant free amino acid in accessory gland secretions of D. melanogaster. CONCLUSION: We have applied biochemical approaches, not used previously, to characterise prominent D. melanogaster accessory gland products. Of the thirteen accessory gland secreted proteins reported in this study, six were represented in a D. simulans male accessory gland EST library that was biased for male-specific genes. Therefore, the present study has identified seven new secreted accessory gland proteins, including GGT-1, which was not recognised previously as a secreted accessory gland product. [Abstract/Link to Full Text]

Latterich M
Publishing proteomic data.
Proteome Sci. 2006;48.
Scientific publications should provide sufficient detail in terms of methodology and presented data to enable the community to reproduce the methodology to generate similar data and arrive at the same conclusion, if an identical sample is provided for analysis. The advent of high-throughput methods in biological experimentation impose some unique challenges both in data presentation in classical print format, as well as in describing methodology and data analysis in sufficient detail to conform to good publication practice. To facilitate this process, Proteome Science is adopting a set of methodology and data presentation guidelines to enable both peer reviewers, as well as the scientific community, to better evaluate high-throughput proteomic studies. [Abstract/Link to Full Text]

Guerreiro N, Gomez-Mancilla B, Charmont S
Optimization and evaluation of surface-enhanced laser-desorption/ionization time-of-flight mass spectrometry for protein profiling of cerebrospinal fluid.
Proteome Sci. 2006;47.
Cerebrospinal fluid (CSF) potentially carries an archive of peptides and small proteins relevant to pathological processes in the central nervous system (CNS) and surrounding brain tissue. Proteomics is especially well suited for the discovery of biomarkers of diagnostic potential in CSF for early diagnosis and discrimination of several neurodegenerative diseases. ProteinChip surface-enhanced laser-desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) is one such approach which offers a unique platform for high throughput profiling of peptides and small proteins in CSF. In this study, we evaluated methodologies for the retention of CSF proteins < 20 kDa in size, and identify a strategy for screening small proteins and peptides in CSF. ProteinChip array types, along with sample and binding buffer conditions, and matrices were investigated. By coupling the processing of arrays to a liquid handler reproducible and reliable profiles, with mean peak coefficients of variation < 20%, were achieved for intra- and inter-assays under selected conditions. Based on peak m/z we found a high degree of overlap between the tested array surfaces. The combination of CM10 and IMAC30 arrays was sufficient to represent between 80-90% of all assigned peaks when using either sinapinic acid or alpha-Cyano-4-hydroxycinnamic acid as the energy absorbing matrices. Moreover, arrays processed with SPA consistently showed better peak resolution and higher peak number across all surfaces within the measured mass range. We intend to use CM10 and IMAC30 arrays prepared in sinapinic acid as a fast and cost-effective approach to drive decisions on sample selection prior to more in-depth discovery of diagnostic biomarkers in CSF using alternative but complementary proteomic strategies. [Abstract/Link to Full Text]

Shukla HD
Proteomic analysis of acidic chaperones, and stress proteins in extreme halophile Halobacterium NRC-1: a comparative proteomic approach to study heat shock response.
Proteome Sci. 2006;46.
BACKGROUND: Halobacterium sp. NRC-1 is an extremely halophilic archaeon and has adapted to optimal growth under conditions of extremely high salinity. Its proteome is highly acidic with a median pI of 4.9, a unique characteristic which helps the organism to adapt high saline environment. In the natural growth environment, Halobacterium NRC-1 encounters a number of stressful conditions including high temperature and intense solar radiation, oxidative and cold stress. Heat shock proteins and chaperones play indispensable roles in an organism's survival under many stress conditions. The aim of this study was to develop an improved method of 2-D gel electrophoresis with enhanced resolution of the acidic proteome, and to identify proteins with diverse cellular functions using in-gel digestion and LC-MS/MS and MALDI-TOF approach. RESULTS: A modified 2-D gel electrophoretic procedure, employing IPG strips in the range of pH 3-6, enabled improved separation of acidic proteins relative to previous techniques. Combining experimental data from 2-D gel electrophoresis with available genomic information, allowed the identification of at least 30 cellular proteins involved in many cellular functions: stress response and protein folding (CctB, PpiA, DpsA, and MsrA), DNA replication and repair (DNA polymerase A alpha subunit, Orc4/CDC6, and UvrC), transcriptional regulation (Trh5 and ElfA), translation (ribosomal proteins Rps27ae and Rphs6 of the 30 S ribosomal subunit; Rpl31eand Rpl18e of the 50 S ribosomal subunit), transport (YufN), chemotaxis (CheC2), and housekeeping (ThiC, ThiD, FumC, ImD2, GapB, TpiA, and PurE). In addition, four gene products with undetermined function were also identified: Vng1807H, Vng0683C, Vng1300H, and Vng6254. To study the heat shock response of Halobacterium NRC-1, growth conditions for heat shock were determined and the proteomic profiles under normal (42 degrees C), and heat shock (49 degrees C) conditions, were compared. Using a differential proteomic approach in combination with available genomic information, bioinformatic analysis revealed five putative heat shock proteins that were upregulated in cells subjected to heat stress at 49 degrees C, namely DnaJ, GrpE, sHsp-1, Hsp-5 and sHsp-2. CONCLUSION: The modified 2-D gel electrophoresis markedly enhanced the resolution of the extremely acidic proteome of Halobacterium NRC-1. Constitutive expression of stress proteins and chaperones help the organism to adapt and survive under extreme salinity and other stress conditions. The upregulated expression pattern of putative chaperones DnaJ, GrpE, sHsp-1, Hsp-5 and sHsp-2 under elevated temperature clearly suggests that Halobacterium NRC-1 has a sophisticated defense mechanism to survive in extreme environments. [Abstract/Link to Full Text]

Abramovitz M, Leyland-Jones B
A systems approach to clinical oncology: focus on breast cancer.
Proteome Sci. 2006;45.
During the past decade, genomic microarrays have been applied with some success to the molecular profiling of breast tumours, which has resulted in a much more detailed classification scheme as well as in the identification of potential gene signature sets. These gene sets have been applied to both the prognosis and prediction of outcome to treatment and have performed better than the current clinical criteria. One of the main limitations of microarray analysis, however, is that frozen tumour samples are required for the assay. This imposes severe limitations on access to samples and precludes large scale validation studies from being conducted. Quantitative reverse transcriptase polymerase chain reaction (qRT-PCR), on the other hand, can be used with degraded RNAs derived from formalin-fixed paraffin-embedded (FFPE) tumour samples, the most important and abundant source of clinical material available. More recently, the novel DASL (cDNA-mediated Annealing, Selection, extension and Ligation) assay has been developed as a high throughput gene expression profiling system specifically designed for use with FFPE tumour tissue samples.However, we do not believe that genomics is adequate as a sole prognostic and predictive platform in breast cancer. The key proteins driving oncogenesis, for example, can undergo post-translational modifications; moreover, if we are ever to move individualization of therapy into the practical world of blood-based assays, serum proteomics becomes critical. Proteomic platforms, including tissue micro-arrays (TMA) and protein chip arrays, in conjunction with surface-enhanced laser desorption ionization time-of-flight mass spectrometry (SELDI-TOF/MS), have been the technologies most widely applied to the characterization of tumours and serum from breast cancer patients, with still limited but encouraging results.This review will focus on these genomic and proteomic platforms, with an emphasis placed on the utilization of FFPE tumour tissue samples and serum, as they have been applied to the study of breast cancer for the discovery of gene signatures and biomarkers for the early diagnosis, prognosis and prediction of treatment outcome. The ultimate goal is to be able to apply a systems biology approach to the information gleaned from the combination of these techniques in order to select the best treatment strategy, monitor its effectiveness and make changes as rapidly as possible where needed to achieve the optimal therapeutic results for the patient. [Abstract/Link to Full Text]

D'Andrea G, Lizzi AR, Venditti S, Di Francesco L, Giorgi A, Mignogna G, Oratore A, Bozzi A
Proteins pattern alteration in AZT-treated K562 cells detected by two-dimensional gel electrophoresis and peptide mass fingerprinting.
Proteome Sci. 2006;44.
In this study we report the effect of AZT on the whole protein expression profile both in the control and the AZT-treated K562 cells, evidenced by two-dimensional gel electrophoresis and peptide mass fingerprinting analysis. Two-dimensional gels computer digital image analysis showed two spots that appeared up-regulated in AZT-treated cells and one spot present only in the drug exposed samples. Upon extraction and analysis by peptide mass fingerprinting, the first two spots were identified as PDI-A3 and stathmin, while the third one was proved to be NDPK-A. Conversely, two protein spots were present only in the untreated K562 cells, and were identified as SOD1 and HSP-60, respectively. [Abstract/Link to Full Text]

Man TK, Li Y, Dang TA, Shen J, Perlaky L, Lau CC
Optimising the use of TRIzol-extracted proteins in surface enhanced laser desorption/ ionization (SELDI) analysis.
Proteome Sci. 2006;43.
BACKGROUND: Research with clinical specimens is always hampered by the limited availability of relevant samples, necessitating the use of a single sample for multiple assays. TRIzol is a common reagent for RNA extraction, but DNA and protein fractions can also be used for other studies. However, little is known about using TRIzol-extracted proteins in proteomic research, partly because proteins extracted from TRIzol are very resistant to solubilization. RESULTS: To facilitate the use of TRIzol-extracted proteins, we first compared the ability of four different common solubilizing reagents to solubilize the TRIzol-extracted proteins from an osteosarcoma cell line, U2-OS. Then we analyzed the solubilized proteins by Surface Enhanced Laser Desorption/Ionization technique (SELDI). The results showed that solubilization of TRIzol-extracted proteins with 9.5 M Urea and 2% CHAPS ([3-[(3-cholamidopropyl)-dimethylammonio]propanesulfonate]) (UREA-CHAPS) was significantly better than the standard 1% SDS in terms of solubilization efficiency and the number of detectable ion peaks. Using three different types of SELDI arrays (CM10, H50, and IMAC-Cu), we demonstrated that peak detection with proteins solubilized by UREA-CHAPS was reproducible (r > 0.9). Further SELDI analysis indicated that the number of ion peaks detected in TRIzol-extracted proteins was comparable to a direct extraction method, suggesting many proteins still remain in the TRIzol protein fraction. CONCLUSION: Our results suggest that UREA-CHAPS performed very well in solubilizing TRIzol-extracted proteins for SELDI applications. Protein fractions left over after TRIzol RNA extraction could be a valuable but neglected source for proteomic or biochemical analysis when additional samples are not available. [Abstract/Link to Full Text]

Fischer F, Poetsch A
Protein cleavage strategies for an improved analysis of the membrane proteome.
Proteome Sci. 2006;42.
BACKGROUND: Membrane proteins still remain elusive in proteomic studies. This is in part due to the distribution of the amino acids lysine and arginine, which are less frequent in integral membrane proteins and almost absent in transmembrane helices. As these amino acids are cleavage targets for the commonly used protease trypsin, alternative cleavage conditions, which should improve membrane protein analysis, were tested by in silico digestion for the three organisms Saccharomyces cerevisiae, Halobacterium sp. NRC-1, and Corynebacterium glutamicum as hallmarks for eukaryotes, archea and eubacteria. RESULTS: For the membrane proteomes from all three analyzed organisms, we identified cleavage conditions that achieve better sequence and proteome coverage than trypsin. Greater improvement was obtained for bacteria than for yeast, which was attributed to differences in protein size and GRAVY. It was demonstrated for bacteriorhodopsin that the in silico predictions agree well with the experimental observations. CONCLUSION: For all three examined organisms, it was found that a combination of chymotrypsin and staphylococcal peptidase I gave significantly better results than trypsin. As some of the improved cleavage conditions are not more elaborate than trypsin digestion and have been proven useful in practice, we suppose that the cleavage at both hydrophilic and hydrophobic amino acids should facilitate in general the analysis of membrane proteins for all organisms. [Abstract/Link to Full Text]

Anderson KK, Monroe ME, Daly DS
Estimating probabilities of peptide database identifications to LC-FTICR-MS observations.
Proteome Sci. 2006;41.
BACKGROUND: The field of proteomics involves the characterization of the peptides and proteins expressed in a cell under specific conditions. Proteomics has made rapid advances in recent years following the sequencing of the genomes of an increasing number of organisms. A prominent technology for high throughput proteomics analysis is the use of liquid chromatography coupled to Fourier transform ion cyclotron resonance mass spectrometry (LC-FTICR-MS). Meaningful biological conclusions can best be made when the peptide identities returned by this technique are accompanied by measures of accuracy and confidence. METHODS: After a tryptically digested protein mixture is analyzed by LC-FTICR-MS, the observed masses and normalized elution times of the detected features are statistically matched to the theoretical masses and elution times of known peptides listed in a large database. The probability of matching is estimated for each peptide in the reference database using statistical classification methods assuming bivariate Gaussian probability distributions on the uncertainties in the masses and the normalized elution times. RESULTS: A database of 69,220 features from 32 LC-FTICR-MS analyses of a tryptically digested bovine serum albumin (BSA) sample was matched to a database populated with 97% false positive peptides. The percentage of high confidence identifications was found to be consistent with other database search procedures. BSA database peptides were identified with high confidence on average in 14.1 of the 32 analyses. False positives were identified on average in just 2.7 analyses. CONCLUSION: Using a priori probabilities that contrast peptides from expected and unexpected proteins was shown to perform better in identifying target peptides than using equally likely a priori probabilities. This is because a large percentage of the target peptides were similar to unexpected peptides which were included to be false positives. The use of triplicate analyses with a "2 out of 3" reporting rule was shown to have excellent rejection of false positives. [Abstract/Link to Full Text]

Ramaswamy A, Lin E, Chen I, Mitra R, Morrisett J, Coombes K, Ju Z, Kapoor M
Application of protein lysate microarrays to molecular marker verification and quantification.
Proteome Sci. 2005;39.
This study presents the development and application of protein lysate microarray (LMA) technology for verification of presence and quantification of human tissue samples for protein biomarkers. Sub-picogram range sensitivity has been achieved on LMA using a non-enzymatic protein detection methodology. Results from a set of quality control experiments are presented and demonstrate the high sensitivity and reproducibility of the LMA methodology. The optimized LMA methodology has been applied for verification of the presence and quantification of disease markers for atherosclerosis. LMA were used to measure lipoprotein [a] and apolipoprotein B100 in 52 carotid endarterectomy samples. The data generated by LMA were validated by ELISA using the same protein lysates. The correlations of protein amounts estimated by LMA and ELISA were highly significant, with r2 > or = 0.98 (p < or = 0.001) for lipoprotein [a] and with r2 > or = 0.94 (p < or = 0.001) for apolipoprotein B100. This is the first report to compare data generated using proteins microarrays with ELISA, a standard technology for the verification of the presence of protein biomarkers. The sensitivity, reproducibility, and high-throughput quality of LMA technology make it a potentially powerful technology for profiling disease specific protein markers in clinical samples. [Abstract/Link to Full Text]

Latterich M
Molecular systems biology at the crossroads: to know less about more, or to know more about less?
Proteome Sci. 2005 Oct 28;38.
Systems biology is a rapidly evolving discipline that endeavours to understand the detailed coordinated workings of entire organisms, with the ultimate goal to detect differences between health and disease, or to understand how cells or entire organisms react to the environment. The editorial provides a critical evaluation of what molecular systems analysis can and cannot accomplish with existing methodologies, and how systems biology needs to merge with reductionism to yield a more comprehensive and mechanistically insightful model of a cell or organism. [Abstract/Link to Full Text]

Stratmann T, Kang AS
Cognate peptide-receptor ligand mapping by directed phage display.
Proteome Sci. 2005 Jun 17;37.
BACKGROUND: A rapid phage display method for the elucidation of cognate peptide specific ligand for receptors is described. The approach may be readily integrated into the interface of genomic and proteomic studies to identify biologically relevant ligands. METHODS: A gene fragment library from influenza coat protein haemagglutinin (HA) gene was constructed by treating HA cDNA with DNAse I to create 50-100 bp fragments. These fragments were cloned into plasmid pORFES IV and in-frame inserts were selected. These in-frame fragment inserts were subsequently cloned into a filamentous phage display vector JC-M13-88 for surface display as fusions to a synthetic copy of gene VIII. Two well characterized antibodies, mAb 12CA5 and pAb 07431, directed against distinct known regions of HA were used to pan the library. RESULTS: Two linear epitopes, HA peptide 112-126 and 162-173, recognized by mAb 12CA5 and pAb 07431, respectively, were identified as the cognate epitopes. CONCLUSION: This approach is a useful alternative to conventional methods such as screening of overlapping synthetic peptide libraries or gene fragment expression libraries when searching for precise peptide protein interactions, and may be applied to functional proteomics. [Abstract/Link to Full Text]

Ahmad QR, Nguyen DH, Wingerd MA, Church GM, Steffen MA
Molecular weight assessment of proteins in total proteome profiles using 1D-PAGE and LC/MS/MS.
Proteome Sci. 2005 Jun 8;3(1):6.
BACKGROUND: The observed molecular weight of a protein on a 1D polyacrylamide gel can provide meaningful insight into its biological function. Differences between a protein's observed molecular weight and that predicted by its full length amino acid sequence can be the result of different types of post-translational events, such as alternative splicing (AS), endoproteolytic processing (EPP), and post-translational modifications (PTMs). The characterization of these events is one of the important goals of total proteome profiling (TPP). LC/MS/MS has emerged as one of the primary tools for TPP, but since this method identifies tryptic fragments of proteins, it has not generally been used for large-scale determination of the molecular weight of intact proteins in complex mixtures. RESULTS: We have developed a set of computational tools for extracting molecular weight information of intact proteins from total proteome profiles in a high throughput manner using 1D-PAGE and LC/MS/MS. We have applied this technology to the proteome profile of a human lymphoblastoid cell line under standard culture conditions. From a total of 1 x 10(7) cells, we identified 821 proteins by at least two tryptic peptides. Additionally, these 821 proteins are well-localized on the 1D-SDS gel. 656 proteins (80%) occur in gel slices in which the observed molecular weight of the protein is consistent with its predicted full-length sequence. A total of 165 proteins (20%) are observed to have molecular weights that differ from their predicted full-length sequence. We explore these molecular-weight differences based on existing protein annotation. CONCLUSION: We demonstrate that the determination of intact protein molecular weight can be achieved in a high-throughput manner using 1D-PAGE and LC/MS/MS. The ability to determine the molecular weight of intact proteins represents a further step in our ability to characterize gene expression at the protein level. The identification of 165 proteins whose observed molecular weight differs from the molecular weight of the predicted full-length sequence provides another entry point into the high-throughput characterization of protein modification. [Abstract/Link to Full Text]

Churchward MA, Butt RH, Lang JC, Hsu KK, Coorssen JR
Enhanced detergent extraction for analysis of membrane proteomes by two-dimensional gel electrophoresis.
Proteome Sci. 2005 Jun 7;3(1):5.
BACKGROUND: The analysis of hydrophobic membrane proteins by two-dimensional gel electrophoresis has long been hampered by the concept of inherent difficulty due to solubility issues. We have optimized extraction protocols by varying the detergent composition of the solubilization buffer with a variety of commercially available non-ionic and zwitterionic detergents and detergent-like phospholipids. RESULTS: After initial analyses by one-dimensional SDS-PAGE, quantitative two-dimensional analyses of human erythrocyte membranes, mouse liver membranes, and mouse brain membranes, extracted with buffers that included the zwitterionic detergent MEGA 10 (decanoyl-N-methylglucamide) and the zwitterionic lipid LPC (1-lauroyl lysophosphatidylcholine), showed selective improvement over extraction with the common 2-DE detergent CHAPS (3 [(3-cholamidopropyl)dimethylammonio]-1-propanesulfonate). Mixtures of the three detergents showed additive improvements in spot number, density, and resolution. Substantial improvements in the analysis of a brain membrane proteome were observed. CONCLUSION: This study demonstrates that an optimized detergent mix, coupled with rigorous sample handling and electrophoretic protocols, enables simple and effective analysis of membrane proteomes using two-dimensional electrophoresis. [Abstract/Link to Full Text]

Wang W, Hollmann R, Fürch T, Nimtz M, Malten M, Jahn D, Deckwer WD
Proteome analysis of a recombinant Bacillus megaterium strain during heterologous production of a glucosyltransferase.
Proteome Sci. 2005 May 31;34.
A recombinant B. megaterium strain was used for the heterologous production of a glucosyltransferase (dextransucrase). To better understand the physiological and metabolic responses of the host cell to cultivation and induction conditions, proteomic analysis was carried out by combined use of two-dimensional gel electrophoresis and mass spectrometry (2-DE/MS) for protein separation and identification. 2-DE method was optimized for the separation of intracellular proteins. Since the genome of B. megaterium is not yet available, peptide sequencing using peptide fragment information obtained from nanoelectrospray ionization quadrupole-time-of-flight tandem mass spectrometry (ESI-QqTOF MS/MS) was applied for protein identification. 167 protein spots were identified as 149 individual proteins, including most enzymes involved in the central carbon metabolic pathways and many enzymes related to amino acid synthesis and protein synthesis. Based on the results a 2-DE reference map and a corresponding protein database were constructed for further proteomic approaches on B. megaterium. For the first time it became possible to perform comparative proteomic analysis on B. megaterium in a batch culture grown on glucose with xylose induction for dextrasucrase production. No significant differences were observed in the expression changes of enzymes of the glycolysis and TCA cycle, indicating that dextransucrase production, which amounted to only 2 % of the entire protein production, did not impose notable metabolic or energetic burdens on the central carbon metabolic pathway of the cells. However, a short-term up-regulation of aspartate aminotransferase, an enzyme closely related to dextransucrase production, in the induced culture demonstrated the feasibility to use 2-DE method for monitoring dextransucrase production. It was also observed that under the cultivation conditions used in this study B. megaterium tended to channel acetyl-CoA into pathways of polyhydroxybutyrate production. No expression increases were found with cytosolic chaperones such as GroEL and DnaK during dextransucrase production and secretion, whereas a strong up-regulation of the oligopeptide-binding protein OppA was observed in correlation with an increased secretion of dextransucrase into the culture medium. [Abstract/Link to Full Text]

Shen W, Yun S, Tam B, Dalal K, Pio FF
Target selection of soluble protein complexes for structural proteomics studies.
Proteome Sci. 2005 May 18;3(1):3.
BACKGROUND: Protein expression in E. coli is the most commonly used system to produce protein for structural studies, because it is fast and inexpensive and can produce large quantity of proteins. However, when proteins from other species such as mammalian are produced in this system, problems of protein expression and solubility arise 1. Structural genomics project are currently investigating proteomics pipelines that would produce sufficient quantities of recombinant proteins for structural studies of protein complexes. To investigate how the E. coli protein expression system could be used for this purpose, we purified apoptotic binary protein complexes formed between members of the Caspase Associated Recruitment Domain (CARD) family. RESULTS: A combinatorial approach to the generation of protein complexes was performed between members of the CARD domain protein family that have the ability to form hetero-dimers between each other. In our method, each gene coding for a specific protein partner is cloned in pET-28b (Novagen) and PGEX2T (Amersham) expression vectors. All combinations of protein complexes are then obtained by reconstituting complexes from purified components in native conditions, after denaturation-renaturation or co-expression. Our study applied to 14 soluble CARD domain proteins revealed that co-expression studies perform better than native and denaturation-renaturation methods. In this study, we confirm existing interactions obtained in vivoin mammalian cells and also predict new interactions. CONCLUSION: The simplicity of this screening method could be easily scaled up to identify soluble protein complexes for structural genomic projects. This study reports informative statistics on the solubility of human protein complexes expressed in E.coli belonging to the human CARD protein family. [Abstract/Link to Full Text]

Casado B, Pannell LK, Whalen G, Clauw DJ, Baraniuk JN
Human neuroglobin protein in cerebrospinal fluid.
Proteome Sci. 2005 Feb 25;3(1):2.
BACKGROUND: Neuroglobin is a hexacoordinated member of the globin family of proteins. It is predominantly localized to various brain regions and retina where it may play a role in protection against ischemia and nitric oxide-induced neural injury. Cerebrospinal fluid was collected from 12 chronic regional or systemic pain and 5 control subjects. Proteins were precipitated by addition of 50% 0.2 N acetic acid, 50% ethanol, 0.02% sodium bisulfite. The pellet was extensively digested with trypsin. Peptides were separated by capillary liquid chromatography using a gradient from 95% water to 95% acetonitrile in 0.2% formic acid, and eluted through a nanoelectrospray ionization interface into a quadrapole - time-of-flight dual mass spectrometer (QToF2, Waters, Milford, MA). Peptides were sequenced (PepSeq, MassLynx v3.5) and proteins identified using MASCOT (R). RESULTS: Six different neuroglobin peptides were identified in various combinations in 3 of 9 female pain subjects, but none in male pain, or female or male control subjects. CONCLUSION: This is the first description of neuroglobin in cerebrospinal fluid. The mechanism(s) leading to its release in chronic pain states remain to be defined. [Abstract/Link to Full Text]

Hepner F, Cszasar E, Roitinger E, Lubec G
Mass spectrometrical analysis of recombinant human growth hormone (Genotropin(R)) reveals amino acid substitutions in 2% of the expressed protein.
Proteome Sci. 2005 Feb 11;3(1):1.
BACKGROUND: The structural integrity of recombinant proteins is of critical importance to their application as clinical treatments. Recombinant growth hormone preparations have been examined by several methodologies. In this study recombinant human growth hormone (rhGH; Genotropin(R)), expressed in E. coli K12, was structurally analyzed by two-dimensional gel electrophoresis and MALDI-TOF-TOF, LC-MS and LC-MS/ MS sequencing of the resolved peptides. RESULTS: Electrospray LC-MS analysis revealed one major protein with an average molecular mass of 22126.8 Da and some additional minor components. Electrospray LC-MS/MS evaluation of the enzymatically digested Genotropin(R) sample resulted in the identification of amino acid substitutions at the residues M14, M125, and M170; di-methylation of K70 (or exchange to arginine); deamidation of N149, and N152, and oxidation of M140, M125 and M170. Peak area comparison of the modified and parental peptides indicates that these changes were present in ~2% of the recombinant preparation. CONCLUSION: Modifications of the recombinant human growth hormone may lead to structural or conformational changes, modification of antigenicity and development of antibody formation in treated subjects. Amino acid exchanges may be caused by differences between human and E. coli codon usage and/or unknown copy editing mechanisms. While deamidation and oxidation can be assigned to processing events, the mechanism for possible di-methylation of K70 remains unclear. [Abstract/Link to Full Text]

Perlee L, Christiansen J, Dondero R, Grimwade B, Lejnine S, Mullenix M, Shao W, Sorette M, Tchernev V, Patel D, Kingsmore S
Development and standardization of multiplexed antibody microarrays for use in quantitative proteomics.
Proteome Sci. 2004 Dec 15;2(1):9.
BACKGROUND: Quantitative proteomics is an emerging field that encompasses multiplexed measurement of many known proteins in groups of experimental samples in order to identify differences between groups. Antibody arrays are a novel technology that is increasingly being used for quantitative proteomics studies due to highly multiplexed content, scalability, matrix flexibility and economy of sample consumption. Key applications of antibody arrays in quantitative proteomics studies are identification of novel diagnostic assays, biomarker discovery in trials of new drugs, and validation of qualitative proteomics discoveries. These applications require performance benchmarking, standardization and specification. RESULTS: Six dual-antibody, sandwich immunoassay arrays that measure 170 serum or plasma proteins were developed and experimental procedures refined in more than thirty quantitative proteomics studies. This report provides detailed information and specification for manufacture, qualification, assay automation, performance, assay validation and data processing for antibody arrays in large scale quantitative proteomics studies. CONCLUSION: The present report describes development of first generation standards for antibody arrays in quantitative proteomics. Specifically, it describes the requirements of a comprehensive validation program to identify and minimize antibody cross reaction under highly multiplexed conditions; provides the rationale for the application of standardized statistical approaches to manage the data output of highly replicated assays; defines design requirements for controls to normalize sample replicate measurements; emphasizes the importance of stringent quality control testing of reagents and antibody microarrays; recommends the use of real-time monitors to evaluate sensitivity, dynamic range and platform precision; and presents survey procedures to reveal the significance of biomarker findings. [Abstract/Link to Full Text]

Myung JK, Afjehi-Sadat L, Felizardo-Cabatic M, Slavc I, Lubec G
Expressional patterns of chaperones in ten human tumor cell lines.
Proteome Sci. 2004 Dec 14;2(1):8.
BACKGROUND: Chaperones (CH) play an important role in tumor biology but no systematic work on expressional patterns has been reported so far. The aim of the study was therefore to present an analytical method for the concomitant determination of several CH in human tumor cell lines, to generate expressional patterns in the individual cell lines and to search for tumor and non-tumor cell line specific CH expression.Human tumor cell lines of neuroblastoma, colorectal and adenocarcinoma of the ovary, osteosarcoma, rhabdomyosarcoma, malignant melanoma, lung, cervical and breast cancer, promyelocytic leukaemia were homogenised, proteins were separated on two-dimensional gel electrophoresis with in-gel digestion of proteins and MALDI-TOF/TOF analysis was carried out for the identification of CH. RESULTS: A series of CH was identified including the main CH groups as HSP90/HATPas_C, HSP70, Cpn60_TCP1, DnaJ, Thioredoxin, TPR, Pro_isomerase, HSP20, ERP29_C, KE2, Prefoldin, DUF704, BAG, GrpE and DcpS. CONCLUSIONS: The ten individual tumor cell lines showed different expression patterns, which are important for the design of CH studies in tumor cell lines. The results can serve as a reference map and form the basis of a concomitant determination of CH by a protein chemical rather than an immunochemical method, independent of antibody availability or specificity. [Abstract/Link to Full Text]

Hansson SF, Puchades M, Blennow K, Sjögren M, Davidsson P
Validation of a prefractionation method followed by two-dimensional electrophoresis - Applied to cerebrospinal fluid proteins from frontotemporal dementia patients.
Proteome Sci. 2004 Nov 18;2(1):7.
BACKGROUND: The aim of this study was firstly, to improve and validate a cerebrospinal fluid (CSF) prefractionation method followed by two-dimensional electrophoresis (2-DE) and secondly, using this strategy to investigate differences between the CSF proteome of frontotemporal dementia (FTD) patients and controls. From each subject three ml of CSF was prefractionated using liquid phase isoelectric focusing prior to 2-DE. RESULTS: With respect to protein recovery and purification potential, ethanol precipitation of the prefractionated CSF sample was found superior, after testing several sample preparation methods.The reproducibility of prefractionated CSF analyzed on 2-D gels was comparable to direct 2-DE analysis of CSF. The protein spots on the prefractionated 2-D gels had an increased intensity, indicating a higher protein concentration, compared to direct 2-D gels. Prefractionated 2-DE analysis of FTD and control CSF showed that 26 protein spots were changed at least two fold. Using mass spectrometry, 13 of these protein spots were identified, including retinol-binding protein, Zn-alpha-2-glycoprotein, proapolipoproteinA1, beta-2-microglobulin, transthyretin, albumin and alloalbumin. CONCLUSION: The results suggest that the prefractionated 2-DE method can be useful for enrichment of CSF proteins and may provide a new tool to investigate the pathology of neurodegenerative diseases. This study confirmed reduced levels of retinol-binding protein and revealed some new biomarker candidates for FTD. [Abstract/Link to Full Text]

Khoudoli GA, Porter IM, Blow JJ, Swedlow JR
Optimisation of the two-dimensional gel electrophoresis protocol using the Taguchi approach.
Proteome Sci. 2004 Sep 9;2(1):6.
BACKGROUND: Quantitative proteomic analyses have traditionally used two-dimensional gel electrophoresis (2DE) for separation and characterisation of complex protein mixtures. Among the difficulties associated with this approach is the solubilisation of protein mixtures for isoelectric focusing (IEF). To find the optimal formulation of the multi-component IEF rehydration buffer (RB) we applied the Taguchi method, a widely used approach for the robust optimisation of complex industrial processes, to determine optimal concentrations for the detergents, carrier ampholytes and reducing agents in RB for 2DE using commercially supplied immobilised pH gradient (IPG) gel strips. RESULTS: Our optimisation resulted in increased protein solubility, improved resolution and reproducibility of 2D gels, using a wide variety of samples. With the updated protocol we routinely detected approximately 4-fold more polypeptides on samples containing complex protein mixtures resolved on small format 2D gels. In addition the pI and size ranges over which proteins could be resolved was substantially improved. Moreover, with improved sample loading and resolution, analysis of individual spots by immunoblotting and mass spectrometry revealed previously uncharacterised posttranscriptional modifications in a variety of chromatin proteins. CONCLUSIONS: While the optimised RB (oRB) is specific to the gels and analysis approach we use, our use of the Taguchi method should be generally applicable to a broad range of electrophoresis and analysis systems. [Abstract/Link to Full Text]

Schweigert FJ, Wirth K, Raila J
Characterization of the microheterogeneity of transthyretin in plasma and urine using SELDI-TOF-MS immunoassay.
Proteome Sci. 2004 Sep 1;2(1):5.
BACKGROUND: It has been shown that transthyretin (TTR) exists in different molecular variants. Besides point mutations associated with different diseases such as amyloidosis, other posttranslational modifications occur that might be of diagnostic interest. RESULTS: TTR levels as determined by ELISA in plasma and urine of healthy individuals were 489 +/- 155 microg/ml plasma and 46 +/- 24 ng/g creatinine, respectively. Average levels in urine of pregnant women were 45 +/- 65 microg/g creatinine. The molecular heterogeneity of TTR was analyzed using a high-throughput mass spectrometric immunoassay system. TTR was extracted from plasma or urine onto an antibody-coated (via protein A) affinity chip surface (PS20) using the surface-enhanced laser desorption/ionization (SELDI) technique. Subsequently samples were subjected to time-of-flight mass spectrometry (TOF-MS). In healthy individuals, TTR in plasma occurred rather consistently in two variants of 13732 +/- 12 and 13851 +/- 9 Da for the native and S-cysteinylated forms and at a smaller signal of 14043 +/- 17 Da for the S-glutathionylated form. In urine of pregnant women, various signals were observed with a dominant signal at 13736 +/- 10 Da and a varying number of smaller immunoreactive fragments. These fragments are possibly the consequence of metabolism in plasma or kidney. CONCLUSION: This chip-based approach represents a rapid and accurate method to characterize the molecular variants of TTR including protein or peptide fragments which are either related to TTR or have resulted from its catabolism. These molecular variants may be of diagnostic importance as alternative or novel biomarkers due to their predominant relation to the TTR metabolism both in healthy and diseased individuals. [Abstract/Link to Full Text]

Raymond S, O'Toole N, Cygler M
A data management system for structural genomics.
Proteome Sci. 2004 Jun 21;2(1):4.
BACKGROUND: Structural genomics (SG) projects aim to determine thousands of protein structures by the development of high-throughput techniques for all steps of the experimental structure determination pipeline. Crucial to the success of such endeavours is the careful tracking and archiving of experimental and external data on protein targets. RESULTS: We have developed a sophisticated data management system for structural genomics. Central to the system is an Oracle-based, SQL-interfaced database. The database schema deals with all facets of the structure determination process, from target selection to data deposition. Users access the database via any web browser. Experimental data is input by users with pre-defined web forms. Data can be displayed according to numerous criteria. A list of all current target proteins can be viewed, with links for each target to associated entries in external databases. To avoid unnecessary work on targets, our data management system matches protein sequences weekly using BLAST to entries in the Protein Data Bank and to targets of other SG centers worldwide. CONCLUSION: Our system is a working, effective and user-friendly data management tool for structural genomics projects. In this report we present a detailed summary of the various capabilities of the system, using real target data as examples, and indicate our plans for future enhancements. [Abstract/Link to Full Text]

Höglund A, Kohlbacher O
From sequence to structure and back again: approaches for predicting protein-DNA binding.
Proteome Sci. 2004 Jun 17;2(1):3.
Gene regulation in higher organisms is achieved by a complex network of transcription factors (TFs). Modulating gene expression and exploring gene function are major aims in molecular biology. Furthermore, the identification of putative target genes for a certain TF serve as powerful tools for specific targeting of rational drugs.Detecting the short and variable transcription factor binding sites (TFBSs) in genomic DNA is an intriguing challenge for computational and structural biologists. Fast and reliable computational methods for predicting TFBSs on a whole-genome scale offer several advantages compared to the current experimental methods that are rather laborious and slow. Two main approaches are being explored, advanced sequence-based algorithms and structure-based methods.The aim of this review is to outline the computational and experimental methods currently being applied in the field of protein-DNA interactions. With a focus on the former, the current state of the art in modeling these interactions is discussed. Surveying sequence and structure-based methods for predicting TFBSs, we conclude that in order to achieve a sound and specific method applicable on genomic sequences it is desirable and important to bring these two approaches together. [Abstract/Link to Full Text]

Shin JH, Yang JW, Le Pecheur M, London J, Hoeger H, Lubec G
Altered expression of hypothetical proteins in hippocampus of transgenic mice overexpressing human Cu/Zn-superoxide dismutase 1.
Proteome Sci. 2004 Jun 11;2(1):2.
BACKGROUND: Cu/Zn-superoxide dismutase 1 (SOD1), encoded on chromosome 21, is a key enzyme in the metabolism of reactive oxygen species (ROS) and pathogenetically relevant for several disease states including Down syndrome (DS; trisomy 21). Systematically studying protein expression in human brain and animal models of DS we decided to carry out "protein hunting" for hypothetical proteins, i.e. proteins that have been predicted based upon nucleic sequences only, in a transgenic mouse model overexpressing human SOD1. RESULTS: We applied a proteomics approach using two-dimensional electrophoresis (2-DE) with in-gel digestion of spots followed by mass spectrometric (matrix-assisted laser desorption/ionization-time of flight) identification and quantification of hypothetical proteins using specific software. Hippocampi of wild type, hemizygous and homozygous SOD1 transgenic mice (SOD1-TGs) were analysed.We identified fourteen hypothetical proteins in mouse hippocampus. Of these, expression levels of 2610008O03Rik protein (Q9D0K2) and 4632432E04Rik protein (Q9D358) were significantly decreased (P < 0.05 and 0.001) and hypothetical protein (Q99KP6) was significantly increased (P < 0.05) in hippocampus of SOD1-TGs as compared with non-transgenic mice. CONCLUSIONS: The biological meaning of aberrant expression of these proteins may be impairment of metabolism, signaling and transcription machinery in SOD1-TGs brain that in turn may help to explain deterioration of these systems in DS brain. [Abstract/Link to Full Text]

Shin JH, Yang JW, Juranville JF, Fountoulakis M, Lubec G
Evidence for existence of thirty hypothetical proteins in rat brain.
Proteome Sci. 2004 Jan 30;2(1):1.
BACKGROUND: The rapid completion of genome sequences has created an infrastructure of biological information and provided essential information to link genes to gene products, proteins, the building blocks for cellular functions. In addition, genome/cDNA sequences make it possible to predict proteins for which there is no experimental evidence. Clues for function of hypothetical proteins are provided by sequence similarity with proteins of known function in model organisms. RESULTS: We constructed a two-dimensional protein map and searched for expression of hypothetical proteins in rat brain. Two-dimensional electrophoresis (2-DE) with subsequent in-gel digestion of spots and matrix-assisted laser desorption/ionization (MALDI) spectrometric identification were applied. In total about 3700 spots were analysed, which resulted in the identification of about 1700 polypeptides, that were the products of 190 different genes. A number of hypothetical gene products were detected (30 of 190, 15.8%) and are considered brain proteins. CONCLUSIONS: A major finding of this study is the demonstration of the existence of putative proteins that were so far only deduced from their nucleic acid structure by a protein chemical method independent of antibody availability and specificity and unambiguously identifying proteins. [Abstract/Link to Full Text]