Recent Articles in Nucleic Acids Research

Flicek P, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Eyre T, Fitzgerald S, Fernandez-Banet J, Gräf S, Haider S, Hammond M, Holland R, Howe KL, Howe K, Johnson N, Jenkinson A, Kähäri A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Megy K, Meidl P, Overduin B, Parker A, Pritchard B, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Slater G, Smedley D, Spudich G, Trevanion S, Vilella AJ, Vogel J, White S, Wood M, Birney E, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Herrero J, Hubbard TJ, Kasprzyk A, Proctor G, Smith J, Ureta-Vidal A, Searle S
Ensembl 2008.
Nucleic Acids Res. 2007 Nov 13; .
The Ensembl project ( is a comprehensive genome information system featuring an integrated set of genome annotation, databases and other information for chordate and selected model organism and disease vector genomes. As of release 47 (October 2007), Ensembl fully supports 35 species, with preliminary support for six additional species. New species in the past year include platypus and horse. Major additions and improvements to Ensembl since our previous report include extensive support for functional genomics data in the form of a specialized functional genomics database, genome-wide maps of protein-DNA interactions and the Ensembl regulatory build; support for customization of the Ensembl web interface through the addition of user accounts and user groups; and increased support for genome resequencing. We have also introduced new comparative genomics-based data mining options and report on the continued development of our software infrastructure. [Abstract/Link to Full Text]

Leparc GG, Mitra RD
A sensitive procedure to detect alternatively spliced mRNA in pooled-tissue samples.
Nucleic Acids Res. 2007 Nov 13;
One important goal of genomics is to explore the extent of alternative splicing in the transcriptome and generate a comprehensive catalog of splice forms. New computational and experimental approaches have led to an increase in the number of predicted alternatively spliced transcripts; however, validation of these predictions has not kept pace. In this work, we systematically explore different methods for the validation of cassette exons predicted by computational methods or tiling microarrays. Our goal was to find a procedure that is cost effective, sensitive and specific. We examined three ways of priming the reverse transcription (RT) reaction-poly-dT priming, random priming and pooled exon-specific priming. We also examined two strategies for PCR amplification-flanking PCR, which uses primers that hybridize to the constitutive exons flanking the predicted exon, and a semi-nested PCR with a primer that targets the predicted exon. We found that the combination of RT using a pool of gene-specific primers followed by semi-nested PCR resulted in a significant increase in sensitivity over the most commonly used methodology (97% of the test set was detected versus 14%). Our method was also highly specific-no false positives were detected using a test set of true negatives. Finally, we demonstrate that this method is able to detect alternative exons with a high sensitivity from whole-organism RNA, allowing all tissues to be sampled in a single experiment. The protocol developed here is an accurate and cost-effective way to validate predictions of alternative splicing. [Abstract/Link to Full Text]

Andreeva A, Howorth D, Chandonia JM, Brenner SE, Hubbard TJ, Chothia C, Murzin AG
Data growth and its impact on the SCOP database: new developments.
Nucleic Acids Res. 2007 Nov 13;
The Structural Classification of Proteins (SCOP) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. The SCOP hierarchy comprises the following levels: Species, Protein, Family, Superfamily, Fold and Class. While keeping the original classification scheme intact, we have changed the production of SCOP in order to cope with a rapid growth of new structural data and to facilitate the discovery of new protein relationships. We describe ongoing developments and new features implemented in SCOP. A new update protocol supports batch classification of new protein structures by their detected relationships at Family and Superfamily levels in contrast to our previous sequential handling of new structural data by release date. We introduce pre-SCOP, a preview of the SCOP developmental version that enables earlier access to the information on new relationships. We also discuss the impact of worldwide Structural Genomics initiatives, which are producing new protein structures at an increasing rate, on the rates of discovery and growth of protein families and superfamilies. SCOP can be accessed at [Abstract/Link to Full Text]

Yang L, Wang K, Tan W, Li H, Yang X, Ma C, Tang H
Using force spectroscopy analysis to improve the properties of the hairpin probe.
Nucleic Acids Res. 2007 Nov 13;
The sensitivity of hairpin-probe-based fluorescence resonance energy transfer (FRET) analysis was sequence-dependent in detecting single base mismatches with different positions and identities. In this paper, the relationship between the sequence-dependent effect and the discrimination sensitivity of a single base mismatch was systematically investigated by fluorescence analysis and force spectroscopy analysis. The same hairpin probe was used. The uneven fluorescence analysis sensitivity was obviously influenced by the guanine-cytosine (GC) contents as well as the location of the mismatched base. However, we found that force spectroscopy analysis distinguished itself, displaying a high and even sensitivity in detecting differently mismatched targets. This could therefore be an alternative and novel way to minimize the sequence-dependent effect of the hairpin probe. The advantage offered by force spectroscopy analysis could mainly be attributed to the percentage of rupture force reduction, which could be directly and dramatically influenced by the percentage of secondary structure disruption contributed by each mismatched base pair, regardless of its location and identity. This yes-or-no detection mechanism should both contribute to a comprehensive understanding of the sensitivity source of different mutation analyses and extend the application range of hairpin probes. [Abstract/Link to Full Text]

Breitkreutz BJ, Stark C, Reguly T, Boucher L, Breitkreutz A, Livstone M, Oughtred R, Lackner DH, Bähler J, Wood V, Dolinski K, Tyers M
The BioGRID Interaction Database: 2008 update.
Nucleic Acids Res. 2007 Nov 13;
The Biological General Repository for Interaction Datasets (BioGRID) database ( was developed to house and distribute collections of protein and genetic interactions from major model organism species. BioGRID currently contains over 198 000 interactions from six different species, as derived from both high-throughput studies and conventional focused studies. Through comprehensive curation efforts, BioGRID now includes a virtually complete set of interactions reported to date in the primary literature for both the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe. A number of new features have been added to the BioGRID including an improved user interface to display interactions based on different attributes, a mirror site and a dedicated interaction management system to coordinate curation across different locations. The BioGRID provides interaction data with monthly updates to Saccharomyces Genome Database, Flybase and Entrez Gene. Source code for the BioGRID and the linked Osprey network visualization system is now freely available without restriction. [Abstract/Link to Full Text]

Inoue J, Honda M, Ikawa S, Shibata T, Mikawa T
The process of displacing the single-stranded DNA-binding protein from single-stranded DNA by RecO and RecR proteins.
Nucleic Acids Res. 2007 Nov 13;
The regions of single-stranded (ss) DNA that result from DNA damage are immediately coated by the ssDNA-binding protein (SSB). RecF pathway proteins facilitate the displacement of SSB from ssDNA, allowing the RecA protein to form protein filaments on the ssDNA region, which facilitates the process of recombinational DNA repair. In this study, we examined the mechanism of SSB displacement from ssDNA using purified Thermus thermophilus RecF pathway proteins. To date, RecO and RecR are thought to act as the RecOR complex. However, our results indicate that RecO and RecR have distinct functions. We found that RecR binds both RecF and RecO, and that RecO binds RecR, SSB and ssDNA. The electron microscopic studies indicated that SSB is displaced from ssDNA by RecO. In addition, pull-down assays indicated that the displaced SSB still remains indirectly attached to ssDNA through its interaction with RecO in the RecO-ssDNA complex. In the presence of both SSB and RecO, the ssDNA-dependent ATPase activity of RecA was inhibited, but was restored by the addition of RecR. Interestingly, the interaction of RecR with RecO affected the ssDNA-binding properties of RecO. These results suggest a model of SSB displacement from the ssDNA by RecF pathway proteins. [Abstract/Link to Full Text]

He S, Liu C, Skogerbø G, Zhao H, Wang J, Liu T, Bai B, Zhao Y, Chen R
NONCODE v2.0: decoding the non-coding.
Nucleic Acids Res. 2007 Nov 13;
The NONCODE database is an integrated knowledge database designed for the analysis of non-coding RNAs (ncRNAs). Since NONCODE was first released 3 years ago, the number of known ncRNAs has grown rapidly, and there is growing recognition that ncRNAs play important regulatory roles in most organisms. In the updated version of NONCODE (NONCODE v2.0), the number of collected ncRNAs has reached 206 226, including a wide range of microRNAs, Piwi-interacting RNAs and mRNA-like ncRNAs. The improvements brought to the database include not only new and updated ncRNA data sets, but also an incorporation of BLAST alignment search service and access through our custom UCSC Genome Browser. NONCODE can be found under or [Abstract/Link to Full Text]

Sanchez H, Cardenas PP, Yoshimura SH, Takeyasu K, Alonso JC
Dynamic structures of Bacillus subtilis RecN DNA complexes.
Nucleic Acids Res. 2007 Nov 13;
Genetic and cytological evidences suggest that Bacillus subtilis RecN acts prior to and after end-processing of DNA double-strand ends via homologous recombination, appears to participate in the assembly of a DNA repair centre and interacts with incoming single-stranded (ss) DNA during natural transformation. We have determined the architecture of RecN-ssDNA complexes by atomic force microscopy (AFM). ATP induces changes in the architecture of the RecN-ssDNA complexes and stimulates inter-complex assembly, thereby increasing the local concentration of DNA ends. The large CII and CIII complexes formed are insensitive to SsbA (counterpart of Escherichia coli SSB or eukaryotic RPA protein) addition, but RecA induces dislodging of RecN from the overhangs of duplex DNA molecules. Reciprocally, in the presence of RecN, RecA does not form large RecA-DNA networks. Based on these results, we hypothesize that in the presence of ATP, RecN tethers the 3'-ssDNA ends, and facilitates the access of RecA to the high local concentration of DNA ends. Then, the resulting RecA nucleoprotein filaments, on different ssDNA segments, might promote the simultaneous genome-wide homology search. [Abstract/Link to Full Text]

Malinen M, Saramäki A, Ropponen A, Degenhardt T, Väisänen S, Carlberg C
Distinct HDACs regulate the transcriptional response of human cyclin-dependent kinase inhibitor genes to trichostatin A and 1{alpha},25-dihydroxyvitamin D3.
Nucleic Acids Res. 2007 Nov 13;
The anti-proliferative effects of histone deacetylase (HDAC) inhibitors and 1alpha,25-dihydroxyvitamin D(3) [1alpha,25(OH)(2)D(3)] converge via the interaction of un-liganded vitamin D receptor (VDR) with co-repressors recruiting multiprotein complexes containing HDACs and via the induction of cyclin-dependent kinase inhibitor (CDKI) genes of the INK4 and Cip/Kip family. We investigated the effects of the HDAC inhibitor Trichostatin A (TSA) and 1alpha,25(OH)(2)D(3) on the proliferation and CDKI gene expression in malignant and non-malignant mammary epithelial cell lines. TSA induced the INK4-family genes p18 and p19, whereas the Cip/Kip family gene p21 was stimulated by 1alpha,25(OH)(2)D(3). Chromatin immunoprecipitation and RNA inhibition assays showed that the co-repressor NCoR1 and some HDAC family members complexed un-liganded VDR and repressed the basal level of CDKI genes, but their role in regulating CDKI gene expression by TSA and 1alpha,25(OH)(2)D(3) were contrary. HDAC3 and HDAC7 attenuated 1alpha,25(OH)(2)D(3)-dependent induction of the p21 gene, for which NCoR1 is essential. In contrast, TSA-mediated induction of the p18 gene was dependent on HDAC3 and HDAC4, but was opposed by NCoR1 and un-liganded VDR. This suggests that the attenuation of the response to TSA by NCoR1 or that to 1alpha,25(OH)(2)D(3) by HDACs can be overcome by their combined application achieving maximal induction of anti-proliferative target genes. [Abstract/Link to Full Text]

Glasner JD, Plunkett G, Anderson BD, Baumler DJ, Biehl BS, Burland V, Cabot EL, Darling AE, Mau B, Neeno-Eckwall EC, Pot D, Qiu Y, Rissman AI, Worzella S, Zaremba S, Fedorko J, Hampton T, Liss P, Rusch M, Shaker M, Shaull L, Shetty P, Thotakura S, Whitmore J, Blattner FR, Greene JM, Perna NT
Enteropathogen Resource Integration Center (ERIC): bioinformatics support for research on biodefense-relevant enterobacteria.
Nucleic Acids Res. 2007 Nov 13;
ERIC, the Enteropathogen Resource Integration Center (, is a new web portal serving as a rich source of information about enterobacteria on the NIAID established list of Select Agents related to biodefense-diarrheagenic Escherichia coli, Shigella spp., Salmonella spp., Yersinia enterocolitica and Yersinia pestis. More than 30 genomes have been completely sequenced, many more exist in draft form and additional projects are underway. These organisms are increasingly the focus of studies using high-throughput experimental technologies and computational approaches. This wealth of data provides unprecedented opportunities for understanding the workings of basic biological systems and discovery of novel targets for development of vaccines, diagnostics and therapeutics. ERIC brings information together from disparate sources and supports data comparison across different organisms, analysis of varying data types and visualization of analyses in human and computer-readable formats. [Abstract/Link to Full Text]

Hershman SG, Chen Q, Lee JY, Kozak ML, Yue P, Wang LS, Brad Johnson F
Genomic distribution and functional analyses of potential G-quadruplex-forming sequences in Saccharomyces cerevisiae.
Nucleic Acids Res. 2007 Nov 13;
Although well studied in vitro, the in vivo functions of G-quadruplexes (G4-DNA and G4-RNA) are only beginning to be defined. Recent studies have demonstrated enrichment for sequences with intramolecular G-quadruplex forming potential (QFP) in transcriptional promoters of humans, chickens and bacteria. Here we survey the yeast genome for QFP sequences and similarly find strong enrichment for these sequences in upstream promoter regions, as well as weaker but significant enrichment in open reading frames (ORFs). Further, four findings are consistent with roles for QFP sequences in transcriptional regulation. First, QFP is correlated with upstream promoter regions with low histone occupancy. Second, treatment of cells with N-methyl mesoporphyrin IX (NMM), which binds G-quadruplexes selectively in vitro, causes significant upregulation of loci with QFP-possessing promoters or ORFs. NMM also causes downregulation of loci connected with the function of the ribosomal DNA (rDNA), which itself has high QFP. Third, ORFs with QFP are selectively downregulated in sgs1 mutants that lack the G4-DNA-unwinding helicase Sgs1p. Fourth, a screen for yeast mutants that enhance or suppress growth inhibition by NMM revealed enrichment for chromatin and transcriptional regulators, as well as telomere maintenance factors. These findings raise the possibility that QFP sequences form bona fide G-quadruplexes in vivo and thus regulate transcription. [Abstract/Link to Full Text]

Pagel P, Oesterheld M, Tovstukhina O, Strack N, Stümpflen V, Frishman D
DIMA 2.0 predicted and known domain interactions.
Nucleic Acids Res. 2007 Nov 13;
DIMA-the domain interaction map has evolved from a simple web server for domain phylogenetic profiling into an integrative prediction resource combining both experimental data on domain-domain interactions and predictions from two different algorithms. With this update, DIMA obtains greatly improved coverage at the level of genomes and domains as well as with respect to available prediction approaches. The domain phylogenetic profiling method now uses SIMAP as its backend for exhaustive domain hit coverage: 7038 Pfam domains were profiled over 460 completely sequenced genomes.Domain pair exclusion predictions were produced from 83 969 distinct protein-protein interactions obtained from IntAct resulting in 21 513 domain pairs with significant domain pair exclusion algorithm scores. Additional predictions applying the same algorithm to predicted protein interactions from STRING yielded 2378 high-confidence pairs. Experimental data comes from iPfam (3074) and 3did (3034 pairs), two databases identifying domain contacts in solved protein structures. Taken together, these two resources yielded 3653 distinct interacting domain pairs. DIMA is available at [Abstract/Link to Full Text]

Backman TW, Sullivan CM, Cumbie JS, Miller ZA, Chapman EJ, Fahlgren N, Givan SA, Carrington JC, Kasschau KD
Update of ASRP: the Arabidopsis Small RNA Project database.
Nucleic Acids Res. 2007 Nov 13;
Development of the Arabidopsis Small RNA Project (ASRP) Database, which provides information and tools for the analysis of microRNA, endogenous siRNA and other small RNA-related features, has been driven by the introduction of high-throughput sequencing technology. To accommodate the demands of increased data, numerous improvements and updates have been made to ASRP, including new ways to access data, more efficient algorithms for handling data, and increased integration with community-wide resources. New search and visualization tools have also been developed to improve access to small RNA classes and their targets. ASRP is publicly available through a web interface at [Abstract/Link to Full Text]

Dahl C, Guldberg P
A ligation assay for multiplex analysis of CpG methylation using bisulfite-treated DNA.
Nucleic Acids Res. 2007 Nov 12;
Aberrant methylation of promoter CpG islands is causally linked with a number of inherited syndromes and most sporadic cancers, and may provide valuable diagnostic and prognostic biomarkers. In this report, we describe an approach to simultaneous analysis of multiple CpG islands, where methylation-specific oligonucleotide probes are joined by ligation and subsequently amplified by polymerase chain reaction (PCR) when hybridized in juxtaposition on bisulfite-treated DNA. Specificity of the ligation reaction is achieved by (i) using probes containing CpGpCpG (for methylated sequences) or CpApCpA (for unmethylated sequences) at the 3' ends, (ii) including three or more probes for each target, and (iii) using a thermostable DNA ligase. The external probes carry universal tails to allow amplification of multiple ligation products using a common primer pair. As proof-of-principle applications, we established duplex assays to examine the FMR1 promoter in individuals with fragile-X syndrome and the SNRPN promoter in individuals with Prader-Willi syndrome or Angelman syndrome, and a multiplex assay to simultaneously detect hypermethylation of seven genes (ID4, APC, RASSF1A, CDH1, ESR1, HIN1 and TWIST1) in breast cancer cell lines and tissues. These data show that ligation of oligonucleotide probes hybridized to bisulfite-treated DNA is a simple and cost-effective approach to analysis of CpG methylation. [Abstract/Link to Full Text]

Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, Kanehisa M
AAindex: amino acid index database, progress report 2008.
Nucleic Acids Res. 2007 Nov 12;
AAindex is a database of numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids. We have added a collection of protein contact potentials to the AAindex as a new section. Accordingly AAindex consists of three sections now: AAindex1 for the amino acid index of 20 numerical values, AAindex2 for the amino acid substitution matrix and AAindex3 for the statistical protein contact potentials. All data are derived from published literature. The database can be accessed through the DBGET/LinkDB system at GenomeNet ( or downloaded by anonymous FTP ( [Abstract/Link to Full Text]

Rawlings ND, Morton FR, Kok CY, Kong J, Barrett AJ
MEROPS: the peptidase database.
Nucleic Acids Res. 2007 Nov 8;
Peptidases (proteolytic enzymes or proteases), their substrates and inhibitors are of great relevance to biology, medicine and biotechnology. The MEROPS database ( aims to fulfil the need for an integrated source of information about these. The organizational principle of the database is a hierarchical classification in which homologous sets of peptidases and protein inhibitors are grouped into protein species, which are grouped into families and in turn grouped into clans. Important additions to the database include newly written, concise text annotations for peptidase clans and the small molecule inhibitors that are outside the scope of the standard classification; displays to show peptidase specificity compiled from our collection of known substrate cleavages; tables of peptidase-inhibitor interactions; and dynamically generated alignments of representatives of each protein species at the family level. New ways to compare peptidase and inhibitor complements between any two organisms whose genomes have been completely sequenced, or between different strains or subspecies of the same organism, have been devised. [Abstract/Link to Full Text]

Fahrer J, Kranaster R, Altmeyer M, Marx A, Bürkle A
Quantitative analysis of the binding affinity of poly(ADP-ribose) to specific binding proteins as a function of chain length.
Nucleic Acids Res. 2007 Nov 8;
Poly(ADP-ribose) (PAR) is synthesized by poly(ADP-ribose) polymerases in response to genotoxic stress and interacts non-covalently with DNA damage checkpoint and repair proteins. Here, we present a variety of techniques to analyze this interaction in terms of selectivity and affinity. In vitro synthesized PAR was end-labeled using a carbonyl-reactive biotin analog. Binding of HPLC-fractionated PAR chains to the tumor suppressor protein p53 and to the nucleotide excision repair protein XPA was assessed using a novel electrophoretic mobility shift assay (EMSA). Long ADP-ribose chains (55-mer) promoted the formation of three specific complexes with p53. Short PAR chains (16-mer) were also able to bind p53, yet forming only one defined complex. In contrast, XPA did not interact with short polymer, but produced a single complex with long PAR chains (55-mer). In addition, we performed surface plasmon resonance with immobilized PAR chains, which allowed establishing binding constants and confirmed the results obtained by EMSA. Taken together, we developed several new protocols permitting the quantitative characterization of PAR-protein binding. Furthermore, we demonstrated that the affinity of the non-covalent PAR interactions with specific binding proteins (XPA, p53) can be very high (nanomolar range) and depends both on the PAR chain length and on the binding protein. [Abstract/Link to Full Text]

Griffiths-Jones S, Saini HK, Dongen SV, Enright AJ
miRBase: tools for microRNA genomics.
Nucleic Acids Res. 2007 Nov 8;
miRBase is the central online repository for microRNA (miRNA) nomenclature, sequence data, annotation and target prediction. The current release (10.0) contains 5071 miRNA loci from 58 species, expressing 5922 distinct mature miRNA sequences: a growth of over 2000 sequences in the past 2 years. miRBase provides a range of data to facilitate studies of miRNA genomics: all miRNAs are mapped to their genomic coordinates. Clusters of miRNA sequences in the genome are highlighted, and can be defined and retrieved with any inter-miRNA distance. The overlap of miRNA sequences with annotated transcripts, both protein- and non-coding, are described. Finally, graphical views of the locations of a wide range of genomic features in model organisms allow for the first time the prediction of the likely boundaries of many miRNA primary transcripts. miRBase is available at [Abstract/Link to Full Text]

Sprague J, Bayraktaroglu L, Bradford Y, Conlin T, Dunn N, Fashena D, Frazer K, Haendel M, Howe DG, Knight J, Mani P, Moxon SA, Pich C, Ramachandran S, Schaper K, Segerdell E, Shao X, Singer A, Song P, Sprunger B, Van Slyke CE, Westerfield M
The Zebrafish Information Network: the zebrafish model organism database provides expanded support for genotypes and phenotypes.
Nucleic Acids Res. 2007 Nov 8;
The Zebrafish Information Network (ZFIN,, the model organism database for zebrafish, provides the central location for curated zebrafish genetic, genomic and developmental data. Extensive data integration of mutant phenotypes, genes, expression patterns, sequences, genetic markers, morpholinos, map positions, publications and community resources facilitates the use of the zebrafish as a model for studying gene function, development, behavior and disease. Access to ZFIN data is provided via web-based query forms and through bulk data files. ZFIN is the definitive source for zebrafish gene and allele nomenclature, the zebrafish anatomical ontology (AO) and for zebrafish gene ontology (GO) annotations. ZFIN plays an active role in the development of cross-species ontologies such as the phenotypic quality ontology (PATO) and the gene ontology (GO). Recent enhancements to ZFIN include (i) a new home page and navigation bar, (ii) expanded support for genotypes and phenotypes, (iii) comprehensive phenotype annotations based on anatomical, phenotypic quality and gene ontologies, (iv) a BLAST server tightly integrated with the ZFIN database via ZFIN-specific datasets, (v) a global site search and (vi) help with hands-on resources. [Abstract/Link to Full Text]

Rogers A, Antoshechkin I, Bieri T, Blasiar D, Bastiani C, Canaran P, Chan J, Chen WJ, Davis P, Fernandes J, Fiedler TJ, Han M, Harris TW, Kishore R, Lee R, McKay S, Müller HM, Nakamura C, Ozersky P, Petcherski A, Schindelman G, Schwarz EM, Spooner W, Tuli MA, Auken KV, Wang D, Wang X, Williams G, Yook K, Durbin R, Stein LD, Spieth J, Sternberg PW
WormBase 2007.
Nucleic Acids Res. 2007 Nov 8;
WormBase ( is the major publicly available database of information about Caenorhabditis elegans, an important system for basic biological and biomedical research. Derived from the initial ACeDB database of C. elegans genetic and sequence information, WormBase now includes the genomic, anatomical and functional information about C. elegans, other Caenorhabditis species and other nematodes. As such, it is a crucial resource not only for C. elegans biologists but the larger biomedical and bioinformatics communities. Coverage of core areas of C. elegans biology will allow the biomedical community to make full use of the results of intensive molecular genetic analysis and functional genomic studies of this organism. Improved search and display tools, wider cross-species comparisons and extended ontologies are some of the features that will help scientists extend their research and take advantage of other nematode species genome sequences. [Abstract/Link to Full Text]

Marinelli RJ, Montgomery K, Liu CL, Shah NH, Prapong W, Nitzberg M, Zachariah ZK, Sherlock GJ, Natkunam Y, West RB, van de Rijn M, Brown PO, Ball CA
The Stanford Tissue Microarray Database.
Nucleic Acids Res. 2007 Nov 22;
The Stanford Tissue Microarray Database (TMAD; is a public resource for disseminating annotated tissue images and associated expression data. Stanford University pathologists, researchers and their collaborators worldwide use TMAD for designing, viewing, scoring and analyzing their tissue microarrays. The use of tissue microarrays allows hundreds of human tissue cores to be simultaneously probed by antibodies to detect protein abundance (Immunohistochemistry; IHC), or by labeled nucleic acids (in situ hybridization; ISH) to detect transcript abundance. TMAD archives multi-wavelength fluorescence and bright-field images of tissue microarrays for scoring and analysis. As of July 2007, TMAD contained 205 161 images archiving 349 distinct probes on 1488 tissue microarray slides. Of these, 31 306 images for 68 probes on 125 slides have been released to the public. To date, 12 publications have been based on these raw public data. TMAD incorporates the NCI Thesaurus ontology for searching tissues in the cancer domain. Image processing researchers can extract images and scores for training and testing classification algorithms. The production server uses the Apache HTTP Server, Oracle Database and Perl application code. Source code is available to interested researchers under a no-cost license. [Abstract/Link to Full Text]

Keppetipola N, Shuman S
Characterization of the 2',3' cyclic phosphodiesterase activities of Clostridium thermocellum polynucleotide kinase-phosphatase and bacteriophage {lambda} phosphatase.
Nucleic Acids Res. 2007 Nov 5;
Clostridium thermocellum polynucleotide kinase-phosphatase (CthPnkp) catalyzes 5' and 3' end-healing reactions that prepare broken RNA termini for sealing by RNA ligase. The central phosphatase domain of CthPnkp belongs to the dinuclear metallophosphoesterase superfamily exemplified by bacteriophage lambda phosphatase (lambda-Pase). CthPnkp is a Ni(2+)/Mn(2+)-dependent phosphodiesterase-monoesterase, active on nucleotide and non-nucleotide substrates, that can be transformed toward narrower metal and substrate specificities via mutations of the active site. Here we characterize the Mn(2+)-dependent 2',3' cyclic nucleotide phosphodiesterase activity of CthPnkp, the reaction most relevant to RNA repair pathways. We find that CthPnkp prefers a 2',3' cyclic phosphate to a 3',5' cyclic phosphate. A single H189D mutation imposes strict specificity for a 2',3' cyclic phosphate, which is cleaved to form a single 2'-NMP product. Analysis of the cyclic phosphodiesterase activities of mutated CthPnkp enzymes illuminates the active site and the structural features that affect substrate affinity and k(cat). We also characterize a previously unrecognized phosphodiesterase activity of lambda-Pase, which catalyzes hydrolysis of bis-p-nitrophenyl phosphate. lambda-Pase also has cyclic phosphodiesterase activity with nucleoside 2',3' cyclic phosphates, which it hydrolyzes to yield a mixture of 2'-NMP and 3'-NMP products. We discuss our results in light of available structural and functional data for other phosphodiesterase members of the binuclear metallophosphoesterase family and draw inferences about how differences in active site composition influence catalytic repertoire. [Abstract/Link to Full Text]

Heger A, Korpelainen E, Hupponen T, Mattila K, Ollikainen V, Holm L
PairsDB atlas of protein sequence space.
Nucleic Acids Res. 2007 Nov 5;
Sequence similarity/database searching is a cornerstone of molecular biology. PairsDB is a database intended to make exploring protein sequences and their similarity relationships quick and easy. Behind PairsDB is a comprehensive collection of protein sequences and BLAST and PSI-BLAST alignments between them. Instead of running BLAST or PSI-BLAST individually on each request, results are retrieved instantaneously from a database of pre-computed alignments. Filtering options allow you to find a set of sequences satisfying a set of criteria-for example, all human proteins with solved structure and without transmembrane segments. PairsDB is continually updated and covers all sequences in Uniprot. The data is stored in a MySQL relational database. Data files will be made available for download at PairsDB can also be accessed interactively at PairsDB data is a valuable platform to build various downstream automated analysis pipelines. For example, the graph of all-against-all similarity relationships is the starting point for clustering protein families, delineating domains, improving alignment accuracy by consistency measures, and defining orthologous genes. Moreover, query-anchored stacked sequence alignments, profiles and consensus sequences are useful in studies of sequence conservation patterns for clues about possible functional sites. [Abstract/Link to Full Text]

Ma B, Levine AJ
Probing potential binding modes of the p53 tetramer to DNA based on the symmetries encoded in p53 response elements.
Nucleic Acids Res. 2007 Nov 5;
Symmetries in the p53 response-element (p53RE) encode binding modes for p53 tetramer to recognize DNA. We investigated the molecular mechanisms and biological implications of the possible binding modes. The probabilities evaluated with molecular dynamics simulations and DNA sequence analyses were found to be correlated, indicating that p53 tetramer models studied here are able to read DNA sequence information. The traditionally believed mode with four p53 monomers binding at all four DNA quarter-sites does not cause linear DNA to bend. Alternatively, p53 tetramer can use only two monomers to recognize DNA sequence and induce DNA bending. With an arrangement of dimer of AB dimer observed in p53 trimer-DNA complex crystal, p53 can recognize supercoiled DNA sequence-specifically by binding to quarter-sites one and four (H14 mode) and recognize Holliday junction geometry-specifically. Examining R273H mutation and p53-DNA interactions, we found that at least three R273H monomers are needed to disable the p53 tetramer, consistent with experiments. But just one R273H monomer may greatly shift the binding mode probabilities. Our work suggests that p53 needs balanced binding modes to maintain genome stability. Inverse repeat p53REs favor the H14 mode and direct repeat p53REs may have high possibilities of other modes. [Abstract/Link to Full Text]

Passos JF, Saretzki G, von Zglinicki T
DNA damage in telomeres and mitochondria during cellular senescence: is there a connection?
Nucleic Acids Res. 2007 Nov 5;
Cellular senescence is the ultimate and irreversible loss of replicative capacity occurring in primary somatic cell culture. It is triggered as a stereotypic response to unrepaired nuclear DNA damage or to uncapped telomeres. In addition to a direct role of nuclear DNA double-strand breaks as inducer of a DNA damage response, two more subtle types of DNA damage induced by physiological levels of reactive oxygen species (ROS) can have a significant impact on cellular senescence: Firstly, it has been established that telomere shortening, which is the major contributor to telomere uncapping, is stress dependent and largely caused by a telomere-specific DNA single-strand break repair inefficiency. Secondly, mitochondrial DNA (mtDNA) damage is closely interrelated with mitochondrial ROS production, and this might also play a causal role for cellular senescence. Improvement of mitochondrial function results in less telomeric damage and slower telomere shortening, while telomere-dependent growth arrest is associated with increased mitochondrial dysfunction. Moreover, telomerase, the enzyme complex that is known to re-elongate shortened telomeres, also appears to have functions independent of telomeres that protect against oxidative stress. Together, these data suggest a self-amplifying cycle between mitochondrial and telomeric DNA damage during cellular senescence. [Abstract/Link to Full Text]

Hastie AR, Pruitt SC
Yeast two-hybrid interaction partner screening through in vivo Cre-mediated Binary Interaction Tag generation.
Nucleic Acids Res. 2007 Nov 5;
Yeast two-hybrid (Y2H) has been successfully used for genome-wide screens to identify protein-protein interactions for several model organisms. Nonetheless, the logistics of pair-wise screening has resulted in a cumbersome and incomplete application of this method to complex genomes. Here, we develop a modification of Y2H that eliminates the requirement for pair-wise screening. This is accomplished by incorporating lox sequences into Y2H vectors such that cDNAs encoding interacting partners become physically linked in the presence of Cre recombinase in vivo. Once linked, DNA from complex pools of clones can be processed without losing the identity of the interacting partners. Short linked sequence tags from each pair of interacting partner (binary interaction Tags or BI-Tags) are then recovered and sequenced. To validate the approach, comparisons between interactions found using traditional Y2H and the BI-Tag method were made, which demonstrate that the BI-Tag technology accurately represents the complexity of the interaction partners found in the screens. The technology described here sufficiently improves the throughput of the Y2H approach to make feasible the generation of near comprehensive interaction maps for complex organisms. [Abstract/Link to Full Text]

Lee PH, Shatkay H
F-SNP: computationally predicted functional SNPs for disease association studies.
Nucleic Acids Res. 2007 Nov 5;
The Functional Single Nucleotide Polymorphism (F-SNP) database integrates information obtained from 16 bioinformatics tools and databases about the functional effects of SNPs. These effects are predicted and indicated at the splicing, transcriptional, translational and post-translational level. As such, the database helps identify and focus on SNPs with potential deleterious effect to human health. In particular, users can retrieve SNPs that disrupt genomic regions known to be functional, including splice sites and transcriptional regulatory regions. Users can also identify non-synonymous SNPs that may have deleterious effects on protein structure or function, interfere with protein translation or impede post-translational modification. A web interface enables easy navigation for obtaining information through multiple starting points and exploration routes (e.g. starting from SNP identifier, genomic region, gene or target disease). The F-SNP database is available at [Abstract/Link to Full Text]

Elles LM, Uhlenbeck OC
Mutation of the arginine finger in the active site of Escherichia coli DbpA abolishes ATPase and helicase activity and confers a dominant slow growth phenotype.
Nucleic Acids Res. 2007 Nov 5;
Escherichia coli DEAD-box protein A (DbpA) is an ATP-dependent RNA helicase with specificity for 23S ribosomal RNA. Although DbpA has been extensively characterized biochemically, its biological function remains unknown. Previous work has shown that a DbpA deletion strain is viable with little or no effect on growth rate. In attempt to elucidate a phenotype for DbpA, point mutations were made at eleven conserved residues in the ATPase active site, which have exhibited dominant-negative phenotypes in other DExD/H proteins. Biochemical analysis of these DbpA mutants shows the expected decrease in RNA-dependent ATPase activity and helix unwinding activity. Only the least biochemically active mutation, R331A, produces small colony phenotype and a reduced growth rate. This dominant slow growth mutant will be valuable to determine the cellular function of DbpA. [Abstract/Link to Full Text]

Bex C, Knauth K, Dambacher S, Buchberger A
A yeast two-hybrid system reconstituting substrate recognition of the von Hippel-Lindau tumor suppressor protein.
Nucleic Acids Res. 2007 Nov 5;
The von Hippel-Lindau tumor suppressor protein (pVHL) is inactivated in the hereditary cancer syndrome von Hippel-Lindau disease and in the majority of sporadic renal carcinomas. pVHL is the substrate-binding subunit of the CBC(VHL) ubiquitin ligase complex that negatively regulates cell growth by promoting the degradation of hypoxia-inducible transcription factor subunits (HIF1/2alpha). Proteomics-based identification of novel pVHL substrates is hampered by their short half-life and low abundancy in mammalian cells. The usefulness of yeast two-hybrid (Y2H) approaches, on the other hand, has been limited by the failure of pVHL to adopt its native structure and by the absence of prolylhydroxylase activity critical for pVHL substrate recognition. Therefore, we modified the Y2H system to faithfully reconstitute the physical interaction between pVHL and its substrates. Our approach relies on the coexpression of pVHL with the cofactors Elongin B and Elongin C and with HIF1/2alpha prolylhydroxylases. In a proof-of-principle Y2H screen, we identified the known substrates HIF1/2alpha and new candidate substrates including diacylglycerol kinase iota, demonstrating that our strategy allows detection of stable interactions between pVHL and otherwise elusive cellular targets. Additional future applications may include structure/function analyses of pVHL-HIF1/2alpha binding and screens for therapeutically relevant compounds that either stabilize or disrupt this interaction. [Abstract/Link to Full Text]

Conte MG, Gaillard S, Lanau N, Rouard M, Périn C
GreenPhylDB: a database for plant comparative genomics.
Nucleic Acids Res. 2007 Nov 5;
GreenPhylDB ( is a comprehensive platform designed to facilitate comparative functional genomics in Oryza sativa and Arabidopsis thaliana genomes. The main functions of GreenPhylDB are to assign O. sativa and A. thaliana sequences to gene families using a semi-automatic clustering procedure and to create 'orthologous' groups using a phylogenomic approach. To date, GreenPhylDB comprises the most complete list of plant gene families, which have been manually curated (6421 families). GreenPhylDB also contains all of the phylogenomic relationships computed for 4375 families. A total of 492 TAIR, 1903 InterPro and 981 KEGG families and subfamilies were manually curated using the clusters created with the TribeMCL software. GreenPhylDB integrates information from several other databases including UniProt, KEGG, InterPro, TAIR and TIGR. Several entry points can be used to display phylogenomic relationships for A. thaliana or O. sativa sequences, using TAIR, TIGR gene ID, family name, InterPro, gene alias, UniProt or protein/nucleic sequence. Finally, a powerful phylogenomics tool, GreenPhyl Ortholog Search Tool (GOST), was incorporated into GreenPhylDB to predict orthologous relationships between O. sativa/A. thaliana protein(s) and sequences from other plant species. [Abstract/Link to Full Text]

Recent Articles in Genome Research

Kukekova AV, Trut LN, Oskina IN, Johnson JL, Temnykh SV, Kharlamova AV, Shepeleva DV, Gulievich RG, Shikhevich SG, Graphodatsky AS, Aguirre GD, Acland GM
A meiotic linkage map of the silver fox, aligned and compared to the canine genome.
Genome Res. 2007 Mar;17(3):387-99.
A meiotic linkage map is essential for mapping traits of interest and is often the first step toward understanding a cryptic genome. Specific strains of silver fox (a variant of the red fox, Vulpes vulpes), which segregate behavioral and morphological phenotypes, create a need for such a map. One such strain, selected for docility, exhibits friendly dog-like responses to humans, in contrast to another strain selected for aggression. Development of a fox map is facilitated by the known cytogenetic homologies between the dog and fox, and by the availability of high resolution canine genome maps and sequence data. Furthermore, the high genomic sequence identity between dog and fox allows adaptation of canine microsatellites for genotyping and meiotic mapping in foxes. Using 320 such markers, we have constructed the first meiotic linkage map of the fox genome. The resulting sex-averaged map covers 16 fox autosomes and the X chromosome with an average inter-marker distance of 7.5 cM. The total map length corresponds to 1480.2 cM. From comparison of sex-averaged meiotic linkage maps of the fox and dog genomes, suppression of recombination in pericentromeric regions of the metacentric fox chromosomes was apparent, relative to the corresponding segments of acrocentric dog chromosomes. Alignment of the fox meiotic map against the 7.6x canine genome sequence revealed high conservation of marker order between homologous regions of the two species. The fox meiotic map provides a critical tool for genetic studies in foxes and identification of genetic loci and genes implicated in fox domestication. [Abstract/Link to Full Text]

Goios A, Pereira L, Bogue M, Macaulay V, Amorim A
mtDNA phylogeny and evolution of laboratory mouse strains.
Genome Res. 2007 Mar;17(3):293-8.
Inbred mouse strains have been maintained for more than 100 years, and they are thought to be a mixture of four different mouse subspecies. Although genealogies have been established, female inbred mouse phylogenies remain unexplored. By a phylogenetic analysis of newly generated complete mitochondrial DNA sequence data in 16 strains, we show here that all common inbred strains descend from the same Mus musculus domesticus female wild ancestor, and suggest that they present a different mitochondrial evolutionary process than their wild relatives with a faster accumulation of replacement substitutions. Our data complement forthcoming results on resequencing of a group of priority strains, and they follow recent efforts of the Mouse Phenome Project to collect and make publicly available information on various strains. [Abstract/Link to Full Text]

Zhao G, Schriefer LA, Stormo GD
Identification of muscle-specific regulatory modules in Caenorhabditis elegans.
Genome Res. 2007 Mar;17(3):348-57.
Transcriptional regulation is the major regulatory mechanism that controls the spatial and temporal expression of genes during development. This is carried out by transcription factors (TFs), which recognize and bind to their cognate binding sites. Recent studies suggest a modular organization of TF-binding sites, in which clusters of transcription-factor binding sites cooperate in the regulation of downstream gene expression. In this study, we report our computational identification and experimental verification of muscle-specific cis-regulatory modules in Caenorhabditis elegans. We first identified a set of motifs that are correlated with muscle-specific gene expression. We then predicted muscle-specific regulatory modules based on clusters of those motifs with characteristics similar to a collection of well-studied modules in other species. The method correctly identifies 88% of the experimentally characterized modules with a positive predictive value of at least 65%. The prediction accuracy of muscle-specific expression on an independent test set is highly significant (P<0.0001). We performed in vivo experimental tests of 12 predicted modules, and 10 of those drive muscle-specific gene expression. These results suggest that our method is highly accurate in identifying functional sequences important for muscle-specific gene expression and is a valuable tool for guiding experimental designs. [Abstract/Link to Full Text]

Spady TC, Ostrander EA
Canid genomics: mapping genes for behavior in the silver fox.
Genome Res. 2007 Mar;17(3):259-63. [Abstract/Link to Full Text]

Babcock M, Yatsenko S, Stankiewicz P, Lupski JR, Morrow BE
AT-rich repeats associated with chromosome 22q11.2 rearrangement disorders shape human genome architecture on Yq12.
Genome Res. 2007 Apr;17(4):451-60.
Low copy repeats (LCRs; segmental duplications) constitute approximately 5% of the sequenced human genome. Nonallelic homologous recombination events between LCRs during meiosis can lead to chromosomal rearrangements responsible for many genomic disorders. The 22q11.2 region is susceptible to recurrent and nonrecurrent deletions, duplications as well as translocations that are mediated by LCRs termed LCR22s. One particular DNA structural element, a palindromic AT-rich repeat (PATRR) present within LCR22-3a, is responsible for translocations. Similar AT-rich repeats are present within the two largest LCR22s, LCR22-2 and LCR22-4. We provide direct sequence evidence that the AT-rich repeats have altered LCR22 organization during primate evolution. The AT-rich repeats are surrounded by a subtype of human satellite I (HSAT I), and an AluSc element, forming a 2.4-kb tripartite structure. Besides 22q11.2, FISH and PCR mapping localized the tripartite repeat within heterochromatic, unsequenced regions of the genome, including the pericentromeric regions of the acrocentric chromosomes and the heterochromatic portion of Yq12 in humans. The repeat is also present on autosomes but not on chromosome Y in other hominoid species, suggesting that it has duplicated on Yq12 after speciation of humans from its common ancestor. This demonstrates that AT-rich repeats have shaped or altered the structure of the genome during evolution. [Abstract/Link to Full Text]

Kumar S, Filipski A
Multiple sequence alignment: in pursuit of homologous DNA positions.
Genome Res. 2007 Feb;17(2):127-35.
DNA sequence alignment is a prerequisite to virtually all comparative genomic analyses, including the identification of conserved sequence motifs, estimation of evolutionary divergence between sequences, and inference of historical relationships among genes and species. While it is mere common sense that inaccuracies in multiple sequence alignments can have detrimental effects on downstream analyses, it is important to know the extent to which the inferences drawn from these alignments are robust to errors and biases inherent in all sequence alignments. A survey of investigations into strengths and weaknesses of sequence alignments reveals, as expected, that alignment quality is generally poor for two distantly related sequences and can often be improved by adding additional sequences as stepping stones between distantly related species. Errors in sequence alignment are also found to have a significant negative effect on subsequent inference of sequence divergence, phylogenetic trees, and conserved motifs. However, our understanding of alignment biases remains rudimentary, and sequence alignment procedures continue to be used somewhat like benign formatting operations to make sequences equal in length. Because of the central role these alignments now play in our endeavors to establish the tree of life and to identify important parts of genomes through evolutionary functional genomics, we see a need for increased community effort to investigate influences of alignment bias on the accuracy of large-scale comparative genomics. [Abstract/Link to Full Text]

Kurahashi H, Inagaki H, Hosoba E, Kato T, Ohye T, Kogo H, Emanuel BS
Molecular cloning of a translocation breakpoint hotspot in 22q11.
Genome Res. 2007 Apr;17(4):461-9.
It has been well documented that 22q11 contains one of the most rearrangement-prone sites in the human genome, where the breakpoints of a number of constitutional translocations cluster. This breakage-sensitive region is located within one of the remaining unclonable gaps from the human genome project, suggestive of a specific sequence recalcitrant to cloning. In this study, we cloned a part of this gap and identified a novel 595-bp palindromic AT-rich repeat (PATRR). To date we have identified three translocation-associated PATRRs. They have common characteristics: (1) they are AT-rich nearly perfect palindromes, which are several hundred base pairs in length; (2) they possess non-AT-rich regions at both ends of the PATRR; (3) they display another nearby AT-rich region on one side of the PATRR. All of these features imply a potential for DNA secondary structure. Sequence analysis of unrelated individuals indicates no major size polymorphism, but shows minor nucleotide polymorphisms among individuals and cis-morphisms between the proximal and distal arms. Breakpoint analysis of various translocations indicates that double-strand-breakage (DSB) occurs at the center of the palindrome, often accompanied by a small symmetric deletion at the center. The breakpoints share only a small number of identical nucleotides between partner chromosomes. Taken together, these features imply that the DSBs are repaired through nonhomologous end joining or single-strand annealing rather than a homologous recombination pathway. All of these results support a previously proposed paradigm that unusual DNA secondary structure plays a role in the mechanism by which palindrome-mediated translocations occur. [Abstract/Link to Full Text]

Peters BA, St Croix B, Sjöblom T, Cummins JM, Silliman N, Ptak J, Saha S, Kinzler KW, Hatzis C, Velculescu VE
Large-scale identification of novel transcripts in the human genome.
Genome Res. 2007 Mar;17(3):287-92.
Although the sequencing of the human genome has been completed, the number and identity of genes contained within it remains to be fully determined. We used LongSAGE to analyze 660,357 human transcripts from human brain mRNA and identified expression of 17,409 known genes and >15,000 different transcripts that were not annotated in genome databases. Analysis of a subset of these unannotated transcripts suggests that 85% were differentially expressed in various tissue types and that fewer than 20% would have been detected by ab initio gene predictions. These studies suggest that the human genome contains on the order of twice as many transcribed regions as are currently annotated and that experimental approaches will be required to fully elucidate the novel genes corresponding to these transcripts. [Abstract/Link to Full Text]

Oosting J, Lips EH, van Eijk R, Eilers PH, Szuhai K, Wijmenga C, Morreau H, van Wezel T
High-resolution copy number analysis of paraffin-embedded archival tissue using SNP BeadArrays.
Genome Res. 2007 Mar;17(3):368-76.
High-density SNP microarrays provide insight into the genomic events that occur in diseases like cancer through their capability to measure both LOH and genomic copy numbers. Where currently available methods are restricted to the use of fresh frozen tissue, we now describe the design and validation of copy number measurements using the Illumina BeadArray platform and the application of this technique to formalin-fixed, paraffin-embedded (FFPE) tissue. In fresh frozen tissue from a set of colorectal tumors with numerous chromosomal aberrations, our method measures copy number patterns that are comparable to values from established platforms, like Affymetrix GeneChip and BAC array-CGH. Moreover, paired comparisons of fresh frozen and FFPE tissues showed nearly identical patterns of genomic change. We conclude that this method enables the use of paraffin-embedded material for research into both LOH and numerical chromosomal abnormalities. These findings make the large pathological archives available for genomic analysis, which could be especially relevant for hereditary disease where fresh material from affected relatives is rarely available. [Abstract/Link to Full Text]

Maydan JS, Flibotte S, Edgley ML, Lau J, Selzer RR, Richmond TA, Pofahl NJ, Thomas JH, Moerman DG
Efficient high-resolution deletion discovery in Caenorhabditis elegans by array comparative genomic hybridization.
Genome Res. 2007 Mar;17(3):337-47.
We have developed array Comparative Genomic Hybridization for Caenorhabditis elegans as a means of screening for novel induced deletions in this organism. We designed three microarrays consisting of overlapping 50-mer probes to annotated exons and micro-RNAs, the first with probes to chromosomes X and II, the second with probes to chromosome II alone, and a third to the entire genome. These arrays were used to reliably detect both a large (50 kb) multigene deletion and a small (1 kb) single-gene deletion in homozygous and heterozygous samples. In one case, a deletion breakpoint was resolved to fewer than 50 bp. In an experiment designed to identify new mutations we used the X:II and II arrays to detect deletions associated with lethal mutants on chromosome II. One is an 8-kb deletion targeting the ast-1 gene on chromosome II and another is a 141-bp deletion in the gene C06A8.1. Others span large sections of the chromosome, up to >750 kb. As a further application of array Comparative Genomic Hybridization in C. elegans we used the whole-genome array to detect the extensive natural gene content variation (almost 2%) between the N2 Bristol strain and the strain CB4856, a strain isolated in Hawaii and JU258, a strain isolated in Madeira. [Abstract/Link to Full Text]

Gat-Viks I, Shamir R
Refinement and expansion of signaling pathways: the osmotic response network in yeast.
Genome Res. 2007 Mar;17(3):358-67.
The analysis of large-scale genome-wide experiments carries the promise of dramatically broadening our understanding on biological networks. The challenge of systematic integration of experimental results with established biological knowledge on a pathway is still unanswered. Here we present a methodology that attempts to answer this challenge when investigating signaling pathways. We formalize existing qualitative knowledge as a probabilistic model that depicts known interactions between molecules (genes, proteins, etc.) as a network and known regulatory relations as logics. We present algorithms that analyze experimental results (e.g., transcription profiles) vis-à-vis the model and propose improvements to the model based on the fit to the experimental data. These algorithms refine the relations between model components, as well as expand the model to include new components that are regulated by components of the original network. Using our methodology, we have modeled together the knowledge on four established signaling pathways related to osmotic shock response in Saccharomyces cerevisiae. Using over 100 published transcription profiles, our refinement methodology revealed three cross talks in the network. The expansion procedure identified with high confidence large groups of genes that are coregulated by transcription factors from the original network via a common logic. The results reveal a novel delicate repressive effect of the HOG pathway on many transcriptional target genes and suggest an unexpected alternative functional mode of the MAP kinase Hog1. These results demonstrate that, by integrated analysis of data and of well-defined knowledge, one can generate concrete biological hypotheses about signaling cascades and their downstream regulatory programs. [Abstract/Link to Full Text]

Hurle B, Swanson W, Green ED
Comparative sequence analyses reveal rapid and divergent evolutionary changes of the WFDC locus in the primate lineage.
Genome Res. 2007 Mar;17(3):276-86.
The initial comparison of the human and chimpanzee genome sequences revealed 16 genomic regions with an unusually high density of rapidly evolving genes. One such region is the whey acidic protein (WAP) four-disulfide core domain locus (or WFDC locus), which contains 14 WFDC genes organized in two subloci on human chromosome 20q13. WAP protease inhibitors have roles in innate immunity and/or the regulation of a group of endogenous proteolytic enzymes called kallikreins. In human, the centromeric WFDC sublocus also contains the rapidly evolving seminal genes, semenogelin 1 and 2 (SEMG1 and SEMG2). The rate of SEMG2 evolution in primates has been proposed to correlate with female promiscuity and semen coagulation, perhaps related to post-copulatory sperm competition. We mapped and sequenced the centromeric WFDC sublocus in 12 primate species that collectively represent four different mating systems. Our analyses reveal a 130-kb region with a notably complex evolutionary history that has included nested duplications, deletions, and significant interspecies divergence of both coding and noncoding sequences; together, this has led to striking differences of this region among primates and between primates and rodents. Further, this region contains six closely linked genes (WFDC12, PI3, SEMG1, SEMG2, SLPI, and MATN4) that show strong patterns of adaptive selection, although an unambiguous correlation between gene mutation rates and mating systems could not be established. [Abstract/Link to Full Text]

Springer NM, Stupar RM
Allelic variation and heterosis in maize: how do two halves make more than a whole?
Genome Res. 2007 Mar;17(3):264-75.
In this review, we discuss the recent research on allelic variation in maize and possible implications of this work toward our understanding of heterosis. Heterosis, or hybrid vigor, is the increased performance of a hybrid relative to the parents, and is a result of the variation that is present within a species. Intraspecific comparisons of sequence and expression levels in maize have documented a surprisingly high level of allelic variation, which includes variation for the content of genic fragments, variation in repetitive elements surrounding genes, and variation in gene expression levels. There is evidence that transposons and repetitive DNA play a major role in the generation of this allelic diversity. The combination of allelic variants provides a more comprehensive suite of alleles in the hybrid that may be involved in novel allelic interactions. A major unresolved question is how the combined allelic variation and interactions in a hybrid give rise to heterotic phenotypes. An understanding of allelic variation present in maize provides an opportunity to speculate on mechanisms that might lead to heterosis. Variation for the presence of genes, the presence of novel beneficial alleles, and modified levels of gene expression in hybrids may all contribute to the heterotic phenotypes. [Abstract/Link to Full Text]

Petyuk VA, Qian WJ, Chin MH, Wang H, Livesay EA, Monroe ME, Adkins JN, Jaitly N, Anderson DJ, Camp DG, Smith DJ, Smith RD
Spatial mapping of protein abundances in the mouse brain by voxelation integrated with high-throughput liquid chromatography-mass spectrometry.
Genome Res. 2007 Mar;17(3):328-36.
Temporally and spatially resolved mapping of protein abundance patterns within the mammalian brain is of significant interest for understanding brain function and molecular etiologies of neurodegenerative diseases; however, such imaging efforts have been greatly challenged by complexity of the proteome, throughput and sensitivity of applied analytical methodologies, and accurate quantitation of protein abundances across the brain. Here, we describe a methodology for comprehensive spatial proteome mapping that addresses these challenges by employing voxelation integrated with automated microscale sample processing, high-throughput liquid chromatography (LC) system coupled with high-resolution Fourier transform ion cyclotron resonance (FTICR) mass spectrometer, and a "universal" stable isotope labeled reference sample approach for robust quantitation. We applied this methodology as a proof-of-concept trial for the analysis of protein distribution within a single coronal slice of a C57BL/6J mouse brain. For relative quantitation of the protein abundances across the slice, an 18O-isotopically labeled reference sample, derived from a whole control coronal slice from another mouse, was spiked into each voxel sample, and stable isotopic intensity ratios were used to obtain measures of relative protein abundances. In total, we generated maps of protein abundance patterns for 1028 proteins. The significant agreement of the protein distributions with previously reported data supports the validity of this methodology, which opens new opportunities for studying the spatial brain proteome and its dynamics during the course of disease progression and other important biological and associated health aspects in a discovery-driven fashion. [Abstract/Link to Full Text]

Huson DH, Auch AF, Qi J, Schuster SC
MEGAN analysis of metagenomic data.
Genome Res. 2007 Mar;17(3):377-86.
Metagenomics is the study of the genomic content of a sample of organisms obtained from a common habitat using targeted or random sequencing. Goals include understanding the extent and role of microbial diversity. The taxonomical content of such a sample is usually estimated by comparison against sequence databases of known sequences. Most published studies use the analysis of paired-end reads, complete sequences of environmental fosmid and BAC clones, or environmental assemblies. Emerging sequencing-by-synthesis technologies with very high throughput are paving the way to low-cost random "shotgun" approaches. This paper introduces MEGAN, a new computer program that allows laptop analysis of large metagenomic data sets. In a preprocessing step, the set of DNA sequences is compared against databases of known sequences using BLAST or another comparison tool. MEGAN is then used to compute and explore the taxonomical content of the data set, employing the NCBI taxonomy to summarize and order the results. A simple lowest common ancestor algorithm assigns reads to taxa such that the taxonomical level of the assigned taxon reflects the level of conservation of the sequence. The software allows large data sets to be dissected without the need for assembly or the targeting of specific phylogenetic markers. It provides graphical and statistical output for comparing different data sets. The approach is applied to several data sets, including the Sargasso Sea data set, a recently published metagenomic data set sampled from a mammoth bone, and several complete microbial genomes. Also, simulations that evaluate the performance of the approach for different read lengths are presented. [Abstract/Link to Full Text]

Itoh T, Tanaka T, Barrero RA, Yamasaki C, Fujii Y, Hilton PB, Antonio BA, Aono H, Apweiler R, Bruskiewich R, Bureau T, Burr F, Costa de Oliveira A, Fuks G, Habara T, Haberer G, Han B, Harada E, Hiraki AT, Hirochika H, Hoen D, Hokari H, Hosokawa S, Hsing YI, Ikawa H, Ikeo K, Imanishi T, Ito Y, Jaiswal P, Kanno M, Kawahara Y, Kawamura T, Kawashima H, Khurana JP, Kikuchi S, Komatsu S, Koyanagi KO, Kubooka H, Lieberherr D, Lin YC, Lonsdale D, Matsumoto T, Matsuya A, McCombie WR, Messing J, Miyao A, Mulder N, Nagamura Y, Nam J, Namiki N, Numa H, Nurimoto S, O'Donovan C, Ohyanagi H, Okido T, Oota S, Osato N, Palmer LE, Quetier F, Raghuvanshi S, Saichi N, Sakai H, Sakai Y, Sakata K, Sakurai T, Sato F, Sato Y, Schoof H, Seki M, Shibata M, Shimizu Y, Shinozaki K, Shinso Y, Singh NK, Smith-White B, Takeda J, Tanino M, Tatusova T, Thongjuea S, Todokoro F, Tsugane M, Tyagi AK, Vanavichit A, Wang A, Wing RA, Yamaguchi K, Yamamoto M, Yamamoto N, Yu Y, Zhang H, Zhao Q, Higo K, Burr B, Gojobori T, Sasaki T
Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana.
Genome Res. 2007 Feb;17(2):175-83.
We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is approximately 32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene. [Abstract/Link to Full Text]

Tian B, Pan Z, Lee JY
Widespread mRNA polyadenylation events in introns indicate dynamic interplay between polyadenylation and splicing.
Genome Res. 2007 Feb;17(2):156-65.
mRNA polyadenylation and pre-mRNA splicing are two essential steps for the maturation of most human mRNAs. Studies have shown that some genes generate mRNA variants involving both alternative polyadenylation and alternative splicing. Polyadenylation in introns can lead to conversion of an internal exon to a 3' terminal exon, which is termed composite terminal exon, or usage of a 3' terminal exon that is otherwise skipped, which is termed skipped terminal exon. Using cDNA/EST and genome sequences, we identified polyadenylation sites in introns for all currently known human genes. We found that approximately 20% human genes have at least one intronic polyadenylation event that can potentially lead to mRNA variants, most of which encode different protein products. The conservation of human intronic poly(A) sites in mouse and rat genomes is lower than that of poly(A) sites in 3'-most exons. Quantitative analysis of a number of mRNA variants generated by intronic poly(A) sites suggests that the intronic polyadenylation activity can vary under different cellular conditions for most genes. Furthermore, we found that weak 5' splice site and large intron size are the determining factors controlling the usage of composite terminal exon poly(A) sites, whereas skipped terminal exon poly(A) sites tend to be associated with strong polyadenylation signals. Thus, our data indicate that dynamic interplay between polyadenylation and splicing leads to widespread polyadenylation in introns and contributes to the complexity of transcriptome in the cell. [Abstract/Link to Full Text]

Tenney AE, Wu JQ, Langton L, Klueh P, Quatrano R, Brent MR
A tale of two templates: automatically resolving double traces has many applications, including efficient PCR-based elucidation of alternative splices.
Genome Res. 2007 Feb;17(2):212-8.
Trace Recalling is a novel method for deconvoluting double traces that result from simultaneously sequencing two DNA templates. Trace Recalling identifies up to two bases at each position of such a trace. The resulting ambiguity sequence is aligned to the genome, identifying one template sequence. A second template sequence is then inferred from this alignment. This technique makes possible many exciting biological applications. Here we present two such applications, alternate splice finding and elucidation of multiple insertion sites in a random insertional mutagenesis library. Our results demonstrate that RT-PCR followed by Trace Recalling is a more efficient and cost effective way to find alternate splices than traditional methods. We also present a method for mapping double-insertion events in a random insertional-mutagenesis library. [Abstract/Link to Full Text]

Baek D, Davis C, Ewing B, Gordon D, Green P
Characterization and predictive discovery of evolutionarily conserved mammalian alternative promoters.
Genome Res. 2007 Feb;17(2):145-55.
Recent studies suggest that surprisingly many mammalian genes have alternative promoters (APs); however, their biological roles, and the characteristics that distinguish them from single promoters (SPs), remain poorly understood. We constructed a large data set of evolutionarily conserved promoters, and used it to identify sequence features, functional associations, and expression patterns that differ by promoter type. The four promoter categories CpG-rich APs, CpG-poor APs, CpG-rich SPs, and CpG-poor SPs each show characteristic strengths and patterns of sequence conservation, frequencies of putative transcription-related motifs, and tissue and developmental stage expression preferences. APs display substantially higher sequence conservation than SPs and CpG-poor promoters than CpG-rich promoters. Among CpG-poor promoters, APs and SPs show sharply contrasting developmental stage preferences and TATA box frequencies. We developed a discriminator to computationally predict promoter type, verified its accuracy through experimental tests that incorporate a novel method for deconvolving mixed sequence traces, and used it to find several new APs. The discriminator predicts that almost half of all mammalian genes have evolutionarily conserved APs. This high frequency of APs, together with the strong purifying selection maintaining them, implies a crucial role in expanding the expression diversity of the mammalian genome. [Abstract/Link to Full Text]

Stinear TP, Seemann T, Pidot S, Frigui W, Reysset G, Garnier T, Meurice G, Simon D, Bouchier C, Ma L, Tichit M, Porter JL, Ryan J, Johnson PD, Davies JK, Jenkin GA, Small PL, Jones LM, Tekaia F, Laval F, Daffé M, Parkhill J, Cole ST
Reductive evolution and niche adaptation inferred from the genome of Mycobacterium ulcerans, the causative agent of Buruli ulcer.
Genome Res. 2007 Feb;17(2):192-200.
Mycobacterium ulcerans is found in aquatic ecosystems and causes Buruli ulcer in humans, a neglected but devastating necrotic disease of subcutaneous tissue that is rampant throughout West and Central Africa. Here, we report the complete 5.8-Mb genome sequence of M. ulcerans and show that it comprises two circular replicons, a chromosome of 5632 kb and a virulence plasmid of 174 kb. The plasmid is required for production of the polyketide toxin mycolactone, which provokes necrosis. Comparisons with the recently completed 6.6-Mb genome of Mycobacterium marinum revealed >98% nucleotide sequence identity and genome-wide synteny. However, as well as the plasmid, M. ulcerans has accumulated 213 copies of the insertion sequence IS2404, 91 copies of IS2606, 771 pseudogenes, two bacteriophages, and multiple DNA deletions and rearrangements. These data indicate that M. ulcerans has recently evolved via lateral gene transfer and reductive evolution from the generalist, more rapid-growing environmental species M. marinum to become a niche-adapted specialist. Predictions based on genome inspection for the production of modified mycobacterial virulence factors, such as the highly abundant phthiodiolone lipids, were confirmed by structural analyses. Similarly, 11 protein-coding sequences identified as M. ulcerans-specific by comparative genomics were verified as such by PCR screening a diverse collection of 33 strains of M. ulcerans and M. marinum. This work offers significant insight into the biology and evolution of mycobacterial pathogens and is an important component of international efforts to counter Buruli ulcer. [Abstract/Link to Full Text]

Pennacchio LA, Loots GG, Nobrega MA, Ovcharenko I
Predicting tissue-specific enhancers in the human genome.
Genome Res. 2007 Feb;17(2):201-11.
Determining how transcriptional regulatory signals are encoded in vertebrate genomes is essential for understanding the origins of multicellular complexity; yet the genetic code of vertebrate gene regulation remains poorly understood. In an attempt to elucidate this code, we synergistically combined genome-wide gene-expression profiling, vertebrate genome comparisons, and transcription factor binding-site analysis to define sequence signatures characteristic of candidate tissue-specific enhancers in the human genome. We applied this strategy to microarray-based gene expression profiles from 79 human tissues and identified 7187 candidate enhancers that defined their flanking gene expression, the majority of which were located outside of known promoters. We cross-validated this method for its ability to de novo predict tissue-specific gene expression and confirmed its reliability in 57 of the 79 available human tissues, with an average precision in enhancer recognition ranging from 32% to 63% and a sensitivity of 47%. We used the sequence signatures identified by this approach to successfully assign tissue-specific predictions to approximately 328,000 human-mouse conserved noncoding elements in the human genome. By overlapping these genome-wide predictions with a data set of enhancers validated in vivo, in transgenic mice, we were able to confirm our results with a 28% sensitivity and 50% precision. These results indicate the power of combining complementary genomic data sets as an initial computational foray into a global view of tissue-specific gene regulation in vertebrates. [Abstract/Link to Full Text]

Shi P, Zhang J
Comparative genomic analysis identifies an evolutionary shift of vomeronasal receptor gene repertoires in the vertebrate transition from water to land.
Genome Res. 2007 Feb;17(2):166-74.
Two evolutionarily unrelated superfamilies of G-protein coupled receptors, V1Rs and V2Rs, bind pheromones and "ordinary" odorants to initiate vomeronasal chemical senses in vertebrates, which play important roles in many aspects of an organism's daily life such as mating, territoriality, and foraging. To study the macroevolution of vomeronasal sensitivity, we identified all V1R and V2R genes from the genome sequences of 11 vertebrates. Our analysis suggests the presence of multiple V1R and V2R genes in the common ancestor of teleost fish and tetrapods and reveals an exceptionally large among-species variation in the sizes of these gene repertoires. Interestingly, the ratio of the number of intact V1R genes to that of V2R genes increased by approximately 50-fold as land vertebrates evolved from aquatic vertebrates. A similar increase was found for the ratio of the number of class II odorant receptor (OR) genes to that of class I genes, but not in other vertebrate gene families. Because V1Rs and class II ORs have been suggested to bind to small airborne chemicals, whereas V2Rs and class I ORs recognize water-soluble molecules, these increases reflect a rare case of adaptation to terrestrial life at the gene family level. Several gene families known to function in concert with V2Rs in the mouse are absent outside rodents, indicating rapid changes of interactions between vomeronasal receptors and their molecular partners. Taken together, our results demonstrate the exceptional evolutionary fluidity of vomeronasal receptors, making them excellent targets for studying the molecular basis of physiological and behavioral diversity and adaptation. [Abstract/Link to Full Text]

Ganley AR, Kobayashi T
Highly efficient concerted evolution in the ribosomal DNA repeats: total rDNA repeat variation revealed by whole-genome shotgun sequence data.
Genome Res. 2007 Feb;17(2):184-91.
Repeat families within genomes are often maintained with similar sequences. Traditionally, this has been explained by concerted evolution, where repeats in an array evolve "in concert" with the same sequence via continual turnover of repeats by recombination. Another form of evolution, birth-and-death evolution, can also explain this pattern, although in this case selection is the critical force maintaining the repeats. The level of intragenomic variation is the key difference between these two forms of evolution. The prohibitive size and repetitive nature of large repeat arrays have made determination of the absolute level of intragenomic repeat variability difficult, thus there is little evidence to support concerted evolution over birth-and-death evolution for many large repeat arrays. Here we use whole-genome shotgun sequence data from the genome projects of five fungal species to reveal absolute levels of sequence variation within the ribosomal RNA gene repeats (rDNA). The level of sequence variation is remarkably low. Furthermore, the polymorphisms that are detected are not functionally constrained and seem to exist beneath the level of selection. These results suggest the rDNA is evolving via concerted evolution. Comparisons with a repeat array undergoing birth-and-death evolution provide a clear contrast in the level of repeat array variation between these two forms of evolution, confirming that the rDNA indeed does evolve via concerted evolution. These low levels of intra-genomic variation are consistent with a model of concerted evolution in which homogenization is very rapid and efficiently maintains highly similar repeat arrays. [Abstract/Link to Full Text]

Cooper SJ, Trinklein ND, Nguyen L, Myers RM
Serum response factor binding sites differ in three human cell types.
Genome Res. 2007 Feb;17(2):136-44.
The serum response factor (SRF) is essential for embryonic development and maintenance of muscle cells and neurons. The mechanism by which this factor controls these divergent pathways is unclear. Here we present a genome-wide view of occupancy of SRF at its binding sites with a focus on those that vary with cell type. We used chromatin immunoprecipitation (ChIP) in combination with human promoter microarrays to identify 216 putative SRF binding sites in the human genome. We performed independent quantitative PCR validation at over half of these sites that resulted in 146 sites we assert to be true binding sites at over 90% confidence. Nearly half of the sites are bound by SRF in only one of the three cell types we tested, providing strong evidence for the diverse roles for SRF in different cell types. We also explore possible mechanisms controlling differential binding of SRF in these cell types by assaying cofactor binding, DNA methylation, histone methylation, and histone acetylation at a subset of sites bound preferentially in smooth muscle cells. Although we did not see a strong correlation between SRF binding and epigenetics modifications, at these sites, we propose that SRF cofactors may play an important role in determining cell-dependent SRF binding sites. ELK4 (previously known as SAP-1 [SRF-associated protein-1]) is ubiquitously expressed. Therefore, we expected it to occupy sites where SRF binding is common in all cell types. Indeed, 90% of SRF sites also bound by ELK4 were common to all three cell types. Together, our data provide a more complete understanding of the regulatory network controlled by SRF. [Abstract/Link to Full Text]

Tanner S, Shen Z, Ng J, Florea L, Guigó R, Briggs SP, Bafna V
Improving gene annotation using peptide mass spectrometry.
Genome Res. 2007 Feb;17(2):231-9.
Annotation of protein-coding genes is a key goal of genome sequencing projects. In spite of tremendous recent advances in computational gene finding, comprehensive annotation remains a challenge. Peptide mass spectrometry is a powerful tool for researching the dynamic proteome and suggests an attractive approach to discover and validate protein-coding genes. We present algorithms to construct and efficiently search spectra against a genomic database, with no prior knowledge of encoded proteins. By searching a corpus of 18.5 million tandem mass spectra (MS/MS) from human proteomic samples, we validate 39,000 exons and 11,000 introns at the level of translation. We present translation-level evidence for novel or extended exons in 16 genes, confirm translation of 224 hypothetical proteins, and discover or confirm over 40 alternative splicing events. Polymorphisms are efficiently encoded in our database, allowing us to observe variant alleles for 308 coding SNPs. Finally, we demonstrate the use of mass spectrometry to improve automated gene prediction, adding 800 correct exons to our predictions using a simple rescoring strategy. Our results demonstrate that proteomic profiling should play a role in any genome sequencing project. [Abstract/Link to Full Text]

Miller MR, Dunham JP, Amores A, Cresko WA, Johnson EA
Rapid and cost-effective polymorphism identification and genotyping using restriction site associated DNA (RAD) markers.
Genome Res. 2007 Feb;17(2):240-8.
Restriction site associated DNA (RAD) tags are a genome-wide representation of every site of a particular restriction enzyme by short DNA tags. Most organisms segregate large numbers of DNA sequence polymorphisms that disrupt restriction sites, which allows RAD tags to serve as genetic markers spread at a high density throughout the genome. Here, we demonstrate the applicability of RAD markers for both individual and bulk-segregant genotyping. First, we show that these markers can be identified and typed on pre-existing microarray formats. Second, we present a method that uses RAD marker DNA to rapidly produce a low-cost microarray genotyping resource that can be used to efficiently identify and type thousands of RAD markers. We demonstrate the utility of the former approach by using a tiling path array for the fruit fly to map a recombination breakpoint, and the latter approach by creating and using an enriched RAD marker array for the threespine stickleback. The high number of RAD markers enabled localization of a previously identified region, as well as a second region also associated with the lateral plate phenotype. Taken together, our results demonstrate that RAD markers, and the method to develop a RAD marker microarray resource, allow high-throughput, high-resolution genotyping in both model and nonmodel systems. [Abstract/Link to Full Text]

Bhowmick BK, Satta Y, Takahata N
The origin and evolution of human ampliconic gene families and ampliconic structure.
Genome Res. 2007 Apr;17(4):441-50.
Out of the nine male-specific gene families in the human Y chromosome amplicons, we investigate the origin and evolution of seven families for which gametologous and orthologous sequences are available. Proto-X/Y gene pairs in the original mammalian sex chromosomes played major roles in origins and gave rise to five gene families: XKRY, VCY, HSFY, RBMY, and TSPY. The divergence times between gametologous X- and Y-linked copies in these families are well correlated with the former X-chromosomal locations. The CDY and DAZ families originated exceptionally by retroposition and transposition of autosomal copies, respectively, but CDY possesses an X-linked copy of enigmatic origin. We also investigate the evolutionary relatedness among Y-linked copies of a gene family in light of their ampliconic locations (palindromes, inverted repeats, and the TSPY array). Although any pair of copies located at the same arm positions within a palindrome is identical or nearly so by frequent gene conversion, copies located at different arm positions are distinctively different. Since these and other distinct copies in various gene families were amplified almost simultaneously in the stem lineage of Catarrhini, we take these simultaneous amplifications as evidence for the elaborate formation of Y ampliconic structure. Curiously, some copies in a gene family located at different palindromes exhibit high sequence similarity, and in most cases, such similarity greatly extends to repeat units that harbor these copies. It appears that such palindromic repeat units have evolved by and large en bloc, but they have undergone frequent exchanges between palindromes. [Abstract/Link to Full Text]

Bansal V, Bashir A, Bafna V
Evidence for large inversion polymorphisms in the human genome from HapMap data.
Genome Res. 2007 Feb;17(2):219-30.
Knowledge about structural variation in the human genome has grown tremendously in the past few years. However, inversions represent a class of structural variation that remains difficult to detect. We present a statistical method to identify large inversion polymorphisms using unusual Linkage Disequilibrium (LD) patterns from high-density SNP data. The method is designed to detect chromosomal segments that are inverted (in a majority of the chromosomes) in a population with respect to the reference human genome sequence. We demonstrate the power of this method to detect such inversion polymorphisms through simulations done using the HapMap data. Application of this method to the data from the first phase of the International HapMap project resulted in 176 candidate inversions ranging from 200 kb to several megabases in length. Our predicted inversions include an 800-kb polymorphic inversion at 7p22, a 1.1-Mb inversion at 16p12, and a novel 1.2-Mb inversion on chromosome 10 that is supported by the presence of two discordant fosmids. Analysis of the genomic sequence around inversion breakpoints showed that 11 predicted inversions are flanked by pairs of highly homologous repeats in the inverted orientation. In addition, for three candidate inversions, the inverted orientation is represented in the Celera genome assembly. Although the power of our method to detect inversions is restricted because of inherently noisy LD patterns in population data, inversions predicted by our method represent strong candidates for experimental validation and analysis. [Abstract/Link to Full Text]

Roberto R, Capozzi O, Wilson RK, Mardis ER, Lomiento M, Tuzun E, Cheng Z, Mootnick AR, Archidiacono N, Rocchi M, Eichler EE
Molecular refinement of gibbon genome rearrangements.
Genome Res. 2007 Feb;17(2):249-57.
The gibbon karyotype is known to be extensively rearranged when compared to the human and to the ancestral primate karyotype. By combining a bioinformatics (paired-end sequence analysis) approach and a molecular cytogenetics approach, we have refined the synteny block arrangement of the white-cheeked gibbon (Nomascus leucogenys, NLE) with respect to the human genome. We provide the first detailed clone framework map of the gibbon genome and refine the location of 86 evolutionary breakpoints to <1 Mb resolution. An additional 12 breakpoints, mapping primarily to centromeric and telomeric regions, were mapped to approximately 5 Mb resolution. Our combined FISH and BES analysis indicates that we have effectively subcloned 49 of these breakpoints within NLE gibbon BAC clones, mapped to a median resolution of 79.7 kb. Interestingly, many of the intervals associated with translocations were gene-rich, including some genes associated with normal skeletal development. Comparisons of NLE breakpoints with those of other gibbon species reveal variability in the position, suggesting that chromosomal rearrangement has been a longstanding property of this particular ape lineage. Our data emphasize the synergistic effect of combining computational genomics and cytogenetics and provide a framework for ultimate sequence and assembly of the gibbon genome. [Abstract/Link to Full Text]

Giresi PG, Kim J, McDaniell RM, Iyer VR, Lieb JD
FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin.
Genome Res. 2007 Jun;17(6):877-85.
DNA segments that actively regulate transcription in vivo are typically characterized by eviction of nucleosomes from chromatin and are experimentally identified by their hypersensitivity to nucleases. Here we demonstrate a simple procedure for the isolation of nucleosome-depleted DNA from human chromatin, termed FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements). To perform FAIRE, chromatin is crosslinked with formaldehyde in vivo, sheared by sonication, and phenol-chloroform extracted. The DNA recovered in the aqueous phase is fluorescently labeled and hybridized to a DNA microarray. FAIRE performed in human cells strongly enriches DNA coincident with the location of DNaseI hypersensitive sites, transcriptional start sites, and active promoters. Evidence for cell-type-specific patterns of FAIRE enrichment is also presented. FAIRE has utility as a positive selection for genomic regions associated with regulatory activity, including regions traditionally detected by nuclease hypersensitivity assays. [Abstract/Link to Full Text]

Ichiyanagi K, Nakajima R, Kajikawa M, Okada N
Novel retrotransposon analysis reveals multiple mobility pathways dictated by hosts.
Genome Res. 2007 Jan;17(1):33-41.
Autonomous non-long-terminal-repeat retrotransposons (NLRs) proliferate by retrotransposition via coordinated reactions of target DNA cleavage and reverse transcription by a mechanism called target-primed reverse transcription (TPRT). Whereas this mechanism guarantees the covalent attachment of the NLR and its target site at the 3' junction, mechanisms for the joining at the 5' junction have been conjectural. To better understand the retrotransposition pathways, we analyzed target-NLR junctions of zebrafish NLRs with a new method of identifying genomic copies that reside within other transposons, termed "target analysis of nested transposons" (TANT). Application of the TANT method revealed various features of the zebrafish NLR integrants; for example, half of the integrants carry extra nucleotides at the 5' junction, which is in stark contrast to the major human NLR, LINE-1. Interestingly, in a cell culture assay, retrotransposition of the zebrafish NLR in heterologous human cells did not bear extra 5' nucleotides, indicating that the choice of the 5' joining pathway is affected by the host. Our results suggest that several pathways exist for NLR retrotransposition and argue in favor of host protein involvement. With genomic sequence information accumulating exponentially, our data demonstrate the general applicability of the TANT method for the analysis of a wide variety of retrotransposons. [Abstract/Link to Full Text]

Paschou P, Mahoney MW, Javed A, Kidd JR, Pakstis AJ, Gu S, Kidd KK, Drineas P
Intra- and interpopulation genotype reconstruction from tagging SNPs.
Genome Res. 2007 Jan;17(1):96-107.
The optimal method to be used for tSNP selection, the applicability of a reference LD map to unassayed populations, and the scalability of these methods to genome-wide analysis, all remain subjects of debate. We propose novel, scalable matrix algorithms that address these issues and we evaluate them on genotypic data from 38 populations and four genomic regions (248 SNPs typed for approximately 2000 individuals). We also evaluate these algorithms on a second data set consisting of genotypes available from the HapMap database (1336 SNPs for four populations) over the same genomic regions. Furthermore, we test these methods in the setting of a real association study using a publicly available family data set. The algorithms we use for tSNP selection and unassayed SNP reconstruction do not require haplotype inference and they are, in principle, scalable even to genome-wide analysis. Moreover, they are greedy variants of recently developed matrix algorithms with provable performance guarantees. Using a small set of carefully selected tSNPs, we achieve very good reconstruction accuracy of "untyped" genotypes for most of the populations studied. Additionally, we demonstrate in a quantitative manner that the chosen tSNPs exhibit substantial transferability, both within and across different geographic regions. Finally, we show that reconstruction can be applied to retrieve significant SNP associations with disease, with important genotyping savings. [Abstract/Link to Full Text]

Recent Articles in Journal of Applied Genetics

Mostowska A, Biedziak B, Trzeciak WH
A novel c.581C>T transition localized in a highly conserved homeobox sequence of MSX1: is it responsible for oligodontia?
J Appl Genet. 2006;47(2):159-64.
Even though selective tooth agenesis is the most common developmental anomaly of human dentition, its genetic background still remains poorly understood. To date, familial as well as sporadic forms of both hypodontia and oligodontia have been associated with mutations or polymorphisms of MSX1, PAX9, AXIN2 and TGFa, whose protein products play a crucial role in odontogenesis. In the present report we described a novel mutation of MSX1, which might be responsible for the lack of 14 permanent teeth in our proband. However, this c.581C>T transition, localized in a highly conserved homeobox sequence of MSX1, was identified also in 2 healthy individuals from the proband's family. Our finding suggests that this transition might be the first described mutation of MSX1 that might be responsible for oligodontia and showing incomplete penetrance. It may also support the view that this common anomaly of human dentition might be an oligogenic trait caused by simultaneous mutations of different genes. [Abstract/Link to Full Text]

Van Allen MI, Boyle E, Thiessen P, McFadden D, Cochrane D, Chambers GK, Langlois S, Stathers P, Irwin B, Cairns E, MacLeod P, Delisle MF, Uh SH
The impact of prenatal diagnosis on neural tube defect (NTD) pregnancy versus birth incidence in British Columbia.
J Appl Genet. 2006;47(2):151-8.
The birth incidence of neural tube defect (NTD) cases in British Columbia (B.C.), and elsewhere in North America, is reported to be declining. This decline is being attributed to folic acid (FA) supplementation and food fortification, but 2nd trimester prenatal screening of pregnancies for NTDs and other congenital anomalies has increased during this timeframe, as well. This descriptive, population-based study evaluates the impact of prenatal screening of NTD-affected pregnancies on (1) pregnancy outcome and (2) reporting of NTD births to the provincial Health Status Registry (B.C.H.S.R.); and it assesses (3) the use of periconceptional FA supplementation. NTD cases were ascertained from medical records of health centres providing care to families with NTD-affected pregnancies and newborns; and from NTD cases reported to the B.C.H.S.R. In 1997-1999, the B.C.H.S.R. published a NTD incidence of 0.77/1000. In this study, 151 NTD-affected pregnancies were identified, with an incidence of 1.16/1000. Partial Reporting of induced abortions in a NTD incidence 45.5% low than the actual incidence. Medical records were available for review on 144/151 pregnancies. Prenatal screening identified 86.1% (124/144) of NTD-affected pregnancies, with 72.6% (90/124) resulting in pregnancy termination, and 27.4% (34/124) continuing to term. Use of FA supplementation in the periconceptional period was recorded in 36.4% of pregnancies (39/107). Thus in B.C. the decline in the NTD incidence is due predominantly to pregnancy terminations following prenatal diagnosis, which reduces the NTD incidence by 60%, from 1.16/1000 to 0.47/1000. Continued efforts for primary and the option of secondary prevention of NTDs are recommended in order to improve newborn health in B.C. and elsewhere. These interventions need to be monitored, however, for optimal health care planning. [Abstract/Link to Full Text]

Wertelecki W
Birth defects surveillance in Ukraine: a process.
J Appl Genet. 2006;47(2):143-9.
Birth defects (BD) surveillance using international standards was introduced in Ukraine by a network of five BD centers located in northwestern, central and southern regions. BD centers provide resources to access current and comprehensive information and to nurture partnerships with physicians, administrators, parental support groups, educators, and humanitarian assistance organizations. One outcome was the vigorous and popular website International BD Information Systems (IBIS). The network is now incorporated as OMNI-Net Ukraine. The program has documented high prevalence rates of neural tube defects (NTD); fetal alcohol effects (FAE); and idiopathic developmental retardation among orphans that prompted prevention and amelioration initiatives. Further program objectives include: universal folic acid flour fortification, as recommended by the Ukrainian Academy of Medicine; continued research on methods to reduce FAE in collaboration with partners from California; opening other early infant stimulation centers funded by local authorities, modeled on those in Rivne and Lutsk; and linking BD prevention with bioethical considerations, which is a topic of interest in Ukraine in part enhanced by the effects of Chornobyl. [Abstract/Link to Full Text]

Kmie? M, Terman A
Associations between the prolactin receptor gene polymorphism and reproductive traits of boars.
J Appl Genet. 2006;47(2):139-41.
The prolactin receptor gene (PRLR), located on chromosome 16 in pigs, is a candidate gene for reproductive traits. The experiment was aimed to detect the DNA mutations in this gene and to find probable relations between the genotype and some reproductive traits in boars. The polymorphism in the PRLR gene was identified by PCR-RFLP method using specific primers and the restriction enzyme AluI. In total 229 boars of various breeds were genotyped. The frequency of allele A was estimated at 0.62 and allele B at 0.38. Genotype AA was found at a frequency of 0.45, AB at 0.35 and BB at 0.20. We found associations between PRLR genotype and ejaculate volume, sperm concentration, percentage of live sperm, and number of live sperm in the ejaculate (P < 0.01). [Abstract/Link to Full Text]

Wyszy?ska-Koko J, Pierzcha?a M, Flisikowski K, Kamyczek M, Rózycki M, Kury? J
Polymorphisms in coding and regulatory regions of the porcine MYF6 and MYOG genes and expression of the MYF6 gene in m. longissimus dorsi versus productive traits in pigs.
J Appl Genet. 2006;47(2):131-8.
MYOG and MYF6 belong to the MyoD gene family. They code for the bHLH transcription factors playing a key role in later stages of myogenesis: differentiation and maturation of myotubes. Three SNPs in porcine MYF6 and two in porcine MYOG were analysed in order to establish associations with chosen carcass quality and growth rate traits in Polish Landrace, Polish Large White and line 990 sows. No statistically significant effect of SNP in the promoter region of the MYF6 gene on its expression measured on mRNA level was found. Associations between the genotype at the MYF6 locus and carcass quality traits appeared to be breed-dependent. The C allele in the case of SNP in the promoter region and GC haplotype in exon 1 were advantageous for right carcass side weight in Polish Landrace sows and disadvantageous for this trait in Polish Large White sows. These gene variants were also the most advantageous for loin and ham weight in sows of line 990. The mutation in exon 1 of the MYOG gene had no statistically significant association with carcass quality traits and the mutation in the 3'-flanking region had the breed-dependent effect as well. These results suggest that SNPs analysed in this study are not causative mutations, but can be considered as markers of some other, still unrevealed genetic polymorphism that influences the physiological processes and phenotypic traits considered in this study. [Abstract/Link to Full Text]

Strabel T, Jankowski T, Jamrozik J
Adjustments for heterogeneous herd-year variances in a random regression model for genetic evaluations of Polish Black-and-White cattle.
J Appl Genet. 2006;47(2):125-30.
The study investigated the existence of heterogeneous variance in first-lactation daily milk yield of Polish Black-and-White cows across herds in different years. Bayesian Information Criterion was used to show that the model with unequal residual variances for different herd-years was more plausible than the model assuming equal variances. A method of adjusting phenotypic records was developed to account for unequal variability in herd-years. Factors used for the data adjustment considered variation of general residuals and residuals for specific herd-years. The size of herd-year was also taken into account. Varied power of corrections was used to analyze the effect of adjustment on estimated breeding values. The method was applied to daily milk records of 817,165 primiparous cows. The effectiveness of the data adjustment was evaluated by the analysis of differences between each bull's breeding value and its parental index. Data correction reduced the average difference and variance of differences between breeding values and parental indices. Accounting for the size of herd-year classes in correction factors improved the efficiency of heterogeneous variance adjustment. [Abstract/Link to Full Text]

Aranishi F
A novel mitochondrial intergenic spacer reflecting population structure of Pacific oyster.
J Appl Genet. 2006;47(2):119-23.
Nucleotide sequence divergence in a novel major mitochondrial DNA intergenic spacer (IGS) of Pacific oyster Crassostrea gigas was analyzed for 29 cultured individuals within the Goseong population (Korea). A total of 7 variable sites were detected within the IGS, and the relative frequency of nucleotide alteration was determined to be 1.16%;. All alterations were due to a single nucleotide substitution, and 5 transitions and 2 transversions were observed. Among 29 specimens, only 8 haplotypes could be identified, and 6 of the haplotypes were unique to particular specimens. Pairwise genetic diversity of all 8 haplotypes was calculated to be 0.412+/-0.134 from multiple sequence substitutions based on the two-parameter model. The phylogenetic tree obtained for these haplotypes according to the neighbor-joining method illustrated a single cluster of linkages, which comprised 5 haplotypes associated with 23 specimens, while the other 3 haplotypes associated with 6 specimens were scattered. The results indicate that the IGS is higher polymorphic and thus more suitable as a genetic marker for population structure analysis of Pacific oyster than the mtDNA coding regions, such as cytochrome c oxidase I and 16S ribosomal RNA genes. [Abstract/Link to Full Text]

Nowaczyk P, Kisia?a A
Effect of selected factors on the effectiveness of Capsicum annuum L. anther culture.
J Appl Genet. 2006;47(2):113-7.
The primary aim of the study was to establish the effectiveness of induced androgenesis in in vitro anther culture of two pepper (Capsicum annuum L.) breeding lines--ATZ1 and PO, and a hybrid between these two lines (ATZ1 x PO)F1. Anther culture was maintained according to the method developed by Dumas de Vaulx et al. (1981) with some modifications. The experiment revealed that the effectiveness of androgenesis ranged from 4 %; for the ATZ1 line to 1.5 %; for the (ATZ1 x PO)F1 and strongly depended on the developmental stage of flower buds, as well as the conditions for anther culture maintenance. The development of androgenic embryos was successfully induced only in anthers which originated from the flower buds with petals equal or slightly longer than sepals and there was a clear relationship between the length of the period of anther induction on CP medium and the level of kinetin in R1 regeneration medium. [Abstract/Link to Full Text]

Perrella G, Cremona G, Consiglio F, Errico A, Bressan RA, Conicella C
Screening for mutations affecting sexual reproduction after activation tagging in Arabidopsis thaliana.
J Appl Genet. 2006;47(2):109-11.
In this work, a seed-set-based screening was performed on 70 lines of Arabidopsis thaliana after activation tagging mutagenesis to identify mutations in reproductive mechanisms. Five mutants showed significantly lower seed set than the wild type and confirmed the phenotype in the progeny. This phenotype was linked with the marker gene bar carried by T-DNA conferring glufosinate resistance. Genetic analysis revealed that the mutation inheritance was sporophytic in 3 mutants and gametophytic in 2 mutants. In addition, 2 mutants had an extra T-DNA copy. Thus activation tagging can be an effective strategy to identify new mutations affecting sporogenesis or gametogenesis. [Abstract/Link to Full Text]

Prus-G?owacki W, Chudzi?ska E, Wojnicka-Po?torak A, Kozacki L, Fagiewicz K
Effects of heavy metal pollution on genetic variation and cytological disturbances in the Pinus sylvestris L. population.
J Appl Genet. 2006;47(2):99-108.
This isoenzymatic and cytogenetic study has shown significant differences in genetic composition between two groups of Pinus sylvestris trees: tolerant and sensitive to heavy metal pollution. Total and mean numbers of alleles and genotypes per locus were higher in the pollution-sensitive group of trees, but heterozygosity (Ho) was lower in this group. Fixation index (F) indicates that trees tolerant for pollution were in the Hardy-Weinberg equilibrium, while the sensitive group had a significant excess of homozygosity. Cytological analyses demonstrated numerous aberrations of chromosomes in meristematic root tissue of seedlings developed from seeds collected from trees in the polluted area. The aberrations included chromosome bridges and stickiness, laggards, retarded and forward chromosomes, and their fragments. The mitotic index was markedly lower in this group of seedlings, as compared to the control. Both isoenzymatic and cytological analyses showed a significant influence of heavy metal ions on the genetic structure of the Pinus sylvestris population. [Abstract/Link to Full Text]

Watanabe N, Fujii Y, Kato N, Ban T, Martinek P
Microsatellite mapping of the genes for brittle rachis on homoeologous group 3 chromosomes in tetraploid and hexaploid wheats.
J Appl Genet. 2006;47(2):93-8.
The brittle rachis character, which causes spontaneous shattering of spikelets, has an adaptive value in wild grass species. The loci Br1 and Br2 in durum wheat (Triticum durum Desf.) and Br3 in hexaploid wheat (T. aestivum L.) determine disarticulation of rachides above the junction of the rachilla with the rachis such that a fragment of rachis is attached below each spikelet. Using microsatellite markers, the loci Br1, Br2 and Br3 were mapped on the homoeologous group 3 chromosomes. The Br2 locus was located on the short arm of chromosome 3A and linked with the centromeric marker, Xgwm32, at a distance of 13.3 cM. The Br3 locus was located on the short arm of chromosome 3B and linked with the centromeric marker, Xgwm72 (at a distance of 14.2 cM). The Br1 locus was located on the short arm of chromosome 3D. The distance of Br1 from the centromeric marker Xgdm72 was 25.3 cM. Mapping the Br1, Br2 and Br3 loci of the brittle rachis suggests the homoeologous origin of these 3 loci for brittle rachides. Since the genes for brittle rachis have been retained in the gene pool of durum wheat, the more closely linked markers with the brittle rachis locus are required to select against brittle rachis genotypes and then to avoid yield loss in improved cultivars. [Abstract/Link to Full Text]

Vulcani-Freitas TM, Gil-da-Silva-Lopes VL, Varella-Garcia M, Maciel-Guerra AT
Infertility and marker chromosomes: application of molecular cytogenetic techniques in a case of inv dup(15).
J Appl Genet. 2006;47(1):89-91.
We report on a phenotypically normal man with infertility, whose 47,XY,+mar karyotype was studied by spectral karyotyping (SKY) and fluorescence in situ hybridization (FISH) using a chromosome-15-specific probe (LSI SNRPN). By these techniques, the marker chromosome was identified as a small inv dup (15). Possible causes for male infertility in this case are discussed. [Abstract/Link to Full Text]

Ulanowska K, Wegrzyn G
Mutagenic activity of 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine.
J Appl Genet. 2006;47(1):85-7.
MPTP (1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine) is a neurotoxin, which can damage dopaminergic neurons. It causes symptoms resembling those observed in patients suffering from Parkinson's disease, and hence this toxin is widely used in studies on animal models of this disorder. Mutagenicity of MPTP was also reported by some authors, but results obtained by others suggested that this compound is not mutagenic. Interestingly, those contrasting results were based on the same assay (the Ames test). Therefore, we aimed to test MPTP mutagenicity by employing a recently developed Vibrio harveyi assay, which was demonstrated previously to be more sensitive than the Ames test, at least for some mutagens. We found that MPTP showed a significant mutagenic activity. Moreover, MPTP mutagenicity was attenuated by methylxanthines, compounds that are known to form complexes with aromatic mutagens. [Abstract/Link to Full Text]

Vallian S, Moeini H
A quantitative bacterial micro-assay for rapid detection of serum phenylalanine in dry blood-spots: application in phenylketonuria screening.
J Appl Genet. 2006;47(1):79-83.
Phenylketonuria is an inherited metabolic disease, which is characterized by increased level of serum phenylalanine (Phe). The quantitative measurement of Phe in the serum is necessary to confirm the disease, and to distinguish phenylketonuria from other forms of hyperphenylalaninemia. In this study, we report a rapid and inexpensive micro-assay for simultaneous detection and quantitative measurement of serum Phe in dry blood-spots. Analysis of the standard curve showed a broad linear Phe range of 120-1800 micromol L(-1). Application of this method in conjunction with the standard Guthrie bacterial inhibition assay and high-pressure liquid chromatography in analyzing 34 samples from phenylketonuria patients and control samples produced comparable results, with the regression equation of Y= 0.994X + 0.996. The advantage of this method over the Guthrie bacterial inhibition assay is its ability to measure the serum Phe quantitatively without false positive results. The method was successfully applied to dried blood-spots as well as serum and whole blood samples. The cost per sample is about 20-50 US cents, which is much less than those of high-pressure liquid chromatography and enzymatic commercial kits. The method can be automated, which is suitable for neonatal and mass phenylketonuria screening, especially in developing countries, where funding is a limiting factor. [Abstract/Link to Full Text]

Czarnecka AM, Golik P, Bartnik E
Mitochondrial DNA mutations in human neoplasia.
J Appl Genet. 2006;47(1):67-78.
Many models of tumour formation have been put forth so far. In general they involve mutations in at least three elements within the cell: oncogenes, tumour suppressors and regulators of telomere replication. Recently numerous mutations in mitochondria have been found in many tumours, whereas they were absent in normal tissues from the same individual. The presence of mutations, of course, does not prove that they play a causative role in development of neoplastic lesions and progression; however, the key role played by mitochondria in both apoptosis and generation of DNA-damaging reactive oxygen species might indicate that the observed mutations contribute to tumour development. Recent experiments with nude mice have proven that mtDNA mutations are indeed responsible for tumour growth and exacerbated ROS production. This review describes mtDNA mutations in main types of human neoplasia. [Abstract/Link to Full Text]

Dybus A, Pijanka J, Cheng YH, Sheen F, Grzesiak W, Muszy?ska M
Polymorphism within the LDHA gene in the homing and non-homing pigeons.
J Appl Genet. 2006;47(1):63-6.
A total of 445 domestic pigeons were genotyped for the lactate dehydrogenase (LDHA) gene. Crude DNA was isolated from blood samples and feathers. Two polymorphic sites were identified in intron 6: one near the splice donor site GT is called site H and the other near the splice acceptor site is called site B. Interestingly, the nucleotide changes of both these sites associate perfectly with the A and B alleles of HaeIII polymorphism: the A allele with nucleotide A of site H and nucleotide T of site B; while the B allele with nucleotide G of site H and nucleotide G of site B. In this study, we have identified the molecular difference between alleles A and B of the pigeon LDHA gene. The difference at site H in intron 6 explains the HaeIII polymorphism. The frequencies of LDHAAB and LDHABB genotypes between the analysed groups differ significantly (P < 0.001); the LDHAA allele was more frequent in the groups of pigeons with elevated homing performance (P < 0.001). The functional difference may be due to the change at site B, the potential splice branch site. Since LDHA activity is associated with the homing ability, it is possible that during the process of selection for the homing ability, the LDHAA allele has been selected, and is more prevalent in the top-racing groups. [Abstract/Link to Full Text]

Urba?ski P, Flisikowski K, Starzy?ski RR, Kury? J, Kamyczek M
A new SNP in the promoter region of the porcine MYF5 gene has no effect on its transcript level in m. longissimus dorsi.
J Appl Genet. 2006;47(1):59-61.
Myogenic factor 5 (myf-5) is the product of the MYF5 gene, belonging to the MyoD family. This transcription factor participates in the control of myogenesis. We identified 3 new mutations in the promoter region of the gene: A65C, C580T and C613T. The aim of this study was to evaluate the influence of the A65C transversion on gene expression. The analysis was conducted on 15 Polish Large White gilts. The relative content of MYF5 mRNA in m. longissimus dorsi did not differ significantly across MYF5 genotypes (AA, AC, CC). This result suggests that the A65C transversion may not play an important role in the expression of the MYF5 gene in analysed adult muscle but it abolishes a putative binding site for two transcription factors (CDP and HSF1) and creates such a site for Sp1. [Abstract/Link to Full Text]

De la Rosa-Reyna XF, Rodríguez Pérez MA, Sifuentes-Rincón AM
Microsatellite polymorphism in intron 1 of the bovine myostatin gene.
J Appl Genet. 2006;47(1):55-7.
Microsatellites within genes have become important because of their association with genetic diseases in humans. A novel microsatellite was identified in the first intron of the bovine myostatin gene. It is characterized by a mononucleotide core motif that exhibits polymorphic sequence variants (from 12 to 21 repeats) within and between some bovine breeds. Structural analysis of the microsatellite region in bovines as well as in closely related species permitted to trace the possible mechanisms for its structural evolution across the order Artiodactyla. [Abstract/Link to Full Text]

Kumar A, Rout PK, Roy R
Polymorphism of beta-lactoglobulin gene in Indian goats and its effect on milk yield.
J Appl Genet. 2006;47(1):49-53.
Polymorphism in the beta-LG gene in the Indian goat was investigated by the SDS-PAGE and PCR-RFLP method. SDS-PAGE was carried out in 1098 samples belonging to 8 different breeds of Indian goats. The electrophoretic pattern in the beta-LG locus showed the presence of AA and AB genotypes with frequencies of 0.81 and 0.19, 0.89 and 0.11, 0.50 and 0.50, 0.80 and 0.20, 0.84 and 0.16, 1.00 and 0.00, 0.98 and 0.02 and 0.950 and 0.050 in Jamunapari, Barbari, Marwari, Sirohi, Jakhrana, Beetal, Local UP and Local MP goats, respectively. A total of 358 individuals belonging to 13 different genetic groups were analyzed by the PCR-RFLP method. The amplified product was observed as 426 bp and the restriction digestion with SacII revealed three genotypes, namely S1S1, S1S2 and S2S2 at the beta-LG locus. The frequency of the S2S2 genotype ranged from 0.42 to 1.00 in the population. An analysis was carried out in Jamunapari and Barbari goats to observe the effect of the beta-LG genotype on 90-day milk yield. Least squares analysis of data showed that beta-LG AA animals had higher milk yield than the beta-LG AB genotype in both breeds (P < 0.01). [Abstract/Link to Full Text]

Chmurzy?ska A
The multigene family of fatty acid-binding proteins (FABPs): function, structure and polymorphism.
J Appl Genet. 2006;47(1):39-48.
Fatty acid-binding proteins (FABPs) are members of the superfamily of lipid-binding proteins (LBP). So far 9 different FABPs, with tissue-specific distribution, have been identified: L (liver), I (intestinal), H (muscle and heart), A (adipocyte), E (epidermal), Il (ileal), B (brain), M (myelin) and T (testis). The primary role of all the FABP family members is regulation of fatty acid uptake and intracellular transport. The structure of all FABPs is similar - the basic motif characterizing these proteins is beta-barrel, and a single ligand (e.g. a fatty acid, cholesterol, or retinoid) is bound in its internal water-filled cavity. Despite the wide variance in the protein sequence, the gene structure is identical. The FABP genes consist of 4 exons and 3 introns and a few of them are located in the same chromosomal region. For example, A-FABP, E-FABP and M-FABP create a gene cluster. Because of their physiological properties some FABP genes were tested in order to identify mutations altering lipid metabolism. Furthermore, the porcine A-FABP and H-FABP were studied as candidate genes with major effect on fatness traits. [Abstract/Link to Full Text]

Trivedi M, Tiwari RK, Dhawan OP
Genetic parameters and correlations of collar rot resistance with important biochemical and yield traits in opium poppy (Papaver somniferum L.).
J Appl Genet. 2006;47(1):29-38.
Collar rot, caused by Rhizoctonia solani Kühn, is one of the most severe fungal diseases of opium poppy. In this study, heritability, genetic advance and correlation for 10 agronomic, 1 physiological, 3 biochemical and 1 chemical traits with disease severity index (DSI) for collar rot were assessed in 35 accessions of opium poppy. Most of the economically important characters, like seed and capsule straw yield per plant, oil and protein content of seeds, peroxidase activity in leaves, morphine content of capsule straw and DSI for collar rot showed high heritability as well as genetic advance. Highly significant negative correlation between DSI and seed yield clearly shows that as the disease progresses in plants, seed yield declines, chiefly due to premature death of infected plants as well as low seed and capsule setting in the survived population of susceptible plants. Similarly, a highly significant negative correlation between peroxidase activity and DSI indicated that marker-assisted selection of disease-resistant plants based on high peroxidase activity would be effective and survived susceptible plants could be removed from the population to stop further spread. [Abstract/Link to Full Text]

Siviero A, Cristofani M, Furtado EL, Garcia AA, Coelho AS, Machado MA
Identification of QTLs associated with citrus resistance to Phytophthora gummosis.
J Appl Genet. 2006;47(1):23-8.
Citrus gummosis, caused by Phytophthora spp., is an important citrus disease in Brazil. Almost all citrus rootstock varieties are susceptible to it to some degree, whereas resistance is present in Poncirus trifoliata, a closely related species. The objective of this study was to detect QTLs linked to citrus Phytophthora gummosis resistance. Eighty individuals of the F1 progeny, obtained by controlled crosses between Sunki mandarin Citrus sunki (susceptible) and Poncirus trifoliata cv. Rubidoux (resistant), were evaluated. Resistance to Phytophthora parasitica was evaluated by inoculating stems of young plants with a disc of fungal mycelia and measuring lesion lengths a month later. Two QTLs linked to gummosis resistance were detected in linkage groups 1 and 5 of the P. trifoliata map, and one QTL in linkage group 2 of the C. sunki map. The phenotypic variation explained by individual QTLs was 14% for C. sunki and ranged from 16 to 24% for P. trifoliata. The low character heritability (h2 = 18.7%) and the detection of more than one QTL associated with citrus Phytophthora gummosis resistance showed that inheritance of the resistance is quantitative. [Abstract/Link to Full Text]

Filipecki M, Yin Z, Wi?niewska A, Smiech M, Malinowski R, Malepszy S
Tissue-culture-responsive and autotetraploidy-responsive changes in metabolic profiles of cucumber (Cucumis sativus L.).
J Appl Genet. 2006;47(1):17-21.
Somaclonal variation commonly occurs during in vitro plant regeneration and may introduce unintended changes in numerous plant characters. In order to assess the range of tissue-culture-responsive changes on the biochemical level, the metabolic profiles of diploid and tetraploid cucumber R1 plants regenerated from leaf-derived callus were determined. Gas chromatography and mass spectrometry were used for monitoring of 48 metabolites and many significant changes were found in metabolic profiles of these plants as compared to a seed-derived control. Most of the changes were common to diploids and tetraploids and were effects of tissue culture. However, tetraploids showed quantitative changes in 14 metabolites, as compared to regenerated diploids. These changes include increases in serine, glucose-6P, fructose-6P, oleic acid and shikimic acid levels. Basing on this study we conclude that the variation in metabolic profiles does not correlate directly with the range of genome changes in tetraploids. [Abstract/Link to Full Text]

Zhang F, Wan XQ, Pan GT
QTL mapping of Fusarium moniliforme ear rot resistance in maize. 1. Map construction with microsatellite and AFLP markers.
J Appl Genet. 2006;47(1):9-15.
To map the QTLs of Fusarium moniliforme ear rot resistance in Zea mays L., a total of 230 F2 individuals, derived from a single cross between inbred maize lines R15 (resistant) and Ye478 (susceptible), were genotyped for genetic map construction using simple sequence repeat (SSR) markers and amplified fragment length polymorphism (AFLP) markers. We used 778 pairs of SSR primers and 63 combinations of AFLP primers to detect the polymorphisms between parents, R15 and Ye478. From the polymorphic 30 AFLP primer combinations and 159 SSR primers, we scored 260 loci in the F2 population, among which 8 SSR and 13 AFLP loci could not be assigned to any of the linkage groups. An integrated molecular genetic linkage map was constructed by the remaining 151 SSR and 88 AFLP markers, which distributed throughout the 10 linkage groups of maize and spanned the genome of about 3463.5 cM with an average of 14.5 cM between two markers. On 4 chromosomes, we detected 5 putative segregation distortion regions (SDRs), including 2 new ones (SDR2 and SDR7). The other 3 SDRs were located near the regions where gametophyte genes were mapped, indicating that segregation distortion could be partially caused by gametophytic factors. [Abstract/Link to Full Text]

Hayhurst R, Cassiman JJ
EuroGentest standing up to scrutiny--first year demonstrates good progress harmonizing community approaches.
J Appl Genet. 2006;47(1):5-7. [Abstract/Link to Full Text]

Weglewska A, Jakóbkiewicz-Banecka J, Wegrzyn G
A modified procedure for quantitative analysis of mtDNA, detecting mtDNA depletion.
J Appl Genet. 2005;46(4):423-8.
Quantitative analysis of mitochondrial DNA (mtDNA) is crucial for proper diagnosis of diseases that are caused by or associated with mtDNA depletion. However, such a quantitative characterization of mtDNA is not a simple procedure and requires several laboratory steps at which potential errors can accumulate. Here, we describe a modified procedure for quantitative human mtDNA analysis. The procedure is based on using two PCR-amplified, fluorescein-labeled DNA probes, complementary to mtDNA (detection probe) and chromosomal 18S rDNA (reference probe), both of similar length. Thus, equal amounts of these probes can be used and, contrary to previously published procedures, no mtDNA purification (apart from total DNA isolation) or 18S rDNA cloning is necessary for probe preparation. Two separate hybridizations (each with one probe) are suggested instead of one hybridization with both probes; this decreases background signals and enables adjustment of the strength of specific signals from both probes, which is useful in the subsequent densitometric analysis after superimposing of both pictures. Using different DNA amounts for reactions, we have proved that the procedure is quantitative in a broad range of sample DNA concentrations. Moreover, we were able to detect mtDNA depletion unambiguously in tissue samples from patients suffering from diseases caused by dysfunction of mtDNA. [Abstract/Link to Full Text]

Lassota M, Prze?ozna B, P?odzien M, Bugno M, Wnuk M, Kotylak Z, S?ota E
Additional chromosome in a child as a result of a balanced reciprocal translocation t(12;18)(p13;q12) in his mother's karyotype.
J Appl Genet. 2005;46(4):419-21.
In this case report we present a child with an additional chromosome in the karyotype. The karyotypes of the boy and his parents were analyzed by use of a conventional banding technique (GTG) and fluorescence in situ hybridization (FISH). Probes painting whole chromosomes 12 and 18 were used in FISH. Cytogenetic examination of the parents revealed that his mother was carrying balanced reciprocal translocation between chromosomes 12 and 18. Her karyotype was described as 46,XX,t(12;18)(p13;q12). Father's karyotype was normal, described as 46,XY. The boy's karyotype was defined as 47,XY,+der(18)t(12;18)(p13;q12). The additional chromosome appeared probably due to 3:1 meiotic disjunction of the maternal balanced translocation, known as tertiary trisomy. The mother displayed a normal phenotype and delivered earlier a healthy child. However, the boy with the unbalanced karyotype shows multiple congenital abnormalities. [Abstract/Link to Full Text]

Vásquez-Velásquez AI, Arnaud-López L, Figuera LE, Padilla-Gutiérrez JR, Rivas F, Rivera H
Ambiguous genitalia by 9p deletion inherent to a dic(Y;9)(q12;p24).
J Appl Genet. 2005;46(4):415-8.
We describe here a 3-month-old male infant with brachy-plagyocephaly, short neck, widely spaced nipples, mild hypertonia, and ambiguous external genitalia but with both testes in the scrotum and no Müllerian derivates. His karyotype was 45,X,der(Y;9)(q12;p24).ish der(Y;9)(DYZ3+,SRY+,9ptel-) de novo. This patient's impaired sex differentiation is consistent with gonadal dysgenesis and compares with the male-to-female sex reversal secondary to a partial 9p deletion in spite of an intact Yp or SRY locus documented in 24 patients including a sex-reversed girl with a (Y;9) dicentric derivative. As for the cytogenetic findings, this case represents the second instance of a de novo pseudodicentric (Y;9) chromosome with loss of both distal 9p and Yq12 regions, apparent intactness of SRY, and consistent or preferential inactivation of the Y centromere. In addition, the possible 9p23p-p22 duplication observed in this case evokes the concomitant 9p22-p21 duplication documented in the previous girl with a (Y;9) derivative. Hence, these striking similarities point to a nonrandom Y;9 rearrangement in patients with either sex reversal or gonadal dysgenesis. Even if the present pseudodicentric derivative had inactivated the Y centromere, the existence of some variant cells points to functional dicentricity as it has been documented in other Y;autosome dicentric derivatives. [Abstract/Link to Full Text]

Schlade-Bartusiak K, Stembalska A, Ramsey D
Significant involvement of chromosome 13q deletions in progression of larynx cancer, detected by comparative genomic hybridization.
J Appl Genet. 2005;46(4):407-13.
Head and neck squamous cell carcinoma (HNSCC) is a heterogeneous group of tumours with various clinical characteristics. These tumours generally exhibit complex karyotypes. Few studies of genomic imbalances have been performed exclusively in subgroups of larynx cancer samples at different stages of the disease. In the present study, chromosomal gains and losses were investigated in 52 larynx tumours, by using comparative genomic hybridization (CGH). The mean number of observed alterations was 37.7 per tumour. The most common sites of losses were 1p, 13q, Xp, and the most common gains were located in 1p, 9q, 16q. The overall number of gains was negatively associated with cancer grading. G1 tumours were also characterized by a higher frequency of deletions in 13q32 and amplifications in 1q23, than tumours in other grades (p < 0.05). The frequency of losses of 13q22 also positively associated with tumour size. There was no association between the frequency of losses in 13q and the presence of lymph node metastases at the time of diagnosis. Another statistically significant association was observed for gains at 1q22-23 and tumour size (p < 0.01). No statistically significant difference in the frequency of most common imbalances was detected between primary tumours with lymph node metastases and those without metastases. In conclusion, we discovered a significant involvement of 13q deletions in the progression of larynx cancer. All the other significant changes observed in the present study were reported previously as being important for HNSCC progression. It seems that multiple genes are disrupted in the process of neoplastic transformation in the larynx, and the networks of events remain to be elucidated. [Abstract/Link to Full Text]

Polasik D, Kmie? M, Liefers S, Terman A
Single nucleotide polymorphisms in exon 10 of the chinchilla growth hormone receptor (GHR) gene.
J Appl Genet. 2005;46(4):403-6.
The aim of this study was to detect SNPs in exon 10 of the chinchilla growth hormone receptor gene (GHR) by comparative sequencing. Sixty females of the same breed (Standard) were analysed. Four new SNPs were identified, which cause 3 amino acid substitutions in the intracellular domain of the receptor: G/C at position 135 bp (in relation to the total sequence of exon 10) (gln/his), CAG/AAA at 352 bp and 354 bp (gln/lys), and C/A at 641 bp (thr/asn). [Abstract/Link to Full Text]

Recent Articles in Genetics and Molecular Research

Batista JS, Alves-Gomes JA
Phylogeography of Brachyplatystoma rousseauxii (Siluriformes - Pimelodidae) in the Amazon Basin offers preliminary evidence for the first case of "homing" for an Amazonian migratory catfish.
Genet Mol Res. 2006;5(4):723-40.
The large pimelodid, Brachyplatystoma rousseauxii, is one of the two most important catfish species for the fisheries in the Amazon. It is captured by commercial and artisanal fishing fleets in at least five Amazonian countries, at fishing grounds more than 5000 km apart. Current evidence suggests a complex life cycle that includes the longest reproductive migration known for a freshwater fish species. Experimental fisheries have pointed to a decrease in yield in the Western Amazon. However, reliable information about the capture and status of this fishery resource is still nonexistent, and no study has ever addressed its genetic diversity. We sequenced the entire D-loop of 45 individuals of B. rousseauxii, fifteen from each of three different fishing locations along the main channel of the Solimões-Amazonas System covering a distance of around 2200 km. Results of phylogenetic analyses, molecular diversity estimations, analysis of molecular variance, and nested clade analysis, together show that there is no genetic segregation associated with location in the main channel, as one would expect for a migratory species. However, the significant decrease found in genetic diversity towards the western part of the Amazon could be explained by a non-random choice of tributary to spawn. It is possible that the genetic diversity of the migrating schools decreases towards the west because portions of the species' genetic diversity are being "captured" by the different effluents, as the fish migrates to spawn in the headwaters. Like the salmon in North America, B. rousseauxii may be returning to their home tributary to spawn. [Abstract/Link to Full Text]

Neshich G, Mazoni I, Oliveira SR, Yamagishi ME, Kuser-Falcão PR, Borro LC, Morita DU, Souza KR, Almeida GV, Rodrigues DN, Jardine JG, Togawa RC, Mancini AL, Higa RH, Cruz SA, Vieira FD, Santos EH, Melo RC, Santoro MM
The Star STING server: a multiplatform environment for protein structure analysis.
Genet Mol Res. 2006;5(4):717-22.
Star STING is the latest version of the STING suite of programs and corresponding database. We report on five important aspects of this package that have acquired some new characteristics, designed to add key advantages to the whole suite: 1) availability for most popular platforms and browsers, 2) introduction of the STING_DB quality assessment, 3) improvement in algorithms for calculation of three STING parameters, 4) introduction of five new STING modules, and 5) expansion of the existing modules. Star STING is freely accessible at:,,, and [Abstract/Link to Full Text]

Torres FR, Ondei LS, Zamaro PJ, Silva RU, Cavasini CE, Rossit AR, Machado RL, Bonini-Domingos CR
Hemoglobin I-Philadelphia [alpha 16 (A14) LYS-->GLU] heterozygote among blood donors from Brazil.
Genet Mol Res. 2006;5(4):713-6.
We describe a heterozygous case of Hb I-Philadelphia [alpha 16 (A14) LYS-->GLU] in a blood donor from the Acre State Blood Bank, in the Brazilian Amazon region. We confirmed the mutation by electrophoretic and chromatographic methods and by DNA sequencing. A literature search showed that this is the first description of this alpha globin mutant in a Brazilian Caucasian group. We also emphasize the importance of the hemoglobin study in blood donors for the purpose of the genetic counseling and quality assurance of the blood to be transfused. Screening tests for hemoglobin mutants are also important for gathering anthropological information about the Brazilian population. [Abstract/Link to Full Text]

Bhowmick BK, Takahata N, Watanabe M, Satta Y
Comparative analysis of human masculinity.
Genet Mol Res. 2006;5(4):696-712.
To study rapidly evolving male specific Y (MSY) genes we retrieved and analyzed nine such genes. VCY, HSFY and RBMY were found to have functional X gametologs, but the rest did not. Using chimpanzee orthologs for XKRY, CDY, HSFY, PRY, and TSPY, the average silent substitution is estimated as 0.017 +/- 0.006/site and the substitution rate is 1.42 x 10(-9)/site/year. Except for VCY, all other loci possess two or more pseudogenes on the Y chromosome. Sequence differences from functional genes show that BPY2, DAZ, XKRY, and RBMY each have one pseudogene for each one that is human specific, while others were generated well before the human-chimpanzee split, by means of duplication, retro-transposition or translocation. Some functional MSY gene duplication of VCY, CDY and HSFY, as well as X-linked VCX and HSFX duplication, occurred in the lineage leading to humans; these duplicates have accumulated nucleotide substitutions that permit their identification. [Abstract/Link to Full Text]

Nassar NM
Cassava in South America, Brazil's contribution and the lesson to be learned from India.
Genet Mol Res. 2006;5(4):688-95.
South America is responsible for about half of the cassava world production. In the 1970's productivity of the crop on the continent was about 15 ton/ha, and dropped continuously until reaching 12 ton/ha in 2004. India's productivity of cassava increased from 10 ton/ha in the 1970's to 28 ton/ha in 2004. Brazil contributed significantly to improving cassava crops through the Instituto Agronômico de Campinas in the 1960's and 1970's. The Universidade de Brasília released high-protein content hybrids, apomictic clones and explored the potential of indigenous landraces. [Abstract/Link to Full Text]

Melo AS, Padovan AC, Serafim RC, Puzer L, Carmona AK, Juliano Neto L, Brunstein A, Briones MR
The Candida albicans AAA ATPase homologue of Saccharomyces cerevisiae Rix7p (YLL034c) is essential for proper morphology, biofilm formation and activity of secreted aspartyl proteinases.
Genet Mol Res. 2006;5(4):664-87.
Proper morphology is essential for the ability of Candida albicans to switch between yeast and hyphae and thereby sustain its virulence. Here we identified, by differential screening, a novel C. albicans AAA ATPase encoding gene, CaYLL34 (RIX7), with enhanced expression in hyphae. Phylogenetic analysis suggests that CaYLL34 belongs to a "VCP-like" subgroup of AAA ATPases essential for yeast viability and contains a bipartite nuclear localization signal. Inactivation of one copy of CaYLL34, by the URA-Blaster method, generated the heterozygous mutant strain M61. This strain has severe phenotypic alterations, such as a highly increased vacuole, abnormal cell shape and reduced growth in different conditions. Also, major pathogenicity factors are affected in M61, for instance, a significant decrease of hypha formation (>90%), surface biofilm adhesion (86%) and secreted aspartyl proteinase activity (76.5%). Our results show that the partial impairment of CaYll34p cellular levels is sufficient to affect the proper cellular morphology and pathogenicity factors and suggest that this protein is required for biogenesis of ribosomal subunits. Accordingly, we propose that the product of CaYLL34 could be tested as a novel target for antifungal drugs. [Abstract/Link to Full Text]

Dorella FA, Fachin MS, Billault A, Dias Neto E, Soravito C, Oliveira SC, Meyer R, Miyoshi A, Azevedo V
Construction and partial characterization of a Corynebacterium pseudotuberculosis bacterial artificial chromosome library through genomic survey sequencing.
Genet Mol Res. 2006;5(4):653-63.
Corynebacterium pseudotuberculosis is a gram-positive bacterium that causes caseous lymphadenitis in sheep and goats. However, despite the economic losses caused by caseous lymphadenitis, there is little information about the molecular mechanisms of pathogenesis of this bacterium. Genomic libraries constructed in bacterial artificial chromosome (BAC) vectors have become the method of choice for clone development in high-throughput genomic-sequencing projects. Large-insert DNA libraries are useful for isolation and characterization of important genomic regions and genes. In order to identify targets that might be useful for genome sequencing, we constructed a C. pseudotuberculosis BAC library in the vector pBeloBAC11. This library contains about 18,000 BAC clones, with inserts ranging in size from 25 to 120 kb, theoretically representing a 390-fold coverage of the C. pseudotuberculosis genome (estimated to be 2.5-3.1 Mb). Many genomic survey sequences (GSSs) with homology to C. diphtheriae, C. glutamicum, C. efficiens, and C. jeikeium proteins were observed within a sample of 215 sequenced clones, confirming their close phylogenetic relationship. Computer analyses of GSSs did not detect chimeric, deleted, or rearranged BAC clones, showing that this library has low redundancy. This GSSs collection is now available for further genetic and physical analysis of the C. pseudotuberculosis genome. The GSS strategy that we used to develop our library proved to be efficient for the identification of genes and will be an important tool for mapping, assembly, comparative, and functional genomic studies in a C. pseudotuberculosis genome sequencing project that will begin this year. [Abstract/Link to Full Text]

Brahmane MP, Das MK, Sinha MR, Sugunan VV, Mukherjee A, Singh SN, Prakash S, Maurye P, Hajra A
Use of RAPD fingerprinting for delineating populations of hilsa shad Tenualosa ilisha (Hamilton, 1822).
Genet Mol Res. 2006;5(4):643-52.
RAPD was used to delineate the hilsa populations sampled from the Ganga, Yamuna, Hooghly, and Narmada Rivers at six different locations. Six degenerate primers were used to generate the fragment patterns from the samples collected. All primers were highly polymorphic and generated high numbers of amplification products. Nei's genetic distances were calculated between locations. The overall average genetic distance among all the six locations was 0.295. The Fst value within the Ganga was 0.469 and within the Hooghly it was 0.546. The overall Fst value for the six populations analyzed was 0.590. The UPGMA dendrogram clustered the hilsa into two distinct clusters: Ganga and Yamuna populations and the Hooghly and Narmada populations. [Abstract/Link to Full Text]

Rousso I, Iliopoulos D, Athanasiadou F, Zavopoulou L, Vassiliou G, Voyiatzis N
Congenital bilateral anorchia: hormonal, molecular and imaging study of a case.
Genet Mol Res. 2006;5(4):638-42.
The aetiology of congenital bilateral anorchia is unknown. For many years there was speculation of an association between genetic factors and anorchia. We performed different tests in an anorchid boy, 2.5 years old, presented to us with micropenis and absence of both testes, in order to determine any possible factors contributing to the anorchia. Physical examination and hormonal, imaging, chromosomal, and molecular analyses of this case were performed. The basal FSH and LH levels were increased, and their increase in response to gonadotrophin-releasing hormone test was prolonged, while testosterone levels failed to increase after hCG administration. Ultrasonography of the pelvis and magnetic resonance of the abdomen were performed and failed to show any testicular tissue. Lastly, surgical exploration confirmed the absence of testicular structure. Chromosomal analysis revealed a normal male karyotype and molecular analysis did not reveal mutations or polymorphisms in the open reading frame of the SRY gene. Diagnostically, the lack of testosterone response to hCG stimulation is the hormonal hallmark of bilateral congenital anorchia. In addition, according to our case and previous studies, there is lack of association between genetic factors necessary for correct testicular descent and anorchia. [Abstract/Link to Full Text]

Oliveira CI, Bicudo HE, Itoyama MM
New evidence for nucleolar dominance in hybrids of Drosophila arizonae and Drosophila mulleri.
Genet Mol Res. 2006;5(4):632-7.
Drosophila mulleri (MU) and D. arizonae (AR) are cryptic species of the mulleri complex, mulleri subgroup, repleta group. Earlier cytogenetic studies revealed that these species have different regulatory mechanisms of nucleolar organizing activity. In these species, nucleolar organizing regions are found in both the X chromosome and the microchromosome. In the salivary glands of hybrids between MU females and AR males, there is an interspecific dominance of the regulatory system of the D. arizonae nucleolar organizer involving, in males, amplification and activation of the nucleolar organizer from the microchromosome. The authors who reported these findings obtained hybrids only in that cross-direction. More recently, hybrids in the opposite direction, i.e., between MU males and AR females, have been obtained. The purpose of the present study was to evaluate, in these hybrids, the association of the nucleoli with the chromosomes inherited from parental species in order to cytogenetically confirm the dominance patterns previously described. Our results support the proposed dominance of the AR nucleolar organizer activity over that of MU, regardless of cross-direction. [Abstract/Link to Full Text]

Pereira CA, Nakano F, Stern JM, Whittle MR
Genuine Bayesian multiallelic significance test for the Hardy-Weinberg equilibrium law.
Genet Mol Res. 2006;5(4):619-31.
Statistical tests that detect and measure deviation from the Hardy-Weinberg equilibrium (HWE) have been devised but are limited when testing for deviation at multiallelic DNA loci is attempted. Here we present the full Bayesian significance test (FBST) for the HWE. This test depends neither on asymptotic results nor on the number of possible alleles for the particular locus being evaluated. The FBST is based on the computation of an evidence index in favor of the HWE hypothesis. A great deal of forensic inference based on DNA evidence assumes that the HWE is valid for the genetic loci being used. We applied the FBST to genotypes obtained at several multiallelic short tandem repeat loci during routine parentage testing; the locus Penta E exemplifies those clearly in HWE while others such as D10S1214 and D19S253 do not appear to show this. [Abstract/Link to Full Text]

Cardoso FC, Pinho JM, Azevedo V, Oliveira SC
Identification of a new Schistosoma mansoni membrane-bound protein through bioinformatic analysis.
Genet Mol Res. 2006;5(4):609-18.
Progress in schistosome genome research has enabled investigators to move rapidly from genome sequences to vaccine development. Proteins bound to the surface of parasites are potential vaccine candidates, or they can be used for diagnosis. We analyzed 4342 proteins deduced from the Schistosoma mansoni transcriptome with bioinformatic computer programs. Thirty-four proteins had membrane-bound motifs. Within this group, we selected the Sm29 protein to be further characterized by in silico analysis. Sm29 was found to have a signal peptide made up of 26 amino acids, with a cleavage site between Ser26 and Val27. The glycosylation site search revealed three threonines (39, 132 and 133) with high probability of O-glycosylation and two asparagines (58 and 115) with high probability of N-glycosylation. Only one transmembrane helix was found in the C-terminal region of the protein from Leu169 to Lis191. The search for similarities and conserved motifs show that Sm29 is a protein with high identity to proteins present in S. japonicum (53, 52, 49, and 37% of identity) and it possesses disulfide-rich conserved domains. Apparently, Sm29 is a membrane bound protein, and it may be an important molecule in host-parasite interactions. [Abstract/Link to Full Text]

Dukkipati VS, Blair HT, Garrick DJ, Murray A
'Ovar-Mhc' - ovine major histocompatibility complex: structure and gene polymorphisms.
Genet Mol Res. 2006;5(4):581-608.
The major histocompatibility complex (MHC) in sheep, Ovar-Mhc, is poorly characterised, when compared to other domestic animals. However, its basic structure is similar to that of other mammals, comprising class I, II and III regions. Currently, there is evidence for the existence of four class I loci. The class II region is better characterised, with evidence of one DRA, four DRB (one coding and three non-coding), one DQA1, two DQA2, and one each of the DQB1, DQB2, DNA, DOB, DYA, DYB, DMA, and DMB genes in the region. The class III region is the least characterised, with the known presence of complement cascade (C4, C2 and Bf), TNFalpha and CYP21 genes. Products of the class I and II genes, MHC molecules, play a pivotal role in antigen presentation required for eliciting immune responses against invading pathogens. Several studies have focused on polymorphisms of Ovar-Mhc genes and their association with disease resistance. However, more research emphasis is needed on characterising the remaining Ovar-Mhc genes and developing simplified and cost-effective methods to score gene polymorphisms. Haplotype screening, employing multiple markers rather than single genes, would be more meaningful in MHC-disease association studies, as it is well known that most of the MHC loci are tightly linked, exhibiting very little recombination. This review summarises the current knowledge of the structure of Ovar-Mhc and polymorphisms of genes located in the complex. [Abstract/Link to Full Text]

Eler JP, Ferraz JB, Balieiro JC, Mattos EC, Mourão GB
Genetic correlation between heifer pregnancy and scrotal circumference measured at 15 and 18 months of age in Nellore cattle.
Genet Mol Res. 2006;5(4):569-80.
Data of pregnancy diagnosis from 24,945 Nellore heifers, raised under tropical conditions in Brazil and exposed to breeding at about 14 months of age, were analyzed simultaneously with 13,742 (analysis 1), 36,091 (analysis 2), 8,405 (analysis 3), and 8,405 (analysis 4) scrotal circumference (SC) records of contemporary young bulls in order to estimate heritability (h(2)) for yearling heifer pregnancy (HP) and for SC measured at around 15 (SC15) and 18 (SC18) months of age and to estimate genetic correlation between HP and SC15 (SC18). Heifer pregnancy was considered as a categorical trait, with the value 1 (success) assigned to heifers that were detected as pregnant by rectal palpation approximately 60 days after the end of a 90-day breeding season and the value 0 (failure) otherwise. In analyses 1 and 3, SC was measured at around 15 months of age and in analysis 2 and 4 it was measured at around 18 months of age. Only 8,848 animals from datasets 1 and 2 were common in both files, which means the same animals measured at different ages. Datasets used in analyses 3 and 4 included the same animals, measured at 15 and at 18 months of age, respectively. Heritability estimates for HP were similar in all analyses, with values ranging from 0.66 +/- 0.08 to 0.67 +/- 0.008. For SC15, the estimates were 0.57 +/- 0.05 in analysis 1 and 0.60 +/- 0.07 in analysis 3. For SC18, the estimates were 0.53 +/- 0.03 in analysis 2 and 0.64 +/- 0.06 in analysis 4. The estimates of genetic correlation between HP and SC15 were 0.15 +/- 0.10 in analysis 1 and 0.11 +/- 0.11 in analysis 3. For the correlation between HP and SC18, the values were 0.27 +/- 0.10 in analysis 2 and 0.16 +/- 0.11 in analysis 4. Based on standard errors and confidence intervals, the best heritability and genetic correlation estimates were obtained from analysis 2, which included more data and a better pedigree structure. Pearson correlation between HP and SC breeding values was similar to the genetic correlation estimates obtained from two-trait models, when all animals in the pedigree file were considered for its calculation. If only sires were considered for the calculation, Pearson correlation was higher but the pattern was the same as from two-trait analyses. The high heritability estimates obtained in the present study confirm that expected progeny difference (EPD) for HP can be used to select bulls for the production of precocious daughters and that the low genetic correlation between SC and HP indicates a greater efficacy of selection based on heifer pregnancy EPD than of selection based on scrotal circumference EPD. The results of the present study, although not conclusive, indicate that SC measured at around SC18 would have a higher genetic correlation with HP than would SC measured at around SC15. [Abstract/Link to Full Text]

Grossi SF, Lui JF, Garcia JE, Meirelles FV
Genetic diversity in wild (Sus scrofa scrofa) and domestic (Sus scrofa domestica) pigs and their hybrids based on polymorphism of a fragment of the D-loop region in the mitochondrial DNA.
Genet Mol Res. 2006;5(4):564-8.
We examined the variation in mitochondrial DNA by sequencing the D-loop region in wild and domestic (large-white breed) pigs, in hybrids between domestic and wild pigs, and in Monteiro pigs. A D-loop fragment of approximately 330 bp was amplified by PCR. Sequencing of DNA amplicons identified haplotypes previously described as European and Asian types. Monteiro pigs and wild pigs had European haplotypes and domestic pigs had both European and Asian haplotypes. [Abstract/Link to Full Text]

Ericsson AO, Faria LO, Cruz WB, Martins de Sá C, Lima BD
TcZFP8, a novel member of the Trypanosoma cruzi CCHC zinc finger protein family with nuclear localization.
Genet Mol Res. 2006;5(3):553-63.
In a 17-kb genomic fragment of Trypanosoma cruzi chromosome XX, we identified three tandemly linked genes coding for CX(2)CX(4)HX(4)C zinc finger proteins. We also showed that similar genes are present in T. brucei and Leishmania major, sharing three monophyletic groups among these trypanosomatids. In T. cruzi, TcZFP8 corresponds to a novel gene coding for a protein containing eight zinc finger motifs. Molecular cloning of this gene and heterologous expression as a fusion with a His-tag were performed in Escherichia coli. The purified recombinant protein was used to produce antibody in rabbits. Using Western blot analysis, we observed the presence of this protein in all three forms of the parasite: amastigote, trypomastigote and epimastigote. An analysis of cytoplasmic and nuclear cell extracts showed that this protein is present in nuclear extracts, and indirect immunofluorescence microscopy confirmed the nuclear localization of TcZFP8. Homologues of TcZFP8 in T. brucei are apparently absent, while one candidate in L. major was identified. [Abstract/Link to Full Text]

Roratto PA, Bartholomei-Santos ML, Gutierrez AM, Kamenetzky L, Rosenzvit MC, Zaha A
Detection of genetic polymorphism among and within Echinococcus granulosus strains by heteroduplex analysis of a microsatellite from the U1 snRNA genes.
Genet Mol Res. 2006;5(3):542-52.
Polymerase chain reaction of a pentanucleotide microsatellite in the U1 snRNA gene complex generated a multiple band pattern due to the priming of paralogous sequences. Denaturation and slow renaturation of polymerase chain reaction products allow the formation of heteroduplex DNA that can be detected by its differential mobility in polyacrylamide gel electrophoresis. Heteroduplex analysis was used to determine if the U1 snRNA microsatellite could be a useful genetic marker in Echinococcus granulosus. A U1 snRNA microsatellite fragment from E. granulosus was isolated and characterized by Southern blot and sequencing. Four E. granulosus strains were analyzed: sheep, Tasmanian sheep, cattle, and camel strains. The former two showed polymorphism and shared three of the six patterns found for sheep strain. The cattle strain displayed two patterns, and the camel strain was monomorphic. The electrophoretic profiles were used for statistical analysis in order to determine genetic distance and the relationship among strains. Heteroduplex analysis can be helpful in genotyping E. granulosus strains and is useful in detecting polymorphism within strains. [Abstract/Link to Full Text]

Nassar NM
The synthesis of a new cassava-derived species, Manihot vieiri Nassar.
Genet Mol Res. 2006;5(3):536-41.
A new species was synthesized artificially by chromosome doubling in a hybrid. The ensuing polyploid type exhibits an apomictic nature and maintains its morphological characteristics in the progeny. It showed a frequency of multiembryonic sacs of 29% in the ovules examined, whereas sacs were absent in the diploid type. [Abstract/Link to Full Text]

Silva AS, Yunes JA
Conservation of glycolytic oscillations in Saccharomyces cerevisiae and human pancreatic beta-cells: a study of metabolic robustness.
Genet Mol Res. 2006;5(3):525-35.
The present study compares two computer models of the first part of glucose catabolism in different organisms in search of evolutionarily conserved characteristics of the glycolysis cycle and proposes the main parameters that define the stable steady-state or oscillatory behavior of the glycolytic system. It is suggested that in both human pancreatic beta-cells and Saccharomyces cerevisiae there are oscillations that, despite differences in wave form and period of oscillation, share the same robustness strategy: the oscillation is not controlled by only one but by at least two parameters that will have more or less control over the pathway flux depending on the initial state of the system as well as on extra-cellular conditions. This observation leads to two important interpretations: the first is that in both S. cerevisiae and human beta-cells, despite differences in enzyme kinetics and mechanism of feedback control, evolution seems to have kept an oscillatory behavior coupled to the glucose concentration outside the cytoplasm, and the second is that the development of drugs to regulate metabolic dysfunctions in more complex systems may require further study, not only determining which enzyme is controlling the flux of the system but also under which conditions and how its control is maintained by the enzyme or transferred to other enzymes in the pathway as the drug starts acting. [Abstract/Link to Full Text]

Martinez ML, Machado MA, Nascimento CS, Silva MV, Teodoro RL, Furlong J, Prata MC, Campos AL, Guimarães MF, Azevedo AL, Pires MF, Verneque RS
Association of BoLA-DRB3.2 alleles with tick (Boophilus microplus) resistance in cattle.
Genet Mol Res. 2006;5(3):513-24.
Losses caused by bovine tick burdens in tropical countries have a tremendous economic impact on production systems. Besides reducing production, this parasite can cause death in the most susceptible animals. The use of commercial acaricides has been the major method of control, but their misuse has led to tick resistance to many chemicals. More recently, vaccines have been used in some countries without solving the problem completely. An alternative could be the development of resistant animals and the use of genetic markers and candidate genes that could help with the enormous task of selecting resistant animals. The bovine lymphocyte antigen genes (BoLA) have been shown to be associated with some parasitic infestations and disease incidence. Thus, the objective of the present study was to determine the association of BoLA-DRB3.2 alleles with tick resistance in cattle. The study was conducted on 231 F2 (Gyr x Holstein) animals that were artificially infested with 10,000 tick larvae. Log of tick count +1 was used as the dependent variable in a mixed animal model with allele substitution effects in addition to fixed effects of year and season at tick count, sex of calves, age of animal at tick count, hair type (short-straight, short-curl, long-straight, and long-curl), coat color (white, >75% white, 50- 75% white, and 25-50% white), and additive genetic, permanent environmental and residual effects as random. Females showed fewer ticks than males. Animals with short-straight hair were more resistant to tick infestation than animals with long-curl hair, and animals with whiter coat color also had fewer ticks. An association between BoLA alleles and lower tick number was found for alleles DRB3.2 *18, *20 and *27 at the 5% significance level. Also, one allele (DRB3.2*16) showed an association at the 10% level. Allele *27 was the most frequent in the population (30.7%), followed by alleles *16 (10.8%), *20 (8.7%) and *18 (2.4%). These results suggest that BoLA-DRB3.2 alleles could be used to help in the selection of animals resistant to tick infestation. However, further studies involving a larger population of cattle in combination with other BoLA genes may help to understand the mechanisms of resistance to parasites. [Abstract/Link to Full Text]

Roth DM, Senna JP, Machado DC
Evaluation of the humoral immune response in BALB/c mice immunized with a naked DNA vaccine anti-methicillin-resistant Staphylococcus aureus.
Genet Mol Res. 2006;5(3):503-12.
Methicillin-resistant Staphylococcus aureus (MRSA) is the major pathogen involved in nosocomial infections, leading to high rates of morbidity and mortality in hospitals worldwide. The methicillin resistance occurs due to the presence of an additional penicillin-binding protein, PBP2a, which has low affinity for beta-lactam antibiotics. In the past few years, vancomycin has been the only antibiotic option for treatment of infections caused by multiresistant MRSA; however, reports of vancomycin-resistant strains have generated great concerns regarding the treatment to overcome these infections. In the present study, we report preliminary results regarding the humoral immune response generated in BALB/c mice by two different doses of naked DNA vaccine containing an internal region, comprising the serine-protease domain, of the PBP2a of MRSA. The immunization procedure consisted of four immunizations given intramuscularly within 15-day intervals. Blood was collect weekly and anti-PBP2a-specific antibodies were screened by ELISA. BALB/c mice immunized with DNA vaccine anti-PBP2a have shown higher antibody titers mainly after the fourth immunization, and intriguingly, no correlation between the humoral immune response and DNA dose was observed. Our results suggest that the DNA vaccine anti-PBP2a induced an immune response by production of specific antibodies anti-MRSA in a non-dose-dependent manner, and it could represent a new and valuable approach to produce specific antibodies for passive immunization to overcome MRSA infections. [Abstract/Link to Full Text]

Scarpassa VM, Conn JE
Molecular differentiation in natural populations of Anopheles oswaldoi sensu lato (Diptera: Culicidae) from the Brazilian Amazon, using sequences of the COI gene from mitochondrial DNA.
Genet Mol Res. 2006;5(3):493-502.
Anopheles (Nyssorhynchus) oswaldoi (Peryassú, 1922) s. l., which has been incriminated as a potential human malaria vector in Western Brazilian Amazon, may constitute a cryptic species complex. However, the most recent study with isozymes indicated high similarity among samples from the States of Acre, Amazonas and Rondônia in the Brazilian Amazon. In the present study, 45 individuals were sequenced from Sena Madureira (State of Acre), Coari (State of Amazonas), São Miguel (State of Rondônia), and Moju (State of Pará), using the cytochrome oxidase I gene from mitochondrial DNA. Twenty-five haplotypes were identified in the four localities, and no haplotype was shared among them. The lowest haplotype number was detected in the Coari sample. The dendrogram based on maximum parsimony analysis yielded four groups: I) haplotypes 1, 2, 3, 4, and 5 from Sena Madureira and haplotypes 17 and 18 from São Miguel; II) haplotypes 13 to 16 and 19 to 22 from São Miguel; III) haplotypes 23 to 25 from Moju, and IV) haplotypes 6 to 9 from Sena Madureira and haplotypes 10 to 12 from Coari. The genetic distance (uncorrected p) obtained among the four groups ranged from 0.08 to 5.3%, whereas the highest values (4.97 to 5.3%) were found between groups I (Sena Madureira) and III (Moju). Based on male genitalia identification, it was suggested that group I may be A. oswaldoi s. s. whereas group IV may be A. konderi. Groups II and III could constitute other lineages or species within A. oswaldoi s. l., whose taxonomic status remains to be clarified. These results suggest that additional studies are necessary using samples of A. oswaldoi s. l. from a larger geographic area. [Abstract/Link to Full Text]

Nassar NM, Kalkmann DC, Collevatti R
Molecular analysis of apomixis in cassava.
Genet Mol Res. 2006;5(3):487-92.
Cassava is the main staple for more than 800 million people in the tropics. It is propagated vegetatively by stem cuttings, which maintains superior genotypes but favors disease accumulation and spread. In this report, we present the results of the screening of the progeny and the second generation of the clone UnB 307 for apomixes using microsatellites. A total of 29 plants were screened, representing the maternal plant, its first and second generations that were left to open pollination. About 20% of the offspring were rated as genetically identical plants. This result confirms the facultative apomictic nature of cassava, with high environmental effect. [Abstract/Link to Full Text]

Lacorte GA, Machado MA, Martinez ML, Campos AL, Maciel RP, Verneque RS, Teodoro RL, Peixoto MG, Carvalho MR, Fonseca CG
DGAT1 K232A polymorphism in Brazilian cattle breeds.
Genet Mol Res. 2006;5(3):475-82.
Recent reports identified DGAT1 (EC harboring a lysine to alanine substitution (K232A) as a candidate gene with a strong effect on milk production traits. Our objective was to estimate the frequency of the DGAT1 K232A polymorphism in the main Zebu and Taurine breeds in Brazil as well as in Zebu x Taurine crossbreds as a potential QTL for marker-assisted selection. Samples of 331 animals from the main Brazilian breeds, Nellore, Guzerat, Red Sindhi, Gyr, Holstein, and Gyr x Holstein F1 were genotyped for DGAT1 K232A polymorphism (A and K alleles) using the PCR-RFLP technique. The highest frequency of the A allele was found in the Holstein sample (73%) followed by Gyr x Holstein F1 (39%). Gyr and Red Sindhi showed low frequencies of A alleles (4 and 2.5%, respectively). The A allele was not found in the Nellore and Guzerat samples. Our results could be used to guide association studies between this locus and milk traits in these breeds. [Abstract/Link to Full Text]

Jesus FF, Wilkins JF, Solferini VN, Wakeley J
Expected coalescence times and segregating sites in a model of glacial cycles.
Genet Mol Res. 2006;5(3):466-74.
The climatic fluctuations of the Quaternary have influenced the distribution of numerous plant and animal species. Several species suffer population reduction and fragmentation, becoming restricted to refugia during glacial periods and expanding again during interglacials. The reduction in population size may reduce the effective population size, mean coalescence time and genetic variation, whereas an increased subdivision may have the opposite effect. To investigate these two opposing forces, we proposed a model in which a panmictic and a structured phase alternate, corresponding to interglacial and glacial periods. From this model, we derived an expression for the expected coalescence time and number of segregating sites for a pair of genes. We observed that increasing the number of demes or the duration of the structured phases causes an increase in coalescence time and expected levels of genetic variation. We compared numerical results with the ones expected for a panmictic population of constant size, and showed that the mean number of segregating sites can be greater in our model even when population size is much smaller in the structured phases. This points to the importance of population structure in the history of species subject to climatic fluctuations, and helps explain the long gene genealogies observed in several organisms. [Abstract/Link to Full Text]

Peixoto MG, Verneque RS, Teodoro RL, Penna VM, Martinez ML
Genetic trend for milk yield in Guzerat herds participating in progeny testing and MOET nucleus schemes.
Genet Mol Res. 2006;5(3):454-65.
Genetic trends for 305-day milk yield (P305) in Brazilian Guzerat herds under selection were compared. Data from 4898 lactations of 3179 purebred and crossbred cows from various regions of Brazil were used. Milk yield was adjusted for mature age and the contemporary groups were defined as herd and calving year. Genetic parameters were estimated using the MTDFREML program. The model included the random effects of animals and permanent environment, and herd-calving year, calving season and genetic composition as fixed effects. Genetic trends were estimated by linear regression of weighted average estimated breeding values as a function of calving year. The average P305 was 2065 +/- 922 kg and the heritability was 0.23 +/- 0.03. The annual genetic trend in estimated breeding values of cows for P305 was 7.09 +/- 0.71 kg between 1987 and 2004, and 6.47 +/- 2.35 kg between 1997 and 2004. For cows born and raised in the multiple ovulation and embryo transfer (MOET) nucleus, this trend was 36.46 +/- 24.54 kg/year between 1997 and 2004, 183.14 +/- 47.94 kg/year between 1997 and 2000, and 9.13 +/- 19.19 kg/year between 2001 and 2004. An average inbreeding coefficient of 0.04 was found for inbred MOET cows in 2004. Increasing the size of the family and introducing new progenies changed reliabilities and predicted transmitting ability estimates of MOET sires. In conclusion, there was a positive genetic trend for milk yield in the MOET nucleus at low inbreeding coefficients due to the increased accuracy and estimated genetic merit, but no changes in the average milk yield were observed. [Abstract/Link to Full Text]

Christofolini DM, Lipay MV, Ramos MA, Brunoni D, Melaragno MI
Screening for fragile X syndrome among Brazilian mentally retarded male patients using PCR from buccal cell DNA.
Genet Mol Res. 2006;5(3):448-53.
Fragile X syndrome is one of the most frequent causes of mental retardation. Since the phenotype in this syndrome is quite variable, clinical diagnosis is not easy and molecular laboratory diagnosis is necessary. Usually DNA from blood cells is used in molecular tests to detect the fragile X mutation which is characterized by an unstable expansion of a CGG repeat in the fragile X mental retardation gene (FMR1). In the present study, blood and buccal cells of 53 mentally retarded patients were molecularly analyzed for FMR1 mutation by PCR. Our data revealed that DNA extraction from buccal cells is a useful noninvasive alternative in the screening of the FMR1 mutation among mentally retarded males. [Abstract/Link to Full Text]

Beauchemin VR, Thomas MG, Franke DE, Silver GA
Evaluation of DNA polymorphisms involving growth hormone relative to growth and carcass characteristics in Brahman steers.
Genet Mol Res. 2006;5(3):438-47.
Associations of DNA polymorphisms in growth hormone (GH) relative to growth and carcass characteristics in growing Brahman steers (N = 324 from 68 sires) were evaluated. Polymorphisms were an Msp-I RFLP and a leucine/valine SNP in the GH gene as well as a Hinf-I RFLP and a histidine/arginine SNP in transcriptional regulators of the GH gene, Pit-1 and Prop-1. Genotypic frequencies of the GH SNP, Pit-1 RFLP, and Prop-1 SNP were greater than 88% for one of the bi-allelic homozygous genotypes. Genotypic frequencies for the GH Msp-I RFLP genotypes were more evenly distributed with frequencies of 0.43, 0.42, and 0.15 for the genotypes of +/+, +/-, and -/-, respectively. Mixed model analyses of growth and carcass traits with genotype and contemporary group serving as fixed effects and sire fitted as a random effect suggested that sire was a significant source of variation (P < 0.05) in average daily gain, carcass yield, and marbling score. However, measures of growth and carcass traits were similar across GH Msp-I genotypes as steers were slaughtered when fat thickness was estimated to be approximately 1.0 cm. These polymorphisms within the GH gene and/or its transcriptional regulators do not appear to be informative predictors of growth and carcass characteristics in Brahman steers. This is partly due to the high level of homozygosity of genotypes. These findings do not eliminate the potential importance of these polymorphisms as predictors of growth and carcass traits in Bos taurus or Bos taurus x Bos indicus composite cattle. [Abstract/Link to Full Text]

Bicalho HM, Pimenta CG, Mendes IK, Pena HB, Queiroz EM, Pena SD
Determination of ancestral proportions in synthetic bovine breeds using commonly employed microsatellite markers.
Genet Mol Res. 2006;5(3):432-7.
The International Society of Animal Genetics (ISAG) has chosen nine microsatellites (international marker set) as a standard that should be included in all cattle parentage studies. They are BM1824, BM2113, INRA023, SPS115, TGLA122, TGLA126, TGLA227, ETH10, and ETH225. We decided to ascertain whether this microsatellite set could be used to determine ancestral proportions in individual animals of synthetic breeds produced by crossing zebu and taurine cattle. Since the genotypes of these markers are routinely available, this would constitute a practical and cost-free method to estimate the ancestry of synthetic breed animals. Genotypes of 100 Gir and 100 Holstein animals were examined for this ISAG marker set. As expected, there were very significant allele frequency differences between the two breeds at most loci. We also typed 20 Girolando animals for which there was complete genealogical information. "Structure" software easily distinguished Holstein and Gir animals based on their microsatellite genotypes; it also attributed the genomic proportion of zebu and taurine of each of the 20 Girolando animals. The proportion of Holstein ancestry was then regressed on the genealogical data; there was a highly significant correlation (r = 0.84, P < 0.0001). The nine microsatellites that compose the ISAG international marker set were capable of estimating the ancestral Gir and Holstein genomic proportions in individual Girolando animals within narrow confidence limits. This microsatellite set might also be useful for estimating the proportions of taurine and zebu origins in commercial meat products. [Abstract/Link to Full Text]

Martínez-Agüero M, Flores-Ramírez S, Ruiz-García M
First report of major histocompatibility complex class II loci from the Amazon pink river dolphin (genus Inia).
Genet Mol Res. 2006;5(3):421-31.
We report the first major histocompatibility complex (MHC) DQB1 sequences for the two species of pink river dolphins (Inia geoffrensis and Inia boliviensis) inhabiting the Amazon and Orinoco River basins. These sequences were found to be polymorphic within the Inia genus and showed shared homology with cetacean DQB-1 sequences, especially, those of the Monodontidae and Phocoenidae. On the other hand, these sequences were shown to be divergent from those described for other riverine dolphin species, such as Lipotes vexillifer, the Chinese river dolphin. Two main conclusions can be drawn from our results: 1) the Mhc DQB1 sequences seem to evolve more rapidly than other nuclear sequences in cetaceans, and 2) differential positive selective pressures acting on these genes cause concomitant divergent evolutionary histories that derive phylogenetic reconstructions that could be inconsistent with widely accepted intertaxa evolutionary relationships elucidated with other molecular markers subjected to a neutral dynamics. [Abstract/Link to Full Text]

Zaitoun I, Khatib H
Assessment of genomic imprinting of SLC38A4, NNAT, NAP1L5, and H19 in cattle.
BMC Genet. 2006;749.
BACKGROUND: At present, few imprinted genes have been reported in cattle compared to human and mouse. Comparative expression analysis and imprinting status are powerful tools for investigating the biological significance of genomic imprinting and studying the regulation mechanisms of imprinted genes. The objective of this study was to assess the imprinting status and pattern of expression of the SLC38A4, NNAT, NAP1L5, and H19 genes in bovine tissues. RESULTS: A polymorphism-based approach was used to assess the imprinting status of four bovine genes in a total of 75 tissue types obtained from 12 fetuses and their dams. In contrast to mouse Slc38a4, which is imprinted in a tissue-specific manner, we found that SLC38A4 is not imprinted in cattle, and we found it expressed in all adult tissues examined. Two single nucleotide polymorphisms (SNPs) were identified in NNAT and used to distinguish between monoallelic and biallelic expression in fetal and adult tissues. The two transcripts of NNAT showed paternal expression like their orthologues in human and mouse. However, in contrast to human and mouse, NNAT was expressed in a wide range of tissues, both fetal and adult. Expression analysis of NAP1L5 in five heterozygous fetuses showed that the gene was paternally expressed in all examined tissues, in contrast to mouse where imprinting is tissue-specific. H19 was found to be maternally expressed like its orthologues in human, sheep, and mouse. CONCLUSION: This is the first report on the imprinting status of SLC38A4, NAP1L5, and on the expression patterns of the two transcripts of NNAT in cattle. It is of interest that the imprinting of NAP1L5, NNAT, and H19 appears to be conserved between mouse and cow, although the tissue distribution of expression differs. In contrast, the imprinting of SLC38A4 appears to be species-specific. [Abstract/Link to Full Text]

Bhagavatula J, Singh L
Genotyping faecal samples of Bengal tiger Panthera tigris tigris for population estimation: a pilot study.
BMC Genet. 2006;748.
BACKGROUND: Bengal tiger Panthera tigris tigris the National Animal of India, is an endangered species. Estimating populations for such species is the main objective for designing conservation measures and for evaluating those that are already in place. Due to the tiger's cryptic and secretive behaviour, it is not possible to enumerate and monitor its populations through direct observations; instead indirect methods have always been used for studying tigers in the wild. DNA methods based on non-invasive sampling have not been attempted so far for tiger population studies in India. We describe here a pilot study using DNA extracted from faecal samples of tigers for the purpose of population estimation. RESULTS: In this study, PCR primers were developed based on tiger-specific variations in the mitochondrial cytochrome b for reliably identifying tiger faecal samples from those of sympatric carnivores. Microsatellite markers were developed for the identification of individual tigers with a sibling Probability of Identity of 0.005 that can distinguish even closely related individuals with 99.9% certainty. The effectiveness of using field-collected tiger faecal samples for DNA analysis was evaluated by sampling, identification and subsequently genotyping samples from two protected areas in southern India. CONCLUSION: Our results demonstrate the feasibility of using tiger faecal matter as a potential source of DNA for population estimation of tigers in protected areas in India in addition to the methods currently in use. [Abstract/Link to Full Text]

Andrenacci D, Grimaldi MR, Panetta V, Riano E, Rugarli EI, Graziani F
Functional dissection of the Drosophila Kallmann's syndrome protein DmKal-1.
BMC Genet. 2006;747.
BACKGROUND: Anosmin-1, the protein implicated in the X-linked Kallmann's syndrome, plays a role in axon outgrowth and branching but also in epithelial morphogenesis. The molecular mechanism of its action is, however, widely unknown. Anosmin-1 is an extracellular protein which contains a cysteine-rich region, a whey acidic protein (WAP) domain homologous to some serine protease inhibitors, and four fibronectin-like type III (FnIII) repeats. Drosophila melanogaster Kal-1 (DmKal-1) has the same protein structure with minor differences, the most important of which is the presence of only two FnIII repeats and a C-terminal region showing a low similarity with the third and the fourth human FnIII repeats. We present a structure-function analysis of the different DmKal-1 domains, including a predicted heparan-sulfate binding site. RESULTS: This study was performed overexpressing wild type DmKal-1 and a series of deletion and point mutation proteins in two different tissues: the cephalopharyngeal skeleton of the embryo and the wing disc. The overexpression of DmKal-1 in the cephalopharyngeal skeleton induced dosage-sensitive structural defects, and we used these phenotypes to perform a structure-function dissection of the protein domains. The reproduction of two deletions found in Kallmann's Syndrome patients determined a complete loss of function, whereas point mutations induced only minor alterations in the activity of the protein. Overexpression of the mutant proteins in the wing disc reveals that the functional relevance of the different DmKal-1 domains is dependent on the extracellular context. CONCLUSION: We suggest that the role played by the various protein domains differs in different extracellular contexts. This might explain why the same mutation analyzed in different tissues or in different cell culture lines often gives opposite phenotypes. These analyses also suggest that the FnIII repeats have a main and specific role, while the WAP domain might have only a modulator role, strictly connected to that of the fibronectins. [Abstract/Link to Full Text]

Brunberg E, Andersson L, Cothran G, Sandberg K, Mikko S, Lindgren G
A missense mutation in PMEL17 is associated with the Silver coat color in the horse.
BMC Genet. 2006;746.
BACKGROUND: The Silver coat color, also called Silver dapple, in the horse is characterized by dilution of the black pigment in the hair. This phenotype shows an autosomal dominant inheritance. The effect of the mutation is most visible in the long hairs of the mane and tail, which are diluted to a mixture of white and gray hairs. Herein we describe the identification of the responsible gene and a missense mutation associated with the Silver phenotype. RESULTS: Segregation data on the Silver locus (Z) were obtained within one half-sib family that consisted of a heterozygous Silver colored stallion with 34 offspring and their 29 non-Silver dams. We typed 41 genetic markers well spread over the horse genome, including one single microsatellite marker (TKY284) close to the candidate gene PMEL17 on horse chromosome 6 (ECA6q23). Significant linkage was found between the Silver phenotype and TKY284 (theta = 0, z = 9.0). DNA sequencing of PMEL17 in Silver and non-Silver horses revealed a missense mutation in exon 11 changing the second amino acid in the cytoplasmic region from arginine to cysteine (Arg618Cys). This mutation showed complete association with the Silver phenotype across multiple horse breeds, and was not found among non-Silver horses with one clear exception; a chestnut colored individual that had several Silver offspring when mated to different non-Silver stallions also carried the exon 11 mutation. In total, 64 Silver horses from six breeds and 85 non-Silver horses from 14 breeds were tested for the exon 11 mutation. One additional mutation located in intron 9, only 759 bases from the missense mutation, also showed complete association with the Silver phenotype. However, as one could expect to find several non-causative mutations completely associated with the Silver mutation, we argue that the missense mutation is more likely to be causative. CONCLUSION: The present study shows that PMEL17 causes the Silver coat color in the horse and enable genetic testing for this trait. [Abstract/Link to Full Text]

Natanaelsson C, Oskarsson MC, Angleby H, Lundeberg J, Kirkness E, Savolainen P
Dog Y chromosomal DNA sequence: identification, sequencing and SNP discovery.
BMC Genet. 2006;745.
BACKGROUND: Population genetic studies of dogs have so far mainly been based on analysis of mitochondrial DNA, describing only the history of female dogs. To get a picture of the male history, as well as a second independent marker, there is a need for studies of biallelic Y-chromosome polymorphisms. However, there are no biallelic polymorphisms reported, and only 3200 bp of non-repetitive dog Y-chromosome sequence deposited in GenBank, necessitating the identification of dog Y chromosome sequence and the search for polymorphisms therein. The genome has been only partially sequenced for one male dog, disallowing mapping of the sequence into specific chromosomes. However, by comparing the male genome sequence to the complete female dog genome sequence, candidate Y-chromosome sequence may be identified by exclusion. RESULTS: The male dog genome sequence was analysed by Blast search against the human genome to identify sequences with a best match to the human Y chromosome and to the female dog genome to identify those absent in the female genome. Candidate sequences were then tested for male specificity by PCR of five male and five female dogs. 32 sequences from the male genome, with a total length of 24 kbp, were identified as male specific, based on a match to the human Y chromosome, absence in the female dog genome and male specific PCR results. 14437 bp were then sequenced for 10 male dogs originating from Europe, Southwest Asia, Siberia, East Asia, Africa and America. Nine haplotypes were found, which were defined by 14 substitutions. The genetic distance between the haplotypes indicates that they originate from at least five wolf haplotypes. There was no obvious trend in the geographic distribution of the haplotypes. CONCLUSION: We have identified 24159 bp of dog Y-chromosome sequence to be used for population genetic studies. We sequenced 14437 bp in a worldwide collection of dogs, identifying 14 SNPs for future SNP analyses, and giving a first description of the dog Y-chromosome phylogeny. [Abstract/Link to Full Text]

Steshina EY, Carr MS, Glick EA, Yevtodiyenko A, Appelbe OK, Schmidt JV
Loss of imprinting at the Dlk1-Gtl2 locus caused by insertional mutagenesis in the Gtl2 5' region.
BMC Genet. 2006;744.
BACKGROUND: The Dlk1 and Gtl2 genes define a region of mouse chromosome 12 that is subject to genomic imprinting, the parental allele-specific expression of a gene. Although imprinted genes play important roles in growth and development, the mechanisms by which imprinting is established and maintained are poorly understood. Differentially methylated regions (DMRs), which carry methylation on only one parental allele, are involved in imprinting control at many loci. The Dlk1-Gtl2 region contains three known DMRs, the Dlk1 DMR in the 3' region of Dlk1, the intergenic DMR 15 kb upstream of Gtl2, and the Gtl2 DMR at the Gtl2 promoter. Three mouse models are analyzed here that provide new information about the regulation of Dlk1-Gtl2 imprinting. RESULTS: A previously existing insertional mutation (Gtl2lacZ), and a targeted deletion in which the Gtl2 upstream region was replaced by a Neo cassette (Gtl2Delta5'Neo), display partial lethality and dwarfism upon paternal inheritance. Molecular characterization shows that both mutations cause loss of imprinting and changes in expression of the Dlk1, Gtl2 and Meg8/Rian genes. Dlk1 levels are decreased upon paternal inheritance of either mutation, suggesting Dlk1 may be causative for the lethality and dwarfism. Loss of imprinting on the paternal chromosome in both Gtl2lacZ and Gtl2Delta5'Neo mice is accompanied by the loss of paternal-specific Gtl2 DMR methylation, while maternal loss of imprinting suggests a previously unknown regulatory role for the maternal Gtl2 DMR. Unexpectedly, when the Neo gene is excised, Gtl2Delta5' animals are of normal size, imprinting is unchanged and the Gtl2 DMR is properly methylated. The exogenous DNA sequences integrated upstream of Gtl2 are therefore responsible for the growth and imprinting effects. CONCLUSION: These data provide further evidence for the coregulation of the imprinted Dlk1 and Gtl2 genes, and support a role for Dlk1 as an important neonatal growth factor. The ability of the Gtl2lacZ and Gtl2Delta5'Neo mutations to cause long-range changes in imprinting and gene expression suggest that regional imprinting regulatory elements may lie in proximity to the integration site. [Abstract/Link to Full Text]

Chen YH, Kao JT
Multinomial logistic regression approach to haplotype association analysis in population-based case-control studies.
BMC Genet. 2006;743.
BACKGROUND: The genetic association analysis using haplotypes as basic genetic units is anticipated to be a powerful strategy towards the discovery of genes predisposing human complex diseases. In particular, the increasing availability of high-resolution genetic markers such as the single-nucleotide polymorphisms (SNPs) has made haplotype-based association analysis an attractive alternative to single marker analysis. RESULTS: We consider haplotype association analysis under the population-based case-control study design. A multinomial logistic model is proposed for haplotype analysis with unphased genotype data, which can be decomposed into a prospective logistic model for disease risk as well as a model for the haplotype-pair distribution in the control population. Environmental factors can be readily incorporated and hence the haplotype-environment interaction can be assessed in the proposed model. The maximum likelihood estimation with unphased genotype data can be conveniently implemented in the proposed model by applying the EM algorithm to a prospective multinomial logistic regression model and ignoring the case-control design. We apply the proposed method to the hypertriglyceridemia study and identifies 3 haplotypes in the apolipoprotein A5 gene that are associated with increased risk for hypertriglyceridemia. A haplotype-age interaction effect is also identified. Simulation studies show that the proposed estimator has satisfactory finite-sample performances. CONCLUSION: Our results suggest that the proposed method can serve as a useful alternative to existing methods and a reliable tool for the case-control haplotype-based association analysis. [Abstract/Link to Full Text]

Thanseem I, Thangaraj K, Chaubey G, Singh VK, Bhaskar LV, Reddy BM, Reddy AG, Singh L
Genetic affinities among the lower castes and tribal groups of India: inference from Y chromosome and mitochondrial DNA.
BMC Genet. 2006;742.
BACKGROUND: India is a country with enormous social and cultural diversity due to its positioning on the crossroads of many historic and pre-historic human migrations. The hierarchical caste system in the Hindu society dominates the social structure of the Indian populations. The origin of the caste system in India is a matter of debate with many linguists and anthropologists suggesting that it began with the arrival of Indo-European speakers from Central Asia about 3500 years ago. Previous genetic studies based on Indian populations failed to achieve a consensus in this regard. We analysed the Y-chromosome and mitochondrial DNA of three tribal populations of southern India, compared the results with available data from the Indian subcontinent and tried to reconstruct the evolutionary history of Indian caste and tribal populations. RESULTS: No significant difference was observed in the mitochondrial DNA between Indian tribal and caste populations, except for the presence of a higher frequency of west Eurasian-specific haplogroups in the higher castes, mostly in the north western part of India. On the other hand, the study of the Indian Y lineages revealed distinct distribution patterns among caste and tribal populations. The paternal lineages of Indian lower castes showed significantly closer affinity to the tribal populations than to the upper castes. The frequencies of deep-rooted Y haplogroups such as M89, M52, and M95 were higher in the lower castes and tribes, compared to the upper castes. CONCLUSION: The present study suggests that the vast majority (> 98%) of the Indian maternal gene pool, consisting of Indio-European and Dravidian speakers, is genetically more or less uniform. Invasions after the late Pleistocene settlement might have been mostly male-mediated. However, Y-SNP data provides compelling genetic evidence for a tribal origin of the lower caste populations in the subcontinent. Lower caste groups might have originated with the hierarchical divisions that arose within the tribal groups with the spread of Neolithic agriculturalists, much earlier than the arrival of Aryan speakers. The Indo-Europeans established themselves as upper castes among this already developed caste-like class structure within the tribes. [Abstract/Link to Full Text]

Gartler SM, Varadarajan KR, Luo P, Norwood TH, Canfield TK, Hansen RS
Abnormal X: autosome ratio, but normal X chromosome inactivation in human triploid cultures.
BMC Genet. 2006;741.
BACKGROUND: X chromosome inactivation (XCI) is that aspect of mammalian dosage compensation that brings about equivalence of X-linked gene expression between females and males by inactivating one of the two X chromosomes (Xi) in normal female cells, leaving them with a single active X (Xa) as in male cells. In cells with more than two X's, but a diploid autosomal complement, all X's but one, Xa, are inactivated. This phenomenon is commonly thought to suggest 1) that normal development requires a ratio of one Xa per diploid autosomal set, and 2) that an early event in XCI is the marking of one X to be active, with remaining X's becoming inactivated by default. RESULTS: Triploids provide a test of these ideas because the ratio of one Xa per diploid autosomal set cannot be achieved, yet this abnormal ratio should not necessarily affect the one-Xa choice mechanism for XCI. Previous studies of XCI patterns in murine triploids support the single-Xa model, but human triploids mostly have two-Xa cells, whether they are XXX or XXY. The XCI patterns we observe in fibroblast cultures from different XXX human triploids suggest that the two-Xa pattern of XCI is selected for, and may have resulted from rare segregation errors or Xi reactivation. CONCLUSION: The initial X inactivation pattern in human triploids, therefore, is likely to resemble the pattern that predominates in murine triploids, i.e., a single Xa, with the remaining X's inactive. Furthermore, our studies of XIST RNA accumulation and promoter methylation suggest that the basic features of XCI are normal in triploids despite the abnormal X:autosome ratio. [Abstract/Link to Full Text]

Lemire M
SUP: an extension to SLINK to allow a larger number of marker loci to be simulated in pedigrees conditional on trait values.
BMC Genet. 2006;740.
BACKGROUND: With the recent advances in high-throughput genotyping technologies that allow for large-scale association mapping of human complex traits, promising statistical designs and methods have been emerging. Efficient simulation software are key elements for the evaluation of the properties of new statistical tests. SLINK is a flexible simulation tool that has been widely used to generate the segregation and recombination processes of markers linked to, and possibly associated with, a trait locus, conditional on trait values in arbitrary pedigrees. In practice, its most serious limitation is the small number of loci that can be simulated, since the complexity of the algorithm scales exponentially with this number. RESULTS: I describe the implementation of a two-step algorithm to be used in conjunction with SLINK to enable the simulation of a large number of marker loci linked to a trait locus and conditional on trait values in families, with the possibility for the loci to be in linkage disequilibrium. SLINK is used in the first step to simulate genotypes at the trait locus conditional on the observed trait values, and also to generate an indicator of the descent path of the simulated alleles. In the second step, marker alleles or haplotypes are generated in the founders, conditional on the trait locus genotypes simulated in the first step. Then the recombination process between the marker loci takes place conditionally on the descent path and on the trait locus genotypes. This two-step implementation is often computationally faster than other software that are designed to generate marker data linked to, and possibly associated with, a trait locus. CONCLUSION: Because the proposed method uses SLINK to simulate the segregation process, it benefits from its flexibility: the trait may be qualitative with the possibility of defining different liability classes (which allows for the simulation of gene-environment interactions or even the simulation of multi-locus effects between unlinked susceptibility regions) or it may be quantitative and normally distributed. In particular, this implementation is the only one available that can generate a large number of marker loci conditional on the set of observed quantitative trait values in pedigrees. [Abstract/Link to Full Text]

Ionita I, Man M
Optimal two-stage strategy for detecting interacting genes in complex diseases.
BMC Genet. 2006;739.
BACKGROUND: The mapping of complex diseases is one of the most important problems in human genetics today. The rapid development of technology for genetic research has led to the discovery of millions of polymorphisms across the human genome, making it possible to conduct genome-wide association studies with hundreds of thousands of markers. Given the large number of markers to be tested in such studies, a two-stage strategy may be a reasonable and powerful approach: in the first stage, a small subset of promising loci is identified using single-locus testing, and, in the second stage, multi-locus methods are used while taking into account the loci selected in the first stage. In this report, we investigate and compare two possible two-stage strategies for genome-wide association studies: a conditional approach and a simultaneous approach. RESULTS: We investigate the power of both the conditional and the simultaneous approach to detect the disease loci for a range of two-locus disease models in a case-control study design. Our results suggest that, overall, the conditional approach is more robust and more powerful than the simultaneous approach; the conditional approach can greatly outperform the simultaneous approach when one of the two disease loci has weak marginal effect, but interacts strongly with the other, stronger locus (easily detectable using single-locus methods in the first stage). CONCLUSION: Genome-wide association studies hold the promise of finding new genes implicated in complex diseases. Two-stage strategies are likely to be employed in these large-scale studies. Therefore we compared two natural two-stage approaches: the conditional approach and the simultaneous approach. Our power studies suggest that, when doing genome-wide association studies, a two-stage conditional approach is likely to be more powerful than a two-stage simultaneous approach. [Abstract/Link to Full Text]

Mathias RA, Gao P, Goldstein JL, Wilson AF, Pugh EW, Furbert-Harris P, Dunston GM, Malveaux FJ, Togias A, Barnes KC, Beaty TH, Huang SK
A graphical assessment of p-values from sliding window haplotype tests of association to identify asthma susceptibility loci on chromosome 11q.
BMC Genet. 2006;738.
BACKGROUND: Past work on asthmatic African American families revealed a strong linkage peak with modest evidence of association on chromosome 11q. Here, we perform tests of association for asthma and a panel of 609 SNPs in African American subjects using a sliding window approach. While efficient in screening a region of dense genotyping, this approach does create some problems: high numbers of tests, assimilating thousands of results, and questions about setting priorities on regions with association signals. RESULTS: We present a newly developed tool, Graphical Assessment of Sliding P-values or GrASP, which uses color display to indicate the width of the sliding windows, significance of individual tests, density of SNP coverage and location of known genes that simplifies some of these issues, and use it to identify regions of interest in these data. CONCLUSION: We demonstrate that GrASP makes it easier to visualize, summarize and prioritize regions of interest from sliding window haplotype analysis, based jointly on the p-value from all the tests from these windows and the building of haplotypes of significance in the region. Using this approach, five regions yielded strong evidence for linkage and association with asthma, including the prior peak linkage region. [Abstract/Link to Full Text]

Zhang H, Zhong X
Linkage analysis of longitudinal data and design consideration.
BMC Genet. 2006;737.
BACKGROUND: Statistical methods have been proposed recently to analyze longitudinal data in genetic studies. So far, little attention has been paid to examine the relationship among key factors in genetic longitudinal studies including power, the number of families or sibships, and the number of repeated measures per individual subjects. RESULTS: We proposed a variance component model that extends classic variance component models for a single quantitative trait to mapping longitudinal traits. Our model includes covariate effects and allows genetic effects to vary over time. Using our proposed model, we examined the power, pedigree structures, and sample size through simulation experiments. CONCLUSION: Our simulation results provide useful insights into the study design for genetic, longitudinal studies. For example, collecting a small number of large sibships is much more powerful than collecting a large number of small sibships or increasing the number of repeated measures, when the total number of measurements is comparable. [Abstract/Link to Full Text]

Gonzalez-Serricchio AS, Sternberg PW
Visualization of C. elegans transgenic arrays by GFP.
BMC Genet. 2006;736.
BACKGROUND: Targeting the green fluorescent protein (GFP) via the E. coli lac repressor (LacI) to a specific DNA sequence, the lac operator (lacO), allows visualization of chromosomes in yeast and mammalian cells. In principle this method of visualization could be used for genetic mosaic analysis, which requires cell-autonomous markers that can be scored easily and at single cell resolution. The C. elegans lin-3 gene encodes an epidermal growth factor family (EGF) growth factor. lin-3 is expressed in the gonadal anchor cell and acts through LET-23 (transmembrane protein tyrosine kinase and ortholog of EGF receptor) to signal the vulval precursor cells to generate vulval tissue. lin-3 is expressed in the vulval cells later, and recent evidence raises the possibility that lin-3 acts in the vulval cells as a relay signal during vulval induction. It is thus of interest to test the site of action of lin-3 by mosaic analysis. RESULTS: We visualized transgenes in living C. elegans by targeting the green fluorescent protein (GFP) via the E. coli lac repressor (LacI) to a specific 256 sequence repeat of the lac operator (lacO) incorporated into transgenes. We engineered animals to express a nuclear-localized GFP-LacI fusion protein. C. elegans cells having a lacO transgene result in nuclear-localized bright spots (i.e., GFP-LacI bound to lacO). Cells with diffuse nuclear fluorescence correspond to unbound nuclear localized GFP-LacI. We detected chromosomes in living animals by chromosomally integrating the array of the lacO repeat sequence and visualizing the integrated transgene with GFP-LacI.This detection system can be applied to determine polyploidy as well as investigating chromosome segregation. To assess the GFP-LacI*lacO system as a marker for mosaic analysis, we conducted genetic mosaic analysis of the epidermal growth factor lin-3, expressed in the anchor cell. We establish that lin-3 acts in the anchor cell to induce vulva development, demonstrating this method's utility in detecting the presence of a transgene. CONCLUSION: The GFP-LacI*lacO transgene detection system works in C. elegans for visualization of chromosomes and extrachromosomal transgenes. It can be used as a marker for genetic mosaic analysis. The lacO repeat sequence as an extrachromosomal array becomes a valuable technique allowing rapid, accurate determination of spontaneous loss of the array, thereby allowing high-resolution mosaic analysis. The lin-3 gene is required in the anchor cell to induce the epidermal vulval precursors cells to undergo vulval development. [Abstract/Link to Full Text]

Cheong HS, Yoon DH, Kim LH, Park BL, Choi YH, Chung ER, Cho YM, Park EW, Cheong IC, Oh SJ, Yi SG, Park T, Shin HD
Growth hormone-releasing hormone (GHRH) polymorphisms associated with carcass traits of meat in Korean cattle.
BMC Genet. 2006;735.
BACKGROUND: Cold carcass weight (CW) and longissimus muscle area (EMA) are the major quantitative traits in beef cattle. In this study, we found several polymorphisms of growth hormone-releasing hormone (GHRH) gene and examined the association of polymorphisms with carcass traits (CW and EMA) in Korean native cattle (Hanwoo). RESULTS: By direct DNA sequencing in 24 unrelated Korean cattle, we identified 12 single nucleotide polymorphisms within the 9 kb full gene region, including the 1.5 kb promoter region. Among them, six polymorphic sites were selected for genotyping in our beef cattle (n = 428) and five marker haplotypes (frequency > 0.1) were identified. Statistical analysis revealed that -4241A>T showed significant associations with CW and EMA. CONCLUSION: Our findings suggest that polymorphisms in GHRH might be one of the important genetic factors that influence carcass yield in beef cattle. Sequence variation/haplotype information identified in this study would provide valuable information for the production of a commercial line of beef cattle. [Abstract/Link to Full Text]

Sengupta S, Xiong L, Fathalli F, Benkelfat C, Tabbane K, Danics Z, Labelle A, Lal S, Krebs MO, Rouleau G, Joober R
Association study of the trinucleotide repeat polymorphism within SMARCA2 and schizophrenia.
BMC Genet. 2006;734.
BACKGROUND: Brahma (BRM) is a key component of the multisubunit SWI/SNF complex, a complex which uses the energy of ATP hydrolysis to remodel chromatin. BRM contains an N-terminal polyglutamine domain, encoded by a polymorphic trinucleotide (CAA/CAG) repeat, the only known polymorphism in the coding region of the gene (SMARCA2). We have examined the association of this polymorphism with schizophrenia in a family-based and case/control study. SMARCA2 was chosen as a candidate gene because of its specific role in developmental pathways, its high expression level in the brain and some evidence of its association with schizophrenia spectrum disorder from genome-wide linkage analysis. RESULTS: Family-based analysis with 281 complete and incomplete triads showed that there is no significant preferential transmission of any of the alleles to the affected offspring. Also, in the case/control analysis, similar allele and genotype distributions were observed between affected cases (n = 289) and unaffected controls (n = 273) in each of three Caucasian populations studied: French Canadian, Tunisian and other Caucasians of European origin. CONCLUSION: Results from our family-based and case-control association study suggest that there is no association between the trinucleotide repeat polymorphism within SMARCA2 and schizophrenia. [Abstract/Link to Full Text]

Slack C, Somers WG, Sousa-Nunes R, Chia W, Overton PM
A mosaic genetic screen for novel mutations affecting Drosophila neuroblast divisions.
BMC Genet. 2006;733.
BACKGROUND: The asymmetric segregation of determinants during cell division is a fundamental mechanism for generating cell fate diversity during development. In Drosophila, neural precursors (neuroblasts) divide in a stem cell-like manner generating a larger apical neuroblast and a smaller basal ganglion mother cell. The cell fate determinant Prospero and its adapter protein Miranda are asymmetrically localized to the basal cortex of the dividing neuroblast and segregated into the GMC upon cytokinesis. Previous screens to identify components of the asymmetric division machinery have concentrated on embryonic phenotypes. However, such screens are reaching saturation and are limited in that the maternal contribution of many genes can mask the effects of zygotic loss of function, and other approaches will be necessary to identify further genes involved in neuroblast asymmetric division. RESULTS: We have performed a genetic screen in the third instar larval brain using the basal localization of Miranda as a marker for neuroblast asymmetry. In addition to the examination of pupal lethal mutations, we have employed the MARCM (Mosaic Analysis with a Repressible Cell Marker) system to generate postembryonic clones of mutations with an early lethal phase. We have screened a total of 2,300 mutagenized chromosomes and isolated alleles affecting cell fate, the localization of basal determinants or the orientation of the mitotic spindle. We have also identified a number of complementation groups exhibiting defects in cell cycle progression and cytokinesis, including both novel genes and new alleles of known components of these processes. CONCLUSION: We have identified four mutations which affect the process of neuroblast asymmetric division. One of these, mapping to the imaginal discs arrested locus, suggests a novel role for the anaphase promoting complex/cyclosome (APC/C) in the targeting of determinants to the basal cortex. The identification and analysis of the remaining mutations will further advance our understanding of the process of asymmetric cell division. We have also isolated a number of mutations affecting cell division which will complement the functional genomics approaches to this process being employed by other laboratories. Taken together, these results demonstrate the value of mosaic screens in the identification of genes involved in neuroblast division. [Abstract/Link to Full Text]

Campanella JJ, Smalley JV
A minimally invasive method of piscine tissue collection and an analysis of long-term field-storage conditions for samples.
BMC Genet. 2006;732.
BACKGROUND: The acquisition of high-quality DNA for use in phylogenetic and molecular population genetic studies is a primary concern for evolutionary and genetic researchers. Many non-destructive DNA sampling methods have been developed and are used with a variety of taxa in applications ranging from genetic stock assessment to molecular forensics. RESULTS: The authors have developed a field sampling method for obtaining high-quality DNA from sunfish (Lepomis) and other freshwater fish that employs a variation on the buccal swab method and results in the collection of DNA suitable for PCR amplification and polymorphism analysis. Additionally, since the circumstances of storage are always a concern for field biologists, the authors have tested the potential storage conditions of swabbed samples and whether those conditions affect DNA extraction and PCR amplification. It was found that samples stored at room temperature in the dark for over 200 days could still yield DNA suitable for PCR amplification and polymorphism detection. CONCLUSION: These findings suggest that valuable molecular genetic data may be obtained from tissues that have not been treated or stored under optimal field conditions. Furthermore, it is clear that the lack of adequately low temperatures during transport and long term storage should not be a barrier to anyone wishing to engage in field-based molecular genetic research. [Abstract/Link to Full Text]

Rachagani S, Gupta ID, Gupta N, Gupta SC
Genotyping of beta-lactoglobulin gene by PCR-RFLP in Sahiwal and Tharparkar cattle breeds.
BMC Genet. 2006;731.
BACKGROUND: Improvement of efficiency and economic returns is an important goal in dairy farming, as in any agricultural enterprise. The primary goal of dairy industry has been to identify an efficient and economical way of increasing milk production and its constituents without increasing the size of the dairy herd. Selection of animals with desirable genotypes and mating them to produce the next generation has been the basis of livestock improvement and this would continue to remain the same in the coming years. The use of polymorphic genes as detectable molecular markers is a promising alternative to the current methods of trait selection once these genes are proven to be associated with traits of interest in animals. The point mutations in exon IV of bovine beta-Lactoglobulin gene determine two allelic variants A and B. These variants were distinguished by Polymerase Chain Reaction and Restriction Fragment Length Polymorphism (PCR-RFLP) analysis in two indigenous Bos indicus breeds viz. Sahiwal and Tharparkar cattle. DNA samples (228 in Sahiwal and 86 in Tharparkar) were analyzed for allelic variants of beta-Lactoglobulin gene. Polymorphism was detected by digestion of PCR amplified products with Hae III enzyme, and separation on 12% non-denaturing gels and resolved by silver staining. RESULTS: The allele B of beta-Lactoglobulin occurred at a higher frequency than the allele A in both Sahiwal and Tharparkar breeds. The genotypic frequencies of AA, AB, and BB in Sahiwal and Tharparkar breeds were 0.031, 0.276, 0.693 and 0.023, 0.733, 0.244 respectively. Frequencies of A and B alleles were 0.17 and 0.83, and 0.39 and 0.61 in Sahiwal and Tharparkar breeds respectively. The Chi-square test results (at one degree of freedom at one per cent level) revealed that the Tharparkar population was not in Hardy-Weinberg equilibrium as there was a continuous migration of animals in the herd studied, where as, the results are not significant for the Sahiwal population. CONCLUSION: Genotype frequencies of AA were the lowest compared to that of BB genotype in Sahiwal cattle while AB genotypes were more frequent in Tharparkar cattle. The frequency of A allele was found to be lower than that of B allele in both the breeds studied. These results further confirm that Bos indicus cattle are predominantly of beta-Lactoglobulin B type than Bos taurus breeds. [Abstract/Link to Full Text]

Levy D, DePalma SR, Benjamin EJ, O'Donnell CJ, Parise H, Hirschhorn JN, Vasan RS, Izumo S, Larson MG
Phenotype-genotype association grid: a convenient method for summarizing multiple association analyses.
BMC Genet. 2006;730.
BACKGROUND: High-throughput genotyping generates vast amounts of data for analysis; results can be difficult to summarize succinctly. A single project may involve genotyping many genes with multiple variants per gene and analyzing each variant in relation to numerous phenotypes, using several genetic models and population subgroups. Hundreds of statistical tests may be performed for a single SNP, thereby complicating interpretation of results and inhibiting identification of patterns of association. RESULTS: To facilitate visual display and summary of large numbers of association tests of genetic loci with multiple phenotypes, we developed a Phenotype-Genotype Association (PGA) grid display. A database-backed web server was used to create PGA grids from phenotypic and genotypic data (sample sizes, means and standard errors, P-value for association). HTML pages were generated using Tcl scripts on an AOLserver platform, using an Oracle database, and the ArsDigita Community System web toolkit. The grids are interactive and permit display of summary data for individual cells by a mouse click (i.e. least squares means for a given SNP and phenotype, specified genetic model and study sample). PGA grids can be used to visually summarize results of individual SNP associations, gene-environment associations, or haplotype associations. CONCLUSION: The PGA grid, which permits interactive exploration of large numbers of association test results, can serve as an easily adapted common and useful display format for large-scale genetic studies. Doing so would reduce the problem of publication bias, and would simplify the task of summarizing large-scale association studies. [Abstract/Link to Full Text]

Nakamoto K, Wang S, Jenison RD, Guo GL, Klaassen CD, Wan YJ, Zhong XB
Linkage disequilibrium blocks, haplotype structure, and htSNPs of human CYP7A1 gene.
BMC Genet. 2006;729.
BACKGROUND: Cholesterol 7-alpha-hydroxylase (CYP7A1) is the rate limiting enzyme for converting cholesterol into bile acids. Genetic variations in the CYP7A1 gene have been associated with metabolic disorders of cholesterol and bile acids, including hypercholesterolemia, hypertriglyceridemia, arteriosclerosis, and gallstone disease. Current genetic studies are focused mainly on analysis of a single nucleotide polymorphism (SNP) at A-278C in the promoter region of the CYP7A1 gene. Here we report a genetic approach for an extensive analysis on linkage disequilibrium (LD) blocks and haplotype structures of the entire CYP7A1 gene and its surrounding sequences in Africans, Caucasians, Asians, Mexican-Americans, and African-Americans. RESULT: The LD patterns and haplotype blocks of CYP7A1 gene were defined in Africans, Caucasians, and Asians using genotyping data downloaded from the HapMap database to select a set of haplotype-tagging SNPs (htSNP). A low cost, microarray-based platform on thin-film biosensor chips was then developed for high-throughput genotyping to study transferability of the HapMap htSNPs to Mexican-American and African-American populations. Comparative LD patterns and haplotype block structure was defined across all test populations. CONCLUSION: A constant genetic structure in CYP7A1 gene and its surrounding sequences was found that may lead to a better design for association studies of genetic variations in CYP7A1 gene with cholesterol and bile acid metabolism. [Abstract/Link to Full Text]

Kashyap VK, Guha S, Sitalaximi T, Bindu GH, Hasnain SE, Trivedi R
Genetic structure of Indian populations based on fifteen autosomal microsatellite loci.
BMC Genet. 2006;728.
BACKGROUND: Indian populations endowed with unparalleled genetic complexity have received a great deal of attention from scientists world over. However, the fundamental question over their ancestry, whether they are all genetically similar or do exhibit differences attributable to ethnicity, language, geography or socio-cultural affiliation is still unresolved. In order to decipher their underlying genetic structure, we undertook a study on 3522 individuals belonging to 54 endogamous Indian populations representing all major ethnic, linguistic and geographic groups and assessed the genetic variation using autosomal microsatellite markers. RESULTS: The distribution of the most frequent allele was uniform across populations, revealing an underlying genetic similarity. Patterns of allele distribution suggestive of ethnic or geographic propinquity were discernible only in a few of the populations and was not applicable to the entire dataset while a number of the populations exhibited distinct identities evident from the occurrence of unique alleles in them. Genetic substructuring was detected among populations originating from northeastern and southern India reflective of their migrational histories and genetic isolation respectively. CONCLUSION: Our analyses based on autosomal microsatellite markers detected no evidence of general clustering of population groups based on ethnic, linguistic, geographic or socio-cultural affiliations. The existence of substructuring in populations from northeastern and southern India has notable implications for population genetic studies and forensic databases where broad grouping of populations based on such affiliations are frequently employed. [Abstract/Link to Full Text]

Barral S, Haynes C, Stone M, Gordon D
LRTae: improving statistical power for genetic association with case/control data when phenotype and/or genotype misclassification errors are present.
BMC Genet. 2006;724.
BACKGROUND: In the field of statistical genetics, phenotype and genotype misclassification errors can substantially reduce power to detect association with genetic case/control studies. Misclassification also can bias population frequency parameters such as genotype, haplotype, or multi-locus genotype frequencies. These problems are of particular concern in case/control designs because, short of repeated sampling, there is no way to detect misclassification errors.We developed a double-sampling procedure for case/control genetic association using a likelihood ratio test framework. Different approaches have been proposed to deal with misclassification errors. We have chosen the likelihood framework because of the ease with which misclassification probabilities may be incorporated into in the statistical framework and hypothesis testing. The statistic is called the Likelihood Ratio Test allowing for errors (LRTae) and is freely available via software download. RESULTS: We applied our procedure to 10,000 replicates of simulated case/control data in which we introduced phenotype misclassification errors. The phenotype considered is Ankylosing Spondylitis (AS). The LRTae method power was always greater than LRTstd power for the significance levels considered (5%, 1%, 0.1%, 0.01%). Power gains for the LRTae method over the LRTstd method increased as the significance level became more stringent. Multi-locus genotype frequency estimates using LRTae method were more accurate than estimates using LRTstd method. CONCLUSION: The LRTae method can be applied to single-locus genotypes, multi-locus genotypes, or multi-locus haplotypes in a case/control framework and can be more powerful to detect association in case/control studies when both genotype and/or phenotype errors are present. Furthermore, the LRTae method provides asymptotically unbiased estimates of case and control genotype frequencies, as well as estimates of phenotype and/or genotype misclassification rates. [Abstract/Link to Full Text]

Taylor J, Provart NJ
CapsID: a web-based tool for developing parsimonious sets of CAPS molecular markers for genotyping.
BMC Genet. 2006;727.
BACKGROUND: Genotyping may be carried out by a number of different methods including direct sequencing and polymorphism analysis. For a number of reasons, PCR-based polymorphism analysis may be desirable, owing to the fact that only small amounts of genetic material are required, and that the costs are low. One popular and cheap method for detecting polymorphisms is by using cleaved amplified polymorphic sequence, or CAPS, molecular markers. These are also known as PCR-RFLP markers. RESULTS: We have developed a program, called CapsID, that identifies snip-SNPs (single nucleotide polymorphisms that alter restriction endonuclease cut sites) within a set or sets of reference sequences, designs PCR primers around these, and then suggests the most parsimonious combination of markers for genotyping any individual who is not a member of the reference set. The output page includes biologist-friendly features, such as images of virtual gels to assist in genotyping efforts. CapsID is freely available at CONCLUSION: CapsID is a tool that can rapidly provide minimal sets of CAPS markers for molecular identification purposes for any biologist working in genetics, community genetics, plant and animal breeding, forensics and other fields. [Abstract/Link to Full Text]

Minvielle F, Kayang BB, Inoue-Murayama M, Miwa M, Vignal A, Gourichon D, Neau A, Monvoisin JL, Ito S
Search for QTL affecting the shape of the egg laying curve of the Japanese quail.
BMC Genet. 2006;726.
BACKGROUND: Egg production is of critical importance in birds not only for their reproduction but also for human consumption as the egg is a highly nutritive and balanced food. Consequently, laying in poultry has been improved through selection to increase the total number of eggs laid per hen. This number is the cumulative result of the oviposition, a cyclic and repeated process which leads to a pattern over time (the egg laying curve) which can be modelled and described individually. Unlike the total egg number which compounds all variations, the shape of the curve gives information on the different phases of egg laying, and its genetic analysis using molecular markers might contribute to understand better the underlying mechanisms. The purpose of this study was to perform the first QTL search for traits involved in shaping the egg laying curve, in an F2 experiment with 359 female Japanese quail. RESULTS: Eight QTL were found on five autosomes, and six of them could be directly associated with egg production traits, although none was significant at the genome-wide level. One of them (on CJA13) had an effect on the first part of the laying curve, before the production peak. Another one (on CJA06) was related to the central part of the curve when laying is maintained at a high level, and the four others (on CJA05, CJA10 and CJA14) acted on the last part of the curve where persistency is determinant. The QTL for the central part of the curve was mapped at the same position on CJA06 than a genome-wide significant QTL for total egg number detected previously in the same F2. CONCLUSION: Despite its limited scope (number of microsatellites, size of the phenotypic data set), this work has shown that it was possible to use the individual egg laying data collected daily to find new QTL which affect the shape of the egg laying curve. Beyond the present results, this new approach could also be applied to longitudinal traits in other species, like growth and lactation in ruminants, for which good marker coverage of the genome and theoretical models with a biological significance are available. [Abstract/Link to Full Text]

Grindflek E, Moe M, Taubert H, Simianer H, Lien S, Moen T
Genome-wide linkage analysis of inguinal hernia in pigs using affected sib pairs.
BMC Genet. 2006;725.
BACKGROUND: Inguinal and scrotal hernias are of great concern to pig producers, and lead to poor animal welfare and severe economic loss. Selection against these conditions is highly preferable, but at this time no gene, Quantitative Trait Loci (QTL), or mode of inheritance has been identified in pigs or in any other species. Therefore, a complete genome scan was performed in order to identify genomic regions affecting inguinal and scrotal hernias in pigs. Records from seedstock breeding farms were collected. No clinical examinations were executed on the pigs and there was therefore no distinction between inguinal and scrotal hernias. The genome scan utilised affected sib pairs (ASP), and the data was analysed using both an ASP test based on Non-parametric Linkage (NPL) analysis, and a Transmission Disequilibrium Test (TDT). RESULTS: Significant QTLs (p < 0.01) were detected on 8 out of 19 porcine chromosomes. The most promising QTLs, however, were detected in SSC1, SSC2, SSC5, SSC6, SSC15, SSC17 and SSCX; all of these regions showed either statistical significance with both statistical methods, or convincing significance with one of the methods. Haplotypes from these suggestive QTL regions were constructed and analysed with TDT. Of these, six different haplotypes were found to be differently transmitted (p < 0.01) to healthy and affected pigs. The most interesting result was one haplotype on SSC5 that was found to be transmitted to hernia pigs with four times higher frequency than to healthy pigs (p < 0.00005). CONCLUSION: For the first time in any species, a genome scan has revealed suggestive QTLs for inguinal and scrotal hernias. While this study permitted the detection of chromosomal regions only, it is interesting to note that several promising candidate genes, including INSL3, MIS, and CGRP, are located within the highly significant QTL regions. Further studies are required in order to narrow down the suggestive QTL regions, investigate the candidate genes, and to confirm the suggestive QTLs in other populations. The haplotype associated with inguinal and scrotal hernias may help in achieving selection against the disorder. [Abstract/Link to Full Text]

Heidema AG, Boer JM, Nagelkerke N, Mariman EC, van der A DL, Feskens EJ
The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases.
BMC Genet. 2006;723.
Genetic epidemiologists have taken the challenge to identify genetic polymorphisms involved in the development of diseases. Many have collected data on large numbers of genetic markers but are not familiar with available methods to assess their association with complex diseases. Statistical methods have been developed for analyzing the relation between large numbers of genetic and environmental predictors to disease or disease-related variables in genetic association studies. In this commentary we discuss logistic regression analysis, neural networks, including the parameter decreasing method (PDM) and genetic programming optimized neural networks (GPNN) and several non-parametric methods, which include the set association approach, combinatorial partitioning method (CPM), restricted partitioning method (RPM), multifactor dimensionality reduction (MDR) method and the random forests approach. The relative strengths and weaknesses of these methods are highlighted. Logistic regression and neural networks can handle only a limited number of predictor variables, depending on the number of observations in the dataset. Therefore, they are less useful than the non-parametric methods to approach association studies with large numbers of predictor variables. GPNN on the other hand may be a useful approach to select and model important predictors, but its performance to select the important effects in the presence of large numbers of predictors needs to be examined. Both the set association approach and random forests approach are able to handle a large number of predictors and are useful in reducing these predictors to a subset of predictors with an important contribution to disease. The combinatorial methods give more insight in combination patterns for sets of genetic and/or environmental predictor variables that may be related to the outcome variable. As the non-parametric methods have different strengths and weaknesses we conclude that to approach genetic association studies using the case-control design, the application of a combination of several methods, including the set association approach, MDR and the random forests approach, will likely be a useful strategy to find the important genes and interaction patterns involved in complex diseases. [Abstract/Link to Full Text]

Taniguchi H, Lowe CE, Cooper JD, Smyth DJ, Bailey R, Nutland S, Healy BC, Lam AC, Burren O, Walker NM, Smink LJ, Wicker LS, Todd JA
Discovery, linkage disequilibrium and association analyses of polymorphisms of the immune complement inhibitor, decay-accelerating factor gene (DAF/CD55) in type 1 diabetes.
BMC Genet. 2006;722.
BACKGROUND: Type 1 diabetes (T1D) is a common autoimmune disease resulting from T-cell mediated destruction of pancreatic beta cells. Decay accelerating factor (DAF, CD55), a glycosylphosphatidylinositol-anchored membrane protein, is a candidate for autoimmune disease susceptibility based on its role in restricting complement activation and evidence that DAF expression modulates the phenotype of mice models for autoimmune disease. In this study, we adopt a linkage disequilibrium (LD) mapping approach to test for an association between the DAF gene and T1D. RESULTS: Initially, we used HapMap II genotype data to examine LD across the DAF region. Additional resequencing was required, identifying 16 novel polymorphisms. Combining both datasets, a LD mapping approach was adopted to test for association with T1D. Seven tag SNPs were selected and genotyped in case-control (3,523 cases and 3,817 controls) and family (725 families) collections. CONCLUSION: We obtained no evidence of association between T1D and the DAF region in two independent collections. In addition, we assessed the impact of using only HapMap II genotypes for the selection of tag SNPs and, based on this study, found that HapMap II genotypes may require additional SNP discovery for comprehensive LD mapping of some genes in common disease. [Abstract/Link to Full Text]

Mandal DM, Sorant AJ, Atwood LD, Wilson AF, Bailey-Wilson JE
Allele frequency misspecification: effect on power and Type I error of model-dependent linkage analysis of quantitative traits under random ascertainment.
BMC Genet. 2006;721.
BACKGROUND: Studies of model-based linkage analysis show that trait or marker model misspecification leads to decreasing power or increasing Type I error rate. An increase in Type I error rate is seen when marker related parameters (e.g., allele frequencies) are misspecified and ascertainment is through the trait, but lod-score methods are expected to be robust when ascertainment is random (as is often the case in linkage studies of quantitative traits). In previous studies, the power of lod-score linkage analysis using the "correct" generating model for the trait was found to increase when the marker allele frequencies were misspecified and parental data were missing. An investigation of Type I error rates, conducted in the absence of parental genotype data and with misspecification of marker allele frequencies, showed that an inflation in Type I error rate was the cause of at least part of this apparent increased power. To investigate whether the observed inflation in Type I error rate in model-based LOD score linkage was due to sampling variation, the trait model was estimated from each sample using REGCHUNT, an automated segregation analysis program used to fit models by maximum likelihood using many different sets of initial parameter estimates. RESULTS: The Type I error rates observed using the trait models generated by REGCHUNT were usually closer to the nominal levels than those obtained when assuming the generating trait model. CONCLUSION: This suggests that the observed inflation of Type I error upon misspecification of marker allele frequencies is at least partially due to sampling variation. Thus, with missing parental genotype data, lod-score linkage is not as robust to misspecification of marker allele frequencies as has been commonly thought. [Abstract/Link to Full Text]

Cousin E, Deleuze JF, Genin E
Selection of SNP subsets for association studies in candidate genes: comparison of the power of different strategies to detect single disease susceptibility locus effects.
BMC Genet. 2006;720.
BACKGROUND: The recent advances in genotyping and molecular techniques have greatly increased the knowledge of the human genome structure. Millions of polymorphisms are reported and freely available in public databases. As a result, there is now a need to identify among all these data, the relevant markers for genetic association studies. Recently, several methods have been published to select subsets of markers, usually Single Nucleotide Polymorphisms (SNPs), that best represent genetic polymorphisms in the studied candidate gene or region. RESULTS: In this paper, we compared four of these selection methods, two based on haplotype information and two based on pairwise linkage disequilibrium (LD). The methods were applied to the genotype data on twenty genes with different patterns of LD and different numbers of SNPs. A measure of the efficiency of the different methods to select SNPs was obtained by comparing, for each gene and under several single disease susceptibility models, the power to detect an association that will be achieved with the selected SNP subsets. CONCLUSION: None of the four selection methods stands out systematically from the others. Methods based on pairwise LD information turn out to be the most interesting methods in a context of association study in candidate gene. In a context where the number of SNPs to be tested in a given region needs to be more limited, as in large-scale studies or wide genome scans, one of the two methods based on haplotype information, would be more suitable. [Abstract/Link to Full Text]

Guo FB, Yu XJ
Separate base usages of genes located on the leading and lagging strands in Chlamydia muridarum revealed by the Z curve method.
BMC Genomics. 2007;8366.
BACKGROUND: The nucleotide compositional asymmetry between the leading and lagging strands in bacterial genomes has been the subject of intensive study in the past few years. It is interesting to mention that almost all bacterial genomes exhibit the same kind of base asymmetry. This work aims to investigate the strand biases in Chlamydia muridarum genome and show the potential of the Z curve method for quantitatively differentiating genes on the leading and lagging strands. RESULTS: The occurrence frequencies of bases of protein-coding genes in C. muridarum genome were analyzed by the Z curve method. It was found that genes located on the two strands of replication have distinct base usages in C. muridarum genome. According to their positions in the 9-D space spanned by the variables u1 - u9 of the Z curve method, K-means clustering algorithm can assign about 94% of genes to the correct strands, which is a few percent higher than those correctly classified by K-means based on the RSCU. The base usage and codon usage analyses show that genes on the leading strand have more G than C and more T than A, particularly at the third codon position. For genes on the lagging strand the biases is reverse. The y component of the Z curves for the complete chromosome sequences show that the excess of G over C and T over A are more remarkable in C. muridarum genome than in other bacterial genomes without separating base and/or codon usages. Furthermore, for the genomes of Borrelia burgdorferi, Treponema pallidum, Chlamydia muridarum and Chlamydia trachomatis, in which distinct base and/or codon usages have been observed, closer phylogenetic distance is found compared with other bacterial genomes. CONCLUSION: The nature of the strand biases of base composition in C. muridarum is similar to that in most other bacterial genomes. However, the base composition asymmetry between the leading and lagging strands in C. muridarum is more significant than that in other bacteria. It's supposed that the remarkable strand biases of G/C and T/A are responsible for the appearance of separate base or codon usages in C. muridarum. On the other hand, the closer phylogenetic distance among the four bacterial genomes with separate base and/or codon usages is necessary rather than occasional. It's also shown that the Z curve method may be more sensitive than RSCU when being used to quantitatively analyze DNA sequences. [Abstract/Link to Full Text]

Bruce SJ, Gardiner BB, Burke LJ, Gongora MM, Grimmond SM, Perkins AC
Dynamic transcription programs during ES cell differentiation towards mesoderm in serum versus serum-free(BMP4) culture.
BMC Genomics. 2007 Oct 10;8(1):365.
ABSTRACT: BACKGROUND: Expression profiling of embryonic stem (ES) cell differentiation in the presence of serum has been performed previously. It remains unclear if transcriptional activation is dependent on complex growth factor mixtures in serum or whether this process is intrinsic to ES cells once the stem cell program has been inactivated. The aims of this study were to determine the transcriptional programs associated with the stem cell state and to characterize mesoderm differentiation between serum and serum-free culture. RESULTS: ES cells were differentiated as embryoid bodies in 10% FBS or serum-free media containing BMP4 (2ng/ml), and expression profiled using 47K Illumina(R) Sentrix arrays. Statistical methods were employed to define gene sets characteristic of stem cell, epiblast and primitive streak programs. Although the initial differentiation profile was similar between the two culture conditions, cardiac gene expression was inhibited in serum whereas blood gene expression was enhanced. Also, expression of many members of the Kruppel-like factor (KLF) family of transcription factors changed dramatically during the first few days of differentiation. KLF2 and KLF4 co-localized with OCT4 in a sub-nuclear compartment of ES cells, dynamic changes in KLF-DNA binding activities occurred upon differentiation, and strong bio-informatic evidence for direct regulation of many stem cell genes by KLFs was found. CONCLUSIONS: Down regulation of stem cell genes and activation of epiblast/primitive streak genes is similar in serum and defined media, but subsequent mesoderm differentiation is strongly influenced by the composition of the media. In addition, KLF family members are likely to be important regulators of many stem cell genes. [Abstract/Link to Full Text]

Srivastava S, Li Z, Yang X, Yedwabnick M, Shaw S, Chan C
Identification of genes that regulate multiple cellular processes/responses in the context of lipotoxicity to hepatoma cells.
BMC Genomics. 2007;8364.
BACKGROUND: In order to devise efficient treatments for complex, multi-factorial diseases, it is important to identify the genes which regulate multiple cellular processes. Exposure to elevated levels of free fatty acids (FFAs) and tumor necrosis factor alpha (TNF-alpha) alters multiple cellular processes, causing lipotoxicity. Intracellular lipid accumulation has been shown to reduce the lipotoxicity of saturated FFA. We hypothesized that the genes which simultaneously regulate lipid accumulation as well as cytotoxicity may provide better targets to counter lipotoxicity of saturated FFA. RESULTS: As a model system to test this hypothesis, human hepatoblastoma cells (HepG2) were exposed to elevated physiological levels of FFAs and TNF-alpha. Triglyceride (TG) accumulation, toxicity and the genomic responses to the treatments were measured. Here, we present a framework to identify such genes in the context of lipotoxicity. The aim of the current study is to identify the genes that could be altered to treat or ameliorate the cellular responses affected by a complex disease rather than to identify the causal genes. Genes that regulate the TG accumulation, cytotoxicity or both were identified by a modified genetic algorithm partial least squares (GA/PLS) analysis. The analyses identified NADH dehydrogenase and mitogen activated protein kinases (MAPKs) as important regulators of both cytotoxicity and lipid accumulation in response to FFA and TNF-alpha exposure. In agreement with the predictions, inhibiting NADH dehydrogenase and c-Jun N-terminal kinase (JNK) reduced cytotoxicity significantly and increased intracellular TG accumulation. Inhibiting another MAPK pathway, the extracellular signal regulated kinase (ERK), on the other hand, improved the cytotoxicity without changing TG accumulation. Much greater reduction in the toxicity was observed upon inhibiting the NADH dehydrogenase and MAPK (which were identified by the dual-response analysis), than for the stearoyl-CoA desaturase (SCD) activation (which was identified for the TG-alone analysis). CONCLUSION: These results demonstrate the applicability of GA/PLS in identifying the genes that regulate multiple cellular responses of interest and that genes regulating multiple cellular responses may be better candidates for countering complex diseases. [Abstract/Link to Full Text]

Davis C, Barvish Z, Gitelman I
A method for the construction of equalized directional cDNA libraries from hydrolyzed total RNA.
BMC Genomics. 2007 Oct 9;8(1):363.
ABSTRACT: BACKGROUND: The transcribed sequences of a cell, the transcriptome, represent the trans-acting fraction of the genetic information, yet eukaryotic cDNA libraries are typically made from only the poly-adenylated fraction. The non-coding or translated but non-polyadenylated RNAs are therefore not represented. The goal of this study was to develop a method that would more completely represent the transcriptome in a useful format, avoiding over-representation of some of the abundant, but low-complexity non-translated transcripts. RESULTS: We developed a combination of self-subtraction and directional cloning procedures for this purpose. Libraries were prepared from partially degraded (hydrolyzed) total RNA from three different species. A restriction endonuclease site was added to the 3' end during first-strand synthesis using a directional random-priming technique. The abundant non-polyadenylated rRNA and tRNA sequences were largely removed by using self-subtraction to equalize the representation of the various RNA species. Sequencing random clones from the libraries showed that 87% of clones were in the forward orientation with respect to known or predicted transcripts. 70% matched identified or predicted translated RNAs in the sequence databases. Abundant mRNAs were less frequent in the self-subtracted libraries compared to a non-subtracted mRNA library. 3% of the sequences were from known or hypothesized ncRNA loci, including five matches to miRNA loci. CONCLUSIONS: We describe a simple method for making high-quality, directional, random-primed, cDNA libraries from small amounts of degraded total RNA. This technique is advantageous in situations where a cDNA library with complete but equalized representation of transcribed sequences, whether polyadenylated or not, is desired. [Abstract/Link to Full Text]

Schroeder TM, Nair AK, Staggs R, Lamblin AF, Westendorf JJ
Gene profile analysis of osteoblast genes differentially regulated by histone deacetylase inhibitors.
BMC Genomics. 2007 Oct 9;8(1):362.
ABSTRACT: BACKGROUND: Osteoblast differentiation requires the coordinated stepwise expression of multiple genes. Histone deacetylase inhibitors (HDIs) accelerate the osteoblast differentiation process by blocking the activity of histone deacetylases (HDACs), which alter gene expression by modifying chromatin structure. We previously demonstrated that HDIs and HDAC3 shRNAs accelerate matrix mineralization and the expression of osteoblast maturation genes (e.g. alkaline phosphatase, osteocalcin). Identifying other genes that are differentially regulated by HDIs might identify new pathways that contribute to osteoblast differentiation. RESULTS: To identify other osteoblast genes that are altered early by HDIs, we incubated MC3T3-E1 preosteoblasts with HDIs (trichostatin A, MS-275, or valproic acid) for 18 hours in osteogenic conditions. The promotion of osteoblast differentiation by HDIs in this experiment was confirmed by osteogenic assays. Gene expression profiles relative to vehicle-treated cells were assessed by microarray analysis with Affymetrix GeneChip 430 2.0 arrays. The regulation of several genes by HDIs in MC3T3-E1 cells and primary osteoblasts was verified by quantitative real-time PCRR. Nine genes were differentially regulated by at least two-fold after exposure to each of the three HDIs and six were verified by PCR in osteoblasts. Four of the verified genes (solute carrier family 9 isoform 3 regulator 1 (Slc9a3r1), sorbitol dehydrogenase 1, a kinase anchor protein, and glutathione S-transferase alpha 4) were induced. Two genes (proteasome subunit, beta type 10 and adaptor-related protein complex AP-4 sigma 1) were suppressed. We also identified eight growth factors and growth factor receptor genes that are significantly altered by each of the HDIs, including Frizzled related proteins 1 and 4, which modulate the Wnt signaling pathway. CONCLUSIONS: This study identifies osteoblast genes that are regulated early by HDIs and indicates pathways that might promote osteoblast maturation following HDI exposure. One gene whose upregulation following HDI treatment is consistent with this notion is Slc9a3r1. Also known as NHERF1, Slc9a3r1 is required for optimal bone density. Similarly, the regulation of Wnt receptor genes indicates that this crucial pathway in osteoblast development is also affected by HDIs. These data support the hypothesis that HDIs regulate the expression of genes that promote osteoblast differentiation and maturation. [Abstract/Link to Full Text]

Sokolovic M, Wehkamp D, Sokolovic A, Vermeulen J, Gilhuijs-Pederson LA, van Haaften RI, Nikolsky Y, Evelo CT, van Kampen AH, Hakvoort TB, Lamers WH
Fasting induces a biphasic adaptive metabolic response in murine small intestine.
BMC Genomics. 2007 Oct 9;8(1):361.
ABSTRACT: BACKGROUND: The gut is a major energy consumer, but a comprehensive overview of the adaptive response to fasting is lacking. Gene-expression profiling, pathway analysis, and immunohistochemistry were therefore carried out on mouse small intestine after 0, 12, 24, and 72 hours of fasting. RESULTS: Intestinal weight declined to 50% of control, but this loss of tissue mass was distributed proportionally among the gut's structural components, so that the microarrays' tissue base remained unaffected. Unsupervised hierarchical clustering of the microarrays revealed that the successive time points separated into distinct branches. Pathway analysis depicted a pronounced, but transient early response that peaked at 12 hours, and a late response that became progressively more pronounced with continued fasting. Early changes in gene expression were compatible with a cellular deficiency in glutamine and metabolic adaptations directed at glutamine conservation, inhibition of pyruvate oxidation, stimulation of glutamate catabolism via aspartate and phosphoenolpyruvate to lactate, and enhanced fatty-acid oxidation and ketone-body synthesis. In addition, the expression of key genes involved in cell cycling and apoptosis was suppressed. At 24 hours of fasting, many of the early adaptive changes abated. Major changes upon continued fasting implied the production of glucose rather than lactate from carbohydrate backbones, a downregulation of fatty-acid oxidation and a very strong downregulation of the electron-transport chain. Cell cycling and apoptosis remained suppressed. CONCLUSION: The changes in gene expression indicate that the small intestine rapidly looses mass during fasting to generate lactate or glucose and ketone bodies. Meanwhile, intestinal architecture is maintained by downregulation of cell turnover. [Abstract/Link to Full Text]

Gogvadze E, Barbisan C, Lebrun MH, Buzdin A
Tripartite chimeric pseudogene from the genome of rice blast fungus Magnaporthe grisea suggests double template jumps during long interspersed nuclear element (LINE) reverse transcription.
BMC Genomics. 2007;8360.
BACKGROUND: A systematic survey of loci carrying retrotransposons in the genome of the rice blast fungus Magnaporthe grisea allowed the identification of novel non-canonical retropseudogenes. These elements are chimeric retrogenes composed of DNA copies from different cellular transcripts directly fused to each other. Their components are copies of a non protein-coding highly expressed RNA of unknown function termed WEIRD and of two fungal retrotransposons: MGL and Mg-SINE. Many of these chimeras are transcribed in various M. grisea tissues and during plant infection. Chimeric retroelements with a similar structure were recently found in three mammalian genomes. All these chimeras are likely formed by RNA template switches during the reverse transcription of diverse LINE elements. RESULTS: We have shown that in M. grisea template switching occurs at specific sites within the initial template RNA which contains a characteristic consensus sequence. We also provide evidence that both single and double template switches may occur during LINE retrotransposition, resulting in the fusion of three different transcript copies. In addition to the 33 bipartite elements, one tripartite chimera corresponding to the fusion of three retrotranscripts (WEIRD, Mg-SINE, MGL-LINE) was identified in the M. grisea genome. Unlike the previously reported two human tripartite elements, this fungal retroelement is flanked by identical 14 bp-long direct repeats. The presence of these short terminal direct repeats demonstrates that the LINE enzymatic machinery was involved in the formation of this chimera and its integration in the M. grisea genome. CONCLUSION: A survey of mammalian genomic databases also revealed two novel tripartite chimeric retroelements, suggesting that double template switches occur during reverse transcription of LINE retrotransposons in different eukaryotic organisms. [Abstract/Link to Full Text]

Taboada EN, van Belkum AF, Yuki N, Acedillo RR, Godschalk PC, Koga M, Endtz HP, Gilbert M, Nash JH
Comparative genomic analysis of Campylobacter jejuni associated with Guillain-Barre and Miller Fisher syndromes: neuropathogenic and enteritis-associated isolates can share high levels of genomic similarity.
BMC Genomics. 2007 Oct 5;8(1):359.
ABSTRACT: BACKGROUND: Campylobacter jejuni infection represents the most frequent antecedent infection triggering the onset of the neuropathic disorders Guillain-Barre syndrome (GBS) and Miller Fisher syndrome (MFS). Although sialylated ganglioside-mimicking lipo-oligosaccharide (LOS) structures are the strongest neuropathogenic determinants in C. jejuni, they do not appear to be the only requirement for a neuropathic outcome since strains capable of their production have been isolated from patients with uncomplicated cases of enteritis. Consequently, other pathogen and/or host-related factors contribute to the onset of neurological complications. We have used comparative genomic hybridization to perform a detailed genomic comparison of strains isolated from GBS/MFS and enteritis-only patients. Our dataset, in which the gene conservation profile for 1712 genes was assayed in 102 strains, including 56 neuropathogenic isolates, represents the largest systematic search for C. jejuni factors associated with GBS/MFS to date and has allowed us to analyze the genetic background of neuropathogenic C. jejuni strains with an unprecedented level of resolution. RESULTS: The majority of GBS/MFS strains can be assigned to one of six major lineages, suggesting that several genetic backgrounds can result in a neuropathogenic phenotype. A statistical analysis of gene conservation rates revealed that although genes involved in the sialylation of LOS structures were significantly associated with neuropathogenic strains, still many enteritis-control strains both bear these genes and share remarkable levels of genomic similarity with their neuropathogenic counterparts. Two capsule biosynthesis genes (Cj1421c and Cj1428c) showed higher conservation rates among neuropathogenic strains compared to enteritis-control strains. Any potential involvement of these genes in neuropathogenesis must be assessed. A single gene (HS:3 Cj1135) had a higher conservation rate among enteritis-control strains. This gene encodes a glucosyltransferase that is found in some of the LOS classes that do not express ganglioside mimics. CONCLUSIONS: Our findings corroborate that neuropathogenic factors may be transferred between unrelated strains of different genetic background. Our results would also suggest that the failure of some strains isolated from uncomplicated cases of enteritis to elicit a neuropathic clinical outcome may be due to subtle genetic differences that silence their neuropathogenic potential and/or due to host-related factors. The microarray data has been deposited in NCBIs Gene Expression Omnibus under accession number GSE3579. [Abstract/Link to Full Text]

Kassahn KS, Crozier RH, Ward AC, Stone G, Caley MJ
From transcriptome to biological function: environmental stress in an ectothermic vertebrate, the coral reef fish Pomacentrus moluccensis.
BMC Genomics. 2007 Oct 5;8(1):358.
ABSTRACT: BACKGROUND: Our understanding of the importance of transcriptional regulation for biological function is continuously improving. We still know, however, comparatively little about how environmentally induced stress affects gene expression in vertebrates, and the consistency of transcriptional stress responses to different types of environmental stress. In this study, we used a multi-stressor approach to identify components of a common stress response as well as components unique to different types of environmental stress. We exposed individuals of the coral reef fish Pomacentrus moluccensis to hypoxic, hyposmotic, cold and heat shock and measured the responses of approximately 16,000 genes in liver. We also compared winter and summer responses to heat shock to examine the capacity for such responses to vary with acclimation to different ambient temperatures. RESULTS: We identified a series of gene functions that were involved in all stress responses examined here, suggesting some common effects of stress on biological function. These common responses were achieved by the regulation of largely independent sets of genes; the responses of individual genes varied greatly across different stress types. In response to heat exposure over five days, a total of 324 gene loci were differentially expressed. Many heat-responsive genes had functions associated with protein turnover, metabolism, and the response to oxidative stress. We were also able to identify groups of co-regulated genes, the genes within which shared similar functions. CONCLUSION: This is the first environmental genomic study to measure gene regulation in response to different environmental stressors in a natural population of a warm-adapted ectothermic vertebrate. We have shown that different types of environmental stress induce expression changes in genes with similar gene functions, but that the responses of individual genes vary between stress types. The functions of heat-responsive genes suggest that prolonged heat exposure leads to oxidative stress and protein damage, a challenge of the immune system, and the re-allocation of energy sources. This study hence offers insight into the effects of environmental stress on biological function and sheds light on the expected sensitivity of coral reef fishes to elevated temperatures in the future. [Abstract/Link to Full Text]

Baron D, Montfort J, Houlgatte R, Fostier A, Guiguen Y
Androgen-induced masculinization in rainbow trout results in a marked dysregulation of early gonadal gene expression profiles.
BMC Genomics. 2007;8357.
BACKGROUND: Fish gonadal sex differentiation is affected by sex steroids treatments providing an efficient strategy to control the sexual phenotype of fish for aquaculture purposes. However, the biological effects of such treatments are poorly understood. The aim of this study was to identify the main effects of an androgen masculinizing treatment (11beta-hydroxyandrostenedione, 11betaOHDelta4, 10 mg/kg of food for 3 months) on gonadal gene expression profiles of an all-female genetic population of trout. To characterize the most important molecular features of this process, we used a large scale gene expression profiling approach using rainbow trout DNA microarrays combined with a detailed gene ontology (GO) analysis. RESULTS: 2,474 genes were characterized as up-regulated or down-regulated in trout female gonads masculinized by androgen in comparison with control male or female gonads from untreated all-male and all-female genetic populations. These genes were classified in 13 k-means clusters of temporally correlated expression profiles. Gene ontology (GO) data mining revealed that androgen treatment triggers a marked down-regulation of genes potentially involved in early oogenesis processes (GO 'mitotic cell cycle', 'nucleolus'), an up-regulation of the translation machinery (GO 'ribosome') along with a down-regulation of proteolysis (GO 'proteolysis', 'peptidase' and 'metallopeptidase activity'). Genes considered as muscle fibres markers (GO 'muscle contraction') and genes annotated as structural constituents of the extracellular matrix (GO 'extracellular matrix') or related to meiosis (GO 'chromosome' and 'meiosis') were found significantly enriched in the two clusters of genes specifically up-regulated in androgen-treated female gonads. GO annotations 'Sex differentiation' and 'steroid biosynthesis' were enriched in a cluster of genes with high expression levels only in control males. Interestingly none of these genes were stimulated by the masculinizing androgen treatment. CONCLUSION: This study provides evidence that androgen masculinization results in a marked dysregulation of early gene expression profiles when compared to natural testicular or ovarian differentiation. Based on these results we suggest that, in our experimental conditions, androgen masculinization proceeds mainly through an early inhibition of female development. [Abstract/Link to Full Text]

Hoogewijs D, Geuens E, Dewilde S, Vierstraete A, Moens L, Vinogradov S, Vanfleteren JR
Wide diversity in structure and expression profiles among members of the Caenorhabditis elegans globin protein family.
BMC Genomics. 2007 Oct 4;8(1):356.
ABSTRACT: BACKGROUND: The emergence of high throughput genome sequencing facilities and powerful high performance bioinformatic tools has highlighted hitherto unexpected wide occurrence of globins in the three kingdoms of life. In silico analysis of the genome of C. elegans identified 33 putative globin genes. It remains a mystery why this tiny animal might need so many globins. As an inroad to understanding this complexity we initiated a structural and functional analysis of the globin family in C. elegans. RESULTS: All 33 C. elegans putative globin genes are transcribed. The translated sequences have the essential signatures of single domain bona fide globins, or they contain a distinct globin domain that is part of a larger protein. All globin domains can be aligned so as to fit the globin fold, but internal interhelical and N- and C-terminal extensions and a variety of amino acid substitutions generate much structural diversity among the globins of C. elegans. Likewise, the encoding genes lack a conserved pattern of intron insertion positioning. We analyze the expression profiles of the globins during the progression of the life cycle, and we find that distinct subsets of globins are induced, or repressed, in wild-type dauers and in daf-2(e1370)/ insulin-receptor mutant adults, although these animals share several physiological features including resistance to elevated temperature, oxidative stress and hypoxic death. Several globin genes are upregulated following oxygen deprivation and we find that HIF-1 and DAF-2 each are required for this response. Our data indicate that the DAF-2 regulated transcription factor DAF-16/FOXO positively modulates hif-1 transcription under anoxia but opposes expression of the HIF-1 responsive globin genes itself. In contrast, the canonical globin of C. elegans, ZK637.13, is not responsive to anoxia. Reduced DAF-2 signaling leads to enhanced transcription of this globin and DAF-16 is required for this effect. CONCLUSION: We found that all 33 putative globins are expressed, albeit at low or very low levels, perhaps indicating cell-specific expression. They show wide diversity in gene structure and amino acid sequence, suggesting a long evolutionary history. Ten globins are responsive to oxygen deprivation in an interacting HIF-1 and DAF-16 dependent manner. Globin ZK637.13 is not responsive to oxygen deprivation and regulated by the Ins/IGF pathway only suggesting that this globin may contribute to the life maintenance program. [Abstract/Link to Full Text]

Rattei T, Ott S, Gutacker M, Rupp J, Maass M, Schreiber S, Solbach W, Wirth T, Gieffers J
Genetic diversity of the obligate intracellular bacterium Chlamydophila pneumoniae by genome-wide analysis of single nucleotide polymorphisms: evidence for highly clonal population structure.
BMC Genomics. 2007;8355.
BACKGROUND: Chlamydophila pneumoniae is an obligate intracellular bacterium that replicates in a biphasic life cycle within eukaryotic host cells. Four published genomes revealed an identity of > 99 %. This remarkable finding raised questions about the existence of distinguishable genotypes in correlation with geographical and anatomical origin. RESULTS: We studied the genetic diversity of C. pneumoniae by analysing synonymous single nucleotide polymorphisms (sSNPs) that are under reduced selection pressure. We conducted an in silico analysis of the four sequenced genomes, chose 232 representative sSNPs and analysed the loci in 38 C. pneumoniae isolates. We identified 15 different genotypes that were separated in four major clusters. Clusters were not associated with anatomical or geographical origin. However, animal lineages are basal on the C. pneumomiae phylogeny, suggesting a recent transmission to humans through successive bottlenecks some 150,000 years ago. A lack of detectable variation in 17 isolates emphasizes the extraordinary genetic conservation of this species and the high clonality of the population. Moreover, the largest cluster, which encompasses 80% of all analysed strains, is an extremely young clade, that went through an important population expansion some 3,300 years ago. CONCLUSION: sSNPs have proven useful as a sensitive marker to gain new insights into genetic diversity, population structure and evolutionary history of C. pneumoniae. [Abstract/Link to Full Text]

Liao BK, Deng AN, Chen SC, Chou MY, Hwang PP
Expression and water calcium dependence of calcium transporter isoforms in zebrafish gill mitochondrion-rich cells.
BMC Genomics. 2007 Oct 4;8(1):354.
ABSTRACT: BACKGROUND: Freshwater fish absorb Ca2+ predominantly from ambient water, and more than 97% of Ca2+ uptake is achieved by active transport through gill mitochondrion-rich (MR) cells. In the current model for Ca2+ uptake in gill MR cells, Ca2+ passively enters the cytosol via the epithelium Ca2+ channel (ECaC), and then is extruded into the plasma through the basolateral Na+/Ca2+ exchanger (NCX) and plasma membrane Ca2+-ATPase (PMCA). However, no convincing molecular or cellular evidence has been available to support the role of specific PMCA and/or NCX isoforms in this model. Zebrafish (Danio rerio) is a good model for analyzing isoforms of a gene because of the plentiful genomic databases and expression sequence tag (EST) data. RESULTS: Using a strategy of BLAST from the zebrafish genome database (Sanger Institute), 6 isoforms of PMCAs (PMCA1a, PMCA1b, PMCA2, PMCA3a, PMCA3b, and PMCA4) and 7 isoforms of NCXs (NCX1a, NCX1b, NCX2a, NCX2b, NCX3, NCX4a, and NCX4b) were identified. In the reverse-transcriptase polymerase chain reaction (RT-PCR) analysis, 5 PMCAs and 2 NCXs were ubiquitously expressed in various tissues including gills. Triple fluorescence in situ hybridization and immunocytochemistry showed the colocalization of zecac, zpmca2, and zncx1b mRNAs in a portion of gill MR cells (using Na+-K+-ATPase as the marker), implying a subset of ionocytes specifically responsible for the transepithelial Ca2+ uptake in zebrafish gills. The gene expressions in gills of high- or low-Ca2+-acclimated zebrafish by quantitative real-time PCR analysis showed that zecac was the only gene regulated in response to environmental Ca2+ levels, while zpmcas and zncxs remained steady. CONCLUSIONS: The present study provides molecular evidence for the specific isoforms of Ca2+ transporters, zECaC, zPMCA2, and zNCX1b, supporting the current Ca2+ uptake model, in which ECaC may play a role as the major regulatory target for this mechanism during environmental challenge. [Abstract/Link to Full Text]

Swindell WR
Gene expression profiling of long-lived dwarf mice: longevity-associated genes and relationships with diet, gender and aging.
BMC Genomics. 2007;8353.
BACKGROUND: Long-lived strains of dwarf mice carry mutations that suppress growth hormone (GH) and insulin-like growth factor I (IGF-I) signaling. The downstream effects of these endocrine abnormalities, however, are not well understood and it is unclear how these processes interact with aging mechanisms. This study presents a comparative analysis of microarray experiments that have measured hepatic gene expression levels in long-lived strains carrying one of four mutations (Prop1(df/df), Pit1(dw/dw), Ghrhr(lit/lit), GHR-KO) and describes how the effects of these mutations relate to one another at the transcriptional level. Points of overlap with the effects of calorie restriction (CR), CR mimetic compounds, low fat diets, gender dimorphism and aging were also examined. RESULTS: All dwarf mutations had larger and more consistent effects on IGF-I expression than dietary treatments. In comparison to dwarf mutations, however, the transcriptional effects of CR (and some CR mimetics) overlapped more strongly with those of aging. Surprisingly, the Ghrhr(lit/lit) mutation had much larger effects on gene expression than the GHR-KO mutation, even though both mutations affect the same endocrine pathway. Several genes potentially regulated or co-regulated with the IGF-I transcript in liver tissue were identified, including a DNA repair gene (Snm1) that is upregulated in proportion to IGF-I inhibition. A total of 13 genes exhibiting parallel differential expression patterns among all four strains of long-lived dwarf mice were identified, in addition to 30 genes with matching differential expression patterns in multiple long-lived dwarf strains and under CR. CONCLUSION: Comparative analysis of microarray datasets can identify patterns and consistencies not discernable from any one dataset individually. This study implements new analytical approaches to provide a detailed comparison among the effects of life-extending mutations, dietary treatments, gender and aging. This comparison provides insight into a broad range of issues relevant to the study of mammalian aging. In this context, 43 longevity-associated genes are identified and individual genes with the highest level of support among all microarray experiments are highlighted. These results provide promising targets for future experimental investigation as well as potential clues for understanding the functional basis of lifespan extension in mammalian systems. [Abstract/Link to Full Text]

Tsai HK, Su CP, Lu MY, Shih CH, Wang D
Co-expression of adjacent genes in yeast cannot be simply attributed to shared regulatory system.
BMC Genomics. 2007;8352.
BACKGROUND: Adjacent gene pairs in the yeast genome have a tendency to express concurrently. Sharing of regulatory elements within the intergenic region of those adjacent gene pairs was often considered the major mechanism responsible for such co-expression. However, it is still in debate to what extent that common transcription factors (TFs) contribute to the co-expression of adjacent genes. In order to resolve the evolutionary aspect of this issue, we investigated the conservation of adjacent pairs in five yeast species. By using the information for TF binding sites in promoter regions available from the MYBS database, the ratios of TF-sharing pairs among all the adjacent pairs in yeast genomes were analyzed. The levels of co-expression in different adjacent patterns were also compared. RESULTS: Our analyses showed that the proportion of adjacent pairs conserved in five yeast species is relatively low compared to that in the mammalian lineage. The proportion was also low for adjacent gene pairs with shared TFs. Particularly, the statistical analysis suggested that co-expression of adjacent gene pairs was not noticeably associated with the sharing of TFs in these pairs. We further proposed a case of the PAC (polymerase A and C) and RRPE (rRNA processing element) motifs which co-regulate divergent/bidirectional pairs, and found that the shared TFs were not significantly relevant to co-expression of divergent promoters among adjacent genes. CONCLUSION: Our findings suggested that the commonly shared cis-regulatory system does not solely contribute to the co-expression of adjacent gene pairs in yeast genome. Therefore we believe that during evolution yeasts have developed a sophisticated regulatory system that integrates both TF-based and non-TF based mechanisms(s) for concurrent regulation of neighboring genes in response to various environmental changes. [Abstract/Link to Full Text]

Nagarajan V, Elasri MO
SAMMD: Staphylococcus aureus microarray meta-database.
BMC Genomics. 2007;8351.
BACKGROUND: Staphylococcus aureus is an important human pathogen, causing a wide variety of diseases ranging from superficial skin infections to severe life threatening infections. S. aureus is one of the leading causes of nosocomial infections. Its ability to resist multiple antibiotics poses a growing public health problem. In order to understand the mechanism of pathogenesis of S. aureus, several global expression profiles have been developed. These transcriptional profiles included regulatory mutants of S. aureus and growth of wild type under different growth conditions. The abundance of these profiles has generated a large amount of data without a uniform annotation system to comprehensively examine them. We report the development of the Staphylococcus aureus Microarray meta-database (SAMMD) which includes data from all the published transcriptional profiles. SAMMD is a web-accessible database that helps users to perform a variety of analysis against and within the existing transcriptional profiles. DESCRIPTION: SAMMD is a relational database that uses MySQL as the back end and PHP/JavaScript/DHTML as the front end. The database is normalized and consists of five tables, which holds information about gene annotations, regulated gene lists, experimental details, references, and other details. SAMMD data is collected from the peer-reviewed published articles. Data extraction and conversion was done using perl scripts while data entry was done through phpMyAdmin tool. The database is accessible via a web interface that contains several features such as a simple search by ORF ID, gene name, gene product name, advanced search using gene lists, comparing among datasets, browsing, downloading, statistics, and help. The database is licensed under General Public License (GPL). CONCLUSION: SAMMD is hosted and available at Currently there are over 9500 entries for regulated genes, from 67 microarray experiments. SAMMD will help staphylococcal scientists to analyze their expression data and understand it at global level. It will also allow scientists to compare and contrast their transcriptome to that of the other published transcriptomes. [Abstract/Link to Full Text]

Houot L, Floutier M, Marteyn B, Michaut M, Picciocchi A, Legrain P, Aude JC, Cassier-Chauvat C, Chauvat F
Cadmium triggers an integrated reprogramming of the metabolism of Synechocystis PCC6803, under the control of the Slr1738 regulator.
BMC Genomics. 2007 Oct 2;8(1):350.
ABSTRACT: BACKGROUND: Cadmium is a persistent pollutant that threatens most biological organisms, including cyanobacteria that support a large part of the biosphere. Using a multifaceted approach, we have investigated the global responses to Cd and other relevant stresses (H2O2 and Fe) in the model cyanobacterium Synechocystis PCC6803. RESULTS: We found that cells respond to the Cd stress in a two main temporal phases process. In the "early" phase cells mainly limit Cd entry through the negative and positive regulation of numerous genes operating in metal uptake and export, respectively. As time proceeds, the number of responsive genes increases. In this "massive" phase, Cd downregulates most genes operating in (i) photosynthesis (PS) that normally provides ATP and NADPH; (ii) assimilation of carbon, nitrogen and sulfur that requires ATP and NAD(P)H; and (iii) translation machinery, a major consumer of ATP and nutrients. Simultaneously, many genes are upregulated, such as those involved in Fe acquisition, stress tolerance, and protein degradation (crucial to nutrients recycling). The most striking common effect of Cd and H2O2 is the disturbance of both light tolerance and Fe homeostasis, which appeared to be interdependent. Our results indicate that cells challenged with H2O2 or Cd use different strategies for the same purpose of supplying Fe atoms to Fe-requiring metalloenzymes and the SUF machinery, which synthesizes or repairs Fe-S centers. Cd-stressed cells preferentially breakdown their Fe-rich PS machinery, whereas H2O2-challenged cells preferentially accelerate the intake of Fe atoms from the medium. CONCLUSIONS: We view the responses to Cd as an integrated "Yin Yang" reprogramming of the whole metabolism, we found to be controlled by the Slr1738 regulator. As the Yin process, the ATP- and nutrients-sparing downregulation of anabolism limits the poisoning incorporation of Cd into metalloenzymes. As the compensatory Yang process, the PS breakdown liberates nutrient assimilates for the synthesis of Cd-tolerance proteins, among which we found the Slr0946 arsenate reductase enzyme. [Abstract/Link to Full Text]

La MV, Crapoulet N, Barbry P, Raoult D, Renesto P
Comparative genomic analysis of Tropheryma whipplei strains reveals that diversity among clinical isolates is mainly related to the WiSP proteins.
BMC Genomics. 2007;8349.
BACKGROUND: The aim of this study was to analyze the genomic diversity of several Tropheryma whipplei strains by microarray-based comparative genomic hybridization. Fifteen clinical isolates originating from biopsy samples recovered from different countries were compared with the T. whipplei Twist strain. For each isolate, the genes were defined as either present or absent/divergent using the GACK analysis software. Genomic changes were then further characterized by PCR and sequencing. RESULTS: The results revealed a limited genetic variation among the T. whipplei isolates, with at most 2.24% of the probes exhibiting differential hybridization against the Twist strain. The main variation was found in genes encoding the WiSP membrane protein family. This work also demonstrated a 19.2 kb-pair deletion within the T. whipplei DIG15 strain. This deletion occurs in the same region as the previously described large genomic rearrangement between Twist and TW08/27. Thus, this can be considered as a major hot-spot for intra-specific T. whipplei differentiation. Analysis of this deleted region confirmed the role of WND domains in generating T. whipplei diversity. CONCLUSION: This work provides the first comprehensive genomic comparison of several T. whipplei isolates. It reveals that clinical isolates originating from various geographic and biological sources exhibit a high conservation rate, indicating that T. whipplei rarely interacts with exogenous DNA. Remarkably, frequent inter-strain variations were dicovered that affected members of the WiSP family. [Abstract/Link to Full Text]

Wang K, Ubriaco G, Sutherland LC
RBM6-RBM5 transcription-induced chimeras are differentially expressed in tumours.
BMC Genomics. 2007 Oct 1;8(1):348.
ABSTRACT: BACKGROUND: Transcription-induced chimerism, a mechanism involving the transcription and intergenic splicing of two consecutive genes, has recently been estimated to account for ~5% of the human transcriptome. Despite this prevalence, the regulation and function of these fused transcripts remains largely uncharacterised. RESULTS: We identified three novel transcription-induced chimeras resulting from the intergenic splicing of a single RNA transcript incorporating the two neighbouring 3p21.3 tumour suppressor locus genes, RBM6 and RBM5, which encode the RNA Binding Motif protein 6 and RNA Binding Motif protein 5, respectively. Each of the three novel chimeric transcripts lacked exons 3, 6, 20 and 21 of RBM6 and exon 1 of RBM5. Differences between the transcripts were associated with the presence or absence of exon 4, exon 5 and a 17 nucleotide (nt) sequence from intron 10 of RBM6. All three chimeric transcripts incorporated the canonical splice sites from both genes (excluding the 17 nt intron 10 insertion). Differential expression was observed in tumour tissue compared to non-tumour tissue, and amongst tumour types. In breast tumour tissue, chimeric expression was associated with elevated levels of RBM6 and RBM5 mRNA, and increased tumour size. No protein expression was detected by in vitro transcription/translation. CONCLUSIONS: These results suggest that RBM6 mRNA experiences altered co-transcriptional gene regulation in certain cancers. The results also suggest that RBM6-RBM5 transcription-induced chimerism might be a process that is linked to the tumour-associated increased transcriptional activity of the RBM6 gene. It appears that none of the transcription-induced chimeras generates a protein product; however, the novel alternative splicing, which affects putative functional domains within exons 3, 6 and 11 of RBM6, does suggest that the generation of these chimeric transcripts has functional relevance. Finally, the association of chimeric expression with breast tumour size suggests that RBM6-RBM5 chimeric expression may be a potential tumour differentiation marker. [Abstract/Link to Full Text]

Kazanov MD, Vitreschak AG, Gelfand MS
Abundance and functional diversity of riboswitches in microbial communities.
BMC Genomics. 2007 Oct 1;8(1):347.
ABSTRACT: BACKGROUND: Several recently completed large-scale enviromental sequencing projects produced a large amount of genetic information about microbial communities ('metagenomes') which is not biased towards cultured organisms. It is a good source for estimation of the abundance of genes and regulatory structures in both known and unknown members of microbial communities. In this study we consider the distribution of RNA regulatory structures, riboswitches, in the Sargasso Sea, Minnesota Soil and Whale Falls metagenomes. RESULTS: Over three hundred riboswitches were found in about 2Gbp metagenome DNA sequences. The abundabce of riboswitches in metagenomes was highest for the TPP, B12 and GCVT riboswitches; the S-box, RFN, YKKC/YXKD, YYBP/YKOY regulatory elements showed lower but significant abundance, while the LYS, G-box, GLMS and YKOK riboswitches were rare. Regions downstream of identified riboswitches were scanned for open reading frames. Comparative analysis of identified ORFs revealed new riboswitch-regulated functions for several classes of riboswitches. In particular, we have observed phosphoserine aminotransferase serC (COG1932) and malate synthase glcB (COG2225) to be regulated by the glycine (GCVT) riboswitch; fatty acid desaturase ole1 (COG1398), by the cobalamin (B12) riboswitch; 5-methylthioribose-1-phosphate isomerase ykrS (COG0182), by the SAM-riboswitch. We also identified conserved riboswitches upstream of genes of unknown function: thiamine (TPP), cobalamine (B12), and glycine (GCVT, upstream of genes from COG4198). CONCLUSIONS: This study demonstrates applicability of bioinformatics to the analysis of RNA regulatory structures in metagenomes. [Abstract/Link to Full Text]

Bresell A, Persson B
Characterization of oligopeptide patterns in large protein sets.
BMC Genomics. 2007 Oct 1;8(1):346.
ABSTRACT: BACKGROUND: Recent sequencing projects and the growth of sequence data banks enable oligopeptide patterns to be characterized on a genome or kingdom level. Several studies have focused on kingdom or habitat classifications based on the abundance of short peptide patterns. There have also been efforts at local structural prediction based on short sequence motifs. Oligopeptide patterns undoubtedly carry valuable information content. Therefore, it is important to characterize these informational peptide patterns to shed light on possible new applications and the pitfalls implicit in neglecting bias in peptide patterns. RESULTS: We have studied four classes of pentapeptide patterns (designated POP, NEP, ORP and URP) in the kingdoms archaea, bacteria and eukaryotes. POP are highly abundant patterns statistically not expected to exist; NEP are patterns that do not exist but are statistically expected to; ORP are patterns unique to a kingdom; and URP are patterns excluded from a kingdom. We used two data sources: the de facto standard of protein knowledge Swissprot, and a set of 386 completely sequenced genomes. For each class of peptides we looked at the 100 most extreme and found both known and unknown sequence features. Most of the known sequence motifs can be explained on the basis of the protein families from which they originate. CONCLUSIONS: We find an inherent bias of certain oligopeptide patterns in naturally occurring proteins that cannot be explained solely on the basis of residue distribution in single proteins, kingdoms or databases. We see three predominant categories of patterns: (i) patterns widespread in a kingdom such as those originating from respiratory chain-associated proteins and translation machinery; (ii) proteins with structurally and/or functionally favored patterns, which have not yet been ascribed this role; (iii) multicopy species-specific retrotransposons, only found in the genome set. These categories will affect the accuracy of sequence pattern algorithms that rely mainly on amino acid residue usage. Methods presented in this paper may be used to discover targets for antibiotics, as we identify numerous examples of kingdom-specific antigens among our peptide classes. The methods may also be useful for detecting coding regions of genes. [Abstract/Link to Full Text]

Buffart TE, Carvalho B, Mons T, Reis RM, Moutinho C, Silva P, van Grieken NC, Vieth M, Stolte M, van de Velde CJ, Schrock E, Matthaei A, Ylstra B, Carneiro F, Meijer GA
DNA copy number profiles of gastric cancer precursor lesions.
BMC Genomics. 2007 Oct 1;8(1):345.
ABSTRACT: BACKGROUND: Chromosomal instability (CIN) is the most prevalent type of genomic instability in gastric tumours, but its role in malignant transformation of the gastric mucosa is still obscure. In the present study, we set out to study whether two morphologically distinct categories of gastric cancer precursor lesions, i.e. intestinal-type and pyloric gland adenomas, would carry different patterns of DNA copy number changes, possibly reflecting distinct genetic pathways of gastric carcinogenesis in these two adenoma types. RESULTS: Using a 5K BAC array CGH platform, we showed that the most common aberrations shared by the 11 intestinal-type and 10 pyloric gland adenomas were gains of chromosomes 9 (29%), 11q (29%) and 20 (33%), and losses of chromosomes 13q (48%), 6(48%), 5(43%) and 10 (33%). The most frequent aberrations in intestinal-type gastric adenoma were gains on 11q, 9q and 8, and losses on chromosomes 5q, 6, 10 and 13, whereas in pyloric gland gastric adenomas these were gains on chromosome 20 and losses on 5q and 6. However, no significant differences were observed between the two adenoma types. CONCLUSIONS: The results suggest that gains on chromosomes 8, 9q, 11q and 20, and losses on chromosomes 5q, 6, 10 and 13, likely represent early events in gastric carcinogenesis. The phenotypical entities, intestinal-type and pyloric gland adenomas, however, do not differ significantly (P = 0.8) at the level of DNA copy number changes. [Abstract/Link to Full Text]

Graham NS, Broadley MR, Hammond JP, White PJ, May ST
Optimising the analysis of transcript data using high density oligonucleotide arrays and genomic DNA-based probe selection.
BMC Genomics. 2007;8344.
BACKGROUND: Affymetrix GeneChip arrays are widely used for transcriptomic studies in a diverse range of species. Each gene is represented on a GeneChip array by a probe-set, consisting of up to 16 probe-pairs. Signal intensities across probe-pairs within a probe-set vary in part due to different physical hybridisation characteristics of individual probes with their target labelled transcripts. We have previously developed a technique to study the transcriptomes of heterologous species based on hybridising genomic DNA (gDNA) to a GeneChip array designed for a different species, and subsequently using only those probes with good homology. RESULTS: Here we have investigated the effects of hybridising homologous species gDNA to study the transcriptomes of species for which the arrays have been designed. Genomic DNA from Arabidopsis thaliana and rice (Oryza sativa) were hybridised to the Affymetrix Arabidopsis ATH1 and Rice Genome GeneChip arrays respectively. Probe selection based on gDNA hybridisation intensity increased the number of genes identified as significantly differentially expressed in two published studies of Arabidopsis development, and optimised the analysis of technical replicates obtained from pooled samples of RNA from rice. CONCLUSION: This mixed physical and bioinformatics approach can be used to optimise estimates of gene expression when using GeneChip arrays. [Abstract/Link to Full Text]

Park SJ, Lee YS, Hwang UW
The complete mitochondrial genome of the sea spider Achelia bituberculata (Pycnogonida, Ammotheidae): arthropod ground pattern of gene arrangement.
BMC Genomics. 2007 Oct 1;8(1):343.
ABSTRACT: BACKGROUND: The phylogenetic position of pycnogonids is a long-standing and controversial issue in arthropod phylogeny. This controversy has recently been rekindled by differences in the conclusions based on neuroanatomical data concerning the chelifore and the patterns of Hox expression. The mitochondrial genome of a sea spider, Nymphon gracile (Pycnogonida, Nymphonidae), was recently reported in an attempt to address this issue. However, N. gracile appears to be a long-branch taxon on the phylogenetic tree and exhibits a number of peculiar features, such as 10 tRNA translocations and even an inversion of several protein-coding genes. Sequences of other pycnogonid mitochondrial genomes are needed if the position of pycnogonids is to be elucidated on this basis. RESULTS: The complete mitochondrial genome (15,474 bp) of a sea spider (Achelia bituberculata) belonging to the family Ammotheidae, which combines a number of anatomical features considered plesiomorphic with respect to other pycnogonids, was sequenced and characterized. The genome organization shows the features typical of most metazoan animal genomes (37 tightly-packed genes). The overall gene arrangement is completely identical to the arthropod ground pattern, with one exception: the position of the trnQ gene between the rrnS gene and the control region. Maximum likelihood and Bayesian inference trees inferred from the amino acid sequences of mitochondrial protein-coding genes consistently indicate that the pycnogonids (A. bituberculata and N. gracile) may be closely related to the clade of Acari and Araneae. CONCLUSIONS: The complete mitochondrial genome sequence of A. bituberculata (Family Ammotheidae) and the previously-reported partial sequence of Endeis spinosa show the gene arrangement patterns typical of arthropods (Limulus-like), but they differ markedly from that of N. gracile. Phylogenetic analyses based on mitochondrial protein-coding genes showed that Pycnogonida may be authentic arachnids (= aquatic arachnids) within Chelicerata sensu lato, as indicated by the name 'sea spider,' and suggest that the Cormogonida theory - that the pycnogonids are a sister group of all other arthropods - should be rejected. However, in view of the relatively weak node confidence, strand-biased nucleotide composition and long-branch attraction artifact, further more intensive studies seem necessary to resolve the exact position of the pycnogonids. [Abstract/Link to Full Text]

Navarro-Costa P, Pereira L, Alves C, Gusmao L, Proenca C, Marques-Vidal P, Rocha T, Correia SC, Jorge S, Neves A, Soares AP, Nunes J, Calhaz-Jorge C, Amorim A, Plancha CE, Goncalves J
Characterizing partial AZFc deletions of the Y chromosome with amplicon-specific sequence markers.
BMC Genomics. 2007 Sep 28;8(1):342.
ABSTRACT: BACKGROUND: The AZFc region of the human Y chromosome is a highly recombinogenic locus containing multi-copy male fertility genes located in repeated DNA blocks (amplicons). These AZFc gene families exhibit slight sequence variations between copies which are considered to have functional relevance. Yet, partial AZFc deletions yield phenotypes ranging from normospermia to azoospermia, thwarting definite conclusions on their real impact on fertility. RESULTS: The amplicon content of partial AZFc deletion products was characterized with novel amplicon-specific sequence markers. Data indicate that partial AZFc deletions are a male infertility risk [odds ratio: 5.6 (95% CI: 1.6-30.1)] and although high diversity of partial deletion products and sequence conversion profiles were recorded, the AZFc marker profiles detected in fertile men were also observed in infertile men. Additionally, the assessment of rearrangement recurrence by Y-lineage analysis indicated that while partial AZFc deletions occurred in highly diverse samples, haplotype diversity was minimal in fertile men sharing identical marker profiles. CONCLUSIONS: Although partial AZFc deletion products are highly heterogeneous in terms of amplicon content, this plasticity is not sufficient to account for the observed phenotypical variance. The lack of causative association between the deletion of specific gene copies and infertility suggests that AZFc gene content might be part of a multifactorial network, with Y-lineage evolution emerging as a possible phenotype modulator. [Abstract/Link to Full Text]

Timmermans MJ, de Boer ME, Nota B, de Boer TE, Marien J, Klein-Lankhorst RM, van Straalen NM, Roelofs D
Collembase: a repository for springtail genomics and soil quality assessment.
BMC Genomics. 2007 Sep 27;8(1):341.
ABSTRACT: BACKGROUND: Environmental quality assessment is traditionally based on responses of reproduction and survival of indicator organisms. For soil assessment the springtail Folsomia candida (Collembola) is an accepted standard test organism. We argue that environmental quality assessment using gene expression profiles of indicator organisms exposed to test substrates is more sensitive, more toxicant specific and significantly faster than current risk assessment methods. To apply this species as a genomic model for soil quality testing we conducted an EST sequencing project and developed an online database. Description: Collembase is a web-accessible database comprising springtail (F. candida) genomic data. Presently, the database contains information on 8686 ESTs that are assembled into 5952 unique gene objects. Of those gene objects ~40% showed homology to other protein sequences available in GenBank (blastx analysis; non-redundant (nr) database; expect-value < 10-5). Software was applied to infer protein sequences. The putative peptides, which had an average length of 115 amino-acids (ranging between 23 and 440) were annotated with Gene Ontology (GO) terms. In total 1025 peptides (~17% of the gene objects) were assigned at least one GO term (expect-value < 10-25). Within Collembase searches can be conducted based on BLAST and GO annotation, cluster name or using a BLAST server. The system furthermore enables easy sequence retrieval for functional genomic and Quantitative-PCR experiments. Sequences are submitted to GenBank (Accession numbers: EV473060 - EV481745). CONCLUSION: Collembase ( is a resource of sequence data on the springtail F. candida. The information within the database will be linked to a custom made microarray, based on the Agilent platform, which can be applied for soil quality testing. In addition, Collembase supplies information that is valuable for related scientific disciplines such as molecular ecology, ecogenomics, molecular evolution and phylogenetics. [Abstract/Link to Full Text]

Barthelson RA, Lambert GM, Vanier C, Lynch RM, Galbraith DW
Comparison of the contributions of the nuclear and cytoplasmic compartments to global gene expression in human cells.
BMC Genomics. 2007;8340.
BACKGROUND: In the most general sense, studies involving global analysis of gene expression aim to provide a comprehensive catalog of the components involved in the production of recognizable cellular phenotypes. These studies are often limited by the available technologies. One technology, based on microarrays, categorizes gene expression in terms of the abundance of RNA transcripts, and typically employs RNA prepared from whole cells, where cytoplasmic RNA predominates. RESULTS: Using microarrays comprising oligonucleotide probes that represent either protein-coding transcripts or microRNAs (miRNA), we have studied global transcript accumulation patterns for the HepG2 (human hepatoma) cell line. Through subdividing the total pool of RNA transcripts into samples from nuclei, the cytoplasm, and whole cells, we determined the degree of correlation of these patterns across these different subcellular locations. The transcript and miRNA abundance patterns for the three RNA fractions were largely similar, but with some exceptions: nuclear RNA samples were enriched with respect to the cytoplasm in transcripts encoding proteins associated with specific nuclear functions, such as the cell cycle, mitosis, and transcription. The cytoplasmic RNA fraction also was enriched, when compared to the nucleus, in transcripts for proteins related to specific nuclear functions, including the cell cycle, DNA replication, and DNA repair. Some transcripts related to the ubiquitin cycle, and transcripts for various membrane proteins were sorted into either the nuclear or cytoplasmic fractions. CONCLUSION: Enrichment or compartmentalization of cell cycle and ubiquitin cycle transcripts within the nucleus may be related to the regulation of their expression, by preventing their translation to proteins. In this way, these cellular functions may be tightly controlled by regulating the release of mRNA from the nucleus and thereby the expression of key rate limiting steps in these pathways. Many miRNA precursors were also enriched in the nuclear samples, with significantly fewer being enriched in the cytoplasm. Studies of mRNA localization will help to clarify the roles RNA processing and transport play in the regulation of cellular function. [Abstract/Link to Full Text]

Flynn SM, Carr SM
Interspecies hybridization on DNA resequencing microarrays: efficiency of sequence recovery and accuracy of SNP detection in human, ape, and codfish mitochondrial DNA genomes on a human-specific MitoChip.
BMC Genomics. 2007 Sep 25;8(1):339.
ABSTRACT: BACKGROUND: Iterative DNA "resequencing" on oligonucleotide microarrays offers a high-throughput method to measure intraspecific biodiversity, one that is especially suited to SNP-dense gene regions such as vertebrate mitochondrial (mtDNA) genomes. However, costs of single-species design and microarray fabrication are prohibitive. A cost-effective, multi-species strategy is to hybridize experimental DNAs from diverse species to a common microarray that is tiled with oligonucleotide sets from multiple, homologous reference genomes. Such a strategy requires that cross-hybridization between the experimental DNAs and reference oligos from the different species not interfere with the accurate recovery of species-specific data. To determine the pattern and limits of such interspecific hybridization, we compared the efficiency of sequence recovery and accuracy of SNP identification by a 15,452-base human-specific microarray challenged with human, chimpanzee, gorilla, and codfish mtDNA genomes. RESULTS: In the human genome, 99.67% of the sequence was recovered with 100.0% accuracy of SNP identification. This accuracy declines log-linearly with sequence divergence from the reference, from 0.067 to 0.247 errors per SNP in the chimpanzee and gorilla genomes, respectively. Efficiency of sequence recovery declines with the increase of the number of interspecific SNPs in the 25b interval tiled by the reference oligonucleotides. In the gorilla genome, which differs from the human reference by 10%, and in which 46% of these 25b regions contain 3 or more SNP differences from the reference, only 88% of the sequence is recoverable. In the codfish genome, which differs from the reference by >30%, less than 4% of the sequence is recoverable, in short islands [greater than or equal to]12b that are conserved between primates and fish. CONCLUSION: Experimental DNAs bind inefficiently to homologous reference oligonucleotide sets on a re-sequencing microarray when their sequences differ by more than a few percent. The data suggest that interspecific cross-hybridization will not interfere with the accurate recovery of species-specific data from multispecies microarrays, provided that the reference DNA sequences differ by >20% (mean of 5b differences per 25b oligo). Recovery of DNA sequence data from multiple, distantly-related species on a single multiplex gene chip should be a practical, highly-parallel method for investigating genomic biodiversity. [Abstract/Link to Full Text]

Gloriam DE, Fredriksson R, Schiöth HB
The G protein-coupled receptor subset of the rat genome.
BMC Genomics. 2007;8338.
BACKGROUND: The superfamily of G protein-coupled receptors (GPCRs) is one of the largest within most mammals. GPCRs are important targets for pharmaceuticals and the rat is one of the most widely used model organisms in biological research. Accurate comparisons of protein families in rat, mice and human are thus important for interpretation of many physiological and pharmacological studies. However, current automated protein predictions and annotations are limited and error prone. RESULTS: We searched the rat genome for GPCRs and obtained 1867 full-length genes and 739 pseudogenes. We identified 1277 new full-length rat GPCRs, whereof 1235 belong to the large group of olfactory receptors. Moreover, we updated the datasets of GPCRs from the human and mouse genomes with 1 and 43 new genes, respectively. The total numbers of full-length genes (and pseudogenes) identified were 799 (583) for human and 1783 (702) for mouse. The rat, human and mouse GPCRs were classified into 7 families named the Glutamate, Rhodopsin, Adhesion, Frizzled, Secretin, Taste2 and Vomeronasal1 families. We performed comprehensive phylogenetic analyses of these families and provide detailed information about orthologues and species-specific receptors. We found that 65 human Rhodopsin family GPCRs are orphans and 56 of these have an orthologue in rat. CONCLUSION: Interestingly, we found that the proportion of one-to-one GPCR orthologues was only 58% between rats and humans and only 70% between the rat and mouse, which is much lower than stated for the entire set of all genes. This is in mainly related to the sensory GPCRs. The average protein sequence identities of the GPCR orthologue pairs is also lower than for the whole genomes. We found these to be 80% for the rat and human pairs and 90% for the rat and mouse pairs. However, the proportions of orthologous and species-specific genes vary significantly between the different GPCR families. The largest diversification is seen for GPCRs that respond to exogenous stimuli indicating that the variation in their repertoires reflects to a large extent the adaptation of the species to their environment. This report provides the first overall roadmap of the GPCR repertoire in rat and detailed comparisons with the mouse and human repertoires. [Abstract/Link to Full Text]

Sanchez S, Hourdez S, Lallier FH
Identification of proteins involved in the functioning of Riftia pachyptila symbiosis by Subtractive Suppression Hybridization.
BMC Genomics. 2007 Sep 24;8(1):337.
ABSTRACT: BACKGROUND: Since its discovery around deep sea hydrothermal vents of the Galapagos Rift about 30 years ago, the chemoautotrophic symbiosis between the vestimentiferan tubeworm Riftia pachyptila and its symbiotic sulfide-oxidizing gamma-proteobacteria has been extensively studied. However, studies on the tubeworm host were essentially targeted, biochemical approaches. We decided to use a global molecular approach to identify new proteins involved in metabolite exchanges and assimilation by the host. We used a Subtractive Suppression Hybridization approach (SSH) in an unusual way, by comparing pairs of tissues from a single individual. We chose to identify the sequences preferentially expressed in the branchial plume tissue (the only organ in contact with the sea water) and in the trophosome (the organ housing the symbiotic bacteria) using the body wall as a reference tissue because it is supposedly not involved in metabolite exchanges in this species. RESULTS: We produced four cDNA libraries: i) body wall-subtracted branchial plume library (BR-BW), ii) and its reverse library, branchial plume-subtracted body wall library (BW-BR), iii) body wall-subtracted trophosome library (TR-BW), iv) and its reverse library, trophosome-subtracted body wall library (BW-TR). For each library, we sequenced about 200 clones resulting in 45 different sequences on average in each library (58 and 59 cDNAs for BR-BW and TR-BW libraries respectively). Overall, half of the contigs matched records found in the databases with good E-values. After quantitative PCR analysis, it resulted that 16S, Major Vault Protein, carbonic anhydrase (RpCAbr), cathepsin and chitinase precursor transcripts were highly represented in the branchial plume tissue compared to the trophosome and the body wall tissues, whereas carbonic anhydrase (RpCAtr), myohemerythrin, a putative T-Cell receptor and one non identified transcript were highly specific of the trophosome tissue. CONCLUSIONS: Quantitative PCR analyses were congruent with our libraries results thereby confirming the existence of tissue-specific transcripts identified by SSH. We focused our study on the transcripts we identified as the most interesting ones based on the BLAST results. Some of the keys to understanding metabolite exchanges may remain in the sequences we could not identify (hypothetical proteins and no similarity found). These sequences will have to be better studied by a longer -or complete- sequencing to check their identity, and then by verifying the expression level of the transcripts in different parts of the worm. [Abstract/Link to Full Text]

Bottillo I, De Luca A, Schirinzi A, Guida V, Torrente I, Calvieri S, Gervasini C, Larizza L, Pizzuti A, Dallapiccola B
Functional analysis of splicing mutations in exon 7 of NF1 gene.
BMC Med Genet. 2007;84.
BACKGROUND: Neurofibromatosis type 1 is one of the most common autosomal dominant disorders, affecting about 1:3,500 individuals. NF1 exon 7 displays weakly defined exon-intron boundaries, and is particularly prone to missplicing. METHODS: In this study we investigated the expression of exon 7 transcripts using bioinformatic identification of splicing regulatory sequences, and functional minigene analysis of four sequence changes [c.910C>T (R304X), c.945G>A/c.946C>A (Q315Q/L316M), c.1005T>C (N335N)] identified in exon 7 of three different NF1 patients. RESULTS: Our results detected the presence of three exonic splicing enhancers (ESEs) and one putative exonic splicing silencer (ESS) element. The wild type minigene assay resulted in three alternative isoforms, including a transcript lacking NF1 exon 7 (NF1DeltaE7). Both the wild type and the mutated constructs shared NF1DeltaE7 in addition to the complete messenger, but displayed a different ratio between the two transcripts. In the presence of R304X and Q315Q/L316M mutations, the relative proportion between the different isoforms is shifted toward the expression of NF1DeltaE7, while in the presence of N335N variant, the NF1DeltaE7 expression is abolished. CONCLUSION: In conclusion, it appears mandatory to investigate the role of each nucleotide change within the NF1 coding sequence, since a significant proportion of NF1 exon 7 mutations affects pre-mRNA splicing, by disrupting exonic splicing motifs and modifying the delicate balance between aberrantly and correctly spliced transcripts. [Abstract/Link to Full Text]

Mayans S, Lackovic K, Nyholm C, Lindgren P, Ruikka K, Eliasson M, Cilio CM, Holmberg D
CT60 genotype does not affect CTLA-4 isoform expression despite association to T1D and AITD in northern Sweden.
BMC Med Genet. 2007;83.
BACKGROUND: Polymorphisms in and around the CTLA-4 gene have previously been associated to T1D and AITD in several populations. One such single nucleotide polymorphism (SNP), CT60, has been reported to affect the expression level ratio of the soluble (sCTLA-4) to full length CTLA-4 (flCTLA-4) isoforms. The aims of our study were to replicate the association previously published by Ueda et al. of polymorphisms in the CTLA-4 region to T1D and AITD and to determine whether the CT60 polymorphism affects the expression level ratio of sCTLA-4/flCTLA-4 in our population. METHODS: Three SNPs were genotyped in 253 cases (104 AITD cases and 149 T1D cases) and 865 ethnically matched controls. Blood from 23 healthy individuals was used to quantify mRNA expression of CTLA-4 isoforms in CD4+ cells using real-time PCR. Serum from 102 cases and 59 healthy individuals was used to determine the level of sCTLA-4 protein. RESULTS: Here we show association of the MH30, CT60 and JO31 polymorphisms to T1D and AITD in northern Sweden. We also observed a higher frequency of the CT60 disease susceptible allele in our controls compared to the British, Italian and Dutch populations, which might contribute to the high frequency of T1D in Sweden. In contrast to previously published findings, however, we were unable to find differences in the sCTLA-4/flCTLA-4 expression ratio based on the CT60 genotype in 23 healthy volunteers, also from northern Sweden. Analysis of sCTLA-4 protein levels in serum showed no correlation between sCTLA-4 protein levels and disease status or CT60 genotype. CONCLUSION: Association was found between T1D/AITD and all three polymorphisms investigated. However, in contrast to previous investigations, sCTLA-4 RNA and protein expression levels did not differ based on CT60 genotype. Our results do not rule out the CT60 SNP as an important polymorphism in the development of T1D or AITD, but suggest that further investigations are necessary to elucidate the effect of the CTLA-4 region on the development of T1D and AITD. [Abstract/Link to Full Text]

Shimizu M, Cashman JR, Yamazaki H
Transient trimethylaminuria related to menstruation.
BMC Med Genet. 2007;82.
BACKGROUND: Trimethylaminuria, or fish odor syndrome, includes a transient or mild malodor caused by an excessive amount of malodorous trimethylamine as a result of body secretions. Herein, we describe data to support the proposal that menses can be an additional factor causing transient trimethylaminuria in self-reported subjects suffering from malodor and even in healthy women harboring functionally active flavin-containing monooxygenase 3 (FMO3). METHODS: FMO3 metabolic capacity (conversion of trimethylamine to trimethylamine N-oxide) was defined as the urinary ratio of trimethylamine N-oxide to total trimethylamine. RESULTS: Self-reported Case (A) that was homozygous for inactive Arg500stop FMO3, showed decreased metabolic capacity of FMO3 (i.e., approximately 10% the unaffected metabolic capacity) during 120 days of observation. For Case (B) that was homozygous for common [Glu158Lys; Glu308Gly] FMO3 polymorphisms, metabolic capacity of FMO3 was almost approximately 90%, except for a few days surrounding menstruation showing < 40% metabolic capacity. In comparison, three healthy control subjects that harbored heterozygous polymorphisms for [Glu158Lys; Glu308Gly] FMO3 or homozygous for wild FMO3 showed normal (> 90%) metabolic capacity, however, on days around menstruation the FMO3 metabolic capacity was decreased to ~60-70%. CONCLUSION: Together, these results indicate that abnormal FMO3 capacity is caused by menstruation particularly in the presence, in homozygous form, of mild genetic variants such as [Glu158Lys; Glu308Gly] that cause a reduced FMO3 function. [Abstract/Link to Full Text]

Sáez ME, Martínez-Larrad MT, Ramírez-Lorca R, González-Sánchez JL, Zabena C, Martinez-Calatrava MJ, González A, Morón FJ, Ruiz A, Serrano-Ríos M
Calpain-5 gene variants are associated with diastolic blood pressure and cholesterol levels.
BMC Med Genet. 2007;81.
BACKGROUND: Genes implicated in common complex disorders such as obesity, type 2 diabetes mellitus (T2DM) or cardiovascular diseases are not disease specific, since clinically related disorders also share genetic components. Cysteine protease Calpain 10 (CAPN10) has been associated with T2DM, hypertension, hypercholesterolemia, increased body mass index (BMI) and polycystic ovary syndrome (PCOS), a reproductive disorder of women in which isunlin resistance seems to play a pathogenic role. The calpain 5 gene (CAPN5) encodes a protein homologue of CAPN10. CAPN5 has been previously associated with PCOS by our group. In this new study, we have analysed the association of four CAPN5 gene variants(rs948976A>G, rs4945140G>A, rs2233546C>T and rs2233549G>A) with several cardiovascular risk factors related to metabolic syndrome in general population. METHODS: Anthropometric measurements, blood pressure, insulin, glucose and lipid profiles were determined in 606 individuals randomly chosen from a cross-sectional population-based epidemiological survey in the province of Segovia in Central Spain (Castille), recruited to investigate the prevalence of anthropometric and physiological parameters related to obesity and other components of the metabolic syndrome. Genotypes at the four polymorphic loci in CAPN5 gene were detected by polymerase chain reaction (PCR). RESULTS: Genotype association analysis was significant for BMI (p < or = 0.041), diastolic blood pressure (p = 0.015) and HDL-cholesterol levels (p = 0.025). Different CAPN5 haplotypes were also associated with diastolic blood pressure (DBP) (0.0005 < or = p < or = 0.006) and total cholesterol levels (0.001 < or = p < or = 0.029). In addition, the AACA haplotype, over-represented in obese individuals, is also more frequent in individuals with metabolic syndrome defined by ATPIII criteria (p = 0.029). CONCLUSION: As its homologue CAPN10, CAPN5 seems to influence traits related to increased risk for cardiovascular diseases. Our results also may suggest CAPN5 as a candidate gene for metabolic syndrome. [Abstract/Link to Full Text]

Al-Sayed M, Imtiaz F, Alsmadi OA, Rashed MS, Meyer BF
Mutations underlying 3-hydroxy-3-methylglutaryl CoA lyase deficiency in the Saudi population.
BMC Med Genet. 2006;786.
BACKGROUND: 3-hydroxy-3-methylglutaric aciduria (3HMG, McKusick: 246450) is an autosomal recessive branched chain organic aciduria caused by deficiency of the enzyme 3-Hydroxy-3-Methylglutaryl CoA lyase (HL, HMGCL, EC HL is encoded by HMGCL gene and many mutations have been reported. 3HMG is commonly observed in Saudi Arabia. METHODS: We utilized Whole Genome Amplification (WGA), PCR and direct sequencing to identify mutations underlying 3HMG in the Saudi population. Two patients from two unrelated families and thirty-four 3HMG positive dried blood spots (DBS) were included. RESULTS: We detected the common missense mutation R41Q in 89% of the tested alleles (64 alleles). 2 alleles carried the frame shift mutation F305fs (-2) and the last two alleles had a novel splice site donor IVS6+1G>A mutation which was confirmed by its absence in more than 100 chromosomes from the normal population. All mutations were present in a homozygous state, reflecting extensive consanguinity. The high frequency of R41Q is consistent with a founder effect. Together the three mutations described account for >94% of the pathogenic mutations underlying 3HMG in Saudi Arabia. CONCLUSION: Our study provides the most extensive genotype analysis on 3HMG patients from Saudi Arabia. Our findings have direct implications on rapid molecular diagnosis, prenatal and pre-implantation diagnosis and population based prevention programs directed towards 3HMG. [Abstract/Link to Full Text]

Pinelli M, Giacchetti M, Acquaviva F, Cocozza S, Donnarumma G, Lapice E, Riccardi G, Romano G, Vaccaro O, Monticelli A
Beta2-adrenergic receptor and UCP3 variants modulate the relationship between age and type 2 diabetes mellitus.
BMC Med Genet. 2006;785.
BACKGROUND: It is widely accepted that Type 2 Diabetes Mellitus (T2DM) and other complex diseases are the product of complex interplay between genetic susceptibility and environmental causes. To cope with such a complexity, all the statistical and conceptual strategies available should be used. The working hypothesis of this study was that two well-known T2DM risk factors could have diverse effect in individuals carrying different genotypes. In particular, our effort was to investigate if a well-defined group of genes, involved in peripheral energy expenditure, could modify the impact of two environmental factors like age and obesity on the risk to develop diabetes. To achieve this aim we exploited a multianalytical approach also using dimensionality reduction strategy and conservative significance correction strategies. METHODS: We collected clinical data and characterised five genetic variants and 2 environmental factors of 342 ambulatory T2DM patients and 305 unrelated non-diabetic controls. To take in account the role of one of the major co-morbidity conditions we stratified the whole sample according to the presence of obesity, over and above the 30 Kg/m2 BMI threshold. RESULTS: By monofactorial analyses the ADRB2-27 Glu27 homozygotes had a lower frequency of diabetes when compared with Gln27 carriers (Odds Ratio (OR) 0.56, 95% Confidence Interval (CI) 0.36 - 0.91). This difference was even more marked in the obese subsample.Multifactor Dimensionality Reduction method in the non-obese subsample showed an interaction among age, ADRB2-16 and UCP3 polymorphisms. In individuals that were UCP3 T-carriers and ADRB2-16 Arg-carriers the OR increased from 1 in the youngest to 10.84 (95% CI 4.54-25.85) in the oldest. On the contrary, in the ADRB2-16 GlyGly and UCP3 CC double homozygote subjects, the OR for the disease was 1.10 (95% CI 0.53-2.27) in the youngest and 1.61 (95% CI 0.55-4.71) in the oldest. CONCLUSION: Although our results should be confirmed by further studies, our data suggests that, when properly evaluated, it is possible to identify genetic factors that could influence the effect of common risk factors. [Abstract/Link to Full Text]

Li CY, Yu SF
A novel mutation in the SH3BP2 gene causes cherubism: case report.
BMC Med Genet. 2006;784.
BACKGROUND: Cherubism is a rare hereditary multi-cystic disease of the jaws, characterized by its typical appearance in early childhood, and stabilization and remission after puberty. It is genetically transmitted in an autosomal dominant fashion and the gene coding for SH3-binding protein 2 (SH3BP2) may be involved. CASE PRESENTATION: We investigated a family consisting of 21 members with 3 female affected individuals with cherubism from Northern China. Of these 21 family members, 17 were recruited for the genetic analysis. We conducted the direct sequence analysis of the SH3BP2 gene among these 17 family members. A disease-causing mutation was identified in exon 9 of the gene. It was an A1517G base change, which leads to a D419G amino acid substitution. CONCLUSION: To our knowledge, the A1517G mutation has not been reported previously in cherubism. This finding is novel. [Abstract/Link to Full Text]

Hall DH, Rahman T, Avery PJ, Keavney B
INSIG-2 promoter polymorphism and obesity related phenotypes: association study in 1428 members of 248 families.
BMC Med Genet. 2006;783.
BACKGROUND: Obesity is a major public health problem. Body mass index (BMI) is a highly heritable phenotype but robust associations of genetic polymorphisms to BMI or other obesity-related phenotypes have been difficult to establish. Recently a large genetic association study showed evidence for association of the single nucleotide polymorphism (SNP) rs7566605, which lies 10 Kb 5' to the first exon of the insulin-induced gene 2 (INSIG-2), with obesity in several cohorts. We tested this polymorphism for association with body mass related phenotypes in a large family study whose mean BMI was consistent with moderate overweight. METHODS: We studied 1428 members of 248 British Caucasian families who had been ascertained through a proband with hypertension. We measured BMI, waist and hip circumference, and plasma levels of leptin. We genotyped the rs7566605 SNP using a restriction fragment length polymorphism assay, and carried out a family-based association test for quantitative traits related to obesity using the statistical programs MERLIN and QTDT. RESULTS: We observed no significant association between genotype at rs7566605 and covariate-adjusted (for age, sex, alcohol consumption, smoking and exercise habit) log-transformed BMI, waist measurement, hip measurement, waist-to-hip ratio, or plasma levels of leptin. CONCLUSION: There was no association between genotype at rs7566605 and obesity-related phenotypes in this British Caucasian population. These families were in general moderately overweight, few members being severely obese. Our result indicates that this polymorphism has little if any effect on BMI within the normal to moderately overweight range. The effects of this polymorphism on body mass may be restricted to those already predisposed to at least moderate obesity as a result of environmental factors and other predisposing genotypes. [Abstract/Link to Full Text]

de la Houssaye G, Bieche I, Roche O, Vieira V, Laurendeau I, Arbogast L, Zeghidi H, Rapp P, Halimi P, Vidaud M, Dufier JL, Menasche M, Abitbol M
Identification of the first intragenic deletion of the PITX2 gene causing an Axenfeld-Rieger Syndrome: case report.
BMC Med Genet. 2006;782.
BACKGROUND: Axenfeld-Rieger syndrome (ARS) is characterized by bilateral congenital abnormalities of the anterior segment of the eye associated with abnormalities of the teeth, midface, and umbilicus. Most cases of ARS are caused by mutations in the genes encoding PITX2 or FOXC1. Here we describe a family affected by a severe form of ARS. CASE PRESENTATION: Two members of this family (father and daughter) presented with typical ARS and developed severe glaucoma. The ocular phenotype was much more severe in the daughter than in the father. Magnetic resonance imaging (MRI) detected an aggressive form of meningioma in the father. There was no mutation in the PITX2 gene, determined by exon screening. We identified an intragenic deletion by quantitative genomic PCR analysis and characterized this deletion in detail. CONCLUSION: Our findings implicate the first intragenic deletion of the PITX2 gene in the pathogenesis of a severe form of ARS in an affected family. This study stresses the importance of a systematic search for intragenic deletions in families affected by ARS and in sporadic cases for which no mutations in the exons or introns of PITX2 have been found. The molecular genetics of some ARS pedigrees should be re-examined with enzymes that can amplify medium and large genomic fragments. [Abstract/Link to Full Text]

King C, Barton DE
Best practice guidelines for the molecular genetic diagnosis of Type 1 (HFE-related) hereditary haemochromatosis.
BMC Med Genet. 2006;781.
BACKGROUND: Hereditary haemochromatosis (HH) is a recessively-inherited disorder of iron over-absorption prevalent in Caucasian populations. Affected individuals for Type 1 HH are usually either homozygous for a cysteine to tyrosine amino acid substitution at position 282 (C282Y) of the HFE gene, or compound heterozygotes for C282Y and for a histidine to aspartic acid change at position 63 (H63D). Molecular genetic testing for these two mutations has become widespread in recent years. With diverse testing methods and reporting practices in use, there was a clear need for agreed guidelines for haemochromatosis genetic testing. The UK Clinical Molecular Genetics Society has elaborated a consensus process for the development of disease-specific best practice guidelines for genetic testing. METHODS: A survey of current practice in the molecular diagnosis of haemochromatosis was conducted. Based on the results of this survey, draft guidelines were prepared using the template developed by UK Clinical Molecular Genetics Society. A workshop was held to develop the draft into a consensus document. The consensus document was then posted on the Clinical Molecular Genetics Society website for broader consultation and amendment. RESULTS: Consensus or near-consensus was achieved on all points in the draft guidelines. The consensus and consultation processes worked well, and outstanding issues were documented in an appendix to the guidelines. CONCLUSION: An agreed set of best practice guidelines were developed for diagnostic, predictive and carrier testing for hereditary haemochromatosis and for reporting the results of such testing. [Abstract/Link to Full Text]

Lind LK, Stecksén-Blicks C, Lejon K, Schmitt-Egenolf M
EDAR mutation in autosomal dominant hypohidrotic ectodermal dysplasia in two Swedish families.
BMC Med Genet. 2006;780.
BACKGROUND: Hypohidrotic ectodermal dysplasia (HED) is a genetic disorder characterized by defective development of teeth, hair, nails and eccrine sweat glands. Both autosomal dominant and autosomal recessive forms of HED have previously been linked to mutations in the ectodysplasin 1 anhidrotic receptor (EDAR) protein that plays an important role during embryogenesis. METHODS: The coding DNA sequence of the EDAR gene was analyzed in two large Swedish three-generational families with autosomal dominant HED. RESULTS: A non-sense C to T mutation in exon 12 was identified in both families. This disease-specific mutation changes an arginine amino acid in position 358 of the EDAR protein into a stop codon (p.Arg358X), thereby truncating the protein. In addition to the causative mutation two polymorphisms, not associated with the HED disorder, were also found in the EDAR gene. CONCLUSION: The finding of the p.Arg358X mutation in the Swedish families is the first corroboration of a previously described observation in an American family. Thus, our study strengthens the role of this particular mutation in the aetiology of autosomal dominant HED and confirms the importance of EDAR for the development of HED. [Abstract/Link to Full Text]

Mai M, Akkad AD, Wieczorek S, Saft C, Andrich J, Kraus PH, Epplen JT, Arning L
No association between polymorphisms in the BDNF gene and age at onset in Huntington disease.
BMC Med Genet. 2006;779.
BACKGROUND: Recent evidence suggests that brain-derived neurotrophic factor (BDNF) is an attractive candidate for modifying age at onset (AO) in Huntington disease (HD). In particular, the functional Val66Met polymorphism appeared to exert a significant effect. Here we evaluate BDNF variability with respect to AO of HD using markers that represent the entire locus. METHODS: Five selected tagging polymorphisms were genotyped across a 65 kb region comprising the BDNF gene in a well established cohort of 250 unrelated German HD patients. RESULTS: Addition of BDNF genotype variations or one of the marker haplotypes to the effect of CAG repeat lengths did not affect the variance of the AO. CONCLUSION: We were unable to verify a recently reported association between the functional Val66Met polymorphism in the BDNF gene and AO in HD. From our findings, we conclude that neither sequence variations in nor near the gene contribute significantly to the variance of AO. [Abstract/Link to Full Text]

Dutra AV, Lin HF, Juo SH, Mohrenweiser H, Sen S, Grewal RP
Analysis of the XRCC1 gene as a modifier of the cerebral response in ischemic stroke.
BMC Med Genet. 2006;778.
BACKGROUND: Although there have been studies of the genetic risk factors in the development of stroke, there have been few investigations of role of genes in the cerebral response to ischemia. The brain responds to ischemia in a series of reactions that ultimately influence the volume of a stroke that, in general, correlates with disability. We hypothesize that polymorphisms in genes encoding proteins involved in these reactions could act as modifiers of this response and impact stroke volume. One of the pathways participating in the cerebral ischemic response involves reactive oxygen species which can cause oxidative damage to nucleic acids. DNA repair mechanisms are in place to protect against such damage and imply a role for DNA repair genes in the response of the brain to ischemia and are potential candidate genes for further investigation. METHODS: We studied two common polymorphisms in the DNA repair gene, XRCC1, C26304T and G28152A, in 134 well characterized patients with non lacunar ischemic strokes. We also performed a case control association study with 113 control patients to assess whether these variants represent risk factors in the development of ischemic stroke. RESULTS: Independent of etiology, the "T" allele of the C26304T polymorphism is significantly associated with larger stroke volumes (T-test analysis, p < 0.044; multivariate regression analysis, beta = 0.23, p < 0.008). In the case control association study, we found that neither of these polymorphisms represented a risk factor for the development of stroke. CONCLUSION: Our study suggests a major gene effect of the "T" allele of the C26304T polymorphism modulating the cerebral response to ischemia in non lacunar ischemic stroke. [Abstract/Link to Full Text]

Bentivegna A, Milani D, Gervasini C, Castronovo P, Mottadelli F, Manzini S, Colapietro P, Giordano L, Atzeri F, Divizia MT, Uzielli ML, Neri G, Bedeschi MF, Faravelli F, Selicorni A, Larizza L
Rubinstein-Taybi Syndrome: spectrum of CREBBP mutations in Italian patients.
BMC Med Genet. 2006;777.
BACKGROUND: Rubinstein-Taybi Syndrome (RSTS, MIM 180849) is a rare congenital disorder characterized by mental and growth retardation, broad and duplicated distal phalanges of thumbs and halluces, facial dysmorphisms and increased risk of tumors. RSTS is caused by chromosomal rearrangements and point mutations in one copy of the CREB-binding protein gene (CREBBP or CBP) in 16p13.3. To date mutations in CREBBP have been reported in 56.6% of RSTS patients and an average figure of 10% has ascribed to deletions. METHODS: Our study is based on the mutation analysis of CREBBP in 31 Italian RSTS patients using segregation analysis of intragenic microsatellites, BAC FISH and direct sequencing of PCR and RT-PCR fragments. RESULTS: We identified a total of five deletions, two of the entire gene and three, all in a mosaic condition, involving either the 5' or the 3' region. By direct sequencing a total of 14 de novo mutations were identified: 10 truncating (5 frameshift and 5 nonsense), one splice site, and three novel missense mutations. Two of the latter affect the HAT domain, while one maps within the conserved nuclear receptor binding of (aa 1-170) and will probably destroy a Nuclear Localization Signal. Identification of the p.Asn1978Ser in the healthy mother of a patient also carrying a de novo frameshift mutation, questions the pathogenetic significance of the missense change reported as recurrent mutation. Thirteen additional polymorphisms, three as of yet unreported, were also detected. CONCLUSION: A high detection rate (61.3%) of mutations is confirmed by this Italian study which also attests one of the highest microdeletion rate (16%) documented so far. [Abstract/Link to Full Text]

Chandak GR, Ward KJ, Yajnik CS, Pandit AN, Bavdekar A, Joglekar CV, Fall CH, Mohankrishna P, Wilkin TJ, Metcalf BS, Weedon MN, Frayling TM, Hattersley AT
Triglyceride associated polymorphisms of the APOA5 gene have very different allele frequencies in Pune, India compared to Europeans.
BMC Med Genet. 2006;776.
BACKGROUND: The APOA5 gene variants, -1131T>C and S19W, are associated with altered triglyceride concentrations in studies of subjects of Caucasian and East Asian descent. There are few studies of these variants in South Asians. We investigated whether the two APOA5 variants also show similar association with various lipid parameters in Indian population as in the UK white subjects. METHODS: We genotyped 557 Indian adults from Pune, India, and 237 UK white adults for -1131T>C and S19W variants in the APOA5 gene, compared their allelic and genotype frequency and determined their association with fasting serum triglycerides, total cholesterol, HDL and LDL cholesterol levels using univariate general linear analysis. APOC3 SstI polymorphism was also analyzed in 175 Pune Indian subjects for analysis of linkage disequilibrium with the APOA5 variants. RESULTS: The APOA5 -1131C allele was more prevalent in Indians from Pune (Pune Indians) compared to UK white subjects (allele frequency 20% vs. 4%, p = 0.00001), whereas the 19W allele was less prevalent (3% vs. 6% p = 0.0015). Patterns of linkage disequilibrium between the two variants were similar between the two populations and confirmed that they occur on two different haplotypes. In Pune Indians, the presence of -1131C allele and the 19W allele was associated with a 19% and 15% increase respectively in triglyceride concentrations although only -1131C was significant (p = 0.0003). This effect size was similar to that seen in the UK white subjects. Analysis of the APOC3 SstI polymorphism in 175 Pune Indian subjects showed that this variant is not in appreciable linkage disequilibrium with the APOA5 -1131T>C variant (r2 = 0.07). CONCLUSION: This is the first study to look at the role of APOA5 in Asian Indian subjects that reside in India. The -1131C allele is more prevalent and the 19W allele is less prevalent in Pune Indians compared to UK Caucasians. We confirm that the APOA5 variants are associated with triglyceride levels independent of ethnicity and that this association is similar in magnitude in Asian Indians and Caucasians. The -1131C allele is present in 36% of the Pune Indian population making it a powerful marker for looking at the role of elevated triglycerides in important conditions such as pancreatitis, diabetes and coronary heart disease. [Abstract/Link to Full Text]

Saxena S, Chakraborty A, Kaushal M, Kotwal S, Bhatanager D, Mohil RS, Chintamani C, Aggarwal AK, Sharma VK, Sharma PC, Lenoir G, Goldgar DE, Szabo CI
Contribution of germline BRCA1 and BRCA2 sequence alterations to breast cancer in Northern India.
BMC Med Genet. 2006;775.
BACKGROUND: A large number of distinct mutations in the BRCA1 and BRCA2 genes have been reported worldwide, but little is known regarding the role of these inherited susceptibility genes in breast cancer risk among Indian women. We investigated the distribution and the nature of BRCA1 and BRCA2 germline mutations and polymorphisms in a cohort of 204 Indian breast cancer patients and 140 age-matched controls. METHOD: Cases were selected with regard to early onset disease (< or =40 years) and family history of breast and ovarian cancer. Two hundred four breast cancer cases along with 140 age-matched controls were analyzed for mutations. All coding regions and exon-intron boundaries of the BRCA1 and BRCA2 genes were screened by heteroduplex analysis followed by direct sequencing of detected variants. RESULTS: In total, 18 genetic alterations were identified. Three deleterious frame-shift mutations (185delAG in exon 2; 4184del4 and 3596del4 in exon 11) were identified in BRCA1, along with one missense mutation (K1667R), one 5'UTR alteration (22C>G), three intronic variants (IVS10-12delG, IVS13+2T>C, IVS7+38T>C) and one silent substitution (5154C>T). Similarly three pathogenic protein-truncating mutations (6376insAA in exon 11, 8576insC in exon19, and 9999delA in exon 27) along with one missense mutation (A2951T), four intronic alterations (IVS2+90T>A, IVS7+75A>T, IVS8+56C>T, IVS25+58insG) and one silent substitution (1593A>G) were identified in BRCA2. Four previously reported polymorphisms (K1183R, S1613G, and M1652I in BRCA1, and 7470A>G in BRCA2) were detected in both controls and breast cancer patients. Rare BRCA1/2 sequence alterations were observed in 15 out of 105 (14.2%) early-onset cases without family history and 11.7% (4/34) breast cancer cases with family history. Of these, six were pathogenic protein truncating mutations. In addition, several variants of uncertain clinical significance were identified. Among these are two missense variants, one alteration of a consensus splice donor sequence, and a variant that potentially disrupts translational initiation. CONCLUSION: BRCA1 and BRCA2 mutations appear to account for a lower proportion of breast cancer patients at increased risk of harboring such mutations in Northern India (6/204, 2.9%) than has been reported in other populations. However, given the limited extent of reported family history among these patients, the observed mutation frequency is not dissimilar from that reported in other cohorts of early onset breast cancer patients. Several of the identified mutations are unique and novel to Indian patients. [Abstract/Link to Full Text]

Smith BH, Campbell H, Blackwood D, Connell J, Connor M, Deary IJ, Dominiczak AF, Fitzpatrick B, Ford I, Jackson C, Haddow G, Kerr S, Lindsay R, McGilchrist M, Morton R, Murray G, Palmer CN, Pell JP, Ralston SH, St Clair D, Sullivan F, Watt G, Wolf R, Wright A, Porteous D, Morris AD
Generation Scotland: the Scottish Family Health Study; a new resource for researching genes and heritability.
BMC Med Genet. 2006;774.
BACKGROUND: Generation Scotland: the Scottish Family Health Study aims to identify genetic variants accounting for variation in levels of quantitative traits underlying the major common complex diseases (such as cardiovascular disease, cognitive decline, mental illness) in Scotland. METHODS/DESIGN: Generation Scotland will recruit a family-based cohort of up to 50,000 individuals (comprising siblings and parent-offspring groups) across Scotland. It will be a six-year programme, beginning in Glasgow and Tayside in the first two years (Phase 1) before extending to other parts of Scotland in the remaining four years (Phase 2). In Phase 1, individuals aged between 35 and 55 years, living in the East and West of Scotland will be invited to participate, along with at least one (and preferably more) siblings and any other first degree relatives aged 18 or over. The total initial sample size will be 15,000 and it is planned that this will increase to 50,000 in Phase 2. All participants will be asked to contribute blood samples from which DNA will be extracted and stored for future investigation. The information from the DNA, along with answers to a life-style and medical history questionnaire, clinical and biochemical measurements taken at the time of donation, and subsequent health developments over the life course (traced through electronic health records) will be stored and used for research purposes. In addition, a detailed public consultation process will begin that will allow respondents' views to shape and develop the study. This is an important aspect to the research, and forms the continuation of a long-term parallel engagement process. DISCUSSION: As well as gene identification, the family-based study design will allow measurement of the heritability and familial aggregation of relevant quantitative traits, and the study of how genetic effects may vary by parent-of-origin. Long-term potential outcomes of this research include the targeting of disease prevention and treatment, and the development of screening tools based on the new genetic information. This study approach is complementary to other population-based genetic epidemiology studies, such as UK Biobank, which are established primarily to characterise genes and genetic risk in the population. [Abstract/Link to Full Text]

Thakur N, Reddy DN, Rao GV, Mohankrishna P, Singh L, Chandak GR
A novel mutation in STK11 gene is associated with Peutz-Jeghers Syndrome in Indian patients.
BMC Med Genet. 2006;773.
BACKGROUND: Peutz-Jeghers syndrome (PJS) is a rare multi-organ cancer syndrome and understanding its genetic basis may help comprehend the molecular mechanism of familial cancer. A number of germ line mutations in the STK11 gene, encoding a serine threonine kinase have been reported in these patients. However, STK11 mutations do not explain all PJS cases. An earlier study reported absence of STK11 mutations in two Indian families and suggested another potential locus on 19q13.4 in one of them. METHODS: We sequenced the promoter and the coding region including the splice-site junctions of the STK11 gene in 16 affected members from ten well-characterized Indian PJS families with a positive family history. RESULTS: We did not observe any of the reported mutations in the STK11 gene in the index patients from these families. We identified a novel pathogenic mutation (c.790_793 delTTTG) in the STK11 gene in one index patient (10%) and three members of his family. The mutation resulted in a frame-shift leading to premature termination of the STK11 protein at 286th codon, disruption of kinase domain and complete loss of C-terminal regulatory domain. Based on these results, we could offer predictive genetic testing, prenatal diagnosis and genetic counselling to other members of the family. CONCLUSION: Ours is the first study reporting the presence of STK11 mutation in Indian PJS patients. It also suggests that reported mutations in the STK11 gene are not responsible for the disease and novel mutations also do not account for many Indian PJS patients. Large-scale genomic deletions in the STK11 gene or another locus may be associated with the PJS phenotype in India and are worth future investigation. [Abstract/Link to Full Text]

Hung CC, Su YN, Chien SC, Liou HH, Chen CC, Chen PC, Hsieh CJ, Chen CP, Lee WT, Lin WL, Lee CN
Molecular and clinical analyses of 84 patients with tuberous sclerosis complex.
BMC Med Genet. 2006;772.
BACKGROUND: Tuberous sclerosis complex (TSC) is an autosomal dominant disease characterized by the development of multiple hamartomas in many internal organs. Mutations in either one of 2 genes, TSC1 and TSC2, have been attributed to the development of TSC. More than two-thirds of TSC patients are sporadic cases, and a wide variety of mutations in the coding region of the TSC1 and TSC2 genes have been reported. METHODS: Mutational analysis of TSC1 and TSC2 genes was performed in 84 Taiwanese TSC families using denaturing high-performance liquid chromatography (DHPLC) and direct sequencing. RESULTS: Mutations were identified in a total of 64 (76 %) cases, including 9 TSC1 mutations (7 sporadic and 2 familial cases) and 55 TSC2 mutations (47 sporadic and 8 familial cases). Thirty-one of the 64 mutations found have not been described previously. The phenotype association is consistent with findings from other large studies, showing that disease resulting from mutations to TSC1 is less severe than disease due to TSC2 mutation. CONCLUSION: This study provides a representative picture of the distribution of mutations of the TSC1 and TSC2 genes in clinically ascertained TSC cases in the Taiwanese population. Although nearly half of the mutations identified were novel, the kinds and distribution of mutation were not different in this population compared to that seen in larger European and American studies. [Abstract/Link to Full Text]

Li JL, Hayden MR, Warby SC, Durr A, Morrison PJ, Nance M, Ross CA, Margolis RL, Rosenblatt A, Squitieri F, Frati L, Gómez-Tortosa E, García CA, Suchowersky O, Klimek ML, Trent RJ, McCusker E, Novelletto A, Frontali M, Paulsen JS, Jones R, Ashizawa T, Lazzarini A, Wheeler VC, Prakash R, Xu G, Djoussé L, Mysore JS, Gillis T, Hakky M, Cupples LA, Saint-Hilaire MH, Cha JH, Hersch SM, Penney JB, Harrison MB, Perlman SL, Zanko A, Abramson RK, Lechich AJ, Duckett A, Marder K, Conneally PM, Gusella JF, MacDonald ME, Myers RH
Genome-wide significance for a modifier of age at neurological onset in Huntington's disease at 6q23-24: the HD MAPS study.
BMC Med Genet. 2006;771.
BACKGROUND: Age at onset of Huntington's disease (HD) is correlated with the size of the abnormal CAG repeat expansion in the HD gene; however, several studies have indicated that other genetic factors also contribute to the variability in HD age at onset. To identify modifier genes, we recently reported a whole-genome scan in a sample of 629 affected sibling pairs from 295 pedigrees, in which six genomic regions provided suggestive evidence for quantitative trait loci (QTL), modifying age at onset in HD. METHODS: In order to test the replication of this finding, eighteen microsatellite markers, three from each of the six genomic regions, were genotyped in 102 newly recruited sibling pairs from 69 pedigrees, and data were analyzed, using a multipoint linkage variance component method, in the follow-up sample and the combined sample of 352 pedigrees with 753 sibling pairs. RESULTS: Suggestive evidence for linkage at 6q23-24 in the follow-up sample (LOD = 1.87, p = 0.002) increased to genome-wide significance for linkage in the combined sample (LOD = 4.05, p = 0.00001), while suggestive evidence for linkage was observed at 18q22, in both the follow-up sample (LOD = 0.79, p = 0.03) and the combined sample (LOD = 1.78, p = 0.002). Epistatic analysis indicated that there is no interaction between 6q23-24 and other loci. CONCLUSION: In this replication study, linkage for modifier of age at onset in HD was confirmed at 6q23-24. Evidence for linkage was also found at 18q22. The demonstration of statistically significant linkage to a potential modifier locus opens the path to location cloning of a gene capable of altering HD pathogenesis, which could provide a validated target for therapeutic development in the human patient. [Abstract/Link to Full Text]

Ellinor PT, Petrov-Kondratov VI, Zakharova E, Nam EG, MacRae CA
Potassium channel gene mutations rarely cause atrial fibrillation.
BMC Med Genet. 2006;770.
BACKGROUND: Mutations in several potassium channel subunits have been associated with rare forms of atrial fibrillation. In order to explore the role of potassium channels in inherited typical forms of the arrhythmia, we have screened a cohort of patients from a referral clinic for mutations in the channel subunit genes implicated in the arrhythmia. We sought to determine if mutations in KCNJ2 and KCNE1-5 are a common cause of atrial fibrillation. METHODS: Serial patients with lone atrial fibrillation or atrial fibrillation with hypertension were enrolled between June 1, 2001 and January 6, 2005. Each patient underwent a standardized interview and physical examination. An electrocardiogram, echocardiogram and blood sample for genetic analysis were also obtained. Patients with a family history of AF were screened for mutations in KCNJ2 and KCNE1-5 using automated sequencing. RESULTS: 96 patients with familial atrial fibrillation were enrolled. Eighty-three patients had lone atrial fibrillation and 13 had atrial fibrillation and hypertension. Patients had a mean age of 56 years at enrollment and 46 years at onset of atrial fibrillation. Eighty-one percent of patients had paroxysmal atrial fibrillation at enrollment. Unlike patients with an activating mutation in KCNQ1, the patients had a normal QTc interval with a mean of 412 +/- 42 ms. Echocardiography revealed a normal mean ejection fraction of 62.0 +/- 7.2 % and mean left atrial dimension of 39.9 +/- 7.0 mm. A number of common polymorphisms in KCNJ2 and KCNE1-5 were identified, but no mutations were detected. CONCLUSION: Mutations in KCNJ2 and KCNE1-5 rarely cause typical atrial fibrillation in a referral clinic population. [Abstract/Link to Full Text]

Brown JT, Lahey C, Laosinchai-Wolf W, Hadd AG
Polymorphisms in the glucocerebrosidase gene and pseudogene urge caution in clinical analysis of Gaucher disease allele c.1448T>C (L444P).
BMC Med Genet. 2006;769.
BACKGROUND: Gaucher disease is a potentially severe lysosomal storage disorder caused by mutations in the human glucocerebrosidase gene (GBA). We have developed a multiplexed genetic assay for eight diseases prevalent in the Ashkenazi population: Tay-Sachs, Gaucher type I, Niemann-Pick types A and B, mucolipidosis type IV, familial dysautonomia, Canavan, Bloom syndrome, and Fanconi anemia type C. This assay includes an allelic determination for GBA allele c.1448T>C (L444P). The goal of this study was to clinically evaluate this assay. METHODS: Biotinylated, multiplex PCR products were directly hybridized to capture probes immobilized on fluorescently addressed microspheres. After incubation with streptavidin-conjugated fluorophore, the reactions were analyzed by Luminex IS100. Clinical evaluations were conducted using de-identified patient DNA samples. RESULTS: We evaluated a multiplexed suspension array assay that includes wild-type and mutant genetic determinations for Gaucher disease allele c.1448T>C. Two percent of samples reported to be wild-type by conventional methods were observed to be c.1448T>C heterozygous using our assay. Sequence analysis suggested that this phenomenon was due to co-amplification of the functional gene and a paralogous pseudogene (PsiGBA) due to a polymorphism in the primer-binding site of the latter. Primers for the amplification of this allele were then repositioned to span an upstream deletion in the pseudogene, yielding a much longer amplicon. Although it is widely reported that long amplicons negatively impact amplification or detection efficiency in recently adopted multiplex techniques, this assay design functioned properly and resolved the occurrence of false heterozygosity. CONCLUSION: Although previously available sequence information suggested GBA gene/pseudogene discrimination capabilities with a short amplified product, we identified common single-nucleotide polymorphisms in the pseudogene that required amplification of a larger region for effective discrimination. [Abstract/Link to Full Text]

Pettigrew MM, Gent JF, Zhu Y, Triche EW, Belanger KD, Holford TR, Bracken MB, Leaderer BP
Association of surfactant protein A polymorphisms with otitis media in infants at risk for asthma.
BMC Med Genet. 2006;768.
BACKGROUND: Otitis media is one of the most common infections of early childhood. Surfactant protein A functions as part of the innate immune response, which plays an important role in preventing infections early in life. This prospective study utilized a candidate gene approach to evaluate the association between polymorphisms in loci encoding SP-A and risk of otitis media during the first year of life among a cohort of infants at risk for developing asthma. METHODS: Between September 1996 and December 1998, women were invited to participate if they had at least one other child with physician-diagnosed asthma. Each mother was given a standardized questionnaire within 4 months of her infant's birth. Infant respiratory symptoms were collected during quarterly telephone interviews at 6, 9 and 12 months of age. Genotyping was done on 355 infants for whom whole blood and complete otitis media data were available. RESULTS: Polymorphisms at codons 19, 62, and 133 in SP-A1, and 223 in SP-A2 were associated with race/ethnicity. In logistic regression models incorporating estimates of uncertainty in haplotype assignment, the 6A4/1A5haplotype was protective for otitis media among white infants in our study population (OR 0.23; 95% CI 0.07,0.73). CONCLUSION: These results indicate that polymorphisms within SP-A loci may be associated with otitis media in white infants. Larger confirmatory studies in all ethnic groups are warranted. [Abstract/Link to Full Text]

Ogata T, Gregoire L, Goddard KA, Skunca M, Tromp G, Lancaster WD, Parrado AR, Lu Q, Shibamura H, Sakalihasan N, Limet R, MacKean GL, Arthur C, Sueda T, Kuivaniemi H
Evidence for association between the HLA-DQA locus and abdominal aortic aneurysms in the Belgian population: a case control study.
BMC Med Genet. 2006;767.
BACKGROUND: Chronic inflammation and autoimmunity likely contribute to the pathogenesis of abdominal aortic aneurysms (AAAs). The aim of this study was to investigate the role of autoimmunity in the etiology of AAAs using a genetic association study approach with HLA polymorphisms. METHODS: HLA-DQA1, -DQB1, -DRB1 and -DRB3-5 alleles were determined in 387 AAA cases (180 Belgian and 207 Canadian) and 426 controls (269 Belgian and 157 Canadian) by a PCR and single-strand oligonucleotide probe hybridization assay. RESULTS: We observed a potential association with the HLA-DQA1 locus among Belgian males (empirical p = 0.027, asymptotic p = 0.071). Specifically, there was a significant difference in the HLA-DQA1*0102 allele frequencies between AAA cases (67/322 alleles, 20.8%) and controls (44/356 alleles, 12.4%) in Belgian males (empirical p = 0.019, asymptotic p = 0.003). In haplotype analyses, marginally significant association was found between AAA and haplotype HLA-DQA1-DRB1 (p = 0.049 with global score statistics and p = 0.002 with haplotype-specific score statistics). CONCLUSION: This study showed potential evidence that the HLA-DQA1 locus harbors a genetic risk factor for AAAs suggesting that autoimmunity plays a role in the pathogenesis of AAAs. [Abstract/Link to Full Text]

Evans D, Beil FU
The E670G SNP in the PCSK9 gene is associated with polygenic hypercholesterolemia in men but not in women.
BMC Med Genet. 2006;766.
BACKGROUND: Common genetic variants in the PCSK9 gene have been reported to be associated with both elevated and exceptionally low LDL levels. The association of a common haplotype, encompassing the E670G single nucleotide polymorphism, with LDL levels reported by Chen et al (J Am Coll Cardiol 2005; 45: 1644) was not confirmed by Kotowski et al (Am J Hum Genet 2006; 78:410-422). METHODS: The incidence of the E670G SNP was determined in 506 patients attending the lipid clinic, University Hospital, Hamburg. RESULTS: The frequency in men with polygenic hypercholesterolemia, 0.11 was significantly higher than in men with LDL<50th percentile, 0.03, p = 0.01. In women there was no difference in the allele frequencies between the two groups. CONCLUSION: In a European population the E670G SNP in the PCSK9 gene is associated with increased LDL in men but not in women. [Abstract/Link to Full Text]

Horan PG, Allen AR, Hughes AE, Patterson CC, Spence M, McGlinchey PG, Belton C, Jardine TC, McKeown PP
Lack of MEF2A Delta7aa mutation in Irish families with early onset ischaemic heart disease, a family based study.
BMC Med Genet. 2006;765.
BACKGROUND: Ischaemic heart disease (IHD) is a complex disease due to the combination of environmental and genetic factors. Mutations in the MEF2A gene have recently been reported in patients with IHD. In particular, a 21 base pair deletion (Delta7aa) in the MEF2A gene was identified in a family with an autosomal dominant pattern of inheritance of IHD. We investigated this region of the MEF2A gene using an Irish family-based study, where affected individuals had early-onset IHD. METHODS: A total of 1494 individuals from 580 families were included (800 discordant sib-pairs and 64 parent-child trios). The Delta7aa region of the MEF2A gene was investigated based on amplicon size. RESULTS: The Delta7aa mutation was not detected in any individual. Variation in the number of CAG (glutamate) and CCG (proline) residues was detected in a nearby region. However, this was not found to be associated with IHD. CONCLUSION: The Delta7aa mutation was not detected in any individual within the study population and is unlikely to play a significant role in the development of IHD in Ireland. Using family-based tests of association the number of tri-nucleotide repeats in a nearby region of the MEF2A gene was not associated with IHD in our study group. [Abstract/Link to Full Text]

Bugeja MJ, Booth D, Bennetts B, Heard R, Rubio J, Stewart G
An investigation of polymorphisms in the 17q11.2-12 CC chemokine gene cluster for association with multiple sclerosis in Australians.
BMC Med Genet. 2006;764.
BACKGROUND: Multiple sclerosis (MS) is a disorder of the central nervous system (CNS) characterised by inflammation and neuronal degeneration. It is believed to result from the complex interaction of a number of genes, each with modest effect. Chemokines are vital to the migration of cells to sites of inflammation, including the CNS, and many are implicated in MS pathogenesis. Most of the CC chemokine genes are encoded in a cluster on chromosome 17q11.2-12, which has been identified in a number of genome wide screens as being potentially associated with MS. METHODS: We conducted a two-stage analysis to investigate the chemokine gene cluster for association with MS. After sequencing the chemokine genes in several DNA pools to identify common polymorphisms, 12 candidate single-nucleotide polymorphisms (SNPs) were genotyped in a cohort of Australian MS trio families. RESULTS: Marginally significant (uncorrected) transmission distortion was identified for four of the SNPs after stratification for several factors. We also identified marginally significant (uncorrected) transmission distortion for haplotypes encompassing the CCL2 and CCL11 genes, using two independent cohorts, which was consistent with recent reports from another group. CONCLUSION: Our results implicate several chemokines as possibly being associated with MS susceptibility, and given that chemokines and their receptors are suitable targets for therapeutic agents, further investigation is warranted in this region. [Abstract/Link to Full Text]

Favorova OO, Favorov AV, Boiko AN, Andreewski TV, Sudomoina MA, Alekseenkov AD, Kulakova OG, Gusev EI, Parmigiani G, Ochs MF
Three allele combinations associated with multiple sclerosis.
BMC Med Genet. 2006;763.
BACKGROUND: Multiple sclerosis (MS) is an immune-mediated disease of polygenic etiology. Dissection of its genetic background is a complex problem, because of the combinatorial possibilities of gene-gene interactions. As genotyping methods improve throughput, approaches that can explore multigene interactions appropriately should lead to improved understanding of MS. METHODS: 286 unrelated patients with definite MS and 362 unrelated healthy controls of Russian descent were genotyped at polymorphic loci (including SNPs, repeat polymorphisms, and an insertion/deletion) of the DRB1, TNF, LT, TGFbeta1, CCR5 and CTLA4 genes and TNFa and TNFb microsatellites. Each allele carriership in patients and controls was compared by Fisher's exact test, and disease-associated combinations of alleles in the data set were sought using a Bayesian Markov chain Monte Carlo-based method recently developed by our group. RESULTS: We identified two previously unknown MS-associated tri-allelic combinations:-509TGFbeta1*C, DRB1*18(3), CTLA4*G and -238TNF*B1,-308TNF*A2, CTLA4*G, which perfectly separate MS cases from controls, at least in the present sample. The previously described DRB1*15(2) allele, the microsatellite TNFa9 allele and the biallelic combination CCR5Delta32, DRB1*04 were also reidentified as MS-associated. CONCLUSION: These results represent an independent validation of MS association with DRB1*15(2) and TNFa9 in Russians and are the first to find the interplay of three loci in conferring susceptibility to MS. They demonstrate the efficacy of our approach for the identification of complex-disease-associated combinations of alleles. [Abstract/Link to Full Text]

Elbein SC, Wang X, Karim MA, Chu WS, Silver KD
Analysis of coding variants in the betacellulin gene in type 2 diabetes and insulin secretion in African American subjects.
BMC Med Genet. 2006;762.
BACKGROUND: Betacellulin is a member of the epidermal growth factor family, expressed at the highest levels predominantly in the pancreas and thought to be involved in islet neogenesis and regeneration. Nonsynonymous coding variants were reported to be associated with type 2 diabetes in African American subjects. We tested the hypotheses that these previously identified variants were associated with type 2 diabetes in African Americans ascertained in Arkansas and that they altered insulin secretion in glucose tolerant African American subjects. METHODS: We typed three variants, exon1 Cys7Gly (C7G), exon 2 Leu44Phe (L44F), and exon 4 Leu124Met (L124M), in 188 control subjects and 364 subjects with type 2 diabetes. We tested for altered insulin secretion in 107 subjects who had undergone intravenous glucose tolerance tests to assess insulin sensitivity and insulin secretion. RESULTS: No variant was associated with type 2 diabetes, and no variant altered insulin secretion or insulin sensitivity. However, an effect on lipids was observed for all 3 variants, and variant L124M was associated with obesity measures. CONCLUSION: We were unable to confirm a role for nonsynonymous variants of betacellulin in the propensity to type 2 diabetes or to impaired insulin secretion. [Abstract/Link to Full Text]

Delgado IJ, Kim DS, Thatcher KN, LaSalle JM, Van den Veyver IB
Expression profiling of clonal lymphocyte cell cultures from Rett syndrome patients.
BMC Med Genet. 2006;761.
BACKGROUND: More than 85% of Rett syndrome (RTT) patients have heterozygous mutations in the X-linked MECP2 gene which encodes methyl-CpG-binding protein 2, a transcriptional repressor that binds methylated CpG sites. Because MECP2 is subject to X chromosome inactivation (XCI), girls with RTT express either the wild type or mutant MECP2 in each of their cells. To test the hypothesis that MECP2 mutations result in genome-wide transcriptional deregulation and identify its target genes in a system that circumvents the functional mosaicism resulting from XCI, we performed gene expression profiling of pure populations of untransformed T-lymphocytes that express either a mutant or a wild-type allele. METHODS: Single T lymphocytes from a patient with a c.473C>T (p.T158M) mutation and one with a c.1308-1309delTC mutation were subcloned and subjected to short term culture. Gene expression profiles of wild-type and mutant clones were compared by oligonucleotide expression microarray analysis. RESULTS: Expression profiling yielded 44 upregulated genes and 77 downregulated genes. We compared this gene list with expression profiles of independent microarray experiments in cells and tissues of RTT patients and mouse models with Mecp2 mutations. These comparisons identified a candidate MeCP2 target gene, SPOCK1, downregulated in two independent microarray experiments, but its expression was not altered by quantitative RT-PCR analysis on brain tissues from a RTT mouse model. CONCLUSION: Initial expression profiling from T-cell clones of RTT patients identified a list of potential MeCP2 target genes. Further detailed analysis and comparison to independent microarray experiments did not confirm significantly altered expression of most candidate genes. These results are consistent with other reported data. [Abstract/Link to Full Text]

Recent Articles in American Journal of Human Genetics

Martinez-Mir A, Zlotogorski A, Gordon D, Petukhova L, Mo J, Gilliam TC, Londono D, Haynes C, Ott J, Hordinsky M, Nanova K, Norris D, Price V, Duvic M, Christiano AM
Genomewide scan for linkage reveals evidence of several susceptibility loci for alopecia areata.
Am J Hum Genet. 2007 Feb;80(2):316-28.
Alopecia areata (AA) is a genetically determined, immune-mediated disorder of the hair follicle that affects 1%-2% of the U.S. population. It is defined by a spectrum of severity that ranges from patchy localized hair loss on the scalp to the complete absence of hair everywhere on the body. In an effort to define the genetic basis of AA, we performed a genomewide search for linkage in 20 families with AA consisting of 102 affected and 118 unaffected individuals from the United States and Israel. Our analysis revealed evidence of at least four susceptibility loci on chromosomes 6, 10, 16 and 18, by use of several different statistical approaches. Fine-mapping analysis with additional families yielded a maximum multipoint LOD score of 3.93 on chromosome 18, a two-point affected sib pair (ASP) LOD score of 3.11 on chromosome 16, several ASP LOD scores >2.00 on chromosome 6q, and a haplotype-based relative risk LOD of 2.00 on chromosome 6p (in the major histocompatibility complex locus). Our findings confirm previous studies of association of the human leukocyte antigen locus with human AA, as well as the C3H-HeJ mouse model for AA. Interestingly, the major loci on chromosomes 16 and 18 coincide with loci for psoriasis reported elsewhere. These results suggest that these regions may harbor gene(s) involved in a number of different skin and hair disorders. [Abstract/Link to Full Text]

Asai-Coakwell M, French CR, Berry KM, Ye M, Koss R, Somerville M, Mueller R, van Heyningen V, Waskiewicz AJ, Lehmann OJ
GDF6, a novel locus for a spectrum of ocular developmental anomalies.
Am J Hum Genet. 2007 Feb;80(2):306-15.
Colobomata represent visually impairing ocular closure defects that are associated with a diverse range of developmental anomalies. Characterization of a chromosome 8q21.2-q22.1 segmental deletion in a patient with chorioretinal coloboma revealed elements of nonallelic homologous recombination and nonhomologous end joining. This genomic architecture extends the range of chromosomal rearrangements associated with human disease and indicates that a broader spectrum of human chromosomal rearrangements may use coupled homologous and nonhomologous mechanisms. We also demonstrate that the segmental deletion encompasses GDF6, encoding a member of the bone-morphogenetic protein family, and that inhibition of gdf6a in a model organism accurately recapitulates the proband's phenotype. The spectrum of disorders generated by morpholino inhibition and the more severe defects (microphthalmia and anophthalmia) observed at higher doses illustrate the key role of GDF6 in ocular development. These results underscore the value of integrated clinical and molecular investigation of patients with chromosomal anomalies. [Abstract/Link to Full Text]

Zsurka G, Hampel KG, Kudina T, Kornblum C, Kraytsberg Y, Elger CE, Khrapko K, Kunz WS
Inheritance of mitochondrial DNA recombinants in double-heteroplasmic families: potential implications for phylogenetic analysis.
Am J Hum Genet. 2007 Feb;80(2):298-305.
Recently, somatic recombination of human mitochondrial DNA (mtDNA) was discovered in skeletal muscle. To determine whether recombinant mtDNA molecules can be transmitted through the germ line, we investigated two families, each harboring two inherited heteroplasmic mtDNA mutations. Using allele-specific polymerase chain reaction and single-cell and single-molecule mutational analyses, we discovered, in both families, all four possible allelic combinations of the two heteroplasmic mutations (tetraplasmy), the hallmark of mtDNA recombination. We strongly suggest that these recombinant mtDNA molecules were inherited rather than de novo generated somatically, because they (1) are highly abundant and (2) are present in different tissues of maternally related family members, including young individuals. Moreover, the comparison of the complete mtDNA sequence of one of the families with database sequences revealed an irregular, nontreelike pattern of mutations, reminiscent of a reticulation. We therefore propose that certain reticulations of the human mtDNA phylogenetic tree might be explained by recombination of coexisting mtDNA molecules harboring multiple mutations. [Abstract/Link to Full Text]

Kügler S, Hahnewald R, Garrido M, Reiss J
Long-term rescue of a lethal inherited disease by adeno-associated virus-mediated gene transfer in a mouse model of molybdenum-cofactor deficiency.
Am J Hum Genet. 2007 Feb;80(2):291-7.
Molybdenum cofactor (MoCo) deficiency is a progressive neurological disorder that inevitably leads to early childhood death because of the lack of any effective therapy. In a mouse model of MoCo deficiency type A, the most frequent form of this autosomal recessively inherited disease, the affected animals show the biochemical characteristics of sulphite and xanthine intoxication and do not survive >2 wk after birth. We have constructed a recombinant-expression cassette for the gene MOCS1, which, via alternative splicing, facilitates the expression of the proteins MOCS1A and MOCS1B, both of which are necessary for the formation of a first intermediate, cyclic pyranopterin monophosphate (cPMP), within the biosynthetic pathway leading to active MoCo. A recombinant adeno-associated virus (AAV) vector was used to express the artificial MOCS1 minigene, in an attempt to cure the lethal MOCS1-deficient phenotype. The vector was used to transduce Mocs1-deficient mice at both 1 and 4 d after birth or, after a pretreatment with purified cPMP, at 40 d after birth. We report here that all Mocs1-deficient animals injected with a control AAV-enhanced green fluorescent protein vector died approximately 8 d after birth or after withdrawal of cPMP supplementation, whereas AAV-MOCS1-transduced animals show significantly increased longevity. A single intrahepatic injection of AAV-MOCS1 resulted in fertile adult animals without any pathological phenotypes. [Abstract/Link to Full Text]

Cargill M, Schrodi SJ, Chang M, Garcia VE, Brandon R, Callis KP, Matsunami N, Ardlie KG, Civello D, Catanese JJ, Leong DU, Panko JM, McAllister LB, Hansen CB, Papenfuss J, Prescott SM, White TJ, Leppert MF, Krueger GG, Begovich AB
A large-scale genetic association study confirms IL12B and leads to the identification of IL23R as psoriasis-risk genes.
Am J Hum Genet. 2007 Feb;80(2):273-90.
We performed a multitiered, case-control association study of psoriasis in three independent sample sets of white North American individuals (1,446 cases and 1,432 controls) with 25,215 genecentric single-nucleotide polymorphisms (SNPs) and found a highly significant association with an IL12B 3'-untranslated-region SNP (rs3212227), confirming the results of a small Japanese study. This SNP was significant in all three sample sets (odds ratio [OR](common) 0.64, combined P [Pcomb]=7.85x10(-10)). A Monte Carlo simulation to address multiple testing suggests that this association is not a type I error. The coding regions of IL12B were resequenced in 96 individuals with psoriasis, and 30 additional IL12B-region SNPs were genotyped. Haplotypes were estimated, and genotype-conditioned analyses identified a second risk allele (rs6887695) located approximately 60 kb upstream of the IL12B coding region that exhibited association with psoriasis after adjustment for rs3212227. Together, these two SNPs mark a common IL12B risk haplotype (OR(common) 1.40, Pcomb=8.11x10(-9)) and a less frequent protective haplotype (OR(common) 0.58, Pcomb=5.65x10(-12)), which were statistically significant in all three studies. Since IL12B encodes the common IL-12p40 subunit of IL-12 and IL-23, we individually genotyped 17 SNPs in the genes encoding the other chains of these cytokines (IL12A and IL23A) and their receptors (IL12RB1, IL12RB2, and IL23R). Haplotype analyses identified two IL23R missense SNPs that together mark a common psoriasis-associated haplotype in all three studies (OR(common) 1.44, Pcomb=3.13x10(-6)). Individuals homozygous for both the IL12B and the IL23R predisposing haplotypes have an increased risk of disease (OR(common) 1.66, Pcomb=1.33x10(-8)). These data, and the previous observation that administration of an antibody specific for the IL-12p40 subunit to patients with psoriasis is highly efficacious, suggest that these genes play a fundamental role in psoriasis pathogenesis. [Abstract/Link to Full Text]

Chang YP, Liu X, Kim JD, Ikeda MA, Layton MR, Weder AB, Cooper RS, Kardia SL, Rao DC, Hunt SC, Luke A, Boerwinkle E, Chakravarti A
Multiple genes for essential-hypertension susceptibility on chromosome 1q.
Am J Hum Genet. 2007 Feb;80(2):253-64.
Essential hypertension, defined as elevated levels of blood pressure (BP) without any obvious cause, is a major risk factor for coronary heart disease, stroke, and renal disease. BP levels and susceptibility to development of essential hypertension are partially determined by genetic factors that are poorly understood. Similar to other efforts to understand complex, non-Mendelian phenotypes, genetic dissection of hypertension-related traits employs genomewide linkage analyses of families and association studies of patient cohorts, to uncover rare and common disease alleles, respectively. Family-based mapping studies of elevated BP cover the large intermediate ground for identification of genes with common variants of significant effect. Our genomewide linkage and candidate-gene-based association studies demonstrate that a replicated linkage peak for BP regulation on human chromosome 1q, homologous to mouse and rat quantitative trait loci for BP, contains at least three genes associated with BP levels in multiple samples: ATP1B1, RGS5, and SELE. Individual variants in these three genes account for 2-5-mm Hg differences in mean systolic BP levels, and the cumulative effect reaches 8-10 mm Hg. Because the associated alleles in these genes are relatively common (frequency >5%), these three genes are important contributors to elevated BP in the population at large. [Abstract/Link to Full Text]

Duffy DL, Montgomery GW, Chen W, Zhao ZZ, Le L, James MR, Hayward NK, Martin NG, Sturm RA
A three-single-nucleotide polymorphism haplotype in intron 1 of OCA2 explains most human eye-color variation.
Am J Hum Genet. 2007 Feb;80(2):241-52.
We have previously shown that a quantitative-trait locus linked to the OCA2 region of 15q accounts for 74% of variation in human eye color. We conducted additional genotyping to clarify the role of the OCA2 locus in the inheritance of eye color and other pigmentary traits associated with skin-cancer risk in white populations. Fifty-eight synonymous and nonsynonymous exonic single-nucleotide polymorphisms (SNPs) and tagging SNPs were typed in a collection of 3,839 adolescent twins, their siblings, and their parents. The highest association for blue/nonblue eye color was found with three OCA2 SNPs: rs7495174 T/C, rs6497268 G/T, and rs11855019 T/C (P values of 1.02x10(-61), 1.57x10(-96), and 4.45x10(-54), respectively) in intron 1. These three SNPs are in one major haplotype block, with TGT representing 78.4% of alleles. The TGT/TGT diplotype found in 62.2% of samples was the major genotype seen to modify eye color, with a frequency of 0.905 in blue or green compared with only 0.095 in brown eye color. This genotype was also at highest frequency in subjects with light brown hair and was more frequent in fair and medium skin types, consistent with the TGT haplotype acting as a recessive modifier of lighter pigmentary phenotypes. Homozygotes for rs11855019 C/C were predominantly without freckles and had lower mole counts. The minor population impact of the nonsynonymous coding-region polymorphisms Arg305Trp and Arg419Gln associated with nonblue eyes and the tight linkage of the major TGT haplotype within the intron 1 of OCA2 with blue eye color and lighter hair and skin tones suggest that differences within the 5' proximal regulatory control region of the OCA2 gene alter expression or messenger RNA-transcript levels and may be responsible for these associations. [Abstract/Link to Full Text]

Klopocki E, Schulze H, Strauss G, Ott CE, Hall J, Trotier F, Fleischhauer S, Greenhalgh L, Newbury-Ecob RA, Neumann LM, Habenicht R, König R, Seemanova E, Megarbane A, Ropers HH, Ullmann R, Horn D, Mundlos S
Complex inheritance pattern resembling autosomal recessive inheritance involving a microdeletion in thrombocytopenia-absent radius syndrome.
Am J Hum Genet. 2007 Feb;80(2):232-40.
Thrombocytopenia-absent radius (TAR) syndrome is characterized by hypomegakaryocytic thrombocytopenia and bilateral radial aplasia in the presence of both thumbs. Other frequent associations are congenital heart disease and a high incidence of cow's milk intolerance. Evidence for autosomal recessive inheritance comes from families with several affected individuals born to unaffected parents, but several other observations argue for a more complex pattern of inheritance. In this study, we describe a common interstitial microdeletion of 200 kb on chromosome 1q21.1 in all 30 investigated patients with TAR syndrome, detected by microarray-based comparative genomic hybridization. Analysis of the parents revealed that this deletion occurred de novo in 25% of affected individuals. Intriguingly, inheritance of the deletion along the maternal line as well as the paternal line was observed. The absence of this deletion in a cohort of control individuals argues for a specific role played by the microdeletion in the pathogenesis of TAR syndrome. We hypothesize that TAR syndrome is associated with a deletion on chromosome 1q21.1 but that the phenotype develops only in the presence of an additional as-yet-unknown modifier (mTAR). [Abstract/Link to Full Text]

Winnepenninckx B, Debacker K, Ramsay J, Smeets D, Smits A, FitzPatrick DR, Kooy RF
CGG-repeat expansion in the DIP2B gene is associated with the fragile site FRA12A on chromosome 12q13.1.
Am J Hum Genet. 2007 Feb;80(2):221-31.
A high level of cytogenetic expression of the rare folate-sensitive fragile site FRA12A is significantly associated with mental retardation. Here, we identify an elongated polymorphic CGG repeat as the molecular basis of FRA12A. This repeat is in the 5' untranslated region of the gene DIP2B, which encodes a protein with a DMAP1-binding domain, which suggests a role in DNA methylation machinery. DIP2B mRNA levels were halved in two subjects with FRA12A with mental retardation in whom the repeat expansion was methylated. In two individuals without mental retardation but with an expanded and methylated repeat, DIP2B expression was reduced to approximately two-thirds of the values observed in controls. Interestingly, a carrier of an unmethylated CGG-repeat expansion showed increased levels of DIP2B mRNA, which suggests that the repeat elongation increases gene expression, as previously described for the fragile X-associated tremor/ataxia syndrome. These data suggest that deficiency of DIP2B, a brain-expressed gene, may mediate the neurocognitive problems associated with FRA12A. [Abstract/Link to Full Text]

Laumonnier F, Cuthbert PC, Grant SG
The role of neuronal complexes in human X-linked brain diseases.
Am J Hum Genet. 2007 Feb;80(2):205-20.
Beyond finding individual genes that are involved in medical disorders, an important challenge is the integration of sets of disease genes with the complexities of basic biological processes. We examine this issue by focusing on neuronal multiprotein complexes and their components encoded on the human X chromosome. Multiprotein signaling complexes in the postsynaptic terminal of central nervous system synapses are essential for the induction of neuronal plasticity and cognitive processes in animals. The prototype complex is the N-methyl-D-aspartate receptor complex/membrane-associated guanylate kinase-associated signaling complex (NRC/MASC) comprising 185 proteins and embedded within the postsynaptic density (PSD), which is a set of complexes totaling approximately 1,100 proteins. It is striking that 86% (6 of 7) of X-linked NRC/MASC genes and 49% (19 of 39) of X-chromosomal PSD genes are already known to be involved in human psychiatric disorders. Moreover, of the 69 known proteins mutated in X-linked mental retardation, 19 (28%) encode postsynaptic proteins. The high incidence of involvement in cognitive disorders is also found in mouse mutants and indicates that the complexes are functioning as integrated entities or molecular machines and that disruption of different components impairs their overall role in cognitive processes. We also noticed that NRC/MASC genes appear to be more strongly associated with mental retardation and autism spectrum disorders. We propose that systematic studies of PSD and NRC/MASC genes in mice and humans will give a high yield of novel genes important for human disease and new mechanistic insights into higher cognitive functions. [Abstract/Link to Full Text]

Bisceglia L, Cerullo G, Forabosco P, Torres DD, Scolari F, Di Perna M, Foramitti M, Amoroso A, Bertok S, Floege J, Mertens PR, Zerres K, Alexopoulos E, Kirmizis D, Ermelinda M, Zelante L, Schena FP
Genetic heterogeneity in Italian families with IgA nephropathy: suggestive linkage for two novel IgA nephropathy loci.
Am J Hum Genet. 2006 Dec;79(6):1130-4.
IgA nephropathy (IgAN) is the most common glomerulonephritis worldwide, but its etiologic mechanisms are still poorly understood. Different prevalences among ethnic groups and familial aggregation, together with an increased familial risk, suggest important genetic influences on its pathogenesis. A locus for familial IgAN, called "IGAN1," on chromosome 6q22-23 has been described, without the identification of any responsible gene. The partners of the European IgAN Consortium organized a second genomewide scan in 22 new informative Italian multiplex families. A total of 186 subjects (59 affected and 127 unaffected) were genotyped and were included in a two-stage genomewide linkage analysis. The regions 4q26-31 and 17q12-22 exhibited the strongest evidence of linkage by nonparametric analysis (best P=.0025 and .0045, respectively). These localizations were also supported by multipoint parametric analysis, in which peak LOD scores of 1.83 ( alpha =0.50) and 2.56 ( alpha =0.65) were obtained using the affected-only dominant model, and by allowance for the presence of genetic heterogeneity. Our results provide further evidence for genetic heterogeneity among families with IgAN. Evidence of linkage to multiple chromosomal regions is consistent with both an oligo/polygenic and a multiple-susceptibility-gene model for familial IgAN, with small or moderate effects in determining the pathological phenotype. Although we identified new candidate regions, replication studies are required to confirm the genetic contribution to familial IgAN. [Abstract/Link to Full Text]

López LC, Schuelke M, Quinzii CM, Kanki T, Rodenburg RJ, Naini A, Dimauro S, Hirano M
Leigh syndrome with nephropathy and CoQ10 deficiency due to decaprenyl diphosphate synthase subunit 2 (PDSS2) mutations.
Am J Hum Genet. 2006 Dec;79(6):1125-9.
Coenzyme Q(10) (CoQ(10)) is a vital lipophilic molecule that transfers electrons from mitochondrial respiratory chain complexes I and II to complex III. Deficiency of CoQ(10) has been associated with diverse clinical phenotypes, but, in most patients, the molecular cause is unknown. The first defect in a CoQ(10) biosynthetic gene, COQ2, was identified in a child with encephalomyopathy and nephrotic syndrome and in a younger sibling with only nephropathy. Here, we describe an infant with severe Leigh syndrome, nephrotic syndrome, and CoQ(10) deficiency in muscle and fibroblasts and compound heterozygous mutations in the PDSS2 gene, which encodes a subunit of decaprenyl diphosphate synthase, the first enzyme of the CoQ(10) biosynthetic pathway. Biochemical assays with radiolabeled substrates indicated a severe defect in decaprenyl diphosphate synthase in the patient's fibroblasts. This is the first description of pathogenic mutations in PDSS2 and confirms the molecular and clinical heterogeneity of primary CoQ(10) deficiency. [Abstract/Link to Full Text]

Tarpey PS, Stevens C, Teague J, Edkins S, O'Meara S, Avis T, Barthorpe S, Buck G, Butler A, Cole J, Dicks E, Gray K, Halliday K, Harrison R, Hills K, Hinton J, Jones D, Menzies A, Mironenko T, Perry J, Raine K, Richardson D, Shepherd R, Small A, Tofts C, Varian J, West S, Widaa S, Yates A, Catford R, Butler J, Mallya U, Moon J, Luo Y, Dorkins H, Thompson D, Easton DF, Wooster R, Bobrow M, Carpenter N, Simensen RJ, Schwartz CE, Stevenson RE, Turner G, Partington M, Gecz J, Stratton MR, Futreal PA, Raymond FL
Mutations in the gene encoding the Sigma 2 subunit of the adaptor protein 1 complex, AP1S2, cause X-linked mental retardation.
Am J Hum Genet. 2006 Dec;79(6):1119-24.
In a systematic sequencing screen of the coding exons of the X chromosome in 250 families with X-linked mental retardation (XLMR), we identified two nonsense mutations and one consensus splice-site mutation in the AP1S2 gene on Xp22 in three families. Affected individuals in these families showed mild-to-profound mental retardation. Other features included hypotonia early in life and delay in walking. AP1S2 encodes an adaptin protein that constitutes part of the adaptor protein complex found at the cytoplasmic face of coated vesicles located at the Golgi complex. The complex mediates the recruitment of clathrin to the vesicle membrane. Aberrant endocytic processing through disruption of adaptor protein complexes is likely to result from the AP1S2 mutations identified in the three XLMR-affected families, and such defects may plausibly cause abnormal synaptic development and function. AP1S2 is the first reported XLMR gene that encodes a protein directly involved in the assembly of endocytic vesicles. [Abstract/Link to Full Text]

Gazda HT, Grabowska A, Merida-Long LB, Latawiec E, Schneider HE, Lipton JM, Vlachos A, Atsidaftos E, Ball SE, Orfali KA, Niewiadomska E, Da Costa L, Tchernia G, Niemeyer C, Meerpohl JJ, Stahl J, Schratt G, Glader B, Backer K, Wong C, Nathan DG, Beggs AH, Sieff CA
Ribosomal protein S24 gene is mutated in Diamond-Blackfan anemia.
Am J Hum Genet. 2006 Dec;79(6):1110-8.
Diamond-Blackfan anemia (DBA) is a rare congenital red-cell aplasia characterized by anemia, bone-marrow erythroblastopenia, and congenital anomalies and is associated with heterozygous mutations in the ribosomal protein (RP) S19 gene (RPS19) in approximately 25% of probands. We report identification of de novo nonsense and splice-site mutations in another RP, RPS24 (encoded by RPS24 [10q22-q23]) in approximately 2% of RPS19 mutation-negative probands. This finding strongly suggests that DBA is a disorder of ribosome synthesis and that mutations in other RP or associated genes that lead to disrupted ribosomal biogenesis and/or function may also cause DBA. [Abstract/Link to Full Text]

Bergmann C, Senderek J, Anhuf D, Thiel CT, Ekici AB, Poblete-Gutierrez P, van Steensel M, Seelow D, Nürnberg G, Schild HH, Nürnberg P, Reis A, Frank J, Zerres K
Mutations in the gene encoding the Wnt-signaling component R-spondin 4 (RSPO4) cause autosomal recessive anonychia.
Am J Hum Genet. 2006 Dec;79(6):1105-9.
Anonychia is an autosomal recessive disorder characterized by the congenital absence of finger- and toenails. In a large German nonconsanguineous family with four affected and five unaffected siblings with isolated total congenital anonychia, we performed genomewide mapping and showed linkage to 20p13. Analysis of the RSPO4 gene within this interval revealed a frameshift and a nonconservative missense mutation in exon 2 affecting the highly conserved first furin-like cysteine-rich domain. Both mutations were not present among controls and were shown to segregate with the disease phenotype. RSPO4 is a member of the recently described R-spondin family of secreted proteins that play a major role in activating the Wnt/ beta -catenin signaling pathway. Wnt signaling is evolutionarily conserved and plays a pivotal role in embryonic development, growth regulation of multiple tissues, and cancer development. Our findings add to the increasing body of evidence indicating that mesenchymal-epithelial interactions are crucial in nail development and put anonychia on the growing list of congenital malformation syndromes caused by Wnt-signaling-pathway defects. To the best of our knowledge, this is the first gene known to be responsible for an isolated, nonsyndromic nail disorder. [Abstract/Link to Full Text]

Cichon S, Martin L, Hennies HC, Müller F, Van Driessche K, Karpushova A, Stevens W, Colombo R, Renné T, Drouet C, Bork K, Nöthen MM
Increased activity of coagulation factor XII (Hageman factor) causes hereditary angioedema type III.
Am J Hum Genet. 2006 Dec;79(6):1098-104.
Hereditary angioedema (HAE) is characterized clinically by recurrent acute skin swelling, abdominal pain, and potentially life-threatening laryngeal edema. Three forms of HAE have been described. The classic forms, HAE types I and II, occur as a consequence of mutations in the C1-inhibitor gene. In contrast to HAE types I and II, HAE type III has been observed exclusively in women, where it appears to be correlated with conditions of high estrogen levels--for example, pregnancy or the use of oral contraceptives. A recent report proposed two missense mutations (c.1032C-->A and c.1032C-->G) in F12, the gene encoding human coagulation factor XII (FXII, or Hageman factor) as a possible cause of HAE type III. Here, we report the occurrence of the c.1032C-->A (p.Thr328Lys) mutation in an HAE type III-affected family of French origin. Investigation of the F12 gene in a large German family did not reveal a coding mutation. Haplotype analysis with use of microsatellite markers is compatible with locus heterogeneity in HAE type III. To shed more light on the pathogenic relevance of the HAE type III-associated p.Thr328Lys mutation, we compared FXII activity and plasma levels in patients carrying the mutation with that of healthy control individuals. Our data strongly suggest that p.Thr328Lys is a gain-of-function mutation that markedly increases FXII amidolytic activity but that does not alter FXII plasma levels. We conclude that enhanced FXII enzymatic plasma activity in female mutation carriers leads to enhanced kinin production, which results in angioedema. Transcription of F12 is positively regulated by estrogens, which may explain why only women are affected with HAE type III. The results of our study represent an important step toward an understanding of the molecular processes involved in HAE type III and provide diagnostic and possibly new therapeutic opportunities. [Abstract/Link to Full Text]

Saunders MA, Good JM, Lawrence EC, Ferrell RE, Li WH, Nachman MW
Human adaptive evolution at Myostatin (GDF8), a regulator of muscle growth.
Am J Hum Genet. 2006 Dec;79(6):1089-97.
Myostatin (GDF8) is a negative regulator of muscle growth in mammals, and loss-of-function mutations are associated with increased skeletal-muscle mass in mice, cattle, and humans. Here, we show that positive natural selection has acted on human nucleotide variation at GDF8, since the observed ratio of nonsynonymous:synonymous changes among humans is significantly greater than expected under the neutral model and is strikingly different from patterns observed across mammalian orders. Furthermore, extended haplotypes around GDF8 suggest that two amino acid variants have been subject to recent positive selection. Both mutations are rare among non-Africans yet are at frequencies of up to 31% in sub-Saharan Africans. These signatures of selection at the molecular level suggest that human variation at GDF8 is associated with functional differences. [Abstract/Link to Full Text]

Heuser A, Plovie ER, Ellinor PT, Grossmann KS, Shin JT, Wichter T, Basson CT, Lerman BB, Sasse-Klaassen S, Thierfelder L, MacRae CA, Gerull B
Mutant desmocollin-2 causes arrhythmogenic right ventricular cardiomyopathy.
Am J Hum Genet. 2006 Dec;79(6):1081-8.
Arrhythmogenic right ventricular cardiomyopathy (ARVC) is a genetically heterogeneous heart-muscle disorder characterized by progressive fibrofatty replacement of right ventricular myocardium and an increased risk of sudden cardiac death. Mutations in desmosomal proteins that cause ARVC have been previously described; therefore, we investigated 88 unrelated patients with the disorder for mutations in human desmosomal cadherin desmocollin-2 (DSC2). We identified a heterozygous splice-acceptor-site mutation in intron 5 (c.631-2A-->G) of the DSC2 gene, which led to the use of a cryptic splice-acceptor site and the creation of a downstream premature termination codon. Quantitative analysis of cardiac DSC2 expression in patient specimens revealed a marked reduction in the abundance of the mutant transcript. Morpholino knockdown in zebrafish embryos revealed a requirement for dsc2 in the establishment of the normal myocardial structure and function, with reduced desmosomal plaque area, loss of the desmosome extracellular electron-dense midlines, and associated myocardial contractility defects. These data identify DSC2 mutations as a cause of ARVC in humans and demonstrate that physiologic levels of DSC2 are crucial for normal cardiac desmosome formation, early cardiac morphogenesis, and cardiac function. [Abstract/Link to Full Text]

Schaid DJ, Batzler AJ, Jenkins GD, Hildebrandt MA
Exact tests of Hardy-Weinberg equilibrium and homogeneity of disequilibrium across strata.
Am J Hum Genet. 2006 Dec;79(6):1071-80.
Detecting departures from Hardy-Weinberg equilibrium (HWE) of marker-genotype frequencies is a crucial first step in almost all human genetic analyses. When a sample is stratified by multiple ethnic groups, it is important to allow the marker-allele frequencies to differ over the strata. In this situation, it is common to test for HWE by using an exact test within each stratum and then using the minimum P value as a global test. This approach does not account for multiple testing, and, because it does not combine information over strata, it does not have optimal power. Several approximate methods to combine information over strata have been proposed, but most of them sum over strata a measure of departure from HWE; if the departures are in different directions, then summing can diminish the overall evidence of departure from HWE. An exact stratified test is more appealing because it uses the probability of genotype configurations across the strata as evidence for global departures from HWE. We developed an exact stratified test for HWE for diallelic markers, such as single-nucleotide polymorphisms (SNPs), and an exact test for homogeneity of Hardy-Weinberg disequilibrium. By applying our methods to data from Perlegen and HapMap--a combined total of more than five million SNP genotypes, with three to four strata and strata sizes ranging from 23 to 60 subjects--we illustrate that the exact stratified test provides more-robust and more-powerful results than those obtained by either the minimum of exact test P values over strata or approximate stratified tests that sum measures of departure from HWE. Hence, our new methods should be useful for samples composed of multiple ethnic groups. [Abstract/Link to Full Text]

Friedman JS, Chang B, Kannabiran C, Chakarova C, Singh HP, Jalali S, Hawes NL, Branham K, Othman M, Filippova E, Thompson DA, Webster AR, Andréasson S, Jacobson SG, Bhattacharya SS, Heckenlively JR, Swaroop A
Premature truncation of a novel protein, RD3, exhibiting subnuclear localization is associated with retinal degeneration.
Am J Hum Genet. 2006 Dec;79(6):1059-70.
The rd3 mouse is one of the oldest identified models of early-onset retinal degeneration. Using the positional candidate approach, we have identified a C-->T substitution in a novel gene, Rd3, that encodes an evolutionarily conserved protein of 195 amino acids. The rd3 mutation results in a predicted stop codon after residue 106. This change is observed in four rd3 lines derived from the original collected mice but not in the nine wild-type mouse strains that were examined. Rd3 is preferentially expressed in the retina and exhibits increasing expression through early postnatal development. In transiently transfected COS-1 cells, the RD3-fusion protein shows subnuclear localization adjacent to promyelocytic leukemia-gene-product bodies. The truncated mutant RD3 protein is detectable in COS-1 cells but appears to get degraded rapidly. To explore potential association of the human RD3 gene at chromosome 1q32 with retinopathies, we performed a mutation screen of 881 probands from North America, India, and Europe. In addition to several alterations of uncertain significance, we identified a homozygous alteration in the invariant G nucleotide of the RD3 exon 2 donor splice site in two siblings with Leber congenital amaurosis. This mutation is predicted to result in premature truncation of the RD3 protein, segregates with the disease, and is not detected in 121 ethnically matched control individuals. We suggest that the retinopathy-associated RD3 protein is part of subnuclear protein complexes involved in diverse processes, such as transcription and splicing. [Abstract/Link to Full Text]

Montgomery GW, Zhu G, Hottenga JJ, Duffy DL, Heath AC, Boomsma DI, Martin NG, Visscher PM
HLA and genomewide allele sharing in dizygotic twins.
Am J Hum Genet. 2006 Dec;79(6):1052-8.
Gametic selection during fertilization or the effects of specific genotypes on the viability of embryos may cause a skewed transmission of chromosomes to surviving offspring. A recent analysis of transmission distortion in humans reported significant excess sharing among full siblings. Dizygotic (DZ) twin pairs are a special case of the simultaneous survival of two genotypes, and there have been reports of DZ pairs with excess allele sharing around the HLA locus, a candidate locus for embryo survival. We performed an allele-sharing study of 1,592 DZ twin pairs from two independent Australian cohorts, of which 1,561 pairs were informative for linkage on chromosome 6. We also analyzed allele sharing in 336 DZ twin pairs from The Netherlands. We found no evidence of excess allele sharing, either at the HLA locus or in the rest of the genome. In contrast, we found evidence of a small but significant (P=.003 for the Australian sample) genomewide deficit in the proportion of two alleles shared identical by descent among DZ twin pairs. We reconciled conflicting evidence in the literature for excess genomewide allele sharing by performing a simulation study that shows how undetected genotyping errors can lead to an apparent deficit or excess of allele sharing among sibling pairs, dependent on whether parental genotypes are known. Our results imply that gene-mapping studies based on affected sibling pairs that include DZ pairs will not suffer from false-positive results due to loci involved in embryo survival. [Abstract/Link to Full Text]

Riazuddin S, Ahmed ZM, Fanning AS, Lagziel A, Kitajiri S, Ramzan K, Khan SN, Chattaraj P, Friedman PL, Anderson JM, Belyantseva IA, Forge A, Riazuddin S, Friedman TB
Tricellulin is a tight-junction protein necessary for hearing.
Am J Hum Genet. 2006 Dec;79(6):1040-51.
The inner ear has fluid-filled compartments of different ionic compositions, including the endolymphatic and perilymphatic spaces of the organ of Corti; the separation from one another by epithelial barriers is required for normal hearing. TRIC encodes tricellulin, a recently discovered tight-junction (TJ) protein that contributes to the structure and function of tricellular contacts of neighboring cells in many epithelial tissues. We show that, in humans, four different recessive mutations of TRIC cause nonsyndromic deafness (DFNB49), a surprisingly limited phenotype, given the widespread tissue distribution of tricellulin in epithelial cells. In the inner ear, tricellulin is concentrated at the tricellular TJs in cochlear and vestibular epithelia, including the structurally complex and extensive junctions between supporting and hair cells. We also demonstrate that there are multiple alternatively spliced isoforms of TRIC in various tissues and that mutations of TRIC associated with hearing loss remove all or most of a conserved region in the cytosolic domain that binds to the cytosolic scaffolding protein ZO-1. A wild-type isoform of tricellulin, which lacks this conserved region, is unaffected by the mutant alleles and is hypothesized to be sufficient for structural and functional integrity of epithelial barriers outside the inner ear. [Abstract/Link to Full Text]

Li D, Parks SB, Kushner JD, Nauman D, Burgess D, Ludwigsen S, Partain J, Nixon RR, Allen CN, Irwin RP, Jakobs PM, Litt M, Hershberger RE
Mutations of presenilin genes in dilated cardiomyopathy and heart failure.
Am J Hum Genet. 2006 Dec;79(6):1030-9.
Two common disorders of the elderly are heart failure and Alzheimer disease (AD). Heart failure usually results from dilated cardiomyopathy (DCM). DCM of unknown cause in families has recently been shown to result from genetic disease, highlighting newly discovered disease mechanisms. AD is the most frequent neurodegenerative disease of older Americans. Familial AD is caused most commonly by presenilin 1 (PSEN1) or presenilin 2 (PSEN2) mutations, a discovery that has greatly advanced the field. The presenilins are also expressed in the heart and are critical to cardiac development. We hypothesized that mutations in presenilins may also be associated with DCM and that their discovery could provide new insight into the pathogenesis of DCM and heart failure. A total of 315 index patients with DCM were evaluated for sequence variation in PSEN1 and PSEN2. Families positive for mutations underwent additional clinical, genetic, and functional studies. A novel PSEN1 missense mutation (Asp333Gly) was identified in one family, and a single PSEN2 missense mutation (Ser130Leu) was found in two other families. Both mutations segregated with DCM and heart failure. The PSEN1 mutation was associated with complete penetrance and progressive disease that resulted in the necessity of cardiac transplantation or in death. The PSEN2 mutation showed partial penetrance, milder disease, and a more favorable prognosis. Calcium signaling was altered in cultured skin fibroblasts from PSEN1 and PSEN2 mutation carriers. These data indicate that PSEN1 and PSEN2 mutations are associated with DCM and heart failure and implicate novel mechanisms of myocardial disease. [Abstract/Link to Full Text]

Gurley KA, Reimer RJ, Kingsley DM
Biochemical and genetic analysis of ANK in arthritis and bone disease.
Am J Hum Genet. 2006 Dec;79(6):1017-29.
Mutations in the progressive ankylosis gene (Ank/ANKH) cause surprisingly different skeletal phenotypes in mice and humans. In mice, recessive loss-of-function mutations cause arthritis, ectopic crystal formation, and joint fusion throughout the body. In humans, some dominant mutations cause chondrocalcinosis, an adult-onset disease characterized by the deposition of ectopic joint crystals. Other dominant mutations cause craniometaphyseal dysplasia, a childhood disease characterized by sclerosis of the skull and abnormal modeling of the long bones, with little or no joint pathology. Ank encodes a multiple-pass transmembrane protein that regulates pyrophosphate levels inside and outside tissue culture cells in vitro, but its mechanism of action is not yet clear, and conflicting models have been proposed to explain the effects of the human mutations. Here, we test wild-type and mutant forms of ANK for radiolabeled pyrophosphate-transport activity in frog oocytes. We also reconstruct two human mutations in a bacterial artificial chromosome and test them in transgenic mice for rescue of the Ank null phenotype and for induction of new skeletal phenotypes. Wild-type ANK stimulates saturable transport of pyrophosphate ions across the plasma membrane, with half maximal rates attained at physiological levels of pyrophosphate. Chondrocalcinosis mutations retain apparently wild-type transport activity and can rescue the joint-fusion phenotype of Ank null mice. Craniometaphyseal dysplasia mutations do not transport pyrophosphate and cannot rescue the defects of Ank null mice. Furthermore, microcomputed tomography revealed previously unappreciated phenotypes in Ank null mice that are reminiscent of craniometaphyseal dysplasia. The combination of biochemical and genetic analyses presented here provides insight into how mutations in ANKH cause human skeletal disease. [Abstract/Link to Full Text]

Chatterjee N, Kalaylioglu Z, Moslehi R, Peters U, Wacholder S
Powerful multilocus tests of genetic association in the presence of gene-gene and gene-environment interactions.
Am J Hum Genet. 2006 Dec;79(6):1002-16.
In modern genetic epidemiology studies, the association between the disease and a genomic region, such as a candidate gene, is often investigated using multiple SNPs. We propose a multilocus test of genetic association that can account for genetic effects that might be modified by variants in other genes or by environmental factors. We consider use of the venerable and parsimonious Tukey's 1-degree-of-freedom model of interaction, which is natural when individual SNPs within a gene are associated with disease through a common biological mechanism; in contrast, many standard regression models are designed as if each SNP has unique functional significance. On the basis of Tukey's model, we propose a novel but computationally simple generalized test of association that can simultaneously capture both the main effects of the variants within a genomic region and their interactions with the variants in another region or with an environmental exposure. We compared performance of our method with that of two standard tests of association, one ignoring gene-gene/gene-environment interactions and the other based on a saturated model of interactions. We demonstrate major power advantages of our method both in analysis of data from a case-control study of the association between colorectal adenoma and DNA variants in the NAT2 genomic region, which are well known to be related to a common biological phenotype, and under different models of gene-gene interactions with use of simulated data. [Abstract/Link to Full Text]

Weedon MN, Clark VJ, Qian Y, Ben-Shlomo Y, Timpson N, Ebrahim S, Lawlor DA, Pembrey ME, Ring S, Wilkin TJ, Voss LD, Jeffery AN, Metcalf B, Ferrucci L, Corsi AM, Murray A, Melzer D, Knight B, Shields B, Smith GD, Hattersley AT, Di Rienzo A, Frayling TM
A common haplotype of the glucokinase gene alters fasting glucose and birth weight: association in six studies and population-genetics analyses.
Am J Hum Genet. 2006 Dec;79(6):991-1001.
Fasting glucose is associated with future risk of type 2 diabetes and ischemic heart disease and is tightly regulated despite considerable variation in quantity, type, and timing of food intake. In pregnancy, maternal fasting glucose concentration is an important determinant of offspring birth weight. The key determinant of fasting glucose is the enzyme glucokinase (GCK). Rare mutations of GCK cause fasting hyperglycemia and alter birth weight. The extent to which common variation of GCK explains normal variation of fasting glucose and birth weight is not known. We aimed to comprehensively define the role of variation of GCK in determination of fasting glucose and birth weight, using a tagging SNP (tSNP) approach and studying 19,806 subjects from six population-based studies. Using 22 tSNPs, we showed that the variant rs1799884 is associated with fasting glucose at all ages in the normal population and exceeded genomewide levels of significance (P=10-9). rs3757840 was also highly significantly associated with fasting glucose (P=8x10-7), but haplotype analysis revealed that this is explained by linkage disequilibrium (r2=0.2) with rs1799884. A maternal A allele at rs1799884 was associated with a 32-g (95% confidence interval 11-53 g) increase in offspring birth weight (P=.002). Genetic variation influencing birth weight may have conferred a selective advantage in human populations. We performed extensive population-genetics analyses to look for evidence of recent positive natural selection on patterns of GCK variation. However, we found no strong signature of positive selection. In conclusion, a comprehensive analysis of common variation of the glucokinase gene shows that this is the first gene to be reproducibly associated with fasting glucose and fetal growth. [Abstract/Link to Full Text]

Loupatty FJ, Clayton PT, Ruiter JP, Ofman R, Ijlst L, Brown GK, Thorburn DR, Harris RA, Duran M, Desousa C, Krywawych S, Heales SJ, Wanders RJ
Mutations in the gene encoding 3-hydroxyisobutyryl-CoA hydrolase results in progressive infantile neurodegeneration.
Am J Hum Genet. 2007 Jan;80(1):195-9.
Only a single patient with 3-hydroxyisobutyryl-CoA hydrolase deficiency has been described in the literature, and the molecular basis of this inborn error of valine catabolism has remained unknown until now. Here, we present a second patient with 3-hydroxyisobutyryl-CoA hydrolase deficiency, who was identified through blood spot acylcarnitine analysis showing persistently increased levels of hydroxy-C(4)-carnitine. Both patients manifested hypotonia, poor feeding, motor delay, and subsequent neurological regression in infancy. Additional features in the newly identified patient included episodes of ketoacidosis and Leigh-like changes in the basal ganglia on a magnetic resonance imaging scan. In cultured skin fibroblasts from both patients, the 3-hydroxyisobutyryl-CoA hydrolase activity was deficient, and virtually no 3-hydroxyisobutyryl-CoA hydrolase protein could be detected by western blotting. Molecular analysis in both patients uncovered mutations in the HIBCH gene, including one missense mutation in a conserved part of the protein and two mutations affecting splicing. A carefully interpreted acylcarnitine profile will allow more patients with 3-hydroxyisobutyryl-CoA hydrolase deficiency to be diagnosed. [Abstract/Link to Full Text]

Baala L, Romano S, Khaddour R, Saunier S, Smith UM, Audollent S, Ozilou C, Faivre L, Laurent N, Foliguet B, Munnich A, Lyonnet S, Salomon R, Encha-Razavi F, Gubler MC, Boddaert N, de Lonlay P, Johnson CA, Vekemans M, Antignac C, Attie-Bitach T
The Meckel-Gruber syndrome gene, MKS3, is mutated in Joubert syndrome.
Am J Hum Genet. 2007 Jan;80(1):186-94.
Joubert syndrome (JS) is an autosomal recessive disorder characterized by cerebellar vermis hypoplasia associated with hypotonia, developmental delay, abnormal respiratory patterns, and abnormal eye movements. The association of retinal dystrophy and renal anomalies defines JS type B. JS is a genetically heterogeneous condition with mutations in two genes, AHI1 and CEP290, identified to date. In addition, NPHP1 deletions identical to those that cause juvenile nephronophthisis have been identified in a subset of patients with a mild form of cerebellar and brainstem anomaly. Occipital encephalocele and/or polydactyly have occasionally been reported in some patients with JS, and these phenotypic features can also be observed in Meckel-Gruber syndrome (MKS). MKS is a rare, autosomal recessive lethal condition characterized by central nervous system malformations (typically, occipital meningoencephalocele), postaxial polydactyly, multicystic kidney dysplasia, and ductal proliferation in the portal area of the liver. Since there is obvious phenotypic overlap between JS and MKS, we hypothesized that mutations in the recently identified MKS genes, MKS1 on chromosome 17q and MKS3 on 8q, may be a cause of JS. After mutation analysis of MKS1 and MKS3 in a series of patients with JS (n=22), we identified MKS3 mutations in four patients with JS, thus defining MKS3 as the sixth JS locus (JBTS6). No MKS1 mutations were identified in this series, suggesting that the allelism is restricted to MKS3. [Abstract/Link to Full Text]

Nicodemus KK, Luna A, Shugart YY
An evaluation of power and type I error of single-nucleotide polymorphism transmission/disequilibrium-based statistical methods under different family structures, missing parental data, and population stratification.
Am J Hum Genet. 2007 Jan;80(1):178-85.
Researchers conducting family-based association studies have a wide variety of transmission/disequilibrium (TD)-based methods to choose from, but few guidelines exist in the selection of a particular method to apply to available data. Using a simulation study design, we compared the power and type I error of eight popular TD-based methods under different family structures, frequencies of missing parental data, genetic models, and population stratifications. No method was uniformly most powerful under all conditions, but type I error was appropriate for nearly every test statistic under all conditions. Power varied widely across methods, with a 46.5% difference in power observed between the most powerful and the least powerful method when 50% of families consisted of an affected sib pair and one parent genotyped under an additive genetic model and a 35.2% difference when 50% of families consisted of a single affection-discordant sibling pair without parental genotypes available under an additive genetic model. Methods were generally robust to population stratification, although some slightly less so than others. The choice of a TD-based test statistic should be dependent on the predominant family structure ascertained, the frequency of missing parental genotypes, and the assumed genetic model. [Abstract/Link to Full Text]

Diego VP, Rainwater DL, Wang XL, Cole SA, Curran JE, Johnson MP, Jowett JB, Dyer TD, Williams JT, Moses EK, Comuzzie AG, Maccluer JW, Mahaney MC, Blangero J
Genotype x adiposity interaction linkage analyses reveal a locus on chromosome 1 for lipoprotein-associated phospholipase A2, a marker of inflammation and oxidative stress.
Am J Hum Genet. 2007 Jan;80(1):168-77.
Because obesity leads to a state of chronic, low-grade inflammation and oxidative stress, we hypothesized that the contribution of genes to variation in a biomarker of these two processes may be influenced by the degree of adiposity. We tested this hypothesis using samples from the San Antonio Family Heart Study that were assayed for activity of lipoprotein-associated phospholipase A(2) (Lp-PLA(2)), a marker of inflammation and oxidative stress. Using an approach to model discrete genotypexenvironment (GxE) interaction, we assigned individuals to one of two discrete diagnostic states (or "adiposity environments"): nonobese or obese, according to criteria suggested by the World Health Organization. We found a genomewide maximum LOD of 3.39 at 153 cM on chromosome 1 for Lp-PLA(2). Significant GxE interaction for Lp-PLA(2) at the genomewide maximum (P=1.16 x 10(-4)) was also found. Microarray gene-expression data were analyzed within the 1-LOD interval of the linkage signal on chromosome 1. We found two transcripts--namely, for Fc gamma receptor IIA and heat-shock protein (70 kDa)--that were significantly associated with Lp-PLA(2) (P<.001 for both) and showed evidence of cis-regulation with nominal LOD scores of 2.75 and 13.82, respectively. It would seem that there is a significant genetic response to the adiposity environment in this marker of inflammation and oxidative stress. Additionally, we conclude that GxE interaction analyses can improve our ability to identify and localize quantitative-trait loci. [Abstract/Link to Full Text]

Agrawal PB, Greenleaf RS, Tomczak KK, Lehtokari VL, Wallgren-Pettersson C, Wallefeld W, Laing NG, Darras BT, Maciver SK, Dormitzer PR, Beggs AH
Nemaline myopathy with minicores caused by mutation of the CFL2 gene encoding the skeletal muscle actin-binding protein, cofilin-2.
Am J Hum Genet. 2007 Jan;80(1):162-7.
Nemaline myopathy (NM) is a congenital myopathy characterized by muscle weakness and nemaline bodies in affected myofibers. Five NM genes, all encoding components of the sarcomeric thin filament, are known. We report identification of a sixth gene, CFL2, encoding the actin-binding protein muscle cofilin-2, which is mutated in two siblings with congenital myopathy. The proband's muscle contained characteristic nemaline bodies, as well as occasional fibers with minicores, concentric laminated bodies, and areas of F-actin accumulation. Her affected sister's muscle was reported to exhibit nonspecific myopathic changes. Cofilin-2 levels were significantly lower in the proband's muscle, and the mutant protein was less soluble when expressed in Escherichia coli, suggesting that deficiency of cofilin-2 may result in reduced depolymerization of actin filaments, causing their accumulation in nemaline bodies, minicores, and, possibly, concentric laminated bodies. [Abstract/Link to Full Text]

Valdmanis PN, Meijer IA, Reynolds A, Lei A, MacLeod P, Schlesinger D, Zatz M, Reid E, Dion PA, Drapeau P, Rouleau GA
Mutations in the KIAA0196 gene at the SPG8 locus cause hereditary spastic paraplegia.
Am J Hum Genet. 2007 Jan;80(1):152-61.
Hereditary spastic paraplegia (HSP) is a progressive upper-motor neurodegenerative disease. The eighth HSP locus, SPG8, is on chromosome 8p24.13. The three families previously linked to the SPG8 locus present with relatively severe, pure spastic paraplegia. We have identified three mutations in the KIAA0196 gene in six families that map to the SPG8 locus. One mutation, V626F, segregated in three large North American families with European ancestry and in one British family. An L619F mutation was found in a Brazilian family. The third mutation, N471D, was identified in a smaller family of European origin and lies in a spectrin domain. None of these mutations were identified in 500 control individuals. Both the L619 and V626 residues are strictly conserved across species and likely have a notable effect on the structure of the protein product strumpellin. Rescue studies with human mRNA injected in zebrafish treated with morpholino oligonucleotides to knock down the endogenous protein showed that mutations at these two residues impaired the normal function of the KIAA0196 gene. However, the function of the 1,159-aa strumpellin protein is relatively unknown. The identification and characterization of the KIAA0196 gene will enable further insight into the pathogenesis of HSP. [Abstract/Link to Full Text]

Upadhyaya M, Huson SM, Davies M, Thomas N, Chuzhanova N, Giovannini S, Evans DG, Howard E, Kerr B, Griffiths S, Consoli C, Side L, Adams D, Pierpont M, Hachen R, Barnicoat A, Li H, Wallace P, Van Biervliet JP, Stevenson D, Viskochil D, Baralle D, Haan E, Riccardi V, Turnpenny P, Lazaro C, Messiaen L
An absence of cutaneous neurofibromas associated with a 3-bp inframe deletion in exon 17 of the NF1 gene (c.2970-2972 delAAT): evidence of a clinically significant NF1 genotype-phenotype correlation.
Am J Hum Genet. 2007 Jan;80(1):140-51.
Neurofibromatosis type 1 (NF1) is characterized by cafe-au-lait spots, skinfold freckling, and cutaneous neurofibromas. No obvious relationships between small mutations (<20 bp) of the NF1 gene and a specific phenotype have previously been demonstrated, which suggests that interaction with either unlinked modifying genes and/or the normal NF1 allele may be involved in the development of the particular clinical features associated with NF1. We identified 21 unrelated probands with NF1 (14 familial and 7 sporadic cases) who were all found to have the same c.2970-2972 delAAT (p.990delM) mutation but no cutaneous neurofibromas or clinically obvious plexiform neurofibromas. Molecular analysis identified the same 3-bp inframe deletion (c.2970-2972 delAAT) in exon 17 of the NF1 gene in all affected subjects. The Delta AAT mutation is predicted to result in the loss of one of two adjacent methionines (codon 991 or 992) ( Delta Met991), in conjunction with silent ACA-->ACG change of codon 990. These two methionine residues are located in a highly conserved region of neurofibromin and are expected, therefore, to have a functional role in the protein. Our data represent results from the first study to correlate a specific small mutation of the NF1 gene to the expression of a particular clinical phenotype. The biological mechanism that relates this specific mutation to the suppression of cutaneous neurofibroma development is unknown. [Abstract/Link to Full Text]

Pearson JV, Huentelman MJ, Halperin RF, Tembe WD, Melquist S, Homer N, Brun M, Szelinger S, Coon KD, Zismann VL, Webster JA, Beach T, Sando SB, Aasly JO, Heun R, Jessen F, Kolsch H, Tsolaki M, Daniilidou M, Reiman EM, Papassotiropoulos A, Hutton ML, Stephan DA, Craig DW
Identification of the genetic basis for complex disorders by use of pooling-based genomewide single-nucleotide-polymorphism association studies.
Am J Hum Genet. 2007 Jan;80(1):126-39.
We report the development and validation of experimental methods, study designs, and analysis software for pooling-based genomewide association (GWA) studies that use high-throughput single-nucleotide-polymorphism (SNP) genotyping microarrays. We first describe a theoretical framework for establishing the effectiveness of pooling genomic DNA as a low-cost alternative to individually genotyping thousands of samples on high-density SNP microarrays. Next, we describe software called "GenePool," which directly analyzes SNP microarray probe intensity data and ranks SNPs by increased likelihood of being genetically associated with a trait or disorder. Finally, we apply these methods to experimental case-control data and demonstrate successful identification of published genetic susceptibility loci for a rare monogenic disease (sudden infant death with dysgenesis of the testes syndrome), a rare complex disease (progressive supranuclear palsy), and a common complex disease (Alzheimer disease) across multiple SNP genotyping platforms. On the basis of these theoretical calculations and their experimental validation, our results suggest that pooling-based GWA studies are a logical first step for determining whether major genetic associations exist in diseases with high heritability. [Abstract/Link to Full Text]

Zheng M, McPeek MS
Multipoint linkage-disequilibrium mapping with haplotype-block structure.
Am J Hum Genet. 2007 Jan;80(1):112-25.
The HapMap Project is providing a great deal of new information on high-resolution haplotype structure in various human populations. This information has the potential to greatly increase the power of association mapping for a fixed amount of genotyping. A number of methods have been proposed for the identification of haplotype blocks, common haplotypes, and tagging single-nucleotide polymorphisms. Here, we build on this work by developing novel methods for case-control multipoint linkage-disequilibrium (LD) mapping that gain power and speed by making explicit use of the inferred block structure. Specifically, we developed a virtual-variant approach that uses the haplotype-block information to greatly increase power for detection of untyped common variants associated with a trait. Because full multipoint LD mapping can be slow, we exploited the haplotype-block information to develop a fast single-block multipoint mapping method. Our methods are appropriate for genotype data and take into account the uncertainty in phase. We describe the methods in the context of case-parents trios, although they are also applicable to unrelated cases and controls. Our simulations indicate that the most important gains from taking into account the haplotype-block structure at the analysis stage of multipoint LD mapping come from (1) greatly increased power to detect association with untyped variants and (2) greatly improved localization of untyped variants associated with the trait. More-modest gains are obtained in improving power to detect association with a variant that is typed with a moderate amount of missing data. The methods are applied to a Crohn disease data set. [Abstract/Link to Full Text]

Baer MM, Bilstein A, Leptin M
A clonal genetic screen for mutants causing defects in larval tracheal morphogenesis in Drosophila.
Genetics. 2007 Aug;176(4):2279-91.
The initial establishment of the tracheal network in the Drosophila embryo is beginning to be understood in great detail, both in its genetic control cascades and in its cell biological events. By contrast, the vast expansion of the system during larval growth, with its extensive ramification of preexisting tracheal branches, has been analyzed less well. The mutant phenotypes of many genes involved in this process are probably not easy to reveal, as these genes may be required for other functions at earlier developmental stages. We therefore conducted a screen for defects in individual clonal homozygous mutant cells in the tracheal network of heterozygous larvae using the mosaic analysis with a repressible cell marker (MARCM) system to generate marked, recombinant mitotic clones. We describe the identification of a set of mutants with distinct phenotypic effects. In particular we found a range of defects in terminal cells, including failure in lumen formation and reduced or extensive branching. Other mutations affect cell growth, cell shape, and cell migration. [Abstract/Link to Full Text]

Mullen GP, Mathews EA, Vu MH, Hunter JW, Frisby DL, Duke A, Grundahl K, Osborne JD, Crowell JA, Rand JB
Choline transport and de novo choline synthesis support acetylcholine biosynthesis in Caenorhabditis elegans cholinergic neurons.
Genetics. 2007 Sep;177(1):195-204.
The cho-1 gene in Caenorhabditis elegans encodes a high-affinity plasma-membrane choline transporter believed to be rate limiting for acetylcholine (ACh) synthesis in cholinergic nerve terminals. We found that CHO-1 is expressed in most, but not all cholinergic neurons in C. elegans. cho-1 null mutants are viable and exhibit mild deficits in cholinergic behavior; they are slightly resistant to the acetylcholinesterase inhibitor aldicarb, and they exhibit reduced swimming rates in liquid. cho-1 mutants also fail to sustain swimming behavior; over a 33-min time course, cho-1 mutants slow down or stop swimming, whereas wild-type animals sustain the initial rate of swimming over the duration of the experiment. A functional CHO-1GFP fusion protein rescues these cho-1 mutant phenotypes and is enriched at cholinergic synapses. Although cho-1 mutants clearly exhibit defects in cholinergic behaviors, the loss of cho-1 function has surprisingly mild effects on cholinergic neurotransmission. However, reducing endogenous choline synthesis strongly enhances the phenotype of cho-1 mutants, giving rise to a synthetic uncoordinated phenotype. Our results indicate that both choline transport and de novo synthesis provide choline for ACh synthesis in C. elegans cholinergic neurons. [Abstract/Link to Full Text]

Cortés-Ortiz L, Duda TF, Canales-Espinosa D, García-Orduña F, Rodríguez-Luna E, Bermingham E
Hybridization in large-bodied New World primates.
Genetics. 2007 Aug;176(4):2421-5.
Well-documented cases of natural hybridization among primates are not common. In New World primates, natural hybridization has been reported only for small-bodied species, but no genotypic data have ever been gathered that confirm these reports. Here we present genetic evidence of hybridization of two large-bodied species of neotropical primates that diverged approximately 3 MYA. We used species-diagnostic mitochondrial and microsatellite loci and the Y chromosome Sry gene to determine the hybrid status of 36 individuals collected from an area of sympatry in Tabasco, Mexico. Thirteen individuals were hybrids. We show that hybridization and subsequent backcrosses are directionally biased and that the only likely cross between parental species produces fertile hybrid females, but fails to produce viable or fertile males. This system can be used as a model to study gene interchange between primate species that have not achieved complete reproductive isolation. [Abstract/Link to Full Text]

Barendse W, Harrison BE, Hawken RJ, Ferguson DM, Thompson JM, Thomas MB, Bunch RJ
Epistasis between calpain 1 and its inhibitor calpastatin within breeds of cattle.
Genetics. 2007 Aug;176(4):2601-10.
The calpain gene family and its inhibitors have diverse effects, many related to protein turnover, which appear to affect a range of phenotypes such as diabetes, exercise-induced muscle injury, and pathological events associated with degenerative neural diseases in humans, fertility, longevity, and postmortem effects on meat tenderness in livestock species. The calpains are inhibited by calpastatin, which binds directly to calpain. Here we report the direct measurement of epistatic interactions of causative mutations for quantitative trait loci (QTL) at calpain 1 (CAPN1), located on chromosome 29, with causative mutations for QTL variation at calpastatin (CAST), located on chromosome 7, in cattle. First we identified potential causative mutations at CAST and then genotyped these along with putative causative mutations at CAPN1 in >1500 cattle of seven breeds. The maximum allele substitution effect on the phenotype of the CAPN1:c.947G>C single nucleotide polymorphism (SNP) was 0.14 sigma(p) (P = 0.0003) and of the CAST:c.155C>T SNP was also 0.14 sigma(p) (P = 0.0011) when measured across breeds. We found significant epistasis between SNPs at CAPN1 and CAST in both taurine and zebu derived breeds. There were more additive x dominance components of epistasis than additive x additive and dominance x dominance components combined. A minority of breed comparisons did not show epistasis, suggesting that genetic variation at other genes may influence the degree of epistasis found in this system. [Abstract/Link to Full Text]

Kennington WJ, Hoffmann AA, Partridge L
Mapping regions within cosmopolitan inversion In(3R)Payne associated with natural variation in body size in Drosophila melanogaster.
Genetics. 2007 Sep;177(1):549-56.
Associations between genotypes for inversions and quantitative traits have been reported in several organisms, but little has been done to localize regions within inversions controlling variation in these traits. Here, we use an association mapping technique to identify genomic regions controlling variation in wing size within the cosmopolitan inversion In(3R)Payne in Drosophila melanogaster. Previous studies have shown that this inversion strongly influences variation in wing size, a trait highly correlated with body size. We found three alleles from two separate regions within In(3R)Payne with significant additive effects on wing size after the additional effect of the inversion itself had been taken into account. There were also several alleles with significant genotype-by-inversion interaction effects on wing size. None of the alleles tested had a significant additive effect on development time, suggesting different genes control these traits and that clinal patterns in them have therefore arisen independently. The presence of multiple regions within In(3R)Payne controlling size is consistent with the idea that inversions persist in populations because they contain multiple sets of locally adapted alleles, but more work needs to be done to test if they are indeed coadapted. [Abstract/Link to Full Text]

Bacaj T, Shaham S
Temporal control of cell-specific transgene expression in Caenorhabditis elegans.
Genetics. 2007 Aug;176(4):2651-5.
Cell-specific promoters allow only spatial control of transgene expression in Caenorhabditis elegans. We describe a method, using cell-specific rescue of heat-shock factor-1 (hsf-1) mutants, that allows spatial and temporal regulation of transgene expression. We demonstrate the utility of this method for timed reporter gene expression and for temporal studies of gene function. [Abstract/Link to Full Text]

Chen XL, Silver HR, Xiong L, Belichenko I, Adegite C, Johnson ES
Topoisomerase I-dependent viability loss in saccharomyces cerevisiae mutants defective in both SUMO conjugation and DNA repair.
Genetics. 2007 Sep;177(1):17-30.
Siz1 and Siz2/Nfi1 are the two Siz/PIAS SUMO E3 ligases in Saccharomyces cerevisiae. Here we show that siz1Delta siz2Delta mutants fail to grow in the absence of the homologous recombination pathway or the Fen1 ortholog RAD27. Remarkably, the growth defects of mutants such as siz1Delta siz2Delta rad52Delta are suppressed by mutations in TOP1, suggesting that these growth defects are caused by topoisomerase I activity. Other mutants that affect SUMO conjugation, including a ulp1 mutant and the nuclear pore mutants nup60Delta and nup133Delta, show similar top1-suppressible synthetic defects with DNA repair mutants, suggesting that these phenotypes also result from reduced SUMO conjugation. siz1Delta siz2Delta mutants also display TOP1-independent genome instability phenotypes, including increased mitotic recombination and elongated telomeres. We also show that SUMO conjugation, TOP1, and RAD27 have overlapping roles in telomere maintenance. Top1 is sumoylated, but Top1 does not appear to be the SUMO substrate involved in the synthetic growth defects. However, sumoylation of certain substrates, including Top1 itself and Tri1 (YMR233W), is enhanced in the absence of Top1 activity. Sumoylation is also required for growth of top1Delta cells. These results suggest that the SUMO pathway has a complex effect on genome stability that involves several mechanistically distinct processes. [Abstract/Link to Full Text]

Wagner A
Rapid detection of positive selection in genes and genomes through variation clusters.
Genetics. 2007 Aug;176(4):2451-63.
Positive selection in genes and genomes can point to the evolutionary basis for differences among species and among races within a species. The detection of positive selection can also help identify functionally important protein regions and thus guide protein engineering. Many existing tests for positive selection are excessively conservative, vulnerable to artifacts caused by demographic population history, or computationally very intensive. I here propose a simple and rapid test that is complementary to existing tests and that overcomes some of these problems. It relies on the null hypothesis that neutrally evolving DNA regions should show a Poisson distribution of nucleotide substitutions. The test detects significant deviations from this expectation in the form of variation clusters, highly localized groups of amino acid changes in a coding region. In applying this test to several thousand human-chimpanzee gene orthologs, I show that such variation clusters are not generally caused by relaxed selection. They occur in well-defined domains of a protein's tertiary structure and show a large excess of amino acid replacement over silent substitutions. I also identify multiple new human-chimpanzee orthologs subject to positive selection, among them genes that are involved in reproductive functions, immune defense, and the nervous system. [Abstract/Link to Full Text]

Ilmonen P, Penn DJ, Damjanovich K, Morrison L, Ghotbi L, Potts WK
Major histocompatibility complex heterozygosity reduces fitness in experimentally infected mice.
Genetics. 2007 Aug;176(4):2501-8.
It is often suggested that heterozygosity at major histocompatibility complex (MHC) loci confers enhanced resistance to infectious diseases (heterozygote advantage, HA, hypothesis), and overdominant selection should contribute to the evolution of these highly polymorphic genes. The evidence for the HA hypothesis is mixed and mainly from laboratory studies on inbred congenic mice, leaving the importance of MHC heterozygosity for natural populations unclear. We tested the HA hypothesis by infecting mice, produced by crossbreeding congenic C57BL/10 with wild ones, with different strains of Salmonella, both in laboratory and in large population enclosures. In the laboratory, we found that MHC influenced resistance, despite interacting wild-derived background loci. Surprisingly, resistance was mostly recessive rather than dominant, unlike in most inbred mouse strains, and it was never overdominant. In the enclosures, heterozygotes did not show better resistance, survival, or reproductive success compared to homozygotes. On the contrary, infected heterozygous females produced significantly fewer pups than homozygotes. Our results show that MHC effects are not masked on an outbred genetic background, and that MHC heterozygosity provides no immunological benefits when resistance is recessive, and can actually reduce fitness. These findings challenge the HA hypothesis and emphasize the need for studies on wild, genetically diverse species. [Abstract/Link to Full Text]

Boban M, Ljungdahl PO
Dal81 enhances Stp1- and Stp2-dependent transcription necessitating negative modulation by inner nuclear membrane protein Asi1 in Saccharomyces cerevisiae.
Genetics. 2007 Aug;176(4):2087-97.
The yeast transcription factors Stp1 and Stp2 are synthesized as latent cytoplasmic precursors. In response to extracellular amino acids, the plasma membrane SPS sensor endoproteolytically excises the N-terminal domains that mediate cytoplasmic retention, enabling the processed forms to efficiently enter the nucleus and induce gene expression. Cytoplasmic retention is not absolute, low levels of full-length Stp1 and Stp2 "leak" into the nucleus, and the concerted action of inner nuclear membrane proteins Asi1, Asi2, and Asi3 restricts their promoter access. In cells lacking Asi function, the precursor forms bind promoters and constitutively induce gene expression. To understand the requirement of Asi-dependent repression, spontaneous mutations in Required for Latent Stp1/2-mediated transcription (RLS) genes that abolish the constitutive expression of SPS sensor-regulated genes in an asi1Delta strain were selected. A single gene, allelic with DAL81, was identified. We show that Dal81 indiscriminately amplifies the transactivation potential of both full-length and processed Stp1 and Stp2 by facilitating promoter binding. In dal81Delta mutants, the repressing activity of the Asi proteins is dispensable, demonstrating that without amplification, the levels of full-length Stp1 and Stp2 that escape cytoplasmic retention are insufficient to activate transcription. Conversely, the high levels of processed Stp1 and Stp2 that accumulate in the nucleus of induced cells activate transcription in the absence of Dal81. [Abstract/Link to Full Text]

Missirlis F, Kosmidis S, Brody T, Mavrakis M, Holmberg S, Odenwald WF, Skoulakis EM, Rouault TA
Homeostatic mechanisms for iron storage revealed by genetic manipulations and live imaging of Drosophila ferritin.
Genetics. 2007 Sep;177(1):89-100.
Ferritin is a symmetric, 24-subunit iron-storage complex assembled of H and L chains. It is found in bacteria, plants, and animals and in two classes of mutations in the human L-chain gene, resulting in hereditary hyperferritinemia cataract syndrome or in neuroferritinopathy. Here, we examined systemic and cellular ferritin regulation and trafficking in the model organism Drosophila melanogaster. We showed that ferritin H and L transcripts are coexpressed during embryogenesis and that both subunits are essential for embryonic development. Ferritin overexpression impaired the survival of iron-deprived flies. In vivo expression of GFP-tagged holoferritin confirmed that iron-loaded ferritin molecules traffic through the Golgi organelle and are secreted into hemolymph. A constant ratio of ferritin H and L subunits, secured via tight post-transcriptional regulation, is characteristic of the secreted ferritin in flies. Differential cellular expression, conserved post-transcriptional regulation via the iron regulatory element, and distinct subcellular localization of the ferritin subunits prior to the assembly of holoferritin are all important steps mediating iron homeostasis. Our study revealed both conserved features and insect-specific adaptations of ferritin nanocages and provides novel imaging possibilities for their in vivo characterization. [Abstract/Link to Full Text]

McDaniel SF, Willis JH, Shaw AJ
A linkage map reveals a complex basis for segregation distortion in an interpopulation cross in the moss Ceratodon purpureus.
Genetics. 2007 Aug;176(4):2489-500.
We report the construction of a linkage map for the moss Ceratodon purpureus (n = 13), based on a cross between geographically distant populations, and provide the first experimental confirmation of maternal chloroplast inheritance in bryophytes. From a mapping population of 288 recombinant haploid gametophytes, genotyped at 121 polymorphic AFLP loci, three gene-based nuclear loci, one chloroplast marker, and sex, we resolved 15 linkage groups resulting in a map length of approximately 730 cM. We estimate that the map covers more than three-quarters of the C. purpureus genome. Approximately 35% of the loci were sex linked, not including those in recombining pseudoautosomal regions. Nearly 45% of the loci exhibited significant segregation distortion (alpha = 0.05). Several pairs of unlinked distorted loci showed significant deviations from multiplicative genotypic frequencies, suggesting that distortion arises from genetic interactions among loci. The distorted autosomal loci all exhibited an excess of the maternal allele, suggesting that these interactions may involve nuclear-cytoplasmic factors. The sex ratio of the progeny was significantly male biased, and the pattern of nonrandom associations among loci indicates that this results from interactions between the sex chromosomes. These results suggest that even in interpopulation crosses, multiple mechanisms act to influence segregation ratios. [Abstract/Link to Full Text]

King J, Armstead IP, Donnison SI, Roberts LA, Harper JA, Skøt K, Elborough K, King IP
Comparative analyses between lolium/festuca introgression lines and rice reveal the major fraction of functionally annotated gene models is located in recombination-poor/very recombination-poor regions of the genome.
Genetics. 2007 Sep;177(1):597-606.
Publication of the rice genome sequence has allowed an in-depth analysis of genome organization in a model monocot plant species. This has provided a powerful tool for genome analysis in large-genome unsequenced agriculturally important monocot species such as wheat, barley, rye, Lolium, etc. Previous data have indicated that the majority of genes in large-genome monocots are located toward the ends of chromosomes in gene-rich regions that undergo high frequencies of recombination. Here we demonstrate that a substantial component of the coding sequences in monocots is localized proximally in regions of very low and even negligible recombination frequencies. The implications of our findings are that during domestication of monocot plant species selection has concentrated on genes located in the terminal regions of chromosomes within areas of high recombination frequency. Thus a large proportion of the genetic variation available for selection of superior plant genotypes has not been exploited. In addition our findings raise the possibility of the evolutionary development of large supergene complexes that confer a selective advantage to the individual. [Abstract/Link to Full Text]

Roychoudhury A, Stephens M
Fast and accurate estimation of the population-scaled mutation rate, theta, from microsatellite genotype data.
Genetics. 2007 Jun;176(2):1363-6.
We present a new approach for estimation of the population-scaled mutation rate, , from microsatellite genotype data, using the recently introduced "product of approximate conditionals" framework. Comparisons with other methods on simulated data demonstrate that this new approach is attractive in terms of both accuracy and speed of computation. Our simulation experiments also demonstrate that, despite the theoretical advantages of full-likelihood-based methods, methods based on certain summary statistics (specifically, the sample homozygosity) can perform very competitively in practice. [Abstract/Link to Full Text]

Schmitz RJ, Hong L, Fitzpatrick KE, Amasino RM
DICER-LIKE 1 and DICER-LIKE 3 redundantly act to promote flowering via repression of FLOWERING LOCUS C in Arabidopsis thaliana.
Genetics. 2007 Jun;176(2):1359-62.
In Arabidopsis thaliana, DICER-LIKE 1 and DICER-LIKE 3 are involved in the generation of small RNAs. Double mutants between dicer-like 1 and dicer-like 3 exhibit a delay in flowering that is caused by increased expression of the floral repressor FLOWERING LOCUS C. This delayed-flowering phenotype is similar to that of autonomous-pathway mutants, and the flowering delay can be overcome by vernalization. [Abstract/Link to Full Text]

Beckwith J
What lies beyond uranus? Preconceptions, ignorance, serendipity and suppressors in the search for biology's secrets.
Genetics. 2007 Jun;176(2):733-40. [Abstract/Link to Full Text]

Crow JF
Haldane, Bailey, Taylor and recombinant-inbred lines.
Genetics. 2007 Jun;176(2):729-32. [Abstract/Link to Full Text]

Bhat KM, Gaziova I, Krishnan S
Regulation of axon guidance by slit and netrin signaling in the Drosophila ventral nerve cord.
Genetics. 2007 Aug;176(4):2235-46.
Netrin and Slit signaling systems play opposing roles during the positioning of longitudinal tracts along the midline in the ventral nerve cord of Drosophila embryo. It has been hypothesized that a gradient of Slit from the midline interacts with three different Robo receptors to specify the axon tract positioning. However, no such gradient has been detected. Moreover, overexpression of Slit at the midline has no effect on the positioning of these lateral tracts. In this article, we show that Slit is present outside of the midline along the longitudinal and commissural tracts. Sli from the midline, in a Robo-independent manner, is initially taken up by the commissural axon tracts when they cross the midline and is transported along the commissural tracts into the longitudinal connectives. These results are not consistent with a Sli gradient model. We also find that sli mRNA is maternally deposited and embryos that are genetically null for sli can have weaker guidance defects. Moreover, in robo or robo3 mutants, embryos with normal axon tracts are found and such robo embryos reach pupal stages and die, while robo3 mutant embryos develop into normal individuals and produce eggs. Interestingly, embryos from robo3 homozygous individuals fail to develop but have axon tracts ranging from normal to various defects: robo3 phenotype, robo phenotype, and slit-like phenotype, suggesting a more complex functional role for these genes than what has been proposed. Finally, our previous results indicated that netrin phenotype is epistatic to sli or robo phenotypes. However, it seems likely that this previously reported epistatic relationship might be due to the partial penetrance of the sli, robo, robo3 (or robo2) phenotypes. Our results argue that double mutant epistasis is most definitive only if the penetrance of the phenotypes of the mutants involved is complete. [Abstract/Link to Full Text]

Wills DM, Burke JM
Quantitative trait locus analysis of the early domestication of sunflower.
Genetics. 2007 Aug;176(4):2589-99.
Genetic analyses of the domestication syndrome have revealed that domestication-related traits typically have a very similar genetic architecture across most crops, being conditioned by a small number of quantitative trait loci (QTL), each with a relatively large effect on the phenotype. To date, the domestication of sunflower (Helianthus annuus L.) stands as the only counterexample to this pattern. In previous work involving a cross between wild sunflower (also H. annuus) and a highly improved oilseed cultivar, we found that domestication-related traits in sunflower are controlled by numerous QTL, typically of small effect. To provide insight into the minimum genetic changes required to transform the weedy common sunflower into a useful crop plant, we mapped QTL underlying domestication-related traits in a cross between a wild sunflower and a primitive Native American landrace that has not been the target of modern breeding programs. Consistent with the results of the previous study, our data indicate that the domestication of sunflower was driven by selection on a large number of loci, most of which had small to moderate phenotypic effects. Unlike the results of the previous study, however, nearly all of the QTL identified herein had phenotypic effects in the expected direction, with the domesticated allele producing a more crop-like phenotype and the wild allele producing a more wild-like phenotype. Taken together, these results are consistent with the hypothesis that selection during the post-domestication era has resulted in the introduction of apparently maladaptive alleles into the modern sunflower gene pool. [Abstract/Link to Full Text]

Lee K, Lee SE
Saccharomyces cerevisiae Sae2- and Tel1-dependent single-strand DNA formation at DNA break promotes microhomology-mediated end joining.
Genetics. 2007 Aug;176(4):2003-14.
Microhomology-mediated end joining (MMEJ) joins DNA ends via short stretches [5-20 nucleotides (nt)] of direct repeat sequences, yielding deletions of intervening sequences. Non-homologous end joining (NHEJ) and single-strand annealing (SSA) are other error prone processes that anneal single-stranded DNA (ssDNA) via a few bases (<5 nt) or extensive direct repeat homologies (>20 nt). Although the genetic components involved in MMEJ are largely unknown, those in NHEJ and SSA are characterized in some detail. Here, we surveyed the role of NHEJ or SSA factors in joining of double-strand breaks (DSBs) with no complementary DNA ends that rely primarily on MMEJ repair. We found that MMEJ requires the nuclease activity of Mre11/Rad50/Xrs2, 3' flap removal by Rad1/Rad10, Nej1, and DNA synthesis by multiple polymerases including Pol4, Rad30, Rev3, and Pol32. The mismatch repair proteins, Rad52 group genes, and Rad27 are dispensable for MMEJ. Sae2 and Tel1 promote MMEJ but inhibit NHEJ, likely by regulating Mre11-dependent ssDNA accumulation at DNA break. Our data support the role of Sae2 and Tel1 in MMEJ and genome integrity. [Abstract/Link to Full Text]

Smolikov S, Eizinger A, Hurlburt A, Rogers E, Villeneuve AM, Colaiácovo MP
Synapsis-defective mutants reveal a correlation between chromosome conformation and the mode of double-strand break repair during Caenorhabditis elegans meiosis.
Genetics. 2007 Aug;176(4):2027-33.
SYP-3 is a new structural component of the synaptonemal complex (SC) required for the regulation of chromosome synapsis. Both chromosome morphogenesis and nuclear organization are altered throughout the germlines of syp-3 mutants. Here, our analysis of syp-3 mutants provides insights into the relationship between chromosome conformation and the repair of meiotic double-strand breaks (DSBs). Although crossover recombination is severely reduced in syp-3 mutants, the production of viable offspring accompanied by the disappearance of RAD-51 foci suggests that DSBs are being repaired in these synapsis-defective mutants. Our studies indicate that once interhomolog recombination is impaired, both intersister recombination and nonhomologous end-joining pathways may contribute to repair during germline meiosis. Moreover, our studies suggest that the conformation of chromosomes may influence the mode of DSB repair employed during meiosis. [Abstract/Link to Full Text]

Knudsen B, Miyamoto MM
Incorporating experimental design and error into coalescent/mutation models of population history.
Genetics. 2007 Aug;176(4):2335-42.
Coalescent theory provides a powerful framework for estimating the evolutionary, demographic, and genetic parameters of a population from a small sample of individuals. Current coalescent models have largely focused on population genetic factors (e.g., mutation, population growth, and migration) rather than on the effects of experimental design and error. This study develops a new coalescent/mutation model that accounts for unobserved polymorphisms due to missing data, sequence errors, and multiple reads for diploid individuals. The importance of accommodating these effects of experimental design and error is illustrated with evolutionary simulations and a real data set from a population of the California sea hare. In particular, a failure to account for sequence errors can lead to overestimated mutation rates, inflated coalescent times, and inappropriate conclusions about the population. This current model can now serve as a starting point for the development of newer models with additional experimental and population genetic factors. It is currently implemented as a maximum-likelihood method, but this model may also serve as the basis for the development of Bayesian approaches that incorporate experimental design and error. [Abstract/Link to Full Text]

Simmons MJ, Niemi JB, Ryzek DF, Lamour C, Goodman JW, Kraszkiewicz W, Wolff R
Cytotype regulation by telomeric P elements in Drosophila melanogaster: interactions with P elements from M' strains.
Genetics. 2007 Aug;176(4):1957-66.
P strains of Drosophila are distinguished from M strains by having P elements in their genomes and also by having the P cytotype, a maternally inherited condition that strongly represses P-element-induced hybrid dysgenesis. The P cytotype is associated with P elements inserted near the left telomere of the X chromosome. Repression by the telomeric P elements TP5 and TP6 is significantly enhanced when these elements are crossed into M' strains, which, like P strains, carry P elements, but have little or no ability to repress dysgenesis. The telomeric and M' P elements must coexist in females for this enhanced repression ability to develop. However, once established, it is transmitted maternally to the immediate offspring independently of the telomeric P elements themselves. Females that carry a telomeric P element but that do not carry M' P elements may also transmit an ability to repress dysgenesis to their offspring independently of the telomeric P element. Cytotype regulation therefore involves a maternally transmissible product of telomeric P elements that can interact synergistically with products from paternally inherited M' P elements. This synergism between TP and M' P elements also appears to persist for at least one generation after the TP has been removed from the genotype. [Abstract/Link to Full Text]

Jones B, Grossman GD, Walsh DC, Porter BA, Avise JC, Fiumera AC
Estimating differential reproductive success from nests of related individuals, with application to a study of the mottled sculpin, Cottus bairdi.
Genetics. 2007 Aug;176(4):2427-39.
Understanding how variation in reproductive success is related to demography is a critical component in understanding the life history of an organism. Parentage analysis using molecular markers can be used to estimate the reproductive success of different groups of individuals in natural populations. Previous models have been developed for cases where offspring are random samples from the population but these models do not account for the presence of full- and half-sibs commonly found in large clutches of many organisms. Here we develop a model for comparing reproductive success among different groups of individuals that explicitly incorporates within-nest relatedness. Inference for the parameters of the model is done in a Bayesian framework, where we sample from the joint posterior of parental assignments and fertility parameters. We use computer simulations to determine how well our model recovers known parameters and investigate how various data collection scenarios (varying the number of nests or the number of offspring) affect the estimates. We then apply our model to compare reproductive success among different age groups of mottled sculpin, Cottus bairdi, from a natural population. We demonstrate that older adults are more likely to contribute to a nest and that females in the older age groups contribute more eggs to a nest than younger individuals. [Abstract/Link to Full Text]

Graze RM, Barmina O, Tufts D, Naderi E, Harmon KL, Persianinova M, Nuzhdin SV
New candidate genes for sex-comb divergence between Drosophila mauritiana and Drosophila simulans.
Genetics. 2007 Aug;176(4):2561-76.
A large-effect QTL for divergence in sex-comb tooth number between Drosophila simulans and D. mauritiana was previously mapped to 73A-84AB. Here we identify genes that are likely contributors to this divergence. We first improved the mapping resolution in the 73A-84AB region using 12 introgression lines and 62 recombinant nearly isogenic lines. To further narrow the list of candidate genes, we assayed leg-specific expression and identified genes with transcript-level evolution consistent with a potential role in sex-comb divergence. Sex combs are formed on the prothoracic (front) legs, but not on the mesothoracic (middle) legs of Drosophila males. We extracted RNA from the prothoracic and mesothoracic pupal legs of two species to determine which of the genes expressed differently between leg types were also divergent for gene expression. Two good functional candidate genes, Scr and dsx, are located in one of our fine-scale QTL regions. In addition, three previously uncharacterized genes (CG15186, CG2016, and CG2791) emerged as new candidates. These genes are located in regions strongly associated with sex-comb tooth number differences and are expressed differently between leg tissues and between species. Further supporting the potential involvement of these genes in sex-comb divergence, we found a significant difference in sex-comb tooth number between co-isogenic D. melanogaster lines with and without P-element insertions at CG2791. [Abstract/Link to Full Text]

Beisel CJ, Rokyta DR, Wichman HA, Joyce P
Testing the extreme value domain of attraction for distributions of beneficial fitness effects.
Genetics. 2007 Aug;176(4):2441-9.
In modeling evolutionary genetics, it is often assumed that mutational effects are assigned according to a continuous probability distribution, and multiple distributions have been used with varying degrees of justification. For mutations with beneficial effects, the distribution currently favored is the exponential distribution, in part because it can be justified in terms of extreme value theory, since beneficial mutations should have fitnesses in the extreme right tail of the fitness distribution. While the appeal to extreme value theory seems justified, the exponential distribution is but one of three possible limiting forms for tail distributions, with the other two loosely corresponding to distributions with right-truncated tails and those with heavy tails. We describe a likelihood-ratio framework for analyzing the fitness effects of beneficial mutations, focusing on testing the null hypothesis that the distribution is exponential. We also describe how to account for missing the smallest-effect mutations, which are often difficult to identify experimentally. This technique makes it possible to apply the test to gain-of-function mutations, where the ancestral genotype is unable to grow under the selective conditions. We also describe how to pool data across experiments, since we expect few possible beneficial mutations in any particular experiment. [Abstract/Link to Full Text]

Li Y, Li Y, Wu S, Han K, Wang Z, Hou W, Zeng Y, Wu R
Estimation of multilocus linkage disequilibria in diploid populations with dominant markers.
Genetics. 2007 Jul;176(3):1811-21.
Analysis of population structure and organization with DNA-based markers can provide important information regarding the history and evolution of a species. Linkage disequilibrium (LD) analysis based on allelic associations between different loci is emerging as a viable tool to unravel the genetic basis of population differentiation. In this article, we derive the EM algorithm to obtain the maximum-likelihood estimates of the linkage disequilibria between dominant markers, to study the patterns of genetic diversity for a diploid species. The algorithm was expanded to estimate and test linkage disequilibria of different orders among three dominant markers and can be technically extended to manipulate an arbitrary number of dominant markers. The feasibility of the proposed algorithm is validated by an example of population genetic studies of hickory trees, native to southeastern China, using dominant random amplified polymorphic DNA markers. Extensive simulation studies were performed to investigate the statistical properties of this algorithm. The precision of the estimates of linkage disequilibrium between dominant markers was compared with that between codominant markers. Results from simulation studies suggest that three-locus LD analysis displays increased power of LD detection relative to two-locus LD analysis. This algorithm is useful for studying the pattern and amount of genetic variation within and among populations. [Abstract/Link to Full Text]

Marciano DC, Karkouti OY, Palzkill T
A fitness cost associated with the antibiotic resistance enzyme SME-1 beta-lactamase.
Genetics. 2007 Aug;176(4):2381-92.
The bla(TEM-1) beta-lactamase gene has become widespread due to the selective pressure of beta-lactam use and its stable maintenance on transferable DNA elements. In contrast, bla(SME-1) is rarely isolated and is confined to the chromosome of carbapenem-resistant Serratia marcescens strains. Dissemination of bla(SME-1) via transfer to a mobile DNA element could hinder the use of carbapenems. In this study, bla(SME-1) was determined to impart a fitness cost upon Escherichia coli in multiple genetic contexts and assays. Genetic screens and designed SME-1 mutants were utilized to identify the source of this fitness cost. These experiments established that the SME-1 protein was required for the fitness cost but also that the enzyme activity of SME-1 was not associated with the fitness cost. The genetic screens suggested that the SME-1 signal sequence was involved in the fitness cost. Consistent with these findings, exchange of the SME-1 signal sequence for the TEM-1 signal sequence alleviated the fitness cost while replacing the TEM-1 signal sequence with the SME-1 signal sequence imparted a fitness cost to TEM-1 beta-lactamase. Taken together, these results suggest that fitness costs associated with some beta-lactamases may limit their dissemination. [Abstract/Link to Full Text]

Jensen JD, Thornton KR, Bustamante CD, Aquadro CF
On the utility of linkage disequilibrium as a statistic for identifying targets of positive selection in nonequilibrium populations.
Genetics. 2007 Aug;176(4):2371-9.
A critically important challenge in empirical population genetics is distinguishing neutral nonequilibrium processes from selective forces that produce similar patterns of variation. We here examine the extent to which linkage disequilibrium (i.e., nonrandom associations between markers) improves this discrimination. We show that patterns of linkage disequilibrium recently proposed to be unique to hitchhiking models are replicated under nonequilibrium neutral models. We also demonstrate that jointly considering spatial patterns of association among variants alongside the site-frequency spectrum is nonetheless of value. Through a comparison of models of equilibrium neutrality, nonequilibrium neutrality, equilibrium hitchhiking, nonequilibrium hitchhiking, and recurrent hitchhiking, we evaluate a linkage disequilibrium (LD) statistic (omega(max)) that appears to have power to identify regions recently shaped by positive selection. Most notably, for demographic parameters relevant to non-African populations of Drosophila melanogaster, we demonstrate that selected loci are distinguishable from neutral loci using this statistic. [Abstract/Link to Full Text]

Orgil U, Araki H, Tangchaiburana S, Berkey R, Xiao S
Intraspecific genetic variations, fitness cost and benefit of RPW8, a disease resistance locus in Arabidopsis thaliana.
Genetics. 2007 Aug;176(4):2317-33.
The RPW8 locus of Arabidopsis thaliana confers broad-spectrum resistance to powdery mildew pathogens. In many A. thaliana accessions, this locus contains two homologous genes, RPW8.1 and RPW8.2. In some susceptible accessions, however, these two genes are replaced by HR4, a homolog of RPW8.1. Here, we show that RPW8.2 from A. lyrata conferred powdery mildew resistance in A. thaliana, suggesting that RPW8.2 might have gained the resistance function before the speciation of A. thaliana and A. lyrata. To investigate how RPW8 has been maintained in A. thaliana, we examined the nucleotide sequence polymorphisms in RPW8 from 51 A. thaliana accessions, related disease reaction phenotypes to the evolutionary history of RPW8.1 and RPW8.2, and identified mutations that confer phenotypic variations. The average nucleotide diversities were high at RPW8.1 and RPW8.2, showing no sign of selective sweep. Moreover, we found that expression of RPW8 incurs fitness benefits and costs on A. thaliana in the presence and absence of the pathogens, respectively. Our results suggest that polymorphisms at the RPW8 locus in A. thaliana may have been maintained by complex selective forces, including those from the fitness benefits and costs both associated with RPW8. [Abstract/Link to Full Text]

No recent articles are currently available.

Díaz-Mejía JJ, Pérez-Rueda E, Segovia L
A network perspective on the evolution of metabolism by gene duplication.
Genome Biol. 2007;8(2):R26.
BACKGROUND: Gene duplication followed by divergence is one of the main sources of metabolic versatility. The patchwork and stepwise models of metabolic evolution help us to understand these processes, but their assumptions are relatively simplistic. We used a network-based approach to determine the influence of metabolic constraints on the retention of duplicated genes. RESULTS: We detected duplicated genes by looking for enzymes sharing homologous domains and uncovered an increased retention of duplicates for enzymes catalyzing consecutive reactions, as illustrated by the ligases acting in the biosynthesis of peptidoglycan. As a consequence, metabolic networks show a high retention of duplicates within functional modules, and we found a preferential biochemical coupling of reactions that partially explains this bias. A similar situation was found in enzyme-enzyme interaction networks, but not in interaction networks of non-enzymatic proteins or gene transcriptional regulatory networks, suggesting that the retention of duplicates results from the biochemical rules governing substrate-enzyme-product relationships. We confirmed a high retention of duplicates between chemically similar reactions, as illustrated by fatty-acid metabolism. The retention of duplicates between chemically dissimilar reactions is, however, also greater than expected by chance. Finally, we detected a significant retention of duplicates as groups, instead of single pairs. CONCLUSION: Our results indicate that in silico modeling of the origin and evolution of metabolism is improved by the inclusion of specific functional constraints, such as the preferential biochemical coupling of reactions. We suggest that the stepwise and patchwork models are not independent of each other: in fact, the network perspective enables us to reconcile and combine these models. [Abstract/Link to Full Text]

Hovatta I, Zapala MA, Broide RS, Schadt EE, Libiger O, Schork NJ, Lockhart DJ, Barlow C
DNA variation and brain region-specific expression profiles exhibit different relationships between inbred mouse strains: implications for eQTL mapping studies.
Genome Biol. 2007;8(2):R25.
BACKGROUND: Expression quantitative trait locus (eQTL) mapping is used to find loci that are responsible for the transcriptional activity of a particular gene. In recent eQTL studies, expression profiles were derived from either homogenized whole brain or collections of large brain regions. However, the brain is a very heterogeneous organ, and expression profiles of different brain regions vary significantly. Because of the importance and potential power of eQTL studies in identifying regulatory networks, we analyzed gene expression patterns in different brain regions from multiple inbred mouse strains and investigated the implications for the design and analysis of eQTL studies. RESULTS: Gene expression profiles of five brain regions in six inbred mouse strains were studied. Few genes exhibited a significant strain-specific expression pattern, whereas a large number of genes exhibited brain region-specific patterns. We constructed phylogenetic trees based on the expression relationships between the strains and compared them with a DNA-level relationship tree. The trees based on the expression of strain-specific genes were constant across brain regions and mirrored DNA-level variation. However, the trees based on region-specific genes exhibited a different set of strain relationships, depending on the brain region. An eQTL analysis showed enrichment of cis-acting regulators among strain-specific genes, whereas brain region-specific genes appear to be mainly regulated by trans-acting elements. CONCLUSION: Our results suggest that many regulatory networks are highly brain region specific and indicate the importance of conducting eQTL mapping studies using data from brain regions or tissues that are physiologically and phenotypically relevant to the trait of interest. [Abstract/Link to Full Text]

Gupta S, Stamatoyannopoulos JA, Bailey TL, Noble WS
Quantifying similarity between motifs.
Genome Biol. 2007;8(2):R24.
A common question within the context of de novo motif discovery is whether a newly discovered, putative motif resembles any previously discovered motif in an existing database. To answer this question, we define a statistical measure of motif-motif similarity, and we describe an algorithm, called Tomtom, for searching a database of motifs with a given query motif. Experimental simulations demonstrate the accuracy of Tomtom's E values and its effectiveness in finding similar motifs. [Abstract/Link to Full Text]

Moses AM, Hériché JK, Durbin R
Clustering of phosphorylation site recognition motifs can be exploited to predict the targets of cyclin-dependent kinase.
Genome Biol. 2007;8(2):R23.
Protein kinases are critical to cellular signalling and post-translational gene regulation, but their biological substrates are difficult to identify. We show that cyclin-dependent kinase (CDK) consensus motifs are frequently clustered in CDK substrate proteins. Based on this, we introduce a new computational strategy to predict the targets of CDKs and use it to identify new biologically interesting candidates. Our data suggest that regulatory modules may exist in protein sequence as clusters of short sequence motifs. [Abstract/Link to Full Text]

Kingsford CL, Ayanbule K, Salzberg SL
Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake.
Genome Biol. 2007;8(2):R22.
BACKGROUND: In many prokaryotes, transcription of DNA to RNA is terminated by a thymine-rich stretch of DNA following a hairpin loop. Detecting such Rho-independent transcription terminators can shed light on the organization of bacterial genomes and can improve genome annotation. Previous computational methods to predict Rho-independent terminators have been slow or limited in the organisms they consider. RESULTS: We describe TransTermHP, a new computational method to rapidly and accurately detect Rho-independent transcription terminators. We predict the locations of terminators in 343 prokaryotic genomes, representing the largest collection of predictions available. In Bacillus subtilis, we can detect 93% of known terminators with a false positive rate of just 6%, comparable to the best-known methods. Outside the Firmicutes division, we find that Rho-independent termination plays a large role in the Neisseria and Vibrio genera, the Pasteurellaceae (including the Haemophilus genus) and several other species. In Neisseria and Pasteurellaceae, terminator hairpins are frequently formed by closely spaced, complementary instances of exogenous DNA uptake signal sequences. We quantify the propensity for terminators to include these sequences. In the process, we provide the first discussion of potential uptake signals in Haemophilus ducreyi and Mannheimia succiniciproducens, and we discuss the preference for a particular configuration of uptake signal sequences within terminators. CONCLUSION: Our new fast and accurate method for detecting transcription terminators has allowed us to identify and analyze terminators in many new genomes and to identify DNA uptake signal sequences in several species where they have not been previously reported. Our software and predictions are freely available. [Abstract/Link to Full Text]

Gazave E, Marqués-Bonet T, Fernando O, Charlesworth B, Navarro A
Patterns and rates of intron divergence between humans and chimpanzees.
Genome Biol. 2007;8(2):R21.
BACKGROUND: Introns, which constitute the largest fraction of eukaryotic genes and which had been considered to be neutral sequences, are increasingly acknowledged as having important functions. Several studies have investigated levels of evolutionary constraint along introns and across classes of introns of different length and location within genes. However, thus far these studies have yielded contradictory results. RESULTS: We present the first analysis of human-chimpanzee intron divergence, in which differences in the number of substitutions per intronic site (Ki) can be interpreted as the footprint of different intensities and directions of the pressures of natural selection. Our main findings are as follows: there was a strong positive correlation between intron length and divergence; there was a strong negative correlation between intron length and GC content; and divergence rates vary along introns and depending on their ordinal position within genes (for instance, first introns are more GC rich, longer and more divergent, and divergence is lower at the 3' and 5' ends of all types of introns). CONCLUSION: We show that the higher divergence of first introns is related to their larger size. Also, the lower divergence of short introns suggests that they may harbor a relatively greater proportion of regulatory elements than long introns. Moreover, our results are consistent with the presence of functionally relevant sequences near the 5' and 3' ends of introns. Finally, our findings suggest that other parts of introns may also be under selective constraints. [Abstract/Link to Full Text]

Deshayes C, Perrodou E, Gallien S, Euphrasie D, Schaeffer C, Van-Dorsselaer A, Poch O, Lecompte O, Reyrat JM
Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors?
Genome Biol. 2007;8(2):R20.
BACKGROUND: In silico analysis has shown that all bacterial genomes contain a low percentage of ORFs with undetected frameshifts and in-frame stop codons. These interrupted coding sequences (ICDSs) may really be present in the organism or may result from misannotation based on sequencing errors. The reality or otherwise of these sequences has major implications for all subsequent functional characterization steps, including module prediction, comparative genomics and high-throughput proteomic projects. RESULTS: We show here, using Mycobacterium smegmatis as a model species, that a significant proportion of these ICDSs result from sequencing errors. We used a resequencing procedure and mass spectrometry analysis to determine the nature of a number of ICDSs in this organism. We found that 28 of the 73 ICDSs investigated correspond to sequencing errors. CONCLUSION: The correction of these errors results in modification of the predicted amino acid sequences of the corresponding proteins and changes in annotation. We suggest that each bacterial ICDS should be investigated individually, to determine its true status and to ensure that the genome sequence is appropriate for comparative genomics analyses. [Abstract/Link to Full Text]

Hellemans J, Mortier G, De Paepe A, Speleman F, Vandesompele J
qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data.
Genome Biol. 2007;8(2):R19.
Although quantitative PCR (qPCR) is becoming the method of choice for expression profiling of selected genes, accurate and straightforward processing of the raw measurements remains a major hurdle. Here we outline advanced and universally applicable models for relative quantification and inter-run calibration with proper error propagation along the entire calculation track. These models and algorithms are implemented in qBase, a free program for the management and automated analysis of qPCR data. [Abstract/Link to Full Text]

Haddrill PR, Halligan DL, Tomaras D, Charlesworth B
Reduced efficacy of selection in regions of the Drosophila genome that lack crossing over.
Genome Biol. 2007;8(2):R18.
BACKGROUND: The recombinational environment is predicted to influence patterns of protein sequence evolution through the effects of Hill-Robertson interference among linked sites subject to selection. In freely recombining regions of the genome, selection should more effectively incorporate new beneficial mutations, and eliminate deleterious ones, than in regions with low rates of genetic recombination. RESULTS: We examined the effects of recombinational environment on patterns of evolution using a genome-wide comparison of Drosophila melanogaster and D. yakuba. In regions of the genome with no crossing over, we find elevated divergence at nonsynonymous sites and in long introns, a virtual absence of codon usage bias, and an increase in gene length. However, we find little evidence for differences in patterns of evolution between regions with high, intermediate, and low crossover frequencies. In addition, genes on the fourth chromosome exhibit more extreme deviations from regions with crossing over than do other, no crossover genes outside the fourth chromosome. CONCLUSION: All of the patterns observed are consistent with a severe reduction in the efficacy of selection in the absence of crossing over, resulting in the accumulation of deleterious mutations in these regions. Our results also suggest that even a very low frequency of crossing over may be enough to maintain the efficacy of selection. [Abstract/Link to Full Text]

Zhao X, Xuan Z, Zhang MQ
Boosting with stumps for predicting transcription start sites.
Genome Biol. 2007;8(2):R17.
Promoter prediction is a difficult but important problem in gene finding, and it is critical for elucidating the regulation of gene expression. We introduce a new promoter prediction program, CoreBoost, which applies a boosting technique with stumps to select important small-scale as well as large-scale features. CoreBoost improves greatly on locating transcription start sites. We also demonstrate that by further utilizing some tissue-specific information, better accuracy can be achieved. [Abstract/Link to Full Text]

Podell S, Gaasterland T
DarkHorse: a method for genome-wide prediction of horizontal gene transfer.
Genome Biol. 2007;8(2):R16.
A new approach to rapid, genome-wide identification and ranking of horizontal transfer candidate proteins is presented. The method is quantitative, reproducible, and computationally undemanding. It can be combined with genomic signature and/or phylogenetic tree-building procedures to improve accuracy and efficiency. The method is also useful for retrospective assessments of horizontal transfer prediction reliability, recognizing orthologous sequences that may have been previously overlooked or unavailable. These features are demonstrated in bacterial, archaeal, and eukaryotic examples. [Abstract/Link to Full Text]

Vavouri T, Walter K, Gilks WR, Lehner B, Elgar G
Parallel evolution of conserved non-coding elements that target a common set of developmental regulatory genes from worms to humans.
Genome Biol. 2007;8(2):R15.
BACKGROUND: The human genome contains thousands of non-coding sequences that are often more conserved between vertebrate species than protein-coding exons. These highly conserved non-coding elements (CNEs) are associated with genes that coordinate development, and have been proposed to act as transcriptional enhancers. Despite their extreme sequence conservation in vertebrates, sequences homologous to CNEs have not been identified in invertebrates. RESULTS: Here we report that nematode genomes contain an alternative set of CNEs that share sequence characteristics, but not identity, with their vertebrate counterparts. CNEs thus represent a very unusual class of sequences that are extremely conserved within specific animal lineages yet are highly divergent between lineages. Nematode CNEs are also associated with developmental regulatory genes, and include well-characterized enhancers and transcription factor binding sites, supporting the proposed function of CNEs as cis-regulatory elements. Most remarkably, 40 of 156 human CNE-associated genes with invertebrate orthologs are also associated with CNEs in both worms and flies. CONCLUSION: A core set of genes that regulate development is associated with CNEs across three animal groups (worms, flies and vertebrates). We propose that these CNEs reflect the parallel evolution of alternative enhancers for a common set of developmental regulatory genes in different animal groups. This 're-wiring' of gene regulatory networks containing key developmental coordinators was probably a driving force during the evolution of animal body plans. CNEs may, therefore, represent the genomic traces of these 'hard-wired' core gene regulatory networks that specify the development of each alternative animal body plan. [Abstract/Link to Full Text]

Toscano CD, Prabhu VV, Langenbach R, Becker KG, Bosetti F
Differential gene expression patterns in cyclooxygenase-1 and cyclooxygenase-2 deficient mouse brain.
Genome Biol. 2007;8(1):R14.
BACKGROUND: Cyclooxygenase (COX)-1 and COX-2 produce prostanoids from arachidonic acid and are thought to have important yet distinct roles in normal brain function. Deletion of COX-1 or COX-2 results in profound differences both in brain levels of prostaglandin E2 and in activation of the transcription factor nuclear factor-kappaB, suggesting that COX-1 and COX-2 play distinct roles in brain arachidonic acid metabolism and regulation of gene expression. To further elucidate the role of COX isoforms in the regulation of the brain transcriptome, microarray analysis of gene expression in the cerebral cortex and hippocampus of mice deficient in COX-1 (COX-1-/-) or COX-2 (COX-2-/-) was performed. RESULTS: A majority (>93%) of the differentially expressed genes in both the cortex and hippocampus were altered in one COX isoform knockout mouse but not the other. The major gene function affected in all genotype comparisons was 'transcriptional regulation'. Distinct biologic and metabolic pathways that were altered in COX-/- mice included beta oxidation, methionine metabolism, janus kinase signaling, and GABAergic neurotransmission. CONCLUSION: Our findings suggest that COX-1 and COX-2 differentially modulate brain gene expression. Because certain anti-inflammatory and analgesic treatments are based on inhibition of COX activity, the specific alterations observed in this study further our understanding of the relationship of COX-1 and COX-2 with signaling pathways in brain and of the therapeutic and toxicologic consequences of COX inhibition. [Abstract/Link to Full Text]

Elsik CG, Mackey AJ, Reese JT, Milshina NV, Roos DS, Weinstock GM
Creating a honey bee consensus gene set.
Genome Biol. 2007;8(1):R13.
BACKGROUND: We wished to produce a single reference gene set for honey bee (Apis mellifera). Our motivation was twofold. First, we wished to obtain an improved set of gene models with increased coverage of known genes, while maintaining gene model quality. Second, we wished to provide a single official gene list that the research community could further utilize for consistent and comparable analyses and functional annotation. RESULTS: We created a consensus gene set for honey bee (Apis mellifera) using GLEAN, a new algorithm that uses latent class analysis to automatically combine disparate gene prediction evidence in the absence of known genes. The consensus gene models had increased representation of honey bee genes without sacrificing quality compared with any one of the input gene predictions. When compared with manually annotated gold standards, the consensus set of gene models was similar or superior in quality to each of the input sets. CONCLUSION: Most eukaryotic genome projects produce multiple gene sets because of the variety of gene prediction programs. Each of the gene prediction programs has strengths and weaknesses, and so the multiplicity of gene sets offers users a more comprehensive collection of genes to use than is available from a single program. On the other hand, the availability of multiple gene sets is also a cause for uncertainty among users as regards which set they should use. GLEAN proved to be an effective method to combine gene lists into a single reference set. [Abstract/Link to Full Text]

Liston A, Hardy K, Pittelkow Y, Wilson SR, Makaroff LE, Fahrer AM, Goodnow CC
Impairment of organ-specific T cell negative selection by diabetes susceptibility genes: genomic analysis by mRNA profiling.
Genome Biol. 2007;8(1):R12.
BACKGROUND: T cells in the thymus undergo opposing positive and negative selection processes so that the only T cells entering circulation are those bearing a T cell receptor (TCR) with a low affinity for self. The mechanism differentiating negative from positive selection is poorly understood, despite the fact that inherited defects in negative selection underlie organ-specific autoimmune disease in AIRE-deficient people and the non-obese diabetic (NOD) mouse strain RESULTS: Here we use homogeneous populations of T cells undergoing either positive or negative selection in vivo together with genome-wide transcription profiling on microarrays to identify the gene expression differences underlying negative selection to an Aire-dependent organ-specific antigen, including the upregulation of a genomic cluster in the cytogenetic band 2F. Analysis of defective negative selection in the autoimmune-prone NOD strain demonstrates a global impairment in the induction of the negative selection response gene set, but little difference in positive selection response genes. Combining expression differences with genetic linkage data, we identify differentially expressed candidate genes, including Bim, Bnip3, Smox, Pdrg1, Id1, Pdcd1, Ly6c, Pdia3, Trim30 and Trim12. CONCLUSION: The data provide a molecular map of the negative selection response in vivo and, by analysis of deviations from this pathway in the autoimmune susceptible NOD strain, suggest that susceptibility arises from small expression differences in genes acting at multiple points in the pathway between the TCR and cell death. [Abstract/Link to Full Text]

Bai Y, Casola C, Feschotte C, Betrán E
Comparative genomics reveals a constant rate of origination and convergent acquisition of functional retrogenes in Drosophila.
Genome Biol. 2007;8(1):R11.
BACKGROUND: Processed copies of genes (retrogenes) are duplicate genes that originated through the reverse-transcription of a host transcript and insertion in the genome. This type of gene duplication, as any other, could be a source of new genes and functions. Using whole genome sequence data for 12 Drosophila species, we dated the origin of 94 retroposition events that gave rise to candidate functional genes in D. melanogaster. RESULTS: Based on this analysis, we infer that functional retrogenes have emerged at a fairly constant rate of 0.5 genes per million years per lineage over the last approximately 63 million years of Drosophila evolution. The number of functional retrogenes and the rate at which they are recruited in the D. melanogaster lineage are of the same order of magnitude as those estimated in the human lineage, despite the higher deletion bias in the Drosophila genome. However, unlike primates, the rate of retroposition in Drosophila seems to be fairly constant and no burst of retroposition can be inferred from our analyses. In addition, our data also support an important role for retrogenes as a source of lineage-specific male functions, in agreement with previous hypotheses. Finally, we identified three cases of functional retrogenes in D. melanogaster that have been independently retroposed and recruited in parallel as new genes in other Drosophila lineages. CONCLUSION: Together, these results indicate that retroposition is a persistent mechanism and a recurrent pathway for the emergence of new genes in Drosophila. [Abstract/Link to Full Text]

Raes J, Korbel JO, Lercher MJ, von Mering C, Bork P
Prediction of effective genome size in metagenomic samples.
Genome Biol. 2007;8(1):R10.
We introduce a novel computational approach to predict effective genome size (EGS; a measure that includes multiple plasmid copies, inserted sequences, and associated phages and viruses) from short sequencing reads of environmental genomics (or metagenomics) projects. We observe considerable EGS differences between environments and link this with ecologic complexity as well as species composition (for instance, the presence of eukaryotes). For example, we estimate EGS in a complex, organism-dense farm soil sample at about 6.3 megabases (Mb) whereas that of the bacteria therein is only 4.7 Mb; for bacteria in a nutrient-poor, organism-sparse ocean surface water sample, EGS is as low as 1.6 Mb. The method also permits evaluation of completion status and assembly bias in single-genome sequencing projects. [Abstract/Link to Full Text]

Wang J, Jemielity S, Uva P, Wurm Y, Gräff J, Keller L
An annotated cDNA library and microarray for large-scale gene-expression studies in the ant Solenopsis invicta.
Genome Biol. 2007;8(1):R9.
Ants display a range of fascinating behaviors, a remarkable level of intra-species phenotypic plasticity and many other interesting characteristics. Here we present a new tool to study the molecular mechanisms underlying these traits: a tentatively annotated expressed sequence tag (EST) resource for the fire ant Solenopsis invicta. From a normalized cDNA library we obtained 21,715 ESTs, which represent 11,864 putatively different transcripts with very diverse molecular functions. All ESTs were used to construct a cDNA microarray. [Abstract/Link to Full Text]

Panelli MC, Stashower ME, Slade HB, Smith K, Norwood C, Abati A, Fetsch P, Filie A, Walters SA, Astry C, Aricó E, Zhao Y, Selleri S, Wang E, Marincola FM
Sequential gene profiling of basal cell carcinomas treated with imiquimod in a placebo-controlled study defines the requirements for tissue rejection.
Genome Biol. 2007;8(1):R8.
BACKGROUND: Imiquimod is a Toll-like receptor-7 agonist capable of inducing complete clearance of basal cell carcinoma (BCC) and other cutaneous malignancies. We hypothesized that the characterization of the early transcriptional events induced by imiquimod may provide insights about immunological events preceding acute tissue and/or tumor rejection. RESULTS: We report a paired analysis of adjacent punch biopsies obtained pre- and post-treatment from 36 patients with BCC subjected to local application of imiquimod (n = 22) or vehicle cream (n = 14) in a blinded, randomized protocol. Four treatments were assessed (q12 applications for 2 or 4 days, or q24 hours for 4 or 8 days). RNA was amplified and hybridized to 17.5 K cDNA arrays. All treatment schedules similarly affected the transcriptional profile of BCC; however, the q12 x 4 days regimen, associated with highest effectiveness, induced the most changes, with 637 genes unequivocally stimulated by imiquimod. A minority of transcripts (98 genes) confirmed previous reports of interferon-alpha involvement. The remaining 539 genes portrayed additional immunological functions predominantly involving the activation of cellular innate and adaptive immune-effector mechanisms. Importantly, these effector signatures recapitulate previous observations of tissue rejection in the context of cancer immunotherapy, acute allograft rejection and autoimmunity. CONCLUSION: This study, based on a powerful and reproducible model of cancer eradication by innate immune mechanisms, provides the first insights in humans into the early transcriptional events associated with immune rejection. This model is likely representative of constant immunological pathways through which innate and adaptive immune responses combine to induce tissue destruction. [Abstract/Link to Full Text]

Gutiérrez RA, Lejay LV, Dean A, Chiaromonte F, Shasha DE, Coruzzi GM
Qualitative network models and genome-wide expression data define carbon/nitrogen-responsive molecular machines in Arabidopsis.
Genome Biol. 2007;8(1):R7.
BACKGROUND: Carbon (C) and nitrogen (N) metabolites can regulate gene expression in Arabidopsis thaliana. Here, we use multi-network analysis of microarray data to identify molecular networks regulated by C and N in the Arabidopsis root system. RESULTS: We used the Arabidopsis whole genome Affymetrix gene chip to explore global gene expression responses in plants exposed transiently to a matrix of C and N treatments. We used ANOVA analysis to define quantitative models of regulation for all detected genes. Our results suggest that about half of the Arabidopsis transcriptome is regulated by C, N or CN interactions. We found ample evidence for interactions between C and N that include genes involved in metabolic pathways, protein degradation and auxin signaling. To provide a global, yet detailed, view of how the cell molecular network is adjusted in response to the CN treatments, we constructed a qualitative multi-network model of the Arabidopsis metabolic and regulatory molecular network, including 6,176 genes, 1,459 metabolites and 230,900 interactions among them. We integrated the quantitative models of CN gene regulation with the wiring diagram in the multi-network, and identified specific interacting genes in biological modules that respond to C, N or CN treatments. CONCLUSION: Our results indicate that CN regulation occurs at multiple levels, including potential post-transcriptional control by microRNAs. The network analysis of our systematic dataset of CN treatments indicates that CN sensing is a mechanism that coordinates the global and coordinated regulation of specific sets of molecular machines in the plant cell. [Abstract/Link to Full Text]

Robertson N, Oveisi-Fordorei M, Zuyderduyn SD, Varhol RJ, Fjell C, Marra M, Jones S, Siddiqui A
DiscoverySpace: an interactive data analysis application.
Genome Biol. 2007;8(1):R6.
DiscoverySpace is a graphical application for bioinformatics data analysis. Users can seamlessly traverse references between biological databases and draw together annotations in an intuitive tabular interface. Datasets can be compared using a suite of novel tools to aid in the identification of significant patterns. DiscoverySpace is of broad utility and its particular strength is in the analysis of serial analysis of gene expression (SAGE) data. The application is freely available online. [Abstract/Link to Full Text]

Sharakhova MV, Hammond MP, Lobo NF, Krzywinski J, Unger MF, Hillenmeyer ME, Bruggner RV, Birney E, Collins FH
Update of the Anopheles gambiae PEST genome assembly.
Genome Biol. 2007;8(1):R5.
BACKGROUND: The genome of Anopheles gambiae, the major vector of malaria, was sequenced and assembled in 2002. This initial genome assembly and analysis made available to the scientific community was complicated by the presence of assembly issues, such as scaffolds with no chromosomal location, no sequence data for the Y chromosome, haplotype polymorphisms resulting in two different genome assemblies in limited regions and contaminating bacterial DNA. RESULTS: Polytene chromosome in situ hybridization with cDNA clones was used to place 15 unmapped scaffolds (sizes totaling 5.34 Mbp) in the pericentromeric regions of the chromosomes and oriented a further 9 scaffolds. Additional analysis by in situ hybridization of bacterial artificial chromosome (BAC) clones placed 1.32 Mbp (5 scaffolds) in the physical gaps between scaffolds on euchromatic parts of the chromosomes. The Y chromosome sequence information (0.18 Mbp) remains highly incomplete and fragmented among 55 short scaffolds. Analysis of BAC end sequences showed that 22 inter-scaffold gaps were spanned by BAC clones. Unmapped scaffolds were also aligned to the chromosome assemblies in silico, identifying regions totaling 8.18 Mbp (144 scaffolds) that are probably represented in the genome project by two alternative assemblies. An additional 3.53 Mbp of alternative assembly was identified within mapped scaffolds. Scaffolds comprising 1.97 Mbp (679 small scaffolds) were identified as probably derived from contaminating bacterial DNA. In total, about 33% of previously unmapped sequences were placed on the chromosomes. CONCLUSION: This study has used new approaches to improve the physical map and assembly of the A. gambiae genome. [Abstract/Link to Full Text]

Chen G, Jensen ST, Stoeckert CJ
Clustering of genes into regulons using integrated modeling-COGRIM.
Genome Biol. 2007;8(1):R4.
We present a Bayesian hierarchical model and Gibbs Sampling implementation that integrates gene expression, ChIP binding, and transcription factor motif data in a principled and robust fashion. COGRIM was applied to both unicellular and mammalian organisms under different scenarios of available data. In these applications, we demonstrate the ability to predict gene-transcription factor interactions with reduced numbers of false-positive findings and to make predictions beyond what is obtained when single types of data are considered. [Abstract/Link to Full Text]

Carmona-Saez P, Chagoyen M, Tirado F, Carazo JM, Pascual-Montano A
GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists.
Genome Biol. 2007;8(1):R3.
We present GENECODIS, a web-based tool that integrates different sources of information to search for annotations that frequently co-occur in a set of genes and rank them by statistical significance. The analysis of concurrent annotations provides significant information for the biologic interpretation of high-throughput experiments and may outperform the results of standard methods for the functional analysis of gene lists. GENECODIS is publicly available at [Abstract/Link to Full Text]

Oshlack A, Emslie D, Corcoran LM, Smyth GK
Normalization of boutique two-color microarrays with a high proportion of differentially expressed probes.
Genome Biol. 2007;8(1):R2.
Normalization is critical for removing systematic variation from microarray data. For two-color microarray platforms, intensity-dependent lowess normalization is commonly used to correct relative gene expression values for biases. Here we outline a normalization method for use when the assumptions of lowess normalization fail. Specifically, this can occur when specialized boutique arrays are constructed that contain a subset of genes selected to test particular biological functions. [Abstract/Link to Full Text]

Wang QF, Prabhakar S, Chanan S, Cheng JF, Rubin EM, Boffelli D
Detection of weakly conserved ancestral mammalian regulatory sequences by primate comparisons.
Genome Biol. 2007;8(1):R1.
BACKGROUND: Genomic comparisons between human and distant, non-primate mammals are commonly used to identify cis-regulatory elements based on constrained sequence evolution. However, these methods fail to detect functional elements that are too weakly conserved among mammals to distinguish them from non-functional DNA. RESULTS: To evaluate a strategy for large scale genome annotation that is complementary to the commonly used distal species comparisons, we explored the potential of deep intra-primate sequence comparisons. We sequenced the orthologs of 558 kb of human genomic sequence, covering multiple loci involved in cholesterol homeostasis, in 6 non-human primates. Our analysis identified six non-coding DNA elements displaying significant conservation among primates but undetectable in more distant comparisons. In vitro and in vivo tests revealed that at least three of these six elements have regulatory function. Notably, the mouse orthologs of these three functional human sequences had regulatory activity despite their lack of significant sequence conservation, indicating that they are ancestral mammalian cis-regulatory elements. These regulatory elements could be detected even in a smaller set of three primate species including human, rhesus and marmoset. CONCLUSION: We have demonstrated that intra-primate sequence comparisons can be used to identify functional modules in large genomic regions, including cis-regulatory elements that are not detectable through comparison with non-mammalian genomes. With the available human and rhesus genomes and that of marmoset, which is being actively sequenced, this strategy can be extended to the whole genome in the near future. [Abstract/Link to Full Text]

Chen N, Mah A, Blacque OE, Chu J, Phgora K, Bakhoum MW, Newbury CR, Khattra J, Chan S, Go A, Efimenko E, Johnsen R, Phirke P, Swoboda P, Marra M, Moerman DG, Leroux MR, Baillie DL, Stein LD
Identification of ciliary and ciliopathy genes in Caenorhabditis elegans through comparative genomics.
Genome Biol. 2006;7(12):R126.
BACKGROUND: The recent availability of genome sequences of multiple related Caenorhabditis species has made it possible to identify, using comparative genomics, similarly transcribed genes in Caenorhabditis elegans and its sister species. Taking this approach, we have identified numerous novel ciliary genes in C. elegans, some of which may be orthologs of unidentified human ciliopathy genes. RESULTS: By screening for genes possessing canonical X-box sequences in promoters of three Caenorhabditis species, namely C. elegans, C. briggsae and C. remanei, we identified 93 genes (including known X-box regulated genes) that encode putative components of ciliated neurons in C. elegans and are subject to the same regulatory control. For many of these genes, restricted anatomical expression in ciliated cells was confirmed, and control of transcription by the ciliogenic DAF-19 RFX transcription factor was demonstrated by comparative transcriptional profiling of different tissue types and of daf-19(+) and daf-19(-) animals. Finally, we demonstrate that the dye-filling defect of dyf-5(mn400) animals, which is indicative of compromised exposure of cilia to the environment, is caused by a nonsense mutation in the serine/threonine protein kinase gene M04C9.5. CONCLUSION: Our comparative genomics-based predictions may be useful for identifying genes involved in human ciliopathies, including Bardet-Biedl Syndrome (BBS), since the C. elegans orthologs of known human BBS genes contain X-box motifs and are required for normal dye filling in C. elegans ciliated neurons. [Abstract/Link to Full Text]

Itzhaki Z, Akiva E, Altuvia Y, Margalit H
Evolutionary conservation of domain-domain interactions.
Genome Biol. 2006;7(12):R125.
BACKGROUND: Recently, there has been much interest in relating domain-domain interactions (DDIs) to protein-protein interactions (PPIs) and vice versa, in an attempt to understand the molecular basis of PPIs. RESULTS: Here we map structurally derived DDIs onto the cellular PPI networks of different organisms and demonstrate that there is a catalog of domain pairs that is used to mediate various interactions in the cell. We show that these DDIs occur frequently in protein complexes and that homotypic interactions (of a domain with itself) are abundant. A comparison of the repertoires of DDIs in the networks of Escherichia coli, Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens shows that many DDIs are evolutionarily conserved. CONCLUSION: Our results indicate that different organisms use the same 'building blocks' for PPIs, suggesting that the functionality of many domain pairs in mediating protein interactions is maintained in evolution. [Abstract/Link to Full Text]

Keränen SV, Fowlkes CC, Luengo Hendriks CL, Sudar D, Knowles DW, Malik J, Biggin MD
Three-dimensional morphology and gene expression in the Drosophila blastoderm at cellular resolution II: dynamics.
Genome Biol. 2006;7(12):R124.
BACKGROUND: To accurately describe gene expression and computationally model animal transcriptional networks, it is essential to determine the changing locations of cells in developing embryos. RESULTS: Using automated image analysis methods, we provide the first quantitative description of temporal changes in morphology and gene expression at cellular resolution in whole embryos, using the Drosophila blastoderm as a model. Analyses based on both fixed and live embryos reveal complex, previously undetected three-dimensional changes in nuclear density patterns caused by nuclear movements prior to gastrulation. Gene expression patterns move, in part, with these changes in morphology, but additional spatial shifts in expression patterns are also seen, supporting a previously proposed model of pattern dynamics based on the induction and inhibition of gene expression. We show that mutations that disrupt either the anterior/posterior (a/p) or the dorsal/ventral (d/v) transcriptional cascades alter morphology and gene expression along both the a/p and d/v axes in a way suggesting that these two patterning systems interact via both transcriptional and morphological mechanisms. CONCLUSION: Our work establishes a new strategy for measuring temporal changes in the locations of cells and gene expression patterns that uses fixed cell material and computational modeling. It also provides a coordinate framework for the blastoderm embryo that will allow increasingly accurate spatio-temporal modeling of both the transcriptional control network and morphogenesis. [Abstract/Link to Full Text]

Luengo Hendriks CL, Keränen SV, Fowlkes CC, Simirenko L, Weber GH, DePace AH, Henriquez C, Kaszuba DW, Hamann B, Eisen MB, Malik J, Sudar D, Biggin MD, Knowles DW
Three-dimensional morphology and gene expression in the Drosophila blastoderm at cellular resolution I: data acquisition pipeline.
Genome Biol. 2006;7(12):R123.
BACKGROUND: To model and thoroughly understand animal transcription networks, it is essential to derive accurate spatial and temporal descriptions of developing gene expression patterns with cellular resolution. RESULTS: Here we describe a suite of methods that provide the first quantitative three-dimensional description of gene expression and morphology at cellular resolution in whole embryos. A database containing information derived from 1,282 embryos is released that describes the mRNA expression of 22 genes at multiple time points in the Drosophila blastoderm. We demonstrate that our methods are sufficiently accurate to detect previously undescribed features of morphology and gene expression. The cellular blastoderm is shown to have an intricate morphology of nuclear density patterns and apical/basal displacements that correlate with later well-known morphological features. Pair rule gene expression stripes, generally considered to specify patterning only along the anterior/posterior body axis, are shown to have complex changes in stripe location, stripe curvature, and expression level along the dorsal/ventral axis. Pair rule genes are also found to not always maintain the same register to each other. CONCLUSION: The application of these quantitative methods to other developmental systems will likely reveal many other previously unknown features and provide a more rigorous understanding of developmental regulatory networks. [Abstract/Link to Full Text]

Cornett DS, Mobley JA, Dias EC, Andersson M, Arteaga CL, Sanders ME, Caprioli RM
A novel histology-directed strategy for MALDI-MS tissue profiling that improves throughput and cellular specificity in human breast cancer.
Mol Cell Proteomics. 2006 Oct;5(10):1975-83.
We describe a novel tissue profiling strategy that improves the cellular specificity and analysis throughput of protein profiles obtained by direct MALDI analysis. The new approach integrates the cellular specificity of histology, the accuracy and reproducibility of robotic liquid dispensing, and the speed and objectivity of automated spectra acquisition. Traditional methodologies for preparing and analyzing tissue samples rely heavily on manual procedures, which for various reasons discussed, restrict cellular specificity and sample throughput. Here, a robotic spotter deposits micron-sized droplets of matrix precisely onto foci of normal mammary epithelium, ductal carcinoma in situ, invasive mammary cancer, and peritumoral stroma selected by a pathologist from high resolution histological images of sectioned human breast cancer samples. The location of each matrix spot was then determined and uploaded into the instrument to facilitate automated profile acquisition by MALDI-TOF. In the example shown, the different lesions were clearly differentiated using mass profiling. Further, the workflow permits a visual projection of any information produced from the profile analyses directly on the histological image for a unique combination of proteomic and histological assessment of sample regions. The higher performance characteristics offered by the new workflow promises to be a significant advancement toward the next generation of tissue profiling studies. [Abstract/Link to Full Text]

Hörth P, Miller CA, Preckel T, Wenz C
Efficient fractionation and improved protein identification by peptide OFFGEL electrophoresis.
Mol Cell Proteomics. 2006 Oct;5(10):1968-74.
The sample fractionation steps conducted prior to mass detection are critically important for the comprehensive analysis of complex protein mixtures. This paper illustrates the effectiveness of OFFGEL electrophoresis with the Agilent 3100 OFFGEL Fractionator for the fractionation of peptides. An Escherichia coli tryptic digest was separated in 24 fractions, and peptides were identified by reversed-phase liquid chromatography on a microfluidic device with mass spectrometric detection. About 90% of the identified individual peptides were found in only one or two fractions. The distribution of the calculated isoelectric points for the peptides identified in each fraction was especially narrow in the acidic pH range. Standard deviations approached the size of the pH segment covered by the respective fraction. The experimental peptide isoelectric point measured by OFFGEL electrophoresis was used as an additional filter for validation of peptide identifications. [Abstract/Link to Full Text]

Stead JA, Keen JN, McDowall KJ
The identification of nucleic acid-interacting proteins using a simple proteomics-based approach that directly incorporates the electrophoretic mobility shift assay.
Mol Cell Proteomics. 2006 Sep;5(9):1697-702.
Proteins that interact with nucleic acids are central to numerous cellular processes, and their continuing characterization represents one of the foremost challenges in the postgenomic era. Here we describe a simple proteomics-based approach for the identification by mass spectrometry of proteins in crude extracts that interact with nucleic acids. It incorporates the electrophoretic mobility shift assay and is based on the finding that when a protein forms a complex with nucleic acid its electrophoretic mobility is affected as well as that of the nucleic acid. Our method should greatly reduce and in some cases may even eliminate the need for extensive protein purification and as such should contribute significantly to the functional annotation of the proteome. Furthermore it requires no prior knowledge of the molecular mass, quaternary structure, or pI of the interacting protein. Proof of principle is demonstrated using a recently discovered transcription factor; however, the approach should also have application in the identification of proteins that interact with RNA. [Abstract/Link to Full Text]

Krueger KE, Srivastava S
Posttranslational protein modifications: current implications for cancer detection, prevention, and therapeutics.
Mol Cell Proteomics. 2006 Oct;5(10):1799-810. [Abstract/Link to Full Text]

Ellmark P, Ingvarsson J, Carlsson A, Lundin BS, Wingren C, Borrebaeck CA
Identification of protein expression signatures associated with Helicobacter pylori infection and gastric adenocarcinoma using recombinant antibody microarrays.
Mol Cell Proteomics. 2006 Sep;5(9):1638-46.
Antibody microarray based technology is a powerful emerging tool in proteomics, target discovery, and differential analysis. Here, we report the first study where recombinant antibody fragments have been used to construct large scale antibody microarrays, composed of 127 different antibodies against mostly immunoregulatory antigens. The arrays were based on single framework recombinant antibody fragments (SinFabs) designed for high on-chip stability and functionality and were used for the analysis of malignant and normal stomach tissue samples from Helicobacter pylori-positive and -negative patients. Our results demonstrate that distinct tumor- as well as infection-associated protein expression signatures could be identified from these complex tissue proteomes, as well as biomarkers such as IL-9, IL-11, and MCP-4, previously not found in these diseases. In a longer perspective, this study may improve the understanding of H. pylori-induced stomach cancer and lead to development of improved diagnostics. [Abstract/Link to Full Text]

Cayatte C, Pons C, Guigonis JM, Pizzol J, Elies L, Kennel P, Rouquié D, Bars R, Rossi B, Samson M
Protein profiling of rat ventral prostate following chronic finasteride administration: identification and localization of a novel putative androgen-regulated protein.
Mol Cell Proteomics. 2006 Nov;5(11):2031-43.
To better understand the effects of antiandrogens on the prostate, we investigated the changes in the proteome of rat ventral prostate (VP) following treatment with a well characterized 5alpha-reductase inhibitor, finasteride. Sprague-Dawley rats were treated daily by gavage with finasteride at 0, 1, 5, 25, and 125 mg/kg/day. Changes in plasma hormone levels as well as the weight and histology of sex accessory tissues were determined after 28 days of treatment and showed a dose-related decrease of VP weights together with a marked atrophy of the tissue visible at the macroscopic and microscopic levels. In addition, significant reductions in seminal vesicle and epididymis weights were noted. VP proteins were analyzed by two-dimensional gel electrophoresis: 37 proteins, mainly involved in protein synthesis, processing, and cellular trafficking and in metabolism, detoxification, and oxidative stress, were identified as modulated by finasteride. The prominent feature of this study is the demonstration of finasteride dose-dependent up-regulation of a protein similar to l-amino-acid oxidase 1 (Lao1). An up-regulation of this protein was also observed with the antiandrogen flutamide. Lao1 expression occurred as early as 48 h after antiandrogen administration and persisted throughout the treatment duration. Immunohistochemistry showed that this protein was only detectable in epithelial cells and secretory vesicles. Altogether these data point to a potential use of Lao1 to reveal antiandrogen-induced prostate injury. [Abstract/Link to Full Text]

Pisitkun T, Johnstone R, Knepper MA
Discovery of urinary biomarkers.
Mol Cell Proteomics. 2006 Oct;5(10):1760-71.
A myriad of proteins and peptides can be identified in normal human urine. These are derived from a variety of sources including glomerular filtration of blood plasma, cell sloughing, apoptosis, proteolytic cleavage of cell surface glycosylphosphatidylinositol-linked proteins, and secretion of exosomes by epithelial cells. Mass spectrometry-based approaches to urinary protein and peptide profiling can, in principle, reveal changes in excretion rates of specific proteins/peptides that can have predictive value in the clinical arena, e.g. in the early diagnosis of disease, in classification of disease with regard to likely therapeutic responses, in assessment of prognosis, and in monitoring response to therapy. These approaches have potential value, not only in diseases of the kidney and urinary tract but also in systemic diseases that are associated with circulating small protein and peptide markers that can pass the glomerular filter. Most large scale biomarker discovery studies reported thus far have used one of two approaches to identify proteins and peptides whose excretion in urine changes in specific disease states: 1) two-dimensional electrophoresis with mass spectrometric and/or immunochemical identification of proteins and 2) top-down mass spectrometric methods (SELDI-TOF-MS and capillary electrophoresis-MS). These studies have been chiefly in the areas of nephrology, urology, and oncology. We review these applications, focusing on two areas of progress, viz. in bladder cancer and in acute rejection of renal transplants. Progress has been limited so far. However, with the advent of powerful LC-MS/MS methods along with methods for quantifying LC-MS/MS output, there is hope for an accelerated discovery and validation of disease biomarkers in urine. [Abstract/Link to Full Text]

Garcia BA, Joshi S, Thomas CE, Chitta RK, Diaz RL, Busby SA, Andrews PC, Ogorzalek Loo RR, Shabanowitz J, Kelleher NL, Mizzen CA, Allis CD, Hunt DF
Comprehensive phosphoprotein analysis of linker histone H1 from Tetrahymena thermophila.
Mol Cell Proteomics. 2006 Sep;5(9):1593-609.
Linker histone H1 is highly phosphorylated in normal growing Tetrahymena thermophila but becomes noticeably dephosphorylated in response to certain conditions such as prolonged starvation. Because phosphorylation of H1 has been associated with the regulation of gene expression, DNA repair, and other critical processes, we sought to use mass spectrometry-based approaches to obtain an in depth phosphorylation "signature" for this linker histone. Histone H1 from both growing and starved Tetrahymena was analyzed by nanoflow reversed-phase HPLC MS/MS following enzymatic digestions, propionic anhydride derivatization, and phosphopeptide enrichment via IMAC. We confirmed five phosphorylation sites identified previously and detected two novel sites of phosphorylation and two novel minor sites of acetylation. The sequential order of phosphorylation on H1 was deduced by using mass spectrometry to define the modified sites on phosphorylated H1 isoforms separated by cation-exchange chromatography. Relative levels of site-specific phosphorylation on H1 isolated from growing and starved Tetrahymena were obtained using a combination of stable isotopic labeling, IMAC, and tandem mass spectrometry. [Abstract/Link to Full Text]

Pollard HB, Eidelman O, Jozwik C, Huang W, Srivastava M, Ji XD, McGowan B, Norris CF, Todo T, Darling T, Mogayzel PJ, Zeitlin PL, Wright J, Guggino WB, Metcalf E, Driscoll WJ, Mueller G, Paweletz C, Jacobowitz DM
De novo biosynthetic profiling of high abundance proteins in cystic fibrosis lung epithelial cells.
Mol Cell Proteomics. 2006 Sep;5(9):1628-37.
In previous studies with cystic fibrosis (CF) IB3-1 lung epithelial cells in culture, we identified 194 unique high abundance proteins by conventional two-dimensional gel electrophoresis and mass spectrometry (Pollard, H. B., Ji, X.-D., Jozwik, C. J., and Jacobowitz, D. M. (2005) High abundance protein profiling of cystic fibrosis lung epithelial cells. Proteomics 5, 2210-2226). In the present work we compared the IB3-1 cells with IB3-1/S9 daughter cells repaired by gene transfer with AAV-(wild type)CFTR. We report that gene transfer resulted in significant changes in silver stain intensity of only 20 of the 194 proteins. However, simultaneous measurement of de novo biosynthetic rates with [(35)S]methionine of all 194 proteins in both cell types resulted in the identification of an additional 31 CF-specific proteins. Of the 51 proteins identified by this hybrid approach, only six proteins changed similarly in both the mass and kinetics categories. This kinetic portion of the high abundance CF proteome, hidden from direct analysis of abundance, included proteins from transcription and signaling pathways such as NFkappaB, chaperones such as HSC70, cytoskeletal proteins, and others. Connectivity analysis indicated that approximately 30% of the 51-member hybrid high abundance CF proteome interacts with the NFkappaB signaling pathway. In conclusion, measurement of biosynthetic rates on a global scale can be used to identify disease-specific differences within the high abundance cystic fibrosis proteome. Most of these kinetically defined proteins are unaffected in expression level when using conventional silver stain analysis. We anticipate that this novel hybrid approach to discovery of the high abundance CF proteome will find general application to other proteomic problems in biology and medicine. [Abstract/Link to Full Text]

Synowsky SA, van den Heuvel RH, Mohammed S, Pijnappel PW, Heck AJ
Probing genuine strong interactions and post-translational modifications in the heterogeneous yeast exosome protein complex.
Mol Cell Proteomics. 2006 Sep;5(9):1581-92.
The characterization of heterogeneous multicomponent protein complexes, which goes beyond identification of protein subunits, is a challenging task. Here we describe and apply a comprehensive method that combines a mild affinity purification procedure with a multiplexed mass spectrometry approach for the in-depth characterization of the exosome complex from Saccharomyces cerevisiae expressed at physiologically relevant levels. The exosome is an ensemble of primarily 3' --> 5' exoribonucleases and plays a major role in RNA metabolism. The complex has been reported to consist of 11 proteins in molecular mass ranging from 20 to 120 kDa. By using native macromolecular mass spectrometry we measured accurate masses (around 400 kDa) of several (sub)exosome complexes. Combination of these data with proteolytic peptide LC tandem mass spectrometry using a linear ion trap coupled to a FT-ICR mass spectrometer and intact protein LC mass spectrometry provided us with the identity of the different exosome components and (sub)complexes, including the subunit stoichiometry. We hypothesize that the observed complexes provide information about strongly and weakly interacting exosome-associated proteins. In our analysis we also identified for the first time phosphorylation sites in seven different exosome subunits. The phosphorylation site in the Rrp4 subunit is fully conserved in the human homologue of Rrp4, which is the only previously reported phosphorylation site in any of the human exosome proteins. The described multiplexed mass spectrometry-based procedure is generic and thus applicable to many different types of cellular molecular machineries even if they are expressed at endogenous levels. [Abstract/Link to Full Text]

Zappacosta F, Collingwood TS, Huddleston MJ, Annan RS
A quantitative results-driven approach to analyzing multisite protein phosphorylation: the phosphate-dependent phosphorylation profile of the transcription factor Pho4.
Mol Cell Proteomics. 2006 Nov;5(11):2019-30.
Multisite protein phosphorylation appears to be quite common. Nevertheless our understanding of how multiple phosphorylation events regulate the function of a protein is limited in many cases. The ability to measure temporal changes in the site-specific phosphorylation profile of a protein in response to a given stimulus or cellular activity would provide an immediate indication of the functional significance of any phosphorylation site to a given process. Here we describe a mass spectrometry-based method to identify functionally relevant phosphorylation sites on a protein. It combines stable isotope labeling with a highly selective mass spectrometry analysis to detect and quantitate phosphorylation sites in response to a cellular signal. This approach requires no a priori knowledge of the phosphorylation state of the protein, does not require purification of phosphopeptides, and reliably detects substoichiometric levels of phosphorylation. Following a review of the quantitative results, only those phosphorylation sites that show a change in relative abundance are selected for identification and further study. We used this results-driven approach to study phosphorylation of the budding yeast transcription factor Pho4 in response to phosphate starvation. Phosphorylation of Pho4 on five cyclin-dependent kinase (Cdk) consensus sites has been shown to regulate the transcriptional activity of Pho4 in response to changes in environmental phosphate levels. Here we show that in phosphate-rich medium Pho4 is phosphorylated on at least 15 distinct sites including the five Cdk sites described previously. In excellent agreement with the known mechanism for regulation of Pho4 we found that phosphorylation at all five of the Cdk sites was repressed in phosphate-depleted medium. In addition to these five sites, we identified four novel phosphorylation sites that were also responsive to changes in phosphate availability. Selecting a limited number of Pho4 phosphorylation sites, we performed a more detailed kinetic analysis using an isotope-free strategy. We used LC-MS with selected reaction monitoring to greatly improve the accuracy, sensitivity, and dynamic range of the subsequent experiments. A detailed analysis of the cell-based phosphorylation at the selected Pho4 sites confirmed an apparent site preference for the Pho80-Pho85 cyclin-cyclin-dependent kinase complex. [Abstract/Link to Full Text]

Agrawal GK, Thelen JJ
Large scale identification and quantitative profiling of phosphoproteins expressed during seed filling in oilseed rape.
Mol Cell Proteomics. 2006 Nov;5(11):2044-59.
Seed filling is a dynamic, temporally regulated phase of seed development that determines the composition of storage reserves in mature seeds. Although the metabolic pathways responsible for storage reserve synthesis such as carbohydrates, oils, and proteins are known, little is known about their regulation. Protein phosphorylation is a ubiquitous form of regulation that influences many aspects of dynamic cellular behavior in plant biology. Here a systematic study has been conducted on five sequential stages (2, 3, 4, 5, and 6 weeks after flowering) of seed development in oilseed rape (Brassica napus L. Reston) to survey the presence and dynamics of phosphoproteins. High resolution two-dimensional gel electrophoresis in combination with a phosphoprotein-specific Pro-Q Diamond phosphoprotein fluorescence stain revealed approximately 300 phosphoprotein spots. Of these, quantitative expression profiles for 234 high quality spots were established, and hierarchical cluster analyses revealed the occurrence of six principal expression trends during seed filling. The identity of 103 spots was determined using LC-MS/MS. The identified spots represented 70 non-redundant phosphoproteins belonging to 10 major functional categories including energy, metabolism, protein destination, and signal transduction. Furthermore phosphorylation within 16 non-redundant phosphoproteins was verified by mapping the phosphorylation sites by LC-MS/MS. Although one of these sites was postulated previously, the remaining sites have not yet been reported in plants. Phosphoprotein data were assembled into a web database. Together this study provides evidence for the presence of a large number of functionally diverse phosphoproteins, including global regulatory factors like 14-3-3 proteins, within developing B. napus seed. [Abstract/Link to Full Text]

Angenendt P, Kreutzberger J, Glökler J, Hoheisel JD
Generation of high density protein microarrays by cell-free in situ expression of unpurified PCR products.
Mol Cell Proteomics. 2006 Sep;5(9):1658-66.
Due to the success of DNA microarrays and the growing numbers of available protein expression clones, protein microarrays have become more and more popular for the high throughput screening of protein interactions. However, the widespread applicability of protein microarrays is currently hampered by the large effort associated with their production. Apart from the requirement for a protein expression library, expression and purification of the proteins themselves and the lacking stability of many proteins remain the bottleneck. Here we present an approach that allows the generation of high density protein microarrays from unbound DNA template molecules on the chip. It is based on the multiple spotting technique and comprises the deposition of a DNA template in a first spotting step and the transfer of a cell-free transcription and translation mixture on top of the same spot in a second spotting step. Using wild-type green fluorescent protein as a model protein, we demonstrated the time and template dependence of this coupled transcription and translation and showed that enough protein was produced to yield signals that were comparable to 300 microg/ml spotted protein. Plasmids as well as unpurified PCR products can be used as templates, and as little as 35 fg of PCR product ( approximately 22,500 molecules) were sufficient for the detectable expression of full-length wild-type green fluorescent protein in subnanoliter volumes. We showed that both aminopropyltrimethoxysilane and nickel chelate surfaces can be used for capture of the newly synthesized proteins. Surprisingly we observed that nickel chelate-coated slides were binding the newly synthesized proteins in an unspecific manner. Finally we adapted the system to the high throughput expression of libraries by designing a single primer pair for the introduction of the required T7 promoter and demonstrated the in situ expression using 384 randomly chosen clones. [Abstract/Link to Full Text]

Benita Y, Wise MJ, Lok MC, Humphery-Smith I, Oosting RS
Analysis of high throughput protein expression in Escherichia coli.
Mol Cell Proteomics. 2006 Sep;5(9):1567-80.
The ability to efficiently produce hundreds of proteins in parallel is the most basic requirement of many aspects of proteomics. Overcoming the technical and financial barriers associated with high throughput protein production is essential for the development of an experimental platform to query and browse the protein content of a cell (e.g. protein and antibody arrays). Proteins are inherently different one from another in their physicochemical properties; therefore, no single protocol can be expected to successfully express most of the proteins. Instead of optimizing a protocol to express a specific protein, we used sequence analysis tools to estimate the probability of a specific protein to be expressed successfully using a given protocol, thereby avoiding a priori proteins with a low success probability. A set of 547 proteins, to be used for antibody production and selection, was expressed in Escherichia coli using a high throughput protein production pipeline. Protein properties derived from sequence alone were correlated to successful expression, and general guidelines are given to increase the efficiency of similar pipelines. A second set of 68 proteins was expressed to investigate the link between successful protein expression and inclusion body formation. More proteins were expressed in inclusion bodies; however, the formation of inclusion bodies was not a requirement for successful expression. [Abstract/Link to Full Text]

Ying W, Jiang Y, Guo L, Hao Y, Zhang Y, Wu S, Zhong F, Wang J, Shi R, Li D, Wan P, Li X, Wei H, Li J, Wang Z, Xue X, Cai Y, Zhu Y, Qian X, He F
A dataset of human fetal liver proteome identified by subcellular fractionation and multiple protein separation and identification technology.
Mol Cell Proteomics. 2006 Sep;5(9):1703-7.
A high throughput process including subcellular fractionation and multiple protein separation and identification technology allowed us to establish the protein expression profile of human fetal liver, which was composed of at least 2,495 distinct proteins and 568 non-isoform groups identified from 64,960 peptides and 24,454 distinct peptides. In addition to the basic protein identification mentioned above, the MS data were used for complementary identification and novel protein mining. By doing the analysis with integrated protein, expressed sequence tag, and genome datasets, 223 proteins and 15 peptides were complementarily identified with high quality MS/MS data. [Abstract/Link to Full Text]

Chae JI, Cho SK, Seo JW, Yoon TS, Lee KS, Kim JH, Lee KK, Han YM, Yu K
Proteomic analysis of the extraembryonic tissue from cloned porcine embryos.
Mol Cell Proteomics. 2006 Sep;5(9):1559-66.
Cloned animals developed from somatic cell nuclear transfer (SCNT) embryos are useful resources for agricultural and medical applications. However, the birth rate in the cloned animals is very low, and the cloned animals that have survived show various developmental defects. In this report, we present the morphology and differentially regulated proteins in the extraembryonic tissue from SCNT embryos to understand the molecular nature of the tissue. We examined 26-day-old SCNT porcine embryos at which the sonogram can first detect pregnancy. The extraembryonic tissue from SCNT embryos was abnormally small compared with the control. In the proteomic analysis with the SCNT extraembryonic tissue, 39 proteins were identified as differentially regulated proteins. Among up-regulated proteins, Annexins and Hsp27 were found. They are closely related to the processes of apoptosis. Among down-regulated proteins, Peroxiredoxins and anaerobic glycolytic enzymes were identified. In the Western blot analysis, antioxidant enzymes and the antiapoptotic Bcl-2 protein were down-regulated, and caspases were up-regulated. In the terminal deoxynucleotidyltransferase-mediated dUTP nick end labeling (TUNEL) assay with the placenta from SCNT embryos, apoptotic trophoblasts were observed. These results demonstrate that a major reason for the low birth rate of cloned animals is due to abnormal apoptosis in the extraembryonic tissue during early pregnancy. [Abstract/Link to Full Text]

Bisle B, Schmidt A, Scheibe B, Klein C, Tebbe A, Kellermann J, Siedler F, Pfeiffer F, Lottspeich F, Oesterhelt D
Quantitative profiling of the membrane proteome in a halophilic archaeon.
Mol Cell Proteomics. 2006 Sep;5(9):1543-58.
We present a large scale quantitation study of the membrane proteome from Halobacterium salinarum. To overcome problems generally encountered with membrane proteins, we established a membrane preparation protocol that allows the application of most proteomic techniques originally developed for soluble proteins. Proteins were quantified using two complementary approaches. For gel-based quantitation, DIGE labeling was combined with two-dimensional gel electrophoresis on an improved 16-benzyldimethyl-n-hexadecylammonium chloride/SDS system. MS-based quantitation was carried out by combining gel-free separation with the recently developed isotope-coded protein labeling technique. Good correlations between these two independent quantitation strategies were obtained. From computational analysis we conclude that labeling of free amino groups by isotope-coded protein labeling (Lys and free N termini) is better suited for membrane proteins than Cys-based labeling strategies but that quantitation of integral membrane proteins remains cumbersome compared with soluble proteins. Nevertheless we could quantify 155 membrane proteins; 101 of these had transmembrane domains. We compared two growth states that strongly affect the energy supply of the cells: aerobic versus anaerobic/phototrophic conditions. The photosynthetic protein bacteriorhodopsin is the most highly regulated protein. As expected, several other membrane proteins involved in aerobic or anaerobic energy metabolism were found to be regulated, but in total, however, the number of regulated proteins is rather small. [Abstract/Link to Full Text]

Kobeissy FH, Ottens AK, Zhang Z, Liu MC, Denslow ND, Dave JR, Tortella FC, Hayes RL, Wang KK
Novel differential neuroproteomics analysis of traumatic brain injury in rats.
Mol Cell Proteomics. 2006 Oct;5(10):1887-98.
Approximately two million traumatic brain injury (TBI) incidents occur annually in the United States, yet there are no specific therapeutic treatments. The absence of brain injury diagnostic endpoints was identified as a significant roadblock to TBI therapeutic development. To this end, our laboratory has studied mechanisms of cellular injury for biomarker discovery and possible therapeutic strategies. In this study, pooled naïve and injured cortical samples (48 h postinjury; rat controlled cortical impact model) were processed and analyzed using a differential neuroproteomics platform. Protein separation was performed using combined cation/anion exchange chromatography-PAGE. Differential proteins were then trypsinized and analyzed with reversed-phase LC-MSMS for protein identification and quantitative confirmation. The results included 59 differential protein components of which 21 decreased and 38 increased in abundance after TBI. Proteins with decreased abundance included collapsin response mediator protein 2 (CRMP-2), glyceraldehyde-3-phosphate dehydrogenase, microtubule-associated proteins MAP2A/2B, and hexokinase. Conversely C-reactive protein, transferrin, and breakdown products of CRMP-2, synaptotagmin, and alphaII-spectrin were found to be elevated after TBI. Differential changes in the above mentioned proteins were confirmed by quantitative immunoblotting. Results from this work provide insight into mechanisms of traumatic brain injury and yield putative biochemical markers to potentially facilitate patient management by monitoring the severity, progression, and treatment of injury. [Abstract/Link to Full Text]

Wu SL, Kim J, Bandle RW, Liotta L, Petricoin E, Karger BL
Dynamic profiling of the post-translational modifications and interaction partners of epidermal growth factor receptor signaling after stimulation by epidermal growth factor using Extended Range Proteomic Analysis (ERPA).
Mol Cell Proteomics. 2006 Sep;5(9):1610-27.
In a recent report, we introduced Extended Range Proteomic Analysis (ERPA), an intermediate approach between top-down and bottom-up proteomics, for the comprehensive characterization at the trace level (fmol level) of large and complex proteins. In this study, we extended ERPA to determine quantitatively the temporal changes that occur in the tyrosine kinase receptor, epidermal growth factor receptor (EGFR), upon stimulation. Specifically A 431 cells were stimulated with epidermal growth factor after which EGFR was immunoprecipitated at stimulation times of 0, 0.5, 2, and 10 min as well as 4 h. High sequence coverage was obtained (96%), and methods were developed for label-free quantitation of phosphorylation and glycosylation. A total of 13 phosphorylation sites were identified, and the estimated stoichiometry was determined over the stimulation time points, including Thr(P) and Ser(P) sites in addition to Tyr(P) sites. A total of 10 extracellular domain N-glycan sites were also identified, and major glycoforms at each site were quantitated. No change in the extent of glycosylation with stimulation was observed as expected. Finally potential binding partners to EGFR were identified based on changes in the amount of protein pulled down with EGFR as a function of time of stimulation. Many of the 19 proteins identified are known binding partners of EGFR. This work demonstrates that comprehensive characterization provides a powerful tool to aid in the study of important therapeutic targets. The detailed molecular information will prove useful in future studies in tissue. [Abstract/Link to Full Text]

Wilkie GS, Schirmer EC
Guilt by association: the nuclear envelope proteome and disease.
Mol Cell Proteomics. 2006 Oct;5(10):1865-75.
The discovery that many inherited diseases are linked to interacting nuclear envelope proteins has raised the possibility that human genetic studies could be assisted by a fusion with proteomics. Two principles could be applied. In the first, the proteome of an organelle associated with a genetically variable disease is determined. The chromosomal locations of the genes encoding the organellar proteins are then determined. If a related disease is linked to a large chromosomal region that includes a gene identified in the organelle, then that gene has an increased likelihood of causing the disease. Directly sequencing this allele from patient samples might speed identification compared with further genetic linkage studies as has been demonstrated for multiple diseases associated with the nuclear envelope. The second principle is that if an organelle has been implicated in the pathology of a particular disorder, then comparison of the organelle proteome from control and patient cells might highlight differences that could indicate the causative protein. The distinct, tissue-specific pathologies associated with nuclear envelope diseases suggest that many tissues will have a set of disorders linked to this organelle, and there are numerous as yet unmapped or partially mapped syndromes that could benefit from such an approach. [Abstract/Link to Full Text]

Lim MS, Elenitoba-Johnson KS
Mass spectrometry-based proteomic studies of human anaplastic large cell lymphoma.
Mol Cell Proteomics. 2006 Oct;5(10):1787-98.
Malignant lymphomas are a diverse group of malignant neoplasms that arise as a result of a complex interplay of multiple factors including genetic aberrations, immunosuppression, and exposure to noxious agents such as ionizing radiation and chemical agents. Anaplastic large cell lymphoma (ALCL) is an aggressive T-lineage lymphoma harboring chromosomal translocations involving the anaplastic lymphoma kinase (ALK) tyrosine kinase. The most common translocation in ALCL is the t(2;5)(p23;q35). This results in the formation of a chimeric fusion kinase, nucleophosmin/ALK. Nucleophosmin/ALK activates numerous downstream signaling pathways resulting in enhanced survival and proliferation. Using a variety of mass spectrometry-driven proteomic strategies, we have studied several aspects of the ALCL proteome. In this review, we provide a summary of mass spectrometry-based proteomic studies that expands the current understanding of the molecular pathogenesis of ALCL and provides the basis for the identification of biomarkers and targets for novel therapeutic agents. [Abstract/Link to Full Text]

Ditzen C, Jastorff AM, Kessler MS, Bunck M, Teplytska L, Erhardt A, Krömer SA, Varadarajulu J, Targosz BS, Sayan-Ayata EF, Holsboer F, Landgraf R, Turck CW
Protein biomarkers in a mouse model of extremes in trait anxiety.
Mol Cell Proteomics. 2006 Oct;5(10):1914-20.
Brain proteome analysis of mice selectively bred for either high or low anxiety-related behavior revealed quantitative and qualitative protein expression differences. The enzyme glyoxalase-I was consistently expressed to a higher extent in low anxiety as compared with high anxiety mice in several brain areas. The same phenotype-dependent difference was also found in red blood cells with normal and cross-mated animals showing intermediate expression profiles of glyoxalase-I. Another protein that showed a different mobility during two-dimensional gel electrophoresis was identified as enolase phosphatase. The presence of both protein markers in red or white blood cells, respectively, creates the opportunity to screen for their expression in clinical blood specimens from patients suffering from anxiety. [Abstract/Link to Full Text]

Lin YF, Wu MS, Chang CC, Lin SW, Lin JT, Sun YJ, Chen DS, Chow LP
Comparative immunoproteomics of identification and characterization of virulence factors from Helicobacter pylori related to gastric cancer.
Mol Cell Proteomics. 2006 Aug;5(8):1484-96.
Helicobacter pylori is an important risk factor of gastric cancer (GC). Although many H. pylori virulence factors have been reported, the pathogenic mechanism by which H. pylori infection causes GC remains unclear. The aims of this study were to identify GC-related antigens from H. pylori and characterize their roles in the development of GC. As GC and duodenal ulcer (DU) are considered clinically divergent, we compared two-dimensional immunoblots of an acid-glycine extract of H. pylori probed with serum samples from 15 patients with GC and 15 with DU to find GC-related antigens, which were subsequently identified by mass spectrometry. Many protein spots were recognized by more than one serum, and 24 of these were better recognized by GC sera. The proteins showing higher frequency of recognition in GC group are threonine synthase, rod shape-determining protein, S-adenosylmethionine synthetase, peptide chain release factor 1, DNA-directed RNA polymerase alpha subunit, co-chaperonin GroES (monomeric and dimeric forms), response regulator OmpR, and membrane fusion protein. Of these proteins, GroES was identified as a dominant GC-related antigen with a much higher seropositivity of GC samples (64.2%, n = 95) compared with 30.9% for gastritis (n = 94) and 35.5% for DU (n = 124). GroES seropositivity was more commonly associated with antral GC than with non-antral GC (odds ratio = 2.7; 95% confidence interval, 1.1-6.7). In peripheral blood mononuclear cells, GroES stimulated production of interleukin (IL)-8, IL-6, granulocyte macrophage colony-stimulating factor, IL-1beta, tumor necrosis factor-alpha, cyclooxygenase-2, and prostaglandin E(2). Moreover when incubated with gastric epithelial cells, GroES induced expression of IL-8, cell proliferation, and up-regulation of c-jun, c-fos, and cyclin D1 but caused down-regulation of p27(Kip1). We conclude that GroES of H. pylori is a novel GC-associated virulence factor and may contribute to gastric carcinogenesis via induction of inflammation and promotion of cell proliferation. [Abstract/Link to Full Text]

Widmann J, Hamady M, Knight R
DivergentSet, a tool for picking non-redundant sequences from large sequence collections.
Mol Cell Proteomics. 2006 Aug;5(8):1520-32.
DivergentSet addresses the important but so far neglected bioinformatics task of choosing a representative set of sequences from a larger collection. We found that using a phylogenetic tree to guide the construction of divergent sets of sequences can be up to 2 orders of magnitude faster than the naive method of using a full distance matrix. By providing a user-friendly interface (available online) that integrates the tasks of finding additional sequences, building and refining the divergent set, producing random divergent sets from the same sequences, and exporting identifiers, this software facilitates a wide range of bioinformatics analyses including finding significant motifs and covariations. As an example application of DivergentSet, we demonstrate that the motifs identified by the motif-finding package MEME (Motif Elicitation by Maximum Entropy) are highly unstable with respect to the specific choice of sequences. This instability suggests that the types of sensitivity analysis enabled by DivergentSet may be widely useful for identifying the motifs of biological significance. [Abstract/Link to Full Text]

Drake RR, Schwegler EE, Malik G, Diaz J, Block T, Mehta A, Semmes OJ
Lectin capture strategies combined with mass spectrometry for the discovery of serum glycoprotein biomarkers.
Mol Cell Proteomics. 2006 Oct;5(10):1957-67.
The application of mass spectrometry to identify disease biomarkers in clinical fluids like serum using high throughput protein expression profiling continues to evolve as technology development, clinical study design, and bioinformatics improve. Previous protein expression profiling studies have offered needed insight into issues of technical reproducibility, instrument calibration, sample preparation, study design, and supervised bioinformatic data analysis. In this overview, new strategies to increase the utility of protein expression profiling for clinical biomarker assay development are discussed with an emphasis on utilizing differential lectin-based glycoprotein capture and targeted immunoassays. The carbohydrate binding specificities of different lectins offer a biological affinity approach that complements existing mass spectrometer capabilities and retains automated throughput options. Specific examples using serum samples from prostate cancer and hepatocellular carcinoma subjects are provided along with suggested experimental strategies for integration of lectin-based methods into clinical fluid expression profiling strategies. Our example workflow incorporates the necessity of early validation in biomarker discovery using an immunoaffinity-based targeted analytical approach that integrates well with upstream discovery technologies. [Abstract/Link to Full Text]

Kudva IT, Krastins B, Sheng H, Griffin RW, Sarracino DA, Tarr PI, Hovde CJ, Calderwood SB, John M
Proteomics-based expression library screening (PELS): a novel method for rapidly defining microbial immunoproteomes.
Mol Cell Proteomics. 2006 Aug;5(8):1514-9.
Current methodologies for global identification of microbial proteins that elicit host humoral immune responses have several limitations and are not ideally suited for use in the postgenomic era. Here we describe a novel application of proteomics, proteomics-based expression library screening, to rapidly define microbial immunoproteomes. Proteomics-based expression library screening is broadly applicable to any cultivable, sequenced pathogen eliciting host antibody responses and hence is ideal for rapidly mining microbial proteomes for targets with diagnostic, prophylactic, and therapeutic potential. In this report, we demonstrate "proof-of-principle" by identifying 207 proteins of the Escherichia coli O157:H7 immunome in bovine reservoirs in only 3 weeks. [Abstract/Link to Full Text]

Fang Z, Miao Y, Ding X, Deng H, Liu S, Wang F, Zhou R, Watson C, Fu C, Hu Q, Lillard JW, Powell M, Chen Y, Forte JG, Yao X
Proteomic identification and functional characterization of a novel ARF6 GTPase-activating protein, ACAP4.
Mol Cell Proteomics. 2006 Aug;5(8):1437-49.
ARF6 GTPase is a conserved regulator of membrane trafficking and actin-based cytoskeleton dynamics at the leading edge of migrating cells. A key determinant of ARF6 function is the lifetime of the GTP-bound active state, which is orchestrated by GTPase-activating protein (GAP) and GTP-GDP exchanging factor. However, very little is known about the molecular mechanisms underlying ARF6-mediated cell migration. To systematically analyze proteins that regulate ARF6 activity during cell migration, we performed a proteomic analysis of proteins selectively bound to active ARF6 using mass spectrometry and identified a novel ARF6-specific GAP, ACAP4. ACAP4 encodes 903 amino acids and contains two coiled coils, one pleckstrin homology domain, one GAP motif, and two ankyrin repeats. Our biochemical characterization demonstrated that ACAP4 has a phosphatidylinositol 4,5-bisphosphate-dependent GAP activity specific for ARF6. The co-localization of ACAP4 with ARF6 occurred in ruffling membranes formed upon AIF(4) and epidermal growth factor stimulation. ACAP4 overexpression limited the recruitment of ARF6 to the membrane ruffles in the absence of epidermal growth factor stimulation. Expression of GTP hydrolysis-resistant ARF6(Q67L) resulted in accumulations of ACAP4 and ARF6 in the cytoplasmic membrane, suggesting that GTP hydrolysis is required for the ARF6-dependent membrane remodeling. Significantly the depletion of ACAP4 by small interfering RNA or inhibition of ARF6 GTP hydrolysis by overexpressing GAP-deficient ACAP4 suppressed ARF6-dependent cell migration in wound healing, demonstrating the importance of ACAP4 in cell migration. Thus, our study sheds new light on the biological function of ARF6-mediated cell migration. [Abstract/Link to Full Text]

Azad NS, Rasool N, Annunziata CM, Minasian L, Whiteley G, Kohn EC
Proteomics in clinical trials and practice: present uses and future promise.
Mol Cell Proteomics. 2006 Oct;5(10):1819-29.
The study of clinical proteomics is a promising new field that has the potential to have many applications, including the identification of biomarkers and monitoring of disease, especially in the field of oncology. Expression proteomics evaluates the cellular production of proteins encoded by a particular gene and exploits the differential expression and post-translational modifications of proteins between healthy and diseased states. These biomarkers may be applied towards early diagnosis, prognosis, and prediction of response to therapy. Functional proteomics seeks to decipher protein-protein interactions and biochemical pathways involved in disease biology and targeted by newer molecular therapeutics. Advanced spectrometry technologies and new protein array formats have improved these analyses and are now being applied prospectively in clinical trials. Further advancement of proteomics technology could usher in an era of personalized molecular medicine, where diseases are diagnosed at earlier stages and where therapies are more effective because they are tailored to the protein expression of a patient's malignancy. [Abstract/Link to Full Text]

Nedelkov D, Kiernan UA, Niederkofler EE, Tubbs KA, Nelson RW
Population proteomics: the concept, attributes, and potential for cancer biomarker research.
Mol Cell Proteomics. 2006 Oct;5(10):1811-8.
This review outlines the concept of population proteomics and its implication in the discovery and validation of cancer-specific protein modulations. Population proteomics is an applied subdiscipline of proteomics engaging in the investigation of human proteins across and within populations to define and better understand protein diversity. Population proteomics focuses on interrogation of specific proteins from large number of individuals, utilizing top-down, targeted affinity mass spectrometry approaches to probe protein modifications. Deglycosylation, sequence truncations, side-chain residue modifications, and other modifications have been reported for myriad of proteins, yet little is know about their incidence rate in the general population. Such information can be gathered via population proteomics and would greatly aid the biomarker discovery efforts. Discovery of novel protein modifications is also expected from such large scale population proteomics, expanding the protein knowledge database. In regard to cancer protein biomarkers, their validation via population proteomics-based approaches is advantageous as mass spectrometry detection is used both in the discovery and validation process, which is essential for the detection of those structurally modified protein biomarkers. [Abstract/Link to Full Text]

Roman I, Figys J, Steurs G, Zizi M
Hunting interactomes of a membrane protein: obtaining the largest set of voltage-dependent anion channel-interacting protein epitopes.
Mol Cell Proteomics. 2006 Sep;5(9):1667-80.
The identification of epitopes involved in protein-protein interactions is essential for understanding protein structure and function. Large scale efforts, although identifying the interactions, did not always yield these epitopes, could not confirm most of the known interactions, and seemed particularly unsuccessful for native intrinsic membrane proteins. We have developed a fluidics-based approach (non-steady-state kinetics) to obtain the broadest set of the epitopes interacting with a given target and applied it to a phage display methodology optimized for membrane proteins. Phages expressing a liver cDNA library were screened against a membrane protein (voltage-dependent anion channel) reconstituted into liposomes and captured on a chip surface. The controlled fluidics was obtained by a surface plasmon resonance (SPR) device that combined the advantages of working with minute reaction volumes and non-equilibrium conditions. We demonstrated selective enrichment of binders and could even select for different binding affinities by fractionation of the selected outputs at various elution times. With voltage-dependent anion channel as bait (a mitochondrial channel critical for cellular metabolism and apoptosis) we found at least 40% of its already reported ligands and independently confirmed 55 novel functional interactions, some of which fully blocked the channel. This highly efficient approach is generally applicable for any protein and could be automated and scaled up even without the use of a SPR device. The epitopes directly identified by this method are useful not only for unraveling interactomes but also for drug design and therapeutics. [Abstract/Link to Full Text]

