picture
RJR-logo

About | BLOGS | Portfolio | Misc | Recommended | What's New | What's Hot

About | BLOGS | Portfolio | Misc | Recommended | What's New | What's Hot

icon

Bibliography Options Menu

icon
QUERY RUN:
18 Aug 2019 at 01:30
HITS:
899
PAGE OPTIONS:
Hide Abstracts   |   Hide Additional Links
NOTE:
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome

RJR-3x

Robert J. Robbins is a biologist, an educator, a science administrator, a publisher, an information technologist, and an IT leader and manager who specializes in advancing biomedical knowledge and supporting education through the application of information technology. More About:  RJR | OUR TEAM | OUR SERVICES | THIS WEBSITE

RJR: Recommended Bibliography 18 Aug 2019 at 01:30 Created: 

Pangenome

Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: pangenome or "pan-genome" or "pan genome" NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)

RevDate: 2019-08-14

Dar HA, Zaheer T, Shehroz M, et al (2019)

Immunoinformatics-Aided Design and Evaluation of a Potential Multi-Epitope Vaccine against Klebsiella Pneumoniae.

Vaccines, 7(3): pii:vaccines7030088.

Klebsiella pneumoniae is an opportunistic gram-negative bacterium that causes nosocomial infection in healthcare settings. Despite the high morbidity and mortality rate associated with these bacterial infections, no effective vaccine is available to counter the pathogen. In this study, the pangenome of a total of 222 available complete genomes of K. pneumoniae was explored to obtain the core proteome. A reverse vaccinology strategy was applied to the core proteins to identify four antigenic proteins. These proteins were then subjected to epitope mapping and prioritization steps to shortlist nine B-cell derived T-cell epitopes which were linked together using GPGPG linkers. An adjuvant (Cholera Toxin B) was also added at the N-terminal of the vaccine construct to improve its immunogenicity and a stabilized multi-epitope protein structure was obtained using molecular dynamics simulation. The designed vaccine exhibited sustainable and strong bonding interactions with Toll-like receptor 2 and Toll-like receptor 4. In silico reverse translation and codon optimization also confirmed its high expression in E. coli K12 strain. The computer-aided analyses performed in this study imply that the designed multi-epitope vaccine can elicit specific immune responses against K. pneumoniae. However, wet lab validation is necessary to further verify the effectiveness of this proposed vaccine candidate.

RevDate: 2019-08-10

Xing J, Li X, Sun Y, et al (2019)

Comparative genomic and functional analysis of Akkermansia muciniphila and closely related species.

Genes & genomics pii:10.1007/s13258-019-00855-1 [Epub ahead of print].

BACKGROUND: Akkermansia muciniphila is an important bacterium that resides on the mucus layer of the intestinal tract. Akkermansia muciniphila has a high abundance in human feces and plays an important role in human health.

OBJECTIVE: In this article, 23 whole genome sequences of the Akkermansia genus were comparatively studied.

METHODS: Phylogenetic trees were constructed with three methods: All amino acid sequences of each strain were used to construct the first phylogenetic tree using the web server of Composition Vector Tree Version 3. The matrix of Genome-to-Genome Distances which were obtained from GGDC 2.0 was used to construct the second phylogenetic tree using FastME. The concatenated single-copy core gene-based phylogenetic tree was generated through MEGA. The single-copy genes were obtained using OrthoMCL. Population structure was assessed by STRUCTURE 2.3.4 using the SNPs in core genes. PROKKA and Roary were used to do pan-genome analyses. The biosynthetic gene clusters were predicted using antiSMASH 4.0. IalandViewer 4 was used to detect the genomic islands.

RESULTS: The results of comparative genomic analysis revealed that: (1) The 23 Akkermansia strains formed 4 clades in phylogenetic trees. The A. muciniphila strains isolated from different geographic regions and ecological niches, formed a closely related clade. (2) The 23 Akkermansia strains were divided into 4 species based on digital DNA-DNA hybridization (dDDH) values. (3) Pan-genome of A. muciniphila is in an open state and increases with addition of new sequenced genomes. (4) SNPs were not evenly distributed throughout the A. muciniphila genomes. The genes in regions with high SNP density are related to metabolism and cell wall/membrane envelope biogenesis. (5) The thermostable outer-membrane protein, Amuc_1100, was conserved in the Akkermansia genus, except for Akkermansia glycaniphila PytT.

CONCLUSION: Overall, applying comparative genomic and pan-genomic analyses, we classified and illuminated the phylogenetic relationship of the 23 Akkermansia strains. Insights of the evolutionary, population structure, gene clusters and genome islands of Akkermansia provided more information about the possible physiological and probiotic mechanisms of the Akkermansia strains, and gave some instructions for the in-depth researches about the use of Akkermansia as a gut probiotic in the future.

RevDate: 2019-08-08

Khan AMAM, Mendoza C, Hauk VJ, et al (2019)

Genomic and physiological analyses reveal that extremely thermophilic Caldicellulosiruptor changbaiensis deploys uncommon cellulose attachment mechanisms.

Journal of industrial microbiology & biotechnology pii:10.1007/s10295-019-02222-1 [Epub ahead of print].

The genus Caldicellulosiruptor is comprised of extremely thermophilic, heterotrophic anaerobes that degrade plant biomass using modular, multifunctional enzymes. Prior pangenome analyses determined that this genus is genetically diverse, with the current pangenome remaining open, meaning that new genes are expected with each additional genome sequence added. Given the high biodiversity observed among the genus Caldicellulosiruptor, we have sequenced and added a 14th species, Caldicellulosiruptor changbaiensis, to the pangenome. The pangenome now includes 3791 ortholog clusters, 120 of which are unique to C. changbaiensis and may be involved in plant biomass degradation. Comparisons between C. changbaiensis and Caldicellulosiruptor bescii on the basis of growth kinetics, cellulose solubilization and cell attachment to polysaccharides highlighted physiological differences between the two species which are supported by their respective gene inventories. Most significantly, these comparisons indicated that C. changbaiensis possesses uncommon cellulose attachment mechanisms not observed among the other strongly cellulolytic members of the genus Caldicellulosiruptor.

RevDate: 2019-08-03

Chapeton-Montes D, Plourde L, Bouchier C, et al (2019)

The population structure of Clostridium tetani deduced from its pan-genome.

Scientific reports, 9(1):11220 pii:10.1038/s41598-019-47551-4.

Clostridium tetani produces a potent neurotoxin, the tetanus neurotoxin (TeNT) that is responsible for the worldwide neurological disease tetanus, but which can be efficiently prevented by vaccination with tetanus toxoid. Until now only one type of TeNT has been characterized and very little information exists about the heterogeneity among C. tetani strains. We report here the genome sequences of 26 C. tetani strains, isolated between 1949 and 2017 and obtained from different locations. Genome analyses revealed that the C. tetani population is distributed in two phylogenetic clades, a major and a minor one, with no evidence for clade separation based on geographical origin or time of isolation. The chromosome of C. tetani is highly conserved; in contrast, the TeNT-encoding plasmid shows substantial heterogeneity. TeNT itself is highly conserved among all strains; the most relevant difference is an insertion of four amino acids in the C-terminal receptor-binding domain in four strains that might impact on receptor-binding properties. Other putative virulence factors, including tetanolysin and collagenase, are encoded in all genomes. This study highlights the population structure of C. tetani and suggests that tetanus-causing strains did not undergo extensive evolutionary diversification, as judged from the high conservation of its main virulence factors.

RevDate: 2019-08-02

Saad J, Phelippeau M, Khoder M, et al (2019)

"Mycobacterium mephinesia", a Mycobacterium terrae complex species of clinical interest isolated in French Polynesia.

Scientific reports, 9(1):11169 pii:10.1038/s41598-019-47674-8.

A 59-year-old tobacco smoker male with chronic bronchitis living in Taravao, French Polynesia, Pacific, presented with a two-year growing nodule in the middle lobe of the right lung. A guided bronchoalveolar lavage inoculated onto Löwenstein-Jensen medium yielded colonies of a rapidly-growing non-chromogenic mycobacterium designed as isolate P7213. The isolate could not be identified using routine matrix-assisted laser desorption ionization-time of flight-mass spectrometry and phenotypic and probe-hybridization techniques and yielded 100% and 97% sequence similarity with the respective 16S rRNA and rpoB gene sequences of Mycobacterium virginiense in the Mycobacterium terrae complex. Electron microscopy showed a 1.15 µm long and 0.38 µm large bacillus which was in vitro susceptible to rifampicin, rifabutin, ethambutol, isoniazid, doxycycline and kanamycin. Its 4,511,948-bp draft genome exhibited a 67.6% G + C content with 4,153 coding-protein genes and 87 predicted RNA genes. Genome sequence-derived DNA-DNA hybridization, OrthoANI and pangenome analysis confirmed isolate P7213 was representative of a new species in the M. terrae complex. We named this species "Mycobacterium mephinesia".

RevDate: 2019-08-02

O'Connor E, McGowan J, McCarthy CGP, et al (2019)

Whole Genome Sequence of the Commercially Relevant Mushroom Strain Agaricus bisporus var. bisporus ARP23.

G3 (Bethesda, Md.) pii:g3.119.400563 [Epub ahead of print].

Agaricus bisporus is an extensively cultivated edible mushroom. Demand for cultivation is continuously growing and difficulties associated with breeding programmes now means strains are effectively considered monoculture. While commercial growing practices are highly efficient and tightly controlled, the over-use of a single strain has led to a variety of disease outbreaks from a range of pathogens including bacteria, fungi and viruses. To address this, the Agaricus Resource Program (ARP) was set up to collect wild isolates from diverse geographical locations through a bounty-driven scheme to create a repository of wild Agaricus germplasm. One of the strains collected, Agaricus bisporus var. bisporus ARP23, has been crossed extensively with white commercial varieties leading to the generation of a novel hybrid with a dark brown pileus commonly referred to as 'Heirloom'. Heirloom has been successfully implemented into commercial mushroom cultivation. In this study the whole genome of Agaricus bisporus var. bisporus ARP23 was sequenced and assembled with Illumina and PacBio sequencing technology. The final genome was found to be 33.49 Mb in length and have significant levels of synteny to other sequenced Agaricus bisporus strains. Overall, 13,030 putative protein coding genes were located and annotated. Relative to the other A. bisporus genomes that are currently available, Agaricus bisporus var. bisporus ARP23 is the largest A. bisporus strain in terms of gene number and genetic content sequenced to date. Comparative genomic analysis shows that the A. bisporus mating loci in unifactorial and unsurprisingly highly conserved between strains. The lignocellulolytic gene content of all A. bisporus strains compared is also very similar. Our results show that the pangenome structure of A. bisporus is quite diverse with between 60-70% of the total protein coding genes per strain considered as being orthologous and syntenically conserved. These analyses and the genome sequence described herein are the starting point for more detailed molecular analyses into the growth and phenotypical responses of Agaricus bisporus var. bisporus ARP23 when challenged with economically important mycoviruses.

RevDate: 2019-08-01

Duan Z, Qiao Y, Lu J, et al (2019)

HUPAN: a pan-genome analysis pipeline for human genomes.

Genome biology, 20(1):149 pii:10.1186/s13059-019-1751-y.

The human reference genome is still incomplete, especially for those population-specific or individual-specific regions, which may have important functions. Here, we developed a HUman Pan-genome ANalysis (HUPAN) system to build the human pan-genome. We applied it to 185 deep sequencing and 90 assembled Han Chinese genomes and detected 29.5 Mb novel genomic sequences and at least 188 novel protein-coding genes missing in the human reference genome (GRCh38). It can be an important resource for the human genome-related biomedical studies, such as cancer genome analysis. HUPAN is freely available at http://cgm.sjtu.edu.cn/hupan/ and https://github.com/SJTU-CGM/HUPAN .

RevDate: 2019-07-27

Richards VP, Velsko IM, Alam T, et al (2019)

Population gene introgression and high genome plasticity for the zoonotic pathogen Streptococcus agalactiae.

Molecular biology and evolution pii:5539754 [Epub ahead of print].

The influence that bacterial adaptation (or niche partitioning) within species has on gene spillover and transmission among bacteria populations occupying different niches is not well understood. Streptococcus agalactiae is an important bacterial pathogen that has a taxonomically diverse host range making it an excellent model system to study these processes. Here we analyze a global set of 901 genome sequences from nine diverse host species to advance our understanding of these processes. Bayesian clustering analysis delineated twelve major populations that closely aligned with niches. Comparative genomics revealed extensive gene gain/loss among populations and a large pan-genome of 9,527 genes, which remained open and was strongly partitioned among niches. As a result, the biochemical characteristics of eleven populations were highly distinctive (significantly enriched). Positive selection was detected and biochemical characteristics of the dispensable genes under selection were enriched in ten populations. Despite the strong gene partitioning, phylogenomics detected gene spillover. In particular, tetracycline resistance (which likely evolved in the human-associated population) from humans to bovine, canines, seals, and fish, demonstrating how a gene selected in one host can ultimately be transmitted into another, and biased transmission from humans to bovines was confirmed with a Bayesian migration analysis. Our findings show high bacterial genome plasticity acting in balance with selection pressure from distinct functional requirements of niches that is associated with an extensive and highly partitioned dispensable genome, likely facilitating continued and expansive adaptation.

RevDate: 2019-07-23

Naidenov B, Lim A, Willyerd K, et al (2019)

Pan-Genomic and Polymorphic Driven Prediction of Antibiotic Resistance in Elizabethkingia.

Frontiers in microbiology, 10:1446.

The Elizabethkingia are a genetically diverse genus of emerging pathogens that exhibit multidrug resistance to a range of common antibiotics. Two representative species, Elizabethkingia bruuniana and E. meningoseptica, were phenotypically tested to determine minimum inhibitory concentrations (MICs) for five antibiotics. Ultra-long read sequencing with Oxford Nanopore Technologies (ONT) and subsequent de novo assembly produced complete, gapless circular genomes for each strain. Alignment based annotation with Prokka identified 5,480 features in E. bruuniana and 5,203 features in E. meningoseptica, where none of these identified genes or gene combinations corresponded to observed phenotypic resistance values. Pan-genomic analysis, performed with an additional 19 Elizabethkingia strains, identified a core-genome size of 2,658,537 bp, 32 uniquely identifiable intrinsic chromosomal antibiotic resistance core-genes and 77 antibiotic resistance pan-genes. Using core-SNPs and pan-genes in combination with six machine learning (ML) algorithms, binary classification of clindamycin and vancomycin resistance achieved f1 scores of 0.94 and 0.84, respectively. Performance on the more challenging multiclass problem for fusidic acid, rifampin and ciprofloxacin resulted in f1 scores of 0.70, 0.75, and 0.54, respectively. By producing two sets of quality biological predictors, pan-genome genes and core-genome SNPs, from long-read sequence data and applying an ensemble of ML techniques, our results demonstrated that accurate phenotypic inference, at multiple AMR resolutions, can be achieved.

RevDate: 2019-07-18

Passarelli-Araujo H, Palmeiro JK, Moharana KC, et al (2019)

Genomic analysis unveils important aspects of population structure, virulence, and antimicrobial resistance in Klebsiella aerogenes.

The FEBS journal [Epub ahead of print].

Klebsiella aerogenes is an important pathogen in healthcare-associated infections. Nevertheless, in comparison to other clinically important pathogens, K. aerogenes population structure, genetic diversity, and pathogenicity remain poorly understood. Here, we elucidate K. aerogenes clonal complexes (CCs) and genomic features associated with resistance and virulence. We present a detailed description of the population structure of K. aerogenes based on 97 publicly available genomes by using both multilocus sequence typing and single nucleotide polymorphisms extracted from the core genome. We also assessed virulence and resistance profiles using VFDB and CARD, respectively. We show that K. aerogenes has an open pangenome and a large effective population size, which account for its high genomic diversity and support that negative selection prevents fixation of most deleterious alleles. The population is structured in at least ten CCs, including two novel ones identified here, CC9 and CC10. The repertoires of resistance genes comprise a high number of antibiotic efflux proteins as well as narrow and extended spectrum β-lactamases. Regarding the population structure, we identified two clusters based on virulence profiles because of the presence of the toxin-encoding clb operon and the siderophore production genes, irp and ybt. Notably, CC3 comprises the majority of K. aerogenes isolates associated with hospital outbreaks, emphasizing the importance of constant monitoring of this pathogen. Collectively, our results may provide a foundation for the development of new therapeutic and surveillance strategies worldwide. This article is protected by copyright. All rights reserved.

RevDate: 2019-07-18

Chen SL (2019)

Genomic Insights Into the Distribution and Evolution of Group B Streptococcus.

Frontiers in microbiology, 10:1447.

Streptococcus agalactiae, also known as Group B Streptococcus (GBS), is a bacteria with truly protean biology. It infects a variety of hosts, among which the most commonly studied are humans, cattle, and fish. GBS holds a singular position in the history of bacterial genomics, as it was the substrate used to describe one of the first major conceptual advances of comparative genomics, the idea of the pan-genome. In this review, I describe a brief history of GBS and the major contributions of genomics to understanding its genome plasticity and evolution as well as its molecular epidemiology, focusing on the three hosts mentioned above. I also discuss one of the major recent paradigm shifts in our understanding of GBS evolution and disease burden: foodborne GBS can cause invasive infections in humans.

RevDate: 2019-07-18

Xia Q, Pan L, Zhang R, et al (2019)

The genome assembly of asparagus bean, Vigna unguiculata ssp. sesquipedialis.

Scientific data, 6(1):124 pii:10.1038/s41597-019-0130-6.

Asparagus bean (Vigna. unguiculata ssp. sesquipedialis), known for its very long and tender green pods, is an important vegetable crop broadly grown in the developing Asian countries. In this study, we reported a 632.8 Mb assembly (549.81 Mb non-N size) of asparagus bean based on the whole genome shotgun sequencing strategy. We also generated a linkage map for asparagus bean, which helped anchor 94.42% of the scaffolds into 11 pseudo-chromosomes. A total of 42,609 protein-coding genes and 3,579 non-protein-coding genes were predicted from the assembly. Taken together, these genomic resources of asparagus bean will help develop a pan-genome of V. unguiculata and facilitate the investigation of economically valuable traits in this species, so that the cultivation of this plant would help combat the protein and energy malnutrition in the developing world.

RevDate: 2019-07-16

Yahara K, Lehours P, FF Vale (2019)

Analysis of genetic recombination and the pan-genome of a highly recombinogenic bacteriophage species.

Microbial genomics [Epub ahead of print].

Bacteriophages are the most prevalent biological entities impacting on the ecosystem and are characterized by their extensive diversity. However, there are two aspects of phages that have remained largely unexplored: genetic flux by recombination between phage populations and characterization of specific phages in terms of the pan-genome. Here, we examined the recombination and pan-genome in Helicobacter pylori prophages at both the genome and gene level. In the genome-level analysis, we applied, for the first time, chromosome painting and fineSTRUCTURE algorithms to a phage species, and showed novel trends in inter-population genetic flux. Notably, hpEastAsia is a phage population that imported a higher proportion of DNA fragments from other phages, whereas the hpSWEurope phages showed weaker signatures of inter-population recombination, suggesting genetic isolation. The gene-level analysis showed that, after parameter tuning of the prokaryote pan-genome analysis program, H. pylori phages have a pan-genome consisting of 75 genes and a soft-core genome of 10 genes, which includes genes involved in the lytic and lysogenic life cycles. Quantitative analysis of recombination events of the soft-core genes showed no substantial variation in the intensity of recombination across the genes, but rather equally frequent recombination among housekeeping genes that were previously reported to be less prone to recombination. The signature of frequent recombination appears to reflect the host-phage evolutionary arms race, either by contributing to escape from bacterial immunity or by protecting the host by producing defective phages.

RevDate: 2019-07-14

Paterson ML, Ranasinghe D, Blom J, et al (2019)

Genomic analysis of a novel Rhodococcus (Prescottella) equi isolate from a bovine host.

Archives of microbiology pii:10.1007/s00203-019-01695-z [Epub ahead of print].

Rhodococcus (Prescottella) equi causes pneumonia-like infections in foals with high mortality rates and can also infect a number of other animals. R. equi is also emerging as an opportunistic human pathogen. In this study, we have sequenced the genome of a novel R. equi isolate, B0269, isolated from the faeces of a bovine host. Comparative genomic analyses with seven other published R. equi genomes, including those from equine or human sources, revealed a pangenome comprising of 6876 genes with 4141 genes in the core genome. Two hundred and 75 genes were specific to the bovine isolate, mostly encoding hypothetical proteins of unknown function. However, these genes include four copies of terA and five copies of terD genes that may be involved in responding to chemical stress. Virulence characteristics in R. equi are associated with the presence of large plasmids carrying a pathogenicity island, including genes from the vap multigene family. A BLAST search of the protein sequences from known virulence-associated plasmids (pVAPA, pVAPB and pVAPN) revealed a similar plasmid backbone on two contigs in bovine isolate B0269; however, no homologues of the main virulence-associated genes, vapA, vapB or vapN, were identified. In summary, this study confirms that R. equi genomes are highly conserved and reports the presence of an apparently novel plasmid in the bovine isolate B0269 that needs further characterisation to understand its potential involvement in virulence properties.

RevDate: 2019-07-13

Kingstad-Bakke BA, Chandrasekar SS, Phanse Y, et al (2019)

Effective mosaic-based nanovaccines against avian influenza in poultry.

Vaccine pii:S0264-410X(19)30854-0 [Epub ahead of print].

Avian influenza virus (AIV) is an extraordinarily diverse pathogen that causes significant morbidity in domesticated poultry populations and threatens human life with looming pandemic potential. Controlling avian influenza in susceptible populations requires highly effective, economical and broadly reactive vaccines. Several AIV vaccines have proven insufficient despite their wide use, and better technologies are needed to improve their immunogenicity and broaden effectiveness. Previously, we developed a "mosaic" H5 subtype hemagglutinin (HA) AIV vaccine and demonstrated its broad protection against diverse highly pathogenic H5N1 and seasonal H1N1 virus strains in mouse and non-human primate models. There is a significant interest in developing effective and safe vaccines against AIV that cannot contribute to the emergence of new strains of the virus once circulating in poultry. Here, we report on the development of an H5 mosaic (H5M) vaccine antigen formulated with polyanhydride nanoparticles (PAN) that provide sustained release of encapsulated antigens. H5M vaccine constructs were immunogenic whether delivered by the modified virus Ankara (MVA) strain or encapsulated within PAN. Both humoral and cellular immune responses were generated in both specific-pathogen free (SPF) and commercial chicks. Importantly, chicks vaccinated by H5M constructs were protected in terms of viral shedding from divergent challenge with a low pathogenicity avian influenza (LPAI) strain at 8 weeks post-vaccination. In addition, protective levels of humoral immunity were generated against highly pathogenic avian influenza (HPAI) of the similar H5N1 and genetically dissimilar H5N2 viruses. Overall, the developed platform technologies (MVA vector and PAN encapsulation) were safe and provided high levels of sustained protection against AIV in chickens. Such approaches could be used to design more efficacious vaccines against other important poultry infections.

RevDate: 2019-07-12

McCarthy CGP, DA Fitzpatrick (2019)

Pangloss: A Tool for Pan-Genome Analysis of Microbial Eukaryotes.

Genes, 10(7): pii:genes10070521.

Although the pan-genome concept originated in prokaryote genomics, an increasing number of eukaryote species pan-genomes have also been analysed. However, there is a relative lack of software intended for eukaryote pan-genome analysis compared to that available for prokaryotes. In a previous study, we analysed the pan-genomes of four model fungi with a computational pipeline that constructed pan-genomes using the synteny-dependent Pan-genome Ortholog Clustering Tool (PanOCT) approach. Here, we present a modified and improved version of that pipeline which we have called Pangloss. Pangloss can perform gene prediction for a set of genomes from a given species that the user provides, constructs and optionally refines a species pan-genome from that set using PanOCT, and can perform various functional characterisation and visualisation analyses of species pan-genome data. To demonstrate Pangloss's capabilities, we constructed and analysed a species pan-genome for the oleaginous yeast Yarrowialipolytica and also reconstructed a previously-published species pan-genome for the opportunistic respiratory pathogen Aspergillus fumigatus. Pangloss is implemented in Python, Perl and R and is freely available under an open source GPLv3 licence via GitHub.

RevDate: 2019-07-11

Passera A, Compant S, Casati P, et al (2019)

Not Just a Pathogen? Description of a Plant-Beneficial Pseudomonas syringae Strain.

Frontiers in microbiology, 10:1409.

Plants develop in a microbe-rich environment and must interact with a plethora of microorganisms, both pathogenic and beneficial. Indeed, such is the case of Pseudomonas, and its model organisms P. fluorescens and P. syringae, a bacterial genus that has received particular attention because of its beneficial effect on plants and its pathogenic strains. The present study aims to compare plant-beneficial and pathogenic strains belonging to the P. syringae species to get new insights into the distinction between the two types of plant-microbe interactions. In assays carried out under greenhouse conditions, P. syringae pv. syringae strain 260-02 was shown to promote plant-growth and to exert biocontrol of P. syringae pv. tomato strain DC3000, against the Botrytis cinerea fungus and the Cymbidium Ringspot Virus. This P. syringae strain also had a distinct volatile emission profile, as well as a different plant-colonization pattern, visualized by confocal microscopy and gfp labeled strains, compared to strain DC3000. Despite the different behavior, the P. syringae strain 260-02 showed great similarity to pathogenic strains at a genomic level. However, genome analyses highlighted a few differences that form the basis for the following hypotheses regarding strain 260-02. P. syringae strain 260-02: (i) possesses non-functional virulence genes, like the mangotoxin-producing operon Mbo; (ii) has different regulation pathways, suggested by the difference in the autoinducer system and the lack of a virulence activator gene; (iii) has genes encoding DNA methylases different from those found in other P. syringae strains, suggested by the presence of horizontal-gene-transfer-obtained methylases that could affect gene expression.

RevDate: 2019-07-11

Fontana A, Falasconi I, Molinari P, et al (2019)

Genomic Comparison of Lactobacillus helveticus Strains Highlights Probiotic Potential.

Frontiers in microbiology, 10:1380.

Lactobacillus helveticus belongs to the large group of lactic acid bacteria (LAB), which are the major players in the fermentation of a wide range of foods. LAB are also present in the human gut, which has often been exploited as a reservoir of potential novel probiotic strains, but several parameters need to be assessed before establishing their safety and potential use for human consumption. In the present study, six L. helveticus strains isolated from natural whey cultures were analyzed for their phenotype and genotype in exopolysaccharide (EPS) production, low pH and bile salt tolerance, bile salt hydrolase (BSH) activity, and antibiotic resistance profile. In addition, a comparative genomic investigation was performed between the six newly sequenced strains and the 51 publicly available genomes of L. helveticus to define the pangenome structure. The results indicate that the newly sequenced strain UC1267 and the deposited strain DSM 20075 can be considered good candidates for gut-adapted strains due to their ability to survive in the presence of 0.2% glycocholic acid (GCA) and 1% taurocholic and taurodeoxycholic acid (TDCA). Moreover, these strains had the highest bile salt deconjugation activity among the tested L. helveticus strains. Considering the safety profile, none of these strains presented antibiotic resistance phenotypically and/or at the genome level. The pangenome analysis revealed genes specific to the new isolates, such as enzymes related to folate biosynthesis in strains UC1266 and UC1267 and an integrated phage in strain UC1035. Finally, the presence of maltose-degrading enzymes and multiple copies of 6-phospho-β-glucosidase genes in our strains indicates the capability to metabolize sugars other than lactose, which is related solely to dairy niches.

RevDate: 2019-07-10

Tian X, Li R, Fu W, et al (2019)

Building a sequence map of the pig pan-genome from multiple de novo assemblies and Hi-C data.

Science China. Life sciences pii:10.1007/s11427-019-9551-7 [Epub ahead of print].

Pigs were domesticated independently in the Near East and China, indicating that a single reference genome from one individual is unable to represent the full spectrum of divergent sequences in pigs worldwide. Therefore, 12 de novo pig assemblies from Eurasia were compared in this study to identify the missing sequences from the reference genome. As a result, 72.5 Mb of non-redundant sequences (∼3% of the genome) were found to be absent from the reference genome (Sscrofa11.1) and were defined as pan-sequences. Of the pan-sequences, 9.0 Mb were dominant in Chinese pigs, in contrast with their low frequency in European pigs. One sequence dominant in Chinese pigs contained the complete genic region of the tazarotene-induced gene 3 (TIG3) gene which is involved in fatty acid metabolism. Using flanking sequences and Hi-C based methods, 27.7% of the sequences could be anchored to the reference genome. The supplementation of these sequences could contribute to the accurate interpretation of the 3D chromatin structure. A web-based pan-genome database was further provided to serve as a primary resource for exploration of genetic diversity and promote pig breeding and biomedical research.

RevDate: 2019-07-09

Piligrimova EG, Kazantseva OA, Nikulin NA, et al (2019)

Bacillus Phage vB_BtS_B83 Previously Designated as a Plasmid May Represent a New Siphoviridae Genus.

Viruses, 11(7): pii:v11070624.

The Bacillus cereus group of bacteria includes, inter alia, the species known to be associated with human diseases and food poisoning. Here, we describe the Bacillus phage vB_BtS_B83 (abbreviated as B83) infecting the species of this group. Transmission electron microscopy (TEM) micrographs indicate that B83 belongs to the Siphoviridae family. B83 is a temperate phage using an arbitrium system for the regulation of the lysis-lysogeny switch, and is probably capable of forming a circular plasmid prophage. Comparative analysis shows that it has been previously sequenced, but was mistaken for a plasmid. B83 shares common genome organization and >46% of proteins with other the Bacillus phage, BMBtp14. Phylograms constructed using large terminase subunits and a pan-genome presence-absence matrix show that these phages form a clade distinct from the closest viruses. Based on the above, we propose the creation of a new genus named Bembunaquatrovirus that includes B83 and BMBtp14.

RevDate: 2019-07-08

Machado KCT, Fortuin S, Tomazella GG, et al (2019)

On the Impact of the Pangenome and Annotation Discrepancies While Building Protein Sequence Databases for Bacteria Proteogenomics.

Frontiers in microbiology, 10:1410.

In proteomics, peptide information within mass spectrometry (MS) data from a specific organism sample is routinely matched against a protein sequence database that best represent such organism. However, if the species/strain in the sample is unknown or genetically poorly characterized, it becomes challenging to determine a database which can represent such sample. Building customized protein sequence databases merging multiple strains for a given species has become a strategy to overcome such restrictions. However, as more genetic information is publicly available and interesting genetic features such as the existence of pan- and core genes within a species are revealed, we questioned how efficient such merging strategies are to report relevant information. To test this assumption, we constructed databases containing conserved and unique sequences for 10 different species. Features that are relevant for probabilistic-based protein identification by proteomics were then monitored. As expected, increase in database complexity correlates with pangenomic complexity. However, Mycobacterium tuberculosis and Bordetella pertussis generated very complex databases even having low pangenomic complexity. We further tested database performance by using MS data from eight clinical strains from M. tuberculosis, and from two published datasets from Staphylococcus aureus. We show that by using an approach where database size is controlled by removing repeated identical tryptic sequences across strains/species, computational time can be reduced drastically as database complexity increases.

RevDate: 2019-07-07

Rank Nielsen M, Dam Wollenberg R, Ringsborg Westphal K, et al (2019)

Heterologous Expression of Intact Biosynthetic Gene Clusters in Fusarium graminearum.

Fungal genetics and biology : FG & B pii:S1087-1845(19)30046-5 [Epub ahead of print].

Filamentous fungi such as species from the genus Fusarium are capable of producing a wide palette of interesting metabolites relevant to health, agriculture and biotechnology. Secondary metabolites are formed from large synthase/synthetase enzymes often encoded in gene clusters containing additional enzymes cooperating in the metabolite's biosynthesis. The true potential of fungal metabolomes remain untapped as the majority of secondary metabolite gene clusters are silent under standard laboratory growth conditions. One way to achieve expression of biosynthetic pathways is to clone the responsible genes and express them in a well-suited heterologous host, which poses a challenge since Fusarium polyketide synthase and non-ribosomal peptide synthetase gene clusters can be large (e.g. as large as 80 kb) and comprise several genes necessary for product formation. The major challenge associated with heterologous expression of fungal biosynthesis pathways is thus handling and cloning large DNA sequences. In this paper we present the successful workflow for cloning, reconstruction and heterologous production of two previously characterized Fusarium pseudograminearum natural product pathways in Fusarium graminearum. In vivo yeast recombination enabled rapid assembly of the W493 (NRPS32-PKS40) and the Fusarium Cytokinin gene clusters. F. graminearum transformants were obtained through protoplast-mediated and Agrobacterium tumefaciens-mediated transformation. Whole genome sequencing revealed isolation of transformants carrying intact copies the gene clusters was possible. Known Fusarium cytokinin metabolites; fusatin, 8-oxo-fusatin, 8-oxo-isopentenyladenine, fusatinic acid together with cis- and trans-zeatin were detected by liquid chromatography and mass spectrometry, which confirmed gene functionality in F. graminearum. In addition the non-ribosomal lipopeptide products W493 A and B was heterologously produced in similar amounts to that observed in the F. pseudograminearum doner. The Fusarium pan-genome comprises more than 60 uncharacterized putative secondary metabolite gene clusters. We nominate the well-characterized F. graminearum as a heterologous expression platform for Fusarium secondary metabolite gene clusters, and present our experience cloning and introducing gene clusters into this species. We expect the presented methods will inspire future endevours in heterologous production of Fusarium metabolites and potentially aid the production and characterization of novel natural products.

RevDate: 2019-07-07

Matteoli FP, Passarelli-Araujo H, Pedrosa-Silva F, et al (2019)

Population structure and pangenome analysis of Enterobacter bugandensis uncover the presence of blaCTX-M-55, blaNDM-5 and blaIMI-1, along with sophisticated iron acquisition strategies.

Genomics pii:S0888-7543(19)30319-2 [Epub ahead of print].

Enterobacter bugandensis is a recently described species that has been largely associated with nosocomial infections. We report the genome of a non-clinical E. bugandensis strain, which was integrated with publicly available genomes to study the pangenome and general population structure of E. bugandensis. Core- and whole-genome multilocus sequence typing allowed the detection of five E. bugandensis phylogroups (PG-A to E), which contain important antimicrobial resistance and virulence determinants. We uncovered several extended-spectrum β-lactamases, including blaCTX-M-55 and blaNDM-5, present in an IncX replicon type plasmid, described here for the first time in E. bugandensis. Genetic context analysis of blaNDM-5 revealed the resemblance of this plasmid with other IncX plasmids from other bacteria from the same country. Three distinctive siderophore producing operons were found in E. bugandensis: enterobactin (ent), aerobactin (iuc/iut), and salmochelin (iro). Our findings provide novel insights on the lifestyle, physiology, antimicrobial, and virulence profiles of E. bugandensis.

RevDate: 2019-07-05

Kopejtka K, Lin Y, Jakubovičová M, et al (2019)

Clustered core- and pan-genome content on Rhodobacteraceae chromosomes.

Genome biology and evolution pii:5527758 [Epub ahead of print].

In Bacteria, chromosome replication starts at a single origin of replication and proceeds on both replichores. Due to its asymmetric nature, replication influences chromosome structure and gene organization, mutation rate and expression. To date, little is known about the distribution of highly conserved genes over the bacterial chromosome. Here, we used a set of 101 fully-sequenced Rhodobacteraceae representatives to analyze the relationship between conservation of genes within this family and their distance from the origin of replication. Twenty-two of the analyzed species had core genes clustered significantly closer to the origin of replication with representatives of the genus Celeribacter being the most apparent example. Interestingly, there were also eight species with the opposite organization. In particular Rhodobaca barguzinensis and Loktanella vestfoldensis showed a significant increase of core genes with distance from the origin of replication. The uneven distribution of low-conserved regions is in particular pronounced for genomes in which the halves of one replichore differ in their conserved gene content. Phage integration and horizontal gene transfer partially explain the scattered nature of Rhodobacteraceae genomes. Our findings lay the foundation for a better understanding of bacterial genome evolution and the role of replication therein.

RevDate: 2019-06-27

Levesque S, de Melo AG, Labrie SJ, et al (2019)

Mobilome of Brevibacterium aurantiacum Sheds Light on Its Genetic Diversity and Its Adaptation to Smear-Ripened Cheeses.

Frontiers in microbiology, 10:1270.

Brevibacterium aurantiacum is an actinobacterium that confers key organoleptic properties to washed-rind cheeses during the ripening process. Although this industrially relevant species has been gaining an increasing attention in the past years, its genome plasticity is still understudied due to the unavailability of complete genomic sequences. To add insights on the mobilome of this group, we sequenced the complete genomes of five dairy Brevibacterium strains and one non-dairy strain using PacBio RSII. We performed phylogenetic and pan-genome analyses, including comparisons with other publicly available Brevibacterium genomic sequences. Our phylogenetic analysis revealed that these five dairy strains, previously identified as Brevibacterium linens, belong instead to the B. aurantiacum species. A high number of transposases and integrases were observed in the Brevibacterium spp. strains. In addition, we identified 14 and 12 new insertion sequences (IS) in B. aurantiacum and B. linens genomes, respectively. Several stretches of homologous DNA sequences were also found between B. aurantiacum and other cheese rind actinobacteria, suggesting horizontal gene transfer (HGT). A HGT region from an iRon Uptake/Siderophore Transport Island (RUSTI) and an iron uptake composite transposon were found in five B. aurantiacum genomes. These findings suggest that low iron availability in milk is a driving force in the adaptation of this bacterial species to this niche. Moreover, the exchange of iron uptake systems suggests cooperative evolution between cheese rind actinobacteria. We also demonstrated that the integrative and conjugative element BreLI (Brevibacterium Lanthipeptide Island) can excise from B. aurantiacum SMQ-1417 chromosome. Our comparative genomic analysis suggests that mobile genetic elements played an important role into the adaptation of B. aurantiacum to cheese ecosystems.

RevDate: 2019-06-26

Zhang B, Zhu W, Diao S, et al (2019)

The poplar pangenome provides insights into the evolutionary history of the genus.

Communications biology, 2:215 pii:474.

The genus Populus comprises a complex amalgam of ancient and modern species that has become a prime model for evolutionary and taxonomic studies. Here we sequenced the genomes of 10 species from five sections of the genus Populus, identified 71 million genomic variations, and observed new correlations between the single-nucleotide polymorphism-structural variation (SNP-SV) density and indel-SV density to complement the SNP-indel density correlation reported in mammals. Disease resistance genes (R genes) with heterozygous loss-of-function (LOF) were significantly enriched in the 10 species, which increased the diversity of poplar R genes during evolution. Heterozygous LOF mutations in the self-incompatibility genes were closely related to the self-fertilization of poplar, suggestive of genomic control of self-fertilization in dioecious plants. The phylogenetic genome-wide SNPs tree also showed possible ancient hybridization among species in sections Tacamahaca, Aigeiros, and Leucoides. The pangenome resource also provided information for poplar genetics and breeding.

RevDate: 2019-06-26

Zhang AN, Mao Y, Wang Y, et al (2019)

Mining traits for the enrichment and isolation of not-yet-cultured populations.

Microbiome, 7(1):96 pii:10.1186/s40168-019-0708-4.

BACKGROUND: The lack of pure cultures limits our understanding into 99% of bacteria. Proper interpretation of the genetic and the transcriptional datasets can reveal clues for the enrichment and even isolation of the not-yet-cultured populations. Unraveling such information requires a proper mining method.

RESULTS: Here, we present a method to infer the hidden traits for the enrichment of not-yet-cultured populations. We demonstrate this method using Candidatus Accumulibacter. Our method constructs a whole picture of the carbon, electron, and energy flows in the not-yet-cultured populations from the genomic datasets. Then, it decodes the coordination across three flows from the transcriptional datasets. Based on it, our method diagnoses the status of the not-yet-cultured populations and provides strategy to optimize the enrichment systems.

CONCLUSION: Our method could shed light to the exploration into the bacterial dark matter in the environments.

RevDate: 2019-06-24

Québatte M, C Dehio (2019)

Bartonella Gene Transfer Agent: Evolution, Function, and Proposed Role in Host Adaptation.

Cellular microbiology [Epub ahead of print].

The processes underlying host-adaptation by bacterial pathogens remain a fundamental question with relevant clinical, ecological and evolutionary implications. Zoonotic pathogens of the genus Bartonella constitute an exceptional model to study these aspects. Bartonellae have undergone a spectacular diversification into multiple species resulting from adaptive radiation. Specific adaptations of a complex facultative intracellular lifestyle have enabled the colonization of distinct mammalian reservoir hosts. This remarkable host adaptability has a multifactorial basis and is thought to be driven by horizontal gene transfer (HGT) and recombination among a limited genus-specific pan-genome. Recent functional and evolutionary studies revealed that the conserved Bartonella gene transfer agent (BaGTA) mediates highly efficient HGT and could thus drive this evolution. Here we review the recent progress made towards understanding BaGTA evolution, function, and its role in the evolution and pathogenesis of Bartonella spp.. We notably discuss how BaGTA could have contributed to genome diversification through recombination of beneficial traits that underlie host adaptability. We further address how BaGTA may counter the accumulation of deleterious mutations in clonal populations (Muller's Ratchet), that are expected to occur through the recurrent transmission bottlenecks during the complex infection cycle of these pathogens in their mammalian reservoir hosts and arthropod vectors.

RevDate: 2019-06-24

Minnullina L, Pudova D, Shagimardanova E, et al (2019)

Comparative Genome Analysis of Uropathogenic Morganella morganii Strains.

Frontiers in cellular and infection microbiology, 9:167.

Morganella morganii is an opportunistic bacterial pathogen shown to cause a wide range of clinical and community-acquired infections. This study was aimed at sequencing and comparing the genomes of three M. morganii strains isolated from the urine samples of patients with community-acquired urinary tract infections. Draft genome sequencing was conducted using the Illumina HiSeq platform. The genomes of MM 1, MM 4, and MM 190 strains have a size of 3.82-3.97 Mb and a GC content of 50.9-51%. Protein-coding sequences (CDS) represent 96.1% of the genomes, RNAs are encoded by 2.7% of genes and pseudogenes account for 1.2% of the genomes. The pan-genome containes 4,038 CDS, of which 3,279 represent core genes. Six to ten prophages and 21-33 genomic islands were identified in the genomes of MM 1, MM 4, and MM 190. More than 30 genes encode capsular biosynthesis proteins, an average of 60 genes encode motility and chemotaxis proteins, and about 70 genes are associated with fimbrial biogenesis and adhesion. We determined that all strains contained urease gene cluster ureABCEFGD and had a urease activity. Both MM 4 and MM 190 strains are capable of hemolysis and their activity correlates well with a cytotoxicity level on T-24 bladder carcinoma cells. These activities were associated with expression of RTX toxin gene hlyA, which was introduced into the genomes by a phage similar to Salmonella phage 118970_sal4.

RevDate: 2019-06-18

Blake VC, Woodhouse MR, Lazo GR, et al (2019)

GrainGenes: centralized small grain resources and digital platform for geneticists and breeders.

Database : the journal of biological databases and curation, 2019:.

GrainGenes (https://wheat.pw.usda.gov or https://graingenes.org) is an international centralized repository for curated, peer-reviewed datasets useful to researchers working on wheat, barley, rye and oat. GrainGenes manages genomic, genetic, germplasm and phenotypic datasets through a dynamically generated web interface for facilitated data discovery. Since 1992, GrainGenes has served geneticists and breeders in both the public and private sectors on six continents. Recently, several new datasets were curated into the database along with new tools for analysis. The GrainGenes homepage was enhanced by making it more visually intuitive and by adding links to commonly used pages. Several genome assemblies and genomic tracks are displayed through the genome browsers at GrainGenes, including the Triticum aestivum (bread wheat) cv. 'Chinese Spring' IWGSC RefSeq v1.0 genome assembly, the Aegilops tauschii (D genome progenitor) Aet v4.0 genome assembly, the Triticum turgidum ssp. dicoccoides (wild emmer wheat) cv. 'Zavitan' WEWSeq v.1.0 genome assembly, a T. aestivum (bread wheat) pangenome, the Hordeum vulgare (barley) cv. 'Morex' IBSC genome assembly, the Secale cereale (rye) select 'Lo7' assembly, a partial hexaploid Avena sativa (oat) assembly and the Triticum durum cv. 'Svevo' (durum wheat) RefSeq Release 1.0 assembly. New genetic maps and markers were added and can be displayed through CMAP. Quantitative trait loci, genetic maps and genes from the Wheat Gene Catalogue are indexed and linked through the Wheat Information System (WheatIS) portal. Training videos were created to help users query and reach the data they need. GSP (Genome Specific Primers) and PIECE2 (Plant Intron Exon Comparison and Evolution) tools were implemented and are available to use. As more small grains reference sequences become available, GrainGenes will play an increasingly vital role in helping researchers improve crops.

RevDate: 2019-06-16

Chun BH, Han DM, Kim KH, et al (2019)

Genomic and metabolic features of Tetragenococcus halophilus as revealed by pan-genome and transcriptome analyses.

Food microbiology, 83:36-47.

The genomic and metabolic diversity and features of Tetragenococcus halophilus, a moderately halophilic lactic acid bacterium, were investigated by pan-genome, transcriptome, and metabolite analyses. Phylogenetic analyses based on the 16S rRNA gene and genome sequences of 15 T. halophilus strains revealed their phylogenetic distinctness from other Tetragenococcus species. Pan-genome analysis of the T. halophilus strains showed that their carbohydrate metabolic capabilities were diverse and strain dependent. Aside from one histidine decarboxylase gene in one strain, no decarboxylase gene associated with biogenic amine production was identified from the genomes. However, T. halophilus DSM 20339T produced tyramine without a biogenic amine-producing decarboxylase gene, suggesting the presence of an unidentified tyramine-producing gene. Our reconstruction of the metabolic pathways of these strains showed that T. halophilus harbors a facultative lactic acid fermentation pathway to produce l-lactate, ethanol, acetate, and CO2 from various carbohydrates. The transcriptomic analysis of strain DSM 20339T suggested that T. halophilus may produce more acetate via the heterolactic pathway (including d-ribose metabolism) at high salt conditions. Although genes associated with the metabolism of glycine betaine, proline, glutamate, glutamine, choline, and citrulline were identified from the T. halophilus genomes, the transcriptome and metabolite analyses suggested that glycine betaine was the main compatible solute responding to high salt concentration and that citrulline may play an important role in the coping mechanism against high salinity-induced osmotic stresses. Our results will provide a better understanding of the genome and metabolic features of T. halophilus, which has implications for the food fermentation industry.

RevDate: 2019-06-13

Kröber E, H Schäfer (2019)

Identification of Proteins and Genes Expressed by Methylophaga thiooxydans During Growth on Dimethylsulfide and Their Presence in Other Members of the Genus.

Frontiers in microbiology, 10:1132.

Dimethylsulfide is a volatile organic sulfur compound that provides the largest input of biogenic sulfur from the oceans to the atmosphere, and thence back to land, constituting an important link in the global sulfur cycle. Microorganisms degrading DMS affect fluxes of DMS in the environment, but the underlying metabolic pathways are still poorly understood. Methylophaga thiooxydans is a marine methylotrophic bacterium capable of growth on DMS as sole source of carbon and energy. Using proteomics and transcriptomics we identified genes expressed during growth on dimethylsulfide and methanol to refine our knowledge of the metabolic pathways that are involved in DMS and methanol degradation in this strain. Amongst the most highly expressed genes on DMS were the two methanethiol oxidases driving the oxidation of this reactive and toxic intermediate of DMS metabolism. Growth on DMS also increased expression of the enzymes of the tetrahydrofolate linked pathway of formaldehyde oxidation, in addition to the tetrahydromethanopterin linked pathway. Key enzymes of the inorganic sulfur oxidation pathway included flavocytochrome c sulfide dehydrogenase, sulfide quinone oxidoreductase, and persulfide dioxygenases. A sulP permease was also expressed during growth on DMS. Proteomics and transcriptomics also identified a number of highly expressed proteins and gene products whose function is currently not understood. As the identity of some enzymes of organic and inorganic sulfur metabolism previously detected in Methylophaga has not been characterized at the genetic level yet, highly expressed uncharacterized genes provide new targets for further biochemical and genetic analysis. A pan-genome analysis of six available Methylophaga genomes showed that only two of the six investigated strains, M. thiooxydans and M. sulfidovorans have the gene encoding methanethiol oxidase, suggesting that growth on methylated sulfur compounds of M. aminisulfidivorans is likely to involve different enzymes and metabolic intermediates. Hence, the pathways of DMS-utilization and subsequent C1 and sulfur oxidation are not conserved across Methylophaga isolates that degrade methylated sulfur compounds.

RevDate: 2019-06-12

Guyeux C, Charr JC, Tran HTM, et al (2019)

Evaluation of chloroplast genome annotation tools and application to analysis of the evolution of coffee species.

PloS one, 14(6):e0216347 pii:PONE-D-18-32695.

Chloroplast sequences are widely used for phylogenetic analysis due to their high degree of conservation in plants. Whole chloroplast genomes can now be readily obtained for plant species using new sequencing methods, giving invaluable data for plant evolution However new annotation methods are required for the efficient analysis of this data to deliver high quality phylogenetic analyses. In this study, the two main tools for chloroplast genome annotation were compared. More consistent detection and annotation of genes were produced with GeSeq when compared to the currently used Dogma. This suggests that the annotation of most of the previously annotated chloroplast genomes should now be updated. GeSeq was applied to species related to coffee, including 16 species of the Coffea and Psilanthus genera to reconstruct the ancestral chloroplast genomes and to evaluate their phylogenetic relationships. Eight genes in the plant chloroplast pan genome (consisting of 92 genes) were always absent in the coffee species analyzed. Notably, the two main cultivated coffee species (i.e. Arabica and Robusta) did not group into the same clade and differ in their pattern of gene evolution. While Arabica coffee (Coffea arabica) belongs to the Coffea genus, Robusta coffee (Coffea canephora) is associated with the Psilanthus genus. A more extensive survey of related species is required to determine if this is a unique attribute of Robusta coffee or a more widespread feature of coffee tree species.

RevDate: 2019-06-06

Hsu T, Gemmell MR, Franzosa EA, et al (2019)

Comparative genomics and genome biology of Campylobacter showae.

Emerging microbes & infections, 8(1):827-840.

Campylobacter showae a bacterium historically linked to gingivitis and periodontitis, has recently been associated with inflammatory bowel disease and colorectal cancer. Our aim was to generate genome sequences for new clinical C. showae strains and identify functional properties explaining their pathogenic potential. Eight C. showae genomes were assessed, four strains isolated from inflamed gut tissues from paediatric Crohn's disease patients, three strains from colonic adenomas, and one from a gastroenteritis patient stool. Genome assemblies were analyzed alongside the only 3 deposited C. showae genomes. The pangenome from these 11 strains consisted of 4686 unique protein families, and the core genome size was estimated at 1050 ± 15 genes with each new genome contributing an additional 206 ± 16 genes. Functional assays indicated that colonic strains segregated into 2 groups: adherent/invasive vs. non-adherent/non-invasive strains. The former possessed Type IV secretion machinery and S-layer proteins, while the latter contained Cas genes and other CRISPR associated proteins. Comparison of gene profiles with strains in Human Microbiome Project metagenomes showed that gut-derived isolates share genes specific to tongue dorsum and supragingival plaque counterparts. Our findings indicate that C. showae strains are phenotypically and genetically diverse and suggest that secretion systems may play an important role in virulence potential.

RevDate: 2019-06-10

Hemsley CM, O'Neill PA, Essex-Lopresti A, et al (2019)

Extensive genome analysis of Coxiella burnetii reveals limited evolution within genomic groups.

BMC genomics, 20(1):441 pii:10.1186/s12864-019-5833-8.

BACKGROUND: Coxiella burnetii is a zoonotic pathogen that resides in wild and domesticated animals across the globe and causes a febrile illness, Q fever, in humans. An improved understanding of the genetic diversity of C. burnetii is essential for the development of diagnostics, vaccines and therapeutics, but genotyping data is lacking from many parts of the world. Sporadic outbreaks of Q fever have occurred in the United Kingdom, but the local genetic make-up of C. burnetii has not been studied in detail.

RESULTS: Here, we report whole genome data for nine C. burnetii sequences obtained in the UK. All four genomes of C. burnetii from cattle, as well as one sheep sample, belonged to Multi-spacer sequence type (MST) 20, whereas the goat samples were MST33 (three genomes) and MST32 (one genome), two genotypes that have not been described to be present in the UK to date. We established the phylogenetic relationship between the UK genomes and 67 publically available genomes based on single nucleotide polymorphisms (SNPs) in the core genome, which confirmed tight clustering of strains within genomic groups, but also indicated that sub-groups exist within those groups. Variation is mainly achieved through SNPs, many of which are non-synonymous, thereby confirming that evolution of C. burnetii is based on modification of existing genes. Finally, we discovered genomic-group specific genome content, which supports a model of clonal expansion of previously established genotypes, with large scale dissemination of some of these genotypes across continents being observed.

CONCLUSIONS: The genetic make-up of C. burnetii in the UK is similar to the one in neighboring European countries. As a species, C. burnetii has been considered a clonal pathogen with low genetic diversity at the nucleotide level. Here, we present evidence for significant variation at the protein level between isolates of different genomic groups, which mainly affects secreted and membrane-associated proteins. Our results thereby increase our understanding of the global genetic diversity of C. burnetii and provide new insights into the evolution of this emerging zoonotic pathogen.

RevDate: 2019-06-04

León-Sampedro R, Del Campo R, Rodriguez-Baños M, et al (2019)

Phylogenomics of Enterococcus faecalis from wild birds: new insights into host-associated differences in core and accessory genomes of the species.

Environmental microbiology [Epub ahead of print].

Wild birds have been suggested to be reservoirs of antimicrobial resistant and/or pathogenic E. faecalis (Efs) strains, but the scarcity of studies and available sequences limit our understanding of of the population structure of the species in these hosts. Here, we analyzed the clonal and plasmid diversity of 97 Efs isolates from wild migratory birds. We found a high diversity, with most sequence types (STs) being firstly described here, while others were found in other hosts including some predominant in poultry. We found that pheromone-responsive plasmids predominate in wild bird Efs while 35% of the isolates entirely lack plasmids. Then, to better understand the ecology of the species, five strains of known STs (ST82, ST170, ST16, ST55) were sequenced and compared with all the Efs genomes available in public databases. Using several methods to analyze core and accessory genomes (AccNET, PLACNET, hierBAPS, PANINI), we detected differences in the accessory genome of some lineages (e.g. ST82) demonstrating specific associations with birds. Conversely, the genomes of other Efs lineages exhibited divergence in core and accessory genomes, reflecting different adaptive trajectories in various hosts. This pangenome divergence, horizontal gene transfer events and occasional epidemic peaks could explain the population structure of the species. This article is protected by copyright. All rights reserved.

RevDate: 2019-05-31

Rossoni AW, Price DC, Seger M, et al (2019)

The genomes of polyextremophilic Cyanidiales contain 1% horizontally transferred genes with diverse adaptive functions.

eLife, 8: pii:45017 [Epub ahead of print].

The role and extent of horizontal gene transfer (HGT) in eukaryotes are hotly disputed topics that impact our understanding of the origin of metabolic processes and the role of organelles in cellular evolution. We addressed this issue by analyzing 10 novel Cyanidiales genomes and determined that 1% of their gene inventory is HGT-derived. Numerous HGT candidates share a close phylogenetic relationship with prokaryotes that live in similar habitats as the Cyanidiales and encode functions related to polyextremophily. HGT candidates differ from native genes in GC-content, number of splice sites, and gene expression. HGT candidates are more prone to loss, which may explain the absence of a eukaryotic pan-genome. Therefore, the lack of a pan-genome and cumulative effects fail to provide substantive arguments against our hypothesis of recurring HGT followed by differential loss in eukaryotes. The maintenance of 1% HGTs, even under selection for genome reduction, underlines the importance of non-endosymbiosis related foreign gene acquisition.

RevDate: 2019-05-30

Lyu J (2019)

Tomato pan-genome.

Nature plants pii:10.1038/s41477-019-0453-5 [Epub ahead of print].

RevDate: 2019-06-10

Singh PK, Mahato AK, Jain P, et al (2019)

Comparative Genomics Reveals the High Copy Number Variation of a Retro Transposon in Different Magnaporthe Isolates.

Frontiers in microbiology, 10:966.

Magnaporthe oryzae is one of the fungal pathogens of rice which results in heavy yield losses worldwide. Understanding the genomic structure of M. oryzae is essential for appropriate deployment of the blast resistance in rice crop improvement programs. In this study we sequenced two M. oryzae isolates, RML-29 (avirulent) and RP-2421 (highly virulent) and performed comparative study along with three publically available genomes of 70-15, P131, and Y34. We identified several candidate effectors (>600) and isolate specific sequences from RML-29 and RP-2421, while a core set of 10013 single copy orthologs were found among the isolates. Pan-genome analysis showed extensive presence and absence variations (PAVs). We identified isolate-specific genes across 12 isolates using the pan-genome information. Repeat analysis was separately performed for each of the 15 isolates. This analysis revealed ∼25 times higher copy number of short interspersed nuclear elements (SINE) in virulent than avirulent isolate. We conclude that the extensive PAVs and occurrence of SINE throughout the genome could be one of the major mechanisms by which pathogenic variability is emerging in M. oryzae isolates. The knowledge gained in this comparative genome study can provide understandings about the fungal genome variations in different hosts and environmental conditions, and it will provide resources to effectively manage this important disease of rice.

RevDate: 2019-06-10

Norri T, Cazaux B, Kosolobov D, et al (2019)

Linear time minimum segmentation enables scalable founder reconstruction.

Algorithms for molecular biology : AMB, 14:12 pii:147.

Background: We study a preprocessing routine relevant in pan-genomic analyses: consider a set of aligned haplotype sequences of complete human chromosomes. Due to the enormous size of such data, one would like to represent this input set with a few founder sequences that retain as well as possible the contiguities of the original sequences. Such a smaller set gives a scalable way to exploit pan-genomic information in further analyses (e.g. read alignment and variant calling). Optimizing the founder set is an NP-hard problem, but there is a segmentation formulation that can be solved in polynomial time, defined as follows. Given a threshold L and a set R = { R 1 , … , R m } of m strings (haplotype sequences), each having length n, the minimum segmentation problem for founder reconstruction is to partition [1, n] into set P of disjoint segments such that each segment [ a , b ] ∈ P has length at least L and the number d (a , b) = | { R i [ a , b ] : 1 ≤ i ≤ m } | of distinct substrings at segment [a, b] is minimized over [ a , b ] ∈ P . The distinct substrings in the segments represent founder blocks that can be concatenated to form max { d (a , b) : [ a , b ] ∈ P } founder sequences representing the original R such that crossovers happen only at segment boundaries.

Results: We give an O(mn) time (i.e. linear time in the input size) algorithm to solve the minimum segmentation problem for founder reconstruction, improving over an earlier O (m n 2) .

Conclusions: Our improvement enables to apply the formulation on an input of thousands of complete human chromosomes. We implemented the new algorithm and give experimental evidence on its practicality. The implementation is available in https://github.com/tsnorri/founder-sequences.

RevDate: 2019-06-10

Yang X, Lee WP, Ye K, et al (2019)

One reference genome is not enough.

Genome biology, 20(1):104 pii:10.1186/s13059-019-1717-0.

A recent study on human structural variation indicates insufficiencies and errors in the human reference genome, GRCh38, and argues for the construction of a human pan-genome.

RevDate: 2019-06-10

Feyereisen M, Mahony J, Kelleher P, et al (2019)

Comparative genome analysis of the Lactobacillus brevis species.

BMC genomics, 20(1):416 pii:10.1186/s12864-019-5783-1.

BACKGROUND: Lactobacillus brevis is a member of the lactic acid bacteria (LAB), and strains of L. brevis have been isolated from silage, as well as from fermented cabbage and other fermented foods. However, this bacterium is also commonly associated with bacterial spoilage of beer.

RESULTS: In the current study, complete genome sequences of six isolated L. brevis strains were determined. Five of these L. brevis strains were isolated from beer (three isolates) or the brewing environment (two isolates), and were characterized as beer-spoilers or non-beer spoilers, respectively, while the sixth isolate had previously been isolated from silage. The genomic features of 19 L. brevis strains, encompassing the six L. brevis strains described in this study and thirteen L. brevis strains for which complete genome sequences were available in public databases, were analyzed with particular attention to evolutionary aspects and adaptation to beer.

CONCLUSIONS: Comparative genomic analysis highlighted evolution of the taxon allowing niche colonization, notably adaptation to the beer environment, with approximately 50 chromosomal genes acquired by L. brevis beer-spoiler strains representing approximately 2% of their total chromosomal genetic content. These genes primarily encode proteins that are putatively involved in oxidation-reduction reactions, transcription regulation or membrane transport, functions that may be crucial to survive the harsh conditions associated with beer. The study emphasized the role of plasmids in beer spoilage with a number of unique genes identified among L. brevis beer-spoiler strains.

RevDate: 2019-06-10

Vincent AT, Schiettekatte O, Goarant C, et al (2019)

Revisiting the taxonomy and evolution of pathogenicity of the genus Leptospira through the prism of genomics.

PLoS neglected tropical diseases, 13(5):e0007270 pii:PNTD-D-18-01947.

The causative agents of leptospirosis are responsible for an emerging zoonotic disease worldwide. One of the major routes of transmission for leptospirosis is the natural environment contaminated with the urine of a wide range of reservoir animals. Soils and surface waters also host a high diversity of non-pathogenic Leptospira and species for which the virulence status is not clearly established. The genus Leptospira is currently divided into 35 species classified into three phylogenetic clusters, which supposedly correlate with the virulence of the bacteria. In this study, a total of 90 Leptospira strains isolated from different environments worldwide including Japan, Malaysia, New Caledonia, Algeria, mainland France, and the island of Mayotte in the Indian Ocean were sequenced. A comparison of average nucleotide identity (ANI) values of genomes of the 90 isolates and representative genomes of known species revealed 30 new Leptospira species. These data also supported the existence of two clades and 4 subclades. To avoid classification that strongly implies assumption on the virulence status of the lineages, we called them P1, P2, S1, S2. One of these subclades has not yet been described and is composed of Leptospira idonii and 4 novel species that are phylogenetically related to the saprophytes. We then investigated genome diversity and evolutionary relationships among members of the genus Leptospira by studying the pangenome and core gene sets. Our data enable the identification of genome features, genes and domains that are important for each subclade, thereby laying the foundation for refining the classification of this complex bacterial genus. We also shed light on atypical genomic features of a group of species that includes the species often associated with human infection, suggesting a specific and ongoing evolution of this group of species that will require more attention. In conclusion, we have uncovered a massive species diversity and revealed a novel subclade in environmental samples collected worldwide and we have redefined the classification of species in the genus. The implication of several new potentially infectious Leptospira species for human and animal health remains to be determined but our data also provide new insights into the emergence of virulence in the pathogenic species.

RevDate: 2019-06-10

González V, Santamaría RI, Bustos P, et al (2019)

Phylogenomic Rhizobium Species Are Structured by a Continuum of Diversity and Genomic Clusters.

Frontiers in microbiology, 10:910.

The bacterial genus Rhizobium comprises diverse symbiotic nitrogen-fixing species associated with the roots of plants in the Leguminosae family. Multiple genomic clusters defined by whole genome comparisons occur within Rhizobium, but their equivalence to species is controversial. In this study we investigated such genomic clusters to ascertain their significance in a species phylogeny context. Phylogenomic inferences based on complete sets of ribosomal proteins and stringent core genome markers revealed the main lineages of Rhizobium. The clades corresponding to R. etli and R. leguminosarum species show several genomic clusters with average genomic nucleotide identities (ANI > 95%), and a continuum of divergent strains, respectively. They were found to be inversely correlated with the genetic distance estimated from concatenated ribosomal proteins. We uncovered evidence of a Rhizobium pangenome that was greatly expanded, both in its chromosomes and plasmids. Despite the variability of extra-chromosomal elements, our genomic comparisons revealed only a few chromid and plasmid families. The presence/absence profile of genes in the complete Rhizobium genomes agreed with the phylogenomic pattern of species divergence. Symbiotic genes were distributed according to the principal phylogenomic Rhizobium clades but did not resolve genome clusters within the clades. We distinguished some types of symbiotic plasmids within Rhizobium that displayed different rates of synonymous nucleotide substitutions in comparison to chromosomal genes. Symbiotic plasmids may have been repeatedly transferred horizontally between strains and species, in the process displacing and substituting pre-existing symbiotic plasmids. In summary, the results indicate that Rhizobium genomic clusters, as defined by whole genomic identities, might be part of a continuous process of evolutionary divergence that includes the core and the extrachromosomal elements leading to species formation.

RevDate: 2019-06-10

Pucker B, Holtgräwe D, Stadermann KB, et al (2019)

A chromosome-level sequence assembly reveals the structure of the Arabidopsis thaliana Nd-1 genome and its gene set.

PloS one, 14(5):e0216233 pii:PONE-D-18-35533.

In addition to the BAC-based reference sequence of the accession Columbia-0 from the year 2000, several short read assemblies of THE plant model organism Arabidopsis thaliana were published during the last years. Also, a SMRT-based assembly of Landsberg erecta has been generated that identified translocation and inversion polymorphisms between two genotypes of the species. Here we provide a chromosome-arm level assembly of the A. thaliana accession Niederzenz-1 (AthNd-1_v2c) based on SMRT sequencing data. The best assembly comprises 69 nucleome sequences and displays a contig length of up to 16 Mbp. Compared to an earlier Illumina short read-based NGS assembly (AthNd-1_v1), a 75 fold increase in contiguity was observed for AthNd-1_v2c. To assign contig locations independent from the Col-0 gold standard reference sequence, we used genetic anchoring to generate a de novo assembly. In addition, we assembled the chondrome and plastome sequences. Detailed analyses of AthNd-1_v2c allowed reliable identification of large genomic rearrangements between A. thaliana accessions contributing to differences in the gene sets that distinguish the genotypes. One of the differences detected identified a gene that is lacking from the Col-0 gold standard sequence. This de novo assembly extends the known proportion of the A. thaliana pan-genome.

RevDate: 2019-05-31

Galata V, Laczny CC, Backes C, et al (2019)

Integrating Culture-based Antibiotic Resistance Profiles with Whole-genome Sequencing Data for 11,087 Clinical Isolates.

Genomics, proteomics & bioinformatics pii:S1672-0229(19)30092-0 [Epub ahead of print].

Emerging antibiotic resistance is a major global health threat. The analysis of nucleic acid sequences linked to susceptibility phenotypes facilitates the study of genetic antibiotic resistance determinants to inform molecular diagnostics and drug development. We collected genetic data (11,087 newly-sequenced whole genomes) and culture-based resistance profiles (10,991 out of the 11,087 isolates comprehensively tested against 22 antibiotics in total) of clinical isolates including 18 main species spanning a time period of 30 years. Species and drug specific resistance patterns were observed including increased resistance rates for Acinetobacter baumannii to carbapenems and for Escherichia coli to fluoroquinolones. Species-level pan-genomes were constructed to reflect the genetic repertoire of the respective species, including conserved essential genes and known resistance factors. Integrating phenotypes and genotypes through species-level pan-genomes allowed to infer gene-drug resistance associations using statistical testing. The isolate collection and the analysis results have been integrated into GEAR-base, a resource available for academic research use free of charge at https://gear-base.com.

RevDate: 2019-06-01

Gao L, Gonda I, Sun H, et al (2019)

The tomato pan-genome uncovers new genes and a rare allele regulating fruit flavor.

Nature genetics, 51(6):1044-1051.

Modern tomatoes have narrow genetic diversity limiting their improvement potential. We present a tomato pan-genome constructed using genome sequences of 725 phylogenetically and geographically representative accessions, revealing 4,873 genes absent from the reference genome. Presence/absence variation analyses reveal substantial gene loss and intense negative selection of genes and promoters during tomato domestication and improvement. Lost or negatively selected genes are enriched for important traits, especially disease resistance. We identify a rare allele in the TomLoxC promoter selected against during domestication. Quantitative trait locus mapping and analysis of transgenic plants reveal a role for TomLoxC in apocarotenoid production, which contributes to desirable tomato flavor. In orange-stage fruit, accessions harboring both the rare and common TomLoxC alleles (heterozygotes) have higher TomLoxC expression than those homozygous for either and are resurgent in modern tomatoes. The tomato pan-genome adds depth and completeness to the reference genome, and is useful for future biological discovery and breeding.

RevDate: 2019-05-11

Cruz-Morales P, Orellana CA, Moutafis G, et al (2019)

Revisiting the evolution and taxonomy of Clostridia, a phylogenomic update.

Genome biology and evolution pii:5487998 [Epub ahead of print].

Clostridium is a large genus of obligate anaerobes belonging to the Firmicutes phylum of bacteria, most of which have a Gram-positive cell wall structure. The genus includes significant human and animal pathogens, causative of potentially deadly diseases such as tetanus and botulism. Despite their relevance and many studies suggesting that they are not a monophyletic group, the taxonomy of the group has largely been neglected. Currently, species belonging to the genus are placed in the unnatural order defined as Clostridiales, which includes the class Clostridia. Here we used genomic data from 779 strains to study the taxonomy and evolution of the group. This analysis allowed us to; (i) confirm that the group is composed of more than one genus (ii), detect major differences between pathogens classified as a single species within the group of authentic Clostridium spp. (sensu stricto), (iii) identify inconsistencies between taxonomy and toxin evolution that reflect on the pervasive misclassification of strains and, (iv) identify differential traits within central metabolism of members of what has been defined earlier and confirmed by us as cluster I. Our analysis shows that the current taxonomic classification of Clostridium species hinders the prediction of functions and traits, suggests a new classification for this fascinating class of bacteria and highlights the importance of phylogenomics for taxonomic studies.

RevDate: 2019-05-12

Park SC, Lee K, Kim YO, et al (2019)

Large-Scale Genomics Reveals the Genetic Characteristics of Seven Species and Importance of Phylogenetic Distance for Estimating Pan-Genome Size.

Frontiers in microbiology, 10:834.

For more than a decade, pan-genome analysis has been applied as an effective method for explaining the genetic contents variation of prokaryotic species. However, genomic characteristics and detailed structures of gene pools have not been fully clarified, because most studies have used a small number of genomes. Here, we constructed pan-genomes of seven species in order to elucidate variations in the genetic contents of >27,000 genomes belonging to Streptococcus pneumoniae, Staphylococcus aureus subsp. aureus, Salmonella enterica subsp. enterica, Escherichia coli and Shigella spp., Mycobacterium tuberculosis complex, Pseudomonas aeruginosa, and Acinetobacter baumannii. This work showed the pan-genomes of all seven species has open property. Additionally, systematic evaluation of the characteristics of their pan-genome revealed that phylogenetic distance provided valuable information for estimating the parameters for pan-genome size among several models including Heaps' law. Our results provide a better understanding of the species and a solution to minimize sampling biases associated with genome-sequencing preferences for pathogenic strains.

RevDate: 2019-05-13

van der Nest MA, Steenkamp ET, Roodt D, et al (2019)

Genomic analysis of the aggressive tree pathogen Ceratocystis albifundus.

Fungal biology, 123(5):351-363.

The overall goal of this study was to determine whether the genome of an important plant pathogen in Africa, Ceratocystis albifundus, is structured into subgenomic compartments, and if so, to establish how these compartments are distributed across the genome. For this purpose, the publicly available genome of C. albifundus was complemented with the genome sequences for four additional isolates using the Illumina HiSeq platform. In addition, a reference genome for one of the individuals was assembled using both PacBio and Illumina HiSeq technologies. Our results showed a high degree of synteny between the five genomes, although several regions lacked detectable long-range synteny. These regions were associated with the presence of accessory genes, lower genetic similarity, variation in read-map depth, as well as transposable elements and genes associated with host-pathogen interactions (e.g. effectors and CAZymes). Such patterns are regarded as hallmarks of accelerated evolution, particularly of accessory subgenomic compartments in fungal pathogens. Our findings thus showed that the genome of C. albifundus is made-up of core and accessory subgenomic compartments, which is an important step towards characterizing its pangenome. This study also highlights the value of comparative genomics for understanding mechanisms that may underly and influence the biology and evolution of pathogens.

RevDate: 2019-05-10

Lorentzen MP, Campbell-Sills H, Jorgensen TS, et al (2019)

Expanding the biodiversity of Oenococcus oeni through comparative genomics of apple cider and kombucha strains.

BMC genomics, 20(1):330 pii:10.1186/s12864-019-5692-3.

BACKGROUND: Oenococcus oeni is a lactic acid bacteria species adapted to the low pH, ethanol-rich environments of wine and cider fermentation, where it performs the crucial role of malolactic fermentation. It has a small genome and has lost the mutS-mutL DNA mismatch repair genes, making it a hypermutable and highly specialized species. Two main lineages of strains, named groups A and B, have been described to date, as well as other subgroups correlated to different types of wines or regions. A third group "C" has also been hypothesized based on sequence analysis, but it remains controversial. In this study we have elucidated the species population structure by sequencing 14 genomes of new strains isolated from cider and kombucha and performing comparative genomics analyses.

RESULTS: Sequence-based phylogenetic trees confirmed a population structure of 4 clades: The previously identified A and B, a third group "C" consisting of the new cider strains and a small subgroup of wine strains previously attributed to group B, and a fourth group "D" exclusively represented by kombucha strains. A pair of complete genomes from group C and D were compared to the circularized O. oeni PSU-1 strain reference genome and no genomic rearrangements were found. Phylogenetic trees, K-means clustering and pangenome gene clusters evidenced the existence of smaller, specialized subgroups of strains. Using the pangenome, genomic differences in stress resistance and biosynthetic pathways were found to uniquely distinguish the C and D clades.

CONCLUSIONS: The obtained results, including the additional cider and kombucha strains, firmly established the O. oeni population structure. Group C does not appear as fully domesticated as group A to wine, but showed several unique patterns which may be due to ongoing specialization to the cider environment. Group D was shown to be the most divergent member of O. oeni to date, appearing as the closest to a pre-domestication state of the species.

RevDate: 2019-05-28

Wang J, Xing J, Lu J, et al (2019)

Complete Genome Sequencing of Bacillus velezensis WRN014, and Comparison with Genome Sequences of other Bacillus velezensis Strains.

Journal of microbiology and biotechnology, 29(5):794-808.

Bacillus velezensis strain WRN014 was isolated from banana fields in Hainan, China. Bacillus velezensis is an important member of the plant growth-promoting rhizobacteria (PGPR) which can enhance plant growth and control soil-borne disease. The complete genome of Bacillus velezensis WRN014 was sequenced by combining Illumina Hiseq 2500 system and Pacific Biosciences SMRT high-throughput sequencing technologies. Then, the genome of Bacillus velezensis WRN014, together with 45 other completed genome sequences of the Bacillus velezensis strains, were comparatively studied. The genome of Bacillus velezensis WRN014 was 4,063,541bp in length and contained 4,062 coding sequences, 9 genomic islands and 13 gene clusters. The results of comparative genomic analysis provide evidence that (i) The 46 Bacillus velezensis strains formed 2 obviously closely related clades in phylogenetic trees. (ii) The pangenome in this study is open and is increasing with the addition of new sequenced genomes. (iii) Analysis of single nucleotide polymorphisms (SNPs) revealed local diversification of the 46 Bacillus velezensis genomes. Surprisingly, SNPs were not evenly distributed throughout the whole genome. (iv) Analysis of gene clusters revealed that rich gene clusters spread over Bacillus velezensis strains and some gene clusters are conserved in different strains. This study reveals that the strain WRN014 and other Bacillus velezensis strains have potential to be used as PGPR and biopesticide.

RevDate: 2019-04-28

Dillon MM, Almeida RND, Laflamme B, et al (2019)

Molecular Evolution of Pseudomonas syringae Type III Secreted Effector Proteins.

Frontiers in plant science, 10:418.

Diverse Gram-negative pathogens like Pseudomonas syringae employ type III secreted effector (T3SE) proteins as primary virulence factors that combat host immunity and promote disease. T3SEs can also be recognized by plant hosts and activate an effector triggered immune (ETI) response that shifts the interaction back toward plant immunity. Consequently, T3SEs are pivotal in determining the virulence potential of individual P. syringae strains, and ultimately help to restrict P. syringae pathogens to a subset of potential hosts that are unable to recognize their repertoires of T3SEs. While a number of effector families are known to be present in the P. syringae species complex, one of the most persistent challenges has been documenting the complex variation in T3SE contents across a diverse collection of strains. Using the entire pan-genome of 494 P. syringae strains isolated from more than 100 hosts, we conducted a global analysis of all known and putative T3SEs. We identified a total of 14,613 putative T3SEs, 4,636 of which were unique at the amino acid level, and show that T3SE repertoires of different P. syringae strains vary dramatically, even among strains isolated from the same hosts. We also find substantial diversification within many T3SE families, and in many cases find strong signatures of positive selection. Furthermore, we identify multiple gene gain and loss events for several families, demonstrating an important role of horizontal gene transfer (HGT) in the evolution of P. syringae T3SEs. These analyses provide insight into the evolutionary history of P. syringae T3SEs as they co-evolve with the host immune system, and dramatically expand the database of P. syringae T3SEs alleles.

RevDate: 2019-05-03

Roach R, Mann R, Gambley CG, et al (2019)

Genomic sequence analysis reveals diversity of Australian Xanthomonas species associated with bacterial leaf spot of tomato, capsicum and chilli.

BMC genomics, 20(1):310 pii:10.1186/s12864-019-5600-x.

BACKGROUND: The genetic diversity in Australian populations of Xanthomonas species associated with bacterial leaf spot in tomato, capsicum and chilli were compared to worldwide bacterial populations. The aim of this study was to confirm the identities of these Australian Xanthomonas species and classify them in comparison to overseas isolates. Analysis of whole genome sequence allows for the investigation of bacterial population structure, pathogenicity and gene exchange, resulting in better management strategies and biosecurity.

RESULTS: Phylogenetic analysis of the core genome alignments and SNP data grouped strains in distinct clades. Patterns observed in average nucleotide identity, pan genome structure, effector and carbohydrate active enzyme profiles reflected the whole genome phylogeny and highlight taxonomic issues in X. perforans and X. euvesicatoria. Circular sequences with similarity to previously characterised plasmids were identified, and plasmids of similar sizes were isolated. Potential false positive and false negative plasmid assemblies were discussed. Effector patterns that may influence virulence on host plant species were analysed in pathogenic and non-pathogenic xanthomonads.

CONCLUSIONS: The phylogeny presented here confirmed X. vesicatoria, X. arboricola, X. euvesicatoria and X. perforans and a clade of an uncharacterised Xanthomonas species shown to be genetically distinct from all other strains of this study. The taxonomic status of X. perforans and X. euvesicatoria as one species is discussed in relation to whole genome phylogeny and phenotypic traits. The patterns evident in enzyme and plasmid profiles indicate worldwide exchange of genetic material with the potential to introduce new virulence elements into local bacterial populations.

RevDate: 2019-04-22

Rao RT, Sivakumar N, K Jayakumar (2019)

Analyses of Livestock-Associated Staphylococcus aureus Pan-Genomes Suggest Virulence Is Not Primary Interest in Evolution of Its Genome.

Omics : a journal of integrative biology, 23(4):224-236.

Staphylococcus aureus is not only part of normal flora but also an opportunistic pathogen relevant to microbial genomics, public health, and veterinary medicine. In addition to being a well-known human pathogen, S. aureus causes various infections in economically important livestock animals such as cows, sheep, goats, and pigs. There are very few studies that have examined the pan-genome of S. aureus or the host-specific strains' pan-genomes. We report on livestock-associated S. aureus' (LA-SA) pan-genome and suggest that virulence is not the primary interest in evolution of its genome. LA-SA' complete genomes were retrieved from the NCBI and pan-genome was constructed by high-speed Roary pipeline. The pan-genome size was 4637 clusters, whereas 42.46% of the pan-genome was associated with the core genome. We found 1268 genes were associated with the strain-unique genome, and the remaining 1432 cluster with the accessory genome. COG (clusters of orthologous group of proteins) analysis of the core genes revealed 34% of clusters related to metabolism responsible for amino acid and inorganic ion transport (COG categories E and P), followed by carbohydrate metabolism (category G). Virulent gene analysis revealed the core genes responsible for antiphagocytosis and iron uptake. The fluidity of pan-genome was calculated as 0.082 ± 0.025. Importantly, the positive selection analysis suggested a slower rate of evolution among the LA-SA genomes. We call for comparative microbial and pan-genome research between human and LA-SA that can help further understand the evolution of virulence and thus inform future microbial diagnostics and drug discovery.

RevDate: 2019-05-08

Liu J, Zeng Q, Wang M, et al (2019)

Comparative genome-scale modelling of the pathogenic Flavobacteriaceae species Riemerella anatipestifer in China.

Environmental microbiology [Epub ahead of print].

Riemerella anatipestifer (RA) is a gram-negative bacterium that has a high potential to infect waterfowl. Although more and more genomes of RA have been generated comparaed to genomic analysis of RA still remains at the level of individual species. In this study, we analysed the pan-genome of 27 RA virulent isolates to reveal the intraspecies genomic diversity from various aspects. The multi-locus sequence typing (MLST) analysis suggests that the geographic origin of R. anatipestifer is Guangdong province, China. Results of pan-genome analysis revealed an open pan-genome for all 27 species with the sizes of 2967 genes. We identified 387 genes among 555 unique genes originated by horizontal gene transfer. Further studies showed 204 strain-specific HGT genes were predicted as virulent proteins. Screening the 1113 core genes in RA through subtractive genomic approach, 70 putative vaccine targets out of 125 non-cytoplasmic proteins have been predicted. Further analysis of these non A. platyrhynchos homologous proteins predicted that 56 essential proteins as drug target with more interaction partners were involved in unique metabolic pathways of RA. In conclusion, the present study indicated the essence and the diversity of RA and also provides useful information for identification of vaccine and drugs candidates in future.

RevDate: 2019-05-01

Knight DR, Kullin B, Androga GO, et al (2019)

Evolutionary and Genomic Insights into Clostridioides difficile Sequence Type 11: a Diverse Zoonotic and Antimicrobial-Resistant Lineage of Global One Health Importance.

mBio, 10(2): pii:mBio.00446-19.

Clostridioides difficile (Clostridium difficile) sequence type 11 (ST11) is well established in production animal populations worldwide and contributes considerably to the global burden of C. difficile infection (CDI) in humans. Increasing evidence of shared ancestry and genetic overlap of PCR ribotype 078 (RT078), the most common ST11 sublineage, between human and animal populations suggests that CDI may be a zoonosis. We performed whole-genome sequencing (WGS) on a collection of 207 ST11 and closely related ST258 isolates of human and veterinary/environmental origin, comprising 16 RTs collected from Australia, Asia, Europe, and North America. Core genome single nucleotide variant (SNV) analysis identified multiple intraspecies and interspecies clonal groups (isolates separated by ≤2 core genome SNVs) in all the major RT sublineages: 078, 126, 127, 033, and 288. Clonal groups comprised isolates spread across different states, countries, and continents, indicative of reciprocal long-range dissemination and possible zoonotic/anthroponotic transmission. Antimicrobial resistance genotypes and phenotypes varied across host species, geographic regions, and RTs and included macrolide/lincosamide resistance (Tn6194 [ermB]), tetracycline resistance (Tn6190 [tetM] and Tn6164 [tet44]), and fluoroquinolone resistance (gyrA/B mutations), as well as numerous aminoglycoside resistance cassettes. The population was defined by a large "open" pan-genome (10,378 genes), a remarkably small core genome of 2,058 genes (only 19.8% of the gene pool), and an accessory genome containing a large and diverse collection of important prophages of the Siphoviridae and Myoviridae This study provides novel insights into strain relatedness and genetic variability of C. difficile ST11, a lineage of global One Health importance.IMPORTANCE Historically, Clostridioides difficile (Clostridium difficile) has been associated with life-threatening diarrhea in hospitalized patients. Increasing rates of C. difficile infection (CDI) in the community suggest exposure to C. difficile reservoirs outside the hospital, including animals, the environment, or food. C. difficile sequence type 11 (ST11) is known to infect/colonize livestock worldwide and comprises multiple ribotypes, many of which cause disease in humans, suggesting CDI may be a zoonosis. Using high-resolution genomics, we investigated the evolution and zoonotic potential of ST11 and a new closely related ST258 lineage sourced from diverse origins. We found multiple intra- and interspecies clonal transmission events in all ribotype sublineages. Clones were spread across multiple continents, often without any health care association, indicative of zoonotic/anthroponotic long-range dissemination in the community. ST11 possesses a massive pan-genome and numerous clinically important antimicrobial resistance elements and prophages, which likely contribute to the success of this globally disseminated lineage of One Health importance.

RevDate: 2019-05-10
CmpDate: 2019-05-08

Wyres KL, Wick RR, Judd LM, et al (2019)

Distinct evolutionary dynamics of horizontal gene transfer in drug resistant and virulent clones of Klebsiella pneumoniae.

PLoS genetics, 15(4):e1008114 pii:PGENETICS-D-18-02153.

Klebsiella pneumoniae has emerged as an important cause of two distinct public health threats: multi-drug resistant (MDR) healthcare-associated infections and drug susceptible community-acquired invasive infections. These pathotypes are generally associated with two distinct subsets of K. pneumoniae lineages or 'clones' that are distinguished by the presence of acquired resistance genes and several key virulence loci. Genomic evolutionary analyses of the most notorious MDR and invasive community-associated ('hypervirulent') clones indicate differences in terms of chromosomal recombination dynamics and capsule polysaccharide diversity, but it remains unclear if these differences represent generalised trends. Here we leverage a collection of >2200 K. pneumoniae genomes to identify 28 common clones (n ≥ 10 genomes each), and perform the first genomic evolutionary comparison. Eight MDR and 6 hypervirulent clones were identified on the basis of acquired resistance and virulence gene prevalence. Chromosomal recombination, surface polysaccharide locus diversity, pan-genome, plasmid and phage dynamics were characterised and compared. The data showed that MDR clones were highly diverse, with frequent chromosomal recombination generating extensive surface polysaccharide locus diversity. Additional pan-genome diversity was driven by frequent acquisition/loss of both plasmids and phage. In contrast, chromosomal recombination was rare in the hypervirulent clones, which also showed a significant reduction in pan-genome diversity, largely driven by a reduction in plasmid diversity. Hence the data indicate that hypervirulent clones may be subject to some sort of constraint for horizontal gene transfer that does not apply to the MDR clones. Our findings are relevant for understanding the risk of emergence of individual K. pneumoniae strains carrying both virulence and acquired resistance genes, which have been increasingly reported and cause highly virulent infections that are extremely difficult to treat. Specifically, our data indicate that MDR clones pose the greatest risk, because they are more likely to acquire virulence genes than hypervirulent clones are to acquire resistance genes.

RevDate: 2019-04-21

Du Y, Ma J, Yin Z, et al (2019)

Comparative genomic analysis of Bacillus paralicheniformis MDJK30 with its closely related species reveals an evolutionary relationship between B. paralicheniformis and B. licheniformis.

BMC genomics, 20(1):283 pii:10.1186/s12864-019-5646-9.

BACKGROUND: Members of the genus Bacillus are important plant growth-promoting rhizobacteria that serve as biocontrol agents. Bacillus paralicheniformis MDJK30 is a PGPR isolated from the peony rhizosphere and can suppress plant-pathogenic bacteria and fungi. To further uncover the genetic mechanism of the plant growth-promoting traits of MDJK30 and its closely related strains, we used comparative genomics to provide insights into the genetic diversity and evolutionary relationship between B. paralicheniformis and B. licheniformis.

RESULTS: A comparative genomics analysis based on B. paralicheniformis MDJK30 and 55 other previously reported Bacillus strains was performed. The evolutionary position of MDJK30 and the evolutionary relationship between B. paralicheniformis and B. licheniformis were evaluated by studying the phylogeny of the core genomes, a population structure analysis and ANI results. Comparative genomic analysis revealed various features of B. paralicheniformis that contribute to its commensal lifestyle in the rhizosphere, including an opening pan genome, a diversity of transport and the metabolism of the carbohydrates and amino acids. There are notable differences in the numbers and locations of the insertion sequences, prophages, genomic islands and secondary metabolic synthase operons between B. paralicheniformis and B. licheniformis. In particular, we found most gene clusters of Fengycin, Bacitracin and Lantipeptide were only present in B. paralicheniformis and were obtained by horizontal gene transfer (HGT), and these clusters may be used as genetic markers for distinguishing B. paralicheniformis and B. licheniformis.

CONCLUSIONS: This study reveals that MDJK30 and the other strains of lineage paralicheniformis present plant growth-promoting traits at the genetic level and can be developed and commercially formulated in agriculture as PGPR. Core genome phylogenies and population structure analysis has proven to be a powerful tool for differentiating B. paralicheniformis and B. licheniformis. Comparative genomic analyses illustrate the genetic differences between the paralicheniformis-licheniformis group with respect to rhizosphere adaptation.

RevDate: 2019-04-09

Vakirlis N, Monerawela C, McManus G, et al (2019)

Evolutionary Journey and Characterisation of a Novel Pan-Gene Associated with Beer Strains of S. cerevisiae.

Yeast (Chichester, England) [Epub ahead of print].

The sequencing of over a thousand Saccharomyces cerevisiae genomes revealed a complex pangenome. Over one-third of the discovered genes are not present in the S. cerevisiae core genome but instead are often restricted to a sub-set of yeast isolates and thus may be important for adaptation to specific environmental niches. We refer to these genes as "pan-genes", being part of the pangenome but not the core genome. Here we describe the evolutionary journey and characterisation of a novel pan-gene, originally named HYPO (Hypothetical Open Reading Frame). Phylogenetic analysis reveals that HYPO has been predominantly retained in S. cerevisiae strains associated with brewing but has been repeatedly lost in most other fungal species during evolution. There is also evidence that HYPO was horizontally transferred at least once, from S. cerevisiae to S. paradoxus. The phylogenetic analysis of HYPO exemplifies the complexity and intricacy of evolutionary trajectories of genes within the S. cerevisiae pangenome. To examine possible functions for Hypo, we overexpressed a HYPO-GFP fusion protein in both S. cerevisiae and S. pastorianus. The protein localised to the plasma membrane where it accumulated initially in distinct foci. Time-lapse fluorescent imaging revealed that when cells are grown in wort, Hypo-gfp fluorescence spreads throughout the membrane during cell growth. The over-expression of Hypo-gfp in S. cerevisiae or S. pastorianus strains did not significantly alter cell growth in medium containing glucose, maltose, maltotriose or wort at different concentrations.

RevDate: 2019-04-08

Quijada NM, Rodríguez-Lázaro D, M Hernández (2019)

TORMES: an automated pipeline for whole bacterial genome analysis.

Bioinformatics (Oxford, England) pii:5430930 [Epub ahead of print].

MOTIVATION: The progress of High Throughput Sequencing (HTS) technologies and the reduction in the sequencing costs are such that Whole Genome Sequencing (WGS) could replace many traditional laboratory assays and procedures. Exploiting the volume of data produced by HTS platforms requires substantial computing skills and this is the main bottleneck in the implementation of WGS as a routine laboratory technique. The way in which the vast amount of results are presented to researchers and clinicians with no specialist knowledge of genome sequencing is also a significant issue.

RESULTS: Here we present TORMES, a user-friendly pipeline for WGS analysis of bacteria from any origin generated by HTS on Illumina platforms. TORMES is designed for non-bioinformatician users, and automates the steps required for WGS analysis directly from the raw sequence data: sequence quality filtering, de novo assembly, draft genome ordering against a reference, genome annotation, multi-locus sequence typing (MLST), searching for antibiotic resistance and virulence genes, and pangenome comparisons. Once the analysis is finished, TORMES generates and interactive web-like report that can be opened in any web browser and shared and revised by researchers in a simple manner. TORMES can be run by using very simple commands and represent a quick an easy way to perform WGS analysis.

AVAILABILITY: TORMES is free available at https://github.com/nmquijada/tormes.

SUPPLEMENTARY INFORMATION: Supplementary data, manual and examples are available at Bioinformatics online and at the TORMES main page.

RevDate: 2019-04-30

Raymond F, Boissinot M, Ouameur AA, et al (2019)

Culture-enriched human gut microbiomes reveal core and accessory resistance genes.

Microbiome, 7(1):56 pii:10.1186/s40168-019-0669-7.

BACKGROUND: Low-abundance microorganisms of the gut microbiome are often referred to as a reservoir for antibiotic resistance genes. Unfortunately, these less-abundant bacteria can be overlooked by deep shotgun sequencing. In addition, it is a challenge to associate the presence of resistance genes with their risk of acquisition by pathogens. In this study, we used liquid culture enrichment of stools to assemble the genome of lower-abundance bacteria from fecal samples. We then investigated the gene content recovered from these culture-enriched and culture-independent metagenomes in relation with their taxonomic origin, specifically antibiotic resistance genes. We finally used a pangenome approach to associate resistance genes with the core or accessory genome of Enterobacteriaceae and inferred their propensity to horizontal gene transfer.

RESULTS: Using culture-enrichment approaches with stools allowed assembly of 187 bacterial species with an assembly size greater than 1 million nucleotides. Of these, 67 were found only in culture-enriched conditions, and 22 only in culture-independent microbiomes. These assembled metagenomes allowed the evaluation of the gene content of specific subcommunities of the gut microbiome. We observed that differentially distributed metabolic enzymes were associated with specific culture conditions and, for the most part, with specific taxa. Gene content differences between microbiomes, for example, antibiotic resistance, were for the most part not associated with metabolic enzymes, but with other functions. We used a pangenome approach to determine if the resistance genes found in Enterobacteriaceae, specifically E. cloacae or E. coli, were part of the core genome or of the accessory genome of this species. In our healthy volunteer cohort, we found that E. cloacae contigs harbored resistance genes that were part of the core genome of the species, while E. coli had a large accessory resistome proximal to mobile elements.

CONCLUSION: Liquid culture of stools contributed to an improved functional and comparative genomics study of less-abundant gut bacteria, specifically those associated with antibiotic resistance. Defining whether a gene is part of the core genome of a species helped in interpreting the genomes recovered from culture-independent or culture-enriched microbiomes.

RevDate: 2019-04-07

Park CJ, CP Andam (2019)

Within-Species Genomic Variation and Variable Patterns of Recombination in the Tetracycline Producer Streptomyces rimosus.

Frontiers in microbiology, 10:552.

Streptomyces rimosus is best known as the primary source of the tetracycline class of antibiotics, most notably oxytetracycline, which have been widely used against many gram-positive and gram-negative pathogens and protozoan parasites. However, despite the medical and agricultural importance of S. rimosus, little is known of its evolutionary history and genome dynamics. In this study, we aim to elucidate the pan-genome characteristics and phylogenetic relationships of 32 S. rimosus genomes. The S. rimosus pan-genome contains more than 22,000 orthologous gene clusters, and approximately 8.8% of these genes constitutes the core genome. A large part of the accessory genome is composed of 9,646 strain-specific genes. S. rimosus exhibits an open pan-genome (decay parameter α = 0.83) and high gene diversity between strains (genomic fluidity φ = 0.12). We also observed strain-level variation in the distribution and abundance of biosynthetic gene clusters (BGCs) and that each individual S. rimosus genome has a unique repertoire of BGCs. Lastly, we observed variation in recombination, with some strains donating or receiving DNA more often than others, strains that tend to frequently recombine with specific partners, genes that often experience recombination more than others, and variable sizes of recombined DNA sequences. We conclude that the high levels of inter-strain genomic variation in S. rimosus is partly explained by differences in recombination among strains. These results have important implications on current efforts for natural drug discovery, the ecological role of strain-level variation in microbial populations, and addressing the fundamental question of why microbes have pan-genomes.

RevDate: 2019-04-03

Cho H, Song ES, Heu S, et al (2019)

Prediction of Host-Specific Genes by Pan-Genome Analyses of the Korean Ralstonia solanacearum Species Complex.

Frontiers in microbiology, 10:506.

The soil-borne pathogenic Ralstonia solanacearum species complex (RSSC) is a group of plant pathogens that is economically destructive worldwide and has a broad host range, including various solanaceae plants, banana, ginger, sesame, and clove. Previously, Korean RSSC strains isolated from samples of potato bacterial wilt were grouped into four pathotypes based on virulence tests against potato, tomato, eggplant, and pepper. In this study, we sequenced the genomes of 25 Korean RSSC strains selected based on these pathotypes. The newly sequenced genomes were analyzed to determine the phylogenetic relationships between the strains with average nucleotide identity values, and structurally compared via multiple genome alignment using Mauve software. To identify candidate genes responsible for the host specificity of the pathotypes, functional genome comparisons were conducted by analyzing pan-genome orthologous group (POG) and type III secretion system effectors (T3es). POG analyses revealed that a total of 128 genes were shared only in tomato-non-pathogenic strains, 8 genes in tomato-pathogenic strains, 5 genes in eggplant-non-pathogenic strains, 7 genes in eggplant-pathogenic strains, 1 gene in pepper-non-pathogenic strains, and 34 genes in pepper-pathogenic strains. When we analyzed T3es, three host-specific effectors were predicted: RipS3 (SKWP3) and RipH3 (HLK3) were found only in tomato-pathogenic strains, and RipAC (PopC) were found only in eggplant-pathogenic strains. Overall, we identified host-specific genes and effectors that may be responsible for virulence functions in RSSC in silico. The expected characters of those genes suggest that the host range of RSSC is determined by the comprehensive actions of various virulence factors, including effectors, secretion systems, and metabolic enzymes.

RevDate: 2019-04-01

Bedoya-Correa CM, Rincón Rodríguez RJ, MT Parada-Sanchez (2019)

Genomic and phenotypic diversity of Streptococcus mutans.

Journal of oral biosciences, 61(1):22-31.

BACKGROUND: Streptococcus mutans (S. mutans) is a commensal microorganism found in the human oral cavity. However, due to environmental changes, selective pressures, and the presence of a variable genome, it adapts and may acquire new physiological and metabolic properties that alter dental biofilm homeostasis, promoting the development of dental caries. Although the plasticity and heterogeneity of S. mutans is widely recognized, very little is known about the mechanisms for the expression of pathogenic properties in specific genotypes.

HIGHLIGHT: The implementation of molecular biology techniques in the study of S. mutans has provided information on the genomic diversity of this species. This variability is generated by genome rearrangements, natural genetic transformation, and horizontal gene transfer, and continues to grow due to an open pan-genome. The main virulence factors associated with the cariogenic potential of S. mutans include adhesion, acid production (acidogenicity), and acid tolerance (aciduricity), and also show variability. These factors coordinate the modification of the physicochemical properties of the biofilm, which results in the accumulation of S. mutans and other acidogenic and aciduric species in the oral cavity.

CONCLUSION: We review the current literature on the main processes that generate S. mutans genomic diversity, as well as the phenotypic variability of its main virulence factors. S. mutans achieves its pathogenesis by sensing the intra- and extracellular environments and regulating gene transcription according to perceived environmental modifications. Consequently, this regulation gives rise to differential synthesis of proteins, allowing this species to potentially express virulence factors.

RevDate: 2019-03-31

Pinholt M, Bayliss SC, Gumpert H, et al (2019)

WGS of 1058 Enterococcus faecium from Copenhagen, Denmark, reveals rapid clonal expansion of vancomycin-resistant clone ST80 combined with widespread dissemination of a vanA-containing plasmid and acquisition of a heterogeneous accessory genome.

The Journal of antimicrobial chemotherapy pii:5423852 [Epub ahead of print].

OBJECTIVES: From 2012 to 2015, a sudden significant increase in vancomycin-resistant (vanA) Enterococcus faecium (VREfm) was observed in the Capital Region of Denmark. Clonal relatedness of VREfm and vancomycin-susceptible E. faecium (VSEfm) was investigated, transmission events between hospitals were identified and the pan-genome and plasmids from the largest VREfm clonal group were characterized.

METHODS: WGS of 1058 E. faecium isolates was carried out on the Illumina platform to perform SNP analysis and to identify the pan-genome. One isolate was also sequenced on the PacBio platform to close the genome. Epidemiological data were collected from laboratory information systems.

RESULTS: Phylogeny of 892 VREfm and 166 VSEfm revealed a polyclonal structure, with a single clonal group (ST80) accounting for 40% of the VREfm isolates. VREfm and VSEfm co-occurred within many clonal groups; however, no VSEfm were related to the dominant VREfm group. A similar vanA plasmid was identified in ≥99% of isolates belonging to the dominant group and 69% of the remaining VREfm. Ten plasmids were identified in the completed genome, and ∼29% of this genome consisted of dispensable accessory genes. The size of the pan-genome among isolates in the dominant group was 5905 genes.

CONCLUSIONS: Most probably, VREfm emerged owing to importation of a successful VREfm clone which rapidly transmitted to the majority of hospitals in the region whilst simultaneously disseminating a vanA plasmid to pre-existing VSEfm. Acquisition of a heterogeneous accessory genome may account for the success of this clone by facilitating adaptation to new environmental challenges.

RevDate: 2019-04-30

Smith BA, Leligdon C, DA Baltrus (2019)

Just the Two of Us? A Family of Pseudomonas Megaplasmids Offers a Rare Glimpse into the Evolution of Large Mobile Elements.

Genome biology and evolution, 11(4):1192-1206.

Pseudomonads are ubiquitous group of environmental proteobacteria, well known for their roles in biogeochemical cycling, in the breakdown of xenobiotic materials, as plant growth promoters, and as pathogens of a variety of host organisms. We have previously identified a large megaplasmid present within one isolate of the plant pathogen Pseudomonas syringae, and here we report that a second member of this megaplasmid family is found within an environmental Pseudomonad isolate most closely related to Pseudomonas putida. Many of the shared genes are involved in critical cellular processes like replication, transcription, translation, and DNA repair. We argue that presence of these shared pathways sheds new light on discussions about the types of genes that undergo horizontal gene transfer (i.e., the complexity hypothesis) as well as the evolution of pangenomes. Furthermore, although both megaplasmids display a high level of synteny, genes that are shared differ by over 50% on average at the amino acid level. This combination of conservation in gene order despite divergence in gene sequence suggests that this Pseudomonad megaplasmid family is relatively old, that gene order is under strong selection within this family, and that there are likely many more members of this megaplasmid family waiting to be found in nature.

RevDate: 2019-05-09

Shelyakin PV, Bochkareva OO, Karan AA, et al (2019)

Micro-evolution of three Streptococcus species: selection, antigenic variation, and horizontal gene inflow.

BMC evolutionary biology, 19(1):83 pii:10.1186/s12862-019-1403-6.

BACKGROUND: The genus Streptococcus comprises pathogens that strongly influence the health of humans and animals. Genome sequencing of multiple Streptococcus strains demonstrated high variability in gene content and order even in closely related strains of the same species and created a newly emerged object for genomic analysis, the pan-genome. Here we analysed the genome evolution of 25 strains of Streptococcus suis, 50 strains of Streptococcus pyogenes and 28 strains of Streptococcus pneumoniae.

RESULTS: Fractions of the pan-genome, unique, periphery, and universal genes differ in size, functional composition, the level of nucleotide substitutions, and predisposition to horizontal gene transfer and genomic rearrangements. The density of substitutions in intergenic regions appears to be correlated with selection acting on adjacent genes, implying that more conserved genes tend to have more conserved regulatory regions. The total pan-genome of the genus is open, but only due to strain-specific genes, whereas other pan-genome fractions reach saturation. We have identified the set of genes with phylogenies inconsistent with species and non-conserved location in the chromosome; these genes are rare in at least one species and have likely experienced recent horizontal transfer between species. The strain-specific fraction is enriched with mobile elements and hypothetical proteins, but also contains a number of candidate virulence-related genes, so it may have a strong impact on adaptability and pathogenicity. Mapping the rearrangements to the phylogenetic tree revealed large parallel inversions in all species. A parallel inversion of length 15 kB with breakpoints formed by genes encoding surface antigen proteins PhtD and PhtB in S. pneumoniae leads to replacement of gene fragments that likely indicates the action of an antigen variation mechanism.

CONCLUSIONS: Members of genus Streptococcus have a highly dynamic, open pan-genome, that potentially confers them with the ability to adapt to changing environmental conditions, i.e. antibiotic resistance or transmission between different hosts. Hence, integrated analysis of all aspects of genome evolution is important for the identification of potential pathogens and design of drugs and vaccines.

RevDate: 2019-03-29

Legendre M, Alempic JM, Philippe N, et al (2019)

Pandoravirus Celtis Illustrates the Microevolution Processes at Work in the Giant Pandoraviridae Genomes.

Frontiers in microbiology, 10:430.

With genomes of up to 2.7 Mb propagated in μm-long oblong particles and initially predicted to encode more than 2000 proteins, members of the Pandoraviridae family display the most extreme features of the known viral world. The mere existence of such giant viruses raises fundamental questions about their origin and the processes governing their evolution. A previous analysis of six newly available isolates, independently confirmed by a study including three others, established that the Pandoraviridae pan-genome is open, meaning that each new strain exhibits protein-coding genes not previously identified in other family members. With an average increment of about 60 proteins, the gene repertoire shows no sign of reaching a limit and remains largely coding for proteins without recognizable homologs in other viruses or cells (ORFans). To explain these results, we proposed that most new protein-coding genes were created de novo, from pre-existing non-coding regions of the G+C rich pandoravirus genomes. The comparison of the gene content of a new isolate, pandoravirus celtis, closely related (96% identical genome) to the previously described p. quercus is now used to test this hypothesis by studying genomic changes in a microevolution range. Our results confirm that the differences between these two similar gene contents mostly consist of protein-coding genes without known homologs, with statistical signatures close to that of intergenic regions. These newborn proteins are under slight negative selection, perhaps to maintain stable folds and prevent protein aggregation pending the eventual emergence of fitness-increasing functions. Our study also unraveled several insertion events mediated by a transposase of the hAT family, 3 copies of which are found in p. celtis and are presumably active. Members of the Pandoraviridae are presently the first viruses known to encode this type of transposase.

RevDate: 2019-03-26

Banerjee R, Shine O, Rajachandran V, et al (2019)

Gene duplication and deletion, not horizontal transfer, drove intra-species mosaicism of Bartonella henselae.

Genomics pii:S0888-7543(18)30730-4 [Epub ahead of print].

Bartonella henselae is a facultative intracellular pathogen that occurs worldwide and is responsible primarily for cat-scratch disease in young people and bacillary angiomatosis in immunocompromised patients. The principal source of genome-level diversity that contributes to B. henselae's host-adaptive features is thought to be horizontal gene transfer events. However, our analyses did not reveal the acquisition of horizontally-transferred islands in B. henselae after its divergence from other Bartonella. Rather, diversity in gene content and genome size was apparently acquired through two alternative mechanisms, including deletion and, more predominantly, duplication of genes. Interestingly, a majority of these events occurred in regions that were horizontally transferred long before B. henselae's divergence from other Bartonella species. Our study indicates the possibility that gene duplication, in response to positive selection pressures in specific clones of B. henselae, might be linked to the pathogen's adaptation to arthropod vectors, the cat reservoir, or humans as incidental host-species.

RevDate: 2019-05-14
CmpDate: 2019-05-14

de Carvalho SP, de Almeida JB, de Freitas LM, et al (2019)

Genomic profile of Brazilian methicillin-resistant Staphylococcus aureus resembles clones dispersed worldwide.

Journal of medical microbiology, 68(5):693-702.

PURPOSE: Comparative genomic analysis of strains may help us to better understand the wide diversity of their genetic profiles. The aim of this study was to analyse the genomic features of the resistome and virulome of Brazilian first methicillin-resistant Staphylococcus aureus (MRSA) isolates and their relationship to other Brazilian and international MRSA strains.

METHODOLOGY: The whole genomes of three MRSA strains previously isolated in Vitória da Conquista were sequenced, assembled, annotated and compared with other MRSA genomes. A phylogenetic tree was constructed and the pan-genome and accessory and core genomes were constructed. The resistomes and virulomes of all strains were identified.Results/Key findings. Phylogenetic analysis of all 49 strains indicated different clones showing high similarity. The pan-genome of the analysed strains consisted of 4484 genes, with 31 % comprising the gene portion of the core genome, 47 % comprising the accessory genome and 22 % being singletons. Most strains showed at least one gene related to virulence factors associated with immune system evasion, followed by enterotoxins. The strains showed multiresistance, with the most recurrent genes conferring resistance to beta-lactams, fluoroquinolones, aminoglycosides and macrolides.

CONCLUSIONS: Our comparative genomic analysis showed that there is no pattern of virulence gene distribution among the clones analysed in the different regions. The Brazilian strains showed similarity with clones from several continents.

RevDate: 2019-04-04

Correia K, Yu SM, R Mahadevan (2019)

AYbRAH: a curated ortholog database for yeasts and fungi spanning 600 million years of evolution.

Database : the journal of biological databases and curation, 2019:.

Budding yeasts inhabit a range of environments by exploiting various metabolic traits. The genetic bases for these traits are mostly unknown, preventing their addition or removal in a chassis organism for metabolic engineering. Insight into the evolution of orthologs, paralogs and xenologs in the yeast pan-genome can help bridge these genotypes; however, existing phylogenomic databases do not span diverse yeasts, and sometimes cannot distinguish between these homologs. To help understand the molecular evolution of these traits in yeasts, we created Analyzing Yeasts by Reconstructing Ancestry of Homologs (AYbRAH), an open-source database of predicted and manually curated ortholog groups for 33 diverse fungi and yeasts in Dikarya, spanning 600 million years of evolution. OrthoMCL and OrthoDB were used to cluster protein sequence into ortholog and homolog groups, respectively; MAFFT and PhyML reconstructed the phylogeny of all homolog groups. Ortholog assignments for enzymes and small metabolite transporters were compared to their phylogenetic reconstruction, and curated to resolve any discrepancies. Information on homolog and ortholog groups can be viewed in the AYbRAH web portal (https://lmse.github.io/aybrah/), including functional annotations, predictions for mitochondrial localization and transmembrane domains, literature references and phylogenetic reconstructions. Ortholog assignments in AYbRAH were compared to HOGENOM, KEGG Orthology, OMA, eggNOG and PANTHER. PANTHER and OMA had the most congruent ortholog groups with AYbRAH, while the other phylogenomic databases had greater amounts of under-clustering, over-clustering or no ortholog annotations for proteins. Future plans are discussed for AYbRAH, and recommendations are made for other research communities seeking to create curated ortholog databases.

RevDate: 2019-05-02
CmpDate: 2019-05-02

Naz K, Naz A, Ashraf ST, et al (2019)

PanRV: Pangenome-reverse vaccinology approach for identifications of potential vaccine candidates in microbial pangenome.

BMC bioinformatics, 20(1):123 pii:10.1186/s12859-019-2713-9.

BACKGROUND: A revolutionary diversion from classical vaccinology to reverse vaccinology approach has been observed in the last decade. The ever-increasing genomic and proteomic data has greatly facilitated the vaccine designing and development process. Reverse vaccinology is considered as a cost-effective and proficient approach to screen the entire pathogen genome. To look for broad-spectrum immunogenic targets and analysis of closely-related bacterial species, the assimilation of pangenome concept into reverse vaccinology approach is essential. The categories of species pangenome such as core, accessory, and unique genes sets can be analyzed for the identification of vaccine candidates through reverse vaccinology.

RESULTS: We have designed an integrative computational pipeline term as "PanRV" that employs both the pangenome and reverse vaccinology approaches. PanRV comprises of four functional modules including i) Pangenome Estimation Module (PGM) ii) Reverse Vaccinology Module (RVM) iii) Functional Annotation Module (FAM) and iv) Antibiotic Resistance Association Module (ARM). The pipeline is tested by using genomic data from 301 genomes of Staphylococcus aureus and the results are verified by experimentally known antigenic data.

CONCLUSION: The proposed pipeline has proved to be the first comprehensive automated pipeline that can precisely identify putative vaccine candidates exploiting the microbial pangenome. PanRV is a Linux based package developed in JAVA language. An executable installer is provided for ease of installation along with a user manual at https://sourceforge.net/projects/panrv2/ .

RevDate: 2019-03-13

Li H, Ding X, Chen C, et al (2019)

Enrichment of phosphate solubilizing bacteria during late developmental stages of eggplant (Solanum melongena L.).

FEMS microbiology ecology, 95(3):.

Understanding the ecology of phosphate solubilizing bacteria (PSBs) is critical for developing better strategies to increase crop productivity. In this study, the diversity of PSBs and of the total bacteria in the rhizosphere of eggplant (Solanum melongena L.) cultivated in organic, integrated and conventional farming systems was compared at four developmental stages of its lifecycle. Both selective culture and high-throughput sequencing analysis of 16S rRNA amplicons indicated that Enterobacter with strong or very strong in vivo phosphate solubilization activities was enriched in the rhizosphere during the fruiting stage. The high-throughput sequencing analysis results demonstrated that farming systems explained 23% of total bacterial community variation. Plant development and farming systems synergistically shaped the rhizospheric bacterial community, in which the degree of variation influenced by farming systems decreased over the plant development phase from 56% to 26.3% to 16.3%, and finally to no significant effect as the plant reached at fruiting stage. Pangenome analysis indicated that two-component and transporter systems varied between the rhizosphere and soil PSBs. This study elucidated the complex interactions among farming systems, plant development and rhizosphere microbiomes.

RevDate: 2019-03-12

Burgueño-Roman A, Castañeda-Ruelas GM, Pacheco-Arjona R, et al (2019)

Pathogenic potential of non-typhoidal Salmonella serovars isolated from aquatic environments in Mexico.

Genes & genomics pii:10.1007/s13258-019-00798-7 [Epub ahead of print].

BACKGROUND: River water has been implicated as a source of non-typhoidal Salmonella (NTS) serovars in Mexico.

OBJECTIVE: To dissect the molecular pathogenesis and defense strategies of seven NTS strains isolated from river water in Mexico.

METHODS: The genome of Salmonella serovars Give, Pomona, Kedougou, Stanley, Oranienburg, Sandiego, and Muenchen were sequenced using the whole-genome shotgun methodology in the Illumina Miseq platform. The genoma annotation and evolutionary analyses were conducted in the RAST and FigTree servers, respectively. The MLST was performed using the SRST2 tool and the comparisons between strains were clustered and visualized using the Gview server. Experimental virulence assay was included to evaluate the pathogenic potential of strains.

RESULTS: We report seven high-quality draft genomes, ranging from ~ 4.61 to ~ 5.12 Mb, with a median G + C value, coding DNA sequence, and protein values of 52.1%, 4697 bp, and 4,589 bp, respectively. The NTS serovars presented with an open pan-genome, offering novel genetic content. Each NTS serovar had an indistinguishable virulotype with a core genome (352 virulence genes) closely associated with Salmonella pathogenicity; 13 genes were characterized as serotype specific, which could explain differences in pathogenicity. All strains maintained highly conserved genetic content regarding the Salmonella pathogenicity islands (1-5) (86.9-100%), fimbriae (84.6%), and hypermutation (100%) genes. Adherence and invasion capacity were confirmed among NTS strains in Caco-2 cells.

CONCLUSION: Our results demonstrated the arsenal of virulence and defense molecular factors harbored on NTS serovars and highlight that environmental NTS strains are waterborne pathogens worthy of attention.

RevDate: 2019-03-29

van Tonder AJ, Bray JE, Jolley KA, et al (2019)

Genomic Analyses of >3,100 Nasopharyngeal Pneumococci Revealed Significant Differences Between Pneumococci Recovered in Four Different Geographical Regions.

Frontiers in microbiology, 10:317.

Understanding the structure of a bacterial population is essential in order to understand bacterial evolution. Estimating the core genome (those genes common to all, or nearly all, strains of a species) is a key component of such analyses. The size and composition of the core genome varies by dataset, but we hypothesized that the variation between different collections of the same bacterial species would be minimal. To investigate this, we analyzed the genome sequences of 3,118 pneumococci recovered from healthy individuals in Reykjavik (Iceland), Southampton (United Kingdom), Boston (United States), and Maela (Thailand). The analyses revealed a "supercore" genome (genes shared by all 3,118 pneumococci) of 558 genes, although an additional 354 core genes were shared by pneumococci from Reykjavik, Southampton, and Boston. Overall, the size and composition of the core and pan-genomes among pneumococci recovered in Reykjavik, Southampton, and Boston were similar. Maela pneumococci were distinctly different in that they had a smaller core genome and larger pan-genome. The pan-genome of Maela pneumococci contained several >25 Kb sequence regions (flanked by pneumococcal genes) that were homologous to genomic regions found in other bacterial species. Overall, our work revealed that some subsets of the global pneumococcal population are highly heterogeneous, and our hypothesis was rejected. This is an important finding in terms of understanding genetic variation among pneumococci and is also an essential point of consideration before generalizing the findings from a single dataset to the wider pneumococcal population.

RevDate: 2019-03-29

Obolski U, Gori A, Lourenço J, et al (2019)

Identifying genes associated with invasive disease in S. pneumoniae by applying a machine learning approach to whole genome sequence typing data.

Scientific reports, 9(1):4049 pii:10.1038/s41598-019-40346-7.

Streptococcus pneumoniae, a normal commensal of the upper respiratory tract, is a major public health concern, responsible for substantial global morbidity and mortality due to pneumonia, meningitis and sepsis. Why some pneumococci invade the bloodstream or CSF (so-called invasive pneumococcal disease; IPD) is uncertain. In this study we identify genes associated with IPD. We transform whole genome sequence (WGS) data into a sequence typing scheme, while avoiding the caveat of using an arbitrary genome as a reference by substituting it with a constructed pangenome. We then employ a random forest machine-learning algorithm on the transformed data, and find 43 genes consistently associated with IPD across three geographically distinct WGS data sets of pneumococcal carriage isolates. Of the genes we identified as associated with IPD, we find 23 genes previously shown to be directly relevant to IPD, as well as 18 uncharacterized genes. We suggest that these uncharacterized genes identified by us are also likely to be relevant for IPD.

RevDate: 2019-03-29

Khaleque HN, González C, Shafique R, et al (2019)

Uncovering the Mechanisms of Halotolerance in the Extremely Acidophilic Members of the Acidihalobacter Genus Through Comparative Genome Analysis.

Frontiers in microbiology, 10:155.

There are few naturally occurring environments where both acid and salinity stress exist together, consequently, there has been little evolutionary pressure for microorganisms to develop systems that enable them to deal with both stresses simultaneously. Members of the genus Acidihalobacter are iron- and sulfur-oxidizing, halotolerant acidophiles that have developed the ability to tolerate acid and saline stress and, therefore, have the potential to bioleach ores with brackish or saline process waters under acidic conditions. The genus consists of four members, A. prosperus DSM 5130T, A. prosperus DSM 14174, A. prosperus F5 and "A. ferrooxidans" DSM 14175. An in depth genome comparison was undertaken in order to provide a more comprehensive description of the mechanisms of halotolerance used by the different members of this genus. Pangenome analysis identified 29, 3 and 9 protein families related to halotolerance in the core, dispensable and unique genomes, respectively. The genes for halotolerance showed Ka/Ks ratios between 0 and 0.2, confirming that they are conserved and stabilized. All the Acidihalobacter genomes contained similar genes for the synthesis and transport of ectoine, which was recently found to be the dominant osmoprotectant in A. prosperus DSM 14174 and A. prosperus DSM 5130T. Similarities also existed in genes encoding low affinity potassium pumps, however, A. prosperus DSM 14174 was also found to contain genes encoding high affinity potassium pumps. Furthermore, only A. prosperus DSM 5130T and "A. ferrooxidans" DSM 14175 contained genes allowing the uptake of taurine as an osmoprotectant. Variations were also seen in genes encoding proteins involved in the synthesis and/or transport of periplasmic glucans, sucrose, proline, and glycine betaine. This suggests that versatility exists in the Acidihalobacter genus in terms of the mechanisms they can use for halotolerance. This information is useful for developing hypotheses for the search for life on exoplanets and moons.

RevDate: 2019-03-09

Tahir Ul Qamar M, Zhu X, Xing F, et al (2019)

ppsPCP: A Plant Presence/absence Variants Scanner and Pan-genome Construction Pipeline.

Bioinformatics (Oxford, England) pii:5372683 [Epub ahead of print].

SUMMARY: Since the idea of pan-genomics emerged several tools and pipelines have been introduced for prokaryotic pan-genomics. However, not a single comprehensive pipeline has been reported which could overcome multiple challenges associated with eukaryotic pan-genomics. To aid the eukaryotic pan-genomic studies, here we present ppsPCP pipeline which is designed for eukaryotes especially for plants. It is capable of scanning presence/absence variants (PAVs) and constructing a fully annotated pan-genome. We believe with these unique features of PAV scanning and building a pan-genome together with its annotation, ppsPCP will be useful for plant pan-genomic studies and aid researchers to study genetic/phenotypic variations and genomic diversity.

The ppsPCP is freely available at github DOI: https://doi.org/10.5281/zenodo.2567390 and webpage http://cbi.hzau.edu.cn/ppsPCP/.

RevDate: 2019-03-09

Rautiainen M, Mäkinen V, T Marschall (2019)

Bit-parallel sequence-to-graph alignment.

Bioinformatics (Oxford, England) pii:5372677 [Epub ahead of print].

MOTIVATION: Graphs are commonly used to represent sets of sequences. Either edges or nodes can be labeled by sequences, so that each path in the graph spells a concatenated sequence. Examples include graphs to represent genome assemblies, such as string graphs and de Bruijn graphs, and graphs to represent a pan-genome and hence the genetic variation present in a population. Being able to align sequencing reads to such graphs is a key step for many analyses and its applications include genome assembly, read error correction, and variant calling with respect to a variation graph.

RESULTS: We generalize two linear sequence-to-sequence algorithms to graphs: the Shift-And algorithm for exact matching and Myers' bitvector algorithm for semi-global alignment. These linear algorithms are both based on processing w sequence characters with a constant number of operations, where w is the word size of the machine (commonly 64), and achieve a speedup of up to w over naive algorithms. For a graph with nodes and edges and a sequence of length m, our bitvector-based graph alignment algorithm reaches a worst case runtime of for acyclic graphs and for arbitrary cyclic graphs. We apply it to five different types of graphs and observe a speedup between 3-fold and 20-fold compared to a previous (asymptotically optimal) alignment algorithm by Navarro (2000).

AVAILABILITY: https://github.com/maickrau/GraphAligner.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

RevDate: 2019-04-18

Velsko IM, Perez MS, VP Richards (2019)

Resolving Phylogenetic Relationships for Streptococcus mitis and Streptococcus oralis through Core- and Pan-Genome Analyses.

Genome biology and evolution, 11(4):1077-1087.

Taxonomic and phylogenetic relationships of Streptococcus mitis and Streptococcus oralis have been difficult to establish biochemically and genetically. We used core-genome analyses of S. mitis and S. oralis, as well as the closely related species Streptococcus pneumoniae and Streptococcus parasanguinis, to clarify the phylogenetic relationships between S. mitis and S. oralis, as well as within subclades of S. oralis. All S. mitis (n = 67), S. oralis (n = 89), S. parasanguinis (n = 27), and 27 S. pneumoniae genome assemblies were downloaded from NCBI and reannotated. All genes were delineated into homologous clusters and maximum-likelihood phylogenies built from putatively nonrecombinant core gene sets. Population structure was determined using Bayesian genome clustering, and patristic distance was calculated between populations. Population-specific gene content was assessed using a phylogenetic-based genome-wide association approach. Streptococcus mitis and S. oralis formed distinct clades, but species mixing suggests taxonomic misassignment. Patristic distance between populations suggests that S. oralis subsp. dentisani is a distinct species, whereas S. oralis subsp. tigurinus and subsp. oralis are supported as subspecies, and that S. mitis comprises two subspecies. None of the genes within the pan-genomes of S. mitis and S. oralis could be statistically correlated with either, and the dispensable genomes showed extensive variation among isolates. These are likely important factors contributing to established overlap in biochemical characteristics for these taxa. Based on core-genome analysis, the substructure of S. oralis and S. mitis should be redefined, and species assignments within S. oralis and S. mitis should be made based on whole-genome analysis to be robust to misassignment.

RevDate: 2019-03-20
CmpDate: 2019-03-20

Eisenbach L, Geissler AJ, Ehrmann MA, et al (2019)

Comparative genomics of Lactobacillus sakei supports the development of starter strain combinations.

Microbiological research, 221:1-9.

Strains of Lactobacillus sakei can be isolated from a variety of sources including meat, fermented sausages, sake, sourdough, sauerkraut or kimchi. Selected strains are widely used as starter cultures for sausage fermentation. Recently we have demonstrated that control about the lactic microbiota in fermenting sausages is achieved rather by pairs or strain sets than by single strains. In this work we characterized the pan genome of L. sakei to enable exploitation of the genomic diversity of L. sakei for the establishment of assertive starter strain sets. We have established the full genome sequences of nine L. sakei strains from different sources of isolation and included in the analysis the genome of L. sakei 23K. Comparative genomics revealed an accessory genome comprising about 50% of the pan genome and different lineages of strains with no relation to their source of isolation. Group and strain specific differences could be found, which namely referred to agmatine and citrate metabolism. The presence of genes encoding metabolic pathways for fructose, sucrose and trehalose as well as gluconate in all strains suggests a general adaptation to plant/sugary environments and a life in communities with other genera. Analysis of the plasmidome did not reveal any specific mechanisms of adaptation to a habitat. The predicted differences of metabolic settings enable prediction of partner strains, which can occupy the meat environment to a large extent and establish competitive exclusion of autochthonous microbiota. This may assist the development of a new generation of meat starter cultures containing L. sakei strains.

RevDate: 2019-04-08
CmpDate: 2019-04-08

Raphael BH, Huynh T, Brown E, et al (2019)

Culture of Clinical Specimens Reveals Extensive Diversity of Legionella pneumophila Strains in Arizona.

mSphere, 4(1): pii:4/1/e00649-18.

Between 2000 and 2017, a total of 236 Legionella species isolates from Arizona were submitted to the CDC for reference testing. Most of these isolates were recovered from bronchoalveolar lavage specimens. Although the incidence of legionellosis in Arizona is less than the overall U.S. incidence, Arizona submits the largest number of isolates to the CDC for testing compared to those from other states. In addition to a higher proportion of culture confirmation of legionellosis cases in Arizona than in other states, all Legionellapneumophila isolates are forwarded to the CDC for confirmatory testing. Compared to that from other states, a higher proportion of isolates from Arizona were identified as belonging to L. pneumophila serogroups 6 (28.2%) and 8 (8.9%). Genome sequencing was conducted on 113 L. pneumophila clinical isolates not known to be associated with outbreaks in order to understand the genomic diversity of strains causing legionellosis in Arizona. Whole-genome multilocus sequence typing (wgMLST) revealed 17 clusters of isolates sharing at least 99% identical allele content. Only two of these clusters contained isolates from more than one individual with exposure at the same facility. Additionally, wgMLST analysis revealed a group of 31 isolates predominantly belonging to serogroup 6 and containing isolates from three separate clusters. Single nucleotide polymorphism (SNP) and pangenome analysis were used to further resolve genome sequences belonging to a subset of isolates. This study demonstrates that culture of clinical specimens for Legionella spp. reveals a highly diverse population of strains causing legionellosis in Arizona which could be underappreciated using other diagnostic approaches.IMPORTANCE Culture of clinical specimens from patients with Legionnaires' disease is rarely performed, restricting our understanding of the diversity and ecology of Legionella Culture of Legionella from patient specimens in Arizona revealed a greater proportion of non-serogroup 1 Legionellapneumophila isolates than in other U.S. isolates examined. Disease caused by such isolates may go undetected using other diagnostic methods. Moreover, genome sequence analysis revealed that these isolates were genetically diverse, and understanding these populations may help in future environmental source attribution studies.

RevDate: 2019-03-07
CmpDate: 2019-03-07

Gabbett MT, Laporte J, Sekar R, et al (2019)

Molecular Support for Heterogonesis Resulting in Sesquizygotic Twinning.

The New England journal of medicine, 380(9):842-849.

Sesquizygotic multiple pregnancy is an exceptional intermediate between monozygotic and dizygotic twinning. We report a monochorionic twin pregnancy with fetal sex discordance. Genotyping of amniotic fluid from each sac showed that the twins were maternally identical but chimerically shared 78% of their paternal genome, which makes them genetically in between monozygotic and dizygotic; they are sesquizygotic. We observed no evidence of sesquizygosis in 968 dizygotic twin pairs whom we screened by means of pangenome single-nucleotide polymorphism genotyping. Data from published repositories also show that sesquizygosis is a rare event. Detailed genotyping implicates chimerism arising at the juncture of zygotic division, termed heterogonesis, as the likely initial step in the causation of sesquizygosis.

RevDate: 2019-04-16

Caputo A, Fournier PE, D Raoult (2019)

Genome and pan-genome analysis to classify emerging bacteria.

Biology direct, 14(1):5 pii:10.1186/s13062-019-0234-0.

BACKGROUND: In the recent years, genomic and pan-genomic studies have become increasingly important. Culturomics allows to study human microbiota through the use of different culture conditions, coupled with a method of rapid identification by MALDI-TOF, or 16S rRNA. Bacterial taxonomy is undergoing many changes as a consequence. With the help of pan-genomic analyses, species can be redefined, and new species definitions generated.

RESULTS: Genomics, coupled with culturomics, has led to the discovery of many novel bacterial species or genera, including Akkermansia muciniphila and Microvirga massiliensis. Using the genome to define species has been applied within the genus Klebsiella. A discontinuity or an abrupt break in the core/pan-genome ratio can uncover novel species.

CONCLUSIONS: Applying genomic and pan-genomic analyses to the reclassification of other bacterial species or genera will be important in the future of medical microbiology. The pan-genome is one of many new innovative tools in bacterial taxonomy.

REVIEWERS: This article was reviewed by William Martin, Eric Bapteste and James Mcinerney.

OPEN PEER REVIEW: Reviewed by William Martin, Eric Bapteste and James Mcinerney.

RevDate: 2019-02-28

Entwistle S, Li X, Y Yin (2019)

Orphan Genes Shared by Pathogenic Genomes Are More Associated with Bacterial Pathogenicity.

mSystems, 4(1): pii:mSystems00290-18.

Orphan genes (also known as ORFans [i.e., orphan open reading frames]) are new genes that enable an organism to adapt to its specific living environment. Our focus in this study is to compare ORFans between pathogens (P) and nonpathogens (NP) of the same genus. Using the pangenome idea, we have identified 130,169 ORFans in nine bacterial genera (505 genomes) and classified these ORFans into four groups: (i) SS-ORFans (P), which are only found in a single pathogenic genome; (ii) SS-ORFans (NP), which are only found in a single nonpathogenic genome; (iii) PS-ORFans (P), which are found in multiple pathogenic genomes; and (iv) NS-ORFans (NP), which are found in multiple nonpathogenic genomes. Within the same genus, pathogens do not always have more genes, more ORFans, or more pathogenicity-related genes (PRGs)-including prophages, pathogenicity islands (PAIs), virulence factors (VFs), and horizontal gene transfers (HGTs)-than nonpathogens. Interestingly, in pathogens of the nine genera, the percentages of PS-ORFans are consistently higher than those of SS-ORFans, which is not true in nonpathogens. Similarly, in pathogens of the nine genera, the percentages of PS-ORFans matching the four types of PRGs are also always higher than those of SS-ORFans, but this is not true in nonpathogens. All of these findings suggest the greater importance of PS-ORFans for bacterial pathogenicity. IMPORTANCE Recent pangenome analyses of numerous bacterial species have suggested that each genome of a single species may have a significant fraction of its gene content unique or shared by a very few genomes (i.e., ORFans). We selected nine bacterial genera, each containing at least five pathogenic and five nonpathogenic genomes, to compare their ORFans in relation to pathogenicity-related genes. Pathogens in these genera are known to cause a number of common and devastating human diseases such as pneumonia, diphtheria, melioidosis, and tuberculosis. Thus, they are worthy of in-depth systems microbiology investigations, including the comparative study of ORFans between pathogens and nonpathogens. We provide direct evidence to suggest that ORFans shared by more pathogens are more associated with pathogenicity-related genes and thus are more important targets for development of new diagnostic markers or therapeutic drugs for bacterial infectious diseases.

RevDate: 2019-02-24

Lin JN, Lai CH, Yang CH, et al (2019)

Genomic Features, Comparative Genomics, and Antimicrobial Susceptibility Patterns of Elizabethkingia bruuniana.

Scientific reports, 9(1):2267 pii:10.1038/s41598-019-38998-6.

Elizabethkingia bruuniana is a novel species of the Elizabethkingia genus. There is scant information on this microorganism. Here, we report the whole-genome features and antimicrobial susceptibility patterns of E. bruuniana strain EM798-26. Elizabethkingia strain EM798-26 was initially identified as E. miricola. This isolate contained a circular genome of 4,393,011 bp. The whole-genome sequence-based phylogeny revealed that Elizabethkingia strain EM798-26 was in the same group of the type strain E. bruuniana G0146T. Both in silico DNA-DNA hybridization and average nucleotide identity analysis clearly demonstrated that Elizabethkingia strain EM798-26 was a species of E. bruuniana. The pan-genome analysis identified 2,875 gene families in the core genome and 5,199 gene families in the pan genome of eight publicly available E. bruuniana genome sequences. The unique genes accounted for 0.2-12.1% of the pan genome in each E. bruuniana. A total of 59 potential virulence factor homologs were predicted in the whole-genome of E. bruuniana strain EM798-26. This isolate was nonsusceptible to multiple antibiotics, but susceptible to aminoglycosides, minocycline, and levofloxacin. The whole-genome sequence analysis of E. bruuniana EM798-26 revealed 29 homologs of antibiotic resistance-related genes. This study presents the genomic features of E. bruuniana. Knowledge of the genomic characteristics provides valuable insights into a novel species.

RevDate: 2019-03-25
CmpDate: 2019-03-25

López-Pérez M, Jayakumar JM, Haro-Moreno JM, et al (2019)

Evolutionary Model of Cluster Divergence of the Emergent Marine Pathogen Vibrio vulnificus: From Genotype to Ecotype.

mBio, 10(1): pii:mBio.02852-18.

Vibrio vulnificus, an opportunistic pathogen, is the causative agent of a life-threatening septicemia and a rising problem for aquaculture worldwide. The genetic factors that differentiate its clinical and environmental strains remain enigmatic. Furthermore, clinical strains have emerged from every clade of V. vulnificus In this work, we investigated the underlying genomic properties and population dynamics of the V. vulnificus species from an evolutionary and ecological point of view. Genome comparisons and bioinformatic analyses of 113 V. vulnificus isolates indicate that the population of V. vulnificus is made up of four different clusters. We found evidence that recombination and gene flow between the two largest clusters (cluster 1 [C1] and C2) have drastically decreased to the point where they are diverging independently. Pangenome and phenotypic analyses showed two markedly different lifestyles for these two clusters, indicating commensal (C2) and bloomer (C1) ecotypes, with differences in carbohydrate utilization, defense systems, and chemotaxis, among other characteristics. Nonetheless, we identified frequent intra- and interspecies exchange of mobile genetic elements (e.g., antibiotic resistance plasmids, novel "chromids," or two different and concurrent type VI secretion systems) that provide high levels of genetic diversity in the population. Surprisingly, we identified strains from both clusters in the mucosa of aquaculture species, indicating that manmade niches are bringing strains from the two clusters together. We propose an evolutionary model of V. vulnificus that could be broadly applicable to other pathogenic vibrios and facultative bacterial pathogens to pursue strategies to prevent their infections and emergence.IMPORTANCEVibrio vulnificus is an emergent marine pathogen and is the cause of a deadly septicemia. However, the genetic factors that differentiate its clinical and environmental strains and its several biotypes remain mostly enigmatic. In this work, we investigated the underlying genomic properties and population dynamics of the V. vulnificus species to elucidate the traits that make these strains emerge as a human pathogen. The acquisition of different ecological determinants could have allowed the development of highly divergent clusters with different lifestyles within the same environment. However, we identified strains from both clusters in the mucosa of aquaculture species, indicating that manmade niches are bringing strains from the two clusters together, posing a potential risk of recombination and of emergence of novel variants. We propose a new evolutionary model that provides a perspective that could be broadly applicable to other pathogenic vibrios and facultative bacterial pathogens to pursue strategies to prevent their infections.

RevDate: 2019-04-28

Issa E, Salloum T, Panossian B, et al (2019)

Genome Mining and Comparative Analysis of Streptococcus intermedius Causing Brain Abscess in a Child.

Pathogens (Basel, Switzerland), 8(1): pii:pathogens8010022.

Streptococcus intermedius (SI) is associated with prolonged hospitalization and low survival rates. The genetic mechanisms involved in brain abscess development and genome evolution in comparison to other members of the Streptococcus anginosus group are understudied. We performed a whole-genome comparative analysis of an SI isolate, LAU_SINT, associated with brain abscess following sinusitis with all SI genomes in addition to S. constellatus and S. anginosus. Selective pressure on virulence factors, phages, pan-genome evolution and single-nucleotide polymorphism analysis were assessed. The structural details of the type seven secretion system (T7SS) was elucidated and compared with different organisms. ily and nanA were both abundant and conserved. Nisin resistance determinants were found in 47% of the isolates. Pan-genome and SNPs-based analysis didn't reveal significant geo-patterns. Our results showed that two SC isolates were misidentified as SI. We propose the presence of four T7SS modules (I⁻IV) located on various genomic islands. We detected a variety of factors linked to metal ions binding on the GIs carrying T7SS. This is the first detailed report characterizing the T7SS and its link to nisin resistance and metal ions binding in SI. These and yet uncharacterized T7SS transmembrane proteins merit further studies and could represent potential therapeutic targets.

RevDate: 2019-03-12
CmpDate: 2019-03-12

Grytten I, Rand KD, Nederbragt AJ, et al (2019)

Graph Peak Caller: Calling ChIP-seq peaks on graph-based reference genomes.

PLoS computational biology, 15(2):e1006731 pii:PCOMPBIOL-D-18-00533.

Graph-based representations are considered to be the future for reference genomes, as they allow integrated representation of the steadily increasing data on individual variation. Currently available tools allow de novo assembly of graph-based reference genomes, alignment of new read sets to the graph representation as well as certain analyses like variant calling and haplotyping. We here present a first method for calling ChIP-Seq peaks on read data aligned to a graph-based reference genome. The method is a graph generalization of the peak caller MACS2, and is implemented in an open source tool, Graph Peak Caller. By using the existing tool vg to build a pan-genome of Arabidopsis thaliana, we validate our approach by showing that Graph Peak Caller with a pan-genome reference graph can trace variants within peaks that are not part of the linear reference genome, and find peaks that in general are more motif-enriched than those found by MACS2.

RevDate: 2019-02-20

Lu QF, Cao DM, Su LL, et al (2019)

Genus-Wide Comparative Genomics Analysis of Neisseria to Identify New Genes Associated with Pathogenicity and Niche Adaptation of Neisseria Pathogens.

International journal of genomics, 2019:6015730.

N. gonorrhoeae and N. meningitidis, the only two human pathogens of Neisseria, are closely related species. But the niches they survived in and their pathogenic characteristics are distinctly different. However, the genetic basis of these differences has not yet been fully elucidated. In this study, comparative genomics analysis was performed based on 15 N. gonorrhoeae, 75 N. meningitidis, and 7 nonpathogenic Neisseria genomes. Core-pangenome analysis found 1111 conserved gene families among them, and each of these species groups had opening pangenome. We found that 452, 78, and 319 gene families were unique in N. gonorrhoeae, N. meningitidis, and both of them, respectively. Those unique gene families were regarded as candidates that related to their pathogenicity and niche adaptation. The relationships among them have been partly verified by functional annotation analysis. But at least one-third genes for each gene set have not found the certain functional information. Simple sequence repeat (SSR), the basis of gene phase variation, was found abundant in the membrane or related genes of each unique gene set, which may facilitate their adaptation to variable host environments. Protein-protein interaction (PPI) analysis found at least five distinct PPI clusters in N. gonorrhoeae and four in N. meningitides, and 167 and 52 proteins with unknown function were contained within them, respectively.

RevDate: 2019-02-20

Stevens MJA, Tasara T, Klumpp J, et al (2019)

Whole-genome-based phylogeny of Bacillus cytotoxicus reveals different clades within the species and provides clues on ecology and evolution.

Scientific reports, 9(1):1984 pii:10.1038/s41598-018-36254-x.

Bacillus cytotoxicus is a member of the Bacillus cereus group linked to fatal cases of diarrheal disease. Information on B. cytotoxicus is very limited; in particular comprehensive genomic data is lacking. Thus, we applied a genomic approach to characterize B. cytotoxicus and decipher its population structure. To this end, complete genomes of ten B. cytotoxicus were sequenced and compared to the four publicly available full B. cytotoxicus genomes and genomes of other B. cereus group members. Average nucleotide identity, core genome, and pan genome clustering resulted in clear distinction of B. cytotoxicus strains from other strains of the B. cereus group. Genomic content analyses showed that a hydroxyphenylalanine operon is present in B. cytotoxicus, but absent in all other members of the B. cereus group. It enables degradation of aromatic compounds to succinate and pyruvate and was likely acquired from another Bacillus species. It allows for utilization of tyrosine and might have given a B. cytotoxicus ancestor an evolutionary advantage resulting in species differentiation. Plasmid content showed that B. cytotoxicus is flexible in exchanging genes, allowing for quick adaptation to the environment. Genome-based phylogenetic analyses divided the B. cytotoxicus strains into four clades that also differed in virulence gene content.

RevDate: 2019-03-24

Lye ZN, MD Purugganan (2019)

Copy Number Variation in Domestication.

Trends in plant science, 24(4):352-365.

Domesticated plants have long served as excellent models for studying evolution. Many genes and mutations underlying important domestication traits have been identified, and most causal mutations appear to be SNPs. Copy number variation (CNV) is an important source of genetic variation that has been largely neglected in studies of domestication. Ongoing work demonstrates the importance of CNVs as a source of genetic variation during domestication, and during the diversification of domesticated taxa. Here, we review how CNVs contribute to evolutionary processes underlying domestication, and review examples of domestication traits caused by CNVs. We draw from examples in plant species, but also highlight cases in animal systems that could illuminate the roles of CNVs in the domestication process.

RevDate: 2019-04-12
CmpDate: 2019-04-12

Arora S, Steuernagel B, Gaurav K, et al (2019)

Resistance gene cloning from a wild crop relative by sequence capture and association genetics.

Nature biotechnology, 37(2):139-143.

Disease resistance (R) genes from wild relatives could be used to engineer broad-spectrum resistance in domesticated crops. We combined association genetics with R gene enrichment sequencing (AgRenSeq) to exploit pan-genome variation in wild diploid wheat and rapidly clone four stem rust resistance genes. AgRenSeq enables R gene cloning in any crop that has a diverse germplasm panel.

RevDate: 2019-04-12
CmpDate: 2019-04-12

Zou Y, Xue W, Luo G, et al (2019)

1,520 reference genomes from cultivated human gut bacteria enable functional microbiome analyses.

Nature biotechnology, 37(2):179-185.

Reference genomes are essential for metagenomic analyses and functional characterization of the human gut microbiota. We present the Culturable Genome Reference (CGR), a collection of 1,520 nonredundant, high-quality draft genomes generated from >6,000 bacteria cultivated from fecal samples of healthy humans. Of the 1,520 genomes, which were chosen to cover all major bacterial phyla and genera in the human gut, 264 are not represented in existing reference genome catalogs. We show that this increase in the number of reference bacterial genomes improves the rate of mapping metagenomic sequencing reads from 50% to >70%, enabling higher-resolution descriptions of the human gut microbiome. We use the CGR genomes to annotate functions of 338 bacterial species, showing the utility of this resource for functional studies. We also carry out a pan-genome analysis of 38 important human gut species, which reveals the diversity and specificity of functional enrichment between their core and dispensable genomes.

RevDate: 2019-05-01

McCarthy CGP, DA Fitzpatrick (2019)

Pan-genome analyses of model fungal species.

Microbial genomics, 5(2):.

The concept of the species 'pan-genome', the union of 'core' conserved genes and all 'accessory' non-conserved genes across all strains of a species, was first proposed in prokaryotes to account for intraspecific variability. Species pan-genomes have been extensively studied in prokaryotes, but evidence of species pan-genomes has also been demonstrated in eukaryotes such as plants and fungi. Using a previously published methodology based on sequence homology and conserved microsynteny, in addition to bespoke pipelines, we have investigated the pan-genomes of four model fungal species: Saccharomyces cerevisiae, Candida albicans, Cryptococcus neoformans var. grubii and Aspergillus fumigatus. Between 80 and 90 % of gene models per strain in each of these species are core genes that are highly conserved across all strains of that species, many of which are involved in housekeeping and conserved survival processes. In many of these species, the remaining 'accessory' gene models are clustered within subterminal regions and may be involved in pathogenesis and antimicrobial resistance. Analysis of the ancestry of species core and accessory genomes suggests that fungal pan-genomes evolve by strain-level innovations such as gene duplication as opposed to wide-scale horizontal gene transfer. Our findings lend further supporting evidence to the existence of species pan-genomes in eukaryote taxa.

RevDate: 2019-03-23

Lugli GA, Mancino W, Milani C, et al (2019)

Dissecting the Evolutionary Development of the Species Bifidobacterium animalis through Comparative Genomics Analyses.

Applied and environmental microbiology, 85(7): pii:AEM.02806-18.

Bifidobacteria are members of the gut microbiota of animals, including mammals, birds, and social insects. In this study, we analyzed and determined the pangenome of Bifidobacterium animalis species, encompassing B. animalis subsp. animalis and the B. animalis subsp. lactis taxon, which is one of the most intensely exploited probiotic bifidobacterial species. In order to reveal differences within the B. animalis species, detailed comparative genomics and phylogenomics analyses were performed, indicating that these two subspecies recently arose through divergent evolutionary events. A subspecies-specific core genome was identified for both B. animalis subspecies, revealing the existence of subspecies-defining genes involved in carbohydrate metabolism. Notably, these in silico analyses coupled with carbohydrate profiling assays suggest genetic adaptations toward a distinct glycan milieu for each member of the B. animalis subspecies, resulting in a divergent evolutionary development of the two subspecies.IMPORTANCE The majority of characterized B. animalis strains have been isolated from human fecal samples. In order to explore genome variability within this species, we isolated 15 novel strains from the gastrointestinal tracts of different animals, including mammals and birds. The present study allowed us to reconstruct the pangenome of this taxon, including the genome contents of 56 B. animalis strains. Through careful assessment of subspecies-specific core genes of the B. animalis subsp. animalis/lactis taxon, we identified genes encoding enzymes involved in carbohydrate transport and metabolism, while unveiling specific gene acquisition and loss events that caused the evolutionary emergence of these two subspecies.

RevDate: 2019-02-08

Nono AD, Chen K, X Liu (2019)

Comparison of different functional prediction scores using a gene-based permutation model for identifying cancer driver genes.

BMC medical genomics, 12(Suppl 1):22 pii:10.1186/s12920-018-0452-9.

BACKGROUND: Identifying cancer driver genes (CDG) is a crucial step in cancer genomic toward the advancement of precision medicine. However, driver gene discovery is a very challenging task because we are not only dealing with huge amount of data; but we are also faced with the complexity of the disease including the heterogeneity of background somatic mutation rate in each cancer patient. It is generally accepted that CDG harbor variants conferring growth advantage in the malignant cell and they are positively selected, which are critical to cancer development; whereas, non-driver genes harbor random mutations with no functional consequence on cancer. Based on this fact, function prediction based approaches for identifying CDG have been proposed to interrogate the distribution of functional predictions among mutations in cancer genomes (eLS 1-16, 2016). Assuming most of the observed mutations are passenger mutations and given the quantitative predictions for the functional impact of the mutations, genes enriched of functional or deleterious mutations are more likely to be drivers. The promises of these methods have been continually refined and can therefore be applied to increase accuracy in detecting new candidate CDGs. However, current function prediction based approaches only focus on coding mutations and lack a systematic way to pick the best mutation deleteriousness prediction algorithms for usage.

RESULTS: In this study, we propose a new function prediction based approach to discover CDGs through a gene-based permutation approach. Our method not only covers both coding and non-coding regions of the genes; but it also accounts for the heterogeneous mutational context in cohort of cancer patients. The permutation model was implemented independently using seven popular deleteriousness prediction scores covering splicing regions (SPIDEX), coding regions (MetaLR, and VEST3) and pan-genome (CADD, DANN, Fathmm-MKL coding and Fathmm-MKL noncoding). We applied this new approach to somatic single nucleotide variants (SNVs) from whole-genome sequences of 119 breast and 24 lung cancer patients and compared the seven deleteriousness prediction scores for their performance in this study.

CONCLUSION: The new function prediction based approach not only predicted known cancer genes listed in the Cancer Gene Census (CGC), but also new candidate CDGs that are worth further investigation. The results showed the advantage of utilizing pan-genome deleteriousness prediction scores in function prediction based methods. Although VEST3 score, a deleteriousness prediction score for missense mutations, has the best performance in breast cancer, it was topped by CADD and Fathmm-MKL coding, two pan-genome deleteriousness prediction scores, in lung cancer.

RevDate: 2019-02-03

Leviatan S, E Segal (2019)

A Significant Expansion of Our Understanding of the Composition of the Human Microbiome.

mSystems, 4(1): pii:mSystems00010-19.

Shotgun sequencing of samples taken from the human microbiome often reveals only partial mapping of the sequenced metagenomic reads to existing reference genomes. Such partial mappability indicates that many genomes are missing in our reference genome set. This is particularly true for non-Western populations and for samples that do not originate from the gut. Pasolli et al. (E. Pasolli, F. Asnicar, S. Manara, M. Zolfo, et al., Cell, 2019, https://doi.org/10.1016/j.cell.2019.01.001) perform a grand effort to expand the reference set, and to better classify its members, revealing a wider pangenome of existing species as well as identifying new species of previously unknown taxonomic branches.

RevDate: 2019-01-30

Sánchez-Osuna M, Cortés P, Barbé J, et al (2018)

Origin of the Mobile Di-Hydro-Pteroate Synthase Gene Determining Sulfonamide Resistance in Clinical Isolates.

Frontiers in microbiology, 9:3332.

Sulfonamides are synthetic chemotherapeutic agents that work as competitive inhibitors of the di-hydro-pteroate synthase (DHPS) enzyme, encoded by the folP gene. Resistance to sulfonamides is widespread in the clinical setting and predominantly mediated by plasmid- and integron-borne sul1-3 genes encoding mutant DHPS enzymes that do not bind sulfonamides. In spite of their clinical importance, the genetic origin of sul1-3 genes remains unknown. Here we analyze sul genes and their genetic neighborhoods to uncover sul signature elements that enable the elucidation of their genetic origin. We identify a protein sequence Sul motif associated with sul-encoded proteins, as well as consistent association of a phosphoglucosamine mutase gene (glmM) with the sul2 gene. We identify chromosomal folP genes bearing these genetic markers in two bacterial families: the Rhodobiaceae and the Leptospiraceae. Bayesian phylogenetic inference of FolP/Sul and GlmM protein sequences clearly establishes that sul1-2 and sul3 genes originated as a mobilization of folP genes present in, respectively, the Rhodobiaceae and the Leptospiraceae, and indicate that the Rhodobiaceae folP gene was transferred from the Leptospiraceae. Analysis of %GC content in folP/sul gene sequences supports the phylogenetic inference results and indicates that the emergence of the Sul motif in chromosomally encoded FolP proteins is ancient and considerably predates the clinical introduction of sulfonamides. In vitro assays reveal that both the Rhodobiaceae and the Leptospiraceae, but not other related chromosomally encoded FolP proteins confer resistance in a sulfonamide-sensitive Escherichia coli background, indicating that the Sul motif is associated with sulfonamide resistance. Given the absence of any known natural sulfonamides targeting DHPS, these results provide a novel perspective on the emergence of resistance to synthetic chemotherapeutic agents, whereby preexisting resistant variants in the vast bacterial pangenome may be rapidly selected for and disseminated upon the clinical introduction of novel chemotherapeuticals.

LOAD NEXT 100 CITATIONS

RJR Experience and Expertise

Researcher

Robbins holds BS, MS, and PhD degrees in the life sciences. He served as a tenured faculty member in the Zoology and Biological Science departments at Michigan State University. He is currently exploring the intersection between genomics, microbial ecology, and biodiversity — an area that promises to transform our understanding of the biosphere.

Educator

Robbins has extensive experience in college-level education: At MSU he taught introductory biology, genetics, and population genetics. At JHU, he was an instructor for a special course on biological database design. At FHCRC, he team-taught a graduate-level course on the history of genetics. At Bellevue College he taught medical informatics.

Administrator

Robbins has been involved in science administration at both the federal and the institutional levels. At NSF he was a program officer for database activities in the life sciences, at DOE he was a program officer for information infrastructure in the human genome project. At the Fred Hutchinson Cancer Research Center, he served as a vice president for fifteen years.

Technologist

Robbins has been involved with information technology since writing his first Fortran program as a college student. At NSF he was the first program officer for database activities in the life sciences. At JHU he held an appointment in the CS department and served as director of the informatics core for the Genome Data Base. At the FHCRC he was VP for Information Technology.

Publisher

While still at Michigan State, Robbins started his first publishing venture, founding a small company that addressed the short-run publishing needs of instructors in very large undergraduate classes. For more than 20 years, Robbins has been operating The Electronic Scholarly Publishing Project, a web site dedicated to the digital publishing of critical works in science, especially classical genetics.

Speaker

Robbins is well-known for his speaking abilities and is often called upon to provide keynote or plenary addresses at international meetings. For example, in July, 2012, he gave a well-received keynote address at the Global Biodiversity Informatics Congress, sponsored by GBIF and held in Copenhagen. The slides from that talk can be seen HERE.

Facilitator

Robbins is a skilled meeting facilitator. He prefers a participatory approach, with part of the meeting involving dynamic breakout groups, created by the participants in real time: (1) individuals propose breakout groups; (2) everyone signs up for one (or more) groups; (3) the groups with the most interested parties then meet, with reports from each group presented and discussed in a subsequent plenary session.

Designer

Robbins has been engaged with photography and design since the 1960s, when he worked for a professional photography laboratory. He now prefers digital photography and tools for their precision and reproducibility. He designed his first web site more than 20 years ago and he personally designed and implemented this web site. He engages in graphic design as a hobby.

963 Red Tail Lane
Bellingham, WA 98226

206-300-3443

E-mail: RJR8222@gmail.com

Collection of publications by R J Robbins

Reprints and preprints of publications, slide presentations, instructional materials, and data compilations written or prepared by Robert Robbins. Most papers deal with computational biology, genome informatics, using information technology to support biomedical research, and related matters.

Research Gate page for R J Robbins

ResearchGate is a social networking site for scientists and researchers to share papers, ask and answer questions, and find collaborators. According to a study by Nature and an article in Times Higher Education , it is the largest academic social network in terms of active users.

Curriculum Vitae for R J Robbins

short personal version

Curriculum Vitae for R J Robbins

long standard version

RJR Picks from Around the Web (updated 11 MAY 2018 )