About | BLOGS | Portfolio | Misc | Recommended | What's New | What's Hot

About | BLOGS | Portfolio | Misc | Recommended | What's New | What's Hot


Bibliography Options Menu

27 Sep 2020 at 01:30
Hide Abstracts   |   Hide Additional Links
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome


Robert J. Robbins is a biologist, an educator, a science administrator, a publisher, an information technologist, and an IT leader and manager who specializes in advancing biomedical knowledge and supporting education through the application of information technology. More About:  RJR | OUR TEAM | OUR SERVICES | THIS WEBSITE

RJR: Recommended Bibliography 27 Sep 2020 at 01:30 Created: 


Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: pangenome or "pan-genome" or "pan genome" NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)


RevDate: 2020-09-25

Feng Y, Fan X, Zhu L, et al (2020)

Phylogenetic and genomic analysis reveals high genomic openness and genetic diversity of Clostridium perfringens.

Microbial genomics [Epub ahead of print].

Clostridium perfringens is associated with a variety of diseases in both humans and animals. Recent advances in genomic sequencing make it timely to re-visit this important pathogen. Although the genome sequence of C. perfringens was first determined in 2002, large-scale comparative genomics with isolates of different origins is still lacking. In this study, we used whole-genome sequencing of 45 C. perfringens isolates with isolation time spanning an 80-year period and performed comparative analysis of 173 genomes from worldwide strains. We also conducted phylogenetic lineage analysis and introduced an openness index (OI) to evaluate the openness of bacterial genomes. We classified all these genomes into five lineages and hypothesized that the origin of C. perfringens dates back to ~80 000 years ago. We showed that the pangenome of the 173 C. perfringens strains contained a total of 26 954 genes, while the core genome comprised 1020 genes, accounting for about a third of the genome of each isolate. We demonstrated that C. perfringens had the highest OI compared with 51 other bacterial species. Intact prophage sequences were found in nearly 70.0 % of C. perfringens genomes, while CRISPR sequences were found only in ~40.0 %. Plasmids were prevalent in C. perfringens isolates, and half of the virulence genes and antibiotic resistance genes (ARGs) identified in all the isolates could be found in plasmids. ARG-sharing network analysis showed that C. perfringens shared its 11 ARGs with 55 different bacterial species, and a high frequency of ARG transfer may have occurred between C. perfringens and species in the genera Streptococcus and Staphylococcus. Correlation analysis showed that the ARG number in C. perfringens strains increased with time, while the virulence gene number was relative stable. Our results, taken together with previous studies, revealed the high genome openness and genetic diversity of C. perfringens and provide a comprehensive view of the phylogeny, genomic features, virulence gene and ARG profiles of worldwide strains.

RevDate: 2020-09-25

Rautiainen M, T Marschall (2020)

GraphAligner: rapid and versatile sequence-to-graph alignment.

Genome biology, 21(1):253 pii:10.1186/s13059-020-02157-2.

Genome graphs can represent genetic variation and sequence uncertainty. Aligning sequences to genome graphs is key to many applications, including error correction, genome assembly, and genotyping of variants in a pangenome graph. Yet, so far, this step is often prohibitively slow. We present GraphAligner, a tool for aligning long reads to genome graphs. Compared to the state-of-the-art tools, GraphAligner is 13x faster and uses 3x less memory. When employing GraphAligner for error correction, we find it to be more than twice as accurate and over 12x faster than extant tools.Availability: Package manager: https://anaconda.org/bioconda/graphaligner and source code: https://github.com/maickrau/GraphAligner.

RevDate: 2020-09-24

Sánchez-Osuna M, Cortés P, Llagostera M, et al (2020)

Exploration into the origins and mobilization of di-hydrofolate reductase genes and the emergence of clinical resistance to trimethoprim.

Microbial genomics [Epub ahead of print].

Trimethoprim is a synthetic antibacterial agent that targets folate biosynthesis by competitively binding to the di-hydrofolate reductase enzyme (DHFR). Trimethoprim is often administered synergistically with sulfonamide, another chemotherapeutic agent targeting the di-hydropteroate synthase (DHPS) enzyme in the same pathway. Clinical resistance to both drugs is widespread and mediated by enzyme variants capable of performing their biological function without binding to these drugs. These mutant enzymes were assumed to have arisen after the discovery of these synthetic drugs, but recent work has shown that genes conferring resistance to sulfonamide were present in the bacterial pangenome millions of years ago. Here, we apply phylogenetics and comparative genomics methods to study the largest family of mobile trimethoprim-resistance genes (dfrA). We show that most of the dfrA genes identified to date map to two large clades that likely arose from independent mobilization events. In contrast to sulfonamide resistance (sul) genes, we find evidence of recurrent mobilization in dfrA genes. Phylogenetic evidence allows us to identify novel dfrA genes in the emerging pathogen Acinetobacter baumannii, and we confirm their resistance phenotype in vitro. We also identify a cluster of dfrA homologues in cryptic plasmid and phage genomes, but we show that these enzymes do not confer resistance to trimethoprim. Our methods also allow us to pinpoint the chromosomal origin of previously reported dfrA genes, and we show that many of these ancient chromosomal genes also confer resistance to trimethoprim. Our work reveals that trimethoprim resistance predated the clinical use of this chemotherapeutic agent, but that novel mutations have likely also arisen and become mobilized following its widespread use within and outside the clinic. Hence, this work confirms that resistance to novel drugs may already be present in the bacterial pangenome, and stresses the importance of rapid mobilization as a fundamental element in the emergence and global spread of resistance determinants.

RevDate: 2020-09-24

Jin L, Chen Y, Yang W, et al (2020)

Complete genome sequence of fish-pathogenic Aeromonas hydrophila HX-3 and a comparative analysis: insights into virulence factors and quorum sensing.

Scientific reports, 10(1):15479 pii:10.1038/s41598-020-72484-8.

The gram-negative, aerobic, rod-shaped bacterium Aeromonas hydrophila, the causative agent of motile aeromonad septicaemia, has attracted increasing attention due to its high pathogenicity. Here, we constructed the complete genome sequence of a virulent strain, A. hydrophila HX-3 isolated from Pseudosciaena crocea and performed comparative genomics to investigate its virulence factors and quorum sensing features in comparison with those of other Aeromonas isolates. HX-3 has a circular chromosome of 4,941,513 bp with a 61.0% G + C content encoding 4483 genes, including 4318 protein-coding genes, and 31 rRNA, 127 tRNA and 7 ncRNA operons. Seventy interspersed repeat and 153 tandem repeat sequences, 7 transposons, 8 clustered regularly interspaced short palindromic repeats, and 39 genomic islands were predicted in the A. hydrophila HX-3 genome. Phylogeny and pan-genome were also analyzed herein to confirm the evolutionary relationships on the basis of comparisons with other fully sequenced Aeromonas genomes. In addition, the assembled HX-3 genome was successfully annotated against the Cluster of Orthologous Groups of proteins database (76.03%), Gene Ontology database (18.13%), and Kyoto Encyclopedia of Genes and Genome pathway database (59.68%). Two-component regulatory systems in the HX-3 genome and virulence factors profiles through comparative analysis were predicted, providing insights into pathogenicity. A large number of genes related to the AHL-type 1 (ahyI, ahyR), LuxS-type 2 (luxS, pfs, metEHK, litR, luxOQU) and QseBC-type 3 (qseB, qseC) autoinducer systems were also identified. As a result of the expression of the ahyI gene in Escherichia coli BL21 (DE3), combined UPLC-MS/MS profiling led to the identification of several new N-acyl-homoserine lactone compounds synthesized by AhyI. This genomic analysis determined the comprehensive QS systems of A. hydrophila, which might provide novel information regarding the mechanisms of virulence signatures correlated with QS.

RevDate: 2020-09-22

Fang X, Lloyd CJ, BO Palsson (2020)

Reconstructing organisms in silico: genome-scale models and their emerging applications.

Nature reviews. Microbiology pii:10.1038/s41579-020-00440-4 [Epub ahead of print].

Escherichia coli is considered to be the best-known microorganism given the large number of published studies detailing its genes, its genome and the biochemical functions of its molecular components. This vast literature has been systematically assembled into a reconstruction of the biochemical reaction networks that underlie E. coli's functions, a process which is now being applied to an increasing number of microorganisms. Genome-scale reconstructed networks are organized and systematized knowledge bases that have multiple uses, including conversion into computational models that interpret and predict phenotypic states and the consequences of environmental and genetic perturbations. These genome-scale models (GEMs) now enable us to develop pan-genome analyses that provide mechanistic insights, detail the selection pressures on proteome allocation and address stress phenotypes. In this Review, we first discuss the overall development of GEMs and their applications. Next, we review the evolution of the most complete GEM that has been developed to date: the E. coli GEM. Finally, we explore three emerging areas in genome-scale modelling of microbial phenotypes: collections of strain-specific models, metabolic and macromolecular expression models, and simulation of stress responses.

RevDate: 2020-09-22

Phanse Y, Wu CW, Venturino AJ, et al (2020)

A Protective Vaccine against Johne's Disease in Cattle.

Microorganisms, 8(9): pii:microorganisms8091427.

Johne's disease (JD) caused by Mycobacterium avium subsp. paratuberculosis (M. paratuberculosis) is a chronic infection characterized by the development of granulomatous enteritis in wild and domesticated ruminants. It is one of the most significant livestock diseases not only in the USA but also globally, accounting for USD 200-500 million losses annually for the USA alone with potential link to cases of Crohn's disease in humans. Developing safe and protective vaccines is of a paramount importance for JD control in dairy cows. The current study evaluated the safety, immunity and protective efficacy of a novel live attenuated vaccine (LAV) candidate with and without an adjuvant in comparison to an inactivated vaccine. Results indicated that the LAV, irrespective of the adjuvant presence, induced robust T cell immune responses indicated by proinflammatory cytokine production such as IFN-γ, IFN-α, TNF-α and IL-17 as well as strong response to intradermal skin test against M. paratuberculosis antigens. Furthermore, the LAV was safe with minimal tissue pathology. Finally, calves vaccinated with adjuvanted LAV did not shed M. paratuberculosis post-challenge, a much-desired characteristic of an effective vaccine against JD. Together, this data suggests a strong potential of testing LAV in field trials to curb JD in dairy herds.

RevDate: 2020-09-17

Zhong C, Wang L, K Ning (2020)

Pan-genome study of Thermococcales reveals extensive genetic diversity and genetic evidence of thermophilic adaption.

Environmental microbiology [Epub ahead of print].

Thermococcales has a strong adaptability to extreme environments, which is of profound interest in explaining how complex life forms emerge on earth. However, their gene composition, thermal stability and evolution in hyperthermal environments are still little known. Here, we characterized the pan-genome architecture of 30 Thermococcales species to gain insight into their genetic properties, evolutionary patterns, and specific metabolisms adapted to niches. We revealed an open pan-genome of Thermococcales comprising 6,070 gene families that tends to increase with the availability of additional genomes. The genome contents of Thermococcales were flexible, with a series of genes experienced gene duplication, progressive divergence, or gene gain and loss events exhibiting distinct functional features. These archaea had concise types of heat shock proteins, such as HSP20, HSP60 and prefoldin, which were constrained by strong purifying selection that governed their conservative evolution. Furthermore, purifying selection forced genes involved in enzyme, motility, secretion system, defense system and chaperones to differ in functional constraints and their disparity in the rate of evolution may be related to adaptation to specific niche. These results deepened our understanding of genetic diversity and adaptation patterns of Thermococcales, and provided valuable research models for studying the metabolic traits of early life forms. This article is protected by copyright. All rights reserved.

RevDate: 2020-09-17

Khan M, Stapleton F, Summers S, et al (2020)

Antibiotic Resistance Characteristics of Pseudomonas aeruginosa Isolated from Keratitis in Australia and India.

Antibiotics (Basel, Switzerland), 9(9): pii:antibiotics9090600.

This study investigated genomic differences in Australian and Indian Pseudomonas aeruginosa isolates from keratitis (infection of the cornea). Overall, the Indian isolates were resistant to more antibiotics, with some of those isolates being multi-drug resistant. Acquired genes were related to resistance to fluoroquinolones, aminoglycosides, beta-lactams, macrolides, sulphonamides, and tetracycline and were more frequent in Indian (96%) than in Australian (35%) isolates (p = 0.02). Indian isolates had large numbers of gene variations (median 50,006, IQR = 26,967-50,600) compared to Australian isolates (median 26,317, IQR = 25,681-33,780). There were a larger number of mutations in the mutL and uvrD genes associated with the mismatch repair (MMR) system in Indian isolates, which may result in strains losing their efficacy for DNA repair. The number of gene variations were greater in isolates carrying MMR system genes or exoU. In the phylogenetic division, the number of core genes were similar in both groups, but Indian isolates had larger numbers of pan genes (median 6518, IQR = 6040-6935). Clones related to three different sequence types-ST308, ST316, and ST491-were found among Indian isolates. Only one clone, ST233, containing two strains was present in Australian isolates. The most striking differences between Australian and Indian isolates were carriage of exoU (that encodes a cytolytic phospholipase) in Indian isolates and exoS (that encodes for GTPase activator activity) in Australian isolates, large number of acquired resistance genes, greater changes to MMR genes, and a larger pan genome as well as increased overall genetic variation in the Indian isolates.

RevDate: 2020-09-16

Yin Z, Zhang S, Wei Y, et al (2020)

Horizontal Gene Transfer Clarifies Taxonomic Confusion and Promotes the Genetic Diversity and Pathogenicity of Plesiomonas shigelloides.

mSystems, 5(5): pii:5/5/e00448-20.

Plesiomonas shigelloides is an emerging pathogen that has been shown to be involved in gastrointestinal diseases and extraintestinal infections in humans. However, the taxonomic position, evolutionary dynamics, and pathogenesis of P. shigelloides remain unclear. We reported the draft genome sequences of 12 P. shigelloides strains representing different serogroups. We were able to determine a clear distinction between P. shigelloides and other members of Enterobacterales via core genome phylogeny, Neighbor-Net network, and average genome identity analysis. The pan-genome analysis of P. shigelloides revealed extensive genetic diversity and presented large flexible gene repertoires, while the core genome phylogeny exhibited a low level of clonality. The discordance between the core genome phylogeny and the pan-genome phylogeny indicated that flexible accessory genomes account for an important proportion of the evolution of P. shigelloides, which was subsequently characterized by determinations of hundreds of horizontally transferred genes (horizontal genes), massive gene expansions and contractions, and diverse mobile genetic elements (MGEs). The apparently high levels of horizontal gene transfer (HGT) in P. shigelloides were conferred from bacteria with novel properties from other taxa (mainly Vibrionaceae and Aeromonadaceae), which caused the historical taxonomic confusion and shaped the virulence gene pools. Furthermore, P. shigelloides genomes contain many macromolecular secretion system genes, virulence factor genes, and resistance genes, indicating its potential to cause intestinal and invasive infections. Collectively, our work provides insights into the phylogenetic position, evolutionary dynamic, and pathogenesis of P. shigelloides at the genomic level, which could facilitate the observation and research of this important pathogen.IMPORTANCE The taxonomic position of P. shigelloides has been the subject of debate for a long time, and until now, the evolutionary dynamics and pathogenesis of P. shigelloides were unclear. In this study, pan-genome analysis indicated extensive genetic diversity and the presence of large and variable gene repertoires. Our results revealed that horizontal gene transfer was the focal driving force for the genetic diversity of the P. shigelloides pan-genome and might have contributed to the emergence of novel properties. Vibrionaceae and Aeromonadaceae were found to be the predominant donor taxa for horizontal genes, which might have caused the taxonomic confusion historically. Comparative genomic analysis revealed the potential of P. shigelloides to cause intestinal and invasive diseases. Our results could advance the understanding of the evolution and pathogenesis of P. shigelloides, particularly in elucidating the role of horizontal gene transfer and investigating virulence-related elements.

RevDate: 2020-09-16

Ross DE, Marshall CW, Gulliver D, et al (2020)

Defining Genomic and Predicted Metabolic Features of the Acetobacterium Genus.

mSystems, 5(5): pii:5/5/e00277-20.

Acetogens are anaerobic bacteria capable of fixing CO2 or CO to produce acetyl coenzyme A (acetyl-CoA) and ultimately acetate using the Wood-Ljungdahl pathway (WLP). Acetobacterium woodii is the type strain of the Acetobacterium genus and has been critical for understanding the biochemistry and energy conservation in acetogens. Members of the Acetobacterium genus have been isolated from a variety of environments or have had genomes recovered from metagenome data, but no systematic investigation has been done on the unique and various metabolisms of the genus. To gain a better appreciation for the metabolic breadth of the genus, we sequenced the genomes of 4 isolates (A. fimetarium, A. malicum, A. paludosum, and A. tundrae) and conducted a comparative genome analysis (pan-genome) of 11 different Acetobacterium genomes. A unifying feature of the Acetobacterium genus is the carbon-fixing WLP. The methyl (cluster II) and carbonyl (cluster III) branches of the Wood-Ljungdahl pathway are highly conserved across all sequenced Acetobacterium genomes, but cluster I encoding the formate dehydrogenase is not. In contrast to A. woodii, all but four strains encode two distinct Rnf clusters, Rnf being the primary respiratory enzyme complex. Metabolism of fructose, lactate, and H2:CO2 was conserved across the genus, but metabolism of ethanol, methanol, caffeate, and 2,3-butanediol varied. Additionally, clade-specific metabolic potential was observed, such as amino acid transport and metabolism in the psychrophilic species, and biofilm formation in the A. wieringae clade, which may afford these groups an advantage in low-temperature growth or attachment to solid surfaces, respectively.IMPORTANCE Acetogens are anaerobic bacteria capable of fixing CO2 or CO to produce acetyl-CoA and ultimately acetate using the Wood-Ljungdahl pathway (WLP). This autotrophic metabolism plays a major role in the global carbon cycle and, if harnessed, can help reduce greenhouse gas emissions. Overall, the data presented here provide a framework for examining the ecology and evolution of the Acetobacterium genus and highlight the potential of these species as a source for production of fuels and chemicals from CO2 feedstocks.

RevDate: 2020-09-15

Chen Z, Erickson DL, J Meng (2020)

Benchmarking hybrid assembly approaches for genomic analyses of bacterial pathogens using Illumina and Oxford Nanopore sequencing.

BMC genomics, 21(1):631 pii:10.1186/s12864-020-07041-8.

BACKGROUND: We benchmarked the hybrid assembly approaches of MaSuRCA, SPAdes, and Unicycler for bacterial pathogens using Illumina and Oxford Nanopore sequencing by determining genome completeness and accuracy, antimicrobial resistance (AMR), virulence potential, multilocus sequence typing (MLST), phylogeny, and pan genome. Ten bacterial species (10 strains) were tested for simulated reads of both mediocre- and low-quality, whereas 11 bacterial species (12 strains) were tested for real reads.

RESULTS: Unicycler performed the best for achieving contiguous genomes, closely followed by MaSuRCA, while all SPAdes assemblies were incomplete. MaSuRCA was less tolerant of low-quality long reads than SPAdes and Unicycler. The hybrid assemblies of five antimicrobial-resistant strains with simulated reads provided consistent AMR genotypes with the reference genomes. The MaSuRCA assembly of Staphylococcus aureus with real reads contained msr(A) and tet(K), while the reference genome and SPAdes and Unicycler assemblies harbored blaZ. The AMR genotypes of the reference genomes and hybrid assemblies were consistent for the other five antimicrobial-resistant strains with real reads. The numbers of virulence genes in all hybrid assemblies were similar to those of the reference genomes, irrespective of simulated or real reads. Only one exception existed that the reference genome and hybrid assemblies of Pseudomonas aeruginosa with mediocre-quality long reads carried 241 virulence genes, whereas 184 virulence genes were identified in the hybrid assemblies of low-quality long reads. The MaSuRCA assemblies of Escherichia coli O157:H7 and Salmonella Typhimurium with mediocre-quality long reads contained 126 and 118 virulence genes, respectively, while 110 and 107 virulence genes were detected in their MaSuRCA assemblies of low-quality long reads, respectively. All approaches performed well in our MLST and phylogenetic analyses. The pan genomes of the hybrid assemblies of S. Typhimurium with mediocre-quality long reads were similar to that of the reference genome, while SPAdes and Unicycler were more tolerant of low-quality long reads than MaSuRCA for the pan-genome analysis. All approaches functioned well in the pan-genome analysis of Campylobacter jejuni with real reads.

CONCLUSIONS: Our research demonstrates the hybrid assembly pipeline of Unicycler as a superior approach for genomic analyses of bacterial pathogens using Illumina and Oxford Nanopore sequencing.

RevDate: 2020-09-14

Psomopoulos FE, van Helden J, Médigue C, et al (2020)

Ancestral state reconstruction of metabolic pathways across pangenome ensembles.

Microbial genomics [Epub ahead of print].

As genome sequencing efforts are unveiling the genetic diversity of the biosphere with an unprecedented speed, there is a need to accurately describe the structural and functional properties of groups of extant species whose genomes have been sequenced, as well as their inferred ancestors, at any given taxonomic level of their phylogeny. Elaborate approaches for the reconstruction of ancestral states at the sequence level have been developed, subsequently augmented by methods based on gene content. While these approaches of sequence or gene-content reconstruction have been successfully deployed, there has been less progress on the explicit inference of functional properties of ancestral genomes, in terms of metabolic pathways and other cellular processes. Herein, we describe PathTrace, an efficient algorithm for parsimony-based reconstructions of the evolutionary history of individual metabolic pathways, pivotal representations of key functional modules of cellular function. The algorithm is implemented as a five-step process through which pathways are represented as fuzzy vectors, where each enzyme is associated with a taxonomic conservation value derived from the phylogenetic profile of its protein sequence. The method is evaluated with a selected benchmark set of pathways against collections of genome sequences from key data resources. By deploying a pangenome-driven approach for pathway sets, we demonstrate that the inferred patterns are largely insensitive to noise, as opposed to gene-content reconstruction methods. In addition, the resulting reconstructions are closely correlated with the evolutionary distance of the taxa under study, suggesting that a diligent selection of target pangenomes is essential for maintaining cohesiveness of the method and consistency of the inference, serving as an internal control for an arbitrary selection of queries. The PathTrace method is a first step towards the large-scale analysis of metabolic pathway evolution and our deeper understanding of functional relationships reflected in emerging pangenome collections.

RevDate: 2020-09-13

Gardon H, Biderre-Petit C, Jouan-Dufournel I, et al (2020)

A drift-barrier model drives the genomic landscape of a structured bacterial population.

Molecular ecology [Epub ahead of print].

Bacterial populations differentiate over time and space to form distinct genetic units. The mechanisms governing this diversification are presumed to result from the ecological context of living units to adapt to specific niches. Recently, a model assuming the acquisition of advantageous genes among populations rather than whole genome sweeps has emerged to explain population differentiation. However, the characteristics of these exchanged, or flexible, genes and whether their evolution is driven by adaptive or neutral processes remain controversial. By analysing the flexible genome of single-amplified genomes of co-occurring populations of the marine Prochlorococcus HLII ecotype, we highlight that genomic compartments - rather than population units - are characterized by different evolutionary trajectories. The dynamics of gene fluxes vary across genomic compartments and therefore the effectiveness of selection depends on the fluctuation of the effective population size along the genome. Taken together, these results support the drift-barrier model of bacterial evolution.

RevDate: 2020-09-11

Christian RW, Hewitt SL, Nelson G, et al (2020)

Plastid transit peptides-where do they come from and where do they all belong? Multi-genome and pan-genomic assessment of chloroplast transit peptide evolution.

PeerJ, 8:e9772 pii:9772.

Subcellular relocalization of proteins determines an organism's metabolic repertoire and thereby its survival in unique evolutionary niches. In plants, the plastid and its various morphotypes import a large and varied number of nuclear-encoded proteins to orchestrate vital biochemical reactions in a spatiotemporal context. Recent comparative genomics analysis and high-throughput shotgun proteomics data indicate that there are a large number of plastid-targeted proteins that are either semi-conserved or non-conserved across different lineages. This implies that homologs are differentially targeted across different species, which is feasible only if proteins have gained or lost plastid targeting peptides during evolution. In this study, a broad, multi-genome analysis of 15 phylogenetically diverse genera and in-depth analyses of pangenomes from Arabidopsis and Brachypodium were performed to address the question of how proteins acquire or lose plastid targeting peptides. The analysis revealed that random insertions or deletions were the dominant mechanism by which novel transit peptides are gained by proteins. While gene duplication was not a strict requirement for the acquisition of novel subcellular targeting, 40% of novel plastid-targeted genes were found to be most closely related to a sequence within the same genome, and of these, 30.5% resulted from alternative transcription or translation initiation sites. Interestingly, analysis of the distribution of amino acids in the transit peptides of known and predicted chloroplast-targeted proteins revealed monocot and eudicot-specific preferences in residue distribution.

RevDate: 2020-09-09

Zhang X, Li F, Cui S, et al (2020)

Prevalence and Distribution Characteristics of blaKPC-2 and blaNDM-1 Genes in Klebsiella pneumoniae.

Infection and drug resistance, 13:2901-2910 pii:253631.

Background: Carbapenem-resistant Klebsiella pneumoniae infections have caused major concern and posed a global threat to public health. As blaKPC-2 and blaNDM-1 genes are the most widely reported carbapenem resistant genes in K. pneumonia, it is crucial to study the prevalence and geographical distribution of these two genes for further understanding of their transmission mode and mechanism.

Purpose: Here, we investigated the prevalence and distribution of blaKPC-2 and blaNDM-1 genes in carbapenem-resistant K. pneumoniae strains from a tertiary hospital and from 1579 genomes available in the NCBI database, and further analyzed the possible core structure of blaKPC-2 or blaNDM-1 genes among global genome data.

Materials and Methods: K. pneumoniae strains from a tertiary hospital in China during 2013-2018 were collected and their antimicrobial susceptibility testing for 28 antibiotics was determined. Whole-genome sequencing of carbapenem-resistant K. pneumoniae strains was used to investigate the genetic characterization. The phylogenetic relationships of these strains were investigated through pan-genome analysis. The epidemiology and distribution of blaKPC-2 and blaNDM-1 genes in K. pneumoniae based on 1579 global genomes and carbapenem-resistant K. pneumoniae strains from hospital were analyzed using bioinformatics. The possible core structure carrying blaKPC-2 or blaNDM-1 genes was investigated among global data.

Results: A total of 19 carbapenem-resistant K. pneumoniae were isolated in a tertiary hospital. All isolates had a multi-resistant pattern and eight kinds of resistance genes. The phylogenetic analysis showed all isolates in the hospital were dominated by two lineages composed of ST11 and ST25, respectively. ST11 and ST25 were the major ST type carrying blaKPC-2 and blaNDM-1 genes, respectively. Among 1579 global genomes data, 147 known ST types (1195 genomes) have been identified, while ST258 (23.6%) and ST11 (22.1%) were the globally prevalent clones among the known ST types. Genetic environment analysis showed that the ISKpn7-dnaA/ISKpn27 -blaKPC-2-ISkpn6 and blaNDM-1-ble-trpf-nagA may be the core structure in the horizontal transfer of blaKPC-2 and blaNDM-1 , respectively. In addition, DNA transferase (hin) may be involved in the horizontal transfer or the expression of blaNDM-1 .

Conclusion: There was clonal transmission of carbapenem-resistant K. pneumoniae in the tertiary hospital in China. The prevalence and distribution of blaKPC-2 and blaNDM-1 varied by countries and were driven by different transposons carrying the core structure. This study shed light on the genetic environment of blaKPC-2 and blaNDM-1 and offered basic information about the mechanism of carbapenem-resistant K. pneumoniae dissemination.

RevDate: 2020-09-09

Liu Y, Z Tian (2020)

From one linear genome to a graph-based pan-genome: a new era for genomics.

Science China. Life sciences pii:10.1007/s11427-020-1808-0 [Epub ahead of print].

RevDate: 2020-09-09

González-Dominici LI, Saati-Santamaría Z, P García-Fraile (2020)

Genome Analysis and Genomic Comparison of the Novel Species Arthrobacter ipsi Reveal Its Potential Protective Role in Its Bark Beetle Host.

Microbial ecology pii:10.1007/s00248-020-01593-8 [Epub ahead of print].

The pine engraver beetle, Ips acuminatus Gyll, is a bark beetle that causes important damages in Scots pine (Pinus sylvestris) forests and plantations. As almost all higher organisms, Ips acuminatus harbours a microbiome, although the role of most members of its microbiome is not well understood. As part of a work in which we analysed the bacterial diversity associated to Ips acuminatus, we isolated the strain Arthrobacter sp. IA7. In order to study its potential role within the bark beetle holobiont, we sequenced and explored its genome and performed a pan-genome analysis of the genus Arthrobacter, showing specific genes of strain IA7 that might be related with its particular role in its niche. Based on these investigations, we suggest several potential roles of the bacterium within the beetle. Analysis of genes related to secondary metabolism indicated potential antifungal capability, confirmed by the inhibition of several entomopathogenic fungal strains (Metarhizium anisopliae CCF0966, Lecanicillium muscarium CCF6041, L. muscarium CCF3297, Isaria fumosorosea CCF4401, I. farinosa CCF4808, Beauveria bassiana CCF4422 and B. brongniartii CCF1547). Phylogenetic analyses of the 16S rRNA gene, six concatenated housekeeping genes (tuf-secY-rpoB-recA-fusA-atpD) and genome sequences indicated that strain IA7 is closely related to A. globiformis NBRC 12137T but forms a new species within the genus Arthrobacter; this was confirmed by digital DNA-DNA hybridization (37.10%) and average nucleotide identity (ANIb) (88.9%). Based on phenotypic and genotypic features, we propose strain IA7T as the novel species Arthrobacter ipsi sp. nov. (type strain IA7T = CECT 30100T = LMG 31782T) and suggest its protective role for its host.

RevDate: 2020-09-08

Boisen N, Østerlund MT, Joensen KG, et al (2020)

Redefining enteroaggregative Escherichia coli (EAEC): Genomic characterization of epidemiological EAEC strains.

PLoS neglected tropical diseases, 14(9):e0008613 pii:PNTD-D-20-00385 [Epub ahead of print].

Although enteroaggregative E. coli (EAEC) has been implicated as a common cause of diarrhea in multiple settings, neither its essential genomic nature nor its role as an enteric pathogen are fully understood. The current definition of this pathotype requires demonstration of cellular adherence; a working molecular definition encompasses E. coli which do not harbor the heat-stable or heat-labile toxins of enterotoxigenic E. coli (ETEC) and harbor the genes aaiC, aggR, and/or aatA. In an effort to improve the definition of this pathotype, we report the most definitive characterization of the pan-genome of EAEC to date, applying comparative genomics and functional characterization on a collection of 97 EAEC strains isolated in the course of a multicenter case-control diarrhea study (Global Enteric Multi-Center Study, GEMS). Genomic analysis revealed that the EAEC strains mapped to all phylogenomic groups of E. coli. Circa 70% of strains harbored one of the five described AAF variants; there were no additional AAF variants identified, and strains that lacked an identifiable AAF generally did not have an otherwise complete AggR regulon. An exception was strains that harbored an ETEC colonization factor (CF) CS22, like AAF a member of the chaperone-usher family of adhesins, but not phylogenetically related to the AAF family. Of all genes scored, sepA yielded the strongest association with diarrhea (P = 0.002) followed by the increased serum survival gene, iss (p = 0.026), and the outer membrane protease gene ompT (p = 0.046). Notably, the EAEC genomes harbored several genes characteristically associated with other E. coli pathotypes. Our data suggest that a molecular definition of EAEC could comprise E. coli strains harboring AggR and a complete AAF(I-V) or CS22 gene cluster. Further, it is possible that strains meeting this definition could be both enteric bacteria and urinary/systemic pathogens.

RevDate: 2020-09-07

Bonnici V, Maresi E, R Giugno (2020)

Challenges in gene-oriented approaches for pangenome content discovery.

Briefings in bioinformatics pii:5901976 [Epub ahead of print].

Given a group of genomes, represented as the sets of genes that belong to them, the discovery of the pangenomic content is based on the search of genetic homology among the genes for clustering them into families. Thus, pangenomic analyses investigate the membership of the families to the given genomes. This approach is referred to as the gene-oriented approach in contrast to other definitions of the problem that takes into account different genomic features. In the past years, several tools have been developed to discover and analyse pangenomic contents. Because of the hardness of the problem, each tool applies a different strategy for discovering the pangenomic content. This results in a differentiation of the performance of each tool that depends on the composition of the input genomes. This review reports the main analysis instruments provided by the current state of the art tools for the discovery of pangenomic contents. Moreover, unlike previous works, the presented study compares pangenomic tools from a methodological perspective, analysing the causes that lead a given methodology to outperform other tools. The analysis is performed by taking into account different bacterial populations, which are synthetically generated by changing evolutionary parameters. The benchmarks used to compare the pangenomic tools, in addition to the computational pipeline developed for this purpose, are available at https://github.com/InfOmics/pangenes-review. Contact: V. Bonnici, R. Giugno Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.

RevDate: 2020-09-03

Zhu Z, Wang L, Qian H, et al (2020)

Comparative genome analysis of 12 Shigella sonnei strains: virulence, resistance, and their interactions.

International microbiology : the official journal of the Spanish Society for Microbiology pii:10.1007/s10123-020-00145-x [Epub ahead of print].

Shigellosis is a highly infectious disease that is mainly transmitted via fecal-oral contact of the bacteria Shigella. Four species have been identified in Shigella genus, among which Shigella flexneri is used to be the most prevalent species globally and commonly isolated from developing countries. However, it is being replaced by Shigella sonnei that is currently the main causative agent for dysentery pandemic in many emerging industrialized countries such as Asia and the Middle East. For a better understanding of S. sonnei virulence and antibiotic resistance, we sequenced 12 clinical S. sonnei strains with varied antibiotic-resistance profiles collected from four cities in Jiangsu Province, China. Phylogenomic analysis clustered antibiotic-sensitive and resistant S. sonnei into two distinct groups while pan-genome analysis reveals the presence and absence of unique genes in each group. Screening of 31 classes of virulence factors found out that type 2 secretion system is doubled in resistant strains. Further principle component analysis based on the interactions between virulence and resistance indicated that abundant virulence factors are associated with higher levels of antibiotic resistance. The result present here is based on statistical analysis of a small sample size and serves basically as a guidance for further experimental and theoretical studies.

RevDate: 2020-09-03

Muñoz-Ramirez ZY, Pascoe B, Mendez-Tenorio A, et al (2020)

A 500-year tale of co-evolution, adaptation, and virulence: Helicobacter pylori in the Americas.

The ISME journal pii:10.1038/s41396-020-00758-0 [Epub ahead of print].

Helicobacter pylori is a common component of the human stomach microbiota, possibly dating back to the speciation of Homo sapiens. A history of pathogen evolution in allopatry has led to the development of genetically distinct H. pylori subpopulations, associated with different human populations, and more recent admixture among H. pylori subpopulations can provide information about human migrations. However, little is known about the degree to which some H. pylori genes are conserved in the face of admixture, potentially indicating host adaptation, or how virulence genes spread among different populations. We analyzed H. pylori genomes from 14 countries in the Americas, strains from the Iberian Peninsula, and public genomes from Europe, Africa, and Asia, to investigate how admixture varies across different regions and gene families. Whole-genome analyses of 723 H. pylori strains from around the world showed evidence of frequent admixture in the American strains with a complex mosaic of contributions from H. pylori populations originating in the Americas as well as other continents. Despite the complex admixture, distinctive genomic fingerprints were identified for each region, revealing novel American H. pylori subpopulations. A pan-genome Fst analysis showed that variation in virulence genes had the strongest fixation in America, compared with non-American populations, and that much of the variation constituted non-synonymous substitutions in functional domains. Network analyses suggest that these virulence genes have followed unique evolutionary paths in the American populations, spreading into different genetic backgrounds, potentially contributing to the high risk of gastric cancer in the region.

RevDate: 2020-09-03

Carroll LM, Huisman JS, M Wiedmann (2020)

Twentieth-century emergence of antimicrobial resistant human- and bovine-associated Salmonella enterica serotype Typhimurium lineages in New York State.

Scientific reports, 10(1):14428 pii:10.1038/s41598-020-71344-9.

Salmonella enterica serotype Typhimurium (S. Typhimurium) boasts a broad host range and can be transmitted between livestock and humans. While members of this serotype can acquire resistance to antimicrobials, the temporal dynamics of this acquisition is not well understood. Using New York State (NYS) and its dairy cattle farms as a model system, 87 S. Typhimurium strains isolated from 1999 to 2016 from either human clinical or bovine-associated sources in NYS were characterized using whole-genome sequencing. More than 91% of isolates were classified into one of four major lineages, two of which were largely susceptible to antimicrobials but showed sporadic antimicrobial resistance (AMR) gene acquisition, and two that were largely multidrug-resistant (MDR). All four lineages clustered by presence and absence of elements in the pan-genome. The two MDR lineages, one of which resembled S. Typhimurium DT104, were predicted to have emerged circa 1960 and 1972. The two largely susceptible lineages emerged earlier, but showcased sporadic AMR determinant acquisition largely after 1960, including acquisition of cephalosporin resistance-conferring genes after 1985. These results confine the majority of AMR acquisition events in NYS S. Typhimurium to the twentieth century, largely within the era of antibiotic usage.

RevDate: 2020-09-03

Bellas CM, Schroeder DC, Edwards A, et al (2020)

Flexible genes establish widespread bacteriophage pan-genomes in cryoconite hole ecosystems.

Nature communications, 11(1):4403 pii:10.1038/s41467-020-18236-8.

Bacteriophage genomes rapidly evolve via mutation and horizontal gene transfer to counter evolving bacterial host defenses; such arms race dynamics should lead to divergence between phages from similar, geographically isolated ecosystems. However, near-identical phage genomes can reoccur over large geographical distances and several years apart, conversely suggesting many are stably maintained. Here, we show that phages with near-identical core genomes in distant, discrete aquatic ecosystems maintain diversity by possession of numerous flexible gene modules, where homologous genes present in the pan-genome interchange to create new phage variants. By repeatedly reconstructing the core and flexible regions of phage genomes from different metagenomes, we show a pool of homologous gene variants co-exist for each module in each location, however, the dominant variant shuffles independently in each module. These results suggest that in a natural community, recombination is the largest contributor to phage diversity, allowing a variety of host recognition receptors and genes to counter bacterial defenses to co-exist for each phage.

RevDate: 2020-08-27

Alam I, Kamau AA, Kulmanov M, et al (2020)

Functional Pangenome Analysis Shows Key Features of E Protein Are Preserved in SARS and SARS-CoV-2.

Frontiers in cellular and infection microbiology, 10:405.

The spread of the novel coronavirus (SARS-CoV-2) has triggered a global emergency, that demands urgent solutions for detection and therapy to prevent escalating health, social, and economic impacts. The spike protein (S) of this virus enables binding to the human receptor ACE2, and hence presents a prime target for vaccines preventing viral entry into host cells. The S proteins from SARS and SARS-CoV-2 are similar, but structural differences in the receptor binding domain (RBD) preclude the use of SARS-specific neutralizing antibodies to inhibit SARS-CoV-2. Here we used comparative pangenomic analysis of all sequenced reference Betacoronaviruses, complemented with functional and structural analyses. This analysis reveals that, among all core gene clusters present in these viruses, the envelope protein E shows a variant cluster shared by SARS and SARS-CoV-2 with two completely-conserved key functional features, namely an ion-channel, and a PDZ-binding motif (PBM). These features play a key role in the activation of the inflammasome causing the acute respiratory distress syndrome, the leading cause of death in SARS and SARS-CoV-2 infections. Together with functional pangenomic analysis, mutation tracking, and previous evidence, on E protein as a determinant of pathogenicity in SARS, we suggest E protein as an alternative therapeutic target to be considered for further studies to reduce complications of SARS-CoV-2 infections in COVID-19.

RevDate: 2020-08-27

Kumar R, Bröms JE, A Sjöstedt (2020)

Exploring the Diversity Within the Genus Francisella - An Integrated Pan-Genome and Genome-Mining Approach.

Frontiers in microbiology, 11:1928.

Pan-genome analysis is a powerful method to explore genomic heterogeneity and diversity of bacterial species. Here we present a pan-genome analysis of the genus Francisella, comprising a dataset of 63 genomes and encompassing clinical as well as environmental isolates from distinct geographic locations. To determine the evolutionary relationship within the genus, we performed phylogenetic whole-genome studies utilizing the average nucleotide identity, average amino acid identity, core genes and non-recombinant loci markers. Based on the analyses, the phylogenetic trees obtained identified two distinct clades, A and B and a diverse cluster designated C. The sizes of the pan-, core-, cloud-, and shell-genomes of Francisella were estimated and compared to those of two other facultative intracellular pathogens, Legionella and Piscirickettsia. Francisella had the smallest core-genome, 692 genes, compared to 886 and 1,732 genes for Legionella and Piscirickettsia respectively, while the pan-genome of Legionella was more than twice the size of that of the other two genera. Also, the composition of the Francisella Type VI secretion system (T6SS) was analyzed. Distinct differences in the gene content of the T6SS were identified. In silico approaches performed to identify putative substrates of these systems revealed potential effectors targeting the cell wall, inner membrane, cellular nucleic acids as well as proteins, thus constituting attractive targets for site-directed mutagenesis. The comparative analysis performed here provides a comprehensive basis for the assessment of the phylogenomic relationship of members of the genus Francisella and for the identification of putative T6SS virulence traits.

RevDate: 2020-08-27

Bannantine JP, Conde C, Bayles DO, et al (2020)

Genetic Diversity Among Mycobacterium avium Subspecies Revealed by Analysis of Complete Genome Sequences.

Frontiers in microbiology, 11:1701.

Mycobacterium avium comprises four subspecies that contain both human and veterinary pathogens. At the inception of this study, twenty-eight M. avium genomes had been annotated as RefSeq genomes, facilitating direct comparisons. These genomes represent strains from around the world and provided a unique opportunity to examine genome dynamics in this species. Each genome was confirmed to be classified correctly based on SNP genotyping, nucleotide identity and presence/absence of repetitive elements or other typing methods. The Mycobacterium avium subspecies paratuberculosis (Map) genome size and organization was remarkably consistent, averaging 4.8 Mb with a variance of only 29.6 kb among the 13 strains. Comparing recombination events along with the larger genome size and variance observed among Mycobacterium avium subspecies avium (Maa) and Mycobacterium avium subspecies hominissuis (Mah) strains (collectively termed non-Map) suggests horizontal gene transfer occurs in non-Map, but not in Map strains. Overall, M. avium subspecies could be divided into two major sub-divisions, with the Map type II (bovine strains) clustering tightly on one end of a phylogenetic spectrum and Mah strains clustering more loosely together on the other end. The most evolutionarily distinct Map strain was an ovine strain, designated Telford, which had >1,000 SNPs and showed large rearrangements compared to the bovine type II strains. The Telford strain clustered with Maa strains as an intermediate between Map type II and Mah. SNP analysis and genome organization analyses repeatedly demonstrated the conserved nature of Map versus the mosaic nature of non-Map M. avium strains. Finally, core and pangenomes were developed for Map and non-Map strains. A total of 80% Map genes belonged to the Map core genome, while only 40% of non-Map genes belonged to the non-Map core genome. These genomes provide a more complete and detailed comparison of these subspecies strains as well as a blueprint for how genetic diversity originated.

RevDate: 2020-08-27

Costa SS, Guimarães LC, Silva A, et al (2020)

First Steps in the Analysis of Prokaryotic Pan-Genomes.

Bioinformatics and biology insights, 14:1177932220938064.

Pan-genome is defined as the set of orthologous and unique genes of a specific group of organisms. The pan-genome is composed by the core genome, accessory genome, and species- or strain-specific genes. The pan-genome is considered open or closed based on the alpha value of the Heap law. In an open pan-genome, the number of gene families will continuously increase with the addition of new genomes to the analysis, while in a closed pan-genome, the number of gene families will not increase considerably. The first step of a pan-genome analysis is the homogenization of genome annotation. The same software should be used to annotate genomes, such as GeneMark or RAST. Subsequently, several software are used to calculate the pan-genome such as BPGA, GET_HOMOLOGUES, PGAP, among others. This review presents all these initial steps for those who want to perform a pan-genome analysis, explaining key concepts of the area. Furthermore, we present the pan-genomic analysis of 9 bacterial species. These are the species with the highest number of genomes deposited in GenBank. We also show the influence of the identity and coverage parameters on the prediction of orthologous and paralogous genes. Finally, we cite the perspectives of several research areas where pan-genome analysis can be used to answer important issues.

RevDate: 2020-08-20

Zhou L, Zhang T, Tang S, et al (2020)

Pan-genome analysis of Paenibacillus polymyxa strains reveals the mechanism of plant growth promotion and biocontrol.

Antonie van Leeuwenhoek pii:10.1007/s10482-020-01461-y [Epub ahead of print].

Rapid development of gene sequencing technologies has led to an exponential increase in microbial sequencing data. Genome research of a single organism does not capture the changes in the characteristics of genetic information within a species. Pan-genome analysis gives us a broader perspective to study the complete genetic information of a species. Paenibacillus polymyxa is a Gram-positive bacterium and an important plant growth-promoting rhizobacterium with the ability to produce multiple antibiotics, such as fusaricidin, lantibiotic, paenilan, and polymyxin. Our study explores the pan-genome of 14 representative P. polymyxa strains isolated from around the world. Heap's law model and curve fitting confirmed an open pan-genome of P. polymyxa. The phylogenetic and collinearity analyses reflected that the evolutionary classification of P. polymyxa strains are not associated with geographical area and ecological niches. Few genes related to phytohormone synthesis and phosphate solubilization were conserved; however, the nif cluster gene associated with nitrogen fixation exists only in some strains. This finding is indicative of nitrogen fixing ability is not stable in P. polymyxa. Analysis of antibiotic gene clusters in P. polymyxa revealed the presence of these genes in both core and accessory genomes. This observation indicates that the difference in living environment led to loss of ability to synthesize antibiotics in some strains. The current pan-genomic analysis of P. polymyxa will help us understand the mechanisms of biological control and plant growth promotion. It will also promote the use of P. polymyxa in agriculture.

RevDate: 2020-08-17

Ouyabe M, Tanaka N, Shiwa Y, et al (2020)

Rhizobium dioscoreae sp. nov., a plant growth-promoting bacterium isolated from yam (Dioscorea species).

International journal of systematic and evolutionary microbiology [Epub ahead of print].

This study investigated endophytic nitrogen-fixing bacteria isolated from two species of yam (water yam, Dioscorea alata L.; lesser yam, Dioscorea esculenta L.) grown in nutrient-poor alkaline soil conditions on Miyako Island, Okinawa, Japan. Two bacterial strains of the genus Rhizobium, S-93T and S-62, were isolated. The phylogenetic tree, based on the almost-complete 16S rRNA gene sequences (1476 bp for each strain), placed them in a distinct clade, with Rhizobium miluonense CCBAU 41251T, Rhizobium hainanense I66T, Rhizobium multihospitium HAMBI 2975T, Rhizobium freirei PRF 81T and Rhizobium tropici CIAT 899T being their closest species. Their bacterial fatty acid profile, with major components of C19 : 0 cyclo ω8c and summed feature 8, as well as other phenotypic characteristics and DNA G+C content (59.65 mol%) indicated that the novel strains belong to the genus Rhizobium. Pairwise average nucleotide identity analyses separated the novel strains from their most closely related species with similarity values of 90.5, 88.9, 88.5, 84.5 and 84.4 % for R. multihospitium HAMBI 2975T, R. tropici CIAT 899T, R. hainanense CCBAU 57015T, R. miluonense HAMBI 2971T and R. freirei PRF 81T, respectively; digital DNA-DNA hybridization values were in the range of 26-42 %. Considering the phenotypic characteristics as well as the genomic data, it is suggested that strains S-93T and S-62 represent a new species, for which the name Rhizobium dioscoreae is proposed. The type strain is S-93T (=NRIC 0988T=NBRC 114257T=DSM 110498T).

RevDate: 2020-08-13

Clawson ML, Schuller G, Dickey AM, et al (2020)

Differences between predicted outer membrane proteins of genotype 1 and 2 Mannheimia haemolytica.

BMC microbiology, 20(1):250 pii:10.1186/s12866-020-01932-2.

BACKGROUND: Mannheimia haemolytica strains isolated from North American cattle have been classified into two genotypes (1 and 2). Although members of both genotypes have been isolated from the upper and lower respiratory tracts of cattle with or without bovine respiratory disease (BRD), genotype 2 strains are much more frequently isolated from diseased lungs than genotype 1 strains. The mechanisms behind the increased association of genotype 2 M. haemolytica with BRD are not fully understood. To address that, and to search for interventions against genotype 2 M. haemolytica, complete, closed chromosome assemblies for 35 genotype 1 and 34 genotype 2 strains were generated and compared. Searches were conducted for the pan genome, core genes shared between the genotypes, and for genes specific to either genotype. Additionally, genes encoding outer membrane proteins (OMPs) specific to genotype 2 M. haemolytica were identified, and the diversity of their protein isoforms was characterized with predominantly unassembled, short-read genomic sequences for up to 1075 additional strains.

RESULTS: The pan genome of the 69 sequenced M. haemolytica strains consisted of 3111 genes, of which 1880 comprised a shared core between the genotypes. A core of 112 and 179 genes or gene variants were specific to genotype 1 and 2, respectively. Seven genes encoding predicted OMPs; a peptidase S6, a ligand-gated channel, an autotransporter outer membrane beta-barrel domain-containing protein (AOMB-BD-CP), a porin, and three different trimeric autotransporter adhesins were specific to genotype 2 as their genotype 1 homologs were either pseudogenes, or not detected. The AOMB-BD-CP gene, however, appeared to be truncated across all examined genotype 2 strains and to likely encode dysfunctional protein. Homologous gene sequences from additional M. haemolytica strains confirmed the specificity of the remaining six genotype 2 OMP genes and revealed they encoded low isoform diversity at the population level.

CONCLUSION: Genotype 2 M. haemolytica possess genes encoding conserved OMPs not found intact in more commensally prone genotype 1 strains. Some of the genotype 2 specific genes identified in this study are likely to have important biological roles in the pathogenicity of genotype 2 M. haemolytica, which is the primary bacterial cause of BRD.

RevDate: 2020-08-12

Xu S, Cheng J, Meng X, et al (2020)

Complete Genome and Comparative Genome Analysis of Lactobacillus reuteri YSJL-12, a Potential Probiotics Strain Isolated From Healthy Sow Fresh Feces.

Evolutionary bioinformatics online, 16:1176934320942192 pii:10.1177_1176934320942192.

Lactobacillus reuteri YSJL-12 was isolated from healthy sow fresh feces and used as probiotics additives previously. To investigate the genetic basis on probiotic potential and identify the genes in the strain, the complete genome of YSJL-12 was sequenced. Then comparative genome analysis on 9 strains of Lactobacillus reuteri was performed. The genome of YSJL-12 consisted of a circular 2,084,748 bp chromosome and 2 circular plasmids (51,906 and 15,134 bp). From among the 2065 protein-coding sequences (CDSs), the genes resistant to the environmental stress were identified. The function of COG (Clusters of Orthologous Group) protein genes was predicted, and the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways were analyzed. The comparative genome analysis indicated that the pan-genome contained a core genome of 1257 orthologous gene clusters, an accessory genome of 1064 orthologous gene clusters, and 1148 strain-specific genes, and the antibacterial mechanism among Lactobacillus reuteri strains might be different. The phylogenetic analysis and genomic collinearity revealed that the phylogenetic relationship among 9 strains of Lactobacillus reuteri was connected with host species and showed host specificity. The research could help us to better predict genes function and understand genetic basis on adapting to host gut in Lactobacillus reuteri YSJL-12.

RevDate: 2020-08-11

Bernardes JS, Eberle RJ, Vieira FRJ, et al (2020)

A comparative pan-genomic analysis of 53 C. pseudotuberculosis strains based on functional domains.

Journal of biomolecular structure & dynamics [Epub ahead of print].

Corynebacterium pseudotuberculosis is a pathogenic bacterium with great veterinary and economic importance. It is classified into two biovars: ovis, nitrate-negative, that causes lymphadenitis in small ruminants and equi, nitrate-positive, causing ulcerative lymphangitis in equines. With the explosive growth of available genomes of several strains, pan-genome analysis has opened new opportunities for understanding the dynamics and evolution of C. pseudotuberculosis. However, few pan-genomic studies have compared biovars equi and ovis. Such studies have considered a reduced number of strains and compared entire genomes. Here we conducted an original pan-genome analysis based on protein sequences and their functional domains. We considered 53 C. pseudotuberculosis strains from both biovars isolated from different hosts and countries. We have analysed conserved domains, common domains more frequently found in each biovar and biovar-specific (unique) domains. Our results demonstrated that biovar equi is more variable; there is a significant difference in the number of proteins per strains, probably indicating the occurrence of more gene loss/gain events. Moreover, strains of biovar equi presented a higher number of biovar-specific domains, 77 against only eight in biovar ovis, most of them are associated with virulence mechanisms. With this domain analysis, we have identified functional differences among strains of biovars ovis and equi that could be related to niche-adaptation and probably help to better understanding mechanisms of virulence and pathogenesis. The distribution patterns of functional domains identified in this work might have impacts on bacterial physiology and lifestyle, encouraging the development of new diagnoses, vaccines, and treatments for C. pseudotuberculosis diseases. Communicated by Ramaswamy H. Sarma.

RevDate: 2020-08-08

Pan Y, Awan F, Zhenbao M, et al (2020)

Preliminary view of the global distribution and spread of the tet(X) family of tigecycline resistance genes.

The Journal of antimicrobial chemotherapy pii:5885053 [Epub ahead of print].

BACKGROUND: The emergence of plasmid-mediated tet(X3)/tet(X4) genes is threatening the role of tigecycline as a last-resort antibiotic to treat clinical infections caused by XDR bacteria. Considering the possible public health threat posed by tet(X) and its variants [which we collectively call 'tet(X) genes' in this study], global monitoring and surveillance are urgently required.

OBJECTIVES: Here we conducted a worldwide survey of the global distribution and spread of tet(X) genes.

METHODS: We analysed a comprehensive dataset of bacterial genomes in conjunction with surveillance data from our laboratory and the NCBI database, as well as sufficient metadata to characterize the results.

RESULTS: The global distribution features of tet(X) genes were revealed. We clustered three types of genetic backbones of tet(X) genes embedded or transferred in bacterial genomes. Our pan-genome analyses revealed a large genetic pool composed of tet(X)-carrying sequences. Moreover, phylogenetic trees of tet(X) genes and tet(X)-like proteins were built.

CONCLUSIONS: To the best of our knowledge, our results provide the first view of the global distribution of tet(X) genes, demonstrate the features of tet(X)-carrying fragments and highlight the possible evolution of tigecycline-inactivation enzymes in diverse bacterial species and habitats.

RevDate: 2020-08-08

Santos DDS, Calaça PRA, Porto ALF, et al (2020)

What Differentiates Probiotic from Pathogenic Bacteria? The Genetic Mobility of Enterococcus faecium Offers New Molecular Insights.

Omics : a journal of integrative biology [Epub ahead of print].

Enterococcus faecium is a lactic acid bacterium with applications in food engineering and nutrigenomics, including as starter cultures in fermented foods. To differentiate the E. faecium probiotic from pathogenic bacteria, physiological analyses are often used but they do not guarantee that a bacterial strain is not pathogenic. We report here new findings and an approach based on comparison of the genetic mobility of (1) probiotic, (2) pathogenic, and (3) nonpathogenic and non-probiotic strains, so as to differentiate probiotics, and inform their safe use. The region of the 16S ribosomal DNA (rDNA) genes of different E. faecium strains native to Pernambuco-Brazil was used with the GenBank query sequence. Complete genomes were selected and divided into three groups as noted above to identify the mobile genetic elements (MGEs) (transposase, integrase, conjugative transposon protein and phage) and antibiotic resistance genes (ARGs), and to undertake pan-genome analysis and multiple genome alignment. Differences in the number of MGEs were found in ARGs, in the presence and absence of the genes that differentiate E. faecium probiotics and pathogenic bacteria genetically. Our data suggest that genetic mobility appears to be informative in differentiating between probiotic and pathogenic strains. While the present findings are not necessarily applicable to all probiotics, they offer novel molecular insights to guide future research in nutrigenomics, clinical medicine, and food engineering on new ways to differentiate pathogenic from probiotic bacteria.

RevDate: 2020-08-07

Son S, Oh JD, Lee SH, et al (2020)

Comparative genomics of canine Lactobacillus reuteri reveals adaptation to a shared environment with humans.

Genes & genomics pii:10.1007/s13258-020-00978-w [Epub ahead of print].

BACKGROUND: Lactobacillus reuteri is a gram-positive, non-motile bacterial species that has been used as a representative microorganism model to describe the ecology and evolution of vertebrate gut symbionts.

OBJECTIVE: Because the genetic features and evolutionary strategies of L. reuteri from the gastrointestinal tract of canines remain unknown, we tried to construct draft genome canine L. reuteri and investigate modified, acquired, or lost genetic features that have facilitated the evolution and adaptation of strains to specific environmental niches by this study.

METHODS: To examine canine L. reuteri, we sequenced an L. reuteri strain isolated from a dog in Korea. A comparative genomic approach was used to assess genetic diversity and gain insight into the distinguishing features related to different hosts based on 27 published genomic sequences.

RESULTS: The pan-genome of 28 L. reuteri strains contained 7,369 gene families, and the core genome contained 1070 gene families. The ANI tree based on the core genes in the canine L. reuteri strain (C1) was very close to those for three strains (IRT, DSM20016, JCM1112) from humans. Evolutionarily, these four strains formed one clade, which we regarded as C1-clade in this study. We could investigate a total of 32,050 amino acid substitutions among the 28 L. reuteri strain genomes. In this comparison, 283 amino acid substitutions were specific to strain C1 and four strains in C1-clade shared most of these 283 C1-strain specific amino acid substitutions, suggesting strongly similar selective pressure. In accessory genes, we could identify 127 C1-clade host-specific genes and found that several genes were closely related to replication, recombination, and repair.

CONCLUSION: This study provides new insights into the adaptation of L. reuteri to the canine intestinal habitat, and suggests that the genome of L. reuteri from canines is closely associated with their living and shared environment with humans.

RevDate: 2020-08-07

Botelho J, Grosso F, L Peixe (2020)

ICEs Are the Main Reservoirs of the Ciprofloxacin-Modifying crpP Gene in Pseudomonas aeruginosa.

Genes, 11(8): pii:genes11080889.

The ciprofloxacin-modifying crpP gene was recently identified in a plasmid isolated from a Pseudomonas aeruginosa clinical isolate. Homologues of this gene were also identified in Escherichia coli, Klebsiella pneumoniae and Acinetobacter baumannii. We set out to explore the mobile elements involved in the acquisition and spread of this gene in publicly available and complete genomes of Pseudomonas spp. All Pseudomonas complete genomes were downloaded from NCBI's Refseq library and were inspected for the presence of the crpP gene. The mobile elements carrying this gene were further characterized. The crpP gene was identified only in P. aeruginosa, in more than half of the complete chromosomes (61.9%, n = 133/215) belonging to 52 sequence types, of which the high-risk clone ST111 was the most frequent. We identified 136 crpP-harboring integrative and conjugative elements (ICEs), with 93.4% belonging to the mating-pair formation G (MPFG) family. The ICEs were integrated at the end of a tRNALys gene and were all flanked by highly conserved 45-bp direct repeats. The crpP-carrying ICEs contain 26 core genes (2.2% of all 1193 genes found in all the ICEs together), which are present in 99% or more of the crpP-harboring ICEs. The most frequently encoded traits on these ICEs include replication, transcription, intracellular trafficking and cell motility. Our work suggests that ICEs are the main vectors promoting the dissemination of the ciprofloxacin-modifying crpP gene in P. aeruginosa.

RevDate: 2020-08-05

Petit RA, TD Read (2020)

Bactopia: a Flexible Pipeline for Complete Analysis of Bacterial Genomes.

mSystems, 5(4): pii:5/4/e00190-20.

Sequencing of bacterial genomes using Illumina technology has become such a standard procedure that often data are generated faster than can be conveniently analyzed. We created a new series of pipelines called Bactopia, built using Nextflow workflow software, to provide efficient comparative genomic analyses for bacterial species or genera. Bactopia consists of a data set setup step (Bactopia Data Sets [BaDs]), which creates a series of customizable data sets for the species of interest, the Bactopia Analysis Pipeline (BaAP), which performs quality control, genome assembly, and several other functions based on the available data sets and outputs the processed data to a structured directory format, and a series of Bactopia Tools (BaTs) that perform specific postprocessing on some or all of the processed data. BaTs include pan-genome analysis, computing average nucleotide identity between samples, extracting and profiling the 16S genes, and taxonomic classification using highly conserved genes. It is expected that the number of BaTs will increase to fill specific applications in the future. As a demonstration, we performed an analysis of 1,664 public Lactobacillus genomes, focusing on Lactobacillus crispatus, a species that is a common part of the human vaginal microbiome. Bactopia is an open source system that can scale from projects as small as one bacterial genome to ones including thousands of genomes and that allows for great flexibility in choosing comparison data sets and options for downstream analysis. Bactopia code can be accessed at https://www.github.com/bactopia/bactopiaIMPORTANCE It is now relatively easy to obtain a high-quality draft genome sequence of a bacterium, but bioinformatic analysis requires organization and optimization of multiple open source software tools. We present Bactopia, a pipeline for bacterial genome analysis, as an option for processing bacterial genome data. Bactopia also automates downloading of data from multiple public sources and species-specific customization. Because the pipeline is written in the Nextflow language, analyses can be scaled from individual genomes on a local computer to thousands of genomes using cloud resources. As a usage example, we processed 1,664 Lactobacillus genomes from public sources and used comparative analysis workflows (Bactopia Tools) to identify and analyze members of the L. crispatus species.

RevDate: 2020-08-03

Tao Y, Jordan DR, ES Mace (2020)

A graph-based pan-genome guides biological discovery.

Molecular plant pii:S1674-2052(20)30256-2 [Epub ahead of print].

RevDate: 2020-08-03

Correia K, R Mahadevan (2020)

Pan-Genome-Scale Network Reconstruction: Harnessing Phylogenomics Increases the Quantity and Quality of Metabolic Models.

Biotechnology journal [Epub ahead of print].

BACKGROUND: A genome-scale network reconstruction (GENRE) represents a compendium of knowledge for an organism and is used in a variety of applications. Current practices limit the quantity and quality of GENREs. First, falling genome sequencing costs over the last decade has led to exponentially growing genome sequences, but the number of curated GENREs has not kept pace; this gap hinders our ability to study physiology throughout the tree of life. Second, the central metabolisms of existing yeast GENREs contain significant commission and omission errors; these inaccuracies limit the validity of metabolic simulations.

METHODS AND RESULTS: We outline an open and transparent framework to increase the quantity and quality of GENREs with phylogenomics. In this framework, research communities curate the pan-genome, pan-reactome, pan-metabolome, and pan-phenome for a group of organisms in a taxon, rather than for a single strain. We demonstrate our approach with 33 yeasts and fungi spanning 600 million years of evolution in the Dikarya subkingdom. We created a pan-fungal metabolic network called FYRMENT (Fungal and Yeast Metabolic Network) (https://github.com/LMSE/FYRMENT), and annotated reactions with ortholog groups from AYbRAH (https://github.com/LMSE/AYbRAH). We created metabolic models for every taxonomic level from subkingdom to strain using FYRMENT and AYbRAH. The fungal pan-GENRE contains 1553 orthologs, 2759 reactions, 2251 metabolites, and ten compartments. The strain-level GENREs have higher genomic and metabolic coverage than existing yeast and fungal GENREs created with other methods. Metabolic simulations show the maximum amino acid yields from glucose differs between yeast lineages, indicating metabolic networks have evolved in yeasts.

CONCLUSIONS: Curating ortholog and reaction databases at higher taxonomic-levels increases the quantity and quality of strain GENREs than the common practice of using model organism GENREs as templates. This pan-GENRE framework provides the ability to scale high-quality GENREs to more branches in the tree of life. This article is protected by copyright. All rights reserved.

RevDate: 2020-08-03

Parlikar A, Kalia K, Sinha S, et al (2020)

Understanding genomic diversity, pan-genome, and evolution of SARS-CoV-2.

PeerJ, 8:e9576 pii:9576.

Coronovirus disease 2019 (COVID-19) infection, which originated from Wuhan, China, has seized the whole world in its grasp and created a huge pandemic situation before humanity. Since December 2019, genomes of numerous isolates have been sequenced and analyzed for testing confirmation, epidemiology, and evolutionary studies. In the first half of this article, we provide a detailed review of the history and origin of COVID-19, followed by the taxonomy, nomenclature and genome organization of its causative agent Severe Acute Respiratory Syndrome-related Coronavirus-2 (SARS-CoV-2). In the latter half, we analyze subgenus Sarbecovirus (167 SARS-CoV-2, 312 SARS-CoV, and 5 Pangolin CoV) genomes to understand their diversity, origin, and evolution, along with pan-genome analysis of genus Betacoronavirus members. Whole-genome sequence-based phylogeny of subgenus Sarbecovirus genomes reasserted the fact that SARS-CoV-2 strains evolved from their common ancestors putatively residing in bat or pangolin hosts. We predicted a few country-specific patterns of relatedness and identified mutational hotspots with high, medium and low probability based on genome alignment of 167 SARS-CoV-2 strains. A total of 100-nucleotide segment-based homology studies revealed that the majority of the SARS-CoV-2 genome segments are close to Bat CoV, followed by some to Pangolin CoV, and some are unique ones. Open pan-genome of genus Betacoronavirus members indicates the diversity contributed by the novel viruses emerging in this group. Overall, the exploration of the diversity of these isolates, mutational hotspots and pan-genome will shed light on the evolution and pathogenicity of SARS-CoV-2 and help in developing putative methods of diagnosis and treatment.

RevDate: 2020-07-31

Söderlund R, Formenti N, Caló S, et al (2020)

Comparative genome analysis of Erysipelothrix rhusiopathiae isolated from domestic pigs and wild boars suggests host adaptation and selective pressure from the use of antibiotics.

Microbial genomics [Epub ahead of print].

The disease erysipelas caused by Erysipelothrix rhusiopathiae (ER) is a major concern in pig production. In the present study the genomes of ER from pigs (n=87), wild boars (n=71) and other sources (n=85) were compared in terms of whole-genome SNP variation, accessory genome content and the presence of genetic antibiotic resistance determinants. The aim was to investigate if genetic features among ER were associated with isolate origin in order to better estimate the risk of transmission of porcine-adapted strains from wild boars to free-range pigs and to increase our understanding of the evolution of ER. Pigs and wild boars carried isolates representing all ER clades, but clade one only occurred in healthy wild boars and healthy pigs. Several accessory genes or gene variants were found to be significantly associated with the pig and wild boar hosts, with genes predicted to encode cell wall-associated or extracellular proteins overrepresented. Gene variants associated with serovar determination and capsule production in serovars known to be pathogenic for pigs were found to be significantly associated with pigs as hosts. In total, 30 % of investigated pig isolates but only 6 % of wild boar isolates carried resistance genes, most commonly tetM (tetracycline) and lsa(E) together with lnu(B) (lincosamides, pleuromutilin and streptogramin A). The incidence of variably present genes including resistance determinants was weakly linked to phylogeny, indicating that host adaptation in ER has evolved multiple times in diverse lineages mediated by recombination and the acquisition of mobile genetic elements. The presented results support the occurrence of host-adapted ER strains, but they do not indicate frequent transmission between wild boars and domestic pigs. This article contains data hosted by Microreact.

RevDate: 2020-07-30

Derakhshani H, Bernier SP, Marko VA, et al (2020)

Completion of draft bacterial genomes by long-read sequencing of synthetic genomic pools.

BMC genomics, 21(1):519 pii:10.1186/s12864-020-06910-6.

BACKGROUND: Illumina technology currently dominates bacterial genomics due to its high read accuracy and low sequencing cost. However, the incompleteness of draft genomes generated by Illumina reads limits their application in comprehensive genomics analyses. Alternatively, hybrid assembly using both Illumina short reads and long reads generated by single molecule sequencing technologies can enable assembly of complete bacterial genomes, yet the high per-genome cost of long-read sequencing limits the widespread use of this approach in bacterial genomics. Here we developed a protocol for hybrid assembly of complete bacterial genomes using miniaturized multiplexed Illumina sequencing and non-barcoded PacBio sequencing of a synthetic genomic pool (SGP), thus significantly decreasing the overall per-genome cost of sequencing.

RESULTS: We evaluated the performance of SGP hybrid assembly on the genomes of 20 bacterial isolates with different genome sizes, a wide range of GC contents, and varying levels of phylogenetic relatedness. By improving the contiguity of Illumina assemblies, SGP hybrid assembly generated 17 complete and 3 nearly complete bacterial genomes. Increased contiguity of SGP hybrid assemblies resulted in considerable improvement in gene prediction and annotation. In addition, SGP hybrid assembly was able to resolve repeat elements and identify intragenomic heterogeneities, e.g. different copies of 16S rRNA genes, that would otherwise go undetected by short-read-only assembly. Comprehensive comparison of SGP hybrid assemblies with those generated using multiplexed PacBio long reads (long-read-only assembly) also revealed the relative advantage of SGP hybrid assembly in terms of assembly quality. In particular, we observed that SGP hybrid assemblies were completely devoid of both small (i.e. single base substitutions) and large assembly errors. Finally, we show the ability of SGP hybrid assembly to differentiate genomes of closely related bacterial isolates, suggesting its potential application in comparative genomics and pangenome analysis.

CONCLUSION: Our results indicate the superiority of SGP hybrid assembly over both short-read and long-read assemblies with respect to completeness, contiguity, accuracy, and recovery of small replicons. By lowering the per-genome cost of sequencing, our parallel sequencing and hybrid assembly pipeline could serve as a cost effective and high throughput approach for completing high-quality bacterial genomes.

RevDate: 2020-07-28

Haberer G, Kamal N, Bauer E, et al (2020)

European maize genomes highlight intraspecies variation in repeat and gene content.

Nature genetics pii:10.1038/s41588-020-0671-9 [Epub ahead of print].

The diversity of maize (Zea mays) is the backbone of modern heterotic patterns and hybrid breeding. Historically, US farmers exploited this variability to establish today's highly productive Corn Belt inbred lines from blends of dent and flint germplasm pools. Here, we report de novo genome sequences of four European flint lines assembled to pseudomolecules with scaffold N50 ranging from 6.1 to 10.4 Mb. Comparative analyses with two US Corn Belt lines explains the pronounced differences between both germplasms. While overall syntenic order and consolidated gene annotations reveal only moderate pangenomic differences, whole-genome alignments delineating the core and dispensable genome, and the analysis of heterochromatic knobs and orthologous long terminal repeat retrotransposons unveil the dynamics of the maize genome. The high-quality genome sequences of the flint pool complement the maize pangenome and provide an important tool to study maize improvement at a genome scale and to enhance modern hybrid breeding.

RevDate: 2020-07-28

Muqaddasi QH, Brassac J, Ebmeyer E, et al (2020)

Prospects of GWAS and predictive breeding for European winter wheat's grain protein content, grain starch content, and grain hardness.

Scientific reports, 10(1):12541 pii:10.1038/s41598-020-69381-5.

Grain quality traits determine the classification of registered wheat (Triticum aestivum L.) varieties. Although environmental factors and crop management practices exert a considerable influence on wheat quality traits, a significant proportion of the variance is attributed to the genetic factors. To identify the underlying genetic factors of wheat quality parameters viz., grain protein content (GPC), grain starch content (GSC), and grain hardness (GH), we evaluated 372 diverse European wheat varieties in replicated field trials in up to eight environments. We observed that all of the investigated traits hold a wide and significant genetic variation, and a significant negative correlation exists between GPC and GSC plus grain yield. Our association analyses based on 26,694 high-quality single nucleotide polymorphic markers revealed a strong quantitative genetic nature of GPC and GSC with associations on groups 2, 3, and 6 chromosomes. The identification of known Puroindoline-b gene for GH provided a positive analytic proof for our studies. We report that a locus QGpc.ipk-6A controls both GPC and GSC with opposite allelic effects. Based on wheat's reference and pan-genome sequences, the physical characterization of two loci viz., QGpc.ipk-2B and QGpc.ipk-6A facilitated the identification of the candidate genes for GPC. Furthermore, by exploiting additive and epistatic interactions of loci, we evaluated the prospects of predictive breeding for the investigated traits that suggested its efficient use in the breeding programs.

RevDate: 2020-07-28

Flament-Simon SC, de Toro M, Chuprikova L, et al (2020)

High diversity and variability of pipolins among a wide range of pathogenic Escherichia coli strains.

Scientific reports, 10(1):12452 pii:10.1038/s41598-020-69356-6.

Self-synthesizing transposons are integrative mobile genetic elements (MGEs) that encode their own B-family DNA polymerase (PolB). Discovered a few years ago, they are proposed as key players in the evolution of several groups of DNA viruses and virus-host interaction machinery. Pipolins are the most recent addition to the group, are integrated in the genomes of bacteria from diverse phyla and also present as circular plasmids in mitochondria. Remarkably, pipolins-encoded PolBs are proficient DNA polymerases endowed with DNA priming capacity, hence the name, primer-independent PolB (piPolB). We have now surveyed the presence of pipolins in a collection of 2,238 human and animal pathogenic Escherichia coli strains and found that, although detected in only 25 positive isolates (1.1%), they are present in E. coli strains from a wide variety of pathotypes, serotypes, phylogenetic groups and sequence types. Overall, the pangenome of strains carrying pipolins is highly diverse, despite the fact that a considerable number of strains belong to only three clonal complexes (CC10, CC23 and CC32). Comparative analysis with a set of 67 additional pipolin-harboring genomes from GenBank database spanning strains from diverse origin, further confirmed these results. The genetic structure of pipolins shows great flexibility and variability, with the piPolB gene and the attachment sites being the only common features. Most pipolins contain one or more recombinases that would be involved in excision/integration of the element in the same conserved tRNA gene. This mobilization mechanism might explain the apparent incompatibility of pipolins with other integrative MGEs such as integrons. In addition, analysis of cophylogeny between pipolins and pipolin-harboring strains showed a lack of congruence between several pipolins and their host strains, in agreement with horizontal transfer between hosts. Overall, these results indicate that pipolins can serve as a vehicle for genetic transfer among circulating E. coli and possibly also among other pathogenic bacteria.

RevDate: 2020-07-28

Crysnanto D, H Pausch (2020)

Bovine breed-specific augmented reference graphs facilitate accurate sequence read mapping and unbiased variant discovery.

Genome biology, 21(1):184 pii:10.1186/s13059-020-02105-0.

BACKGROUND: The current bovine genomic reference sequence was assembled from a Hereford cow. The resulting linear assembly lacks diversity because it does not contain allelic variation, a drawback of linear references that causes reference allele bias. High nucleotide diversity and the separation of individuals by hundreds of breeds make cattle ideally suited to investigate the optimal composition of variation-aware references.

RESULTS: We augment the bovine linear reference sequence (ARS-UCD1.2) with variants filtered for allele frequency in dairy (Brown Swiss, Holstein) and dual-purpose (Fleckvieh, Original Braunvieh) cattle breeds to construct either breed-specific or pan-genome reference graphs using the vg toolkit. We find that read mapping is more accurate to variation-aware than linear references if pre-selected variants are used to construct the genome graphs. Graphs that contain random variants do not improve read mapping over the linear reference sequence. Breed-specific augmented and pan-genome graphs enable almost similar mapping accuracy improvements over the linear reference. We construct a whole-genome graph that contains the Hereford-based reference sequence and 14 million alleles that have alternate allele frequency greater than 0.03 in the Brown Swiss cattle breed. Our novel variation-aware reference facilitates accurate read mapping and unbiased sequence variant genotyping for SNPs and Indels.

CONCLUSIONS: We develop the first variation-aware reference graph for an agricultural animal (https://doi.org/10.5281/zenodo.3759712). Our novel reference structure improves sequence read mapping and variant genotyping over the linear reference. Our work is a first step towards the transition from linear to variation-aware reference structures in species with high genetic diversity and many sub-populations.

RevDate: 2020-07-28

Yin Z, Liu J, Du B, et al (2020)

Whole-Genome-Based Survey for Polyphyletic Serovars of Salmonella enterica subsp. enterica Provides New Insights into Public Health Surveillance.

International journal of molecular sciences, 21(15): pii:ijms21155226.

Serotyping has traditionally been considered the basis for surveillance of Salmonella, but it cannot distinguish distinct lineages sharing the same serovar that vary in host range, pathogenicity and epidemiology. However, polyphyletic serovars have not been extensively investigated. Public health microbiology is currently being transformed by whole-genome sequencing (WGS) data, which promote the lineage determination using a more powerful and accurate technique than serotyping. The focus in this study is to survey and analyze putative polyphyletic serovars. The multi-locus sequence typing (MLST) phylogenetic analysis identified four putative polyphyletic serovars, namely, Montevideo, Bareilly, Saintpaul, and Muenchen. Whole-genome-based phylogeny and population structure highlighted the polyphyletic nature of Bareilly and Saintpaul and the multi-lineage nature of Montevideo and Muenchen. The population of these serovars was defined by extensive genetic diversity, the open pan genome and the small core genome. Source niche metadata revealed putative existence of lineage-specific niche adaptation (host-preference and environmental-preference), exhibited by lineage-specific genomic contents associated with metabolism and transport. Meanwhile, differences in genetic profiles relating to virulence and antimicrobial resistance within each lineage may contribute to pathogenicity and epidemiology. The results also showed that recombination events occurring at the H1-antigen loci may be an important reason for polyphyly. The results presented here provide the genomic basis of simple, rapid, and accurate identification of phylogenetic lineages of these serovars, which could have important implications for public health.

RevDate: 2020-07-27

Fang H, Xu JB, Nie Y, et al (2020)

Pan-genomic analysis reveals that the evolution of Dietzia species depends on their living habitats.

Environmental microbiology [Epub ahead of print].

The bacterial genus Dietzia is widely distributed in various environments. The genomes of 26 diverse strains of Dietzia, including almost all the type strains, were analyzed in this study. This analysis revealed a lipid metabolism gene richness, which could explain the ability of Dietzia to live in oil related environments. The pan-genome consists of 83,976 genes assigned into 10,327 gene families, 792 of which are shared by all the genomes of Dietzia. Mathematical extrapolation of the data suggests that the Dietzia pan-genome is open. Both gene duplication and gene loss contributed to the open pan-genome, while horizontal gene transfer was limited. Dietzia strains primarily gained their diverse metabolic capacity through more ancient gene duplications. Phylogenetic analysis of Dietzia isolated from aquatic and terrestrial environments showed two distinct clades from the same ancestor. The genome sizes of Dietzia strains from aquatic environments were significantly larger than those from terrestrial environments, which was mainly due to the occurrence of more gene loss events during the evolutionary progress of the strains from terrestrial environments. The evolutionary history of Dietzia was tightly coupled to environmental conditions, and iron concentrations should be one of the key factors shaping the genomes of the Dietzia lineages. This article is protected by copyright. All rights reserved.

RevDate: 2020-07-27

Moreno-Pérez A, Pintado A, Murillo J, et al (2020)

Host Range Determinants of Pseudomonas savastanoi Pathovars of Woody Hosts Revealed by Comparative Genomics and Cross-Pathogenicity Tests.

Frontiers in plant science, 11:973.

The study of host range determinants within the Pseudomonas syringae complex is gaining renewed attention due to its widespread distribution in non-agricultural environments, evidence of large variability in intra-pathovar host range, and the emergence of new epidemic diseases. This requires the establishment of appropriate model pathosystems facilitating integration of phenotypic, genomic and evolutionary data. Pseudomonas savastanoi pv. savastanoi is a model pathogen of the olive tree, and here we report a closed genome of strain NCPPB 3335, plus draft genome sequences of three strains isolated from oleander (pv. nerii), ash (pv. fraxini) and broom plants (pv. retacarpa). We then conducted a comparative genomic analysis of these four new genomes plus 16 publicly available genomes, representing 20 strains of these four P. savastanoi pathovars of woody hosts. Despite overlapping host ranges, cross-pathogenicity tests using four plant hosts clearly separated these pathovars and lead to pathovar reassignment of two strains. Critically, these functional assays were pivotal to reconcile phylogeny with host range and to define pathovar-specific genes repertoires. We report a pan-genome of 7,953 ortholog gene families and a total of 45 type III secretion system effector genes, including 24 core genes, four genes exclusive of pv. retacarpa and several genes encoding pathovar-specific truncations. Noticeably, the four pathovars corresponded with well-defined genetic lineages, with core genome phylogeny and hierarchical clustering of effector genes closely correlating with pathogenic specialization. Knot-inducing pathovars encode genes absent in the canker-inducing pv. fraxini, such as those related to indole acetic acid, cytokinins, rhizobitoxine, and a bacteriophytochrome. Other pathovar-exclusive genes encode type I, type II, type IV, and type VI secretion system proteins, the phytotoxine phevamine A, a siderophore, c-di-GMP-related proteins, methyl chemotaxis proteins, and a broad collection of transcriptional regulators and transporters of eight different superfamilies. Our combination of pathogenicity analyses and genomics tools allowed us to correctly assign strains to pathovars and to propose a repertoire of host range-related genes in the P. syringae complex.

RevDate: 2020-07-24

Kc R, Leong KWC, Harkness NM, et al (2020)

Whole-genome analyses reveal gene content differences between nontypeable Haemophilus influenzae isolates from chronic obstructive pulmonary disease compared to other clinical phenotypes.

Microbial genomics [Epub ahead of print].

Nontypeable Haemophilus influenzae (NTHi) colonizes human upper respiratory airways and plays a key role in the course and pathogenesis of acute exacerbations of chronic obstructive pulmonary disease (COPD). Currently, it is not possible to distinguish COPD isolates of NTHi from other clinical isolates of NTHi using conventional genotyping methods. Here, we analysed the core and accessory genome of 568 NTHi isolates, including 40 newly sequenced isolates, to look for genetic distinctions between NTHi isolates from COPD with respect to other illnesses, including otitis media, meningitis and pneumonia. Phylogenies based on polymorphic sites in the core-genome did not show discrimination between NTHi strains collected from different clinical phenotypes. However, pan-genome-wide association studies identified 79 unique NTHi accessory genes that were significantly associated with COPD. Furthermore, many of the COPD-related NTHi genes have known or predicted roles in virulence, transmembrane transport of metal ions and nutrients, cellular respiration and maintenance of redox homeostasis. This indicates that specific genes may be required by NTHi for its survival or virulence in the COPD lung. These results advance our understanding of the pathogenesis of NTHi infection in COPD lungs.

RevDate: 2020-07-23

Tonkin-Hill G, MacAlasdair N, Ruis C, et al (2020)

Producing polished prokaryotic pangenomes with the Panaroo pipeline.

Genome biology, 21(1):180 pii:10.1186/s13059-020-02090-4.

Population-level comparisons of prokaryotic genomes must take into account the substantial differences in gene content resulting from horizontal gene transfer, gene duplication and gene loss. However, the automated annotation of prokaryotic genomes is imperfect, and errors due to fragmented assemblies, contamination, diverse gene families and mis-assemblies accumulate over the population, leading to profound consequences when analysing the set of all genes found in a species. Here, we introduce Panaroo, a graph-based pangenome clustering tool that is able to account for many of the sources of error introduced during the annotation of prokaryotic genome assemblies. Panaroo is available at https://github.com/gtonkinhill/panaroo .

RevDate: 2020-07-21

Bayer PE, Golicz AA, Scheben A, et al (2020)

Plant pan-genomes are the new reference.

Nature plants pii:10.1038/s41477-020-0733-0 [Epub ahead of print].

Recent years have seen a surge in plant genome sequencing projects and the comparison of multiple related individuals. The high degree of genomic variation observed led to the realization that single reference genomes do not represent the diversity within a species, and led to the expansion of the pan-genome concept. Pan-genomes represent the genomic diversity of a species and includes core genes, found in all individuals, as well as variable genes, which are absent in some individuals. Variable gene annotations often show similarities across plant species, with genes for biotic and abiotic stress commonly enriched within variable gene groups. Here we review the growth of pan-genomics in plants, explore the origins of gene presence and absence variation, and show how pan-genomes can support plant breeding and evolution studies.

RevDate: 2020-07-20

Yang LL, Jiang Z, Li Y, et al (2020)

Plasmids related to the symbiotic nitrogen fixation are not only cooperated functionally, but also may have evolved over a time span in family Rhizobiaceae.

Genome biology and evolution pii:5873871 [Epub ahead of print].

Rhizobia are soil bacteria capable of forming symbiotic nitrogen-fixing nodules associated with leguminous plants. In fast-growing legume-nodulating rhizobia, such as the species in the family Rhizobiaceae, the symbiotic plasmid is the main genetic basis for nitrogen-fixing symbiosis, and is susceptible to horizontal gene transfer. To further understand the symbioses evolution in Rhizobiaceae, we analysed the pan-genome of this family based on 92 genomes of type/reference strains and reconstructed its phylogeny using a phylogenomics approach. Intriguingly, although the genetic expansion that occurred in chromosomal regions was the main reason for the high proportion of low-frequency flexible gene families in the pan-genome, gene gain events associated with accessory plasmids introduced more genes into the genomes of nitrogen-fixing species. For symbiotic plasmids, although horizontal gene transfer frequently occurred, transfer may be impeded by, such as, the host's physical isolation and soil conditions, even among phylogenetically close species. During coevolution with leguminous hosts, the plasmid system, including accessory and symbiotic plasmids, may have evolved over a time span, and provided rhizobial species with the ability to adapt to various environmental conditions and helped them achieve nitrogen fixation. These findings provide new insights into the phylogeny of Rhizobiaceae and advance our understanding of the evolution of symbiotic nitrogen fixation.

RevDate: 2020-07-17

Coulton A, KJ Edwards (2020)

AutoCloner: automatic homologue-specific primer design for full-gene cloning in polyploids.

BMC bioinformatics, 21(1):311 pii:10.1186/s12859-020-03601-7.

BACKGROUND: Polyploid organisms such as wheat complicate even the simplest of procedures in molecular biology. Whilst knowledge of genomic sequences in crops is increasing rapidly, the scientific community is still a long way from producing a full pan-genome for every species. Polymerase chain reaction and Sanger sequencing therefore remain widely used as methods for characterizing gene sequences in many varieties of crops. High sequence similarity between genomes in polyploids means that if primers are not homeologue-specific via the incorporation of a SNP at the 3' tail, sequences other than the target sequence will also be amplified. Current consensus for gene cloning in wheat is to manually perform many steps in a long bioinformatics pipeline.

RESULTS: Here we present AutoCloner (www.autocloner.com), a fully automated pipeline for crop gene cloning that includes a free-to-use web interface for users. AutoCloner takes a sequence of interest from the user and performs a basic local alignment search tool (BLAST) search against the genome assembly for their particular polyploid crop. Homologous sequences are then compiled with the input sequence into a multiple sequence alignment which is mined for single-nucleotide polymorphisms (SNPs). Various combinations of potential primers that cover the entire gene of interest are then created and evaluated by Primer3; the set of primers with the highest score, as well as all possible primers at every SNP location, are then returned to the user for polymerase chain reaction (PCR). We have successfully used AutoCloner to clone various genes of interest in the Apogee wheat variety, which has no current genome sequence. In addition, we have successfully run the pipeline on ~ 80,000 high-confidence gene models from a wheat genome assembly.

CONCLUSION: AutoCloner is the first tool to fully-automate primer design for gene cloning in polyploids, where previously the consensus within the wheat community was to perform this process manually. The web interface for AutoCloner provides a simple and effective polyploid primer-design method for gene cloning, with no need for researchers to download software or input any other details other than their sequence of interest.

RevDate: 2020-07-17

Castro-Jaimes S, Bello-López E, Velázquez-Acosta C, et al (2020)

Chromosome Architecture and Gene Content of the Emergent Pathogen Acinetobacter haemolyticus.

Frontiers in microbiology, 11:926.

Acinetobacter haemolyticus is a Gammaproteobacterium that has been involved in serious diseases frequently linked to the nosocomial environment. Most of the strains causing such infections are sensitive to a wide variety of antibiotics, but recent reports indicate that this pathogen is acquiring very efficiently carbapenem-resistance determinants like the blaNDM-1 gene, all over the world. With this work we contribute with a collection set of 31 newly sequenced nosocomial A. haemolyticus isolates. Genome analysis of these sequences and others collected from RefSeq indicates that their chromosomes are organized in 12 syntenic blocks that contain most of the core genome genes. These blocks are separated by hypervariable regions that are rich in unique gene families, but also have signals of horizontal gene transfer. Genes involved in virulence or encoding different secretion systems are located inside syntenic regions and have recombination signals. The relative order of the synthetic blocks along the A. haemolyticus chromosome can change, indicating that they have been subject to several kinds of inversions. Genomes of this microorganism show large differences in gene content even if they are in the same clade. Here we also show that A. haemolyticus has an open pan-genome.

RevDate: 2020-07-16

Chandrasekar SS, Kingstad-Bakke BA, Wu CW, et al (2020)

A Novel Mucosal Adjuvant System for the Immunization Against Avian Coronavirus Causing Infectious Bronchitis.

Journal of virology pii:JVI.01016-20 [Epub ahead of print].

Infectious Bronchitis (IB) caused by Infectious Bronchitis Virus (IBV) is currently a major threat to chicken health with multiple outbreaks being reported in the US over the past decade. Modified live virus (MLV) vaccines used in the field can persist and provide the genetic material needed for recombination and emergence of novel IBV serotypes. Inactivated and subunit vaccines overcome some of the limitations of MLV with no risk of virulence reversion and emergence of new virulent serotypes. However, these vaccines are weakly immunogenic and poorly protective. There is an urgent need to develop more effective vaccines that can elicit a robust, long-lasting immune response. In this study, we evaluate a novel adjuvant system developed from Quil-A and chitosan (QAC) for the intranasal delivery of nucleic acid immunogens to improve protective efficacy. The QAC adjuvant system forms nanocarriers (<100 nm) that efficiently encapsulate nucleic acid cargo, exhibit sustained release of payload and can stably transfect cells. Encapsulation of plasmid DNA vaccine expressing IBV nucleocapsid (N) protein by the QAC adjuvant system (pQAC-N) enhanced immunogenicity as evidenced by robust induction of adaptive humoral and cellular immune responses post vaccination and challenge. Birds immunized with pQAC-N showed reduced clinical severity and viral shedding post challenge on par with protection observed with current commercial vaccines without the associated safety concerns. Presented results indicate that the QAC adjuvant system can offer a safer alternative to the use of live vaccines against avian and other emerging coronaviruses.Importance According to the 2017 US agriculture statistics, the combined value of production and sales from broilers, eggs, turkeys, and chicks was $42.8 billion. Of this number, broiler sales comprised 67 percent of the industry value with the production of > 50 billion pounds of chicken meats. The economic success of the poultry industry in the USA hinges on the extensive use of vaccines to control Infectious Bronchitis Virus (IBV) and other poultry pathogens. Majority of vaccines currently licensed for poultry health include both modified live vaccine and inactivated pathogens. Despite their proven efficacy, modified live vaccine constructs take time to produce and could potentially revert to virulence, which limits their safety. The significance of our research stems from the development of a safer and potent alternative mucosal vaccine to replace live vaccines against IBV and other emerging coronaviruses.

RevDate: 2020-07-13

Bohr LL, Mortimer TD, CS Pepperell (2020)

Lateral Gene Transfer Shapes Diversity of Gardnerella spp.

Frontiers in cellular and infection microbiology, 10:293.

Gardnerella spp. are pathognomonic for bacterial vaginosis, which increases the risk of preterm birth and the transmission of sexually transmitted infections. Gardnerella spp. are genetically diverse, comprising what have recently been defined as distinct species with differing functional capacities. Disease associations with Gardnerella spp. are not straightforward: patients with BV are usually infected with multiple species, and Gardnerella spp. are also found in the vaginal microbiome of healthy women. Genome comparisons of Gardnerella spp. show evidence of lateral gene transfer (LGT), but patterns of LGT have not been characterized in detail. Here we sought to define the role of LGT in shaping the genetic structure of Gardnerella spp. We analyzed whole genome sequencing data for 106 Gardnerella strains and used these data for pan genome analysis and to characterize LGT in the core and accessory genomes, over recent and remote timescales. In our diverse sample of Gardnerella strains, we found that both the core and accessory genomes are clearly differentiated in accordance with newly defined species designations. We identified putative competence and pilus assembly genes across most species; we also found them to be differentiated between species. Competence machinery has diverged in parallel with the core genome, with selection against deleterious mutations as a predominant influence on their evolution. By contrast, the virulence factor vaginolysin, which encodes a toxin, appears to be readily exchanged among species. We identified five distinct prophage clusters in Gardnerella genomes, two of which appear to be exchanged between Gardnerella species. Differences among species are apparent in their patterns of LGT, including their exchange with diverse gene pools. Despite frequent LGT and co-localization in the same niche, our results show that Gardnerella spp. are clearly genetically differentiated and yet capable of exchanging specific genetic material. This likely reflects complex interactions within bacterial communities associated with the vaginal microbiome. Our results provide insight into how such interactions evolve and are maintained, allowing these multi-species communities to colonize and invade human tissues and adapt to antibiotics and other stressors.

RevDate: 2020-07-13

Han M, Liu G, Chen Y, et al (2020)

Comparative Genomics Uncovers the Genetic Diversity and Characters of Veillonella atypica and Provides Insights Into Its Potential Applications.

Frontiers in microbiology, 11:1219.

Veillonella atypica is a bacterium that is present in the gut and the oral cavity of mammals and plays diverse roles in different niches. A recent study demonstrated that Veillonella is highly associated with marathon running and approved that V. atypica gavage improves treadmill run time in mice, revealing that V. atypica has a high biotechnological potential in improving athlete performance. However, a comprehensive analysis of the genetic diversity, function traits, and genome editing method of V. atypica remains elusive. In the present study, we conducted a systemically comparative analysis of the genetic datasets of nine V. atypica strains. The pan-genome of V. atypica consisted of 2,065 homologous clusters and exhibited an open pan-genome structure. A phylogenetic analysis of V. atypica with two different categories revealed that V. atypica OK5 was the most distant from the other eight V. atypica strains. A total of 43 orthologous genes were identified as CAZyme genes and grouped into 23 CAZyme families. The CAZyme components derived from accessory clusters contributed to the differences in the ability of the nine V. atypica strains to utilize carbohydrates. An integrated analysis of the metabolic pathways of V. atypica suggested that V. atypica strains harbored vancomycin resistance and were involved in several biosynthesis pathways of secondary metabolites. The V. atypica strains harbored four main Cas proteins, namely, CAS-Type IIIA, CAS-Type IIA, CAS-Type IIC, and CAS-Type IIID. This pilot study provides an in-depth understanding of and a fundamental knowledge about the biology of V. atypica that allow the possibility to increase the biotechnological potential of this bacterium.

RevDate: 2020-07-12

Guo G, Du D, Yu Y, et al (2020)

Pan-genome analysis of Streptococcus suis serotype 2 revealed genomic diversity among strains of different virulence.

Transboundary and emerging diseases [Epub ahead of print].

Streptococcus suis (SS) is an emerging zoonotic pathogen that causes severe infections in swine and humans. Among the 33 known serotypes, serotype 2 is most frequently associated with infections in pigs and humans. To better understand the virulence characterization of S. suis serotype 2 (SS2) and discriminate the difference between virulent and avirulent strains in SS2, characterization of the genomic features of strains with different virulence are required. The result showed that Streptococcus suis have an open pan-genome. The pan-genome shared by the 19 S. suis serotype 2 strains was composed of 1239 core genes and 2436 accessory genes. COG analysis indicated that core genes are involved in the basic physiological function, but accessory genes related to tachytely evolution. Comparative analysis between core genomes of virulent strains and 9 avirulent strains, suggested that srtBCD pilus cluster was a significant discrepancy between virulent and avirulent strains. Analysis between high virulent and group B low virulent strains showed 53 and 58 genes specific to each other. Moreover, genomes of avirulent strains tend to be larger than virulent strains; avirulent strains tend to possess more prophages sequences than virulent strains. Our findings could be contributed to a better understanding of the genomics of S. suis serotype 2.

RevDate: 2020-07-08

Lees JA, Mai TT, Galardini M, et al (2020)

Improved Prediction of Bacterial Genotype-Phenotype Associations Using Interpretable Pangenome-Spanning Regressions.

mBio, 11(4): pii:mBio.01344-20.

Discovery of genetic variants underlying bacterial phenotypes and the prediction of phenotypes such as antibiotic resistance are fundamental tasks in bacterial genomics. Genome-wide association study (GWAS) methods have been applied to study these relations, but the plastic nature of bacterial genomes and the clonal structure of bacterial populations creates challenges. We introduce an alignment-free method which finds sets of loci associated with bacterial phenotypes, quantifies the total effect of genetics on the phenotype, and allows accurate phenotype prediction, all within a single computationally scalable joint modeling framework. Genetic variants covering the entire pangenome are compactly represented by extended DNA sequence words known as unitigs, and model fitting is achieved using elastic net penalization, an extension of standard multiple regression. Using an extensive set of state-of-the-art bacterial population genomic data sets, we demonstrate that our approach performs accurate phenotype prediction, comparable to popular machine learning methods, while retaining both interpretability and computational efficiency. Compared to those of previous approaches, which test each genotype-phenotype association separately for each variant and apply a significance threshold, the variants selected by our joint modeling approach overlap substantially.IMPORTANCE Being able to identify the genetic variants responsible for specific bacterial phenotypes has been the goal of bacterial genetics since its inception and is fundamental to our current level of understanding of bacteria. This identification has been based primarily on painstaking experimentation, but the availability of large data sets of whole genomes with associated phenotype metadata promises to revolutionize this approach, not least for important clinical phenotypes that are not amenable to laboratory analysis. These models of phenotype-genotype association can in the future be used for rapid prediction of clinically important phenotypes such as antibiotic resistance and virulence by rapid-turnaround or point-of-care tests. However, despite much effort being put into adapting genome-wide association study (GWAS) approaches to cope with bacterium-specific problems, such as strong population structure and horizontal gene exchange, current approaches are not yet optimal. We describe a method that advances methodology for both association and generation of portable prediction models.

RevDate: 2020-07-07

Shahi N, SK Mallik (2020)

Emerging bacterial fish pathogen Lactococcus garvieae RTCLI04, isolated from rainbow trout (Oncorhynchus mykiss): Genomic features and comparative genomics.

Microbial pathogenesis pii:S0882-4010(20)30734-8 [Epub ahead of print].

Lactococcus garvieae is one of the emerging zoonotic bacterial pathogen, causes fatal hemorrhagic septicemia in cultured fish species, animals and humans, worldwide. Here, we report the genomic features of whole-genome sequence (WGS) of L. garvieae strain RTCLI04, recovered from lower intestine of farmed rainbow trout, Oncorhynchus mykiss in the northwest Himalayan region India. The genome of L. garvieae RTCLI04 is a single circular chromosome of 2,054,885 base pairs (bp), which encodes 1,993 proteins and has G + C content of 39%. The bioinformatics analysis of WGS of RTCLI04, confirmed the presence of 51 tRNAs genes (including two pseudogenes), six rRNAs genes (four genes for 5S rRNA; one gene for 16S rRNA and one gene for 23S rRNA), five virulent domains, and twenty eight different genetic pathways. A Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) finder tool indicates that three different CRISPR and one cas system with common spacer was present in the genome of L. garvieae RTCLI04. Pan-genome analysis of RTCLI04 and all the other reference L. garvieae strains shows that pan-genome of this bacterium consisted of 2,239 putative protein-coding genes in which 1,850 genes are core gene, 389 genes are dispensable gene, and 221 genes are unique to RTCLI04. L. garvieae RTCLI04 lacks genomic island of 16.5 Kb capsule gene cluster. In addition, 39 virulence-associated genes (VAGs) including hly1,-2,-3; PavA, PsaA; eno; LPxTG containing surface proteins 1, 2, 3 and 4; pgm, sod and 29 antimicrobial resistant genes (ARGs) including mefE (clindamycin), srmB (lincomycin), dfrA26 (trimethoprim), gyrB (nalidixic acid), arr-3 (rifampin), otrB (tetracycline), aac(6)-Ic (tobramycin), IrgB (penicillin), mecA (oxacillin), vanRB (vancomycin) and mfpA (fluoroquinolone) were also predicted in the genome of L. garvieae RTCLI04. Our study provides new insight into understanding the virulence mechanism, antimicrobial resistance, and development of effective therapeutic measures against L. garvieae during a disease outbreak in aquaculture.

RevDate: 2020-07-07

Lyu J (2020)

Pan-genome upgrade.

Nature plants pii:10.1038/s41477-020-0731-2 [Epub ahead of print].

RevDate: 2020-07-07

Hurel J, Schbath S, Bougeard S, et al (2020)

DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples.

BMC bioinformatics, 21(1):284 pii:10.1186/s12859-020-03611-5.

BACKGROUND: The European Community has adopted very restrictive policies regarding the dissemination and use of genetically modified organisms (GMOs). In fact, a maximum threshold of 0.9% of contaminating GMOs is tolerated for a "GMO-free" label. In recent years, imports of undescribed GMOs have been detected. Their sequences are not described and therefore not detectable by conventional approaches, such as PCR.

RESULTS: We developed DUGMO, a bioinformatics pipeline for the detection of genetically modified (GM) bacteria, including unknown GM bacteria, based on Illumina paired-end sequencing data. The method is currently focused on the detection of GM bacteria with - possibly partial - transgenes in pure bacterial samples. In the preliminary steps, coding sequences (CDSs) are aligned through two successive BLASTN against the host pangenome with relevant tuned parameters to discriminate CDSs belonging to the wild type genome (wgCDS) from potential GM coding sequences (pgmCDSs). Then, Bray-Curtis distances are calculated between the wgCDS and each pgmCDS, based on the difference of genomic vocabulary. Finally, two machine learning methods, namely the Random Forest and Generalized Linear Model, are carried out to target true GM CDS(s), based on six variables including Bray-Curtis distances and GC content. Tests carried out on a GM Bacillus subtilis showed 25 positive CDSs corresponding to the chloramphenicol resistance gene and CDSs of the inserted plasmids. On a wild type B. subtilis, no false positive sequences were detected.

CONCLUSION: DUGMO detects exogenous CDS, truncated, fused or highly mutated wild CDSs in high-throughput sequencing data, and was shown to be efficient at detecting GM sequences, but it might also be employed for the identification of recent horizontal gene transfers.

RevDate: 2020-07-03

Kaushal G, SP Singh (2020)

Comparative genome analysis provides shreds of molecular evidence for reclassification of Leuconostoc mesenteroides MTCC10508 as a strain of Leu. suionicum.

Genomics pii:S0888-7543(20)30015-X [Epub ahead of print].

This study presents the whole-genome comparative analysis of a Leuconostoc sp. strain, previously documented as Leu. mesenteroides MTCC 10508. The ANI, dDDH, dot plot, and MAUVE analyses suggested its reclassification as a strain of Leu. suionicum. Functional annotation identified a total of 1971 genes, out of which, 265 genes were mapped to CAZymes, evincing its carbohydrate transforming capability. The genome comparison with 59 Leu. mesenteroides, and Leu. suionicum strains generated the core and pan-genome profiles, divulging the unique genes in Leuconostoc sp. MTCC 10508. For the first time, this study reports the genes encoding alpha-xylosidase and copper oxidase in a strain of Leu. suionicum. The genetic information for any possible allergenic molecule could not be detected in the genome, advocating the safety of the strain. The present investigation provides the genomic evidence for reclassification of the Leuconostoc sp. strain and also promulgates the molecular insights into its metabolic potential.

RevDate: 2020-07-03

Duru IC, Andreevskaya M, Laine P, et al (2020)

Genomic characterization of the most barotolerant Listeria monocytogenes RO15 strain compared to reference strains used to evaluate food high pressure processing.

BMC genomics, 21(1):455 pii:10.1186/s12864-020-06819-0.

BACKGROUND: High pressure processing (HPP; i.e. 100-600 MPa pressure depending on product) is a non-thermal preservation technique adopted by the food industry to decrease significantly foodborne pathogens, including Listeria monocytogenes, from food. However, susceptibility towards pressure differs among diverse strains of L. monocytogenes and it is unclear if this is due to their intrinsic characteristics related to genomic content. Here, we tested the barotolerance of 10 different L. monocytogenes strains, from food and food processing environments and widely used reference strains including clinical isolate, to pressure treatments with 400 and 600 MPa. Genome sequencing and genome comparison of the tested L. monocytogenes strains were performed to investigate the relation between genomic profile and pressure tolerance.

RESULTS: None of the tested strains were tolerant to 600 MPa. A reduction of more than 5 log10 was observed for all strains after 1 min 600 MPa pressure treatment. L. monocytogenes strain RO15 showed no significant reduction in viable cell counts after 400 MPa for 1 min and was therefore defined as barotolerant. Genome analysis of so far unsequenced L. monocytogenes strain RO15, 2HF33, MB5, AB199, AB120, C7, and RO4 allowed us to compare the gene content of all strains tested. This revealed that the three most pressure tolerant strains had more than one CRISPR system with self-targeting spacers. Furthermore, several anti-CRISPR genes were detected in these strains. Pan-genome analysis showed that 10 prophage genes were significantly associated with the three most barotolerant strains.

CONCLUSIONS: L. monocytogenes strain RO15 was the most pressure tolerant among the selected strains. Genome comparison suggests that there might be a relationship between prophages and pressure tolerance in L. monocytogenes.

RevDate: 2020-07-02

Steinbrenner AD (2020)

The evolving landscape of cell surface pattern recognition across plant immune networks.

Current opinion in plant biology, 56:135-146 pii:S1369-5266(20)30053-4 [Epub ahead of print].

To recognize diverse threats, plants monitor extracellular molecular patterns and transduce intracellular immune signaling through receptor complexes at the plasma membrane. Pattern recognition occurs through a prototypical network of interacting proteins, comprising A) receptors that recognize inputs associated with a growing number of pest and pathogen classes (bacteria, fungi, oomycetes, caterpillars), B) co-receptor kinases that participate in binding and signaling, and C) cytoplasmic kinases that mediate first stages of immune output. While this framework has been elucidated in reference accessions of model organisms, network components are part of gene families with widespread variation, potentially tuning immunocompetence for specific contexts. Most dramatically, variation in receptor repertoires determines the range of ligands acting as immunogenic inputs for a given plant. Diversification of receptor kinase (RK) and related receptor-like protein (RLP) repertoires may tune responses even within a species. Comparative genomics at pangenome scale will reveal patterns and features of immune network variation.

RevDate: 2020-07-02

Chen Z, Kuang D, Xu X, et al (2020)

Genomic analyses of multidrug-resistant Salmonella Indiana, Typhimurium, and Enteritidis isolates using MinION and MiSeq sequencing technologies.

PloS one, 15(7):e0235641 pii:PONE-D-20-08707.

We sequenced 25 isolates of phenotypically multidrug-resistant Salmonella Indiana (n = 11), Typhimurium (n = 8), and Enteritidis (n = 6) using both MinION long-read [SQK-LSK109 and flow cell (R9.4.1)] and MiSeq short-read (Nextera XT and MiSeq Reagent Kit v2) sequencing technologies to determine the advantages of each approach in terms of the characteristics of genome structure, antimicrobial resistance (AMR), virulence potential, whole-genome phylogeny, and pan-genome. The MinION reads were base-called in real-time using MinKnow 3.4.8 integrated with Guppy 3.0.7. The long-read-only assembly, Illumina-only assembly, and hybrid assembly pipelines of Unicycler 0.4.8 were used to generate the MinION, MiSeq, and hybrid assemblies, respectively. The MinION assemblies were highly contiguous compared to the MiSeq assemblies but lacked accuracy, a deficiency that was mitigated by adding the MiSeq short reads through the Unicycler hybrid assembly which corrected erroneous single nucleotide polymorphisms (SNPs). The MinION assemblies provided similar predictions of AMR and virulence potential compared to the MiSeq and hybrid assemblies, although they produced more total false negatives of AMR genotypes, primarily due to failure in identifying tetracycline resistance genes in 11 of the 19 MinION assemblies of tetracycline-resistant isolates. The MinION assemblies displayed a large genetic distance from their corresponding MiSeq and hybrid assemblies on the whole-genome phylogenetic tree, indicating that the lower read accuracy of MinION sequencing caused incorrect clustering. The pan-genome of the MinION assemblies contained significantly more accessory genes and less core genes compared to the MiSeq and hybrid assemblies, suggesting that although these assemblies were more contiguous, their sequencing errors reduced accurate genome annotations. Our research demonstrates that MinION sequencing by itself provides an efficient assessment of the genome structure, antimicrobial resistance, and virulence potential of Salmonella; however, it is not sufficient for whole-genome phylogenetic and pan-genome analyses. MinION in combination with MiSeq facilitated the most accurate genomic analyses.

RevDate: 2020-07-02

Wang B, Cheng H, Qian W, et al (2020)

Comparative genome analysis and mining of secondary metabolites of Paenibacillus polymyxa.

Genes & genetic systems [Epub ahead of print].

Paenibacillus polymyxa is a well-known Gram-positive biocontrol bacterium. It has been reported that many P. polymyxa strains can inhibit bacteria, fungi and other plant pathogens. Paenibacillus polymyxa employs a variety of mechanisms to promote plant growth, so it is necessary to understand the biocontrol ability of bacteria at the genome level. In the present study, thanks to the widespread availability of Paenibacillus genome data and the development of bioinformatics tools, we were able to analyze and mine the genomes of 43 P. polymyxa strains. The strain NCTC4744 was determined not to be P. polymyxa according to digital DNA-DNA hybridization and average nucleotide identity. By analysis of the pan-genome and the core genome, we found that the pan-genome of P. polymyxa was open and that there were 3,192 core genes. In a gene cluster analysis of secondary metabolites, 797 secondary metabolite gene clusters were found, of which 343 are not similar to known clusters and are expected to reveal a large number of new secondary metabolites. We also analyzed the plant growth-promoting genes that were mined and found, surpisingly, that these genes are highly conserved. The results of the present study not only reveal a large number of unknown potential secondary metabolite gene clusters in P. polymyxa, but also suggest that plant growth promotion characteristics are evolutionary adaptations of P. polymyxa to plant-related habitats.

RevDate: 2020-07-02

Fodor A, Abate BA, Deák P, et al (2020)

Multidrug Resistance (MDR) and Collateral Sensitivity in Bacteria, with Special Attention to Genetic and Evolutionary Aspects and to the Perspectives of Antimicrobial Peptides-A Review.

Pathogens (Basel, Switzerland), 9(7): pii:pathogens9070522.

Antibiotic poly-resistance (multidrug-, extreme-, and pan-drug resistance) is controlled by adaptive evolution. Darwinian and Lamarckian interpretations of resistance evolution are discussed. Arguments for, and against, pessimistic forecasts on a fatal "post-antibiotic era" are evaluated. In commensal niches, the appearance of a new antibiotic resistance often reduces fitness, but compensatory mutations may counteract this tendency. The appearance of new antibiotic resistance is frequently accompanied by a collateral sensitivity to other resistances. Organisms with an expanding open pan-genome, such as Acinetobacterbaumannii, Pseudomonas aeruginosa, and Klebsiella pneumoniae, can withstand an increased number of resistances by exploiting their evolutionary plasticity and disseminating clonally or poly-clonally. Multidrug-resistant pathogen clones can become predominant under antibiotic stress conditions but, under the influence of negative frequency-dependent selection, are prevented from rising to dominance in a population in a commensal niche. Antimicrobial peptides have a great potential to combat multidrug resistance, since antibiotic-resistant bacteria have shown a high frequency of collateral sensitivity to antimicrobial peptides. In addition, the mobility patterns of antibiotic resistance, and antimicrobial peptide resistance, genes are completely different. The integron trade in commensal niches is fortunately limited by the species-specificity of resistance genes. Hence, we theorize that the suggested post-antibiotic era has not yet come, and indeed might never come.

RevDate: 2020-07-01

Roder T, Wüthrich D, Bär C, et al (2020)

In Silico Comparison Shows that the Pan-Genome of a Dairy-Related Bacterial Culture Collection Covers Most Reactions Annotated to Human Microbiomes.

Microorganisms, 8(7): pii:microorganisms8070966.

The diversity of the human microbiome is positively associated with human health. However, this diversity is endangered by Westernized dietary patterns that are characterized by a decreased nutrient variety. Diversity might potentially be improved by promoting dietary patterns rich in microbial strains. Various collections of bacterial cultures resulting from a century of dairy research are readily available worldwide, and could be exploited to contribute towards this end. We have conducted a functional in silico analysis of the metagenome of 24 strains, each representing one of the species in a bacterial culture collection composed of 626 sequenced strains, and compared the pathways potentially covered by this metagenome to the intestinal metagenome of four healthy, although overweight, humans. Remarkably, the pan-genome of the 24 strains covers 89% of the human gut microbiome's annotated enzymatic reactions. Furthermore, the dairy microbial collection covers biological pathways, such as methylglyoxal degradation, sulfate reduction, g-aminobutyric (GABA) acid degradation and salicylate degradation, which are differently covered among the four subjects and are involved in a range of cardiometabolic, intestinal, and neurological disorders. We conclude that microbial culture collections derived from dairy research have the genomic potential to complement and restore functional redundancy in human microbiomes.

RevDate: 2020-06-30

Motyka-Pomagruk A, Zoledowska S, Misztak AE, et al (2020)

Comparative genomics and pangenome-oriented studies reveal high homogeneity of the agronomically relevant enterobacterial plant pathogen Dickeya solani.

BMC genomics, 21(1):449 pii:10.1186/s12864-020-06863-w.

BACKGROUND: Dickeya solani is an important plant pathogenic bacterium causing severe losses in European potato production. This species draws a lot of attention due to its remarkable virulence, great devastating potential and easier spread in contrast to other Dickeya spp. In view of a high need for extensive studies on economically important soft rot Pectobacteriaceae, we performed a comparative genomics analysis on D. solani strains to search for genetic foundations that would explain the differences in the observed virulence levels within the D. solani population.

RESULTS: High quality assemblies of 8 de novo sequenced D. solani genomes have been obtained. Whole-sequence comparison, ANIb, ANIm, Tetra and pangenome-oriented analyses performed on these genomes and the sequences of 14 additional strains revealed an exceptionally high level of homogeneity among the studied genetic material of D. solani strains. With the use of 22 genomes, the pangenome of D. solani, comprising 84.7% core, 7.2% accessory and 8.1% unique genes, has been almost completely determined, suggesting the presence of a nearly closed pangenome structure. Attribution of the genes included in the D. solani pangenome fractions to functional COG categories showed that higher percentages of accessory and unique pangenome parts in contrast to the core section are encountered in phage/mobile elements- and transcription- associated groups with the genome of RNS 05.1.2A strain having the most significant impact. Also, the first D. solani large-scale genome-wide phylogeny computed on concatenated core gene alignments is herein reported.

CONCLUSIONS: The almost closed status of D. solani pangenome achieved in this work points to the fact that the unique gene pool of this species should no longer expand. Such a feature is characteristic of taxa whose representatives either occupy isolated ecological niches or lack efficient mechanisms for gene exchange and recombination, which seems rational concerning a strictly pathogenic species with clonal population structure. Finally, no obvious correlations between the geographical origin of D. solani strains and their phylogeny were found, which might reflect the specificity of the international seed potato market.

RevDate: 2020-06-30

Yang F, Feng H, Massey IY, et al (2020)

Genome-Wide Analysis Reveals Genetic Potential for Aromatic Compounds Biodegradation of Sphingopyxis.

BioMed research international, 2020:5849123.

Members of genus Sphingopyxis are frequently found in diverse eco-environments worldwide and have been traditionally considered to play vital roles in the degradation of aromatic compounds. Over recent decades, many aromatic-degrading Sphingopyxis strains have been isolated and recorded, but little is known about their genetic nature related to aromatic compounds biodegradation. In this study, bacterial genomes of 19 Sphingopyxis strains were used for comparative analyses. Phylogeny showed an ambiguous relatedness between bacterial strains and their habitat specificity, while clustering based on Cluster of Orthologous Groups suggested the potential link of functional profile with substrate-specific traits. Pan-genome analysis revealed that 19 individuals were predicted to share 1,066 orthologous genes, indicating a high genetic homogeneity among Sphingopyxis strains. Notably, KEGG Automatic Annotation Server results suggested that most genes pertaining aromatic compounds biodegradation were predicted to be involved in benzoate, phenylalanine, and aminobenzoate metabolism. Among them, β-ketoadipate biodegradation might be the main pathway in Sphingopyxis strains. Further inspection showed that a number of mobile genetic elements varied in Sphingopyxis genomes, and plasmid-mediated gene transfer coupled with prophage- and transposon-mediated rearrangements might play prominent roles in the evolution of bacterial genomes. Collectively, our findings presented that Sphingopyxis isolates might be the promising candidates for biodegradation of aromatic compounds in pollution sites.

RevDate: 2020-06-30

Sun Z, Zhou D, Zhang X, et al (2020)

Determining the Genetic Characteristics of Resistance and Virulence of the "Epidermidis Cluster Group" Through Pan-Genome Analysis.

Frontiers in cellular and infection microbiology, 10:274.

Staphylococcus caprae, Staphylococcus capitis, and Staphylococcus epidermidis belong to the "Epidermidis Cluster Group" (ECG) and are generally opportunistic pathogens. In this work, whole genome sequencing, molecular cloning and pan-genome analysis were performed to investigate the genetic characteristics of the resistance, virulence and genome structures of 69 ECG strains, including a clinical isolate (S. caprae SY333) obtained in this work. Two resistance genes (blaZ and aadD2) encoded on the plasmids pSY333-41 and pSY333-45 of S. caprae SY333 were confirmed to be functional. The bla region in ECG exhibited three distinct structures, and these chromosome- and plasmid-encoded bla operons seemed to follow two different evolutionary paths. Pan-genome analysis revealed their pan-genomes tend to be "open." For the virulence-related factors, the genes involved in primary attachment were observed almost exclusively in S. epidermidis, while the genes associated with intercellular aggregation were observed more frequently in S. caprae and S. capitis. The type VII secretion system was present in all strains of S. caprae and some of S. epidermidis but not in S. capitis. Moreover, the isd locus (iron regulated surface determinant) was first found to be encoded on the genomes of S. caprae and S. capitis. These findings suggested that the plasmid and chromosome encoded bla operons of ECG species underwent different evolution paths, as well as they differed in the abundance of virulence genes associated with adherence, invasion, secretion system and immune evasion. Identification of isd loci in S. caprae and S. capitis indicated their ability to acquire heme as nutrient iron during infection.

RevDate: 2020-06-26

Nishitsuji K, Arimoto A, Yonashiro Y, et al (2020)

Comparative genomics of four strains of the edible brown alga, Cladosiphon okamuranus.

BMC genomics, 21(1):422 pii:10.1186/s12864-020-06792-8.

BACKGROUND: The brown alga, Cladosiphon okamuranus (Okinawa mozuku), is one of the most important edible seaweeds, and it is cultivated for market primarily in Okinawa, Japan. Four strains, denominated S, K, O, and C, with distinctively different morphologies, have been cultivated commercially since the early 2000s. We previously reported a draft genome of the S-strain. To facilitate studies of seaweed biology for future aquaculture, we here decoded and analyzed genomes of the other three strains (K, O, and C).

RESULTS: Here we improved the genome of the S-strain (ver. 2, 130 Mbp, 12,999 genes), and decoded the K-strain (135 Mbp, 12,511 genes), the O-strain (140 Mbp, 12,548 genes), and the C-strain (143 Mbp, 12,182 genes). Molecular phylogenies, using mitochondrial and nuclear genes, showed that the S-strain diverged first, followed by the K-strain, and most recently the C- and O-strains. Comparisons of genome architecture among the four strains document the frequent occurrence of inversions. In addition to gene acquisitions and losses, the S-, K-, O-, and C-strains possess 457, 344, 367, and 262 gene families unique to each strain, respectively. Comprehensive Blast searches showed that most genes have no sequence similarity to any entries in the non-redundant protein sequence database, although GO annotation suggested that they likely function in relation to molecular and biological processes and cellular components.

CONCLUSIONS: Our study compares the genomes of four strains of C. okamuranus and examines their phylogenetic relationships. Due to global environmental changes, including temperature increases, acidification, and pollution, brown algal aquaculture is facing critical challenges. Genomic and phylogenetic information reported by the present research provides useful tools for isolation of novel strains.

RevDate: 2020-06-25

Collis RM, Biggs PJ, Midwinter AC, et al (2020)

Genomic epidemiology and carbon metabolism of Escherichia coli serogroup O145 reflect contrasting phylogenies.

PloS one, 15(6):e0235066 pii:PONE-D-20-02830.

Shiga toxin-producing Escherichia coli (STEC) are a leading cause of foodborne outbreaks of human disease, but they reside harmlessly as an asymptomatic commensal in the ruminant gut. STEC serogroup O145 are difficult to isolate as routine diagnostic methods are unable to distinguish non-O157 serogroups due to their heterogeneous metabolic characteristics, resulting in under-reporting which is likely to conceal their true prevalence. In light of these deficiencies, the purpose of this study was a twofold approach to investigate enhanced STEC O145 diagnostic culture-based methods: firstly, to use a genomic epidemiology approach to understand the genetic diversity and population structure of serogroup O145 at both a local (New Zealand) (n = 47) and global scale (n = 75) and, secondly, to identify metabolic characteristics that will help the development of a differential media for this serogroup. Analysis of a subset of E. coli serogroup O145 strains demonstrated considerable diversity in carbon utilisation, which varied in association with eae subtype and sequence type. Several carbon substrates, such as D-serine and D-malic acid, were utilised by the majority of serogroup O145 strains, which, when coupled with current molecular and culture-based methods, could aid in the identification of presumptive E. coli serogroup O145 isolates. These carbon substrates warrant subsequent testing with additional serogroup O145 strains and non-O145 strains. Serogroup O145 strains displayed extensive genetic heterogeneity that was correlated with sequence type and eae subtype, suggesting these genetic markers are good indicators for distinct E. coli phylogenetic lineages. Pangenome analysis identified a core of 3,036 genes and an open pangenome of >14,000 genes, which is consistent with the identification of distinct phylogenetic lineages. Overall, this study highlighted the phenotypic and genotypic heterogeneity within E. coli serogroup O145, suggesting that the development of a differential media targeting this serogroup will be challenging.

RevDate: 2020-06-23

Vázquez-Rosas-Landa M, Ponce-Soto GY, Aguirre-Liguori JA, et al (2020)

Population genomics of Vibrionaceae isolated from an endangered oasis reveals local adaptation after an environmental perturbation.

BMC genomics, 21(1):418 pii:10.1186/s12864-020-06829-y.

BACKGROUND: In bacteria, pan-genomes are the result of an evolutionary "tug of war" between selection and horizontal gene transfer (HGT). High rates of HGT increase the genetic pool and the effective population size (Ne), resulting in open pan-genomes. In contrast, selective pressures can lead to local adaptation by purging the variation introduced by HGT and mutation, resulting in closed pan-genomes and clonal lineages. In this study, we explored both hypotheses, elucidating the pan-genome of Vibrionaceae isolates after a perturbation event in the endangered oasis of Cuatro Ciénegas Basin (CCB), Mexico, and looking for signals of adaptation to the environments in their genomes.

RESULTS: We obtained 42 genomes of Vibrionaceae distributed in six lineages, two of them did not showed any close reference strain in databases. Five of the lineages showed closed pan-genomes and were associated to either water or sediment environment; their high Ne estimates suggest that these lineages are not from a recent origin. The only clade with an open pan-genome was found in both environments and was formed by ten genetic groups with low Ne, suggesting a recent origin. The recombination and mutation estimators (r/m) ranged from 0.005 to 2.725, which are similar to oceanic Vibrionaceae estimations. However, we identified 367 gene families with signals of positive selection, most of them found in the core genome; suggesting that despite recombination, natural selection moves the Vibrionaceae CCB lineages to local adaptation, purging the genomes and keeping closed pan-genome patterns. Moreover, we identify 598 SNPs associated with an unstructured environment; some of the genes associated with these SNPs were related to sodium transport.

CONCLUSIONS: Different lines of evidence suggest that the sampled Vibrionaceae, are part of the rare biosphere usually living under famine conditions. Two of these lineages were reported for the first time. Most Vibrionaceae lineages of CCB are adapted to their micro-habitats rather than to the sampled environments. This pattern of adaptation is concordant with the association of closed pan-genomes and local adaptation.

RevDate: 2020-06-20

Anani H, Zgheib R, Hasni I, et al (2020)

Interest of bacterial pangenome analyses in clinical microbiology.

Microbial pathogenesis pii:S0882-4010(20)30641-0 [Epub ahead of print].

Thanks to the progress and decreasing costs in genome sequencing technologies, more than 250,000 bacterial genomes are currently available in public databases, covering most, if not all, of the major human-associated phylogenetic groups of these microorganisms, pathogenic or not. In addition, for many of them, sequences from several strains of a given species are available, thus enabling to evaluate their genetic diversity and study their evolution. In addition, the significant cost reduction of bacterial whole genome sequencing as well as the rapid increase in the number of available bacterial genomes have prompted the development of pangenomic software tools. The study of bacterial pangenome has many applications in clinical microbiology. It can unveil the pathogenic potential and ability of bacteria to resist antimicrobials as well identify specific sequences and predict antigenic epitopes that allow molecular or serologic assays and vaccines to be designed. Bacterial pangenome constitutes a powerful method for understanding the history of human bacteria and relating these findings to diagnosis in clinical microbiology laboratories in order to optimize patient management.

RevDate: 2020-06-19

Liu Y, Du H, Li P, et al (2020)

Pan-Genome of Wild and Cultivated Soybeans.

Cell pii:S0092-8674(20)30618-8 [Epub ahead of print].

Soybean is one of the most important vegetable oil and protein feed crops. To capture the entire genomic diversity, it is needed to construct a complete high-quality pan-genome from diverse soybean accessions. In this study, we performed individual de novo genome assemblies for 26 representative soybeans that were selected from 2,898 deeply sequenced accessions. Using these assembled genomes together with three previously reported genomes, we constructed a graph-based genome and performed pan-genome analysis, which identified numerous genetic variations that cannot be detected by direct mapping of short sequence reads onto a single reference genome. The structural variations from the 2,898 accessions that were genotyped based on the graph-based genome and the RNA sequencing (RNA-seq) data from the representative 26 accessions helped to link genetic variations to candidate genes that are responsible for important traits. This pan-genome resource will promote evolutionary and functional genomics studies in soybean.

RevDate: 2020-06-12

Ellegaard KM, Suenami S, Miyazaki R, et al (2020)

Vast Differences in Strain-Level Diversity in the Gut Microbiota of Two Closely Related Honey Bee Species.

Current biology : CB pii:S0960-9822(20)30586-8 [Epub ahead of print].

Most bacterial species encompass strains with vastly different gene content. Strain diversity in microbial communities is therefore considered to be of functional importance. Yet little is known about the extent to which related microbial communities differ in diversity at this level and which underlying mechanisms may constrain and maintain strain-level diversity. Here, we used shotgun metagenomics to characterize and compare the gut microbiota of two honey bee species, Apis mellifera and Apis cerana, which diverged about 6 mya. Although the host species are colonized largely by the same bacterial 16S rRNA phylotypes, we find that their communities are host specific when analyzed with genomic resolution. Moreover, despite their similar ecology, A. mellifera displayed a much higher diversity of strains and functional gene content in the microbiota compared to A. cerana, both per colony and per individual bee. In particular, the gene repertoire for polysaccharide degradation was massively expanded in the microbiota of A. mellifera relative to A. cerana. Bee management practices, divergent ecological adaptation, or habitat size may have contributed to the observed differences in microbiota genomic diversity of these key pollinator species. Our results illustrate that the gut microbiota of closely related animal hosts can differ vastly in genomic diversity while displaying similar levels of diversity based on the 16S rRNA gene. Such differences are likely to have consequences for gut microbiota functioning and host-symbiont interactions, highlighting the need for metagenomic studies to understand the ecology and evolution of microbial communities.

RevDate: 2020-06-11

Crouse A, Schramm C, Emond-Rheault JG, et al (2020)

Combining Whole-Genome Sequencing and Multimodel Phenotyping To Identify Genetic Predictors of Salmonella Virulence.

mSphere, 5(3): pii:5/3/e00293-20.

Salmonella comprises more than 2,600 serovars. Very few environmental and uncommon serovars have been characterized for their potential role in virulence and human infections. A complementary in vitro and in vivo systematic high-throughput analysis of virulence was used to elucidate the association between genetic and phenotypic variations across Salmonella isolates. The goal was to develop a strategy for the classification of isolates as a benchmark and predict virulence levels of isolates. Thirty-five phylogenetically distant strains of unknown virulence were selected from the Salmonella Foodborne Syst-OMICS (SalFoS) collection, representing 34 different serovars isolated from various sources. Isolates were evaluated for virulence in 4 complementary models of infection to compare virulence traits with the genomics data, including interactions with human intestinal epithelial cells, human macrophages, and amoeba. In vivo testing was conducted using the mouse model of Salmonella systemic infection. Significant correlations were identified between the different models. We identified a collection of novel hypothetical and conserved proteins associated with isolates that generate a high burden. We also showed that blind prediction of virulence of 33 additional strains based on the pan-genome was high in the mouse model of systemic infection (82% agreement) and in the human epithelial cell model (74% agreement). These complementary approaches enabled us to define virulence potential in different isolates and present a novel strategy for risk assessment of specific strains and for better monitoring and source tracking during outbreaks.IMPORTANCESalmonella species are bacteria that are a major source of foodborne disease through contamination of a diversity of foods, including meat, eggs, fruits, nuts, and vegetables. More than 2,600 different Salmonella enterica serovars have been identified, and only a few of them are associated with illness in humans. Despite the fact that they are genetically closely related, there is enormous variation in the virulence of different isolates of Salmonella enterica Identification of foodborne pathogens is a lengthy process based on microbiological, biochemical, and immunological methods. Here, we worked toward new ways of integrating whole-genome sequencing (WGS) approaches into food safety practices. We used WGS to build associations between virulence and genetic diversity within 83 Salmonella isolates representing 77 different Salmonella serovars. Our work demonstrates the potential of combining a genomics approach and virulence tests to improve the diagnostics and assess risk of human illness associated with specific Salmonella isolates.

RevDate: 2020-06-10

Gori A, Harrison OB, Mlia E, et al (2020)

Pan-GWAS of Streptococcus agalactiae Highlights Lineage-Specific Genes Associated with Virulence and Niche Adaptation.

mBio, 11(3): pii:mBio.00728-20.

Streptococcus agalactiae (group B streptococcus; GBS) is a colonizer of the gastrointestinal and urogenital tracts, and an opportunistic pathogen of infants and adults. The worldwide population of GBS is characterized by clonal complexes (CCs) with different invasive potentials. CC17, for example, is a hypervirulent lineage commonly associated with neonatal sepsis and meningitis, while CC1 is less invasive in neonates and more commonly causes invasive disease in adults with comorbidities. The genetic basis of GBS virulence and the extent to which different CCs have adapted to different host environments remain uncertain. We have therefore applied a pan-genome-wide association study (GWAS) approach to 1,988 GBS strains isolated from different hosts and countries. Our analysis identified 279 CC-specific genes associated with virulence, disease, metabolism, and regulation of cellular mechanisms that may explain the differential virulence potential of particular CCs. In CC17 and CC23, for example, we have identified genes encoding pilus, quorum-sensing proteins, and proteins for the uptake of ions and micronutrients which are absent in less invasive lineages. Moreover, in CC17, carriage and disease strains were distinguished by the allelic variants of 21 of these CC-specific genes. Together our data highlight the lineage-specific basis of GBS niche adaptation and virulence.IMPORTANCE GBS is a leading cause of mortality in newborn babies in high- and low-income countries worldwide. Different strains of GBS are characterized by different degrees of virulence, where some are harmlessly carried by humans or animals and others are much more likely to cause disease.The genome sequences of almost 2,000 GBS samples isolated from both animals and humans in high- and low- income countries were analyzed using a pan-genome-wide association study approach. This allowed us to identify 279 genes which are associated with different lineages of GBS, characterized by a different virulence and preferred host. Additionally, we propose that the GBS now carried in humans may have first evolved in animals before expanding clonally once adapted to the human host.These findings are essential to help understand what is causing GBS disease and how the bacteria have evolved and are transmitted.

RevDate: 2020-06-08

Gonzales-Siles L, Karlsson R, Schmidt P, et al (2020)

A Pangenome Approach for Discerning Species-Unique Gene Markers for Identifications of Streptococcus pneumoniae and Streptococcus pseudopneumoniae.

Frontiers in cellular and infection microbiology, 10:222.

Correct identifications of isolates and strains of the Mitis-Group of the genus Streptococcus are particularly difficult, due to high genetic similarity, resulting from horizontal gene transfer and homologous recombination, and unreliable phenotypic and genotypic biomarkers for differentiating the species. Streptococcus pneumoniae and Streptococcus pseudopneumoniae are the most closely related species of the clade. In this study, publicly-available genome sequences for Streptococcus pneumoniae and S. pseudopneumoniae were analyzed, using a pangenomic approach, to find candidates for species-unique gene markers; ten species-unique genes for S. pneumoniae and nine for S. pseudopneumoniae were identified. These species-unique gene marker candidates were verified by PCR assays for identifying S. pneumoniae and S. pseudopneumoniae strains isolated from clinical samples. All determined species-level unique gene markers for S. pneumoniae were detected in all S. pneumoniae clinical isolates, whereas fewer of the unique S. pseudopneumoniae gene markers were present in more than 95% of the clinical isolates. In parallel, taxonomic identifications of the clinical isolates were confirmed, using conventional optochin sensitivity testing, targeted PCR-detection for the "Xisco" gene, as well as genomic ANIb similarity analyses for the genome sequences of selected strains. Using mass spectrometry-proteomics, species-specific peptide matches were observed for four of the S. pneumoniae gene markers and for three of the S. pseudopneumoniae gene markers. Application of multiple species-level unique biomarkers of S. pneumoniae and S. pseudopneumoniae, is proposed as a protocol for the routine clinical laboratory for improved, reliable differentiation, and identification of these pathogenic and commensal species.

RevDate: 2020-06-08

Nasr Azadani D, Zhang D, Hatherill JR, et al (2020)

Isolation, characterization, and comparative genomic analysis of a phage infecting high-level aminoglycoside-resistant (HLAR) Enterococcus faecalis.

PeerJ, 8:e9171 pii:9171.

Enterococcus is a genus of Gram-positive bacteria that are commensal to the gastrointestinal tracts of humans but some species have been increasingly implicated as agents of nosocomial infections. The increase in infections and the spread of antibiotic-resistant strains have contributed to renewed interest in the discovery of Enterococcus phages. The aims of this study were (1) the isolation, characterization, and genome sequencing of a phage capable of infecting an antibiotic-resistant E. faecalis strain, and (2) the comparative genomic analysis of publicly-available Enterococcus phages. For this purpose, multiple phages were isolated from wastewater treatment plant (WWTP) influent using a high-level aminoglycoside-resistant (HLAR) E. faecalis strain as the host. One phage, phiNASRA1, demonstrated a high lytic efficiency (∼97.52%). Transmission electron microscopy (TEM) and whole-genome sequencing (WGS) showed that phiNASRA1 belongs to the Siphoviridae family of double-stranded DNA viruses. The phage was approximately 250 nm in length and its complete genome (40,139 bp, 34.7% GC) contained 62 open reading frames (ORFs). Phylogenetic comparisons of phiNASRA1 and 31 publicly-available Enterococcus phages, based on the large subunit terminase and portal proteins, grouped phage by provenance, size, and GC content. In particular, both phylogenies grouped phages larger than 100 kbp into distinct clades. A phylogeny based on a pangenome analysis of the same 32 phages also grouped phages by provenance, size, and GC content although agreement between the two single-locus phylogenies was higher. Per the pangenome phylogeny, phiNASRA1 was most closely related to phage LY0322 that was similar in size, GC content, and number of ORFs (40,139 and 40,934 bp, 34.77 and 34.80%, and 60 and 64 ORFs, respectively). The pangenome analysis did illustrate the high degree of sequence diversity and genome plasticity as no coding sequence was homologous across all 32 phages, and even 'conserved' structural proteins (e.g., the large subunit terminase and portal proteins) were homologous in no more than half of the 32 phage genomes. These findings contribute to a growing body of literature devoted to understanding phage biology and diversity. We propose that this high degree of diversity limited the value of the single-locus and pangenome phylogenies. By contrast, the high degree of homology between phages larger than 100 kbp suggests that pangenome analyses of more similar phages is a viable method for assessing subclade diversity. Future work is focused on validating phiNASRA1 as a potential therapeutic agent to eradicate antibiotic-resistant E. faecalis infections in an animal model.

RevDate: 2020-06-04

Wang LYR, Jokinen CC, Laing CR, et al (2020)

Assessing the genomic relatedness and evolutionary rates of persistent verotoxigenic Escherichia coli serotypes within a closed beef herd in Canada.

Microbial genomics [Epub ahead of print].

Verotoxigenic Escherichia coli (VTEC) are food- and water-borne pathogens associated with both sporadic illness and outbreaks of enteric disease. While it is known that cattle are reservoirs of VTEC, little is known about the genomic variation of VTEC in cattle, and whether the variation in genomes reported for human outbreak strains is consistent with individual animal or group/herd sources of infection. A previous study of VTEC prevalence identified serotypes carried persistently by three consecutive cohorts of heifers within a closed herd of cattle. This present study aimed to: (i) determine whether the genomic relatedness of bovine isolates is similar to that reported for human strains associated with single source outbreaks, (ii) estimate the rates of genome change among dominant serotypes over time within a cattle herd, and (iii) identify genomic features of serotypes associated with persistence in cattle. Illumina MiSeq genome sequencing and genotyping based on allelic and single nucleotide variations were completed, while genome change over time was measured using Bayesian evolutionary analysis sampling trees. The accessory genome, including the non-protein-encoding intergenic regions (IGRs), virulence factors, antimicrobial-resistance genes and plasmid gene content of representative persistent and sporadic cattle strains were compared using Fisher's exact test corrected for multiple comparisons. Herd strains from serotypes O6:H34 (n=22), O22:H8 (n=30), O108:H8 (n=39), O139:H19 (n=44) and O157:H7 (n=106) were readily distinguishable from epidemiologically unrelated strains of the same serotype using a similarity threshold of 10 or fewer allele differences between adjacent nodes. Temporal-cohort clustering within each serotype was supported by date randomization analysis. Substitutions per site per year were consistent with previously reported values for E. coli; however, there was low branch support for these values. Acquisition of the phage-encoded Shiga toxin 2 gene in serotype O22:H8 was observed. Pan-genome analyses identified accessory regions that were more prevalent in persistent serotypes (P≤0.05) than in sporadic serotypes. These results suggest that VTEC serotypes from a specific cattle population are highly clonal with a similar level of relatedness as human single-source outbreak-associated strains, but changes in the genome occur gradually over time. Additionally, elements in the accessory genomes may provide a selective advantage for persistence of VTEC within cattle herds.

RevDate: 2020-06-04

Fan X, Qiu H, Han W, et al (2020)

Phytoplankton pangenome reveals extensive prokaryotic horizontal gene transfer of diverse functions.

Science advances, 6(18):eaba0111 pii:aba0111.

The extent and role of horizontal gene transfer (HGT) in phytoplankton and, more broadly, eukaryotic evolution remain controversial topics. Recent studies substantiate the importance of HGT in modifying or expanding functions such as metal or reactive species detoxification and buttressing halotolerance. Yet, the potential of HGT to significantly alter the fate of species in a major eukaryotic assemblage remains to be established. We provide such an example for the ecologically important lineages encompassed by cryptophytes, rhizarians, alveolates, stramenopiles, and haptophytes ("CRASH" taxa). We describe robust evidence of prokaryotic HGTs in these taxa affecting functions such as polysaccharide biosynthesis. Numbers of HGTs range from 0.16 to 1.44% of CRASH species gene inventories, comparable to the ca. 1% prokaryote-derived HGTs found in the genomes of extremophilic red algae. Our results substantially expand the impact of HGT in eukaryotes and define a set of general principles for prokaryotic gene fixation in phytoplankton genomes.

RevDate: 2020-06-04

Wesevich A, Sutton G, Ruffin F, et al (2020)

Newly-named Klebsiella aerogenes (formerly Enterobacter aerogenes) is Associated with Poor Clinical Outcomes Relative to other Enterobacter Species in Patients with Bloodstream Infection.

Journal of clinical microbiology pii:JCM.00582-20 [Epub ahead of print].

Objectives:Enterobacter aerogenes was recently renamed Klebsiella aerogenes This study aimed to identify differences in clinical characteristics, outcomes, and bacterial genetics among patients with K. aerogenes versus Enterobacter species bloodstream infections (BSI).Methods: We prospectively enrolled patients with K. aerogenes or Enterobacter cloacae complex (Ecc) BSI from 2002-2015. We performed whole genome sequencing (WGS) and pan-genome analysis on all bacteria.Results: Overall, 150 patients with K. aerogenes (46/150 [31%]) or Ecc (104/150 [69%]) BSI were enrolled. The two groups had similar baseline characteristics. Neither total in-hospital mortality (13/46 [28%] versus 22/104 [21%]; p=0.3) nor attributable in-hospital mortality (9/46 [20%] versus 13/104 [12%]; p=0.3) differed between patients with K. aerogenes versus Ecc BSI, respectively. However, poor clinical outcome (death before discharge, recurrent BSI, and/or BSI complication) was higher for K. aerogenes than Ecc BSI (32/46 [70%] versus 42/104 [40%]; p=0.001). In a multivariable regression model, K. aerogenes BSI, relative to Ecc BSI, was predictive of poor clinical outcome (odds ratio 3.3; 95% confidence interval 1.4-8.1; p=0.008). Pan-genome analysis revealed 983 genes in 323 genomic islands unique to K. aerogenes isolates, including putative virulence genes involved in iron acquisition (n=67), fimbriae/pili/flagella production (n=117), and metal homeostasis (n=34). Antibiotic resistance was largely found in Ecc lineage 1, which had a higher rate of multidrug resistant phenotype (23/54 [43%]) relative to all other bacterial isolates (23/96 [24%]; p=0.03).Conclusions:K. aerogenes BSI was associated with poor clinical outcomes relative to Ecc BSI. Putative virulence factors in K. aerogenes may account for these differences.

RevDate: 2020-06-01

Badet T, D Croll (2020)

The rise and fall of genes: origins and functions of plant pathogen pangenomes.

Current opinion in plant biology, 56:65-73 pii:S1369-5266(20)30049-2 [Epub ahead of print].

Plant pathogens can rapidly overcome resistance of their hosts by mutating key pathogenicity genes encoding for effectors. Pathogen adaptation is fuelled by extensive genetic variability in populations and different strains may not share the same set of genes. Recently, such an intra-specific variation in gene content became formalized as pangenomes distinguishing core genes (i.e. shared) and accessory genes (i.e. lineage or strain-specific). Across pathogens species, key effectors tend to be part of the rapidly evolving accessory genome. Here, we show how the construction and analysis of pathogen pangenomes provide deep insights into the dynamic host adaptation process. We also discuss how pangenomes should ideally be built and how geography, niche and lifestyle likely determine pangenome sizes.

RevDate: 2020-05-30

Pilar AVC, Petronella N, Dussault FM, et al (2020)

Similar yet different: phylogenomic analysis to delineate Salmonella and Citrobacter species boundaries.

BMC genomics, 21(1):377 pii:10.1186/s12864-020-06780-y.

BACKGROUND: Salmonella enterica is a leading cause of foodborne illness worldwide resulting in considerable public health and economic costs. Testing for the presence of this pathogen in food is often hampered by the presence of background microflora that may present as Salmonella (false positives). False positive isolates belonging to the genus Citrobacter can be difficult to distinguish from Salmonella due to similarities in their genetics, cell surface antigens, and other phenotypes. In order to understand the genetic basis of these similarities, a comparative genomic approach was used to define the pan-, core, accessory, and unique coding sequences of a representative population of Salmonella and Citrobacter strains.

RESULTS: Analysis of the genomic content of 58 S. enterica strains and 37 Citrobacter strains revealed the presence of 31,130 and 1540 coding sequences within the pan- and core genome of this population. Amino acid sequences unique to either Salmonella (n = 1112) or Citrobacter (n = 195) were identified and revealed potential niche-specific adaptations. Phylogenetic network analysis of the protein families encoded by the pan-genome indicated that genetic exchange between Salmonella and Citrobacter may have led to the acquisition of similar traits and also diversification within the genera.

CONCLUSIONS: Core genome analysis suggests that the Salmonella enterica and Citrobacter populations investigated here share a common evolutionary history. Comparative analysis of the core and pan-genomes was able to define the genetic features that distinguish Salmonella from Citrobacter and highlight niche specific adaptations.

RevDate: 2020-05-29

Li M, Aye SM, Ahmed MU, et al (2020)

Pan-transcriptomic analysis identified common differentially expressed genes of Acinetobacter baumannii in response to polymyxin treatments.

Molecular omics [Epub ahead of print].

Multidrug-resistant Acinetobacter baumannii is a top-priority Gram-negative pathogen and polymyxins are a last-line therapeutic option. Previous systems pharmacological studies examining polymyxin killing and resistance usually focused on individual strains, and the derived knowledge could be limited by strain-specific genomic context. In this study, we examined the gene expression of five A. baumannii strains (34654, 1207552, 1428368, 1457504 and ATCC 19606) to determine the common differentially expressed genes in response to polymyxin treatments. A pan-genome containing 6061 genes was identified for 89 A. baumannii genomes from RefSeq database which included the five strains examined in this study; 2822 of the 6061 genes constituted the core genome. After 2 mg L-1 or 0.75 × MIC polymyxin treatments for 15 min, 41 genes were commonly up-regulated, including those involved in membrane biogenesis and homeostasis, lipoprotein and phospholipid trafficking, efflux pump and poly-N-acetylglucosamine biosynthesis; six genes were commonly down-regulated, three of which were related to fatty acid biosynthesis. Additionally, comparison of the gene expression at 15 and 60 min in ATCC 19606 revealed that polymyxin treatment resulted in a rapid change in amino acid metabolism at 15 min and perturbations on envelope biogenesis at both time points. This is the first pan-transcriptomic study for polymyxin-treated A. baumannii and our results identified that the remodelled outer membrane, up-regulated efflux pumps and down-regulated fatty acid biosynthesis might be essential for early responses to polymyxins in A. baumannii. Our findings provide important mechanistic insights into bacterial responses to polymyxin killing and may facilitate the optimisation of polymyxin therapy against this problematic 'superbug'.

RevDate: 2020-05-29

Tschoeke D, Salazar VW, Vidal L, et al (2020)

Unlocking the Genomic Taxonomy of the Prochlorococcus Collective.

Microbial ecology pii:10.1007/s00248-020-01526-5 [Epub ahead of print].

Prochlorococcus is the most abundant photosynthetic prokaryote on our planet. The extensive ecological literature on the Prochlorococcus collective (PC) is based on the assumption that it comprises one single genus comprising the species Prochlorococcus marinus, containing itself a collective of ecotypes. Ecologists adopt the distributed genome hypothesis of an open pan-genome to explain the observed genomic diversity and evolution patterns of the ecotypes within PC. Novel genomic data for the PC prompted us to revisit this group, applying the current methods used in genomic taxonomy. As a result, we were able to distinguish the five genera: Prochlorococcus, Eurycolium, Prolificoccus, Thaumococcus, and Riococcus. The novel genera have distinct genomic and ecological attributes.

RevDate: 2020-05-29

Sharma P, Gupta SK, Barrett JB, et al (2020)

Comparison of Antimicrobial Resistance and Pan-Genome of Clinical and Non-Clinical Enterococcus cecorum from Poultry Using Whole-Genome Sequencing.

Foods (Basel, Switzerland), 9(6): pii:foods9060686.

Enterococcus cecorum is an emerging avian pathogen, particularly in chickens, but can be found in both diseased (clinical) and healthy (non-clinical) poultry. To better define differences between E. cecorum from the two groups, whole-genome sequencing (WGS) was used to identify and compare antimicrobial resistance genes as well as the pan-genome among the isolates. Eighteen strains selected from our previous study were subjected to WGS using Illumina MiSeq and comparatively analyzed. Assembled contigs were analyzed for resistance genes using ARG-ANNOT. Resistance to erythromycin was mediated by ermB, ermG, and mefA, in clinical isolates and ermB and mefA, in non-clinical isolates. Lincomycin resistance genes were identified as linB, lnuB, lnuC, and lnuD with lnuD found only in non-clinical E. cecorum; however, lnuB and linB were found in only one clinical isolate. For both groups of isolates, kanamycin resistance was mediated by aph3-III, while tetracycline resistance was conferred by tetM, tetO, and tetL. No mutations or known resistance genes were found for isolates resistant to either linezolid or chloramphenicol, suggesting possible new mechanisms of resistance to these drugs. A comparison of WGS results confirmed that non-clinical isolates contained more resistance genes than clinical isolates. The pan-genome of clinical and non-clinical isolates resulted in 3651 and 4950 gene families, respectively, whereas the core gene sets were comprised of 1559 and 1534 gene families in clinical and non-clinical isolates, respectively. Unique genes were found more frequently in non-clinical isolates than clinical. Phylogenetic analysis of the isolates and all the available complete and draft genomes showed no correlation between healthy and diseased poultry. Additional genomic comparison is required to elucidate genetic factors in E. cecorum that contribute to disease in poultry.

RevDate: 2020-05-28

Liu YH, Xie YG, Li L, et al (2020)

Cyclobacterium salsum sp. nov. and Cyclobacterium roseum sp. nov., isolated from a saline lake.

International journal of systematic and evolutionary microbiology [Epub ahead of print].

Two novel strains, designated SYSU L10167T and SYSU L10180T, were isolated from sediment sampled at Dabancheng saline lake in Xinjiang, PR China. A polyphasic approach was used to clarify the taxonomic positions of the two strains. Cells of the isolates were curved ring-like, horseshoe-shaped or rod-shaped, non-motile and non-spore-forming. Cells were Gram-stain-negative, aerobic, heterotrophic and rose-pigmented. The phylogenetic trees based on 16S rRNA gene sequences showed that strains SYSU L10167T and SYSU L10180T formed a distinct lineage within the genus Cyclobacterium. Strains SYSU L10167T and SYSU L10180T showed highest similarities to Cyclobacterium jeungdonense KCTC 23150T (98.0 and 97.4%, respectively). Results of genomic analyses (including average nucleotide identity, digital DNA-DNA hybridization and the marker gene tree) and pan-genome analysis further confirmed that strains SYSU L10167T and SYSU L10180T were separate from each other and other species of the genus Cyclobacterium. The draft genomes of the isolates had sizes of 5.5-5.7 Mb and reflected their major physiological capabilities. Based on phenotypic, physiological, chemotaxonomic and genotypic characterization, we propose that the isolates represent two novel species, for which the names Cyclobacterium salsum sp. nov. and Cyclobacterium roseum sp. nov. are proposed. The type strains of the species are SYSU L10167T (=KCTC 72390T=CGMCC 1.17521T) and SYSU L10180T (=KCTC 72391T=CGMCC 1.17278T).

RevDate: 2020-05-27

Garrido-Sanz D, Redondo-Nieto M, Martín M, et al (2020)

Comparative Genomics of the Rhodococcus Genus Shows Wide Distribution of Biodegradation Traits.

Microorganisms, 8(5): pii:microorganisms8050774.

The genus Rhodococcus exhibits great potential for bioremediation applications due to its huge metabolic diversity, including biotransformation of aromatic and aliphatic compounds. Comparative genomic studies of this genus are limited to a small number of genomes, while the high number of sequenced strains to date could provide more information about the Rhodococcus diversity. Phylogenomic analysis of 327 Rhodococcus genomes and clustering of intergenomic distances identified 42 phylogenomic groups and 83 species-level clusters. Rarefaction models show that these numbers are likely to increase as new Rhodococcus strains are sequenced. The Rhodococcus genus possesses a small "hard" core genome consisting of 381 orthologous groups (OGs), while a "soft" core genome of 1253 OGs is reached with 99.16% of the genomes. Models of sequentially randomly added genomes show that a small number of genomes are enough to explain most of the shared diversity of the Rhodococcus strains, while the "open" pangenome and strain-specific genome evidence that the diversity of the genus will increase, as new genomes still add more OGs to the whole genomic set. Most rhodococci possess genes involved in the degradation of aliphatic and aromatic compounds, while short-chain alkane degradation is restricted to a certain number of groups, among which a specific particulate methane monooxygenase (pMMO) is only found in Rhodococcus sp. WAY2. The analysis of Rieske 2Fe-2S dioxygenases among rhodococci genomes revealed that most of these enzymes remain uncharacterized.

RevDate: 2020-05-26

Eizenga JM, Novak AM, Sibbesen JA, et al (2020)

Pangenome Graphs.

Annual review of genomics and human genetics [Epub ahead of print].

Low-cost whole-genome assembly has enabled the collection of haplotype-resolved pangenomes for numerous organisms. In turn, this technological change is encouraging the development of methods that can precisely address the sequence and variation described in large collections of related genomes. These approaches often use graphical models of the pangenome to support algorithms for sequence alignment, visualization, functional genomics, and association studies. The additional information provided to these methods by the pangenome allows them to achieve superior performance on a variety of bioinformatic tasks, including read alignment, variant calling, and genotyping. Pangenome graphs stand to become a ubiquitous tool in genomics. Although it is unclear whether they will replace linear reference genomes, their ability to harmoniously relate multiple sequence and coordinate systems will make them useful irrespective of which pangenomic models become most common in the future. Expected final online publication date for the Annual Review of Genomics and Human Genetics, Volume 21 is August 31, 2020. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

RevDate: 2020-05-26

Kelly LJ, Plumb WJ, Carey DW, et al (2020)

Convergent molecular evolution among ash species resistant to the emerald ash borer.

Nature ecology & evolution pii:10.1038/s41559-020-1209-3 [Epub ahead of print].

Recent studies show that molecular convergence plays an unexpectedly common role in the evolution of convergent phenotypes. We exploited this phenomenon to find candidate loci underlying resistance to the emerald ash borer (EAB, Agrilus planipennis), the United States' most costly invasive forest insect to date, within the pan-genome of ash trees (the genus Fraxinus). We show that EAB-resistant taxa occur within three independent phylogenetic lineages. In genomes from these resistant lineages, we detect 53 genes with evidence of convergent amino acid evolution. Gene-tree reconstruction indicates that, for 48 of these candidates, the convergent amino acids are more likely to have arisen via independent evolution than by another process such as hybridization or incomplete lineage sorting. Seven of the candidate genes have putative roles connected to the phenylpropanoid biosynthesis pathway and 17 relate to herbivore recognition, defence signalling or programmed cell death. Evidence for loss-of-function mutations among these candidates is more frequent in susceptible species than in resistant ones. Our results on evolutionary relationships, variability in resistance, and candidate genes for defence response within the ash genus could inform breeding for EAB resistance, facilitating ecological restoration in areas invaded by this beetle.

RevDate: 2020-05-25

Gao S, Wu J, Stiller J, et al (2020)

Identifying barley pan-genome sequence anchors using genetic mapping and machine learning.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik pii:10.1007/s00122-020-03615-y [Epub ahead of print].

KEY MESSAGE: We identified 1.844 million barley pan-genome sequence anchors from 12,306 genotypes using genetic mapping and machine learning. There is increasing evidence that genes from a given crop genotype are far to cover all genes in that species; thus, building more comprehensive pan-genomes is of great importance in genetic research and breeding. Obtaining a thousand-genotype scale pan-genome using deep-sequencing data is currently impractical for species like barley which has a huge and highly repetitive genome. To this end, we attempted to identify barley pan-genome sequence anchors from a large quantity of genotype-by-sequencing (GBS) datasets by combining genetic mapping and machine learning algorithms. Based on the GBS sequences from 11,166 domesticated and 1140 wild barley genotypes, we identified 1.844 million pan-genome sequence anchors. Of them, 532,253 were identified as presence/absence variation (PAV) tags. Through aligning these PAV tags to the genome of hulless barley genotype Zangqing320, our analysis resulted in a validation of 83.6% of them from the domesticated genotypes and 88.6% from the wild barley genotypes. Association analyses against flowering time, plant height and kernel size showed that the relative importance of the PAV and non-PAV tags varied for different traits. The pan-genome sequence anchors based on GBS tags can facilitate the construction of a comprehensive pan-genome and greatly assist various genetic studies including identification of structural variation, genetic mapping and breeding in barley.

RevDate: 2020-05-23

Oshkin IY, Miroshnikov KK, Grouzdev DS, et al (2020)

Pan-Genome-Based Analysis as a Framework for Demarcating Two Closely Related Methanotroph Genera Methylocystis and Methylosinus.

Microorganisms, 8(5): pii:microorganisms8050768.

The Methylocystis and Methylosinus are two of the five genera that were included in the first taxonomic framework of methanotrophic bacteria created half a century ago. Members of both genera are widely distributed in various environments and play a key role in reducing methane fluxes from soils and wetlands. The original separation of these methanotrophs in two distinct genera was based mainly on their differences in cell morphology. Further comparative studies that explored various single-gene-based phylogenies suggested the monophyletic nature of each of these genera. Current availability of genome sequences from members of the Methylocystis/ Methylosinus clade opens the possibility for in-depth comparison of the genomic potentials of these methanotrophs. Here, we report the finished genome sequence of Methylocystis heyeri H2T and compare it to 23 currently available genomes of Methylocystis and Methylosinus species. The phylogenomic analysis confirmed that members of these genera form two separate clades. The Methylocystis/Methylosinus pan-genome core comprised 1,173 genes, with the accessory genome containing 4,941 and 11,192 genes in the shell and the cloud, respectively. Major differences between the genome-encoded environmental traits of these methanotrophs include a variety of enzymes for methane oxidation and dinitrogen fixation as well as genomic determinants for cell motility and photosynthesis.

RevDate: 2020-05-21

Castillo AI, Chacón-Díaz C, Rodríguez-Murillo N, et al (2020)

Impacts of local population history and ecology on the evolution of a globally dispersed pathogen.

BMC genomics, 21(1):369 pii:10.1186/s12864-020-06778-6.

BACKGROUND: Pathogens with a global distribution face diverse biotic and abiotic conditions across populations. Moreover, the ecological and evolutionary history of each population is unique. Xylella fastidiosa is a xylem-dwelling bacterium infecting multiple plant hosts, often with detrimental effects. As a group, X. fastidiosa is divided into distinct subspecies with allopatric historical distributions and patterns of multiple introductions from numerous source populations. The capacity of X. fastidiosa to successfully colonize and cause disease in naïve plant hosts varies among subspecies, and potentially, among populations. Within Central America (i.e. Costa Rica) two X. fastidiosa subspecies coexist: the native subsp. fastidiosa and the introduced subsp. pauca. Using whole genome sequences, the patterns of gene gain/loss, genomic introgression, and genetic diversity were characterized within Costa Rica and contrasted to other X. fastidiosa populations.

RESULTS: Within Costa Rica, accessory and core genome analyses showed a highly malleable genome with numerous intra- and inter-subspecific gain/loss events. Likewise, variable levels of inter-subspecific introgression were found within and between both coexisting subspecies; nonetheless, the direction of donor/recipient subspecies to the recombinant segments varied. Some strains appeared to recombine more frequently than others; however, no group of genes or gene functions were overrepresented within recombinant segments. Finally, the patterns of genetic diversity of subsp. fastidiosa in Costa Rica were consistent with those of other native populations (i.e. subsp. pauca in Brazil).

CONCLUSIONS: Overall, this study shows the importance of characterizing local evolutionary and ecological history in the context of world-wide pathogen distribution.

RevDate: 2020-05-20

Fiuza TS, Lima JPMS, GA de Souza (2020)

EpitoCore: Mining Conserved Epitope Vaccine Candidates in the Core Proteome of Multiple Bacteria Strains.

Frontiers in immunology, 11:816.

In reverse vaccinology approaches, complete proteomes of bacteria are submitted to multiple computational prediction steps in order to filter proteins that are possible vaccine candidates. Most available tools perform such analysis only in a single strain, or a very limited number of strains. But the vast amount of genomic data had shown that most bacteria contain pangenomes, i.e., their genomic information contains core, conserved genes, and random accessory genes specific to each strain. Therefore, in reverse vaccinology methods it is of the utmost importance to define core proteins and core epitopes. EpitoCore is a decision-tree pipeline developed to fulfill that need. It provides surfaceome prediction of proteins from related strains, defines core proteins within those, calculate their immunogenicity, predicts epitopes for a given set of MHC alleles defined by the user, and then reports if epitopes are located extracellularly and if they are conserved among the core homologs. Pipeline performance is illustrated by mining peptide vaccine candidates in Mycobacterium avium hominissuis strains. From a total proteome of ~4,800 proteins per strain, EpitoCore predicted 103 highly immunogenic core homologs located at cell surface, many of those related to virulence and drug resistance. Conserved epitopes identified among these homologs allows the users to define sets of peptides with potential to immunize the largest coverage of tested HLA alleles using peptide-based vaccines. Therefore, EpitoCore is able to provide automated identification of conserved epitopes in bacterial pangenomic datasets.

RevDate: 2020-05-19

Gohil K, Rajput V, M Dharne (2020)

Pan-genomics of Ochrobactrum species from clinical and environmental origins reveals distinct populations and possible links.

Genomics pii:S0888-7543(19)30993-0 [Epub ahead of print].

Ochrobactrum genus is comprised of soil-dwelling Gram-negative bacteria mainly reported for bioremediation of toxic compounds. Since last few years, mainly two species of this genus, O. intermedium and O. anthropi were documented for causing infections mostly in the immunocompromised patients. Despite such ubiquitous presence, study of adaptation in various niches is still lacking. Thus, to gain insights into the niche adaptation strategies, pan-genome analysis was carried out by comparing 67 genome sequences belonging to Ochrobactrum species. Pan-genome analysis revealed it is an open pan-genome indicative of the continuously evolving nature of the genus. The presence/absence of gene clusters also illustrated the unique presence of antibiotic efflux transporter genes and type IV secretion system genes in the clinical strains while the genes of solvent resistance and exporter pumps in the environmental strains. A phylogenomic investigation based on 75 core genes depicted better and robust phylogenetic resolution and topology than the 16S rRNA gene. To support the pan-genome analysis, individual genomes were also investigated for the mobile genetic elements (MGE), antibiotic resistance genes (ARG), metal resistance genes (MRG) and virulence factors (VF). The analysis revealed the presence of MGE, ARG, and MRG in all the strains which play an important role in the species evolution which is in agreement with the pan-genome analysis. The average nucleotide identity (ANI) based on the genetic relatedness between the Ochrobactrum species indicated a distinction between individual species. Interestingly, the ANI tool was able to classify the Ochrobactrum genomes to the species level which were assigned till the genus level on the NCBI database.


RJR Experience and Expertise


Robbins holds BS, MS, and PhD degrees in the life sciences. He served as a tenured faculty member in the Zoology and Biological Science departments at Michigan State University. He is currently exploring the intersection between genomics, microbial ecology, and biodiversity — an area that promises to transform our understanding of the biosphere.


Robbins has extensive experience in college-level education: At MSU he taught introductory biology, genetics, and population genetics. At JHU, he was an instructor for a special course on biological database design. At FHCRC, he team-taught a graduate-level course on the history of genetics. At Bellevue College he taught medical informatics.


Robbins has been involved in science administration at both the federal and the institutional levels. At NSF he was a program officer for database activities in the life sciences, at DOE he was a program officer for information infrastructure in the human genome project. At the Fred Hutchinson Cancer Research Center, he served as a vice president for fifteen years.


Robbins has been involved with information technology since writing his first Fortran program as a college student. At NSF he was the first program officer for database activities in the life sciences. At JHU he held an appointment in the CS department and served as director of the informatics core for the Genome Data Base. At the FHCRC he was VP for Information Technology.


While still at Michigan State, Robbins started his first publishing venture, founding a small company that addressed the short-run publishing needs of instructors in very large undergraduate classes. For more than 20 years, Robbins has been operating The Electronic Scholarly Publishing Project, a web site dedicated to the digital publishing of critical works in science, especially classical genetics.


Robbins is well-known for his speaking abilities and is often called upon to provide keynote or plenary addresses at international meetings. For example, in July, 2012, he gave a well-received keynote address at the Global Biodiversity Informatics Congress, sponsored by GBIF and held in Copenhagen. The slides from that talk can be seen HERE.


Robbins is a skilled meeting facilitator. He prefers a participatory approach, with part of the meeting involving dynamic breakout groups, created by the participants in real time: (1) individuals propose breakout groups; (2) everyone signs up for one (or more) groups; (3) the groups with the most interested parties then meet, with reports from each group presented and discussed in a subsequent plenary session.


Robbins has been engaged with photography and design since the 1960s, when he worked for a professional photography laboratory. He now prefers digital photography and tools for their precision and reproducibility. He designed his first web site more than 20 years ago and he personally designed and implemented this web site. He engages in graphic design as a hobby.

963 Red Tail Lane
Bellingham, WA 98226


E-mail: RJR8222@gmail.com

Collection of publications by R J Robbins

Reprints and preprints of publications, slide presentations, instructional materials, and data compilations written or prepared by Robert Robbins. Most papers deal with computational biology, genome informatics, using information technology to support biomedical research, and related matters.

Research Gate page for R J Robbins

ResearchGate is a social networking site for scientists and researchers to share papers, ask and answer questions, and find collaborators. According to a study by Nature and an article in Times Higher Education , it is the largest academic social network in terms of active users.

Curriculum Vitae for R J Robbins

short personal version

Curriculum Vitae for R J Robbins

long standard version

RJR Picks from Around the Web (updated 11 MAY 2018 )