About | BLOGS | Portfolio | Misc | Recommended | What's New | What's Hot

About | BLOGS | Portfolio | Misc | Recommended | What's New | What's Hot


Bibliography Options Menu

22 Oct 2020 at 01:32
Hide Abstracts   |   Hide Additional Links
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome


Robert J. Robbins is a biologist, an educator, a science administrator, a publisher, an information technologist, and an IT leader and manager who specializes in advancing biomedical knowledge and supporting education through the application of information technology. More About:  RJR | OUR TEAM | OUR SERVICES | THIS WEBSITE

RJR: Recommended Bibliography 22 Oct 2020 at 01:32 Created: 


Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: pangenome or "pan-genome" or "pan genome" NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)


RevDate: 2020-10-17

Song JM, Liu DX, Xie WZ, et al (2020)

BnPIR: Brassica napus Pan-genome Information Resource for 1,689 accessions.

Plant biotechnology journal [Epub ahead of print].

Brassica napus (B. napus) was originally formed ~7,500 years ago by interspecific hybridization between B. rape and B. oleracea (Chalhoub et al., 2014), which supplies approximately 13%-16% of the vegetable oil globally. B. napus serves as an excellent model for polyploid genomics and evolutionary research in plants. Brassica database (BRAD) has long been used for rapeseed genomic research, which provides genome browser and syntenic relationship for multiple Brassicaceae genomes (Wang et al., 2015).

RevDate: 2020-10-19

Li H, Feng X, C Chu (2020)

The design and construction of reference pangenome graphs with minigraph.

Genome biology, 21(1):265 pii:10.1186/s13059-020-02168-z.

The recent advances in sequencing technologies enable the assembly of individual genomes to the quality of the reference genome. How to integrate multiple genomes from the same species and make the integrated representation accessible to biologists remains an open challenge. Here, we propose a graph-based data model and associated formats to represent multiple genomes while preserving the coordinate of the linear reference genome. We implement our ideas in the minigraph toolkit and demonstrate that we can efficiently construct a pangenome graph and compactly encode tens of thousands of structural variants missing from the current reference genome.

RevDate: 2020-10-16

De Filippis F, Pasolli E, D Ercolini (2020)

Newly Explored Faecalibacterium Diversity Is Connected to Age, Lifestyle, Geography, and Disease.

Current biology : CB pii:S0960-9822(20)31433-0 [Epub ahead of print].

Faecalibacterium is prevalent in the human gut and a promising microbe for the development of next-generation probiotics (NGPs) or biotherapeutics. Analyzing reference Faecalibacterium genomes and almost 3,000 Faecalibacterium-like metagenome-assembled genomes (MAGs) reconstructed from 7,907 human and 203 non-human primate gut metagenomes, we identified the presence of 22 different Faecalibacterium-like species-level genome bins (SGBs), some further divided in different strains according to the subject geographical origin. Twelve SGBs are globally spread in the human gut and show different genomic potential in the utilization of complex polysaccharides, suggesting that higher SGB diversity may be related with increased utilization of plant-based foods. Moreover, up to 11 different species may co-occur in the same subject, with lower diversity in Western populations, as well as intestinal inflammatory states and obesity. The newly explored Faecalibacterium diversity will be able to support the choice of strains suitable as NGPs, guided by the consideration of the differences existing in their functional potential.

RevDate: 2020-10-15

Zhou Z, Charlesworth J, M Achtman (2020)

Accurate reconstruction of bacterial pan- and core genomes with PEPPAN.

Genome research pii:gr.260828.120 [Epub ahead of print].

Bacterial genomes can contain traces of a complex evolutionary history, including extensive homologous recombination, gene loss, gene duplications, and horizontal gene transfer. To reconstruct the phylogenetic and population history of a set of multiple bacteria, it is necessary to examine their pangenome, the composite of all the genes in the set. Here we introduce PEPPAN, a novel pipeline that can reliably construct pangenomes from thousands of genetically diverse bacterial genomes that represent the diversity of an entire genus. PEPPAN outperforms existing pangenome methods by providing consistent gene and pseudogene annotations extended by similarity-based gene predictions, and identifying and excluding paralogs by combining tree- and synteny-based approaches. The PEPPAN package additionally includes PEPPAN_parser, which implements additional downstream analyses, including the calculation of trees based on accessory gene content or allelic differences between core genes. To test the accuracy of PEPPAN, we implemented SimPan, a novel pipeline for simulating the evolution of bacterial pangenomes. We compared the accuracy and speed of PEPPAN with four state-of-the-art pangenome pipelines using both empirical and simulated data sets. PEPPAN was more accurate and more specific than any of the other pipelines and was almost as fast as any of them. As a case study, we used PEPPAN to construct a pangenome of approximately 40,000 genes from 3052 representative genomes spanning at least 80 species of Streptococcus The resulting gene and allelic trees provide an unprecedented overview of the genomic diversity of the entire Streptococcus genus.

RevDate: 2020-10-14

Kumar R, Register K, Christopher-Hennings J, et al (2020)

Population Genomic Analysis of Mycoplasma bovis Elucidates Geographical Variations and Genes associated with Host-Types.

Microorganisms, 8(10): pii:microorganisms8101561.

: Among more than twenty species belonging to the class Mollecutes, Mycoplasma bovis is the most common cause of bovine mycoplasmosis in North America and Europe. Bovine mycoplasmosis causes significant economic loss in the cattle industry. The number of M. bovis positive herds recently has increased in North America and Europe. Since antibiotic treatment is ineffective and no efficient vaccine is available, M. bovis induced mycoplasmosis is primarily controlled by herd management measures such as the restriction of moving infected animals out of the herds and culling of infected or shedders of M. bovis. To better understand the population structure and genomic factors that may contribute to its transmission, we sequenced 147 M. bovis strains isolated from four different countries viz. USA (n = 121), Canada (n = 22), Israel (n = 3) and Lithuania (n = 1). All except two of the isolates (KRB1 and KRB8) were isolated from two host types i.e., bovine (n = 75) and bison (n = 70). We performed a large-scale comparative analysis of M. bovis genomes by integrating 103 publicly available genomes and our dataset (250 total genomes). Whole genome single nucleotide polymorphism (SNP) based phylogeny using M.agalactiae as an outgroup revealed that M. bovis population structure is composed of five different clades. USA isolates showed a high degree of genomic divergence in comparison to the Australian isolates. Based on host of origin, all the isolates in clade IV was of bovine origin, whereas majority of the isolates in clades III and V was of bison origin. Our comparative genome analysis also revealed that M. bovis has an open pangenome with a large breadth of unexplored diversity of genes. The function based analysis of autogenous vaccine candidates (n = 10) included in this study revealed that their functional diversity does not span the genomic diversity observed in all five clades identified in this study. Our study also found that M. bovis genome harbors a large number of IS elements and their number increases significantly (p = 7.8x10-6) as the genome size increases. Collectively, the genome data and the whole genome-based population analysis in this study may help to develop better understanding of M. bovis induced mycoplasmosis in cattle.

RevDate: 2020-10-11

Eizenga JM, Novak AM, Kobayashi E, et al (2020)

Efficient dynamic variation graphs.

Bioinformatics (Oxford, England) pii:5872523 [Epub ahead of print].

MOTIVATION: Pangenomics is a growing field within computational genomics. Many pangenomic analyses use bidirected sequence graphs as their core data model. However, implementing and correctly using this data model can be difficult, and the scale of pangenomic datasets can be challenging to work at. These challenges have impeded progress in this field.

RESULTS: Here, we present a stack of two C++ libraries, libbdsg and libhandlegraph, which use a simple, field-proven interface, designed to expose elementary features of these graphs while preventing common graph manipulation mistakes. The libraries also provide a Python binding. Using a diverse collection of pangenome graphs, we demonstrate that these tools allow for efficient construction and manipulation of large genome graphs with dense variation. For instance, the speed and memory usage are up to an order of magnitude better than the prior graph implementation in the VG toolkit, which has now transitioned to using libbdsg's implementations.

libhandlegraph and libbdsg are available under an MIT License from https://github.com/vgteam/libhandlegraph and https://github.com/vgteam/libbdsg.

RevDate: 2020-10-10

Kumar J, D Sen Gupta (2020)

Prospects of next generation sequencing in lentil breeding.

Molecular biology reports pii:10.1007/s11033-020-05891-9 [Epub ahead of print].

Lentil is an important food legume crop that has large and complex genome. During past years, considerable attention has been given on the use of next generation sequencing for enriching the genomic resources including identification of SSR and SNP markers, development of unigenes, transcripts, and identification of candidate genes for biotic and abiotic stresses, analysis of genetic diversity and identification of genes/ QTLs for agronomically important traits. However, in other crops including pulses, next generation sequencing has revolutionized the genomic research and helped in genomic assisted breeding rapidly and cost effectively. The present review discuss current status and future prospects of the use NGS based breeding in lentil.

RevDate: 2020-10-08

Muthuirulandi Sethuvel DP, Mutreja A, Pragasam AK, et al (2020)

Phylogenetic and Evolutionary Analysis Reveals the Recent Dominance of Ciprofloxacin-Resistant Shigella sonnei and Local Persistence of S. flexneri Clones in India.

mSphere, 5(5):.

Shigella is the second leading cause of bacterial diarrhea worldwide. Recently, Shigella sonnei seems to be replacing Shigella flexneri in low- and middle-income countries undergoing economic development. Despite this, studies focusing on these species at the genomic level remain largely unexplored. Here, we compared the genome sequences of S. flexneri and S. sonnei isolates from India with the publicly available genomes of global strains. Our analysis provides evidence for the long-term persistence of all phylogenetic groups (PGs) of S. flexneri and the recent dominance of the ciprofloxacin-resistant S. sonnei lineage in India. Within S. flexneri PGs, the majority of the study isolates belonged to PG3 within the predominance of serotype 2. For S. sonnei, the current pandemic involves globally distributed multidrug-resistant (MDR) clones that belong to Central Asia lineage III. The presence of such epidemiologically dominant lineages in association with stable antimicrobial resistance (AMR) determinants results in successful survival in the community.IMPORTANCEShigella is the second leading cause of bacterial diarrhea worldwide. This has been categorized as a priority pathogen among enteric bacteria by the Global Antimicrobial Resistance Surveillance System (GLASS) of the World Health Organization (WHO). Recently, S. sonnei seems to be replacing S. flexneri in low- and middle-income countries undergoing economic development. Antimicrobial resistance in S. flexneri and S. sonnei is a growing international concern, specifically with the international dominance of the multidrug-resistant (MDR) lineage. Genomic studies focusing on S. flexneri and S. sonnei in India remain largely unexplored. This study provides information on the introduction and expansion of drug-resistant Shigella strains in India for the first time by comparing the genome sequences of S. flexneri and S. sonnei isolates from India with the publicly available genomes of global strains. The study discusses the key differences between the two dominant species of Shigella at the genomic level to understand the evolutionary trends and genome dynamics of emerging and existing resistance clones. The present work demonstrates evidence for the long-term persistence of all PGs of S. flexneri and the recent dominance of a ciprofloxacin-resistant S. sonnei lineage in India.

RevDate: 2020-10-07

Khilyas IV, Sorokina AV, Markelova MI, et al (2020)

Genomic and phenotypic analysis of siderophore-producing Rhodococcus qingshengii strain S10 isolated from an arid weathered serpentine rock environment.

Archives of microbiology pii:10.1007/s00203-020-02057-w [Epub ahead of print].

The success of members of the genus Rhodococcus in colonizing arid rocky environments is owed in part to desiccation tolerance and an ability to extract iron through the secretion and uptake of siderophores. Here, we report a comprehensive genomic and taxonomic analysis of Rhodococcus qingshengii strain S10 isolated from eathered serpentine rock at the arid Khalilovsky massif, Russia. Sequence comparisons of whole genomes and of selected marker genes clearly showed strain S10 to belong to the R. qingshengii species. Four prophage sequences within the R. qingshengii S10 genome were identified, one of which encodes for a putative siderophore-interacting protein. Among the ten non-ribosomal peptides synthase (NRPS) clusters identified in the strain S10 genome, two show high homology to those responsible for siderophore synthesis. Phenotypic analyses demonstrated that R. qingshengii S10 secretes siderophores and possesses adaptive features (tolerance of up to 8% NaCl and pH 9) that should enable survival in its native habitat within dry serpentine rock.

RevDate: 2020-10-07

Sonnenberg CB, Kahlke T, P Haugen (2020)

Vibrionaceae core, shell and cloud genes are non-randomly distributed on Chr 1: An hypothesis that links the genomic location of genes with their intracellular placement.

BMC genomics, 21(1):695 pii:10.1186/s12864-020-07117-5.

BACKGROUND: The genome of Vibrionaceae bacteria, which consists of two circular chromosomes, is replicated in a highly ordered fashion. In fast-growing bacteria, multifork replication results in higher gene copy numbers and increased expression of genes located close to the origin of replication of Chr 1 (ori1). This is believed to be a growth optimization strategy to satisfy the high demand of essential growth factors during fast growth. The relationship between ori1-proximate growth-related genes and gene expression during fast growth has been investigated by many researchers. However, it remains unclear which other gene categories that are present close to ori1 and if expression of all ori1-proximate genes is increased during fast growth, or if expression is selectively elevated for certain gene categories.

RESULTS: We calculated the pangenome of all complete genomes from the Vibrionaceae family and mapped the four pangene categories, core, softcore, shell and cloud, to their chromosomal positions. This revealed that core and softcore genes were found heavily biased towards ori1, while shell genes were overrepresented at the opposite part of Chr 1 (i.e., close to ter1). RNA-seq of Aliivibrio salmonicida and Vibrio natriegens showed global gene expression patterns that consistently correlated with chromosomal distance to ori1. Despite a biased gene distribution pattern, all pangene categories contributed to a skewed expression pattern at fast-growing conditions, whereas at slow-growing conditions, softcore, shell and cloud genes were responsible for elevated expression.

CONCLUSION: The pangene categories were non-randomly organized on Chr 1, with an overrepresentation of core and softcore genes around ori1, and overrepresentation of shell and cloud genes around ter1. Furthermore, we mapped our gene distribution data on to the intracellular positioning of chromatin described for V. cholerae, and found that core/softcore and shell/cloud genes appear enriched at two spatially separated intracellular regions. Based on these observations, we hypothesize that there is a link between the genomic location of genes and their cellular placement.

RevDate: 2020-10-07

Malik A, Kim YR, SB Kim (2020)

Genome Mining of the Genus Streptacidiphilus for Biosynthetic and Biodegradation Potential.

Genes, 11(10): pii:genes11101166.

The genus Streptacidiphilus represents a group of acidophilic actinobacteria within the family Streptomycetaceae, and currently encompasses 15 validly named species, which include five recent additions within the last two years. Considering the potential of the related genera within the family, namely Streptomyces and Kitasatospora, these relatively new members of the family can also be a promising source for novel secondary metabolites. At present, 15 genome data for 11 species from this genus are available, which can provide valuable information on their biology including the potential for metabolite production as well as enzymatic activities in comparison to the neighboring taxa. In this study, the genome sequences of 11 Streptacidiphilus species were subjected to the comparative analysis together with selected Streptomyces and Kitasatospora genomes. This study represents the first comprehensive comparative genomic analysis of the genus Streptacidiphilus. The results indicate that the genomes of Streptacidiphilus contained various secondary metabolite (SM) producing biosynthetic gene clusters (BGCs), some of them exclusively identified in Streptacidiphilus only. Several of these clusters may potentially code for SMs that may have a broad range of bioactivities, such as antibacterial, antifungal, antimalarial and antitumor activities. The biodegradation capabilities of Streptacidiphilus were also explored by investigating the hydrolytic enzymes for complex carbohydrates. Although all genomes were enriched with carbohydrate-active enzymes (CAZymes), their numbers in the genomes of some strains such as Streptacidiphilus carbonis NBRC 100919T were higher as compared to well-known carbohydrate degrading organisms. These distinctive features of each Streptacidiphilus species make them interesting candidates for future studies with respect to their potential for SM production and enzymatic activities.

RevDate: 2020-10-06

Chambers J, Sparks N, Sydney N, et al (2020)

Comparative genomics and pan-genomics of the Myxococcaceae, including a description of five novel species: Myxococcus eversor sp. nov., Myxococcus llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogochensis sp. nov., Myxococcus vastator sp. nov., Pyxidicoccus caerfyrddinensis sp. nov. and Pyxidicoccus trucidator sp. nov.

Genome biology and evolution pii:5918458 [Epub ahead of print].

Members of the predatory Myxococcales (myxobacteria) possess large genomes, undergo multicellular development and produce diverse secondary metabolites, which are being actively prospected for novel drug discovery. To direct such efforts, it is important to understand the relationships between myxobacterial ecology, evolution, taxonomy and genomic variation. This study investigated the genomes and pan-genomes of organisms within the Myxococcaceae, including the genera Myxococcus and Corallococcus, the most abundant myxobacteria isolated from soils. Previously, ten species of Corallococcus were known, while six species of Myxococcus phylogenetically surrounded a third genus (Pyxidicoccus) composed of a single species. Here, we describe draft genome sequences of five novel species within the Myxococcaceae (Myxococcus eversor, Myxococcus llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogochensis, Myxococcus vastator, Pyxidicoccus caerfyrddinensis and Pyxidicoccus trucidator), and for the Pyxidicoccus type species strain, Pyxidicoccus fallax DSM 14698T. Genomic and physiological comparisons demonstrated clear differences between the five novel species and every other Myxococcus or Pyxidicoccus spp. type strain. Subsequent analyses of type strain genomes showed that both the Corallococcus pan-genome and the combined Myxococcus and Pyxidicoccus (Myxococcus/Pyxidicoccus) pan-genome are large and open, but with clear differences. Genomes of Corallococcus spp. are generally smaller than those of Myxococcus/Pyxidicoccus spp., but have core genomes three times larger. Myxococcus/Pyxidicoccus spp. genomes are more variable in size, with larger and more unique sets of accessory genes than those of Corallococcus species. In both genera, biosynthetic gene clusters are relatively enriched in the shell pan-genomes, implying they grant a greater evolutionary benefit than other shell genes, presumably by conferring selective advantages during predation.

RevDate: 2020-10-05

Jensen SE, Charles JR, Muleta K, et al (2020)

A sorghum practical haplotype graph facilitates genome-wide imputation and cost-effective genomic prediction.

The plant genome, 13(1):e20009.

Successful management and utilization of increasingly large genomic datasets is essential for breeding programs to accelerate cultivar development. To help with this, we developed a Sorghum bicolor Practical Haplotype Graph (PHG) pangenome database that stores haplotypes and variant information. We developed two PHGs in sorghum that were used to identify genome-wide variants for 24 founders of the Chibas sorghum breeding program from 0.01x sequence coverage. The PHG called single nucleotide polymorphisms (SNPs) with 5.9% error at 0.01x coverage-only 3% higher than PHG error when calling SNPs from 8x coverage sequence. Additionally, 207 progenies from the Chibas genomic selection (GS) training population were sequenced and processed through the PHG. Missing genotypes were imputed from PHG parental haplotypes and used for genomic prediction. Mean prediction accuracies with PHG SNP calls range from .57-.73 and are similar to prediction accuracies obtained with genotyping-by-sequencing or targeted amplicon sequencing (rhAmpSeq) markers. This study demonstrates the use of a sorghum PHG to impute SNPs from low-coverage sequence data and shows that the PHG can unify genotype calls across multiple sequencing platforms. By reducing input sequence requirements, the PHG can decrease the cost of genotyping, make GS more feasible, and facilitate larger breeding populations. Our results demonstrate that the PHG is a useful research and breeding tool that maintains variant information from a diverse group of taxa, stores sequence data in a condensed but readily accessible format, unifies genotypes across genotyping platforms, and provides a cost-effective option for genomic selection.

RevDate: 2020-10-05

Roe C, Williamson CHD, Vazquez AJ, et al (2020)

Bacterial Genome Wide Association Studies (bGWAS) and Transcriptomics Identifies Cryptic Antimicrobial Resistance Mechanisms in Acinetobacter baumannii.

Frontiers in public health, 8:451.

Antimicrobial resistance (AMR) in the nosocomial pathogen, Acinetobacter baumannii, is becoming a serious public health threat. While some mechanisms of AMR have been reported, understanding novel mechanisms of resistance is critical for identifying emerging resistance. One of the first steps in identifying novel AMR mechanisms is performing genotype/phenotype association studies; however, performing these studies is complicated by the plastic nature of the A. baumannii pan-genome. In this study, we compared the antibiograms of 12 antimicrobials associated with multiple drug families for 84 A. baumannii isolates, many isolated in Arizona, USA. in silico screening of these genomes for known AMR mechanisms failed to identify clear correlations for most drugs. We then performed a bacterial genome wide association study (bGWAS) looking for associations between all possible 21-mers; this approach generally failed to identify mechanisms that explained the resistance phenotype. In order to decrease the genomic noise associated with population stratification, we compared four phylogenetically-related pairs of isolates with differing susceptibility profiles. RNA-Sequencing (RNA-Seq) was performed on paired isolates and differentially-expressed genes were identified. In these isolate pairs, five different potential mechanisms were identified, highlighting the difficulty of broad AMR surveillance in this species. To verify and validate differential expression, amplicon sequencing was performed. These results suggest that a diagnostic platform based on gene expression rather than genomics alone may be beneficial in certain surveillance efforts. The implementation of such advanced diagnostics coupled with increased AMR surveillance will potentially improve A. baumannii infection treatment and patient outcomes.

RevDate: 2020-10-05

Yang Y, Zhang Y, Cápiro NL, et al (2020)

Genomic Characteristics Distinguish Geographically Distributed Dehalococcoidia.

Frontiers in microbiology, 11:546063.

Dehalococcoidia (Dia) class microorganisms are frequently found in various pristine and contaminated environments. Metagenome-assembled genomes (MAGs) and single-cell amplified genomes (SAGs) studies have substantially improved the understanding of Dia microbial ecology and evolution; however, an updated thorough investigation on the genomic and evolutionary characteristics of Dia microorganisms distributed in geographically distinct environments has not been implemented. In this study, we analyzed available genomic data to unravel Dia evolutionary and metabolic traits. Based on the phylogeny of 16S rRNA genes retrieved from sixty-seven genomes, Dia microorganisms can be categorized into three groups, the terrestrial cluster that contains all Dehalococcoides and Dehalogenimonas strains, the marine cluster I, and the marine cluster II. These results reveal that a higher ratio of horizontally transferred genetic materials was found in the Dia marine clusters compared to that of the Dia terrestrial cluster. Pangenome analysis further suggests that Dia microorganisms have evolved cluster-specific enzymes (e.g., dehalogenase in terrestrial Dia, sulfite reductase in marine Dia) and biosynthesis capabilities (e.g., siroheme biosynthesis in marine Dia). Marine Dia microorganisms are likely adapted to versatile metabolisms for energy conservation besides organohalide respiration. The genomic differences between marine and terrestrial Dia may suggest distinct functions and roles in element cycling (e.g., carbon, sulfur, chlorine), which require interdisciplinary approaches to unravel the physiology and evolution of Dia in various environments.

RevDate: 2020-10-05

Kim HB, Kim E, Yang SM, et al (2020)

Development of Real-Time PCR Assay to Specifically Detect 22 Bifidobacterium Species and Subspecies Using Comparative Genomics.

Frontiers in microbiology, 11:2087.

Bifidobacterium species are used as probiotics to provide beneficial effects to humans. These effects are specific to some species or subspecies of Bifidobacterium. However, some Bifidobacterium species or subspecies are not distinguished because similarity of 16S rRNA and housekeeping gene sequences within Bifidobacterium species is very high. In this study, we developed a real-time polymerase chain reaction (PCR) assay to rapidly and accurately detect 22 Bifidobacterium species by selecting genetic markers using comparative genomic analysis. A total of 210 Bifidobacterium genome sequences were compared to select species- or subspecies-specific genetic markers. A phylogenetic tree based on pan-genomes generated clusters according to Bifidobacterium species or subspecies except that two strains were not grouped with their subspecies. Based on pan-genomes constructed, species- or subspecies-specific genetic markers were selected. The specificity of these markers was confirmed by aligning these genes against 210 genome sequences. Real-time PCR could detect 22 Bifidobacterium specifically. We constructed the criterion for quantification by standard curves. To further test the developed assay for commercial food products, we monitored 26 probiotic products and 7 dairy products. Real-time PCR results and labeling data were then compared. Most of these products (21/33, 63.6%) were consistent with their label claims. Some products labeled at species level only can be detected up to subspecies level through our developed assay.

RevDate: 2020-10-03

Harris LG, Bodger O, Post V, et al (2020)

Temporal Changes in Patient-Matched Staphylococcus epidermidis Isolates from Infections: towards Defining a 'True' Persistent Infection.

Microorganisms, 8(10): pii:microorganisms8101508.

Staphylococcus epidermidis is found naturally on the skin but is a common cause of persistent orthopaedic device-related infections (ODRIs). This study used a pan-genome and gene-by-gene approach to analyse the clonality of whole genome sequences (WGS) of 115 S. epidermidis isolates from 55 patients with persistent ODRIs. Analysis of the 522 gene core genome revealed that the isolates clustered into three clades, and MLST analysis showed that 83% of the isolates belonged to clonal complex 2 (CC2). Analysis also found 13 isolate pairs had different MLST types and less than 70% similarity within the genes; hence, these were defined as re-infection by a different S. epidermidis strain. Comparison of allelic diversity in the remaining 102 isolates (49 patients) revealed that 6 patients had microevolved infections (>7 allele differences), and only 37 patients (77 isolates) had a 'true' persistent infection. Analysis of the core genomes of isolate pairs from 37 patients found 110/841 genes had variations; mainly in metabolism associated genes. The accessory genome consisted of 2936 genes; with an average size of 1515 genes. To conclude, this study demonstrates the advantage of using WGS for identifying the accuracy of a persistent infection diagnosis. Hence, persistent infections can be defined as 'true' persistent infections if the core genome of paired isolates has ≤7 allele differences; microevolved persistent infection if the paired isolates have >7 allele differences but same MLST type; and polyclonal if they are the same species but a different MLST type.

RevDate: 2020-09-30

Srivastava AK, Srivastava R, Sharma A, et al (2020)

Pan-genome analysis of Exiguobacterium reveals species delineation and genomic similarity with Exiguobacterium profundum PHM 11.

Environmental microbiology reports [Epub ahead of print].

The stint of the bacterial species is convoluting, but the new algorithms to calculate genome-to-genome distance (GGD) and DNA-DNA hybridization (DDH) for comparative genome analysis have rejuvenated the exploration of species and sub-species characterization. The present study reports the first whole genome sequence of Exiguobacterium profundum PHM11. PHM11 genome consist of ~2.92 Mb comprising 48 contigs, 47.93 % G+C content. Functional annotations revealed a total of 3033 protein coding genes and 33 non-protein coding genes. Out of these, only 2316 could be characterized and others reported as hypothetical proteins. The comparative analysis of predicted proteome of PHM11 with five other Exiguobacterium sp. identified 3806 clusters, out of which the PHM11 shared a total of 2723 clusters having 1664 common clusters, 131 singletons and 928 distributed between five species. The pan-genome analysis of seventy different genomic sequences of Exigubacterium strains devoid of a species taxon was done on the basis of GGD and the DDH which identified eight genomes analogous to the PHM11 at species level and may be characterized as E. profundum. The ANI value and phylogenetic tree analysis also support the same. The results regarding pan-genome analysis provide a convincing insight for delineation of these 8 strains to species. This article is protected by copyright. All rights reserved.

RevDate: 2020-09-28

Patel M, Patel HM, Vohra N, et al (2020)

Complete genome sequencing and comparative genome characterization of the lignocellulosic biomass degrading bacterium Pseudomonas stutzeri MP4687 from cattle rumen.

Biotechnology reports (Amsterdam, Netherlands), 28:e00530 pii:S2215-017X(20)30076-X.

We report the complete genome sequencing of novel Pseudomonas stutzeri strain MP4687 isolated from cattle rumen. Various strains of P. stutzeri have been reported from different environmental samples including oil-contaminated sites, crop roots, air, and human clinical samples, but not from rumen samples, which is being reported here for the first time. The genome of P. stutzeri MP4687 has a single replicon, 4.75 Mb chromosome and a G + C content of 63.45%. The genome encodes for 4,790 protein coding genes including 164 CAZymes and 345 carbohydrate processing genes. The isolate MP4687 harbors LCB hydrolyzing potential through endoglucanase (4.5 U/mL), xylanase (3.1 U/mL), β-glucosidase (3.3 U/mL) and β-xylosidase (1.9 U/mL) activities. The pangenome analysis further revealed that MP4687 has a very high number of unique genes (>2100) compared to other P. stutzeri genomes, which might have an important role in rumen functioning.

RevDate: 2020-09-28

Verma DK, Vasudeva G, Sidhu C, et al (2020)

Biochemical and Taxonomic Characterization of Novel Haloarchaeal Strains and Purification of the Recombinant Halotolerant α-Amylase Discovered in the Isolate.

Frontiers in microbiology, 11:2082.

Haloarchaea are salt-loving archaea and potential source of industrially relevant halotolerant enzymes. In the present study, three reddish-pink, extremely halophilic archaeal strains, namely wsp1 (wsp-water sample Pondicherry), wsp3, and wsp4, were isolated from the Indian Solar saltern. The phylogenetic analysis based on 16S rRNA gene sequences suggests that both wsp3 and wsp4 strains belong to Halogeometricum borinquense while wsp1 is closely related to Haloferax volcanii species. The comparative genomics revealed an open pangenome for both genera investigated here. Whole-genome sequence analysis revealed that these isolates have multiple copies of industrially/biotechnologically important unique genes and enzymes. Among these unique enzymes, for recombinant expression and purification, we selected four putative α-amylases identified in these three isolates. We successfully purified functional halotolerant recombinant Amy2, from wsp1 using pelB signal sequence-based secretion strategy using Escherichia coli as an expression host. This method may prove useful to produce functional haloarchaeal secretory recombinant proteins suitable for commercial or research applications. Biochemical analysis of Amy2 suggests the halotolerant nature of the enzyme having maximum enzymatic activity observed at 1 M NaCl. We also report the isolation and characterization of carotenoids purified from these isolates. This study highlights the presence of several industrially important enzymes in the haloarchaeal strains which may potentially have improved features like stability and salt tolerance suitable for industrial applications.

RevDate: 2020-09-26

Chen Y, Song W, Xie X, et al (2020)

A Collinearity-incorporating Homology Inference Strategy for Connecting Emerging Assemblies in Triticeae Tribe as a Pilot Practice in the Plant Pangenomic Era.

Molecular plant pii:S1674-2052(20)30314-2 [Epub ahead of print].

Plant genome sequencing has dramatically increased, and some species even have multiple high-quality reference versions. Demands for clade-specific homology inference and analysis have increased in pangenomic era. We proposed a novel method, GeneTribe (https://chenym1.github.io/genetribe/), for homology inference among genetically similar genomes that incorporates gene collinearity and shows better performance than traditional sequence-similarity-based methods in terms of accuracy and scalability. The Triticeae tribe is a typical allopolyploid-rich clade with complex species relationships that includes many important crops such as wheat, barley, and rye. We then built Triticeae-GeneTribe (http://wheat.cau.edu.cn/TGT/) as a homology database, by integrating 12 Triticeae genomes and 3 outgroup model genomes and implemented versatile analysis and visualization functions. With macrocollinearity analysis, we were able to construct a refined model illustrating the structural rearrangements of the 4A-5A-7B chromosomes in wheat as two major translocation events. With collinearity analysis at both the macro- and microscale, we illustrated the complex evolutionary history of homologs of the wheat vernalization gene Vrn2 as a combined result of genome translocation, duplication, and polyploidization and gene loss events. Our work provides a useful practice for connecting emerging genome assemblies, with awareness of the extensive polyploidy in plants, and will help researchers efficiently exploit genome sequence resources.

RevDate: 2020-09-28

McCubbin T, Gonzalez-Garcia RA, Palfreyman RW, et al (2020)

A Pan-Genome Guided Metabolic Network Reconstruction of Five Propionibacterium Species Reveals Extensive Metabolic Diversity.

Genes, 11(10): pii:genes11101115.

Propionibacteria have been studied extensively since the early 1930s due to their relevance to industry and importance as human pathogens. Still, their unique metabolism is far from fully understood. This is partly due to their signature high GC content, which has previously hampered the acquisition of quality sequence data, the accurate annotation of the available genomes, and the functional characterization of genes. The recent completion of the genome sequences for several species has led researchers to reassess the taxonomical classification of the genus Propionibacterium, which has been divided into several new genres. Such data also enable a comparative genomic approach to annotation and provide a new opportunity to revisit our understanding of their metabolism. Using pan-genome analysis combined with the reconstruction of the first high-quality Propionibacterium genome-scale metabolic model and a pan-metabolic model of current and former members of the genus Propionibacterium, we demonstrate that despite sharing unique metabolic traits, these organisms have an unexpected diversity in central carbon metabolism and a hidden layer of metabolic complexity. This combined approach gave us new insights into the evolution of Propionibacterium metabolism and led us to propose a novel, putative ferredoxin-linked energy conservation strategy. The pan-genomic approach highlighted key differences in Propionibacterium metabolism that reflect adaptation to their environment. Results were mathematically captured in genome-scale metabolic reconstructions that can be used to further explore metabolism using metabolic modeling techniques. Overall, the data provide a platform to explore Propionibacterium metabolism and a tool for the rational design of strains.

RevDate: 2020-09-25

Feng Y, Fan X, Zhu L, et al (2020)

Phylogenetic and genomic analysis reveals high genomic openness and genetic diversity of Clostridium perfringens.

Microbial genomics [Epub ahead of print].

Clostridium perfringens is associated with a variety of diseases in both humans and animals. Recent advances in genomic sequencing make it timely to re-visit this important pathogen. Although the genome sequence of C. perfringens was first determined in 2002, large-scale comparative genomics with isolates of different origins is still lacking. In this study, we used whole-genome sequencing of 45 C. perfringens isolates with isolation time spanning an 80-year period and performed comparative analysis of 173 genomes from worldwide strains. We also conducted phylogenetic lineage analysis and introduced an openness index (OI) to evaluate the openness of bacterial genomes. We classified all these genomes into five lineages and hypothesized that the origin of C. perfringens dates back to ~80 000 years ago. We showed that the pangenome of the 173 C. perfringens strains contained a total of 26 954 genes, while the core genome comprised 1020 genes, accounting for about a third of the genome of each isolate. We demonstrated that C. perfringens had the highest OI compared with 51 other bacterial species. Intact prophage sequences were found in nearly 70.0 % of C. perfringens genomes, while CRISPR sequences were found only in ~40.0 %. Plasmids were prevalent in C. perfringens isolates, and half of the virulence genes and antibiotic resistance genes (ARGs) identified in all the isolates could be found in plasmids. ARG-sharing network analysis showed that C. perfringens shared its 11 ARGs with 55 different bacterial species, and a high frequency of ARG transfer may have occurred between C. perfringens and species in the genera Streptococcus and Staphylococcus. Correlation analysis showed that the ARG number in C. perfringens strains increased with time, while the virulence gene number was relative stable. Our results, taken together with previous studies, revealed the high genome openness and genetic diversity of C. perfringens and provide a comprehensive view of the phylogeny, genomic features, virulence gene and ARG profiles of worldwide strains.

RevDate: 2020-09-25

Rautiainen M, T Marschall (2020)

GraphAligner: rapid and versatile sequence-to-graph alignment.

Genome biology, 21(1):253 pii:10.1186/s13059-020-02157-2.

Genome graphs can represent genetic variation and sequence uncertainty. Aligning sequences to genome graphs is key to many applications, including error correction, genome assembly, and genotyping of variants in a pangenome graph. Yet, so far, this step is often prohibitively slow. We present GraphAligner, a tool for aligning long reads to genome graphs. Compared to the state-of-the-art tools, GraphAligner is 13x faster and uses 3x less memory. When employing GraphAligner for error correction, we find it to be more than twice as accurate and over 12x faster than extant tools.Availability: Package manager: https://anaconda.org/bioconda/graphaligner and source code: https://github.com/maickrau/GraphAligner.

RevDate: 2020-09-24

Sánchez-Osuna M, Cortés P, Llagostera M, et al (2020)

Exploration into the origins and mobilization of di-hydrofolate reductase genes and the emergence of clinical resistance to trimethoprim.

Microbial genomics [Epub ahead of print].

Trimethoprim is a synthetic antibacterial agent that targets folate biosynthesis by competitively binding to the di-hydrofolate reductase enzyme (DHFR). Trimethoprim is often administered synergistically with sulfonamide, another chemotherapeutic agent targeting the di-hydropteroate synthase (DHPS) enzyme in the same pathway. Clinical resistance to both drugs is widespread and mediated by enzyme variants capable of performing their biological function without binding to these drugs. These mutant enzymes were assumed to have arisen after the discovery of these synthetic drugs, but recent work has shown that genes conferring resistance to sulfonamide were present in the bacterial pangenome millions of years ago. Here, we apply phylogenetics and comparative genomics methods to study the largest family of mobile trimethoprim-resistance genes (dfrA). We show that most of the dfrA genes identified to date map to two large clades that likely arose from independent mobilization events. In contrast to sulfonamide resistance (sul) genes, we find evidence of recurrent mobilization in dfrA genes. Phylogenetic evidence allows us to identify novel dfrA genes in the emerging pathogen Acinetobacter baumannii, and we confirm their resistance phenotype in vitro. We also identify a cluster of dfrA homologues in cryptic plasmid and phage genomes, but we show that these enzymes do not confer resistance to trimethoprim. Our methods also allow us to pinpoint the chromosomal origin of previously reported dfrA genes, and we show that many of these ancient chromosomal genes also confer resistance to trimethoprim. Our work reveals that trimethoprim resistance predated the clinical use of this chemotherapeutic agent, but that novel mutations have likely also arisen and become mobilized following its widespread use within and outside the clinic. Hence, this work confirms that resistance to novel drugs may already be present in the bacterial pangenome, and stresses the importance of rapid mobilization as a fundamental element in the emergence and global spread of resistance determinants.

RevDate: 2020-09-24

Jin L, Chen Y, Yang W, et al (2020)

Complete genome sequence of fish-pathogenic Aeromonas hydrophila HX-3 and a comparative analysis: insights into virulence factors and quorum sensing.

Scientific reports, 10(1):15479 pii:10.1038/s41598-020-72484-8.

The gram-negative, aerobic, rod-shaped bacterium Aeromonas hydrophila, the causative agent of motile aeromonad septicaemia, has attracted increasing attention due to its high pathogenicity. Here, we constructed the complete genome sequence of a virulent strain, A. hydrophila HX-3 isolated from Pseudosciaena crocea and performed comparative genomics to investigate its virulence factors and quorum sensing features in comparison with those of other Aeromonas isolates. HX-3 has a circular chromosome of 4,941,513 bp with a 61.0% G + C content encoding 4483 genes, including 4318 protein-coding genes, and 31 rRNA, 127 tRNA and 7 ncRNA operons. Seventy interspersed repeat and 153 tandem repeat sequences, 7 transposons, 8 clustered regularly interspaced short palindromic repeats, and 39 genomic islands were predicted in the A. hydrophila HX-3 genome. Phylogeny and pan-genome were also analyzed herein to confirm the evolutionary relationships on the basis of comparisons with other fully sequenced Aeromonas genomes. In addition, the assembled HX-3 genome was successfully annotated against the Cluster of Orthologous Groups of proteins database (76.03%), Gene Ontology database (18.13%), and Kyoto Encyclopedia of Genes and Genome pathway database (59.68%). Two-component regulatory systems in the HX-3 genome and virulence factors profiles through comparative analysis were predicted, providing insights into pathogenicity. A large number of genes related to the AHL-type 1 (ahyI, ahyR), LuxS-type 2 (luxS, pfs, metEHK, litR, luxOQU) and QseBC-type 3 (qseB, qseC) autoinducer systems were also identified. As a result of the expression of the ahyI gene in Escherichia coli BL21 (DE3), combined UPLC-MS/MS profiling led to the identification of several new N-acyl-homoserine lactone compounds synthesized by AhyI. This genomic analysis determined the comprehensive QS systems of A. hydrophila, which might provide novel information regarding the mechanisms of virulence signatures correlated with QS.

RevDate: 2020-09-22

Fang X, Lloyd CJ, BO Palsson (2020)

Reconstructing organisms in silico: genome-scale models and their emerging applications.

Nature reviews. Microbiology pii:10.1038/s41579-020-00440-4 [Epub ahead of print].

Escherichia coli is considered to be the best-known microorganism given the large number of published studies detailing its genes, its genome and the biochemical functions of its molecular components. This vast literature has been systematically assembled into a reconstruction of the biochemical reaction networks that underlie E. coli's functions, a process which is now being applied to an increasing number of microorganisms. Genome-scale reconstructed networks are organized and systematized knowledge bases that have multiple uses, including conversion into computational models that interpret and predict phenotypic states and the consequences of environmental and genetic perturbations. These genome-scale models (GEMs) now enable us to develop pan-genome analyses that provide mechanistic insights, detail the selection pressures on proteome allocation and address stress phenotypes. In this Review, we first discuss the overall development of GEMs and their applications. Next, we review the evolution of the most complete GEM that has been developed to date: the E. coli GEM. Finally, we explore three emerging areas in genome-scale modelling of microbial phenotypes: collections of strain-specific models, metabolic and macromolecular expression models, and simulation of stress responses.

RevDate: 2020-09-22

Phanse Y, Wu CW, Venturino AJ, et al (2020)

A Protective Vaccine against Johne's Disease in Cattle.

Microorganisms, 8(9): pii:microorganisms8091427.

Johne's disease (JD) caused by Mycobacterium avium subsp. paratuberculosis (M. paratuberculosis) is a chronic infection characterized by the development of granulomatous enteritis in wild and domesticated ruminants. It is one of the most significant livestock diseases not only in the USA but also globally, accounting for USD 200-500 million losses annually for the USA alone with potential link to cases of Crohn's disease in humans. Developing safe and protective vaccines is of a paramount importance for JD control in dairy cows. The current study evaluated the safety, immunity and protective efficacy of a novel live attenuated vaccine (LAV) candidate with and without an adjuvant in comparison to an inactivated vaccine. Results indicated that the LAV, irrespective of the adjuvant presence, induced robust T cell immune responses indicated by proinflammatory cytokine production such as IFN-γ, IFN-α, TNF-α and IL-17 as well as strong response to intradermal skin test against M. paratuberculosis antigens. Furthermore, the LAV was safe with minimal tissue pathology. Finally, calves vaccinated with adjuvanted LAV did not shed M. paratuberculosis post-challenge, a much-desired characteristic of an effective vaccine against JD. Together, this data suggests a strong potential of testing LAV in field trials to curb JD in dairy herds.

RevDate: 2020-09-17

Zhong C, Wang L, K Ning (2020)

Pan-genome study of Thermococcales reveals extensive genetic diversity and genetic evidence of thermophilic adaption.

Environmental microbiology [Epub ahead of print].

Thermococcales has a strong adaptability to extreme environments, which is of profound interest in explaining how complex life forms emerge on earth. However, their gene composition, thermal stability and evolution in hyperthermal environments are still little known. Here, we characterized the pan-genome architecture of 30 Thermococcales species to gain insight into their genetic properties, evolutionary patterns, and specific metabolisms adapted to niches. We revealed an open pan-genome of Thermococcales comprising 6,070 gene families that tends to increase with the availability of additional genomes. The genome contents of Thermococcales were flexible, with a series of genes experienced gene duplication, progressive divergence, or gene gain and loss events exhibiting distinct functional features. These archaea had concise types of heat shock proteins, such as HSP20, HSP60 and prefoldin, which were constrained by strong purifying selection that governed their conservative evolution. Furthermore, purifying selection forced genes involved in enzyme, motility, secretion system, defense system and chaperones to differ in functional constraints and their disparity in the rate of evolution may be related to adaptation to specific niche. These results deepened our understanding of genetic diversity and adaptation patterns of Thermococcales, and provided valuable research models for studying the metabolic traits of early life forms. This article is protected by copyright. All rights reserved.

RevDate: 2020-09-17

Khan M, Stapleton F, Summers S, et al (2020)

Antibiotic Resistance Characteristics of Pseudomonas aeruginosa Isolated from Keratitis in Australia and India.

Antibiotics (Basel, Switzerland), 9(9): pii:antibiotics9090600.

This study investigated genomic differences in Australian and Indian Pseudomonas aeruginosa isolates from keratitis (infection of the cornea). Overall, the Indian isolates were resistant to more antibiotics, with some of those isolates being multi-drug resistant. Acquired genes were related to resistance to fluoroquinolones, aminoglycosides, beta-lactams, macrolides, sulphonamides, and tetracycline and were more frequent in Indian (96%) than in Australian (35%) isolates (p = 0.02). Indian isolates had large numbers of gene variations (median 50,006, IQR = 26,967-50,600) compared to Australian isolates (median 26,317, IQR = 25,681-33,780). There were a larger number of mutations in the mutL and uvrD genes associated with the mismatch repair (MMR) system in Indian isolates, which may result in strains losing their efficacy for DNA repair. The number of gene variations were greater in isolates carrying MMR system genes or exoU. In the phylogenetic division, the number of core genes were similar in both groups, but Indian isolates had larger numbers of pan genes (median 6518, IQR = 6040-6935). Clones related to three different sequence types-ST308, ST316, and ST491-were found among Indian isolates. Only one clone, ST233, containing two strains was present in Australian isolates. The most striking differences between Australian and Indian isolates were carriage of exoU (that encodes a cytolytic phospholipase) in Indian isolates and exoS (that encodes for GTPase activator activity) in Australian isolates, large number of acquired resistance genes, greater changes to MMR genes, and a larger pan genome as well as increased overall genetic variation in the Indian isolates.

RevDate: 2020-09-16

Yin Z, Zhang S, Wei Y, et al (2020)

Horizontal Gene Transfer Clarifies Taxonomic Confusion and Promotes the Genetic Diversity and Pathogenicity of Plesiomonas shigelloides.

mSystems, 5(5): pii:5/5/e00448-20.

Plesiomonas shigelloides is an emerging pathogen that has been shown to be involved in gastrointestinal diseases and extraintestinal infections in humans. However, the taxonomic position, evolutionary dynamics, and pathogenesis of P. shigelloides remain unclear. We reported the draft genome sequences of 12 P. shigelloides strains representing different serogroups. We were able to determine a clear distinction between P. shigelloides and other members of Enterobacterales via core genome phylogeny, Neighbor-Net network, and average genome identity analysis. The pan-genome analysis of P. shigelloides revealed extensive genetic diversity and presented large flexible gene repertoires, while the core genome phylogeny exhibited a low level of clonality. The discordance between the core genome phylogeny and the pan-genome phylogeny indicated that flexible accessory genomes account for an important proportion of the evolution of P. shigelloides, which was subsequently characterized by determinations of hundreds of horizontally transferred genes (horizontal genes), massive gene expansions and contractions, and diverse mobile genetic elements (MGEs). The apparently high levels of horizontal gene transfer (HGT) in P. shigelloides were conferred from bacteria with novel properties from other taxa (mainly Vibrionaceae and Aeromonadaceae), which caused the historical taxonomic confusion and shaped the virulence gene pools. Furthermore, P. shigelloides genomes contain many macromolecular secretion system genes, virulence factor genes, and resistance genes, indicating its potential to cause intestinal and invasive infections. Collectively, our work provides insights into the phylogenetic position, evolutionary dynamic, and pathogenesis of P. shigelloides at the genomic level, which could facilitate the observation and research of this important pathogen.IMPORTANCE The taxonomic position of P. shigelloides has been the subject of debate for a long time, and until now, the evolutionary dynamics and pathogenesis of P. shigelloides were unclear. In this study, pan-genome analysis indicated extensive genetic diversity and the presence of large and variable gene repertoires. Our results revealed that horizontal gene transfer was the focal driving force for the genetic diversity of the P. shigelloides pan-genome and might have contributed to the emergence of novel properties. Vibrionaceae and Aeromonadaceae were found to be the predominant donor taxa for horizontal genes, which might have caused the taxonomic confusion historically. Comparative genomic analysis revealed the potential of P. shigelloides to cause intestinal and invasive diseases. Our results could advance the understanding of the evolution and pathogenesis of P. shigelloides, particularly in elucidating the role of horizontal gene transfer and investigating virulence-related elements.

RevDate: 2020-09-16

Ross DE, Marshall CW, Gulliver D, et al (2020)

Defining Genomic and Predicted Metabolic Features of the Acetobacterium Genus.

mSystems, 5(5): pii:5/5/e00277-20.

Acetogens are anaerobic bacteria capable of fixing CO2 or CO to produce acetyl coenzyme A (acetyl-CoA) and ultimately acetate using the Wood-Ljungdahl pathway (WLP). Acetobacterium woodii is the type strain of the Acetobacterium genus and has been critical for understanding the biochemistry and energy conservation in acetogens. Members of the Acetobacterium genus have been isolated from a variety of environments or have had genomes recovered from metagenome data, but no systematic investigation has been done on the unique and various metabolisms of the genus. To gain a better appreciation for the metabolic breadth of the genus, we sequenced the genomes of 4 isolates (A. fimetarium, A. malicum, A. paludosum, and A. tundrae) and conducted a comparative genome analysis (pan-genome) of 11 different Acetobacterium genomes. A unifying feature of the Acetobacterium genus is the carbon-fixing WLP. The methyl (cluster II) and carbonyl (cluster III) branches of the Wood-Ljungdahl pathway are highly conserved across all sequenced Acetobacterium genomes, but cluster I encoding the formate dehydrogenase is not. In contrast to A. woodii, all but four strains encode two distinct Rnf clusters, Rnf being the primary respiratory enzyme complex. Metabolism of fructose, lactate, and H2:CO2 was conserved across the genus, but metabolism of ethanol, methanol, caffeate, and 2,3-butanediol varied. Additionally, clade-specific metabolic potential was observed, such as amino acid transport and metabolism in the psychrophilic species, and biofilm formation in the A. wieringae clade, which may afford these groups an advantage in low-temperature growth or attachment to solid surfaces, respectively.IMPORTANCE Acetogens are anaerobic bacteria capable of fixing CO2 or CO to produce acetyl-CoA and ultimately acetate using the Wood-Ljungdahl pathway (WLP). This autotrophic metabolism plays a major role in the global carbon cycle and, if harnessed, can help reduce greenhouse gas emissions. Overall, the data presented here provide a framework for examining the ecology and evolution of the Acetobacterium genus and highlight the potential of these species as a source for production of fuels and chemicals from CO2 feedstocks.

RevDate: 2020-09-15

Chen Z, Erickson DL, J Meng (2020)

Benchmarking hybrid assembly approaches for genomic analyses of bacterial pathogens using Illumina and Oxford Nanopore sequencing.

BMC genomics, 21(1):631 pii:10.1186/s12864-020-07041-8.

BACKGROUND: We benchmarked the hybrid assembly approaches of MaSuRCA, SPAdes, and Unicycler for bacterial pathogens using Illumina and Oxford Nanopore sequencing by determining genome completeness and accuracy, antimicrobial resistance (AMR), virulence potential, multilocus sequence typing (MLST), phylogeny, and pan genome. Ten bacterial species (10 strains) were tested for simulated reads of both mediocre- and low-quality, whereas 11 bacterial species (12 strains) were tested for real reads.

RESULTS: Unicycler performed the best for achieving contiguous genomes, closely followed by MaSuRCA, while all SPAdes assemblies were incomplete. MaSuRCA was less tolerant of low-quality long reads than SPAdes and Unicycler. The hybrid assemblies of five antimicrobial-resistant strains with simulated reads provided consistent AMR genotypes with the reference genomes. The MaSuRCA assembly of Staphylococcus aureus with real reads contained msr(A) and tet(K), while the reference genome and SPAdes and Unicycler assemblies harbored blaZ. The AMR genotypes of the reference genomes and hybrid assemblies were consistent for the other five antimicrobial-resistant strains with real reads. The numbers of virulence genes in all hybrid assemblies were similar to those of the reference genomes, irrespective of simulated or real reads. Only one exception existed that the reference genome and hybrid assemblies of Pseudomonas aeruginosa with mediocre-quality long reads carried 241 virulence genes, whereas 184 virulence genes were identified in the hybrid assemblies of low-quality long reads. The MaSuRCA assemblies of Escherichia coli O157:H7 and Salmonella Typhimurium with mediocre-quality long reads contained 126 and 118 virulence genes, respectively, while 110 and 107 virulence genes were detected in their MaSuRCA assemblies of low-quality long reads, respectively. All approaches performed well in our MLST and phylogenetic analyses. The pan genomes of the hybrid assemblies of S. Typhimurium with mediocre-quality long reads were similar to that of the reference genome, while SPAdes and Unicycler were more tolerant of low-quality long reads than MaSuRCA for the pan-genome analysis. All approaches functioned well in the pan-genome analysis of Campylobacter jejuni with real reads.

CONCLUSIONS: Our research demonstrates the hybrid assembly pipeline of Unicycler as a superior approach for genomic analyses of bacterial pathogens using Illumina and Oxford Nanopore sequencing.

RevDate: 2020-09-14

Psomopoulos FE, van Helden J, Médigue C, et al (2020)

Ancestral state reconstruction of metabolic pathways across pangenome ensembles.

Microbial genomics [Epub ahead of print].

As genome sequencing efforts are unveiling the genetic diversity of the biosphere with an unprecedented speed, there is a need to accurately describe the structural and functional properties of groups of extant species whose genomes have been sequenced, as well as their inferred ancestors, at any given taxonomic level of their phylogeny. Elaborate approaches for the reconstruction of ancestral states at the sequence level have been developed, subsequently augmented by methods based on gene content. While these approaches of sequence or gene-content reconstruction have been successfully deployed, there has been less progress on the explicit inference of functional properties of ancestral genomes, in terms of metabolic pathways and other cellular processes. Herein, we describe PathTrace, an efficient algorithm for parsimony-based reconstructions of the evolutionary history of individual metabolic pathways, pivotal representations of key functional modules of cellular function. The algorithm is implemented as a five-step process through which pathways are represented as fuzzy vectors, where each enzyme is associated with a taxonomic conservation value derived from the phylogenetic profile of its protein sequence. The method is evaluated with a selected benchmark set of pathways against collections of genome sequences from key data resources. By deploying a pangenome-driven approach for pathway sets, we demonstrate that the inferred patterns are largely insensitive to noise, as opposed to gene-content reconstruction methods. In addition, the resulting reconstructions are closely correlated with the evolutionary distance of the taxa under study, suggesting that a diligent selection of target pangenomes is essential for maintaining cohesiveness of the method and consistency of the inference, serving as an internal control for an arbitrary selection of queries. The PathTrace method is a first step towards the large-scale analysis of metabolic pathway evolution and our deeper understanding of functional relationships reflected in emerging pangenome collections.

RevDate: 2020-09-13

Gardon H, Biderre-Petit C, Jouan-Dufournel I, et al (2020)

A drift-barrier model drives the genomic landscape of a structured bacterial population.

Molecular ecology [Epub ahead of print].

Bacterial populations differentiate over time and space to form distinct genetic units. The mechanisms governing this diversification are presumed to result from the ecological context of living units to adapt to specific niches. Recently, a model assuming the acquisition of advantageous genes among populations rather than whole genome sweeps has emerged to explain population differentiation. However, the characteristics of these exchanged, or flexible, genes and whether their evolution is driven by adaptive or neutral processes remain controversial. By analysing the flexible genome of single-amplified genomes of co-occurring populations of the marine Prochlorococcus HLII ecotype, we highlight that genomic compartments - rather than population units - are characterized by different evolutionary trajectories. The dynamics of gene fluxes vary across genomic compartments and therefore the effectiveness of selection depends on the fluctuation of the effective population size along the genome. Taken together, these results support the drift-barrier model of bacterial evolution.

RevDate: 2020-09-11

Christian RW, Hewitt SL, Nelson G, et al (2020)

Plastid transit peptides-where do they come from and where do they all belong? Multi-genome and pan-genomic assessment of chloroplast transit peptide evolution.

PeerJ, 8:e9772 pii:9772.

Subcellular relocalization of proteins determines an organism's metabolic repertoire and thereby its survival in unique evolutionary niches. In plants, the plastid and its various morphotypes import a large and varied number of nuclear-encoded proteins to orchestrate vital biochemical reactions in a spatiotemporal context. Recent comparative genomics analysis and high-throughput shotgun proteomics data indicate that there are a large number of plastid-targeted proteins that are either semi-conserved or non-conserved across different lineages. This implies that homologs are differentially targeted across different species, which is feasible only if proteins have gained or lost plastid targeting peptides during evolution. In this study, a broad, multi-genome analysis of 15 phylogenetically diverse genera and in-depth analyses of pangenomes from Arabidopsis and Brachypodium were performed to address the question of how proteins acquire or lose plastid targeting peptides. The analysis revealed that random insertions or deletions were the dominant mechanism by which novel transit peptides are gained by proteins. While gene duplication was not a strict requirement for the acquisition of novel subcellular targeting, 40% of novel plastid-targeted genes were found to be most closely related to a sequence within the same genome, and of these, 30.5% resulted from alternative transcription or translation initiation sites. Interestingly, analysis of the distribution of amino acids in the transit peptides of known and predicted chloroplast-targeted proteins revealed monocot and eudicot-specific preferences in residue distribution.

RevDate: 2020-09-09

Zhang X, Li F, Cui S, et al (2020)

Prevalence and Distribution Characteristics of blaKPC-2 and blaNDM-1 Genes in Klebsiella pneumoniae.

Infection and drug resistance, 13:2901-2910 pii:253631.

Background: Carbapenem-resistant Klebsiella pneumoniae infections have caused major concern and posed a global threat to public health. As blaKPC-2 and blaNDM-1 genes are the most widely reported carbapenem resistant genes in K. pneumonia, it is crucial to study the prevalence and geographical distribution of these two genes for further understanding of their transmission mode and mechanism.

Purpose: Here, we investigated the prevalence and distribution of blaKPC-2 and blaNDM-1 genes in carbapenem-resistant K. pneumoniae strains from a tertiary hospital and from 1579 genomes available in the NCBI database, and further analyzed the possible core structure of blaKPC-2 or blaNDM-1 genes among global genome data.

Materials and Methods: K. pneumoniae strains from a tertiary hospital in China during 2013-2018 were collected and their antimicrobial susceptibility testing for 28 antibiotics was determined. Whole-genome sequencing of carbapenem-resistant K. pneumoniae strains was used to investigate the genetic characterization. The phylogenetic relationships of these strains were investigated through pan-genome analysis. The epidemiology and distribution of blaKPC-2 and blaNDM-1 genes in K. pneumoniae based on 1579 global genomes and carbapenem-resistant K. pneumoniae strains from hospital were analyzed using bioinformatics. The possible core structure carrying blaKPC-2 or blaNDM-1 genes was investigated among global data.

Results: A total of 19 carbapenem-resistant K. pneumoniae were isolated in a tertiary hospital. All isolates had a multi-resistant pattern and eight kinds of resistance genes. The phylogenetic analysis showed all isolates in the hospital were dominated by two lineages composed of ST11 and ST25, respectively. ST11 and ST25 were the major ST type carrying blaKPC-2 and blaNDM-1 genes, respectively. Among 1579 global genomes data, 147 known ST types (1195 genomes) have been identified, while ST258 (23.6%) and ST11 (22.1%) were the globally prevalent clones among the known ST types. Genetic environment analysis showed that the ISKpn7-dnaA/ISKpn27 -blaKPC-2-ISkpn6 and blaNDM-1-ble-trpf-nagA may be the core structure in the horizontal transfer of blaKPC-2 and blaNDM-1 , respectively. In addition, DNA transferase (hin) may be involved in the horizontal transfer or the expression of blaNDM-1 .

Conclusion: There was clonal transmission of carbapenem-resistant K. pneumoniae in the tertiary hospital in China. The prevalence and distribution of blaKPC-2 and blaNDM-1 varied by countries and were driven by different transposons carrying the core structure. This study shed light on the genetic environment of blaKPC-2 and blaNDM-1 and offered basic information about the mechanism of carbapenem-resistant K. pneumoniae dissemination.

RevDate: 2020-09-09

Liu Y, Z Tian (2020)

From one linear genome to a graph-based pan-genome: a new era for genomics.

Science China. Life sciences pii:10.1007/s11427-020-1808-0 [Epub ahead of print].

RevDate: 2020-09-09

González-Dominici LI, Saati-Santamaría Z, P García-Fraile (2020)

Genome Analysis and Genomic Comparison of the Novel Species Arthrobacter ipsi Reveal Its Potential Protective Role in Its Bark Beetle Host.

Microbial ecology pii:10.1007/s00248-020-01593-8 [Epub ahead of print].

The pine engraver beetle, Ips acuminatus Gyll, is a bark beetle that causes important damages in Scots pine (Pinus sylvestris) forests and plantations. As almost all higher organisms, Ips acuminatus harbours a microbiome, although the role of most members of its microbiome is not well understood. As part of a work in which we analysed the bacterial diversity associated to Ips acuminatus, we isolated the strain Arthrobacter sp. IA7. In order to study its potential role within the bark beetle holobiont, we sequenced and explored its genome and performed a pan-genome analysis of the genus Arthrobacter, showing specific genes of strain IA7 that might be related with its particular role in its niche. Based on these investigations, we suggest several potential roles of the bacterium within the beetle. Analysis of genes related to secondary metabolism indicated potential antifungal capability, confirmed by the inhibition of several entomopathogenic fungal strains (Metarhizium anisopliae CCF0966, Lecanicillium muscarium CCF6041, L. muscarium CCF3297, Isaria fumosorosea CCF4401, I. farinosa CCF4808, Beauveria bassiana CCF4422 and B. brongniartii CCF1547). Phylogenetic analyses of the 16S rRNA gene, six concatenated housekeeping genes (tuf-secY-rpoB-recA-fusA-atpD) and genome sequences indicated that strain IA7 is closely related to A. globiformis NBRC 12137T but forms a new species within the genus Arthrobacter; this was confirmed by digital DNA-DNA hybridization (37.10%) and average nucleotide identity (ANIb) (88.9%). Based on phenotypic and genotypic features, we propose strain IA7T as the novel species Arthrobacter ipsi sp. nov. (type strain IA7T = CECT 30100T = LMG 31782T) and suggest its protective role for its host.

RevDate: 2020-09-08

Boisen N, Østerlund MT, Joensen KG, et al (2020)

Redefining enteroaggregative Escherichia coli (EAEC): Genomic characterization of epidemiological EAEC strains.

PLoS neglected tropical diseases, 14(9):e0008613 pii:PNTD-D-20-00385 [Epub ahead of print].

Although enteroaggregative E. coli (EAEC) has been implicated as a common cause of diarrhea in multiple settings, neither its essential genomic nature nor its role as an enteric pathogen are fully understood. The current definition of this pathotype requires demonstration of cellular adherence; a working molecular definition encompasses E. coli which do not harbor the heat-stable or heat-labile toxins of enterotoxigenic E. coli (ETEC) and harbor the genes aaiC, aggR, and/or aatA. In an effort to improve the definition of this pathotype, we report the most definitive characterization of the pan-genome of EAEC to date, applying comparative genomics and functional characterization on a collection of 97 EAEC strains isolated in the course of a multicenter case-control diarrhea study (Global Enteric Multi-Center Study, GEMS). Genomic analysis revealed that the EAEC strains mapped to all phylogenomic groups of E. coli. Circa 70% of strains harbored one of the five described AAF variants; there were no additional AAF variants identified, and strains that lacked an identifiable AAF generally did not have an otherwise complete AggR regulon. An exception was strains that harbored an ETEC colonization factor (CF) CS22, like AAF a member of the chaperone-usher family of adhesins, but not phylogenetically related to the AAF family. Of all genes scored, sepA yielded the strongest association with diarrhea (P = 0.002) followed by the increased serum survival gene, iss (p = 0.026), and the outer membrane protease gene ompT (p = 0.046). Notably, the EAEC genomes harbored several genes characteristically associated with other E. coli pathotypes. Our data suggest that a molecular definition of EAEC could comprise E. coli strains harboring AggR and a complete AAF(I-V) or CS22 gene cluster. Further, it is possible that strains meeting this definition could be both enteric bacteria and urinary/systemic pathogens.

RevDate: 2020-09-07

Bonnici V, Maresi E, R Giugno (2020)

Challenges in gene-oriented approaches for pangenome content discovery.

Briefings in bioinformatics pii:5901976 [Epub ahead of print].

Given a group of genomes, represented as the sets of genes that belong to them, the discovery of the pangenomic content is based on the search of genetic homology among the genes for clustering them into families. Thus, pangenomic analyses investigate the membership of the families to the given genomes. This approach is referred to as the gene-oriented approach in contrast to other definitions of the problem that takes into account different genomic features. In the past years, several tools have been developed to discover and analyse pangenomic contents. Because of the hardness of the problem, each tool applies a different strategy for discovering the pangenomic content. This results in a differentiation of the performance of each tool that depends on the composition of the input genomes. This review reports the main analysis instruments provided by the current state of the art tools for the discovery of pangenomic contents. Moreover, unlike previous works, the presented study compares pangenomic tools from a methodological perspective, analysing the causes that lead a given methodology to outperform other tools. The analysis is performed by taking into account different bacterial populations, which are synthetically generated by changing evolutionary parameters. The benchmarks used to compare the pangenomic tools, in addition to the computational pipeline developed for this purpose, are available at https://github.com/InfOmics/pangenes-review. Contact: V. Bonnici, R. Giugno Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.

RevDate: 2020-09-03

Zhu Z, Wang L, Qian H, et al (2020)

Comparative genome analysis of 12 Shigella sonnei strains: virulence, resistance, and their interactions.

International microbiology : the official journal of the Spanish Society for Microbiology pii:10.1007/s10123-020-00145-x [Epub ahead of print].

Shigellosis is a highly infectious disease that is mainly transmitted via fecal-oral contact of the bacteria Shigella. Four species have been identified in Shigella genus, among which Shigella flexneri is used to be the most prevalent species globally and commonly isolated from developing countries. However, it is being replaced by Shigella sonnei that is currently the main causative agent for dysentery pandemic in many emerging industrialized countries such as Asia and the Middle East. For a better understanding of S. sonnei virulence and antibiotic resistance, we sequenced 12 clinical S. sonnei strains with varied antibiotic-resistance profiles collected from four cities in Jiangsu Province, China. Phylogenomic analysis clustered antibiotic-sensitive and resistant S. sonnei into two distinct groups while pan-genome analysis reveals the presence and absence of unique genes in each group. Screening of 31 classes of virulence factors found out that type 2 secretion system is doubled in resistant strains. Further principle component analysis based on the interactions between virulence and resistance indicated that abundant virulence factors are associated with higher levels of antibiotic resistance. The result present here is based on statistical analysis of a small sample size and serves basically as a guidance for further experimental and theoretical studies.

RevDate: 2020-09-03

Muñoz-Ramirez ZY, Pascoe B, Mendez-Tenorio A, et al (2020)

A 500-year tale of co-evolution, adaptation, and virulence: Helicobacter pylori in the Americas.

The ISME journal pii:10.1038/s41396-020-00758-0 [Epub ahead of print].

Helicobacter pylori is a common component of the human stomach microbiota, possibly dating back to the speciation of Homo sapiens. A history of pathogen evolution in allopatry has led to the development of genetically distinct H. pylori subpopulations, associated with different human populations, and more recent admixture among H. pylori subpopulations can provide information about human migrations. However, little is known about the degree to which some H. pylori genes are conserved in the face of admixture, potentially indicating host adaptation, or how virulence genes spread among different populations. We analyzed H. pylori genomes from 14 countries in the Americas, strains from the Iberian Peninsula, and public genomes from Europe, Africa, and Asia, to investigate how admixture varies across different regions and gene families. Whole-genome analyses of 723 H. pylori strains from around the world showed evidence of frequent admixture in the American strains with a complex mosaic of contributions from H. pylori populations originating in the Americas as well as other continents. Despite the complex admixture, distinctive genomic fingerprints were identified for each region, revealing novel American H. pylori subpopulations. A pan-genome Fst analysis showed that variation in virulence genes had the strongest fixation in America, compared with non-American populations, and that much of the variation constituted non-synonymous substitutions in functional domains. Network analyses suggest that these virulence genes have followed unique evolutionary paths in the American populations, spreading into different genetic backgrounds, potentially contributing to the high risk of gastric cancer in the region.

RevDate: 2020-09-03

Carroll LM, Huisman JS, M Wiedmann (2020)

Twentieth-century emergence of antimicrobial resistant human- and bovine-associated Salmonella enterica serotype Typhimurium lineages in New York State.

Scientific reports, 10(1):14428 pii:10.1038/s41598-020-71344-9.

Salmonella enterica serotype Typhimurium (S. Typhimurium) boasts a broad host range and can be transmitted between livestock and humans. While members of this serotype can acquire resistance to antimicrobials, the temporal dynamics of this acquisition is not well understood. Using New York State (NYS) and its dairy cattle farms as a model system, 87 S. Typhimurium strains isolated from 1999 to 2016 from either human clinical or bovine-associated sources in NYS were characterized using whole-genome sequencing. More than 91% of isolates were classified into one of four major lineages, two of which were largely susceptible to antimicrobials but showed sporadic antimicrobial resistance (AMR) gene acquisition, and two that were largely multidrug-resistant (MDR). All four lineages clustered by presence and absence of elements in the pan-genome. The two MDR lineages, one of which resembled S. Typhimurium DT104, were predicted to have emerged circa 1960 and 1972. The two largely susceptible lineages emerged earlier, but showcased sporadic AMR determinant acquisition largely after 1960, including acquisition of cephalosporin resistance-conferring genes after 1985. These results confine the majority of AMR acquisition events in NYS S. Typhimurium to the twentieth century, largely within the era of antibiotic usage.

RevDate: 2020-09-03

Bellas CM, Schroeder DC, Edwards A, et al (2020)

Flexible genes establish widespread bacteriophage pan-genomes in cryoconite hole ecosystems.

Nature communications, 11(1):4403 pii:10.1038/s41467-020-18236-8.

Bacteriophage genomes rapidly evolve via mutation and horizontal gene transfer to counter evolving bacterial host defenses; such arms race dynamics should lead to divergence between phages from similar, geographically isolated ecosystems. However, near-identical phage genomes can reoccur over large geographical distances and several years apart, conversely suggesting many are stably maintained. Here, we show that phages with near-identical core genomes in distant, discrete aquatic ecosystems maintain diversity by possession of numerous flexible gene modules, where homologous genes present in the pan-genome interchange to create new phage variants. By repeatedly reconstructing the core and flexible regions of phage genomes from different metagenomes, we show a pool of homologous gene variants co-exist for each module in each location, however, the dominant variant shuffles independently in each module. These results suggest that in a natural community, recombination is the largest contributor to phage diversity, allowing a variety of host recognition receptors and genes to counter bacterial defenses to co-exist for each phage.

RevDate: 2020-08-27

Alam I, Kamau AA, Kulmanov M, et al (2020)

Functional Pangenome Analysis Shows Key Features of E Protein Are Preserved in SARS and SARS-CoV-2.

Frontiers in cellular and infection microbiology, 10:405.

The spread of the novel coronavirus (SARS-CoV-2) has triggered a global emergency, that demands urgent solutions for detection and therapy to prevent escalating health, social, and economic impacts. The spike protein (S) of this virus enables binding to the human receptor ACE2, and hence presents a prime target for vaccines preventing viral entry into host cells. The S proteins from SARS and SARS-CoV-2 are similar, but structural differences in the receptor binding domain (RBD) preclude the use of SARS-specific neutralizing antibodies to inhibit SARS-CoV-2. Here we used comparative pangenomic analysis of all sequenced reference Betacoronaviruses, complemented with functional and structural analyses. This analysis reveals that, among all core gene clusters present in these viruses, the envelope protein E shows a variant cluster shared by SARS and SARS-CoV-2 with two completely-conserved key functional features, namely an ion-channel, and a PDZ-binding motif (PBM). These features play a key role in the activation of the inflammasome causing the acute respiratory distress syndrome, the leading cause of death in SARS and SARS-CoV-2 infections. Together with functional pangenomic analysis, mutation tracking, and previous evidence, on E protein as a determinant of pathogenicity in SARS, we suggest E protein as an alternative therapeutic target to be considered for further studies to reduce complications of SARS-CoV-2 infections in COVID-19.

RevDate: 2020-08-27

Kumar R, Bröms JE, A Sjöstedt (2020)

Exploring the Diversity Within the Genus Francisella - An Integrated Pan-Genome and Genome-Mining Approach.

Frontiers in microbiology, 11:1928.

Pan-genome analysis is a powerful method to explore genomic heterogeneity and diversity of bacterial species. Here we present a pan-genome analysis of the genus Francisella, comprising a dataset of 63 genomes and encompassing clinical as well as environmental isolates from distinct geographic locations. To determine the evolutionary relationship within the genus, we performed phylogenetic whole-genome studies utilizing the average nucleotide identity, average amino acid identity, core genes and non-recombinant loci markers. Based on the analyses, the phylogenetic trees obtained identified two distinct clades, A and B and a diverse cluster designated C. The sizes of the pan-, core-, cloud-, and shell-genomes of Francisella were estimated and compared to those of two other facultative intracellular pathogens, Legionella and Piscirickettsia. Francisella had the smallest core-genome, 692 genes, compared to 886 and 1,732 genes for Legionella and Piscirickettsia respectively, while the pan-genome of Legionella was more than twice the size of that of the other two genera. Also, the composition of the Francisella Type VI secretion system (T6SS) was analyzed. Distinct differences in the gene content of the T6SS were identified. In silico approaches performed to identify putative substrates of these systems revealed potential effectors targeting the cell wall, inner membrane, cellular nucleic acids as well as proteins, thus constituting attractive targets for site-directed mutagenesis. The comparative analysis performed here provides a comprehensive basis for the assessment of the phylogenomic relationship of members of the genus Francisella and for the identification of putative T6SS virulence traits.

RevDate: 2020-08-27

Bannantine JP, Conde C, Bayles DO, et al (2020)

Genetic Diversity Among Mycobacterium avium Subspecies Revealed by Analysis of Complete Genome Sequences.

Frontiers in microbiology, 11:1701.

Mycobacterium avium comprises four subspecies that contain both human and veterinary pathogens. At the inception of this study, twenty-eight M. avium genomes had been annotated as RefSeq genomes, facilitating direct comparisons. These genomes represent strains from around the world and provided a unique opportunity to examine genome dynamics in this species. Each genome was confirmed to be classified correctly based on SNP genotyping, nucleotide identity and presence/absence of repetitive elements or other typing methods. The Mycobacterium avium subspecies paratuberculosis (Map) genome size and organization was remarkably consistent, averaging 4.8 Mb with a variance of only 29.6 kb among the 13 strains. Comparing recombination events along with the larger genome size and variance observed among Mycobacterium avium subspecies avium (Maa) and Mycobacterium avium subspecies hominissuis (Mah) strains (collectively termed non-Map) suggests horizontal gene transfer occurs in non-Map, but not in Map strains. Overall, M. avium subspecies could be divided into two major sub-divisions, with the Map type II (bovine strains) clustering tightly on one end of a phylogenetic spectrum and Mah strains clustering more loosely together on the other end. The most evolutionarily distinct Map strain was an ovine strain, designated Telford, which had >1,000 SNPs and showed large rearrangements compared to the bovine type II strains. The Telford strain clustered with Maa strains as an intermediate between Map type II and Mah. SNP analysis and genome organization analyses repeatedly demonstrated the conserved nature of Map versus the mosaic nature of non-Map M. avium strains. Finally, core and pangenomes were developed for Map and non-Map strains. A total of 80% Map genes belonged to the Map core genome, while only 40% of non-Map genes belonged to the non-Map core genome. These genomes provide a more complete and detailed comparison of these subspecies strains as well as a blueprint for how genetic diversity originated.

RevDate: 2020-08-27

Costa SS, Guimarães LC, Silva A, et al (2020)

First Steps in the Analysis of Prokaryotic Pan-Genomes.

Bioinformatics and biology insights, 14:1177932220938064.

Pan-genome is defined as the set of orthologous and unique genes of a specific group of organisms. The pan-genome is composed by the core genome, accessory genome, and species- or strain-specific genes. The pan-genome is considered open or closed based on the alpha value of the Heap law. In an open pan-genome, the number of gene families will continuously increase with the addition of new genomes to the analysis, while in a closed pan-genome, the number of gene families will not increase considerably. The first step of a pan-genome analysis is the homogenization of genome annotation. The same software should be used to annotate genomes, such as GeneMark or RAST. Subsequently, several software are used to calculate the pan-genome such as BPGA, GET_HOMOLOGUES, PGAP, among others. This review presents all these initial steps for those who want to perform a pan-genome analysis, explaining key concepts of the area. Furthermore, we present the pan-genomic analysis of 9 bacterial species. These are the species with the highest number of genomes deposited in GenBank. We also show the influence of the identity and coverage parameters on the prediction of orthologous and paralogous genes. Finally, we cite the perspectives of several research areas where pan-genome analysis can be used to answer important issues.

RevDate: 2020-08-20

Zhou L, Zhang T, Tang S, et al (2020)

Pan-genome analysis of Paenibacillus polymyxa strains reveals the mechanism of plant growth promotion and biocontrol.

Antonie van Leeuwenhoek pii:10.1007/s10482-020-01461-y [Epub ahead of print].

Rapid development of gene sequencing technologies has led to an exponential increase in microbial sequencing data. Genome research of a single organism does not capture the changes in the characteristics of genetic information within a species. Pan-genome analysis gives us a broader perspective to study the complete genetic information of a species. Paenibacillus polymyxa is a Gram-positive bacterium and an important plant growth-promoting rhizobacterium with the ability to produce multiple antibiotics, such as fusaricidin, lantibiotic, paenilan, and polymyxin. Our study explores the pan-genome of 14 representative P. polymyxa strains isolated from around the world. Heap's law model and curve fitting confirmed an open pan-genome of P. polymyxa. The phylogenetic and collinearity analyses reflected that the evolutionary classification of P. polymyxa strains are not associated with geographical area and ecological niches. Few genes related to phytohormone synthesis and phosphate solubilization were conserved; however, the nif cluster gene associated with nitrogen fixation exists only in some strains. This finding is indicative of nitrogen fixing ability is not stable in P. polymyxa. Analysis of antibiotic gene clusters in P. polymyxa revealed the presence of these genes in both core and accessory genomes. This observation indicates that the difference in living environment led to loss of ability to synthesize antibiotics in some strains. The current pan-genomic analysis of P. polymyxa will help us understand the mechanisms of biological control and plant growth promotion. It will also promote the use of P. polymyxa in agriculture.

RevDate: 2020-08-17

Ouyabe M, Tanaka N, Shiwa Y, et al (2020)

Rhizobium dioscoreae sp. nov., a plant growth-promoting bacterium isolated from yam (Dioscorea species).

International journal of systematic and evolutionary microbiology [Epub ahead of print].

This study investigated endophytic nitrogen-fixing bacteria isolated from two species of yam (water yam, Dioscorea alata L.; lesser yam, Dioscorea esculenta L.) grown in nutrient-poor alkaline soil conditions on Miyako Island, Okinawa, Japan. Two bacterial strains of the genus Rhizobium, S-93T and S-62, were isolated. The phylogenetic tree, based on the almost-complete 16S rRNA gene sequences (1476 bp for each strain), placed them in a distinct clade, with Rhizobium miluonense CCBAU 41251T, Rhizobium hainanense I66T, Rhizobium multihospitium HAMBI 2975T, Rhizobium freirei PRF 81T and Rhizobium tropici CIAT 899T being their closest species. Their bacterial fatty acid profile, with major components of C19 : 0 cyclo ω8c and summed feature 8, as well as other phenotypic characteristics and DNA G+C content (59.65 mol%) indicated that the novel strains belong to the genus Rhizobium. Pairwise average nucleotide identity analyses separated the novel strains from their most closely related species with similarity values of 90.5, 88.9, 88.5, 84.5 and 84.4 % for R. multihospitium HAMBI 2975T, R. tropici CIAT 899T, R. hainanense CCBAU 57015T, R. miluonense HAMBI 2971T and R. freirei PRF 81T, respectively; digital DNA-DNA hybridization values were in the range of 26-42 %. Considering the phenotypic characteristics as well as the genomic data, it is suggested that strains S-93T and S-62 represent a new species, for which the name Rhizobium dioscoreae is proposed. The type strain is S-93T (=NRIC 0988T=NBRC 114257T=DSM 110498T).

RevDate: 2020-08-13

Clawson ML, Schuller G, Dickey AM, et al (2020)

Differences between predicted outer membrane proteins of genotype 1 and 2 Mannheimia haemolytica.

BMC microbiology, 20(1):250 pii:10.1186/s12866-020-01932-2.

BACKGROUND: Mannheimia haemolytica strains isolated from North American cattle have been classified into two genotypes (1 and 2). Although members of both genotypes have been isolated from the upper and lower respiratory tracts of cattle with or without bovine respiratory disease (BRD), genotype 2 strains are much more frequently isolated from diseased lungs than genotype 1 strains. The mechanisms behind the increased association of genotype 2 M. haemolytica with BRD are not fully understood. To address that, and to search for interventions against genotype 2 M. haemolytica, complete, closed chromosome assemblies for 35 genotype 1 and 34 genotype 2 strains were generated and compared. Searches were conducted for the pan genome, core genes shared between the genotypes, and for genes specific to either genotype. Additionally, genes encoding outer membrane proteins (OMPs) specific to genotype 2 M. haemolytica were identified, and the diversity of their protein isoforms was characterized with predominantly unassembled, short-read genomic sequences for up to 1075 additional strains.

RESULTS: The pan genome of the 69 sequenced M. haemolytica strains consisted of 3111 genes, of which 1880 comprised a shared core between the genotypes. A core of 112 and 179 genes or gene variants were specific to genotype 1 and 2, respectively. Seven genes encoding predicted OMPs; a peptidase S6, a ligand-gated channel, an autotransporter outer membrane beta-barrel domain-containing protein (AOMB-BD-CP), a porin, and three different trimeric autotransporter adhesins were specific to genotype 2 as their genotype 1 homologs were either pseudogenes, or not detected. The AOMB-BD-CP gene, however, appeared to be truncated across all examined genotype 2 strains and to likely encode dysfunctional protein. Homologous gene sequences from additional M. haemolytica strains confirmed the specificity of the remaining six genotype 2 OMP genes and revealed they encoded low isoform diversity at the population level.

CONCLUSION: Genotype 2 M. haemolytica possess genes encoding conserved OMPs not found intact in more commensally prone genotype 1 strains. Some of the genotype 2 specific genes identified in this study are likely to have important biological roles in the pathogenicity of genotype 2 M. haemolytica, which is the primary bacterial cause of BRD.

RevDate: 2020-08-12

Xu S, Cheng J, Meng X, et al (2020)

Complete Genome and Comparative Genome Analysis of Lactobacillus reuteri YSJL-12, a Potential Probiotics Strain Isolated From Healthy Sow Fresh Feces.

Evolutionary bioinformatics online, 16:1176934320942192 pii:10.1177_1176934320942192.

Lactobacillus reuteri YSJL-12 was isolated from healthy sow fresh feces and used as probiotics additives previously. To investigate the genetic basis on probiotic potential and identify the genes in the strain, the complete genome of YSJL-12 was sequenced. Then comparative genome analysis on 9 strains of Lactobacillus reuteri was performed. The genome of YSJL-12 consisted of a circular 2,084,748 bp chromosome and 2 circular plasmids (51,906 and 15,134 bp). From among the 2065 protein-coding sequences (CDSs), the genes resistant to the environmental stress were identified. The function of COG (Clusters of Orthologous Group) protein genes was predicted, and the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways were analyzed. The comparative genome analysis indicated that the pan-genome contained a core genome of 1257 orthologous gene clusters, an accessory genome of 1064 orthologous gene clusters, and 1148 strain-specific genes, and the antibacterial mechanism among Lactobacillus reuteri strains might be different. The phylogenetic analysis and genomic collinearity revealed that the phylogenetic relationship among 9 strains of Lactobacillus reuteri was connected with host species and showed host specificity. The research could help us to better predict genes function and understand genetic basis on adapting to host gut in Lactobacillus reuteri YSJL-12.

RevDate: 2020-08-11

Bernardes JS, Eberle RJ, Vieira FRJ, et al (2020)

A comparative pan-genomic analysis of 53 C. pseudotuberculosis strains based on functional domains.

Journal of biomolecular structure & dynamics [Epub ahead of print].

Corynebacterium pseudotuberculosis is a pathogenic bacterium with great veterinary and economic importance. It is classified into two biovars: ovis, nitrate-negative, that causes lymphadenitis in small ruminants and equi, nitrate-positive, causing ulcerative lymphangitis in equines. With the explosive growth of available genomes of several strains, pan-genome analysis has opened new opportunities for understanding the dynamics and evolution of C. pseudotuberculosis. However, few pan-genomic studies have compared biovars equi and ovis. Such studies have considered a reduced number of strains and compared entire genomes. Here we conducted an original pan-genome analysis based on protein sequences and their functional domains. We considered 53 C. pseudotuberculosis strains from both biovars isolated from different hosts and countries. We have analysed conserved domains, common domains more frequently found in each biovar and biovar-specific (unique) domains. Our results demonstrated that biovar equi is more variable; there is a significant difference in the number of proteins per strains, probably indicating the occurrence of more gene loss/gain events. Moreover, strains of biovar equi presented a higher number of biovar-specific domains, 77 against only eight in biovar ovis, most of them are associated with virulence mechanisms. With this domain analysis, we have identified functional differences among strains of biovars ovis and equi that could be related to niche-adaptation and probably help to better understanding mechanisms of virulence and pathogenesis. The distribution patterns of functional domains identified in this work might have impacts on bacterial physiology and lifestyle, encouraging the development of new diagnoses, vaccines, and treatments for C. pseudotuberculosis diseases. Communicated by Ramaswamy H. Sarma.

RevDate: 2020-08-08

Pan Y, Awan F, Zhenbao M, et al (2020)

Preliminary view of the global distribution and spread of the tet(X) family of tigecycline resistance genes.

The Journal of antimicrobial chemotherapy pii:5885053 [Epub ahead of print].

BACKGROUND: The emergence of plasmid-mediated tet(X3)/tet(X4) genes is threatening the role of tigecycline as a last-resort antibiotic to treat clinical infections caused by XDR bacteria. Considering the possible public health threat posed by tet(X) and its variants [which we collectively call 'tet(X) genes' in this study], global monitoring and surveillance are urgently required.

OBJECTIVES: Here we conducted a worldwide survey of the global distribution and spread of tet(X) genes.

METHODS: We analysed a comprehensive dataset of bacterial genomes in conjunction with surveillance data from our laboratory and the NCBI database, as well as sufficient metadata to characterize the results.

RESULTS: The global distribution features of tet(X) genes were revealed. We clustered three types of genetic backbones of tet(X) genes embedded or transferred in bacterial genomes. Our pan-genome analyses revealed a large genetic pool composed of tet(X)-carrying sequences. Moreover, phylogenetic trees of tet(X) genes and tet(X)-like proteins were built.

CONCLUSIONS: To the best of our knowledge, our results provide the first view of the global distribution of tet(X) genes, demonstrate the features of tet(X)-carrying fragments and highlight the possible evolution of tigecycline-inactivation enzymes in diverse bacterial species and habitats.

RevDate: 2020-08-08

Santos DDS, Calaça PRA, Porto ALF, et al (2020)

What Differentiates Probiotic from Pathogenic Bacteria? The Genetic Mobility of Enterococcus faecium Offers New Molecular Insights.

Omics : a journal of integrative biology [Epub ahead of print].

Enterococcus faecium is a lactic acid bacterium with applications in food engineering and nutrigenomics, including as starter cultures in fermented foods. To differentiate the E. faecium probiotic from pathogenic bacteria, physiological analyses are often used but they do not guarantee that a bacterial strain is not pathogenic. We report here new findings and an approach based on comparison of the genetic mobility of (1) probiotic, (2) pathogenic, and (3) nonpathogenic and non-probiotic strains, so as to differentiate probiotics, and inform their safe use. The region of the 16S ribosomal DNA (rDNA) genes of different E. faecium strains native to Pernambuco-Brazil was used with the GenBank query sequence. Complete genomes were selected and divided into three groups as noted above to identify the mobile genetic elements (MGEs) (transposase, integrase, conjugative transposon protein and phage) and antibiotic resistance genes (ARGs), and to undertake pan-genome analysis and multiple genome alignment. Differences in the number of MGEs were found in ARGs, in the presence and absence of the genes that differentiate E. faecium probiotics and pathogenic bacteria genetically. Our data suggest that genetic mobility appears to be informative in differentiating between probiotic and pathogenic strains. While the present findings are not necessarily applicable to all probiotics, they offer novel molecular insights to guide future research in nutrigenomics, clinical medicine, and food engineering on new ways to differentiate pathogenic from probiotic bacteria.

RevDate: 2020-08-07

Son S, Oh JD, Lee SH, et al (2020)

Comparative genomics of canine Lactobacillus reuteri reveals adaptation to a shared environment with humans.

Genes & genomics pii:10.1007/s13258-020-00978-w [Epub ahead of print].

BACKGROUND: Lactobacillus reuteri is a gram-positive, non-motile bacterial species that has been used as a representative microorganism model to describe the ecology and evolution of vertebrate gut symbionts.

OBJECTIVE: Because the genetic features and evolutionary strategies of L. reuteri from the gastrointestinal tract of canines remain unknown, we tried to construct draft genome canine L. reuteri and investigate modified, acquired, or lost genetic features that have facilitated the evolution and adaptation of strains to specific environmental niches by this study.

METHODS: To examine canine L. reuteri, we sequenced an L. reuteri strain isolated from a dog in Korea. A comparative genomic approach was used to assess genetic diversity and gain insight into the distinguishing features related to different hosts based on 27 published genomic sequences.

RESULTS: The pan-genome of 28 L. reuteri strains contained 7,369 gene families, and the core genome contained 1070 gene families. The ANI tree based on the core genes in the canine L. reuteri strain (C1) was very close to those for three strains (IRT, DSM20016, JCM1112) from humans. Evolutionarily, these four strains formed one clade, which we regarded as C1-clade in this study. We could investigate a total of 32,050 amino acid substitutions among the 28 L. reuteri strain genomes. In this comparison, 283 amino acid substitutions were specific to strain C1 and four strains in C1-clade shared most of these 283 C1-strain specific amino acid substitutions, suggesting strongly similar selective pressure. In accessory genes, we could identify 127 C1-clade host-specific genes and found that several genes were closely related to replication, recombination, and repair.

CONCLUSION: This study provides new insights into the adaptation of L. reuteri to the canine intestinal habitat, and suggests that the genome of L. reuteri from canines is closely associated with their living and shared environment with humans.

RevDate: 2020-08-07

Botelho J, Grosso F, L Peixe (2020)

ICEs Are the Main Reservoirs of the Ciprofloxacin-Modifying crpP Gene in Pseudomonas aeruginosa.

Genes, 11(8): pii:genes11080889.

The ciprofloxacin-modifying crpP gene was recently identified in a plasmid isolated from a Pseudomonas aeruginosa clinical isolate. Homologues of this gene were also identified in Escherichia coli, Klebsiella pneumoniae and Acinetobacter baumannii. We set out to explore the mobile elements involved in the acquisition and spread of this gene in publicly available and complete genomes of Pseudomonas spp. All Pseudomonas complete genomes were downloaded from NCBI's Refseq library and were inspected for the presence of the crpP gene. The mobile elements carrying this gene were further characterized. The crpP gene was identified only in P. aeruginosa, in more than half of the complete chromosomes (61.9%, n = 133/215) belonging to 52 sequence types, of which the high-risk clone ST111 was the most frequent. We identified 136 crpP-harboring integrative and conjugative elements (ICEs), with 93.4% belonging to the mating-pair formation G (MPFG) family. The ICEs were integrated at the end of a tRNALys gene and were all flanked by highly conserved 45-bp direct repeats. The crpP-carrying ICEs contain 26 core genes (2.2% of all 1193 genes found in all the ICEs together), which are present in 99% or more of the crpP-harboring ICEs. The most frequently encoded traits on these ICEs include replication, transcription, intracellular trafficking and cell motility. Our work suggests that ICEs are the main vectors promoting the dissemination of the ciprofloxacin-modifying crpP gene in P. aeruginosa.

RevDate: 2020-08-05

Petit RA, TD Read (2020)

Bactopia: a Flexible Pipeline for Complete Analysis of Bacterial Genomes.

mSystems, 5(4): pii:5/4/e00190-20.

Sequencing of bacterial genomes using Illumina technology has become such a standard procedure that often data are generated faster than can be conveniently analyzed. We created a new series of pipelines called Bactopia, built using Nextflow workflow software, to provide efficient comparative genomic analyses for bacterial species or genera. Bactopia consists of a data set setup step (Bactopia Data Sets [BaDs]), which creates a series of customizable data sets for the species of interest, the Bactopia Analysis Pipeline (BaAP), which performs quality control, genome assembly, and several other functions based on the available data sets and outputs the processed data to a structured directory format, and a series of Bactopia Tools (BaTs) that perform specific postprocessing on some or all of the processed data. BaTs include pan-genome analysis, computing average nucleotide identity between samples, extracting and profiling the 16S genes, and taxonomic classification using highly conserved genes. It is expected that the number of BaTs will increase to fill specific applications in the future. As a demonstration, we performed an analysis of 1,664 public Lactobacillus genomes, focusing on Lactobacillus crispatus, a species that is a common part of the human vaginal microbiome. Bactopia is an open source system that can scale from projects as small as one bacterial genome to ones including thousands of genomes and that allows for great flexibility in choosing comparison data sets and options for downstream analysis. Bactopia code can be accessed at https://www.github.com/bactopia/bactopiaIMPORTANCE It is now relatively easy to obtain a high-quality draft genome sequence of a bacterium, but bioinformatic analysis requires organization and optimization of multiple open source software tools. We present Bactopia, a pipeline for bacterial genome analysis, as an option for processing bacterial genome data. Bactopia also automates downloading of data from multiple public sources and species-specific customization. Because the pipeline is written in the Nextflow language, analyses can be scaled from individual genomes on a local computer to thousands of genomes using cloud resources. As a usage example, we processed 1,664 Lactobacillus genomes from public sources and used comparative analysis workflows (Bactopia Tools) to identify and analyze members of the L. crispatus species.

RevDate: 2020-08-03

Tao Y, Jordan DR, ES Mace (2020)

A graph-based pan-genome guides biological discovery.

Molecular plant pii:S1674-2052(20)30256-2 [Epub ahead of print].

RevDate: 2020-08-03

Correia K, R Mahadevan (2020)

Pan-Genome-Scale Network Reconstruction: Harnessing Phylogenomics Increases the Quantity and Quality of Metabolic Models.

Biotechnology journal [Epub ahead of print].

BACKGROUND: A genome-scale network reconstruction (GENRE) represents a compendium of knowledge for an organism and is used in a variety of applications. Current practices limit the quantity and quality of GENREs. First, falling genome sequencing costs over the last decade has led to exponentially growing genome sequences, but the number of curated GENREs has not kept pace; this gap hinders our ability to study physiology throughout the tree of life. Second, the central metabolisms of existing yeast GENREs contain significant commission and omission errors; these inaccuracies limit the validity of metabolic simulations.

METHODS AND RESULTS: We outline an open and transparent framework to increase the quantity and quality of GENREs with phylogenomics. In this framework, research communities curate the pan-genome, pan-reactome, pan-metabolome, and pan-phenome for a group of organisms in a taxon, rather than for a single strain. We demonstrate our approach with 33 yeasts and fungi spanning 600 million years of evolution in the Dikarya subkingdom. We created a pan-fungal metabolic network called FYRMENT (Fungal and Yeast Metabolic Network) (https://github.com/LMSE/FYRMENT), and annotated reactions with ortholog groups from AYbRAH (https://github.com/LMSE/AYbRAH). We created metabolic models for every taxonomic level from subkingdom to strain using FYRMENT and AYbRAH. The fungal pan-GENRE contains 1553 orthologs, 2759 reactions, 2251 metabolites, and ten compartments. The strain-level GENREs have higher genomic and metabolic coverage than existing yeast and fungal GENREs created with other methods. Metabolic simulations show the maximum amino acid yields from glucose differs between yeast lineages, indicating metabolic networks have evolved in yeasts.

CONCLUSIONS: Curating ortholog and reaction databases at higher taxonomic-levels increases the quantity and quality of strain GENREs than the common practice of using model organism GENREs as templates. This pan-GENRE framework provides the ability to scale high-quality GENREs to more branches in the tree of life. This article is protected by copyright. All rights reserved.

RevDate: 2020-08-03

Parlikar A, Kalia K, Sinha S, et al (2020)

Understanding genomic diversity, pan-genome, and evolution of SARS-CoV-2.

PeerJ, 8:e9576 pii:9576.

Coronovirus disease 2019 (COVID-19) infection, which originated from Wuhan, China, has seized the whole world in its grasp and created a huge pandemic situation before humanity. Since December 2019, genomes of numerous isolates have been sequenced and analyzed for testing confirmation, epidemiology, and evolutionary studies. In the first half of this article, we provide a detailed review of the history and origin of COVID-19, followed by the taxonomy, nomenclature and genome organization of its causative agent Severe Acute Respiratory Syndrome-related Coronavirus-2 (SARS-CoV-2). In the latter half, we analyze subgenus Sarbecovirus (167 SARS-CoV-2, 312 SARS-CoV, and 5 Pangolin CoV) genomes to understand their diversity, origin, and evolution, along with pan-genome analysis of genus Betacoronavirus members. Whole-genome sequence-based phylogeny of subgenus Sarbecovirus genomes reasserted the fact that SARS-CoV-2 strains evolved from their common ancestors putatively residing in bat or pangolin hosts. We predicted a few country-specific patterns of relatedness and identified mutational hotspots with high, medium and low probability based on genome alignment of 167 SARS-CoV-2 strains. A total of 100-nucleotide segment-based homology studies revealed that the majority of the SARS-CoV-2 genome segments are close to Bat CoV, followed by some to Pangolin CoV, and some are unique ones. Open pan-genome of genus Betacoronavirus members indicates the diversity contributed by the novel viruses emerging in this group. Overall, the exploration of the diversity of these isolates, mutational hotspots and pan-genome will shed light on the evolution and pathogenicity of SARS-CoV-2 and help in developing putative methods of diagnosis and treatment.

RevDate: 2020-07-31

Söderlund R, Formenti N, Caló S, et al (2020)

Comparative genome analysis of Erysipelothrix rhusiopathiae isolated from domestic pigs and wild boars suggests host adaptation and selective pressure from the use of antibiotics.

Microbial genomics [Epub ahead of print].

The disease erysipelas caused by Erysipelothrix rhusiopathiae (ER) is a major concern in pig production. In the present study the genomes of ER from pigs (n=87), wild boars (n=71) and other sources (n=85) were compared in terms of whole-genome SNP variation, accessory genome content and the presence of genetic antibiotic resistance determinants. The aim was to investigate if genetic features among ER were associated with isolate origin in order to better estimate the risk of transmission of porcine-adapted strains from wild boars to free-range pigs and to increase our understanding of the evolution of ER. Pigs and wild boars carried isolates representing all ER clades, but clade one only occurred in healthy wild boars and healthy pigs. Several accessory genes or gene variants were found to be significantly associated with the pig and wild boar hosts, with genes predicted to encode cell wall-associated or extracellular proteins overrepresented. Gene variants associated with serovar determination and capsule production in serovars known to be pathogenic for pigs were found to be significantly associated with pigs as hosts. In total, 30 % of investigated pig isolates but only 6 % of wild boar isolates carried resistance genes, most commonly tetM (tetracycline) and lsa(E) together with lnu(B) (lincosamides, pleuromutilin and streptogramin A). The incidence of variably present genes including resistance determinants was weakly linked to phylogeny, indicating that host adaptation in ER has evolved multiple times in diverse lineages mediated by recombination and the acquisition of mobile genetic elements. The presented results support the occurrence of host-adapted ER strains, but they do not indicate frequent transmission between wild boars and domestic pigs. This article contains data hosted by Microreact.

RevDate: 2020-07-30

Derakhshani H, Bernier SP, Marko VA, et al (2020)

Completion of draft bacterial genomes by long-read sequencing of synthetic genomic pools.

BMC genomics, 21(1):519 pii:10.1186/s12864-020-06910-6.

BACKGROUND: Illumina technology currently dominates bacterial genomics due to its high read accuracy and low sequencing cost. However, the incompleteness of draft genomes generated by Illumina reads limits their application in comprehensive genomics analyses. Alternatively, hybrid assembly using both Illumina short reads and long reads generated by single molecule sequencing technologies can enable assembly of complete bacterial genomes, yet the high per-genome cost of long-read sequencing limits the widespread use of this approach in bacterial genomics. Here we developed a protocol for hybrid assembly of complete bacterial genomes using miniaturized multiplexed Illumina sequencing and non-barcoded PacBio sequencing of a synthetic genomic pool (SGP), thus significantly decreasing the overall per-genome cost of sequencing.

RESULTS: We evaluated the performance of SGP hybrid assembly on the genomes of 20 bacterial isolates with different genome sizes, a wide range of GC contents, and varying levels of phylogenetic relatedness. By improving the contiguity of Illumina assemblies, SGP hybrid assembly generated 17 complete and 3 nearly complete bacterial genomes. Increased contiguity of SGP hybrid assemblies resulted in considerable improvement in gene prediction and annotation. In addition, SGP hybrid assembly was able to resolve repeat elements and identify intragenomic heterogeneities, e.g. different copies of 16S rRNA genes, that would otherwise go undetected by short-read-only assembly. Comprehensive comparison of SGP hybrid assemblies with those generated using multiplexed PacBio long reads (long-read-only assembly) also revealed the relative advantage of SGP hybrid assembly in terms of assembly quality. In particular, we observed that SGP hybrid assemblies were completely devoid of both small (i.e. single base substitutions) and large assembly errors. Finally, we show the ability of SGP hybrid assembly to differentiate genomes of closely related bacterial isolates, suggesting its potential application in comparative genomics and pangenome analysis.

CONCLUSION: Our results indicate the superiority of SGP hybrid assembly over both short-read and long-read assemblies with respect to completeness, contiguity, accuracy, and recovery of small replicons. By lowering the per-genome cost of sequencing, our parallel sequencing and hybrid assembly pipeline could serve as a cost effective and high throughput approach for completing high-quality bacterial genomes.

RevDate: 2020-07-28

Haberer G, Kamal N, Bauer E, et al (2020)

European maize genomes highlight intraspecies variation in repeat and gene content.

Nature genetics pii:10.1038/s41588-020-0671-9 [Epub ahead of print].

The diversity of maize (Zea mays) is the backbone of modern heterotic patterns and hybrid breeding. Historically, US farmers exploited this variability to establish today's highly productive Corn Belt inbred lines from blends of dent and flint germplasm pools. Here, we report de novo genome sequences of four European flint lines assembled to pseudomolecules with scaffold N50 ranging from 6.1 to 10.4 Mb. Comparative analyses with two US Corn Belt lines explains the pronounced differences between both germplasms. While overall syntenic order and consolidated gene annotations reveal only moderate pangenomic differences, whole-genome alignments delineating the core and dispensable genome, and the analysis of heterochromatic knobs and orthologous long terminal repeat retrotransposons unveil the dynamics of the maize genome. The high-quality genome sequences of the flint pool complement the maize pangenome and provide an important tool to study maize improvement at a genome scale and to enhance modern hybrid breeding.

RevDate: 2020-07-28

Muqaddasi QH, Brassac J, Ebmeyer E, et al (2020)

Prospects of GWAS and predictive breeding for European winter wheat's grain protein content, grain starch content, and grain hardness.

Scientific reports, 10(1):12541 pii:10.1038/s41598-020-69381-5.

Grain quality traits determine the classification of registered wheat (Triticum aestivum L.) varieties. Although environmental factors and crop management practices exert a considerable influence on wheat quality traits, a significant proportion of the variance is attributed to the genetic factors. To identify the underlying genetic factors of wheat quality parameters viz., grain protein content (GPC), grain starch content (GSC), and grain hardness (GH), we evaluated 372 diverse European wheat varieties in replicated field trials in up to eight environments. We observed that all of the investigated traits hold a wide and significant genetic variation, and a significant negative correlation exists between GPC and GSC plus grain yield. Our association analyses based on 26,694 high-quality single nucleotide polymorphic markers revealed a strong quantitative genetic nature of GPC and GSC with associations on groups 2, 3, and 6 chromosomes. The identification of known Puroindoline-b gene for GH provided a positive analytic proof for our studies. We report that a locus QGpc.ipk-6A controls both GPC and GSC with opposite allelic effects. Based on wheat's reference and pan-genome sequences, the physical characterization of two loci viz., QGpc.ipk-2B and QGpc.ipk-6A facilitated the identification of the candidate genes for GPC. Furthermore, by exploiting additive and epistatic interactions of loci, we evaluated the prospects of predictive breeding for the investigated traits that suggested its efficient use in the breeding programs.

RevDate: 2020-07-28

Flament-Simon SC, de Toro M, Chuprikova L, et al (2020)

High diversity and variability of pipolins among a wide range of pathogenic Escherichia coli strains.

Scientific reports, 10(1):12452 pii:10.1038/s41598-020-69356-6.

Self-synthesizing transposons are integrative mobile genetic elements (MGEs) that encode their own B-family DNA polymerase (PolB). Discovered a few years ago, they are proposed as key players in the evolution of several groups of DNA viruses and virus-host interaction machinery. Pipolins are the most recent addition to the group, are integrated in the genomes of bacteria from diverse phyla and also present as circular plasmids in mitochondria. Remarkably, pipolins-encoded PolBs are proficient DNA polymerases endowed with DNA priming capacity, hence the name, primer-independent PolB (piPolB). We have now surveyed the presence of pipolins in a collection of 2,238 human and animal pathogenic Escherichia coli strains and found that, although detected in only 25 positive isolates (1.1%), they are present in E. coli strains from a wide variety of pathotypes, serotypes, phylogenetic groups and sequence types. Overall, the pangenome of strains carrying pipolins is highly diverse, despite the fact that a considerable number of strains belong to only three clonal complexes (CC10, CC23 and CC32). Comparative analysis with a set of 67 additional pipolin-harboring genomes from GenBank database spanning strains from diverse origin, further confirmed these results. The genetic structure of pipolins shows great flexibility and variability, with the piPolB gene and the attachment sites being the only common features. Most pipolins contain one or more recombinases that would be involved in excision/integration of the element in the same conserved tRNA gene. This mobilization mechanism might explain the apparent incompatibility of pipolins with other integrative MGEs such as integrons. In addition, analysis of cophylogeny between pipolins and pipolin-harboring strains showed a lack of congruence between several pipolins and their host strains, in agreement with horizontal transfer between hosts. Overall, these results indicate that pipolins can serve as a vehicle for genetic transfer among circulating E. coli and possibly also among other pathogenic bacteria.

RevDate: 2020-07-28

Crysnanto D, H Pausch (2020)

Bovine breed-specific augmented reference graphs facilitate accurate sequence read mapping and unbiased variant discovery.

Genome biology, 21(1):184 pii:10.1186/s13059-020-02105-0.

BACKGROUND: The current bovine genomic reference sequence was assembled from a Hereford cow. The resulting linear assembly lacks diversity because it does not contain allelic variation, a drawback of linear references that causes reference allele bias. High nucleotide diversity and the separation of individuals by hundreds of breeds make cattle ideally suited to investigate the optimal composition of variation-aware references.

RESULTS: We augment the bovine linear reference sequence (ARS-UCD1.2) with variants filtered for allele frequency in dairy (Brown Swiss, Holstein) and dual-purpose (Fleckvieh, Original Braunvieh) cattle breeds to construct either breed-specific or pan-genome reference graphs using the vg toolkit. We find that read mapping is more accurate to variation-aware than linear references if pre-selected variants are used to construct the genome graphs. Graphs that contain random variants do not improve read mapping over the linear reference sequence. Breed-specific augmented and pan-genome graphs enable almost similar mapping accuracy improvements over the linear reference. We construct a whole-genome graph that contains the Hereford-based reference sequence and 14 million alleles that have alternate allele frequency greater than 0.03 in the Brown Swiss cattle breed. Our novel variation-aware reference facilitates accurate read mapping and unbiased sequence variant genotyping for SNPs and Indels.

CONCLUSIONS: We develop the first variation-aware reference graph for an agricultural animal (https://doi.org/10.5281/zenodo.3759712). Our novel reference structure improves sequence read mapping and variant genotyping over the linear reference. Our work is a first step towards the transition from linear to variation-aware reference structures in species with high genetic diversity and many sub-populations.

RevDate: 2020-07-28

Yin Z, Liu J, Du B, et al (2020)

Whole-Genome-Based Survey for Polyphyletic Serovars of Salmonella enterica subsp. enterica Provides New Insights into Public Health Surveillance.

International journal of molecular sciences, 21(15): pii:ijms21155226.

Serotyping has traditionally been considered the basis for surveillance of Salmonella, but it cannot distinguish distinct lineages sharing the same serovar that vary in host range, pathogenicity and epidemiology. However, polyphyletic serovars have not been extensively investigated. Public health microbiology is currently being transformed by whole-genome sequencing (WGS) data, which promote the lineage determination using a more powerful and accurate technique than serotyping. The focus in this study is to survey and analyze putative polyphyletic serovars. The multi-locus sequence typing (MLST) phylogenetic analysis identified four putative polyphyletic serovars, namely, Montevideo, Bareilly, Saintpaul, and Muenchen. Whole-genome-based phylogeny and population structure highlighted the polyphyletic nature of Bareilly and Saintpaul and the multi-lineage nature of Montevideo and Muenchen. The population of these serovars was defined by extensive genetic diversity, the open pan genome and the small core genome. Source niche metadata revealed putative existence of lineage-specific niche adaptation (host-preference and environmental-preference), exhibited by lineage-specific genomic contents associated with metabolism and transport. Meanwhile, differences in genetic profiles relating to virulence and antimicrobial resistance within each lineage may contribute to pathogenicity and epidemiology. The results also showed that recombination events occurring at the H1-antigen loci may be an important reason for polyphyly. The results presented here provide the genomic basis of simple, rapid, and accurate identification of phylogenetic lineages of these serovars, which could have important implications for public health.

RevDate: 2020-07-27

Fang H, Xu JB, Nie Y, et al (2020)

Pan-genomic analysis reveals that the evolution of Dietzia species depends on their living habitats.

Environmental microbiology [Epub ahead of print].

The bacterial genus Dietzia is widely distributed in various environments. The genomes of 26 diverse strains of Dietzia, including almost all the type strains, were analyzed in this study. This analysis revealed a lipid metabolism gene richness, which could explain the ability of Dietzia to live in oil related environments. The pan-genome consists of 83,976 genes assigned into 10,327 gene families, 792 of which are shared by all the genomes of Dietzia. Mathematical extrapolation of the data suggests that the Dietzia pan-genome is open. Both gene duplication and gene loss contributed to the open pan-genome, while horizontal gene transfer was limited. Dietzia strains primarily gained their diverse metabolic capacity through more ancient gene duplications. Phylogenetic analysis of Dietzia isolated from aquatic and terrestrial environments showed two distinct clades from the same ancestor. The genome sizes of Dietzia strains from aquatic environments were significantly larger than those from terrestrial environments, which was mainly due to the occurrence of more gene loss events during the evolutionary progress of the strains from terrestrial environments. The evolutionary history of Dietzia was tightly coupled to environmental conditions, and iron concentrations should be one of the key factors shaping the genomes of the Dietzia lineages. This article is protected by copyright. All rights reserved.

RevDate: 2020-07-27

Moreno-Pérez A, Pintado A, Murillo J, et al (2020)

Host Range Determinants of Pseudomonas savastanoi Pathovars of Woody Hosts Revealed by Comparative Genomics and Cross-Pathogenicity Tests.

Frontiers in plant science, 11:973.

The study of host range determinants within the Pseudomonas syringae complex is gaining renewed attention due to its widespread distribution in non-agricultural environments, evidence of large variability in intra-pathovar host range, and the emergence of new epidemic diseases. This requires the establishment of appropriate model pathosystems facilitating integration of phenotypic, genomic and evolutionary data. Pseudomonas savastanoi pv. savastanoi is a model pathogen of the olive tree, and here we report a closed genome of strain NCPPB 3335, plus draft genome sequences of three strains isolated from oleander (pv. nerii), ash (pv. fraxini) and broom plants (pv. retacarpa). We then conducted a comparative genomic analysis of these four new genomes plus 16 publicly available genomes, representing 20 strains of these four P. savastanoi pathovars of woody hosts. Despite overlapping host ranges, cross-pathogenicity tests using four plant hosts clearly separated these pathovars and lead to pathovar reassignment of two strains. Critically, these functional assays were pivotal to reconcile phylogeny with host range and to define pathovar-specific genes repertoires. We report a pan-genome of 7,953 ortholog gene families and a total of 45 type III secretion system effector genes, including 24 core genes, four genes exclusive of pv. retacarpa and several genes encoding pathovar-specific truncations. Noticeably, the four pathovars corresponded with well-defined genetic lineages, with core genome phylogeny and hierarchical clustering of effector genes closely correlating with pathogenic specialization. Knot-inducing pathovars encode genes absent in the canker-inducing pv. fraxini, such as those related to indole acetic acid, cytokinins, rhizobitoxine, and a bacteriophytochrome. Other pathovar-exclusive genes encode type I, type II, type IV, and type VI secretion system proteins, the phytotoxine phevamine A, a siderophore, c-di-GMP-related proteins, methyl chemotaxis proteins, and a broad collection of transcriptional regulators and transporters of eight different superfamilies. Our combination of pathogenicity analyses and genomics tools allowed us to correctly assign strains to pathovars and to propose a repertoire of host range-related genes in the P. syringae complex.

RevDate: 2020-07-24

Kc R, Leong KWC, Harkness NM, et al (2020)

Whole-genome analyses reveal gene content differences between nontypeable Haemophilus influenzae isolates from chronic obstructive pulmonary disease compared to other clinical phenotypes.

Microbial genomics [Epub ahead of print].

Nontypeable Haemophilus influenzae (NTHi) colonizes human upper respiratory airways and plays a key role in the course and pathogenesis of acute exacerbations of chronic obstructive pulmonary disease (COPD). Currently, it is not possible to distinguish COPD isolates of NTHi from other clinical isolates of NTHi using conventional genotyping methods. Here, we analysed the core and accessory genome of 568 NTHi isolates, including 40 newly sequenced isolates, to look for genetic distinctions between NTHi isolates from COPD with respect to other illnesses, including otitis media, meningitis and pneumonia. Phylogenies based on polymorphic sites in the core-genome did not show discrimination between NTHi strains collected from different clinical phenotypes. However, pan-genome-wide association studies identified 79 unique NTHi accessory genes that were significantly associated with COPD. Furthermore, many of the COPD-related NTHi genes have known or predicted roles in virulence, transmembrane transport of metal ions and nutrients, cellular respiration and maintenance of redox homeostasis. This indicates that specific genes may be required by NTHi for its survival or virulence in the COPD lung. These results advance our understanding of the pathogenesis of NTHi infection in COPD lungs.

RevDate: 2020-07-23

Tonkin-Hill G, MacAlasdair N, Ruis C, et al (2020)

Producing polished prokaryotic pangenomes with the Panaroo pipeline.

Genome biology, 21(1):180 pii:10.1186/s13059-020-02090-4.

Population-level comparisons of prokaryotic genomes must take into account the substantial differences in gene content resulting from horizontal gene transfer, gene duplication and gene loss. However, the automated annotation of prokaryotic genomes is imperfect, and errors due to fragmented assemblies, contamination, diverse gene families and mis-assemblies accumulate over the population, leading to profound consequences when analysing the set of all genes found in a species. Here, we introduce Panaroo, a graph-based pangenome clustering tool that is able to account for many of the sources of error introduced during the annotation of prokaryotic genome assemblies. Panaroo is available at https://github.com/gtonkinhill/panaroo .

RevDate: 2020-07-21

Bayer PE, Golicz AA, Scheben A, et al (2020)

Plant pan-genomes are the new reference.

Nature plants pii:10.1038/s41477-020-0733-0 [Epub ahead of print].

Recent years have seen a surge in plant genome sequencing projects and the comparison of multiple related individuals. The high degree of genomic variation observed led to the realization that single reference genomes do not represent the diversity within a species, and led to the expansion of the pan-genome concept. Pan-genomes represent the genomic diversity of a species and includes core genes, found in all individuals, as well as variable genes, which are absent in some individuals. Variable gene annotations often show similarities across plant species, with genes for biotic and abiotic stress commonly enriched within variable gene groups. Here we review the growth of pan-genomics in plants, explore the origins of gene presence and absence variation, and show how pan-genomes can support plant breeding and evolution studies.

RevDate: 2020-07-20

Yang LL, Jiang Z, Li Y, et al (2020)

Plasmids related to the symbiotic nitrogen fixation are not only cooperated functionally, but also may have evolved over a time span in family Rhizobiaceae.

Genome biology and evolution pii:5873871 [Epub ahead of print].

Rhizobia are soil bacteria capable of forming symbiotic nitrogen-fixing nodules associated with leguminous plants. In fast-growing legume-nodulating rhizobia, such as the species in the family Rhizobiaceae, the symbiotic plasmid is the main genetic basis for nitrogen-fixing symbiosis, and is susceptible to horizontal gene transfer. To further understand the symbioses evolution in Rhizobiaceae, we analysed the pan-genome of this family based on 92 genomes of type/reference strains and reconstructed its phylogeny using a phylogenomics approach. Intriguingly, although the genetic expansion that occurred in chromosomal regions was the main reason for the high proportion of low-frequency flexible gene families in the pan-genome, gene gain events associated with accessory plasmids introduced more genes into the genomes of nitrogen-fixing species. For symbiotic plasmids, although horizontal gene transfer frequently occurred, transfer may be impeded by, such as, the host's physical isolation and soil conditions, even among phylogenetically close species. During coevolution with leguminous hosts, the plasmid system, including accessory and symbiotic plasmids, may have evolved over a time span, and provided rhizobial species with the ability to adapt to various environmental conditions and helped them achieve nitrogen fixation. These findings provide new insights into the phylogeny of Rhizobiaceae and advance our understanding of the evolution of symbiotic nitrogen fixation.

RevDate: 2020-07-17

Coulton A, KJ Edwards (2020)

AutoCloner: automatic homologue-specific primer design for full-gene cloning in polyploids.

BMC bioinformatics, 21(1):311 pii:10.1186/s12859-020-03601-7.

BACKGROUND: Polyploid organisms such as wheat complicate even the simplest of procedures in molecular biology. Whilst knowledge of genomic sequences in crops is increasing rapidly, the scientific community is still a long way from producing a full pan-genome for every species. Polymerase chain reaction and Sanger sequencing therefore remain widely used as methods for characterizing gene sequences in many varieties of crops. High sequence similarity between genomes in polyploids means that if primers are not homeologue-specific via the incorporation of a SNP at the 3' tail, sequences other than the target sequence will also be amplified. Current consensus for gene cloning in wheat is to manually perform many steps in a long bioinformatics pipeline.

RESULTS: Here we present AutoCloner (www.autocloner.com), a fully automated pipeline for crop gene cloning that includes a free-to-use web interface for users. AutoCloner takes a sequence of interest from the user and performs a basic local alignment search tool (BLAST) search against the genome assembly for their particular polyploid crop. Homologous sequences are then compiled with the input sequence into a multiple sequence alignment which is mined for single-nucleotide polymorphisms (SNPs). Various combinations of potential primers that cover the entire gene of interest are then created and evaluated by Primer3; the set of primers with the highest score, as well as all possible primers at every SNP location, are then returned to the user for polymerase chain reaction (PCR). We have successfully used AutoCloner to clone various genes of interest in the Apogee wheat variety, which has no current genome sequence. In addition, we have successfully run the pipeline on ~ 80,000 high-confidence gene models from a wheat genome assembly.

CONCLUSION: AutoCloner is the first tool to fully-automate primer design for gene cloning in polyploids, where previously the consensus within the wheat community was to perform this process manually. The web interface for AutoCloner provides a simple and effective polyploid primer-design method for gene cloning, with no need for researchers to download software or input any other details other than their sequence of interest.

RevDate: 2020-07-17

Castro-Jaimes S, Bello-López E, Velázquez-Acosta C, et al (2020)

Chromosome Architecture and Gene Content of the Emergent Pathogen Acinetobacter haemolyticus.

Frontiers in microbiology, 11:926.

Acinetobacter haemolyticus is a Gammaproteobacterium that has been involved in serious diseases frequently linked to the nosocomial environment. Most of the strains causing such infections are sensitive to a wide variety of antibiotics, but recent reports indicate that this pathogen is acquiring very efficiently carbapenem-resistance determinants like the blaNDM-1 gene, all over the world. With this work we contribute with a collection set of 31 newly sequenced nosocomial A. haemolyticus isolates. Genome analysis of these sequences and others collected from RefSeq indicates that their chromosomes are organized in 12 syntenic blocks that contain most of the core genome genes. These blocks are separated by hypervariable regions that are rich in unique gene families, but also have signals of horizontal gene transfer. Genes involved in virulence or encoding different secretion systems are located inside syntenic regions and have recombination signals. The relative order of the synthetic blocks along the A. haemolyticus chromosome can change, indicating that they have been subject to several kinds of inversions. Genomes of this microorganism show large differences in gene content even if they are in the same clade. Here we also show that A. haemolyticus has an open pan-genome.

RevDate: 2020-07-16

Chandrasekar SS, Kingstad-Bakke BA, Wu CW, et al (2020)

A Novel Mucosal Adjuvant System for the Immunization Against Avian Coronavirus Causing Infectious Bronchitis.

Journal of virology pii:JVI.01016-20 [Epub ahead of print].

Infectious Bronchitis (IB) caused by Infectious Bronchitis Virus (IBV) is currently a major threat to chicken health with multiple outbreaks being reported in the US over the past decade. Modified live virus (MLV) vaccines used in the field can persist and provide the genetic material needed for recombination and emergence of novel IBV serotypes. Inactivated and subunit vaccines overcome some of the limitations of MLV with no risk of virulence reversion and emergence of new virulent serotypes. However, these vaccines are weakly immunogenic and poorly protective. There is an urgent need to develop more effective vaccines that can elicit a robust, long-lasting immune response. In this study, we evaluate a novel adjuvant system developed from Quil-A and chitosan (QAC) for the intranasal delivery of nucleic acid immunogens to improve protective efficacy. The QAC adjuvant system forms nanocarriers (<100 nm) that efficiently encapsulate nucleic acid cargo, exhibit sustained release of payload and can stably transfect cells. Encapsulation of plasmid DNA vaccine expressing IBV nucleocapsid (N) protein by the QAC adjuvant system (pQAC-N) enhanced immunogenicity as evidenced by robust induction of adaptive humoral and cellular immune responses post vaccination and challenge. Birds immunized with pQAC-N showed reduced clinical severity and viral shedding post challenge on par with protection observed with current commercial vaccines without the associated safety concerns. Presented results indicate that the QAC adjuvant system can offer a safer alternative to the use of live vaccines against avian and other emerging coronaviruses.Importance According to the 2017 US agriculture statistics, the combined value of production and sales from broilers, eggs, turkeys, and chicks was $42.8 billion. Of this number, broiler sales comprised 67 percent of the industry value with the production of > 50 billion pounds of chicken meats. The economic success of the poultry industry in the USA hinges on the extensive use of vaccines to control Infectious Bronchitis Virus (IBV) and other poultry pathogens. Majority of vaccines currently licensed for poultry health include both modified live vaccine and inactivated pathogens. Despite their proven efficacy, modified live vaccine constructs take time to produce and could potentially revert to virulence, which limits their safety. The significance of our research stems from the development of a safer and potent alternative mucosal vaccine to replace live vaccines against IBV and other emerging coronaviruses.

RevDate: 2020-07-13

Bohr LL, Mortimer TD, CS Pepperell (2020)

Lateral Gene Transfer Shapes Diversity of Gardnerella spp.

Frontiers in cellular and infection microbiology, 10:293.

Gardnerella spp. are pathognomonic for bacterial vaginosis, which increases the risk of preterm birth and the transmission of sexually transmitted infections. Gardnerella spp. are genetically diverse, comprising what have recently been defined as distinct species with differing functional capacities. Disease associations with Gardnerella spp. are not straightforward: patients with BV are usually infected with multiple species, and Gardnerella spp. are also found in the vaginal microbiome of healthy women. Genome comparisons of Gardnerella spp. show evidence of lateral gene transfer (LGT), but patterns of LGT have not been characterized in detail. Here we sought to define the role of LGT in shaping the genetic structure of Gardnerella spp. We analyzed whole genome sequencing data for 106 Gardnerella strains and used these data for pan genome analysis and to characterize LGT in the core and accessory genomes, over recent and remote timescales. In our diverse sample of Gardnerella strains, we found that both the core and accessory genomes are clearly differentiated in accordance with newly defined species designations. We identified putative competence and pilus assembly genes across most species; we also found them to be differentiated between species. Competence machinery has diverged in parallel with the core genome, with selection against deleterious mutations as a predominant influence on their evolution. By contrast, the virulence factor vaginolysin, which encodes a toxin, appears to be readily exchanged among species. We identified five distinct prophage clusters in Gardnerella genomes, two of which appear to be exchanged between Gardnerella species. Differences among species are apparent in their patterns of LGT, including their exchange with diverse gene pools. Despite frequent LGT and co-localization in the same niche, our results show that Gardnerella spp. are clearly genetically differentiated and yet capable of exchanging specific genetic material. This likely reflects complex interactions within bacterial communities associated with the vaginal microbiome. Our results provide insight into how such interactions evolve and are maintained, allowing these multi-species communities to colonize and invade human tissues and adapt to antibiotics and other stressors.

RevDate: 2020-07-13

Han M, Liu G, Chen Y, et al (2020)

Comparative Genomics Uncovers the Genetic Diversity and Characters of Veillonella atypica and Provides Insights Into Its Potential Applications.

Frontiers in microbiology, 11:1219.

Veillonella atypica is a bacterium that is present in the gut and the oral cavity of mammals and plays diverse roles in different niches. A recent study demonstrated that Veillonella is highly associated with marathon running and approved that V. atypica gavage improves treadmill run time in mice, revealing that V. atypica has a high biotechnological potential in improving athlete performance. However, a comprehensive analysis of the genetic diversity, function traits, and genome editing method of V. atypica remains elusive. In the present study, we conducted a systemically comparative analysis of the genetic datasets of nine V. atypica strains. The pan-genome of V. atypica consisted of 2,065 homologous clusters and exhibited an open pan-genome structure. A phylogenetic analysis of V. atypica with two different categories revealed that V. atypica OK5 was the most distant from the other eight V. atypica strains. A total of 43 orthologous genes were identified as CAZyme genes and grouped into 23 CAZyme families. The CAZyme components derived from accessory clusters contributed to the differences in the ability of the nine V. atypica strains to utilize carbohydrates. An integrated analysis of the metabolic pathways of V. atypica suggested that V. atypica strains harbored vancomycin resistance and were involved in several biosynthesis pathways of secondary metabolites. The V. atypica strains harbored four main Cas proteins, namely, CAS-Type IIIA, CAS-Type IIA, CAS-Type IIC, and CAS-Type IIID. This pilot study provides an in-depth understanding of and a fundamental knowledge about the biology of V. atypica that allow the possibility to increase the biotechnological potential of this bacterium.

RevDate: 2020-07-12

Guo G, Du D, Yu Y, et al (2020)

Pan-genome analysis of Streptococcus suis serotype 2 revealed genomic diversity among strains of different virulence.

Transboundary and emerging diseases [Epub ahead of print].

Streptococcus suis (SS) is an emerging zoonotic pathogen that causes severe infections in swine and humans. Among the 33 known serotypes, serotype 2 is most frequently associated with infections in pigs and humans. To better understand the virulence characterization of S. suis serotype 2 (SS2) and discriminate the difference between virulent and avirulent strains in SS2, characterization of the genomic features of strains with different virulence are required. The result showed that Streptococcus suis have an open pan-genome. The pan-genome shared by the 19 S. suis serotype 2 strains was composed of 1239 core genes and 2436 accessory genes. COG analysis indicated that core genes are involved in the basic physiological function, but accessory genes related to tachytely evolution. Comparative analysis between core genomes of virulent strains and 9 avirulent strains, suggested that srtBCD pilus cluster was a significant discrepancy between virulent and avirulent strains. Analysis between high virulent and group B low virulent strains showed 53 and 58 genes specific to each other. Moreover, genomes of avirulent strains tend to be larger than virulent strains; avirulent strains tend to possess more prophages sequences than virulent strains. Our findings could be contributed to a better understanding of the genomics of S. suis serotype 2.

RevDate: 2020-07-08

Lees JA, Mai TT, Galardini M, et al (2020)

Improved Prediction of Bacterial Genotype-Phenotype Associations Using Interpretable Pangenome-Spanning Regressions.

mBio, 11(4): pii:mBio.01344-20.

Discovery of genetic variants underlying bacterial phenotypes and the prediction of phenotypes such as antibiotic resistance are fundamental tasks in bacterial genomics. Genome-wide association study (GWAS) methods have been applied to study these relations, but the plastic nature of bacterial genomes and the clonal structure of bacterial populations creates challenges. We introduce an alignment-free method which finds sets of loci associated with bacterial phenotypes, quantifies the total effect of genetics on the phenotype, and allows accurate phenotype prediction, all within a single computationally scalable joint modeling framework. Genetic variants covering the entire pangenome are compactly represented by extended DNA sequence words known as unitigs, and model fitting is achieved using elastic net penalization, an extension of standard multiple regression. Using an extensive set of state-of-the-art bacterial population genomic data sets, we demonstrate that our approach performs accurate phenotype prediction, comparable to popular machine learning methods, while retaining both interpretability and computational efficiency. Compared to those of previous approaches, which test each genotype-phenotype association separately for each variant and apply a significance threshold, the variants selected by our joint modeling approach overlap substantially.IMPORTANCE Being able to identify the genetic variants responsible for specific bacterial phenotypes has been the goal of bacterial genetics since its inception and is fundamental to our current level of understanding of bacteria. This identification has been based primarily on painstaking experimentation, but the availability of large data sets of whole genomes with associated phenotype metadata promises to revolutionize this approach, not least for important clinical phenotypes that are not amenable to laboratory analysis. These models of phenotype-genotype association can in the future be used for rapid prediction of clinically important phenotypes such as antibiotic resistance and virulence by rapid-turnaround or point-of-care tests. However, despite much effort being put into adapting genome-wide association study (GWAS) approaches to cope with bacterium-specific problems, such as strong population structure and horizontal gene exchange, current approaches are not yet optimal. We describe a method that advances methodology for both association and generation of portable prediction models.

RevDate: 2020-07-07

Shahi N, SK Mallik (2020)

Emerging bacterial fish pathogen Lactococcus garvieae RTCLI04, isolated from rainbow trout (Oncorhynchus mykiss): Genomic features and comparative genomics.

Microbial pathogenesis pii:S0882-4010(20)30734-8 [Epub ahead of print].

Lactococcus garvieae is one of the emerging zoonotic bacterial pathogen, causes fatal hemorrhagic septicemia in cultured fish species, animals and humans, worldwide. Here, we report the genomic features of whole-genome sequence (WGS) of L. garvieae strain RTCLI04, recovered from lower intestine of farmed rainbow trout, Oncorhynchus mykiss in the northwest Himalayan region India. The genome of L. garvieae RTCLI04 is a single circular chromosome of 2,054,885 base pairs (bp), which encodes 1,993 proteins and has G + C content of 39%. The bioinformatics analysis of WGS of RTCLI04, confirmed the presence of 51 tRNAs genes (including two pseudogenes), six rRNAs genes (four genes for 5S rRNA; one gene for 16S rRNA and one gene for 23S rRNA), five virulent domains, and twenty eight different genetic pathways. A Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) finder tool indicates that three different CRISPR and one cas system with common spacer was present in the genome of L. garvieae RTCLI04. Pan-genome analysis of RTCLI04 and all the other reference L. garvieae strains shows that pan-genome of this bacterium consisted of 2,239 putative protein-coding genes in which 1,850 genes are core gene, 389 genes are dispensable gene, and 221 genes are unique to RTCLI04. L. garvieae RTCLI04 lacks genomic island of 16.5 Kb capsule gene cluster. In addition, 39 virulence-associated genes (VAGs) including hly1,-2,-3; PavA, PsaA; eno; LPxTG containing surface proteins 1, 2, 3 and 4; pgm, sod and 29 antimicrobial resistant genes (ARGs) including mefE (clindamycin), srmB (lincomycin), dfrA26 (trimethoprim), gyrB (nalidixic acid), arr-3 (rifampin), otrB (tetracycline), aac(6)-Ic (tobramycin), IrgB (penicillin), mecA (oxacillin), vanRB (vancomycin) and mfpA (fluoroquinolone) were also predicted in the genome of L. garvieae RTCLI04. Our study provides new insight into understanding the virulence mechanism, antimicrobial resistance, and development of effective therapeutic measures against L. garvieae during a disease outbreak in aquaculture.

RevDate: 2020-07-07

Lyu J (2020)

Pan-genome upgrade.

Nature plants pii:10.1038/s41477-020-0731-2 [Epub ahead of print].

RevDate: 2020-07-07

Hurel J, Schbath S, Bougeard S, et al (2020)

DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples.

BMC bioinformatics, 21(1):284 pii:10.1186/s12859-020-03611-5.

BACKGROUND: The European Community has adopted very restrictive policies regarding the dissemination and use of genetically modified organisms (GMOs). In fact, a maximum threshold of 0.9% of contaminating GMOs is tolerated for a "GMO-free" label. In recent years, imports of undescribed GMOs have been detected. Their sequences are not described and therefore not detectable by conventional approaches, such as PCR.

RESULTS: We developed DUGMO, a bioinformatics pipeline for the detection of genetically modified (GM) bacteria, including unknown GM bacteria, based on Illumina paired-end sequencing data. The method is currently focused on the detection of GM bacteria with - possibly partial - transgenes in pure bacterial samples. In the preliminary steps, coding sequences (CDSs) are aligned through two successive BLASTN against the host pangenome with relevant tuned parameters to discriminate CDSs belonging to the wild type genome (wgCDS) from potential GM coding sequences (pgmCDSs). Then, Bray-Curtis distances are calculated between the wgCDS and each pgmCDS, based on the difference of genomic vocabulary. Finally, two machine learning methods, namely the Random Forest and Generalized Linear Model, are carried out to target true GM CDS(s), based on six variables including Bray-Curtis distances and GC content. Tests carried out on a GM Bacillus subtilis showed 25 positive CDSs corresponding to the chloramphenicol resistance gene and CDSs of the inserted plasmids. On a wild type B. subtilis, no false positive sequences were detected.

CONCLUSION: DUGMO detects exogenous CDS, truncated, fused or highly mutated wild CDSs in high-throughput sequencing data, and was shown to be efficient at detecting GM sequences, but it might also be employed for the identification of recent horizontal gene transfers.

RevDate: 2020-07-03

Kaushal G, SP Singh (2020)

Comparative genome analysis provides shreds of molecular evidence for reclassification of Leuconostoc mesenteroides MTCC10508 as a strain of Leu. suionicum.

Genomics pii:S0888-7543(20)30015-X [Epub ahead of print].

This study presents the whole-genome comparative analysis of a Leuconostoc sp. strain, previously documented as Leu. mesenteroides MTCC 10508. The ANI, dDDH, dot plot, and MAUVE analyses suggested its reclassification as a strain of Leu. suionicum. Functional annotation identified a total of 1971 genes, out of which, 265 genes were mapped to CAZymes, evincing its carbohydrate transforming capability. The genome comparison with 59 Leu. mesenteroides, and Leu. suionicum strains generated the core and pan-genome profiles, divulging the unique genes in Leuconostoc sp. MTCC 10508. For the first time, this study reports the genes encoding alpha-xylosidase and copper oxidase in a strain of Leu. suionicum. The genetic information for any possible allergenic molecule could not be detected in the genome, advocating the safety of the strain. The present investigation provides the genomic evidence for reclassification of the Leuconostoc sp. strain and also promulgates the molecular insights into its metabolic potential.

RevDate: 2020-07-03

Duru IC, Andreevskaya M, Laine P, et al (2020)

Genomic characterization of the most barotolerant Listeria monocytogenes RO15 strain compared to reference strains used to evaluate food high pressure processing.

BMC genomics, 21(1):455 pii:10.1186/s12864-020-06819-0.

BACKGROUND: High pressure processing (HPP; i.e. 100-600 MPa pressure depending on product) is a non-thermal preservation technique adopted by the food industry to decrease significantly foodborne pathogens, including Listeria monocytogenes, from food. However, susceptibility towards pressure differs among diverse strains of L. monocytogenes and it is unclear if this is due to their intrinsic characteristics related to genomic content. Here, we tested the barotolerance of 10 different L. monocytogenes strains, from food and food processing environments and widely used reference strains including clinical isolate, to pressure treatments with 400 and 600 MPa. Genome sequencing and genome comparison of the tested L. monocytogenes strains were performed to investigate the relation between genomic profile and pressure tolerance.

RESULTS: None of the tested strains were tolerant to 600 MPa. A reduction of more than 5 log10 was observed for all strains after 1 min 600 MPa pressure treatment. L. monocytogenes strain RO15 showed no significant reduction in viable cell counts after 400 MPa for 1 min and was therefore defined as barotolerant. Genome analysis of so far unsequenced L. monocytogenes strain RO15, 2HF33, MB5, AB199, AB120, C7, and RO4 allowed us to compare the gene content of all strains tested. This revealed that the three most pressure tolerant strains had more than one CRISPR system with self-targeting spacers. Furthermore, several anti-CRISPR genes were detected in these strains. Pan-genome analysis showed that 10 prophage genes were significantly associated with the three most barotolerant strains.

CONCLUSIONS: L. monocytogenes strain RO15 was the most pressure tolerant among the selected strains. Genome comparison suggests that there might be a relationship between prophages and pressure tolerance in L. monocytogenes.

RevDate: 2020-07-02

Steinbrenner AD (2020)

The evolving landscape of cell surface pattern recognition across plant immune networks.

Current opinion in plant biology, 56:135-146 pii:S1369-5266(20)30053-4 [Epub ahead of print].

To recognize diverse threats, plants monitor extracellular molecular patterns and transduce intracellular immune signaling through receptor complexes at the plasma membrane. Pattern recognition occurs through a prototypical network of interacting proteins, comprising A) receptors that recognize inputs associated with a growing number of pest and pathogen classes (bacteria, fungi, oomycetes, caterpillars), B) co-receptor kinases that participate in binding and signaling, and C) cytoplasmic kinases that mediate first stages of immune output. While this framework has been elucidated in reference accessions of model organisms, network components are part of gene families with widespread variation, potentially tuning immunocompetence for specific contexts. Most dramatically, variation in receptor repertoires determines the range of ligands acting as immunogenic inputs for a given plant. Diversification of receptor kinase (RK) and related receptor-like protein (RLP) repertoires may tune responses even within a species. Comparative genomics at pangenome scale will reveal patterns and features of immune network variation.

RevDate: 2020-07-02

Chen Z, Kuang D, Xu X, et al (2020)

Genomic analyses of multidrug-resistant Salmonella Indiana, Typhimurium, and Enteritidis isolates using MinION and MiSeq sequencing technologies.

PloS one, 15(7):e0235641 pii:PONE-D-20-08707.

We sequenced 25 isolates of phenotypically multidrug-resistant Salmonella Indiana (n = 11), Typhimurium (n = 8), and Enteritidis (n = 6) using both MinION long-read [SQK-LSK109 and flow cell (R9.4.1)] and MiSeq short-read (Nextera XT and MiSeq Reagent Kit v2) sequencing technologies to determine the advantages of each approach in terms of the characteristics of genome structure, antimicrobial resistance (AMR), virulence potential, whole-genome phylogeny, and pan-genome. The MinION reads were base-called in real-time using MinKnow 3.4.8 integrated with Guppy 3.0.7. The long-read-only assembly, Illumina-only assembly, and hybrid assembly pipelines of Unicycler 0.4.8 were used to generate the MinION, MiSeq, and hybrid assemblies, respectively. The MinION assemblies were highly contiguous compared to the MiSeq assemblies but lacked accuracy, a deficiency that was mitigated by adding the MiSeq short reads through the Unicycler hybrid assembly which corrected erroneous single nucleotide polymorphisms (SNPs). The MinION assemblies provided similar predictions of AMR and virulence potential compared to the MiSeq and hybrid assemblies, although they produced more total false negatives of AMR genotypes, primarily due to failure in identifying tetracycline resistance genes in 11 of the 19 MinION assemblies of tetracycline-resistant isolates. The MinION assemblies displayed a large genetic distance from their corresponding MiSeq and hybrid assemblies on the whole-genome phylogenetic tree, indicating that the lower read accuracy of MinION sequencing caused incorrect clustering. The pan-genome of the MinION assemblies contained significantly more accessory genes and less core genes compared to the MiSeq and hybrid assemblies, suggesting that although these assemblies were more contiguous, their sequencing errors reduced accurate genome annotations. Our research demonstrates that MinION sequencing by itself provides an efficient assessment of the genome structure, antimicrobial resistance, and virulence potential of Salmonella; however, it is not sufficient for whole-genome phylogenetic and pan-genome analyses. MinION in combination with MiSeq facilitated the most accurate genomic analyses.

RevDate: 2020-07-02

Wang B, Cheng H, Qian W, et al (2020)

Comparative genome analysis and mining of secondary metabolites of Paenibacillus polymyxa.

Genes & genetic systems [Epub ahead of print].

Paenibacillus polymyxa is a well-known Gram-positive biocontrol bacterium. It has been reported that many P. polymyxa strains can inhibit bacteria, fungi and other plant pathogens. Paenibacillus polymyxa employs a variety of mechanisms to promote plant growth, so it is necessary to understand the biocontrol ability of bacteria at the genome level. In the present study, thanks to the widespread availability of Paenibacillus genome data and the development of bioinformatics tools, we were able to analyze and mine the genomes of 43 P. polymyxa strains. The strain NCTC4744 was determined not to be P. polymyxa according to digital DNA-DNA hybridization and average nucleotide identity. By analysis of the pan-genome and the core genome, we found that the pan-genome of P. polymyxa was open and that there were 3,192 core genes. In a gene cluster analysis of secondary metabolites, 797 secondary metabolite gene clusters were found, of which 343 are not similar to known clusters and are expected to reveal a large number of new secondary metabolites. We also analyzed the plant growth-promoting genes that were mined and found, surpisingly, that these genes are highly conserved. The results of the present study not only reveal a large number of unknown potential secondary metabolite gene clusters in P. polymyxa, but also suggest that plant growth promotion characteristics are evolutionary adaptations of P. polymyxa to plant-related habitats.

RevDate: 2020-07-02

Fodor A, Abate BA, Deák P, et al (2020)

Multidrug Resistance (MDR) and Collateral Sensitivity in Bacteria, with Special Attention to Genetic and Evolutionary Aspects and to the Perspectives of Antimicrobial Peptides-A Review.

Pathogens (Basel, Switzerland), 9(7): pii:pathogens9070522.

Antibiotic poly-resistance (multidrug-, extreme-, and pan-drug resistance) is controlled by adaptive evolution. Darwinian and Lamarckian interpretations of resistance evolution are discussed. Arguments for, and against, pessimistic forecasts on a fatal "post-antibiotic era" are evaluated. In commensal niches, the appearance of a new antibiotic resistance often reduces fitness, but compensatory mutations may counteract this tendency. The appearance of new antibiotic resistance is frequently accompanied by a collateral sensitivity to other resistances. Organisms with an expanding open pan-genome, such as Acinetobacterbaumannii, Pseudomonas aeruginosa, and Klebsiella pneumoniae, can withstand an increased number of resistances by exploiting their evolutionary plasticity and disseminating clonally or poly-clonally. Multidrug-resistant pathogen clones can become predominant under antibiotic stress conditions but, under the influence of negative frequency-dependent selection, are prevented from rising to dominance in a population in a commensal niche. Antimicrobial peptides have a great potential to combat multidrug resistance, since antibiotic-resistant bacteria have shown a high frequency of collateral sensitivity to antimicrobial peptides. In addition, the mobility patterns of antibiotic resistance, and antimicrobial peptide resistance, genes are completely different. The integron trade in commensal niches is fortunately limited by the species-specificity of resistance genes. Hence, we theorize that the suggested post-antibiotic era has not yet come, and indeed might never come.

RevDate: 2020-07-01

Roder T, Wüthrich D, Bär C, et al (2020)

In Silico Comparison Shows that the Pan-Genome of a Dairy-Related Bacterial Culture Collection Covers Most Reactions Annotated to Human Microbiomes.

Microorganisms, 8(7): pii:microorganisms8070966.

The diversity of the human microbiome is positively associated with human health. However, this diversity is endangered by Westernized dietary patterns that are characterized by a decreased nutrient variety. Diversity might potentially be improved by promoting dietary patterns rich in microbial strains. Various collections of bacterial cultures resulting from a century of dairy research are readily available worldwide, and could be exploited to contribute towards this end. We have conducted a functional in silico analysis of the metagenome of 24 strains, each representing one of the species in a bacterial culture collection composed of 626 sequenced strains, and compared the pathways potentially covered by this metagenome to the intestinal metagenome of four healthy, although overweight, humans. Remarkably, the pan-genome of the 24 strains covers 89% of the human gut microbiome's annotated enzymatic reactions. Furthermore, the dairy microbial collection covers biological pathways, such as methylglyoxal degradation, sulfate reduction, g-aminobutyric (GABA) acid degradation and salicylate degradation, which are differently covered among the four subjects and are involved in a range of cardiometabolic, intestinal, and neurological disorders. We conclude that microbial culture collections derived from dairy research have the genomic potential to complement and restore functional redundancy in human microbiomes.

RevDate: 2020-06-30

Motyka-Pomagruk A, Zoledowska S, Misztak AE, et al (2020)

Comparative genomics and pangenome-oriented studies reveal high homogeneity of the agronomically relevant enterobacterial plant pathogen Dickeya solani.

BMC genomics, 21(1):449 pii:10.1186/s12864-020-06863-w.

BACKGROUND: Dickeya solani is an important plant pathogenic bacterium causing severe losses in European potato production. This species draws a lot of attention due to its remarkable virulence, great devastating potential and easier spread in contrast to other Dickeya spp. In view of a high need for extensive studies on economically important soft rot Pectobacteriaceae, we performed a comparative genomics analysis on D. solani strains to search for genetic foundations that would explain the differences in the observed virulence levels within the D. solani population.

RESULTS: High quality assemblies of 8 de novo sequenced D. solani genomes have been obtained. Whole-sequence comparison, ANIb, ANIm, Tetra and pangenome-oriented analyses performed on these genomes and the sequences of 14 additional strains revealed an exceptionally high level of homogeneity among the studied genetic material of D. solani strains. With the use of 22 genomes, the pangenome of D. solani, comprising 84.7% core, 7.2% accessory and 8.1% unique genes, has been almost completely determined, suggesting the presence of a nearly closed pangenome structure. Attribution of the genes included in the D. solani pangenome fractions to functional COG categories showed that higher percentages of accessory and unique pangenome parts in contrast to the core section are encountered in phage/mobile elements- and transcription- associated groups with the genome of RNS 05.1.2A strain having the most significant impact. Also, the first D. solani large-scale genome-wide phylogeny computed on concatenated core gene alignments is herein reported.

CONCLUSIONS: The almost closed status of D. solani pangenome achieved in this work points to the fact that the unique gene pool of this species should no longer expand. Such a feature is characteristic of taxa whose representatives either occupy isolated ecological niches or lack efficient mechanisms for gene exchange and recombination, which seems rational concerning a strictly pathogenic species with clonal population structure. Finally, no obvious correlations between the geographical origin of D. solani strains and their phylogeny were found, which might reflect the specificity of the international seed potato market.

RevDate: 2020-06-30

Yang F, Feng H, Massey IY, et al (2020)

Genome-Wide Analysis Reveals Genetic Potential for Aromatic Compounds Biodegradation of Sphingopyxis.

BioMed research international, 2020:5849123.

Members of genus Sphingopyxis are frequently found in diverse eco-environments worldwide and have been traditionally considered to play vital roles in the degradation of aromatic compounds. Over recent decades, many aromatic-degrading Sphingopyxis strains have been isolated and recorded, but little is known about their genetic nature related to aromatic compounds biodegradation. In this study, bacterial genomes of 19 Sphingopyxis strains were used for comparative analyses. Phylogeny showed an ambiguous relatedness between bacterial strains and their habitat specificity, while clustering based on Cluster of Orthologous Groups suggested the potential link of functional profile with substrate-specific traits. Pan-genome analysis revealed that 19 individuals were predicted to share 1,066 orthologous genes, indicating a high genetic homogeneity among Sphingopyxis strains. Notably, KEGG Automatic Annotation Server results suggested that most genes pertaining aromatic compounds biodegradation were predicted to be involved in benzoate, phenylalanine, and aminobenzoate metabolism. Among them, β-ketoadipate biodegradation might be the main pathway in Sphingopyxis strains. Further inspection showed that a number of mobile genetic elements varied in Sphingopyxis genomes, and plasmid-mediated gene transfer coupled with prophage- and transposon-mediated rearrangements might play prominent roles in the evolution of bacterial genomes. Collectively, our findings presented that Sphingopyxis isolates might be the promising candidates for biodegradation of aromatic compounds in pollution sites.

RevDate: 2020-06-30

Sun Z, Zhou D, Zhang X, et al (2020)

Determining the Genetic Characteristics of Resistance and Virulence of the "Epidermidis Cluster Group" Through Pan-Genome Analysis.

Frontiers in cellular and infection microbiology, 10:274.

Staphylococcus caprae, Staphylococcus capitis, and Staphylococcus epidermidis belong to the "Epidermidis Cluster Group" (ECG) and are generally opportunistic pathogens. In this work, whole genome sequencing, molecular cloning and pan-genome analysis were performed to investigate the genetic characteristics of the resistance, virulence and genome structures of 69 ECG strains, including a clinical isolate (S. caprae SY333) obtained in this work. Two resistance genes (blaZ and aadD2) encoded on the plasmids pSY333-41 and pSY333-45 of S. caprae SY333 were confirmed to be functional. The bla region in ECG exhibited three distinct structures, and these chromosome- and plasmid-encoded bla operons seemed to follow two different evolutionary paths. Pan-genome analysis revealed their pan-genomes tend to be "open." For the virulence-related factors, the genes involved in primary attachment were observed almost exclusively in S. epidermidis, while the genes associated with intercellular aggregation were observed more frequently in S. caprae and S. capitis. The type VII secretion system was present in all strains of S. caprae and some of S. epidermidis but not in S. capitis. Moreover, the isd locus (iron regulated surface determinant) was first found to be encoded on the genomes of S. caprae and S. capitis. These findings suggested that the plasmid and chromosome encoded bla operons of ECG species underwent different evolution paths, as well as they differed in the abundance of virulence genes associated with adherence, invasion, secretion system and immune evasion. Identification of isd loci in S. caprae and S. capitis indicated their ability to acquire heme as nutrient iron during infection.

RevDate: 2020-06-26

Nishitsuji K, Arimoto A, Yonashiro Y, et al (2020)

Comparative genomics of four strains of the edible brown alga, Cladosiphon okamuranus.

BMC genomics, 21(1):422 pii:10.1186/s12864-020-06792-8.

BACKGROUND: The brown alga, Cladosiphon okamuranus (Okinawa mozuku), is one of the most important edible seaweeds, and it is cultivated for market primarily in Okinawa, Japan. Four strains, denominated S, K, O, and C, with distinctively different morphologies, have been cultivated commercially since the early 2000s. We previously reported a draft genome of the S-strain. To facilitate studies of seaweed biology for future aquaculture, we here decoded and analyzed genomes of the other three strains (K, O, and C).

RESULTS: Here we improved the genome of the S-strain (ver. 2, 130 Mbp, 12,999 genes), and decoded the K-strain (135 Mbp, 12,511 genes), the O-strain (140 Mbp, 12,548 genes), and the C-strain (143 Mbp, 12,182 genes). Molecular phylogenies, using mitochondrial and nuclear genes, showed that the S-strain diverged first, followed by the K-strain, and most recently the C- and O-strains. Comparisons of genome architecture among the four strains document the frequent occurrence of inversions. In addition to gene acquisitions and losses, the S-, K-, O-, and C-strains possess 457, 344, 367, and 262 gene families unique to each strain, respectively. Comprehensive Blast searches showed that most genes have no sequence similarity to any entries in the non-redundant protein sequence database, although GO annotation suggested that they likely function in relation to molecular and biological processes and cellular components.

CONCLUSIONS: Our study compares the genomes of four strains of C. okamuranus and examines their phylogenetic relationships. Due to global environmental changes, including temperature increases, acidification, and pollution, brown algal aquaculture is facing critical challenges. Genomic and phylogenetic information reported by the present research provides useful tools for isolation of novel strains.

RevDate: 2020-06-25

Collis RM, Biggs PJ, Midwinter AC, et al (2020)

Genomic epidemiology and carbon metabolism of Escherichia coli serogroup O145 reflect contrasting phylogenies.

PloS one, 15(6):e0235066 pii:PONE-D-20-02830.

Shiga toxin-producing Escherichia coli (STEC) are a leading cause of foodborne outbreaks of human disease, but they reside harmlessly as an asymptomatic commensal in the ruminant gut. STEC serogroup O145 are difficult to isolate as routine diagnostic methods are unable to distinguish non-O157 serogroups due to their heterogeneous metabolic characteristics, resulting in under-reporting which is likely to conceal their true prevalence. In light of these deficiencies, the purpose of this study was a twofold approach to investigate enhanced STEC O145 diagnostic culture-based methods: firstly, to use a genomic epidemiology approach to understand the genetic diversity and population structure of serogroup O145 at both a local (New Zealand) (n = 47) and global scale (n = 75) and, secondly, to identify metabolic characteristics that will help the development of a differential media for this serogroup. Analysis of a subset of E. coli serogroup O145 strains demonstrated considerable diversity in carbon utilisation, which varied in association with eae subtype and sequence type. Several carbon substrates, such as D-serine and D-malic acid, were utilised by the majority of serogroup O145 strains, which, when coupled with current molecular and culture-based methods, could aid in the identification of presumptive E. coli serogroup O145 isolates. These carbon substrates warrant subsequent testing with additional serogroup O145 strains and non-O145 strains. Serogroup O145 strains displayed extensive genetic heterogeneity that was correlated with sequence type and eae subtype, suggesting these genetic markers are good indicators for distinct E. coli phylogenetic lineages. Pangenome analysis identified a core of 3,036 genes and an open pangenome of >14,000 genes, which is consistent with the identification of distinct phylogenetic lineages. Overall, this study highlighted the phenotypic and genotypic heterogeneity within E. coli serogroup O145, suggesting that the development of a differential media targeting this serogroup will be challenging.

RevDate: 2020-06-23

Vázquez-Rosas-Landa M, Ponce-Soto GY, Aguirre-Liguori JA, et al (2020)

Population genomics of Vibrionaceae isolated from an endangered oasis reveals local adaptation after an environmental perturbation.

BMC genomics, 21(1):418 pii:10.1186/s12864-020-06829-y.

BACKGROUND: In bacteria, pan-genomes are the result of an evolutionary "tug of war" between selection and horizontal gene transfer (HGT). High rates of HGT increase the genetic pool and the effective population size (Ne), resulting in open pan-genomes. In contrast, selective pressures can lead to local adaptation by purging the variation introduced by HGT and mutation, resulting in closed pan-genomes and clonal lineages. In this study, we explored both hypotheses, elucidating the pan-genome of Vibrionaceae isolates after a perturbation event in the endangered oasis of Cuatro Ciénegas Basin (CCB), Mexico, and looking for signals of adaptation to the environments in their genomes.

RESULTS: We obtained 42 genomes of Vibrionaceae distributed in six lineages, two of them did not showed any close reference strain in databases. Five of the lineages showed closed pan-genomes and were associated to either water or sediment environment; their high Ne estimates suggest that these lineages are not from a recent origin. The only clade with an open pan-genome was found in both environments and was formed by ten genetic groups with low Ne, suggesting a recent origin. The recombination and mutation estimators (r/m) ranged from 0.005 to 2.725, which are similar to oceanic Vibrionaceae estimations. However, we identified 367 gene families with signals of positive selection, most of them found in the core genome; suggesting that despite recombination, natural selection moves the Vibrionaceae CCB lineages to local adaptation, purging the genomes and keeping closed pan-genome patterns. Moreover, we identify 598 SNPs associated with an unstructured environment; some of the genes associated with these SNPs were related to sodium transport.

CONCLUSIONS: Different lines of evidence suggest that the sampled Vibrionaceae, are part of the rare biosphere usually living under famine conditions. Two of these lineages were reported for the first time. Most Vibrionaceae lineages of CCB are adapted to their micro-habitats rather than to the sampled environments. This pattern of adaptation is concordant with the association of closed pan-genomes and local adaptation.

RevDate: 2020-06-20

Anani H, Zgheib R, Hasni I, et al (2020)

Interest of bacterial pangenome analyses in clinical microbiology.

Microbial pathogenesis pii:S0882-4010(20)30641-0 [Epub ahead of print].

Thanks to the progress and decreasing costs in genome sequencing technologies, more than 250,000 bacterial genomes are currently available in public databases, covering most, if not all, of the major human-associated phylogenetic groups of these microorganisms, pathogenic or not. In addition, for many of them, sequences from several strains of a given species are available, thus enabling to evaluate their genetic diversity and study their evolution. In addition, the significant cost reduction of bacterial whole genome sequencing as well as the rapid increase in the number of available bacterial genomes have prompted the development of pangenomic software tools. The study of bacterial pangenome has many applications in clinical microbiology. It can unveil the pathogenic potential and ability of bacteria to resist antimicrobials as well identify specific sequences and predict antigenic epitopes that allow molecular or serologic assays and vaccines to be designed. Bacterial pangenome constitutes a powerful method for understanding the history of human bacteria and relating these findings to diagnosis in clinical microbiology laboratories in order to optimize patient management.

RevDate: 2020-06-19

Liu Y, Du H, Li P, et al (2020)

Pan-Genome of Wild and Cultivated Soybeans.

Cell pii:S0092-8674(20)30618-8 [Epub ahead of print].

Soybean is one of the most important vegetable oil and protein feed crops. To capture the entire genomic diversity, it is needed to construct a complete high-quality pan-genome from diverse soybean accessions. In this study, we performed individual de novo genome assemblies for 26 representative soybeans that were selected from 2,898 deeply sequenced accessions. Using these assembled genomes together with three previously reported genomes, we constructed a graph-based genome and performed pan-genome analysis, which identified numerous genetic variations that cannot be detected by direct mapping of short sequence reads onto a single reference genome. The structural variations from the 2,898 accessions that were genotyped based on the graph-based genome and the RNA sequencing (RNA-seq) data from the representative 26 accessions helped to link genetic variations to candidate genes that are responsible for important traits. This pan-genome resource will promote evolutionary and functional genomics studies in soybean.


RJR Experience and Expertise


Robbins holds BS, MS, and PhD degrees in the life sciences. He served as a tenured faculty member in the Zoology and Biological Science departments at Michigan State University. He is currently exploring the intersection between genomics, microbial ecology, and biodiversity — an area that promises to transform our understanding of the biosphere.


Robbins has extensive experience in college-level education: At MSU he taught introductory biology, genetics, and population genetics. At JHU, he was an instructor for a special course on biological database design. At FHCRC, he team-taught a graduate-level course on the history of genetics. At Bellevue College he taught medical informatics.


Robbins has been involved in science administration at both the federal and the institutional levels. At NSF he was a program officer for database activities in the life sciences, at DOE he was a program officer for information infrastructure in the human genome project. At the Fred Hutchinson Cancer Research Center, he served as a vice president for fifteen years.


Robbins has been involved with information technology since writing his first Fortran program as a college student. At NSF he was the first program officer for database activities in the life sciences. At JHU he held an appointment in the CS department and served as director of the informatics core for the Genome Data Base. At the FHCRC he was VP for Information Technology.


While still at Michigan State, Robbins started his first publishing venture, founding a small company that addressed the short-run publishing needs of instructors in very large undergraduate classes. For more than 20 years, Robbins has been operating The Electronic Scholarly Publishing Project, a web site dedicated to the digital publishing of critical works in science, especially classical genetics.


Robbins is well-known for his speaking abilities and is often called upon to provide keynote or plenary addresses at international meetings. For example, in July, 2012, he gave a well-received keynote address at the Global Biodiversity Informatics Congress, sponsored by GBIF and held in Copenhagen. The slides from that talk can be seen HERE.


Robbins is a skilled meeting facilitator. He prefers a participatory approach, with part of the meeting involving dynamic breakout groups, created by the participants in real time: (1) individuals propose breakout groups; (2) everyone signs up for one (or more) groups; (3) the groups with the most interested parties then meet, with reports from each group presented and discussed in a subsequent plenary session.


Robbins has been engaged with photography and design since the 1960s, when he worked for a professional photography laboratory. He now prefers digital photography and tools for their precision and reproducibility. He designed his first web site more than 20 years ago and he personally designed and implemented this web site. He engages in graphic design as a hobby.

963 Red Tail Lane
Bellingham, WA 98226


E-mail: RJR8222@gmail.com

Collection of publications by R J Robbins

Reprints and preprints of publications, slide presentations, instructional materials, and data compilations written or prepared by Robert Robbins. Most papers deal with computational biology, genome informatics, using information technology to support biomedical research, and related matters.

Research Gate page for R J Robbins

ResearchGate is a social networking site for scientists and researchers to share papers, ask and answer questions, and find collaborators. According to a study by Nature and an article in Times Higher Education , it is the largest academic social network in terms of active users.

Curriculum Vitae for R J Robbins

short personal version

Curriculum Vitae for R J Robbins

long standard version

RJR Picks from Around the Web (updated 11 MAY 2018 )