In SymGenDB, users are able to find all the symbiotic relationships that have been recorded in literature up to the last update (June 2018), specifically all complete genomes included in the KEGG catalog that live in such associations. It consists of four different tools separated in different tabs, each of which is designed for a specific search.
THE BASICS:
In the first tab, ORGANISMS, it is possible for users to search and get an overview of specific associations between symbiotic organisms.
In the second tab, GENOMES, a list of the metrics and metadata of the genome(s) involved in a specific symbiotic relationship of interest is displayed.
In the third tab, GENES, users are able to search for orthologous genes of interest included in the symbionts of a given organism(s) included in our catalog.
In the last tab, Meta-DAGs, users are able to search for comprised metabolic networks of their organisms in symbiosis of interest, as well as the genus/genera metabolisms of such organisms by their core (intersection) and pan (union) metabolic capabilities.
Each tab is fully explained next, for examples, please check out the quick tour.
UPDATE: For every tab, now it is possible to search for multiple organisms in the same search, only by separating them with commas.
In this tab, users are able to search for a specific symbiotic association of interest, by entering the common name, the species scientific name, the genus, the class, or any taxonomy level keyword based on NCBI’s taxonomy of a symbiont and/or a host. Users are also encouraged to enter any random organism they can think of, to explore and get a better understanding of the searches SymbioGenomesDB does, including shuffling between the taxonomic levels, since this is a useful way to start any type of analysis of these organisms. Users can explore through broad searches, from wider taxonomy levels, such as Kingdom or Families, to narrow searches, to the level of species and strains.
It is important to denote which partner of the symbiosis will be searched for. It can either be the symbionts of your host of interest, or the other way around, and selecting the proper circle available indicates it.
Selection of the host(s) taxonomy level is available in the scrolling menu below the search bar (the default ‘all ranks’ will search in every taxonomic level and as a result, show the level where the match is found). In the same manner, a selection of the symbiont(s) taxonomy level of interest is also available (the default ‘all ranks’ is equal in every search). Typically, results of the query entered will be shown in just milliseconds. Searching for an organism with a large amount of associated symbionts (for example, human or pig, since there are a lot of pathogenic bacteria associated with these species), can take a few seconds.
Both the resulting plot and resulting list of organisms can be downloaded. Also, the resulting list of organisms includes links to their respective specie at NCBI Taxonomy.
This search will retrieve the genomes associated with a specific host, their metadata and their sequences/orthologs. Users can write the common name, the scientific species name, the class, the genus, etc., of their genome(s) of choice according to the taxonomical level of interest, in the search box (the default ‘all ranks’ will search in every taxonomic level).
In this case, the search can be done for a specific symbiont, or a specific host. If the search is for a host, another menu listing all its symbionts’ genomes will become available, where the genomes of interest can be selected.
In a matter of seconds, tables consisting of the metrics (genome size, NCBI's taxon id, CDSs, GC content, etc) of each genome selected, the symbionts’ isolation source, plasmids and all chromosomes, in cases where more than one chromosome is present, will be shown.
An important feature embedded in this search is that users can download a table with all of the orthologous genes from the genomes they have searched for (at least two), with just the click of a button. This table can be easily parsed for further analysis. The table(s), the FASTA file(s) and/or the GFF file(s) of the organisms resulting of the search are also available for download.
In this tab users can search for genes in any symbiont present in our catalog, or a table of orthologous genes included in two or more genomes of interest, involved in symbiosis.
Users need to input the gene of interest in the first search bar. If more than one gene is searched for, users must use commas to separate searches. In the second search bar, the common name, the scientific name, the class, the genus, etc., of the genome(s) of interest according to the taxonomical level of choice, must be written (the default ‘all ranks’ will search in every taxonomic level). As a result, two menus will be available, to select the gene(s) and genome(s) of interest, accordingly. The search can be done for a specific symbiont, or a specific host. If searching for a host, another menu listing all its symbionts’ genomes becomes available.
The resulting data will be in the form of a table including the genes and the genomes selected in the menus. Every output and their sequences are available for download, as flat files for further and easy parsing and analysis.
This search will retrieve the metabolic information available in the KEGG catalog of every organism included in the database as Meta-DAGs, a new methodology aimed to reduce the size of metabolic networks by calculating 'Metabolic Building Blocks' (MBBs) based on the ciclyc or acyclic properties of the pathways included in an organism's metabolic network, for easier understanding, analysis and interpretations (for a complete and thorough explanation, please refer to Alberich et al. 2017).
In this case, the search can be done for specific organisms or for organisms at the genus hyerarchical level. If the search is for specific organisms, the user gets its metabolic network(s) as a metabolic DAG in an interactive form, where the user can zoom in and out of any part of the network for more general or specific details, and click on any of the MBBs that compose this Meta-DAG to get a full description of the reaction(s) included in that MBB, and the link to its KEGG web page. Furthermore, a visualization of the MBB highlighted in the global metabolic pathways map from the KEGG's web page is shown.
Lastly, with every search the user gets the "pan" and "core" metabolisms, in the same output format. The pan-metabolism is the metabolism obtained by joining the metabolism (union of all metabolic networks) of every organism in the genus. The core-metabolism is the metabolism common to all organisms in the genus (intersection of all metabolic networks).