BASys

BASys
Content
Description For automated bacterial genome annotation and chromosomal map generation
Data types
captured
Data input: Raw genome sequence (FASTA format), labeled genome sequence (FAST format) or predicted/labeled proteome sequence (FASTA); Data output: Fully annotated genome along with an interactive, annotated genome map
Contact
Research center University of Alberta
Laboratory Dr. David Wishart
Primary citation [1]
Access
Website https://www.basys.ca/
Download URL https://www.basys.ca/
Miscellaneous
Data release
frequency
Last update 2012
Curation policy Manually curated

BASys (Bacterial Annotation System) is a freely available web server that can be used to perform automated, comprehensive annotation of bacterial genomes.[2] With the advent of next generation DNA sequencing it is now possible to sequence the complete genome of a bacterium (typically ~4 million bases) within a single day. This has led to an explosion in the number of fully sequenced microbes. In fact, as of 2013, there were more than 2700 fully sequenced bacterial genomes deposited with GenBank. However, a continuing challenge with microbial genomics is finding the resources or tools for annotating the large number of newly sequenced genomes. BASys was developed in 2005 in anticipation of these needs. In fact, BASys was the world’s first publicly accessible microbial genome annotation web server. Because of its widespread popularity, the BASys server was updated in 2011 through the addition of multiple server nodes to handle the large number of queries it was receiving.

The BASys server is designed to accept either assembled genome data (raw DNA sequence data) or complete proteome assignments as input. If raw DNA sequence is provided, BASys employs Glimmer (version 2.1.3) to identify the genes.[1] The output from BASys is a comprehensive genome-wide annotation (with ~60 annotation subfields for each gene) and a zoomable, hyperlinked genome map of the query genome. BASys uses nearly 30 different programs to determine and annotate gene/protein names, GO functions, COG functions, possible paralogues and orthologues, molecular weight, isoelectric point, operon structure, subcellular localization, signal peptides, transmembrane regions, secondary structure, 3D structure, reactions and pathways. The full list of programs used by BASys is given below:

Name Method
Glimmer 2.1.3Glimmer is a popular and very accurate ab initio gene finding program for microbial DNA. On a study of for 31 complete bacterial and archaeal genomes, Glimmer achieved an average gene prediction accuracy of 99.36%. Glimmer uses Interpolated Markov Models to distinguish coding regions from noncoding DNA. Glimmer's performance decreases with increasing GC Content. For genomes with high GC content (>60%), Glimmer may generate a high number of false positive predictions and therefore should be used with caution.
HMMER 2.3.2Used for local Pfam Searches
Homodeller 2.0Locally developed homology modelling program.
SignalP 3.0Signal peptide prediction.
TMHMM 2.0Prediction of transmembrane helices in protein.
PSIPRED 2.45Secondary structure prediction. PSIPRED achieves an average Q3 score of 80.6% for secondary structure prediction.
PS_scanTool for local PROSITE scans.
VADAR 1.4Locally developed protein structure analysis tool. BASys uses VADAR to analyze protein structures for secondary structural information
PSORT-B 2.0.4Used to predict subcellular location. PSORT-B attains a precision of 96% for Gram-positive and Gram-negative bacteria
ProteinNameExtractor 1.0BASys function prediction module. This module was validated against a set of expertly annotated proteins from C.trachomatis.
FindParalogs 1.0BASys module for paralog identification. The paralogs database is created from the conceptual translations for the identified coding regions supplied to BASys by Glimmer or by the submitter.
FindHomologs 1.0BASys module for homolog identification. Searches model organism databases for possible homologs.
GOSearch 1.0BASys module for extracting Gene Ontology information from various sources.
OperonFinder 1.0BASys module for identifying operons.
StructureManager 1.0BASys module for manipulating protein structure files.
StructureClassifier 1.0BASys module for determining structure class from secondary structure information.
Structure Finder 1.0BASys module for generating protein structures from various sources.
COG_Finder 1.0BASys module for identifying COG functional categories and accessions
Secondary Structure Manager 1.0BASys module for generating secondary structure information from various sources.
ECNumber_FinderBASys module for mapping EC_number to and from various sources.
SwissProt Annotation Manager 1.0BASys module for comparing and transitively applying annotations from SwissProt records.
CCDB Annotation Manager 1.0BASys module for comparing and transitively applying annotations from CCDB records.
Gene Identifier 1.0BASys module for coordinating gene identification information from glimmer or user submissions
BASys Annotation Manager 1.0The BASys pipeline manager.
KEGG Search ManagerBASys module for searching and extracting metabolic information from KEGG.
SubCellLocalization Manager 1.0BASys module for generating subcellular location annotation from various sources.

In addition to its extensive annotation for each gene/protein in the query genome, BASys also generates colorful, clickable and fully zoomable circular maps of each input chromosome. These bacterial genome maps are generated used a program called CGView (Circular Genome Viewer) which was developed in 2004.[3] The genome maps are designed to allow rapid navigation and detailed visualization of all the BASys-generated gene annotations. A complete BASys run takes approximately 16 h for an average bacterial chromosome (approximately 4 Megabases). BASys annotations may be viewed and downloaded anonymously or through a password protected access system. BASys will store its bacterial genome annotations on the server for a maximum of 180 days. BASys handles approximately 1000 submissions a year. BASys is accessible at https://www.basys.ca/

Scope and Access

All data in BacMap is non-proprietary or is derived from a non-proprietary source. It is freely accessible and available to anyone. In addition, nearly every data item is fully traceable and explicitly referenced to the original source. BacMap data is available through a public web interface and downloads.

See also

References

  1. 1 2 Van Domselaar, GH; Stothard P; Shrivastava S; Cruz JA; Guo A; Dong X; Lu P; Szafron D; Greiner R; Wishart DS. (July 2005). "BASys: a web server for automated bacterial genome annotation.". Nucleic Acids Res. 33 (Web Server issue): W455–9. doi:10.1093/nar/gki593. PMC 1160269Freely accessible. PMID 15980511.
  2. Stothard P, Van Domselaar G, Shrivastava S, Guo A, O'Neill B, Cruz J, Ellison M, Wishart DS (2005). "BacMap: an interactive picture atlas of annotated bacterial genomes.". Nucleic Acids Res. 33 (Database issue): D317–20. doi:10.1093/nar/gki075. PMC 540029Freely accessible. PMID 15608206.
  3. Stothard, P; Wishart DS (2005). "Circular genome visualization and exploration using CGView.". Bioinformatics. 21 (4): 537–9. doi:10.1093/bioinformatics/bti054. PMID 15479716.
This article is issued from Wikipedia - version of the 4/27/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.