Usage

You have different option in using BAR 3.0: you can provide a sequence, either by accession or fasta file; GO Term, PFAM or PDB identifier, ligand code or a NCBI taxonomy code.

To try some possible searches, click on the suggested queries below the input box.

Searching by sequence

Enter a UniprotKB accession in the Query field and press Submit. If the accession is present in BAR 3.0, you will get a page with the cluster annotation.

GO Term and PFAM

Enter a GO Term identifier (i.e GO:0004889) or a PFAM code (i.e. PF02932) and click Submit.

If the term is statistically validated in at least one BAR 3.0 cluster, the output will show the list of matching clusters. Each item in the list is linked to the annotation page for that cluster.

In the result page there is an additional filter field: by entering a NCBI Taxonomy id and clicking Submit, the result table will show only clusters containing sequences from that specific organism.

PDB

Enter a PDB identifier (i.e. 1kc4) and click Submit.

If the PDB is associated to a sequence in a BAR 3.0 cluster, the output will show the list of such clusters. Each item in the list is linked to the annotation page for that cluster.

Ligand

Enter a ligand code (i.e. MG) and click Submit.

If the ligand is associated to a PDB in a BAR 3.0 cluster, the output will show the list of such clusters. Each item in the list is linked to the annotation page for that cluster.

Organism

Enter a NCBI Taxonomy identifier for the desired organism (i.e. 8713) and click Submit.

If the organism is associated to a sequence in a BAR 3.0 cluster, the output will show the list of such clusters. Each item in the list is linked to the annotation page for that cluster.

Output

The cluster annotation page shows cluster statistics at the top:

The button Download cluster data in JSON format allows to download all cluster information in a machine readable format. The schema for the JSON file is available here.

The Query box shows information about the query sequence, if one was provided. This section reports also the sequence identity (SI) and coverage (COV) of the best alignment against the cluster.

If the cluster contains structural information, the PDB section contains a table listing all the PDB chains associated to it, along with UniprotKB accessions and ligands. All rows links to the PDB files associated to sequences in the cluster. At the end of this section you can find links to a cluster HMM profile and a link to the downloadable alignment of the query sequence against it.

The PDB Complexes section lists clusters containing sequences in complex with members of the result cluster. For each PDB complex, the table shows the linked cluster, the matching UniprotKB accessions, the PDB and its chains.

The Protein-protein interactions section lists clusters containing sequences associated in IntAct to members of the result cluster. For each matching cluster the table displays the number of interacting pairs, if the query sequence is in iteraction and with whom, and if the matching cluster contains sequences from the same organism as the query sequence.

Next there are GO Term and PFAM annotations are listed. The heading of each annotation type (Biological Process, Molecular Function, etc) reports the number of statistically validated terms over the total number of annotation terms of the same type in the cluster.

Each term links to the relative entry in its database of origin. Statistically validated terms are marked as "yes" in the "Validated" column. Terms are sorted by P-value, from the lowest to the highest, and by term depth in the ontology (from deeper to shallower).

Statistically validated PFAM may bring associated GO terms to the validated status. Such GO terms are markedby an icon into the PFAM column, that also contains the associated PFAM.

If the GO term may be associated only to sequences from a specific taxonomy, or in case it should never be associated to sequences from a specific taxonomy, such limitations are shown into the Constraints column.

The KEGG Pathways section lists KEGG pathways associated to sequences form the result cluster.

Each table is downloadable in text, Excel and PDF format, and it is also printable using the appropriate button below its heading.

General Information

BAR 3.0 is a web server for the functional annotation of protein sequences. The annotation process relies on a non-hierarchical clustering procedure of a BLAST all-against-all comparison of entire UniProtKB (SwissProt + TrEMBL) without fragments sequences. A graph scheme is adopted in which each protein is a node. An edge is established between two nodes if the two corresponding sequences share a BLAST hit that undergoes the following constraints:

Sequence identity ≥ 40%

Coverage of the alignment ≥ 90%

The coverage is defined as the ratio of the length of the intersection of the aligned regions on the two sequences and the overall length of the alignment (namely the sum of the lengths of the two sequences minus the intersection length).

Clusters are the connected components of the graph and are disjointed (each sequence belongs only to one cluster).

References

  1. Bartoli, L., Montanucci, L., Fronza, R., Martelli, P.L., Fariselli, P., Carota, L., Donvito, G., Maggi, G.P., and Casadio, R. (2009) The Bologna annotation resource: a non hierarchical method for the functional and structural annotation of protein sequences relying on a comparative large-scale genome analysis. J Proteome Res., 8, 4362-4371. 10.1021/pr900204r. PMID: 19552451
  2. Piovesan, D., Martelli, P.L., Fariselli, P., Zauli, A., Rossi, I., and Casadio, R. (2011) BAR-PLUS: the Bologna Annotation Resource Plus for functional and structural annotation of protein sequences. Nucleic Acids Res., 39, W197-202. 10.1093/nar/gkr292. PMID: 21622657
  3. Piovesan, D., Martelli, P. L. , Fariselli, P. , Profiti, G., Zauli, A., Rossi, I., Casadio, R. (2013) How to inherit statistically validated annotation within BAR+ protein clusters, BMC Bioinformatics, vol. 14, no. Suppl 3, p. S4. 10.1186/1471-2105-14-S3-S4. PMID: 23514411