.
|
|
SPARCLE, the Subfamily Protein Architecture Labeling Engine, is a resource for the functional characterization and labeling of protein sequences that have been grouped by their characteristic conserved domain architecture. A domain architecture is defined as the sequential order of conserved domains in a protein sequence.
|
|
|
To use SPARCLE, you can either:
With either approach, the corresponding SPARCLE record(s) will display the name and functional label of the architecture, supporting evidence, and links to other proteins with the same architecture. Illustrated examples for each approach are below.
|
|
|
|
|
|
|
|
|
|
The most common way to access SPARCLE is to
enter a query sequence into CD-Search. The search results will include a "Protein Classification" section if the query protein has a hit to a curated domain architecture in the SPARCLE database. The illustration below provides an example, using NP_387887, DNA gyrase subunit B, as the protein query sequence.
Click on the individual panels of the illustration to open the corresponding live web pages. The first panel of the illustration opens a blank CD-Search page, into which you can enter the accession number of the sample query sequence (NP_387887). Please note that the second and third panels of the illustration reflect the search results as of January 2017. The corresponding live web pages will show a slightly different result, because the annotation of domain architectures on proteins continues to evolve as new data and publications become available. (See the note about ongoing research beneath the illustration.)
|
|
|
|
|
|
|
|
|
Ongoing research: The Conserved Domain Database (CDD), as well as the conserved domain architecture annotated on proteins by SPARCLE, continue to evolve as new data become available and as research progresses. Therefore, the live web page views might differ from the illustration above.
For example, in January 2017, the protein sequence NP_387887 was initially annotated with architecture ID 10647733 (as shown in the illustration). That architecture is named "DNA gyrase subunit B" and includes four distinct conserved domains.
In March 2017, when a new build of CDD/SPARCLE was released, the conserved domain architecture annotation for NP_387887 was revised to architecture ID 11481348, which is a multi-domain that encompasses the four original conserved domains, and which can be seen in the current CD-Search results for NP_387887. That architecture has a more specific and precise name, "type IIA DNA topoisomerase subunit B," and reflects the full length protein model.
To see the four distinct conserved domains that compose the multi-domain, simply change the CD-Search display option on the live CD-Search results for NP_387887 from "Concise Results" to "Full Results" (using the "View" menu near the upper right hand corner of the CD-Search results page). The Full Results display will show the four conserved domains that compose the multi-domain.
As the available data and understanding of conserved domain architectures continue to evolve, the domain architectures that are annotated on proteins may evolve as well, as shown in this example. Comments about the data are welcome and can be sent to the NCBI Support Center/Help Desk, which is accessible as a link in the footer of NCBI web pages.
|
|
|
|
|
|
|
|
|
|
|
It is also possible to search the SPARCLE database by keyword to retrieve domain architectures that contain the term(s) of interest in their descriptions.
The illustration below provides an example. It searches the SPARCLE database for conserved domain architecture records that contain the terms "chloride" and "channel", and limits the results to curated domain architecture records by adding curated[ReviewLevel] to the search.
Click on the individual panels of the illustration below to open the corresponding live web pages. Please note that the second panel of the illustration shows the search results as of March 2, 2017; the corresponding live web page will retrieve a larger number of records, as the SPARCLE database continues to grow.
|
|
|
|
|
|
|
|