Conserved Domains and Protein Classification
 
 
 
 
How to find proteins with similar domain architectures
 
 

Any one of the methods below will retrieve proteins with similar domain architectures, which have been identified using the Conserved Domain Architecture Retrieval Tool (CDART). A domain architecture is defined as the sequential order of conserved domains in a protein sequence.

 
  Method 1:
(illustrated below)
  Enter a query (either as protein sequence, set of conserved domains, or multiple queries) directly into the Conserved Domain Architecture Retrieval Tool (CDART) page. (See first frame of illustration below.)  
 
  Method 2:
(illustrated below)
  Retrieve a sequence record of interest from Entrez Protein database, scroll to the "Related information" menu in the right margin, and select "Domain Relatives." (See second frame of illustration below.)  
 
  Method 3:   Use the CD-Search tool to compare a protein query sequence (either as raw sequence data in FASTA format, or as a GI or Accession) against the desired conserved domain data set, in order to identify functional units within the query sequence. Then press the "Search for similar domain architectures" button on the search results page.  
 

Regardless of which method you use, the results will display a list of similar domain architectures, which are ranked by the number of domains they share in common with the query protein's domain architecture. The results display also provides links to the proteins that have a each architecture.

The illustration below shows Methods 1 and 2 for using the Conserved Domain Architecture Retrieval Tool (CDART). Click on any frame of the image below to link to corresponding sections in the CDART help document, which provide additional details about the input options and output display.



Illustration of the CDART home page, where you can input a query either as protein, a set of conserved domains, or as multiple queries. Click on this image for details and examples. Illustration of a sample protein sequence record (mouse DNA mismatch repair protein Mlh1, NP_081086) from the Entrez Protein database, where you can follow the link for Domain Relatives to view a list of proteins with similar domain architectures. Click on this graphic to read more about the various links that exist from a protein record to conserved domains.
Illustration of CDART search results, which list proteins that have domain architectures similar to your query protein sequence (NP_081086, mouse DNA mismatch repair protein, in this example). Click on this graphic to read more about the output display.
The expanded view of a domain architecture displays a list of representative, non-redundant protein sequences which have that architecture. Click on this graphic to read more about the information provided for each domain architecture.


If you would like to try this example yourself, open the CDART home page and enter NP_081086 (mouse DNA mismatch repair protein Mlh1) as the query, or retrieve the sequence record from the Entrez Protein database and then follow the link for "Domain Relatives" that appears under "Related Information" in the right margin. Click on any frame of the image above to link to subsequent sections in this help document, which provide additional details about the input options and output display.


 
 
 
Revised 26 September 2016