Logo Help
What is a logo?
Logos are commonly used in molecular biology to provide a compact graphical representation of the conservation pattern of a set of sequences. They render the information contained in sequence alignments or profile hidden Markov models by drawing a stack of letters for each position, where
- the height of the stack typically corresponds to the conservation at that position, and
- the height of each letter within a stack depends on the frequency of that letter at that position.
What is shown in a logo?
Letter conservation
The per-positon probabilities underlying the letter stack calculations are estimates of those per-position distributions across the family of homologous sequences. In the case of rendering a logo for a profile HMM, those estimates are taken directly from the HMM. When rendering a logo for a sequence alignment, Skylign enables calculation of these estimates (a) directly from frequencies observed in the alignment, (b) from observed counts after assigning sequence weights to account for sequence redundancy, or (c) after assigning weights and combining with a Dirichlet mixture prior. These options are implemented using HMMER (http://hmmer.org).
Skylign offers three ways of calculating letter- and stack-heights for each column of the underlying alignment:
- Information content - all : Stack height is the information content (aka relative entropy) of the position, and letters divide that height according to their estimated probability. This is default, and is the way logo letter stacks are typically computed in other tools.
- Information content - above background : Stack height is the information content of the position, and only positive-scoring letters are included in the stack - the height of the stack is subdivided according to the relative probabilities of those positive scoring letters. This reduces the noisy mash of letters at the bottom of logos when strong priors are mixed with observed counts (as with HMMER 3.1 profile HMMs).
- Score : The height of each letter depends on that letter’s score, and only positive-scoring letters are included in the stack. The height of a stack does not have any inherent meaning in this case.
Gap parameters
In addition to representing the per-position letter distribution, Skylign renders position-specific gap parameters. It does this by presenting three values for each position k:
- Insert probability: the probability of observing one or more letters inserted between the letter corresponding to position k and the letter corresponding to position (k+1).
- Insert length: the expected length of an insertion following position k, if one is observed.
- Occupancy: the probability of observing a letter at position k. If we call this value occ(k), the probability of observing a gap character (part of a deletion relative to the model) is 1- occ(k).
Scale
Skylign allows you to scale the logo so that the maximum value on the Y-axis corresponds to:- Maximum Observed: largest observed information content (in bits)
- Consensus Colors: largest possible information content (in bits)
Color Schemes
- Default : A unique color is assigned to each amino acid or nucleotide residue.
- Consensus Colors : The residues are colored according to the ClustalX coloring scheme:
- Glycine (G)
- Proline (P)
- Small or hydrophobic (A,V,L,I,M,F,W)
- Hydroxyl or amine amino acids (S,T,N,Q)
- Charged amino-acids (D,E,R,K)
- Histidine or tyrosine (H,Y)
Coordinates
When building a logo for an input alignment, Skylign produces an intermediate HMM. If this is done with the 'remove mostly-empty columns' setting, or if an HMM is uploaded directly, positions in the HMM may differ from the corresponding columns in the associated alignment. In this case, choose from the following:
- Model: The coordinates along the top of the plot show the model position.
- Alignment: The coordinates along the top of the plot show the column in the alignment associated with the corresponding position in the model.
More details
This description leaves many details unexplained. Please see Skylign paper for details.
Uploading a file
Skylign will compute a logo for a profile HMM or for a sequence alignment
Profile HMM
Skylign accepts HMMER-formatted profile HMM files. Producing a profile logo is a straightforward business: Skylign reads the estimated per-position distributions from the HMM file and uses them to compute stack heights and gap values. The submission form requires selection of preferred method for computing letter and stack heights.
Alignment
Skylign accepts sequence alignments in any format accepted by HMMER (this includes Stockholm and aligned fasta format). Producing an alignment logo requires selection of the preferred method of estimating per-column parameters from the observed frequencies in the alignment (Alignment Processing):
- Observed counts: For each column, use the maximum-likelihood estimate - the observed counts.
- Weighted counts: For each column, use the maximum-likelihood estimate after applying weights to each sequence to account for high similarity in a subset of sequences.
- Convert to profile HMM - keep all columns: Apply sequence weights, and also combine with a Dirichlet mixture prior (incorporating absolute weighting to control relative contribution of observed counts and prior). Keeping all columns means that even columns in the alignment that mostly consist of gap characters will be represented by positions in the logo.
- Convert to profile HMM - keep mostly-empty columns: Apply sequence weights, and also combine with a Dirichlet mixture prior (incorporating absolute weighting to control relative contribution of observed counts and prior). Keeping mostly-empty columns means that only columns in the alignment that mostly (after weighting) consist of non-gap characters will be represented by positions in the logo.
- Alignment sequences are full length: Count all terminal gaps as deletions.
- Some sequences are fragments: When a sequence is a fragment (less than half the length of the alignment), its terminal gaps are not considered when counting a column's deletions.
Note: in older browsers, it is not possible to discern an HMM file from an alignment file, so the "Alignment Processing" option will be shown even for an uploaded profile HMM. Simply select "Convert to an HMM".
API Documentation
All documentation for the resources made available by the REST API can be found on the REST API Documentation pages.
Using the API
Creating a logo
The first thing you will need to do is upload your alignment or hmm file to our server.
curl -H 'Accept:application/json' -F file='@hmm' -F processing=hmm http://skylign.org
If something went wrong then a 400 response will be returned.
# Request missing the file upload. curl -H 'Accept:application/json' -F processing=hmm http://skylign.org # Response { "error" : { "upload" : "Please choose an alignment or HMM file to upload." } }
If the upload was successful you will receive an HTTP 200 response with the location of your logo in the payload
{
"url":"http://skylign.org/logo/6BBFEB96-E7E0-11E2-A243-DF86A4A34227",
"uuid":"6BBFEB96-E7E0-11E2-A243-DF86A4A34227",
"message":"Logo generated successfully"
}
Retrieving the logo
With the response in hand you can use the returned url to fetch your logo.
# Request curl -H 'Accept:image/png' http://skylign.org/logo/6BBFEB96-E7E0-11E2-A243-DF86A4A34227 > 6BBFEB96-E7E0-11E2-A243-DF86A4A34227.png # Response
Retrieving the raw data
If you would like to get the JSON used in the javascript logo, then you can get to that by requesting a JSON repsonse. Go to the GET /logo/:uuid page to see the other download formats that are available.
# Request curl -H 'Accept:application/json' http://skylign.org/logo/6BBFEB96-E7E0-11E2-A243-DF86A4A34227 # Response { "mmline":[0,0,0,0,0,0,0,0], "max_height_obs": 5.337, "max_height_theory": 6.45311498641968, "delete_probs":[0,0,0,0,0,0,0,0], "height_arr":[ [ "V:0.001","M:0.001","C:0.001","W:0.001","I:0.001", "S:0.002","T:0.002","N:0.002","K:0.002","Q:0.002", "F:0.003","Y:0.003","E:0.003","L:0.003","A:0.003", "P:0.003","D:0.003","R:0.003","G:0.006","H:5.292" ], [ "Q:0.001","W:0.001","H:0.001","N:0.002","K:0.002", "Y:0.002","E:0.002","C:0.002","P:0.002","D:0.002", "R:0.002","S:0.003","F:0.004","T:0.004","M:0.004", "G:0.004","A:0.006","L:0.016","V:0.027","I:3.757" ], [ "Y:0.001","M:0.001","W:0.001","F:0.002","Q:0.002", "H:0.002","I:0.002","K:0.003","E:0.003","C:0.003", "L:0.003","D:0.003","R:0.003","N:0.004","V:0.005", "P:0.005","T:0.01","G:0.011","A:0.018","S:3.562" ], [ "Y:0.001","M:0.001","W:0.001","F:0.002","Q:0.002", "H:0.002","I:0.002","K:0.003","E:0.003","C:0.003", "L:0.003","D:0.003","R:0.003","N:0.004","V:0.005", "P:0.005","T:0.01","G:0.011","A:0.018","S:3.562" ], [ "Y:0.001","W:0.001","F:0.002","M:0.002","H:0.002", "K:0.003","Q:0.003","I:0.003","E:0.004","C:0.004", "L:0.004","D:0.004","R:0.004","N:0.006","V:0.007", "G:0.015","S:0.023","A:0.028","P:0.416","T:2.819" ], [ "N:0.002","W:0.002","H:0.002","D:0.002","K:0.003", "E:0.003","Q:0.003","P:0.003","R:0.003","G:0.003", "C:0.004","S:0.005","Y:0.005","T:0.01","A:0.013", "F:0.021","V:0.069","L:0.126","I:0.915","M:2.357" ], [ "F:0.001","Y:0.001","M:0.001","C:0.001","W:0.001", "H:0.001","I:0.001","K:0.002","V:0.002","Q:0.002", "L:0.002","R:0.002","T:0.003","P:0.003","S:0.004", "E:0.004","A:0.005","D:0.005","G:0.006","N:4.409" ], [ "M:0","F:0.001","N:0.001","K:0.001","Y:0.001", "V:0.001","Q:0.001","C:0.001","W:0.001","H:0.001", "I:0.001","S:0.002","T:0.002","E:0.002","L:0.002", "P:0.002","D:0.002","R:0.002","A:0.003","G:3.733" ] ], "insert_lengths": [2,2,2,2,2,2,2,0], "insert_probs":[0,0,0,0,0,0,0,0] }
Using the logo elsewhere
-
Step 1 Download the javascript and css.
-
Step 2 Get a copy of jQuery (version 1.8+)
The best place to get this is from the jQuery website. We have tested with version 1.8.2 but newer versions should also work.
-
Step 3 Get the JSON for your logo.
If you have already run the search, then you should be able to download the JSON from the Download section above.
-
Step 4 Insert the following markup into your page.
/* set css rules for styling the logo */ <link rel="stylesheet" type="text/css" href="hmm_logo.min.css"> /* make sure jQuery has been loaded */ <script src="jquery-1.8.2.min.js"></script> /* load in the hmm_logo code */ <script src="hmm_logo.min.js"></script> /* At the location where you want the logo to appear in your page */ <div id="logo" class="logo" data-logo="YOUR LOGO JSON"></div> /* finally run the call to render the logo on page load */ <script> $(document).ready(function () { $('#logo').hmm_logo(); }); </script>
Options
There are a number of ways to change the initial display of the logo. These options are passed to the logo code when calling the hmm_logo() function.
Name | Description |
---|---|
column_width |
The maximum width of each column in the logo when zoom is set to 1. $('#logo').hmm_logo({column_width: '30px'});Default: 34px Range: 30px - 40px |
height_toggle |
Enables or disables the "Toggle Scale" button, which allows one to toggle between maximum observed and maximum theoretical scales. $('#logo').hmm_logo({height_toggle: true});Default: null |
zoom |
Sets the initial zoom level that the logo will use. $('#logo').hmm_logo({zoom: 1});Default: 0.4 Range: 0.1 - 1 |
column_info |
Passing this option enables the onclick display of information for individual columns. If specified, this will tell the code where to display a table of the scores for each residue in a selected column. The value passed should be the id of the element where the table is to be inserted. $('#logo').hmm_logo({column_info: "#col_info"});Default: null |
Installing the Server
Skylign can be downloaded and installed as a stand alone web server. Details on obtaining the source code and dependencies can be found on the install page.
Citing Skylign
Skylign: a tool for creating informative, interactive logos representing sequence alignments and profile hidden Markov models.
Wheeler, T.J., Clements, J., Finn, R.D.
BMC Bioinformatics Volume 15 (2014) p.7 DOI: 10.1186/1471-2105-15-7