Logo Help

Logos are commonly used in molecular biology to provide a compact graphical representation of the conservation pattern of a set of sequences. They render the information contained in sequence alignments or profile hidden Markov models by drawing a stack of letters for each position, where

  • the height of the stack typically corresponds to the conservation at that position, and
  • the height of each letter within a stack depends on the frequency of that letter at that position.

What is shown in a logo?

Letter conservation

The per-positon probabilities underlying the letter stack calculations are estimates of those per-position distributions across the family of homologous sequences. In the case of rendering a logo for a profile HMM, those estimates are taken directly from the HMM. When rendering a logo for a sequence alignment, Skylign enables calculation of these estimates (a) directly from frequencies observed in the alignment, (b) from observed counts after assigning sequence weights to account for sequence redundancy, or (c) after assigning weights and combining with a Dirichlet mixture prior. These options are implemented using HMMER (http://hmmer.org).

Skylign offers three ways of calculating letter- and stack-heights for each column of the underlying alignment:

  • Information content - all : Stack height is the information content (aka relative entropy) of the position, and letters divide that height according to their estimated probability. This is default, and is the way logo letter stacks are typically computed in other tools.
  • Information content - above background : Stack height is the information content of the position, and only positive-scoring letters are included in the stack - the height of the stack is subdivided according to the relative probabilities of those positive scoring letters. This reduces the noisy mash of letters at the bottom of logos when strong priors are mixed with observed counts (as with HMMER 3.1 profile HMMs).
  • Score : The height of each letter depends on that letter’s score, and only positive-scoring letters are included in the stack. The height of a stack does not have any inherent meaning in this case.

Gap parameters

In addition to representing the per-position letter distribution, Skylign renders position-specific gap parameters. It does this by presenting three values for each position k:

  • Insert probability: the probability of observing one or more letters inserted between the letter corresponding to position k and the letter corresponding to position (k+1).
  • Insert length: the expected length of an insertion following position k, if one is observed.
  • Occupancy: the probability of observing a letter at position k. If we call this value occ(k), the probability of observing a gap character (part of a deletion relative to the model) is 1- occ(k).
These parameters are represented by three rows of numerical values placed below the letter stacks of the logo, with a heat map laid over the top of each value to provide a visual aid.

Scale

Skylign allows you to scale the logo so that the maximum value on the Y-axis corresponds to:
  • Maximum Observed: largest observed information content (in bits)
  • Consensus Colors: largest possible information content (in bits)

Color Schemes

  • Default : A unique color is assigned to each amino acid or nucleotide residue.
  • Consensus Colors : The residues are colored according to the ClustalX coloring scheme:
     
    Glycine (G)
     
    Proline (P)
     
    Small or hydrophobic (A,V,L,I,M,F,W)
     
    Hydroxyl or amine amino acids (S,T,N,Q)
     
    Charged amino-acids (D,E,R,K)
     
    Histidine or tyrosine (H,Y)

Coordinates

When building a logo for an input alignment, Skylign produces an intermediate HMM. If this is done with the 'remove mostly-empty columns' setting, or if an HMM is uploaded directly, positions in the HMM may differ from the corresponding columns in the associated alignment. In this case, choose from the following:

  • Model: The coordinates along the top of the plot show the model position.
  • Alignment: The coordinates along the top of the plot show the column in the alignment associated with the corresponding position in the model.

More details

This description leaves many details unexplained. Please see Skylign paper for details.

Uploading a file

Skylign will compute a logo for a profile HMM or for a sequence alignment

Profile HMM

Skylign accepts HMMER-formatted profile HMM files. Producing a profile logo is a straightforward business: Skylign reads the estimated per-position distributions from the HMM file and uses them to compute stack heights and gap values. The submission form requires selection of preferred method for computing letter and stack heights.

Alignment

Skylign accepts sequence alignments in any format accepted by HMMER (this includes Stockholm and aligned fasta format). Producing an alignment logo requires selection of the preferred method of estimating per-column parameters from the observed frequencies in the alignment (Alignment Processing):

  • Observed counts: For each column, use the maximum-likelihood estimate - the observed counts.
  • Weighted counts: For each column, use the maximum-likelihood estimate after applying weights to each sequence to account for high similarity in a subset of sequences.
  • Convert to profile HMM - keep all columns: Apply sequence weights, and also combine with a Dirichlet mixture prior (incorporating absolute weighting to control relative contribution of observed counts and prior). Keeping all columns means that even columns in the alignment that mostly consist of gap characters will be represented by positions in the logo.
  • Convert to profile HMM - keep mostly-empty columns: Apply sequence weights, and also combine with a Dirichlet mixture prior (incorporating absolute weighting to control relative contribution of observed counts and prior). Keeping mostly-empty columns means that only columns in the alignment that mostly (after weighting) consist of non-gap characters will be represented by positions in the logo.
In some cases, an alignment may consist of fragmentary sequences (for instance, because of short sequencing reads). In this case, you may prefer not to count terminal gaps as deletions, instead simply ignoring terminal gaps for short (fragmentary) sequences. These options control this behavior:
  • Alignment sequences are full length: Count all terminal gaps as deletions.
  • Some sequences are fragments: When a sequence is a fragment (less than half the length of the alignment), its terminal gaps are not considered when counting a column's deletions.
In addition to estimation method, the stack/letter height method must also be chosen, as with profile logos.

Note: in older browsers, it is not possible to discern an HMM file from an alignment file, so the "Alignment Processing" option will be shown even for an uploaded profile HMM. Simply select "Convert to an HMM".

API Documentation

All documentation for the resources made available by the REST API can be found on the REST API Documentation pages.

Using the API

Creating a logo

The first thing you will need to do is upload your alignment or hmm file to our server.

curl -H 'Accept:application/json' -F file='@hmm' -F processing=hmm http://skylign.org

If something went wrong then a 400 response will be returned.

# Request missing the file upload.
curl -H 'Accept:application/json' -F processing=hmm http://skylign.org

# Response
{
  "error" : {
    "upload" : "Please choose an alignment or HMM file to upload."
  }
}
      

If the upload was successful you will receive an HTTP 200 response with the location of your logo in the payload

{
  "url":"http://skylign.org/logo/6BBFEB96-E7E0-11E2-A243-DF86A4A34227",
  "uuid":"6BBFEB96-E7E0-11E2-A243-DF86A4A34227",
  "message":"Logo generated successfully"
}
      

Retrieving the logo

With the response in hand you can use the returned url to fetch your logo.

# Request
curl -H 'Accept:image/png' http://skylign.org/logo/6BBFEB96-E7E0-11E2-A243-DF86A4A34227 > 6BBFEB96-E7E0-11E2-A243-DF86A4A34227.png

# Response

hmm logo
      

Retrieving the raw data

If you would like to get the JSON used in the javascript logo, then you can get to that by requesting a JSON repsonse. Go to the GET /logo/:uuid page to see the other download formats that are available.

# Request
curl -H 'Accept:application/json' http://skylign.org/logo/6BBFEB96-E7E0-11E2-A243-DF86A4A34227

# Response
{
  "mmline":[0,0,0,0,0,0,0,0],
  "max_height_obs": 5.337,
  "max_height_theory": 6.45311498641968,
  "delete_probs":[0,0,0,0,0,0,0,0],
  "height_arr":[
    [
      "V:0.001","M:0.001","C:0.001","W:0.001","I:0.001",
      "S:0.002","T:0.002","N:0.002","K:0.002","Q:0.002",
      "F:0.003","Y:0.003","E:0.003","L:0.003","A:0.003",
      "P:0.003","D:0.003","R:0.003","G:0.006","H:5.292"
    ],
    [
      "Q:0.001","W:0.001","H:0.001","N:0.002","K:0.002",
      "Y:0.002","E:0.002","C:0.002","P:0.002","D:0.002",
      "R:0.002","S:0.003","F:0.004","T:0.004","M:0.004",
      "G:0.004","A:0.006","L:0.016","V:0.027","I:3.757"
    ],
    [
      "Y:0.001","M:0.001","W:0.001","F:0.002","Q:0.002",
      "H:0.002","I:0.002","K:0.003","E:0.003","C:0.003",
      "L:0.003","D:0.003","R:0.003","N:0.004","V:0.005",
      "P:0.005","T:0.01","G:0.011","A:0.018","S:3.562"
    ],
    [
      "Y:0.001","M:0.001","W:0.001","F:0.002","Q:0.002",
      "H:0.002","I:0.002","K:0.003","E:0.003","C:0.003",
      "L:0.003","D:0.003","R:0.003","N:0.004","V:0.005",
      "P:0.005","T:0.01","G:0.011","A:0.018","S:3.562"
    ],
    [
      "Y:0.001","W:0.001","F:0.002","M:0.002","H:0.002",
      "K:0.003","Q:0.003","I:0.003","E:0.004","C:0.004",
      "L:0.004","D:0.004","R:0.004","N:0.006","V:0.007",
      "G:0.015","S:0.023","A:0.028","P:0.416","T:2.819"
    ],

    [
      "N:0.002","W:0.002","H:0.002","D:0.002","K:0.003",
      "E:0.003","Q:0.003","P:0.003","R:0.003","G:0.003",
      "C:0.004","S:0.005","Y:0.005","T:0.01","A:0.013",
      "F:0.021","V:0.069","L:0.126","I:0.915","M:2.357"
    ],

    [
      "F:0.001","Y:0.001","M:0.001","C:0.001","W:0.001",
      "H:0.001","I:0.001","K:0.002","V:0.002","Q:0.002",
      "L:0.002","R:0.002","T:0.003","P:0.003","S:0.004",
      "E:0.004","A:0.005","D:0.005","G:0.006","N:4.409"
    ],

    [
      "M:0","F:0.001","N:0.001","K:0.001","Y:0.001",
      "V:0.001","Q:0.001","C:0.001","W:0.001","H:0.001",
      "I:0.001","S:0.002","T:0.002","E:0.002","L:0.002",
      "P:0.002","D:0.002","R:0.002","A:0.003","G:3.733"
    ]
  ],
  "insert_lengths": [2,2,2,2,2,2,2,0],
  "insert_probs":[0,0,0,0,0,0,0,0]
}
      

Using the logo elsewhere

  • Step 1 Download the javascript and css.

  • Step 2 Get a copy of jQuery (version 1.8+)

    The best place to get this is from the jQuery website. We have tested with version 1.8.2 but newer versions should also work.

  • Step 3 Get the JSON for your logo.

    If you have already run the search, then you should be able to download the JSON from the Download section above.

  • Step 4 Insert the following markup into your page.

    /* set css rules for styling the logo */
    <link rel="stylesheet" type="text/css" href="hmm_logo.min.css">
    
    /* make sure jQuery has been loaded */
    <script src="jquery-1.8.2.min.js"></script>
    
    /* load in the hmm_logo code */
    <script src="hmm_logo.min.js"></script>
    
    /* At the location where you want the logo to appear in your page */
    <div id="logo" class="logo" data-logo="YOUR LOGO JSON"></div>
    
    /* finally run the call to render the logo on page load */
    <script>
      $(document).ready(function () {
        $('#logo').hmm_logo();
      });
    </script>
    

Options

There are a number of ways to change the initial display of the logo. These options are passed to the logo code when calling the hmm_logo() function.

Name Description
column_width

The maximum width of each column in the logo when zoom is set to 1.

$('#logo').hmm_logo({column_width: '30px'});
Default: 34px
Range: 30px - 40px
height_toggle

Enables or disables the "Toggle Scale" button, which allows one to toggle between maximum observed and maximum theoretical scales.

$('#logo').hmm_logo({height_toggle: true});
Default: null
zoom

Sets the initial zoom level that the logo will use.

$('#logo').hmm_logo({zoom: 1});
Default: 0.4
Range: 0.1 - 1
column_info

Passing this option enables the onclick display of information for individual columns. If specified, this will tell the code where to display a table of the scores for each residue in a selected column. The value passed should be the id of the element where the table is to be inserted.

$('#logo').hmm_logo({column_info: "#col_info"});
Default: null

Installing the Server

Skylign can be downloaded and installed as a stand alone web server. Details on obtaining the source code and dependencies can be found on the install page.

Citing Skylign

Skylign: a tool for creating informative, interactive logos representing sequence alignments and profile hidden Markov models.
Wheeler, T.J., Clements, J., Finn, R.D.
BMC Bioinformatics Volume 15 (2014) p.7 DOI: 10.1186/1471-2105-15-7