This tutorial takes you through the various features available with WebGBrowse in order to configure your gff3 dataset for GBrowse. Before introducing you to the tutorial, we assume that you know how to load your data into a GFF3 (General Feature Format version 3) standard file. It is important to remember that WebGBrowse is designed for GBrowse versions 1.69 and above which use GFF3 format files (.gff3), not the GFF version 2 (.gff) files used by earlier GBrowse versions. The GBrowse Administration Tutorial (Lincoln Stein, 2008) has a section introducing the GFF3 data file format for first time users.
You can run WebGBrowse either directly form our server (http://webgbrowse.cgb.indiana.edu) or you can download and install it locally. For details on downloading the standalone version and the installation instructions, please visit the software page. The online version of WebGBrowse does not require any specific software to be installed on the client's side (except for a web browser). WebGBrowse is tested to work on IE 7.0 and above, Mozilla Firefox and Safari on Windows, Mac and Linux platforms.
In this tutorial, we will use a simulated Volvox genome annotation GFF3 dataset derived from the GBrowse Administration Tutorial from the GMOD website. This dataset presents different feature types (column3) that can be configured to illustrate default generic display, multi-segmented features, protein-coding genes, reading frames, grouped features and quantitative data display and this tutorial is intended to show you how to achieve that configuration using WebGBrowse. This dataset also includes the genome sequence information at the end in FASTA format. GBrowse associates this sequence information with the corresponding features that are being displayed. Download and save this sample dataset (volvox.gff3) to your hard drive. Open a browser window and go to the WebGBrowse homepage (http://webgbrowse.cgb.indiana.edu/) from where you can upload your dataset (Figure 1).
Figure 1: WebGBrowse Input Form
Click the "Browse..." button in the "GFF3 File" section of the WebGBrowse Input Form and select the file, volvox.gff3, you downloaded earlier. The input can also be provided as a compressed file in .zip or .gz format, provided the compressed file contains a single .gff3 file without any sub-directories. You can use an existing configuration file (if you have one) as a template instead of starting from the scratch. We recommend the usage of only those configuration files which are generated using WebGBrowse to be used as templates. There is a section on using a pre-existing configuration file as a template included in this tutorial. For now, we will build the configuration right from the scratch. You can also enter your email address, which is optional. By providing email address you can perform the configuration in multiple sessions, receive the configuration results to your email and keep track of all your previous submissions. To learn more, please see the section on Tracking the Previous Submissions to WebGBrowse. After entering the required input, click submit. Upon validating the contents of the input, WebGBrowse will navigate to the "Configuration Panel".
The configuration panel is the place where you can add, edit or delete feature tracks for the GBrowse display of your dataset. If you have provided your email address in the WebGBrowse input, you will see a "Save Progress" button at the top right corner of the configuration panel which lets you perform the configuration process in multiple sessions. You can save your progress at any stage by clicking on that button and WebGBrowse will send a link to your email. You can resume your work by clicking on that link. You can provide a short description for your dataset in the "Description" field. For now, enter "Sample Volvox Dataset for WebGBrowse" as description.
WebGBrowse populates the "Feature" listbox in the "Add New Track" section (Figure 2) with the list of the unique features derived from your dataset which can be configured into individual GBrowse tracks. This list is derived based on the values from the feature type column (column3) and the source column (column2) from the data. Besides each of the unique feature types (column3), the unique type:source (column3:column2) sets with common "type" values will form the items in the unique features list. For example, in the sample Volvox genome data, the feature type "remark" would make a single list item (since all the corresponding rows have the same source) where as the feature type "CDS" would make four list items - "CDS", "CDS:example", "CDS:predicted", "CDS:exonerate".
Figure 2: "Add New Track" Section in the Configuration Panel
Next to the "Feature" list box, you will see a "Glyph" dropdown list (Figure 2) which contains the names of various glyphs available with WebGBrowse. A glyph determines the shape of each of the genomic features associated with a specific GBrowse track in the GBrowse display. It presents a configurable set of parameters specific to its type allowing further control over the display features such as height, color, width etc. The configuration panel is powered by a "Glyph Library" which stores the complete configuration details of the various glyphs available. For more details on the glyph library, please visit the glyph library page. A short description and a sample image for the selected glyph will be displayed next to the glyph dropdown list.
You can define the GBrowse tracks by associating a feature to be displayed with a glyph from the glyph library. If you open the contents of the volvox.gff3 file, you will notice a series of features of type"remark" as shown below.
ctgA example remark 1659 1984 . + . Name=f07;Note=This is an example ctgA example remark 3014 6130 . + . Name=f06;Note=This is another example ctgA example remark 4715 5968 . - . Name=f05;Note=Ok! Ok! I get the message. ctgA example remark 13280 16394 . + . Name=f08 ctgA example remark 15329 15533 . + . Name=f10 ctgA example remark 19157 22915 . - . Name=f13 ...
Let's add a new track based on the feature remark using the generic glyph. In the configuration panel select "remark" from the feature list box and "generic" from the glyph list box and click "Add Track". It flashes the "Generic Glyph Parameters" form (Figure 3).
Figure 3: Glyph Parameters Form
Change the "Key" to "ExampleFeatures" and "Glyph Background Color" to "navy". You may want to play with the other parameter values including those available in the "Advanced Section". A "Save and Continue" and a "Cancel Changes" buttons will be available at the top and bottom ends of the form. Once you are done setting the parameter values, click "Save and Continue" to add the new track and go back to the Configuration Panel. Since, we now have a track added, you will notice a new section "Tracks Added" extending the configuration panel(Figure 4). A Configured Tracks listbox shows the currently added tracks (currently the "remark" track). Selecting a track will display the configuration for the selected item in a Configuration Settings box on the right.
Figure 4: "Tracks Added" Section in the Configuration Panel
Click the "Display in GBrowse 1.70" button or "Display in GBrowse 2.0" button to display the configured dataset in your desired version of GBrowse (Figure 5). A WebGBrowse Control Panel hosting a "Edit Configuration" and a "Download Configuration" buttons is provided at the top of the GBrowse display. These buttons will enable you to go back to the Configuration panel to make further changes or to download the generated configuration file. You may use this downloaded configuration file as a template while configuring other similar datasets.
Figure 5: GBrowse Display with WebGBrowse Control Panel
You can edit or delete the added tracks by clicking on "Edit Track" or "Delete Track" buttons respectively in the configuration panel (Figure 4). Clicking the "Edit Track" button will open the configuration information for the selected track in the Glyph Parameters form (Figure 3) for editing. But, there is no direct way to change the glyph associated with a track, and should you decide to change, you will need to delete the track by clicking on "Delete Track" button and add it once again.
Let's examine a few more features by adding more tracks. From the configuration panel add a new track based on the feature "polypeptide_domain" and the glyph "span".
ctgA example polypeptide_domain 11911 15561 . + . Name=m11;Note=kinase ctgA example polypeptide_domain 13801 14007 . - . Name=m05;Note=helix loop helix ctgA example polypeptide_domain 14731 17239 . - . Name=m14;Note=kinase ctgA example polypeptide_domain 15396 16159 . + . Name=m03;Note=zinc finger ctgA example polypeptide_domain 17023 17675 . + . Name=m08;Note=7-transmembrane ctgA example polypeptide_domain 17667 17690 . + . Name=m13;Note=DEAD box ...
In the "Span Glyph Parameters" form change the "Key" to "Example Motifs" and "Glyph Height" to "5". The default value for "Link" would be "AUTO" which makes GBrowse generate an automatic link to a helper script named "gbrowse_details". You can change it to any URL you may wish. You can also include any of the built in recognized variables such as $name, $description etc. For the complete list of recognized variables please see the sub-section "link" in GBrowse Configuration HOWTO. For now change the link to "http://www.google.com/search?q=$description" and click "Next". This will make GBrowse generate a link to google search page with the description being the search term.
The sample dataset contains features of type "match" as shown below. You will notice that multiple lines carry the same ID representing the subparts of the same feature. Such features can be best displayed using "segments" or "transcript" or "transcript2" glyphs.
ctgA example match 32329 32359 . + . ID=match-seg01;Name=seg01;Note=This is a segment ctgA example match 26122 26126 . + . ID=match-seg02;Name=seg02 ctgA example match 26497 26869 . + . ID=match-seg02;Name=seg02 ctgA example match 27201 27325 . + . ID=match-seg02;Name=seg02 ctgA example match 27372 27433 . + . ID=match-seg02;Name=seg02 ctgA example match 27565 27565 . + . ID=match-seg02;Name=seg02 ...
Add a track using the feature "match" and the glyph "segments". Change the "Key" to "Example Alignments", "Glyph Background Color" to "lightgrey" and "Connector Type" to "Solid" (Figure 6). You can also experiment with the glyphs "transcript" and "transcript2" to display features with subparts. For more description on these glyphs, please visit the glyph library page.
Figure 6: Segments Glyph Displaying Features with Multiple Subparts
The sample dataset we downloaded earlier contains the "gene" feature which is a three-tiered structure to represent the gene, descending from gene to mRNA to CDS and UTR features. The various parts fit together using ID and Parent features. The gene in our dataset is named EDEN with its three spliced forms named EDEN.1, EDEN.2 and EDEN.3.
ctgA example gene 1050 9000 . + . ID=EDEN;Name=EDEN;Note=protein kinase ctgA example mRNA 1050 9000 . + . ID=EDEN.1;Parent=EDEN;Name=EDEN.1;Note=Eden splice form 1;Index=1 ctgA example five_prime_UTR 1050 1200 . + . Parent=EDEN.1 ctgA example CDS 1201 1500 . + 0 Parent=EDEN.1 ctgA example CDS 3000 3902 . + 0 Parent=EDEN.1 ctgA example CDS 5000 5500 . + 0 Parent=EDEN.1 ctgA example CDS 7000 7608 . + 0 Parent=EDEN.1 ctgA example three_prime_UTR 7609 9000 . + . Parent=EDEN.1 ctgA example mRNA 1050 9000 . + . ID=EDEN.2;Parent=EDEN;Name=EDEN.2;Note=Eden splice form 2;Index=1 ctgA example five_prime_UTR 1050 1200 . + . Parent=EDEN.2 ctgA example CDS 1201 1500 . + 0 Parent=EDEN.2 ctgA example CDS 5000 5500 . + 0 Parent=EDEN.2 ctgA example CDS 7000 7608 . + 0 Parent=EDEN.2 ctgA example three_prime_UTR 7609 9000 . + . Parent=EDEN.2 ctgA example mRNA 1300 9000 . + . ID=EDEN.3;Parent=EDEN;Name=EDEN.3;Note=Eden splice form 3;Index=1 ctgA example five_prime_UTR 1300 1500 . + . Parent=EDEN.3 ctgA example five_prime_UTR 3000 3300 . + . Parent=EDEN.3 ctgA example CDS 3301 3902 . + 0 Parent=EDEN.3 ctgA example CDS 5000 5500 . + 1 Parent=EDEN.3 ctgA example CDS 7000 7600 . + 1 Parent=EDEN.3 ctgA example three_prime_UTR 7601 9000 . + . Parent=EDEN.3
The "gene" glyph can be used to depict alternatively spliced protein-coding genes. If you want to display the individual transcripts, use the "Processed Transcript" glyph (see below). Let's add a track for the feature "gene" and the glyph "gene" (Figure 7). Change the "Key" to "Protein-coding Genes", "Glyph Background color" to "peachpuff", "Category" to "Genes" and check the option "Show Label Transcripts". Changing the category will place the track under a different group in the gbrowse display. If you want to make the track look like UCSC Genome Browser, set "Glyph Background color" to "black", and in the "Transcript" section change "UTR Color" to "black" and check the options "Make UTRs thinner" and "Decorate Introns".
Figure 7: Gene Glyph Displaying Alternatively Spliced Protein-Coding Genes
The "gene" glyph can also be used to illustrate simpler genes. For example choose the feature "CDS:predicted" and the glyph "gene". Change the "Key" to "Predicted genes", "Glyph Background Color" to "white" and "Category" to "Genes" (Figure 8).
ctgA predicted CDS 10000 11500 . + 0 Name=Apple1;Note=A for Apple ctgA predicted CDS 13000 13800 . + 0 ID=cds-Apple2;Name=Apple2;Note=AnotherApple ctgA predicted CDS 15000 15500 . + 1 ID=cds-Apple2;Name=Apple2 ctgA predicted CDS 17000 17200 . + 2 ID=cds-Apple2;Name=Apple2
The "gene" glyph can also be used to illustrate simpler genes. For example, choose the feature "CDS:predicted" and the glyph "gene". Change the "Key" to "Predicted genes", "Glyph Background Color" to "white" and "Category" to "Genes" (Figure 8).
Figure 8: Gene Glyph Displaying Simpler Genes
Our sample dataset also contains a set of features as shown below which represent a single transcript that has both coding and non-coding regions. This transcript can be displayed using the "processed_transcript" glyph (also known as "so_transcript").
ctgA exonerate mRNA 17400 23000 . + . ID=rna-Apple3;Name=Apple3;Note=Predicted ctgA exonerate UTR 17400 17999 . + . Parent=rna-Apple3 ctgA exonerate CDS 18000 18800 . + 0 Parent=rna-Apple3 ctgA exonerate CDS 19000 19500 . + 1 Parent=rna-Apple3 ctgA exonerate CDS 21000 21200 . + 2 Parent=rna-Apple3 ctgA exonerate UTR 21201 23000 . + . Parent=rna-Apple3
Choose the feature "mRNA:exonerate" and the glyph "processed_transcript". Change the "Key" to "Exonerate Predictions", "Glyph Background Color" to "beige" and "Category" to "Genes" (Figure 9).
Figure 9: processed_transcript Glyph Displaying Simpler Genes
In the example described above as shown in Figure 7, you can display whether the reading frame for the third exon is preserved between EDEN.1 and EDEN.3 using the "cds" glyph. Add a track using the feature "mRNA" and the glyph "cds". Change the "Key" to "Frame usage", "Category" to "Genes" and check the option "Ignore features with empty phase value" in the "Advanced" section (Figure 10). You can also change the color of the different frames in the "Frames" section.
Figure 10: cds Glyph Displaying Reading Frames
You can plot microarray or tiling array data using the "xyplot" glyph. The sample dataset has lines with feature type "microarray_oligo" representing microarray data.
ctgA affy microarray_oligo 1 100 281 . . Name=Expt1 ctgA affy microarray_oligo 101 200 183 . . Name=Expt1 ctgA affy microarray_oligo 201 300 213 . . Name=Expt1 ctgA affy microarray_oligo 301 400 191 . . Name=Expt1 ctgA affy microarray_oligo 401 500 288 . . Name=Expt1 ctgA affy microarray_oligo 501 600 184 . . Name=Expt1 ...
Add a track with the feature "microarray_oligo" and the glyph "xyplot". Change the "Key" to "Transcriptional Profile", "Graph Type" to "Boxes" and "Scale Position" to "right" (Figure 11).
Figure 11: xyplot Glyph Displaying Microarray Data
Finally, let us see how to group 5' and 3' EST reads. The sample dataset contains lines with feature type "EST_match" as shown below. The name of the feature represents the read to which it belongs. For example, the EST_match agt830.5 has a 5' read where as agt830.3 has a 3' read.
ctgA est EST_match 1050 1500 . + . ID=Match1;Name=agt830.5;Target=agt830.5 1 451 ctgA est EST_match 3000 3202 . + . ID=Match1;Name=agt830.5;Target=agt830.5 452 654 ctgA est EST_match 5410 5500 . - . ID=Match2;Name=agt830.3;Target=agt830.3 505 595 ctgA est EST_match 7000 7503 . - . ID=Match2;Name=agt830.3;Target=agt830.3 1 504 ctgA est EST_match 1050 1500 . + . ID=Match3;Name=agt221.5;Target=agt221.5 1 451 ctgA est EST_match 5000 5500 . + . ID=Match3;Name=agt221.5;Target=agt221.5 452 952 ctgA est EST_match 7000 7300 . + . ID=Match3;Name=agt221.5;Target=agt221.5 953 1253 ...
Add a track with the feature "EST_match" and the glyph "segments". Change the "Key" to "ESTs", "Glyph Background color" to "orange", "Glyph Height" to "6" and "Connector Type" to "Solid" and proceed to the GBrowse display by clicking the button "Display in GBrowse". You will notice that the segments of the individual features are joined together, but the ESTs from individual reads are not grouped yet. (Figure 12).
Figure 12: EST Match Display - Ungrouped
Let us group the ESTs with the same name together. From the WebGBrowse Control Panel, click the button "Edit Configuration" to proceed to the configuration panel. In the "Tracks Added" section, select the "EST_match" track from the "Configured Tracks" list and click the button "Edit Track". In the "Segments Glyph Parameters" form, change the value of "Group Pattern" in the "Advanced" section to "/\.$/". This value is a regular expression that make the ESTs with same names but ending in ".3" or ".5" to be grouped (Figure 13). Save and proceed to the GBrowse display.
Figure 13: EST Match Display - Grouped Features
You can use a pre-existing configuration file, such as the one downloaded from the WebGBrowse Control Panel in the GBrowse display (see Figure 5), as a template while configuring similar datasets, instead of starting from the scratch every single time. Download your configuration file by clicking the "Download Configuration" button in the WebGBrowse Control Panel (Figure 5), or download the sample configuration file (volvox.conf) provided in the WebGBrowse Input form (see Figure 1) of the WebGBrowse homepage (http://webgbrowse.cgb.indiana.edu/). Go to the WebGBrowse homepage (http://webgbrowse.cgb.indiana.edu/) and upload the volvox.gff3 and volvox.conf files as input for "GFF3 File" and "Configuration file to be used as template" fields respectively. You will notice in the "Tracks Added" section of the "Configuration Panel", the pre-configured tracks based on the template. WebGBrowse will ignore any tracks in the template associated with features that are missing in the uploaded dataset. You can add more tracks or modify the existing tracks as described earlier.
We recommend the usage of only the configuration files generated using WebGBrowse as the templates. For advanced users, if you want to use the GBrowse configuration files from other sources, you may try by renaming the track stanzas in the pattern "Track" followed by a four digit number, for example, "[Track9999]", but renaming track stanzas can not guarantee those files to work accurately as template files.
This tutorial is intended to give you a start with using WebGBrowse. It would be useful to explore other glyphs and their parameters which are not discussed above. Also, it is important to note that the role of WebGBrowse is only in generating the configuration information and hosting it on the GBrowse server. The actual display is generated by the GBrowse software itself. For details on the features available with GBrowse, please refer to the GBrowse user tutorial from OpenHelix. For questions, bug reports, support requests and suggestions please visit our support page.
Page last updated by Ram Podicheti on January 25, 2010 11:14 AM EST