FlyBase:GBrowse Tracks

From FlyBase Wiki
Revision as of 12:54, 5 August 2014 by Susan St.Pierre (talk | contribs)
Jump to navigation Jump to search

Reference Genome Annotations (Iso-1)

Gene Span Shows the total extent of the transcribed region of an annotated gene (including non-coding genes), with direction of transcription indicated. No transcript substructure is shown; corresponds to the entire extent defined by all annotated transcripts. Hyperlinked to Gene Report; label shows FlyBase gene symbol.

Transcript SO:0000673 Shows the exon (wider bars) and intron (black line) structure of each annotated coding transcript, with direction of transcription indicated. Hyperlinked Transcript Report; if label option is on, shows FlyBase symbol.

CDS Shows extent of sequence encoding each specific polypeptide, with direction of transcription indicated; introns indicated as narrow lines. Hyperlinked to Polypeptide Report; if label option is on, shows FlyBase symbol.

Natural TE SO:0000101 Shows the extent of a natural transposable element in the sequenced strain (at the time it was sequenced). Hyperlinked to Natural Transposon report; if label option is on, shows FlyBase symbol.

Repeat region Regions of genomic repeats and low complexity DNA sequences, as computed using RepeatMasker and RepeatRunner (Smith, et al., 2007).

General

Estimated Cytological band Approximate extent of the classical cytological chromosome bands described by Bridges. See Computed cytological data in FlyBase for a detailed description of how this computed cytological location is calculated. See Sorsa polytene maps for a collection of EM micrographs of chromosome regions aligned to Bridges' maps.

3-frame translation (forward) If zoomed out (greater than 400bp), shows ticks at sites of stop codons for the three frames on the forward strand. If zoomed in (400bp or less), shows predicted translations for each frame, using single-letter amino acid code.

3-frame translation (reverse) If zoomed out (greater than 400bp), shows ticks at sites of stop codons for the three frames on the reverse strand. If zoomed in (400bp or less), shows predicted translations for each frame, using single-letter amino acid code.

DNA/GC Content If zoomed out (greater than 100bp) shows a graphic of GC content calculated over 10bp intervals. If zoomed in (100bp or less), shows the double-stranded DNA sequence.

Aligned Evidence

cDNA D. melanogaster cDNA sequences submitted to the sequence databases; shows the exon (wider bars) and intron (black line) structure, and direction of transcription. Sequences submitted prior to 2003 aligned using sim4 (Florea, et al., 1998, Genome Res. 8:967-74) or sim4tandem (sim4 modified by S. Shu, BDGP) to Release 3 by the BDGP and promoted to Releases 4 and 5 by FlyBase. Sequences submitted since 2003 aligned to Releases 4 and 5 by NCBI and submitted to FlyBase. Some genomic DNA submissions, including TPA submissions, are included in this tier.

EST ("expressed sequence tag") Indicated in light green. Partial sequence of a cDNA; shows the exon (wider bars) and intron (narrow bars) structure, and direction of transcription. D. melanogaster EST sequences submitted to the sequence databases prior to 2003 aligned using sim4 (Florea, et al., 1998, Genome Res. 8:967-74) to Release 3 by the BDGP and promoted to Releases 4 and 5 by FlyBase. D. melanogaster EST sequences submitted since 2003 aligned to Releases 4 and 5 by NCBI and submitted to FlyBase.

mRNA D. melanogaster mRNA sequences submitted to the sequence databases;  shows the exon (wider bars) and intron (narrow bars) structure, and direction of transcription.  Sequences are aligned to Release 5 by NCBI and submitted to FlyBase.  These mRNA sequences frequently reflect composite assemblies of a conceptual transcript and therefore may not be available as reagents.

other aligned sequences  D. melanogaster aligned nucleotides submitted to the sequence databases.  These aligned sequences were generally submitted prior to 2003 and aligned using sim4 (Florea, et al., 1998, Genome Res. 8:967-74) or sim4tandem (sim4 modified by S. Shu, BDGP) to Release 3 by the BDGP and promoted to Releases 4 and 5 by FlyBase.  Some genomic DNA submissions, including TPA submissions, are included in this tier.

RNA-Seq based exon junctions Indicated in blue. Orientation of the junction is indicated by an arrowhead in the center of the junction. Mousing over the glyph for a RNA-Seq junction activates a pop-up indicating the total read counts for the junction. The read counts are separated based upon which dataset the information is from (modENCODE or Baylor, see below for Dataset Report links).

BCM_1_RNAseq_junctions Dataset Report
modENCODE_mRNA-Seq_U_junctions Dataset Report

PeptideAtlas peptides Indicated in yellow. Alignment of peptide sequences determined by mass spectroscopy, derived from polypeptides isolated from the sequenced strain at various developmental stages. Contributed by the Center for Model Organism Proteomes, SystemsX and Research Priority Project of the University of Zurich, Switzerland. For more information, see Peptide Atlas

Transcription Start Sites (embryonic) Horizontal extent of the glyph indicates the range over which 90 percent of the signal is located. Red indicates signal along the genome as a histogram. Clicking on the feature will link you to the relevant Sequence Feature report. Genomic sequences identified by integrative analysis of ESTs, CAGE or RLM-RACE. Note: data for embryonic stages only.

mE_Transcription_Start_Sites Dataset Report

Mapped Mutations

Transgene insertion site SO:0000368 Indicated by blue triangles with an arrow or diamonds with no arrow. An insertion indicated by a downward-pointing triangle and an arrow pointing to the right is oriented with its conventional 5' terminus to the left (assuming view is in conventional orientation of the Drosophila chromosome); this is described as being in the "plus" orientation. An insertion indicated by a upward-pointing triangle and an arrow pointing to the left is oriented with its conventional 5' terminus to the right (assuming view is in conventional orientation of the Drosophila chromosome); this is described as being in the "minus" orientation. An insertion indicated with a blue diamond has an unknown orientation. For all insertions, if the estimated insertion site is larger than 10 nucleotides, there are short dashes on either side of the triangle or diamond. In those cases, see the Insertion Report for more information about localization. Insertions are hyperlinked to their respective Insertion Report. NOTE: If the "Flip" option is selected, the orientations of the insertion triangles do not change and thus will be incorrect. For a guide on how to work out the orientation of an insertion relative to a transcript of interest, see this short powerpoint presentation (download).

Point mutation SO:1000008 A single nucleotide has been changed into another nucleotide. Location of mutation is indicated with a grey bar and labeled with the FlyBase allele symbol. The feature glyph is hyperlinked to the related Allele Report.

Sequence variant SO:0000109 A region of sequence where variation has been observed. Often these refer to natural variants of a protein that lead to two different functions. The location of the mutation is indicated with a grey bar and labeled with the FlyBase allele symbol. The feature glyph is hyperlinked to the related Allele Report.

Uncharacterized change in nucleotide sequence SO:1000007 The nature of the nucleotide substitution is either uncharacterized or only partially characterized. The location of the mutation is indicated with grey bar and labeled with the FlyBase allele symbol. The feature glyph is hyperlinked to the related Allele Report.

Aberration junction SO:0000687 Location of aberration breakpoint reported in the literature. Labeled with FlyBase aberration symbol designation and the numerical designation of the breakpoint mapped (where known). Often the exact breakpoint location is unknown and the feature indicates a range within which the breakpoint has been mapped. Genetic data available in the Aberration Report which can be accessed directly by clicking on the feature.

Complex substitution SO:1000005 The mutation occurred from a mutation event that cannot be determined from the observed DNA change. Location of mutation is indicated with a grey bar and labeled with the FlyBase allele symbol. Feature is hyperlinked to the related Allele Report.

Indels SO:1000032 The junction where an insertion or deletion of one or more nucleotides occurred. Location of mutation is indicated with a grey bar and labeled with the FlyBase allele symbol. In the case of deletions, the extent of the deletion is indicated by a grey bar. In the case of nucleotide insertions, the location of the nucleotide(s) insertion is indicated by a vertical grey line. Features are hyperlinked to the related Allele Report.

Rescue fragment SO:0000411 Locations of transgenic rescue fragment reported in the literature. Labeled with FlyBase allele symbol designation. Features are hyperlinked to the related Allel Reports which contain genetic data.

Gene Predictions

NCBI gnomon, 2014 Gene model prediction, transcript prediction. A description of the annotation pipeline can be found at: http://www.ncbi.nlm.nih.gov/genome/annotation_euk/process/

NCBI gnomon, 2006 (coding region of transcript) generated via a hidden Markov model using transcript alignment constraints and protein hit information, if available; allows prediction of alternatively spliced isoforms (Souvorov, et al., 2006, NCBI); submitted by J. Ostell.

CONTRAST (coding region of transcript) using a semi-Markov conditional random field (Gross, Do, and Batzoglou, 2005, BCATS 2005 Symposium Proceedings, p. 82); submitted by S. Batzoglou.

PhyloCSF (CONGO) Exon prediction. Region of sequence conservation across multiple Drosophila species, with a pattern of conservation indicative of a protein-coding extent and termini consistent with exon structure (start, splice or stop); submitted by M. Lin and M. Kellis. Related publications: Lin MF, Carlson, JW et al. (2007), Lin MF, Jungreis I, and Kellis M. (2011)

Similarity: Synteny features

Proteins (aligned by BLAST)

Dmel proteins D. melanogaster protein sequences submitted to the sequence databases. Sequences submitted prior to 2003 aligned by BLASTX to Release 3 by the BDGP and promoted to Releases 4 and 5 by FlyBase. Sequences submitted since 2003 aligned by TBLASTN to Releases 4 and 5 by NCBI and submitted to FlyBase.

Other proteins Protein sequences submitted to the sequence databases since 2003, from species other than D. melanogaster; aligned by TBLASTN to Releases 4 and 5 by NCBI and submitted to FlyBase.

Synteny Features

Orthologs. Indicated by yellow bar covering extent of the orthologous region. Mousing over this bar will cause a pop-up box to appear which will contain links for all of the orthologs of the gene. The ortholog box will contain the drosophilid orthologs on the left and other orthologs on the right.

Blast Hit

Blast HSP If entry into GBrowse was via a linked hit using the FlyBase Blast tool, the extent of the aligned region is shown as a vertical grey bar that extends through all the GBrowse tiers.

Noncoding Features

Insulators class I Insulator_Class_I.mE01 Dataset Report

Insulators class II Insulator_Class_II.mE01 Dataset Report

Protein binding site. Indicated by grey bar. Locations of protein binding site reported in the literature, as compiled by FlyBase and/or FlyReg (formerly RedFly)). Reference and supporting information available on related Sequence Feature reports (click feature glyph to access report).

Enhancers mE1_CBP_Enhancers Dataset Report. Genomic sequences identified as putative embryo-only enhancers by virtue of embryo-specific CBP-binding in ChIP assays.

Silencers mE1_HDAC_PRE Dataset Report. Genomic sequences identified as putative polycomb response elements (silencers) in embryos.

Regulatory region. Indicated by grey bar. Location of regulatory feature reported in the literature. Reference and supporting information available on related Sequence Feature reports (click feature glyph to access report).

TFBS – HOT spot analysis mE1_TFBS_HSA Dataset Report. Genomic sequences identified as unique regions of transcription factor (TF) binding using HOT spot analysis (HSA); one or many TFs may bind in a given region. A synthesis of ChIP data sets for 41 different transcription factors. TF binding profiles used in this analysis were assayed at early embryo stages. Mousing over the feature pops up a box that lists the transcription factor genes that bind within the region. Clicking on the feature links to the related Sequence Feature report.

TFBS – zinc finger domain Binding sites for transcriptions factors that contain one or more zinc finger domains. Clicking on the feature links to the related Sequence Feature report. The following Dataset Reports comprise the data found in this track.

mE1_TFBS_disco
mE1_TFBS_ftz-f1
mE1_TFBS_GATAe
BDTNP1_TFBS_hb
mE1_TFBS_hkb
BDTNP1_TFBS_kni
mE1_TFBS_Kr
mE1_TFBS_sbb
mE1_TFBS_sens
BDTNP1_TFBS_shn
BDTNP1_TFBS_sna
BDTNP1_TFBS_tll
mE1_TFBS_zfh1

TFBS – homeodomain Binding sites for transcriptions factors that contain one or more homeodomains. The following Dataset Reports comprise the data found in this track.

BDTNP1_TFBS_bcd
mE1_TFBS_cad
mE1_TFBS_Dll
mE1_TFBS_en
mE1_TFBS_eve
BDTNP1_TFBS_ftz
mE1_TFBS_inv
BDTNP1_TFBS_prd
mE1_TFBS_Ubx
BDTNP1_TFBS_z

TFBS – helix-loop-helix domain Binding sites for transcriptions factors that contain one or more helix-loop-helix domains. The following Dataset Reports comprise the data found in this track.

BDTNP1_TFBS_da
mE1_TFBS_h
mE1_TFBS_kn
BDTNP1_TFBS_twi

TFBS – BTB/POZ domain Binding sites for transcriptions factors that contain one or more BTB/POZ domains. The following Dataset Reports comprise the data found in this track.

mE1_TFBS_bab1
mE1_TFBS_chinmo
mE1_TFBS_Trl
mE1_TFBS_ttk

TFBS – other Binding sites for transcriptions factors that do not fall into one of the other categories. The following Dataset Reports comprise the data found in this track.

mE1_TFBS_cnc
mE1_TFBS_D
BDTNP1_TFBS_dl
BDTNP1_TFBS_gt
mE1_TFBS_jumu
BDTNP1_TFBS_Mad
BDTNP1_TFBS_Med
mE1_TFBS_run
BDTNP1_TFBS_slp1
mE1_TFBS_Stat92E

Chromatin domains 5-state model, Kc cells Chromatin_types_NKI.Kc167 Dataset Report. Whole-genome DamID binding profiles of 53 chromatin proteins in Drosophila Kc167 cells were generated and/or analyzed. In the same array platform, ChIP-on-chip profiles of histone H3, H1, H3K9me2, H3K27me3, H3K4me2, and H3K79me3 were obtained. These were correlated with gene expression, which was measured by RNA-tag profiling.

Chromatin domains 9-state model, S2 cells Chromatin_types_mE1.S2 Dataset Report. Demarcation of chromatin domains of nine major types based on analysis of 18 histone modification profiles.

Chromatin domains 9-state model, BG3 cells Chromatin_types_mE2.BG3 Dataset Report. Demarcation of chromatin domains of nine major types based on analysis of 18 histone modification profiles.

Origins of replication mE_Early_Replication_Origins_cells Dataset Report. Genome profile of early activating origins of replication, BrdU label, Kc, BG3 and S2 cell lines.

Microarray Features

Affymetric v1 Affymetrix_GeneChip_v1 Dataset Report. Oligonucleotides (25-mers) designed by Affymetrix to correspond to annotated transcripts in D. melanogaster. Used for the Affymetrix GeneChip Drosophila Genome Array DrosGenome1 microarray, release date February 19, 2002.

Affymetric v2 Affymetrix_GeneChip_v2 Dataset Report. Oligonucleotides (25-mers) designed by Affymetrix to correspond to annotated transcripts in D. melanogaster. Used for the Affymetrix GeneChip Drosophila Genome 2.0 Array, release date July 1, 2004.

Expression Levels

Expression Levels: RNA-Seq by Tissue

To aid in visualization, we have divided the RNA-Seq by Tissue data into five categories listed below. The datasets included within each category are also listed. Mousing over the Topoview glyph in GBrowse will cause a color-key to pop up so you can tell which dataset you are looking at.

Digestive system
mE_mRNA_L3_Wand_dig_sys
mE_mRNA_A_1d_dig_sys
mE_mRNA_A_4d_dig_sys
mE_mRNA_A_20d_dig_sys

Fat body and salivary glands
mE_mRNA_L3_Wand_fat
mE_mRNA_WPP_fat
mE_mRNA_P8_fat
mE_mRNA_L3_Wand_saliv
mE_mRNA_WPP_saliv

Imaginal disc and other carcass
mE_mRNA_L3_Wand_imag_disc
mE_mRNA_L3_Wand_carcass
mE_mRNA_A_1d_carcass
mE_mRNA_A_4d_carcass
mE_mRNA_A_20d_carcass

CNS and adult head
mE_mRNA_L3_CNS
mE_mRNA_P8_CNS
mE_mRNA_A_MateM_1d_head
mE_mRNA_A_MateM_4d_head
mE_mRNA_A_MateM_20d_head
mE_mRNA_A_VirF_1d_head
mE_mRNA_A_VirF_4d_head
mE_mRNA_A_VirF_20d_head
mE_mRNA_A_MateF_1d_head
mE_mRNA_A_MateF_4d_head
mE_mRNA_A_MateF_20d_head

Gonads and male accessory glands
mE_mRNA_A_MateM_4d_testis
mE_mRNA_A_MateM_4d_acc_gland
mE_mRNA_A_VirF_4d_ovary
mE_mRNA_A_MateF_4d_ovary

Expression Levels: RNA-Seq

These tracks contain RNA-Seq expression data for several different stages of development, types of tissue culture cells, or treatment conditions. Mousing over the track in GBrowse will cause a key to pop up. The key indicates the meaning of the different colors.

Developmental stage subsets (Baylor) BCM_1_RNAseq Dataset report.

Developmental stage subsets, unique reads (modENCODE) modENCODE_mRNA-Seq_U Dataset report.

Tissue culture cells, by strand (modENCODE Transcription Group) modENCODE_mRNA-Seq_cell.B Dataset report.

Treatments/Conditions modENCODE_mRNA-Seq_treatments Dataset report.

Aberrations

Deleted segment. Indicated in red. When one or more aberrations overlap the region being viewed, a darker red bar labeled "Spanning aberration(s)" will be seen. When moused-over, a pop-up box containing all the aberrations that span the region being viewed will appear. Click one of the aberration symbols to go to the Aberration Report. When mousing over a lighter red bar labeled with a deficiency symbol, a list of genes within the aberration extents pops up. Clicking on one of the gene symbols within the pop up will link to the Gene Report. Clicking on the bar itself links to the Aberration Report.

Duplicated segment. Indicated in blue. When one or more aberrations cover the region being viewed, a darker blue bar labeled "Spanning aberration(s)" will be seen. When moused-over, a pop-up box containing all the aberrations that cover the region being viewed will appear. Click one of the aberration symbols to go to the Aberration Report. When mousing over a lighter blue bar labeled with a duplication symbol, a list of genes within the aberration extents pops up. Clicking on one of the gene symbols within the pop up will link to the Gene Report. Clicking on the bar itself links to the Aberration Report.

Stock center aberration: deleted segment. This track functions just as the "Deleted Segment" track described above. This track contains only deficiencies available from the Bloomington Stock Center. Links to the relevant Stock Report can be found at the bottom of the Aberration Report, which is linked from the feature glyph.

Stock center aberration: duplicated segment This track functions just as the "Duplicated Segment" track described above. This track contains only duplications available from the Bloomington Stock Center. Links to the relevant Stock Report can be found at the bottom of the Aberration Report, which is linked from the feature glyph.

The Bloomington Deficiency Kit:

The Bloomington Deficiency Kit is a set of stocks defined by the Bloomington Drosophila Stock Center (BDSC) to provide maximal coverage of the genome with the minimal number of deficiencies having molecularly mapped breakpoints. The BDSC Deficiency Kit also includes deficiencies with breakpoints that have not been mapped molecularly, primarily to provide coverage of gaps between the molecularly defined deficiencies. Since the ends of cytologically characterized deficiencies cannot be placed on the genome map with certainty, the BDSC has defined segments of these deficiencies that fill gaps in molecularly defined coverage for GBrowse display. The endpoints of gap filling segments are derived primarily from overlapping deficiency endpoints and complementation with annotated genes.

BDSC Deficiency Kit: deleted segment Molecularly defined deficiencies are indicated in red. Click the deficiency to go to the Aberration Report. Links to the relevant Stock Report can be found at the bottom of the Aberration Report.

BDSC Deficiency Kit: gap filling or haploinsufficiency flanking segment. Segments of cytologically defined deficiencies that fill gaps between molecularly defined deficiencies or flank haploinsufficient loci are indicated in yellow. Click the segment icon to go to the Aberration Report for the full deficiency. Links to the relevant Stock Report can be found at the bottom of the Aberration Report.

RNAi Reagents and Data

DRSC RNAi amplicons
DRSC dsNRA amplicon platform Dataset Report. DNA fragments amplified from D. melanogaster genomic DNA (OregonR) by the Drosophila Genomics Resource Center (DGRC), using gene-specific primers made by Incyte and designed to target transcribed regions with minimal sequence similarity to other genes. Used for the DGRC-D.melanogaster-DGRC1-15552-v5 amplicon microarray, release date June 2, 2006 (original release of v1, May 2004) . For further information see the DGRC-1 Dataset report.

VDRC RNAi reagent
Segment used to create inverted repeat in RNAi construct from the Vienna Drosophila RNAi Center (Dickson B. et al. 2007.7.18). "GD" identifer also used for allele and construct symbols.

VDRC-1 Dataset Report
VDRC-2 Dataset Report.


TRiP RNAi amplicons

FBlc0000048 TRiP-1 Dataset Report
FBlc0000153 TRiP-2 Dataset Report
FBlc0000185 TRiP-3 Dataset Report
FBlc0000186 TRiP-4 Dataset Report
FBlc0000416 TRiP-5 Dataset Report

BKNAmplicons
Extents of the amplicons are indicated with an orange bar.
HD2 Dataset Report (dsRNA amplicon platform) RNAi amplicons from the GenomeRNAi database.


HFAAmplicons
Extents of the amplicons are indicated with an orange bar.
HFA RNAi amplicons from the GenomeRNAi database.

Other Reagents

Putative brain enhancers (Pfeiffer et al.) GMR_Brain_exp_1 Dataset Report. Glyphs represent putative enhancers used to generate fly stocks carrying GAL4 transgenic constructs designed to be expressed in adult brain. Stocks are available from the Bloomington Stock Center. Clicking the glyph brings up the associated Sequence Feature report (e.g. http://flybase.org/reports/FBsf0000162377.html). On the Sequence Feature report, under "associated information" there is a construct listed. Clicking this construct link brings up the associated Construct Report (e.g. http://flybase.org/reports/FBtp0058072.html) on which you can find a link to the Stock Report (e.g. http://flybase.org/reports/FBst0045107.html). Sorry it's so roundabout!

Tiling BACs Tiling BAC Indicated by narrow gray bar. BAC genomic clones used by the Berkeley Drosophila Genome Project as part of the minimal tiling path used for determination of the D. melanogaster genomic sequence (Celniker SE. et al. 2002).

Analysis

Restriction Sites Clicking here marks cut sites for the common restriction enzymes PvuII, ClaI, BamHI, EcoRI.