Difference between revisions of "FlyBase:GBrowse Tracks"

From FlyBase Wiki
Jump to navigation Jump to search
Line 39: Line 39:
 
'''PeptideAtlas peptides''' Indicated in yellow. Alignment of peptide sequences determined by mass spectroscopy, derived from polypeptides isolated from the sequenced strain at various developmental stages. Contributed by the Center for Model Organism Proteomes, SystemsX and Research Priority Project of the University of Zurich, Switzerland. For more information, see [https://db.systemsbiology.net/sbeams/cgi/PeptideAtlas/buildDetails?atlas_build_id=352 Peptide Atlas]
 
'''PeptideAtlas peptides''' Indicated in yellow. Alignment of peptide sequences determined by mass spectroscopy, derived from polypeptides isolated from the sequenced strain at various developmental stages. Contributed by the Center for Model Organism Proteomes, SystemsX and Research Priority Project of the University of Zurich, Switzerland. For more information, see [https://db.systemsbiology.net/sbeams/cgi/PeptideAtlas/buildDetails?atlas_build_id=352 Peptide Atlas]
  
'''Transcription Start Sites (embryonic)''' http://flybase.org/reports/FBlc0000202.html
+
'''Transcription Start Sites (embryonic)''' Horizontal extent of the glyph indicates the range over which 90 percent of the signal is located. Red indicates signal along the genome as a histogram. Clicking on the feature will link you to the relevant Sequence Feature report. Genomic sequences identified by integrative analysis of ESTs, CAGE or RLM-RACE. Note: data for embryonic stages only.
 +
 
 +
[http://flybase.org/reports/FBlc0000202.html mE_Transcription_Start_Sites]
  
 
== Mapped Mutations ==
 
== Mapped Mutations ==

Revision as of 15:59, 31 July 2014

Reference Genome Annotations (Iso-1)

Gene Span Shows the total extent of the transcribed region of an annotated gene (including non-coding genes), with direction of transcription indicated. No transcript substructure is shown; corresponds to the entire extent defined by all annotated transcripts. Hyperlinked to Gene Report; label shows FlyBase gene symbol.

Transcript SO:0000673 Shows the exon (wider bars) and intron (black line) structure of each annotated coding transcript, with direction of transcription indicated. Hyperlinked Transcript Report; if label option is on, shows FlyBase symbol.

CDS Shows extent of sequence encoding each specific polypeptide, with direction of transcription indicated; introns indicated as narrow lines. Hyperlinked to Polypeptide Report; if label option is on, shows FlyBase symbol.

Natural TE SO:0000101 Shows the extent of a natural transposable element in the sequenced strain (at the time it was sequenced). Hyperlinked to Natural Transposon report; if label option is on, shows FlyBase symbol.

Repeat region Regions of genomic repeats and low complexity DNA sequences, as computed using RepeatMasker and RepeatRunner (Smith, et al., 2007).

General

Estimated Cytological band Approximate extent of the classical cytological chromosome bands described by Bridges. See Computed cytological data in FlyBase for a detailed description of how this computed cytological location is calculated. See Sorsa polytene maps for a collection of EM micrographs of chromosome regions aligned to Bridges' maps.

3-frame translation (forward) If zoomed out (greater than 400bp), shows ticks at sites of stop codons for the three frames on the forward strand. If zoomed in (400bp or less), shows predicted translations for each frame, using single-letter amino acid code.

3-frame translation (reverse) If zoomed out (greater than 400bp), shows ticks at sites of stop codons for the three frames on the reverse strand. If zoomed in (400bp or less), shows predicted translations for each frame, using single-letter amino acid code.

DNA/GC Content If zoomed out (greater than 100bp) shows a graphic of GC content calculated over 10bp intervals. If zoomed in (100bp or less), shows the double-stranded DNA sequence.

Aligned Evidence

cDNA D. melanogaster cDNA sequences submitted to the sequence databases; shows the exon (wider bars) and intron (black line) structure, and direction of transcription. Sequences submitted prior to 2003 aligned using sim4 (Florea, et al., 1998, Genome Res. 8:967-74) or sim4tandem (sim4 modified by S. Shu, BDGP) to Release 3 by the BDGP and promoted to Releases 4 and 5 by FlyBase. Sequences submitted since 2003 aligned to Releases 4 and 5 by NCBI and submitted to FlyBase. Some genomic DNA submissions, including TPA submissions, are included in this tier.

EST ("expressed sequence tag") Indicated in light green. Partial sequence of a cDNA; shows the exon (wider bars) and intron (narrow bars) structure, and direction of transcription. D. melanogaster EST sequences submitted to the sequence databases prior to 2003 aligned using sim4 (Florea, et al., 1998, Genome Res. 8:967-74) to Release 3 by the BDGP and promoted to Releases 4 and 5 by FlyBase. D. melanogaster EST sequences submitted since 2003 aligned to Releases 4 and 5 by NCBI and submitted to FlyBase.

mRNA D. melanogaster mRNA sequences submitted to the sequence databases;  shows the exon (wider bars) and intron (narrow bars) structure, and direction of transcription.  Sequences are aligned to Release 5 by NCBI and submitted to FlyBase.  These mRNA sequences frequently reflect composite assemblies of a conceptual transcript and therefore may not be available as reagents.

other aligned sequences  D. melanogaster aligned nucleotides submitted to the sequence databases.  These aligned sequences were generally submitted prior to 2003 and aligned using sim4 (Florea, et al., 1998, Genome Res. 8:967-74) or sim4tandem (sim4 modified by S. Shu, BDGP) to Release 3 by the BDGP and promoted to Releases 4 and 5 by FlyBase.  Some genomic DNA submissions, including TPA submissions, are included in this tier.

RNA-Seq based exon junctions Indicated in blue. Orientation of the junction is indicated by an arrowhead in the center of the junction. Mousing over the glyph for a RNA-Seq junction activates a pop-up indicating the total read counts for the junction. The read counts are separated based upon which dataset the information is from (modENCODE or Baylor, see below for Dataset Report links).

BCM_1_RNAseq_junctions Dataset Report
modENCODE_mRNA-Seq_U_junctions Dataset Report

PeptideAtlas peptides Indicated in yellow. Alignment of peptide sequences determined by mass spectroscopy, derived from polypeptides isolated from the sequenced strain at various developmental stages. Contributed by the Center for Model Organism Proteomes, SystemsX and Research Priority Project of the University of Zurich, Switzerland. For more information, see Peptide Atlas

Transcription Start Sites (embryonic) Horizontal extent of the glyph indicates the range over which 90 percent of the signal is located. Red indicates signal along the genome as a histogram. Clicking on the feature will link you to the relevant Sequence Feature report. Genomic sequences identified by integrative analysis of ESTs, CAGE or RLM-RACE. Note: data for embryonic stages only.

mE_Transcription_Start_Sites

Mapped Mutations

Transgene insertion site SO:0000368 Indicated by blue triangles (which indicate orientation of the transgene insertion site) or diamonds (if orientation is not known). An insertion indicated by a downward-pointing triangle is oriented with its conventional 5' terminus to the left (assuming view is in conventional orientation of the Drosophila chromosome); this is described as being in the "plus" orientation. An insertion indicated by a upward-pointing triangle is oriented with its conventional 5' terminus to the right (assuming view is in conventional orientation of the Drosophila chromosome); this is described as being in the "minus" orientation. Hyperlinked to an Insertion Report; if label option is on, shows FlyBase symbol. NOTE: If the "Flip" option is selected, the orientations of the insertion triangles do not change and thus will be incorrect. For a guide on how to work out the orientation of an insertion relative to a transcript of interest, see this short powerpoint presentation.

Point mutation SO:1000008 A single nucleotide has been changed into another nucleotide. Location of mutation is indicated with a grey bar and labeled with the FlyBase allele symbol.

Sequence variant SO:0000109 A region of sequence where variation has been observed. Often these refer to natural variants of a protein that lead to two different functions. The location of the mutation is indicated with a grey bar and labeled with the FlyBase allele symbol.

Uncharacterized change in nucleotide sequence SO:1000007 The nature of the nucleotide substitution is either uncharacterized or only partially characterized. The location of the mutation is indicated with grey bar and labeled with the FlyBase allele symbol.

Aberration junction SO:0000687 Location of aberration breakpoint reported in the literature. Labeled with FlyBase aberration symbol designation and the numerical designation of the breakpoint mapped (where known). Generally the exact breakpoint location is unknown and the feature indicates a range within which the breakpoint has been mapped. References and supporting information available in the "Mapped features and mutations" section of the Gene Report for a nearby gene; genetic data available in the Aberration Report.

Complex substitution SO:1000005 The mutation occurred from a mutation event that cannot be determined from the observed DNA change. Location of mutation is indicated with a grey bar and labeled with the FlyBase allele symbol.

Indels SO:1000032 The junction where an insertion or deletion of one or more nucleotides occurred. Location of mutation is indicated with a grey bar and labeled with the FlyBase allele symbol. In the case of deletions, the extent of the deletion is indicated by a grey bar. In the case of insertions, the location of the nucleotide(s) insertion is indicated by a vertical grey line.

Rescue fragment SO:0000411 Locations of transgenic rescue fragment reported in the literature. Labeled with FlyBase allele symbol designation. Reference and supporting information available in the "Mapped features and mutations" section of the Gene Report; genetic data available in the Allele Report.

Gene Predictions

NCBI gnomon, 2014 Gene model prediction, transcript prediction. A description of the annotation pipeline can be found at: http://www.ncbi.nlm.nih.gov/genome/annotation_euk/process/

NCBI gnomon, 2006 (coding region of transcript) generated via a hidden Markov model using transcript alignment constraints and protein hit information, if available; allows prediction of alternatively spliced isoforms (Souvorov, et al., 2006, NCBI); submitted by J. Ostell.

CONTRAST (coding region of transcript) using a semi-Markov conditional random field (Gross, Do, and Batzoglou, 2005, BCATS 2005 Symposium Proceedings, p. 82); submitted by S. Batzoglou.

PhyloCSF (CONGO) Exon prediction. Region of sequence conservation across multiple Drosophila species, with a pattern of conservation indicative of a protein-coding extent and termini consistent with exon structure (start, splice or stop); submitted by M. Lin and M. Kellis.

Similarity: Synteny features

Proteins (aligned by BLAST)

Dmel proteins D. melanogaster protein sequences submitted to the sequence databases. Sequences submitted prior to 2003 aligned by BLASTX to Release 3 by the BDGP and promoted to Releases 4 and 5 by FlyBase. Sequences submitted since 2003 aligned by TBLASTN to Releases 4 and 5 by NCBI and submitted to FlyBase.

Other proteins Protein sequences submitted to the sequence databases since 2003, from species other than D. melanogaster; aligned by TBLASTN to Releases 4 and 5 by NCBI and submitted to FlyBase.

Synteny Features

Orthologs. Indicated by yellow bar covering extent of the orthologous region. Mousing over this bar will cause a pop-up box to appear which will contain links for all of the orthologs of the gene. The ortholog box will contain the drosophilid orthologs on the left and other orthologs on the right.

Blast Hit

Blast HSP If entry into GBrowse was via a linked hit using the FlyBase Blast tool, the extent of the aligned region is shown as a vertical grey bar that extends through all the GBrowse tiers.

Noncoding Features

Insulators class I Insulator_Class_I.mE01 Dataset Report

Insulators class II Insulator_Class_II.mE01 Dataset Report

Protein binding site. Indicated by grey bar. Locations of protein binding site reported in the literature, as compiled by FlyBase and/or FlyReg). Reference and supporting information available on related Sequence Feature rports (click feature glyph to access report).

Enhancers mE1_CBP_Enhancers Dataset Report

Silencers mE1_HDAC_PRE Dataset Report

Regulatory region. Indicated by grey bar. Location of regulatory feature reported in the literature. Reference and supporting information available on related Sequence Feature reports (click feature glyph to access report).

TFBS – HOT spot analysis mE1_TFBS_HSA Dataset Report Genomic sequences identified as unique regions of transcription factor (TF) binding using HOT spot analysis (HSA); one or many TFs may bind in a given region. A synthesis of ChIP data sets for 41 different transcription factors. TF binding profiles used in this analysis were assayed at early embryo stages.

TFBS – zinc finger domain Binding sites for transcriptions factors that contain one or more zinc finger domains. The following Dataset Reports comprise the data found in this track.

mE1_TFBS_disco
mE1_TFBS_ftz-f1
mE1_TFBS_GATAe
BDTNP1_TFBS_hb
mE1_TFBS_hkb
BDTNP1_TFBS_kni
mE1_TFBS_Kr
mE1_TFBS_sbb
mE1_TFBS_sens
BDTNP1_TFBS_shn
BDTNP1_TFBS_sna
BDTNP1_TFBS_tll
mE1_TFBS_zfh1

TFBS – homeodomain Binding sites for transcriptions factors that contain one or more homeodomains. The following Dataset Reports comprise the data found in this track.

BDTNP1_TFBS_bcd
mE1_TFBS_cad
mE1_TFBS_Dll
mE1_TFBS_en
mE1_TFBS_eve
BDTNP1_TFBS_ftz
mE1_TFBS_inv
BDTNP1_TFBS_prd
mE1_TFBS_Ubx
BDTNP1_TFBS_z

TFBS – helix-loop-helix domain Binding sites for transcriptions factors that contain one or more helix-loop-helix domains. The following Dataset Reports comprise the data found in this track.

BDTNP1_TFBS_da
mE1_TFBS_h
mE1_TFBS_kn
BDTNP1_TFBS_twi

TFBS – BTB/POZ domain Binding sites for transcriptions factors that contain one or more BTB/POZ domains. The following Dataset Reports comprise the data found in this track.

mE1_TFBS_bab1
mE1_TFBS_chinmo
mE1_TFBS_Trl
mE1_TFBS_ttk

TFBS – other Binding sites for transcriptions factors that do not fall into one of the other categories. The following Dataset Reports comprise the data found in this track.

mE1_TFBS_cnc
mE1_TFBS_D
BDTNP1_TFBS_dl
BDTNP1_TFBS_gt
mE1_TFBS_jumu
BDTNP1_TFBS_Mad
BDTNP1_TFBS_Med
mE1_TFBS_run
BDTNP1_TFBS_slp1
mE1_TFBS_Stat92E

Chromatin domains 5-state model, Kc cells Chromatin_types_NKI.Kc167 Dataset Report Whole-genome DamID binding profiles of 53 chromatin proteins in Drosophila Kc167 cells were generated and/or analyzed. In the same array platform, ChIP-on-chip profiles of histone H3, H1, H3K9me2, H3K27me3, H3K4me2, and H3K79me3 were obtained. These were correlated with gene expression, which was measured by RNA-tag profiling.

Chromatin domains 9-state model, S2 cells Chromatin_types_mE1.S2 Dataset Report Demarcation of chromatin domains of nine major types based on analysis of 18 histone modification profiles.

Chromatin domains 9-state model, BG3 cells Chromatin_types_mE2.BG3 Dataset Report Demarcation of chromatin domains of nine major types based on analysis of 18 histone modification profiles.

Origins of replication mE_Early_Replication_Origins_cells Dataset Report Genome profile of early activating origins of replication, BrdU label, Kc, BG3 and S2 cell lines.

Microarray Features

Affymetric v1 Affymetrix_GeneChip_v1 Dataset Report Oligonucleotides (25-mers) designed by Affymetrix to correspond to annotated transcripts in D. melanogaster. Used for the Affymetrix GeneChip Drosophila Genome Array DrosGenome1 microarray, release date February 19, 2002.

Affymetric v2 Affymetrix_GeneChip_v2 Dataset Report Oligonucleotides (25-mers) designed by Affymetrix to correspond to annotated transcripts in D. melanogaster. Used for the Affymetrix GeneChip Drosophila Genome 2.0 Array, release date July 1, 2004.

Expression Levels

Expression Levels: RNA-Seq by Tissue

Digestive system mE_mRNA_L3_Wand_dig_sys FBlc0000227 mE_mRNA_A_1d_dig_sys FBlc0000219 mE_mRNA_A_4d_dig_sys FBlc0000223 mE_mRNA_A_20d_dig_sys FBlc0000221

Fat body and salivary glands mE_mRNA_L3_Wand_fat FBlc0000228 mE_mRNA_WPP_fat FBlc0000233 mE_mRNA_P8_fat FBlc0000235 mE_mRNA_L3_Wand_saliv FBlc0000230 mE_mRNA_WPP_saliv FBlc0000234

Imaginal disc and other carcass mE_mRNA_L3_Wand_imag_disc FBlc0000229 mE_mRNA_L3_Wand_carcass FBlc0000226 mE_mRNA_A_1d_carcass FBlc0000218 mE_mRNA_A_4d_carcass FBlc0000222 mE_mRNA_A_20d_carcass FBlc0000220

CNS and adult head mE_mRNA_L3_CNS FBlc0000225 mE_mRNA_P8_CNS FBlc0000224 mE_mRNA_A_MateM_1d_head FBlc0000209 mE_mRNA_A_MateM_4d_head FBlc0000216 mE_mRNA_A_MateM_20d_head FBlc0000214 mE_mRNA_A_VirF_1d_head FBlc0000210 mE_mRNA_A_VirF_4d_head FBlc0000211 mE_mRNA_A_VirF_20d_head FBlc0000231 mE_mRNA_A_MateF_1d_head FBlc0000207 mE_mRNA_A_MateF_4d_head FBlc0000213 mE_mRNA_A_MateF_20d_head FBlc0000212

Gonads and male accessory glands mE_mRNA_A_MateM_4d_testis FBlc0000217 mE_mRNA_A_MateM_4d_acc_gland FBlc0000215 mE_mRNA_A_VirF_4d_ovary FBlc0000232 mE_mRNA_A_MateF_4d_ovary FBlc0000208

Expression Levels: RNA-Seq

Developmental stage subsets (Baylor) FBlc0000060

Developmental stage subsets, unique reads (modENCODE) FBlc0000085

Tissue culture cells (modENCODE Transcription Group) FBlc0000116

Tissue culture cells, by strand (modENCODE Transcription Group) FBlc0000260

Treatments/Conditions FBlc0000236


Aberrations

Deleted segment. Indicated in red. When one or more aberrations overlap the region being viewed, a darker red bar labeled "Spanning aberration(s)" will be seen. When moused-over, a pop-up box containing all the aberrations that span the region being viewed will appear. Click one of the aberration symbols to go to the Aberration Report.

Duplicated segment. Indicated in blue. When one or more aberrations cover the region being viewed, a darker blue bar labeled "Spanning aberration(s)" will be seen. When moused-over, a pop-up box containing all the aberrations that cover the region being viewed will appear. Click one of the aberration symbols to go to the Aberration Report.

Stock center aberration: deleted segment. Indicated in red. When one or more aberrations overlap the region being viewed, a darker red bar will be seen. When moused-over, a pop-up box containing all the aberrations that span the region being viewed will appear. Click one of the aberration symbols to go to the Aberration Report.

Stock center aberration: duplicated segment Indicated in blue. When one or more aberrations cover the region being viewed, a darker blue bar labelled "Spanning aberration(s)" will be seen. When moused-over, a pop-up box containing all the aberrations that cover the region being viewed will appear. Click one of the aberration symbols to go to the Aberration Report.

The Bloomington Deficiency Kit:

The Bloomington Deficiency Kit is a set of stocks defined by the Bloomington Drosophila Stock Center (BDSC) to provide maximal coverage of the genome with the minimal number of deficiencies having molecularly mapped breakpoints. The BDSC Deficiency Kit also includes deficiencies with breakpoints that have not been mapped molecularly, primarily to provide coverage of gaps between the molecularly defined deficiencies. Since the ends of cytologically characterized deficiencies cannot be placed on the genome map with certainty, the BDSC has defined segments of these deficiencies that fill gaps in molecularly defined coverage for GBrowse display. The endpoints of gap filling segments are derived primarily from overlapping deficiency endpoints and complementation with annotated genes.

BDSC Deficiency Kit deleted segment Molecularly defined deficiencies are indicated in red. Click the deficiency to go to the Aberration Report.

BDSC Deficiency Kit gap filling or haploinsufficiency flanking segment Segments of cytologically defined deficiencies that fill gaps between molecularly defined deficiencies or flank haploinsufficient loci are indicated in yellow. Click the segment icon to go to the Aberration Report for the full deficiency.

RNAi Reagents and Data

DRSC RNAi amplicons FBlc0000026 DNA fragments amplified from D. melanogaster genomic DNA (OregonR) by the Drosophila Genomics Resource Center (DGRC), using gene-specific primers made by Incyte and designed to target transcribed regions with minimal sequence similarity to other genes. Used for the DGRC-D.melanogaster-DGRC1-15552-v5 amplicon microarray, release date June 2, 2006 (original release of v1, May 2004) . For further information see the DGRC-1 library collection report.

VDRC RNAi reagent FBlc0000041, FBlc0000055 Segment used to create inverted repeat in RNAi construct from the Vienna Drosophila RNAi Center (FBrf0200327). "GD" identifer also used for allele and construct symbols.

TRiP RNAi amplicons FBlc0000048, FBlc0000153, FBlc0000185, FBlc0000186, FBlc0000416

BKNAmplicons FBlc0000030 RNAi amplicons from the GenomeRNAi database. Extents of the amplicons are indicated with an orange bar.

HFAAmplicons FBlc0000031 RNAi amplicons from the GenomeRNAi database. Extents of the amplicons are indicated with an orange bar.

Other Reagents

Putative brain enhancers (Pfeiffer et al.) FBlc0000201

Tiling BACs Tiling BAC Indicated by narrow gray bar. BAC genomic clones used by the Berkeley Drosophila Genome Project as part of the minimal tiling path used for determination of the D. melanogaster genomic sequence (FBrf0155823).