FlyBase:Transcript Report

From FlyBase Wiki
Revision as of 13:56, 16 December 2024 by Steven marygold (talk | contribs)
Jump to navigation Jump to search

The Transcript Report provides information on individual annotated transcripts. Annotated D. melanogaster transcripts are transcripts identified by FlyBase annotators based on a variety of evidence including cDNA/ESTs, RNA-seq expression and exon junction data, transcription start site data, protein alignment, and gene predictions. Transcripts from other sequenced Drosophila species are based on NCBI Gnomon predictions. Transcripts described in the literature may or may not correspond exactly to annotated transcripts. Curated transcript data is found on the Gene Report, subsection Transcript Data, field Reported transcript sizes.

This is a field-by-field guide to the information provided in the Transcript Report.

General Information

Genomic Location

Genomic Maps - Links in the left hand panel take you to the corresponding region in the JBrowse genome viewer. On the right is a JBrowse shapshot showing the gene region plus 2kb on either side of the gene. The snapshot includes transcripts of the gene of interest, plus transcripts of neighboring genes. Clicking on the genes or transcripts in the snapshot takes you to the associated gene or transcript report.

cDNA Clones Consistent with the Transcript (Num)

cDNA Clones, Fully Sequenced

Contains a list of cDNA clones which have been fully sequenced and which support the transcript

Symbol The valid symbol that is used in FlyBase for the transcript.

The first part of the symbol (before the '\') is the standard prefix for the species (from the Species Abbreviations list). For species other than D.melanogaster, the species prefix is displayed wherever the transcript symbol is used throughout FlyBase. For D.melanogaster transcripts, the species prefix is only displayed in the General Information section at the top of a Report.

Annotation symbol The current symbol for the annotation that represents the transcript.
Associated gene The gene that encodes the transcript.

Clicking on the gene symbol will take you to the relevant Gene Report.

Feature type The SO Ontology designation for the RNA type linked to the associated “CV Term Report”.
Species The organism that the transcript originates from, with the initial letter of the genus and the full species name listed.
FlyBase ID The FlyBase identifier number of the transcript, used to uniquely identify the transcript in the database..
Length (nt) The length in nucleotides of the transcript.
Evidence Rank A transcript is given a rank based on its Annotation Evidence Score. Transcripts with a score less than 5 are classified as 'Weakly Supported'. Those with a score between 5 and 8 are classified as 'Moderately Supported' and those with a score of 9 and above are classified 'Strongly Supported'.

These classifications should be considered as indicative but not definitive of support for a transcript annotation. Only the classes of evidence described in the 'Evidence Score' section are considered and there may be additional support for a transcript annotation that has not been assessed by these methods. For example, a transcript may be supported by RNA-seq expression or exon junction data or its structure may have been reported in literature without a corresponding sequence record in the public databases.

Evidence Score Each transcript gets a score that is based on the sum of the following categories:


1 point if one or more aligned EST sequences are fully consistent with the annotated transcript.

2 points if an annotated exon intersects a region of aligned protein similarity (note that similarity to self is excluded).

4 points if there is any gene prediction that is fully consistent with the annotated transcript.

8 points if one or more aligned cDNAs are fully consistent with the annotated transcript.


The points assigned for each type of evidence allow one to easily and unambiguously determine what types of evidence exist that support a particular transcript annotation as each possible combination of supporting types receives a unique score.

Exact Match Clones are an exact match at the nucleotide level to the annotated transcript. This includes the 5 prime and 3 prime ends as well as all splice junctions.
Contained within the annotated transcript, internally consistent Clones that support the transcript, but do not extend the full length of the transcript.
End(s) extend beyond the annotated transcript, internally consistent Clones may be full length or partial, and support the internal structure of the transcript, but extend beyond the transcript at the 5 prime and/or the 3 prime end.

cDNA Clones, End Sequence Only (ESTs)

cDNA clones which have not yet been fully sequenced. The clones may or may not be full length.

Contained within the annotated transcript, internally consistent Clones that support the annotated transcript but the sequence begins and ends within the transcript.
End(s) extend beyond the annotated transcript, internally consistent Clones that are consistent with the annotated transcript, but extend beyond at either the 3' or 5' end.

Sequence

The nucleotide sequence of the transcript.

Sequence Downloader

Sequence Downloader - Click on the link to the left to go to the Sequence Downloader tool. This opens the Sequence Downloader tool with the transcript ID entered in the ID box at the top and the 'Transcripts' mode selected. The sequence of the transcript appears below. The ‘Type’ menu allows you to toggle between ‘Tanscripts, CDS, exons, introns, translations, 5’ UTR, and 3’ UTR’. To change the mode, choose a sequence type and click on “View Sequence”. (Note: additional sequence region options are available if you access the Sequence Downloader tool from the gene page or from the “Tools” dropdown menu).

Find the symbol, ID, genome coordinates, length and entity coordinates for your selected sequence region below.

Options

  • To retrieve the FASTA sequence, click on the icon to the right of the ID.
  • To find the coordinates of a particular span, mouse over a region of the sequence. The coordinates (in nucleotides or amino acids as appropriate) for the selected region will be displayed in the “Selected region” field.
  • Search for a sequence of interest in the Search box. For an explanation of the regular expressions that can be used to search, click on the icon to the right of the “Search in sequence” box.

Other Products of this Gene

Other Transcripts

A table of other transcripts derived from the same gene, which lists each transcript symbol, its FlyBase identifier number and its length in nucleotides.

Clicking on a transcript symbol will take you to the relevant Transcript Report.

Polypeptides

A table of polypeptides encoded by the same gene, which lists each polypeptide symbol, its FlyBase identifier number and its length in amino acid residues.

The table is subdivided into the polypeptides derived from this transcript and polypeptides derived from other transcripts.

Clicking on a polypeptide symbol will take you to the relevant Polypeptide Report.

Comments

Comments regarding the annotation of the transcript, including whether the transcript contains any unconventional features.

This section is only displayed in an individual Transcript Report if it contains data.

External Crossreferences

A table of RefSeq and DDBJ/EMBL/Genbank sequence accession numbers corresponding to the transcript.

Clicking on an accession number will take you to the appropriate entry in the GenBank database.

Synonyms

A list of symbols that have been used in the literature, or by FlyBase, to describe the transcript.

Secondary IDs

A list of Secondary FlyBase Identifier numbers of the gene.

References

A list of publications that discuss the transcript, subdivided into fields by type of publication. Publications which discuss the associated gene but not this particular transcript can be found on the Gene Report. Only those fields containing data are displayed in an individual Transcript Report.