Difference between revisions of "FlyBase:Gene Report"
Line 99: | Line 99: | ||
|'''Protein Family (UniProt, Sequence Similarities)''' || Lists any UniProt Protein Families shown in their 'Sequence similarities' section, with the evidence code and a hyperlink to corresponding [http://www.uniprot.org/ UniProt] page. | |'''Protein Family (UniProt, Sequence Similarities)''' || Lists any UniProt Protein Families shown in their 'Sequence similarities' section, with the evidence code and a hyperlink to corresponding [http://www.uniprot.org/ UniProt] page. | ||
|- | |- | ||
− | |'''Protein Domains/Motifs''' || '''Interpro''' | + | |'''Protein Domains/Motifs''' || '''Interpro''' Lists Protein Domains or motifs found in the protein encoded by the gene with hyperlinks to the corresponding [http://www.ebi.ac.uk/interpro/ Interpro] page. |
|- | |- |
Revision as of 19:05, 26 April 2017
Last Updated: 16 September 2015
The Gene Report provides information on individual genes. The report also contains summaries and links to individual reports relating to objects associated with the gene, such as mutant alleles of the gene (including both classical alleles and alleles carried within transgenic constructs), transcripts and proteins encoded by the gene and their expression pattern data, and transgenic constructs and insertions. A link to search for stocks carrying these objects is also provided.
A locus may have been mapped to the genome and have an annotation, or it may only have been identified genetically via its mutant phenotype. Both types of locus are given gene reports in FlyBase.
As well as including gene reports for genes derived from species within the family Drosophilidae, FlyBase also includes gene reports for non-drosophilid genes ("foreign genes") that have been introduced into Drosophila via transgenic constructs and for engineered objects such as a fusion gene between two D.melanogaster genes.
FlyBase attributes data to the publication that reported it, so that users can easily refer back to the original publication if they wish. Thus, where possible in the fields below, the publication(s) that are the source of the information are listed, typically in parentheses to the right of the data. The exception in the Gene Report is the General Information section which contains a summary of the identity of the gene, the Genomic Location and Detailed Mapping Data section which contains a summary of location data, and summary tables in the Gene Model and Products and Alleles and Phenotypes sections, which provide links to objects associated with the gene.
This is a field-by-field guide to the information provided in the Gene Report.
General Information
Symbol | The valid symbol that is used in FlyBase for the gene.
The first part of the symbol (before the '\') is the standard prefix for the species (from the Species Abbreviations list). For species other than D.melanogaster, the species prefix is displayed wherever the gene symbol is used throughout FlyBase. For D.melanogaster genes, the species prefix is only displayed in the General Information section at the top of a Report. |
Name | The valid full name that is used in FlyBase for the gene. |
Feature type | A single controlled vocabulary term from the Sequence Ontology (SO), which aims to describe the key type of the gene.
The single term in this field is computed by FlyBase from the full list of SO terms listed in the Sequence Ontology: Class of Gene section of the Gene Report. See Computed Feature type of genes for a detailed description of how the term is selected. |
Species | The organism that the gene originates from, with the initial letter of the genus and the full species name listed. |
Annotation symbol | The current symbol for the annotation that represents the gene (if applicable). |
FlyBase ID | The Primary FlyBase identifier number of the gene, used to uniquely identify the gene in the database.
A gene may also have any number of Secondary FlyBase identifier numbers, which are listed in the SECONDARY FLYBASE IDs section of the Gene Report. |
Gene Snapshot | A short summary designed to provide a quick overview of the function of the gene's products.
It is based on key points solicited from expert researchers, and revised by FlyBase curators. The review date is stated, and cases that are in progress or are deemed to have insufficient data to summarize are stated as such. |
Genomic Location
Cytogenetic Map | A computed cytological location, based on the position on the genome to which the gene maps. This field is only filled in for genes with annotations.
See Computed cytological data in FlyBase for a detailed description of how this computed cytological location is calculated. |
Sequence location | The extent of the transcription unit on the genome, prefixed with the chromosome arm that the gene is located on. The strand to which the gene maps is indicated in square brackets after the sequence coordinates.
This field is only filled in for genes with annotations. |
Genomic Maps | A GBrowse thumbnail showing the location of the gene (highlighted in yellow) in the genome. Clicking on any of the genes visible in the thumbnail will take you to the corresponding gene report.
The top menu allows you to download the genomic region containing the genome region in Decorated FASTA format. The bottom menu allows you to download some of the components of the annotation, in FASTA format.
Gene region Extended Gene region CDS Introns Exons Translations Transcripts 3' UTR 5' UTR
|
Tag or Foreign Gene Data
This section is only displayed in an individual Gene Report if the gene is a "tag" or a "foreign gene". (A "tag" is a DNA fragment that is used to confer a novel property to a gene product, and may be artificial or derived from a Drosophilid or non-drosophilid gene. A "foreign gene" is any non-drosophilid gene that has been introduced into Drosophila, for example via a transgenic construct.) For tags, this section shows the species from which the gene is derived, the nature of the tag, and either a sequence accession number (if available) or a brief description of the tag. For foreign genes, this section shows the species from which the gene is derived, the gene symbol, and where possible an accession number for the gene (either an identifier number from a model organism database, or a sequence accession number).
Families, Domains and Molecular Function
Gene Group Membership (FlyBase) | Lists any Flybase Gene Groups the gene is a member of, with a hyperlink to the Gene Group Report. |
Protein Family (UniProt, Sequence Similarities) | Lists any UniProt Protein Families shown in their 'Sequence similarities' section, with the evidence code and a hyperlink to corresponding UniProt page. |
Protein Domains/Motifs | Interpro Lists Protein Domains or motifs found in the protein encoded by the gene with hyperlinks to the corresponding Interpro page. |
Molecular Function (see GO section for details) | A summary of the Gene Ontology Molecular Function terms associated with the gene, divided into terms with Experimental Evidence and Predictions/Assertions. Each term is hyperlinked to the appropriate Term Report.
For the full Gene Ontology (GO) data, including evidence and attributions, see the Gene Ontology (GO) section. |
Gene Ontology: Function, Process & Cellular Component
Gene Ontology (GO) controlled vocabulary (CV) terms assigned to genes based on evidence in external publications and internal analysis. Note that terms are assigned to genes but apply to the product of that gene. The number in parentheses shows the total number of different terms assigned to a gene.
More information about how GO terms are assigned can be found under the Classification of Gene Products using Gene Ontology (GO) terms.
The current release of GO data for all FlyBase genes can be found in the gene_association.fb file. The latest version of this data is available for download here from the Gene Ontology consortium site. Details about the file format and comments on version differences can be found in the documentation on the Classification of Gene Products using Gene Ontology (GO) terms.
Molecular Function
A list of GO molecular function terms associated with the product(s) of the gene. For each term, the type of evidence used to assign the term and the reference containing the evidence are provided. The terms are divided into 'Terms Based on Experimental Evidence' and 'Terms Based on Predictions or Assertions'.
CV term
Clicking on the term in the CV term column will take you to the relevant GO CV Term Report, which includes a definition of the GO term (where available).
GO CV terms may be preceeded by qualifiers. The qualifier NOT is used to make an explicit note that the gene product is not associated with the GO term. The qualifier contributes_to is used when an individual gene product that is part of a complex can be annotated to terms that describe the function of the complex. contributes_to can only be used to qualify GO molecular function terms.
Evidence
The type of evidence used to support the GO term is indicated in the evidence column. The evidence code documentation contains details of the evidence codes used by FlyBase. More information is available in the Gene Ontology Guide to GO Evidence Codes.
Some types of evidence can be supported by other database objects. These objects are identified by their database abbreviation followed by a colon and the unique identifier for the object in that database. A list of current database abbreviations can be found in the GO.xrf_abbs file. The evidence code documentation explains which codes can be used with other database objects and the special meaning attached to using multiple objects in evidence.
Biological Process
A list of GO biological process terms associated with the product(s) of the gene. For each term, the type of evidence used to assign the term and the reference containing the evidence are provided. The terms are divided into 'Terms Based on Experimental Evidence' and 'Terms Based on Predictions or Assertions'.
CV term
Clicking on the term in the CV term column will take you to the relevant GO CV Term Report, which includes a definition of the term (where available).
GO CV terms may be preceeded by qualifiers. The qualifier NOT is used to make an explicit note that the gene product is not associated with the GO term.
Evidence
The type of evidence used to support the GO term is indicated in the evidence column. The evidence code documentation contains details of the evidence codes used by FlyBase. More information is available in the Gene Ontology Guide to GO Evidence Codes.
Some types of evidence can be supported by other database objects. These objects are identified by their database abbreviation followed by a colon and the unique identifier for the object in that database. A list of current database abbreviations can be found in the GO.xrf_abbs file. The evidence code documentation explains which codes can be used with other database objects and the special meaning attached to using multiple objects in evidence.
Cellular Component
A list of GO cellular component terms associated with the product(s) of the gene. For each term, the type of evidence used to assign the term and the reference containing the evidence are provided. The terms are divided into 'Terms Based on Experimental Evidence' and 'Terms Based on Predictions or Assertions'.
CV term
Clicking on the term in the CV term column will take you to the relevant GO CV Term Report, which includes a definition of the term (where available).
GO CV terms may be preceeded by the qualifiers. The qualifier NOT is used to make an explicit note that the gene product is not associated with the GO term. The qualifier colocalizes_with is used when a gene product is transiently or peripherally associated with an organelle or complex. It is also used in cases where the resolution of an assay is not accurate enough to say that the gene product is a bona fide component member. colocalizes_with can only be used to qualify GO cellular component CV terms.
Evidence
The type of evidence used to support the GO term is indicated in the evidence column. The evidence code documentation contains details of the evidence codes used by FlyBase.
Some types of evidence can be supported by other database objects. These objects are identified by their database abbreviation followed by a colon and the unique identifier for the object in that database. A list of current database abbreviations can be found in the GO.xrf_abbs file. The evidence code documentation explains which codes can be used with other database objects and the special meaning attached to using multiple objects in evidence.
Summaries
The Automatically generated summary is generated automatically from information in the gene report and related allele reports. For more information about each statement in the summary see the relevant section in the gene gene/allele report.
Gene Group Membership
Each FlyBase Gene Group that the gene is a member of is listed with a description of that group. The Gene Group name is hyperlinked to the corresponding Gene Group Report.
UniProt Contributed Data
Shows the summary shown in the UniProt 'Function' section, with the evidence code and a hyperlink to corresponding UniProt page.
User Contributed Data
A link to the FlyGene Wiki page for the gene. The FlyGene Wiki allows you to write your own summary for the gene which is visible to the public and editable by others.
External Summaries
Links to any external summaries for the gene.
Interactions & Pathways
Summary of Physical Interactions
Information regarding curated physical interactions for the protein and RNA products of the gene. Currently, only interactions involving two D. melanogaster genes are reported. Only pair-wise interactions are reported.
esyN Network Diagram - A graphical representation of FlyBase-curated protein-protein and RNA-protein interactions. Linkouts to the esyN tool allow users to edit the network display by adding/deleting additional factors and genetic or physical interactions to the diagram.
protein-protein A table containing a summary of "protein-protein" physical interactions between the protein products of two genes. The table lists each pairwise protein-protein interaction on a separate line of the table. The two interacting genes are indicated in the first column. Each interaction is represented a group of reported interaction instances between the two proteins. All assays used in support of the interaction are listed in column 2. All references reporting the pairwise interaction are listed in column 3. Clicking on a 'Interacting group' symbol will take you to the relevant Interaction Report. Clicking on a reference will take you to the relevant Reference Report.
RNA-protein This table reports physical interactions between the protein of one gene and the RNA of another gene. The table lists each pairwise RNA-protein interaction on a separate line of the table. The two interacting genes are indicated in the first column. Each interaction is represented a group of reported interaction instances. All assays used in support of the interaction are listed in column 2. All references reporting the pairwise interaction are listed in column 3. Clicking on a 'Interacting group' symbol will take you to the relevant Interaction Report. Clicking on a reference will take you to the relevant Reference Report.
Summary of Genetic Interactions
Interactions browser - To see a graphical representation of genetic interactions involving this gene click the Interactions Browser button.
A table containing a summary of genetic interactions between this and other genes. The table lists other gene that this gene has been shown to interact with genetically, as well as the type of interaction (suppressible, enhanceable) and the reference(s) in which the interaction is described.
Clicking on a gene symbol will take you to the relevant Gene Report for the interacting gene.
Clicking on a reference will take you to the relevant Reference Report.
To see the full details of the genetic interactions (including the alleles used and the phenotypes seen), see the Alleles and Phenotypes section.
External Data
Subunit Structure | Shows data from the UniProt 'Subunit Structure' section, with a hyperlink to the corresponding UniProt page. |
Linkouts | A list of additional links to external databases that are relevant to interactions data is also displayed. Clicking on the linkout identifier will take you to the appropriate entry in the external database from which they are derived.
These links are also displayed together with all other external data links for the gene in the External Crossreferences and Linkouts section of the Gene Report. |
Expression Data
Transcript Expression
A three column table with the headings Stage, Tissue/Position, and Reference. Each row in the table represents one distinct expression pattern defined by time of expression (Stage) and by location of expression (Tissue/Position). Each distinct expression pattern is attributed to the reference that reported it. Developmental stages and anatomical parts are described using controlled vocabulary (CV) terms (for valid CV terms, see the Vocabularies search page).
The table is subdivided by the assay, when appropriate.
Additional Descriptive Data
Free-text curated descriptions of expression patterns, sorted by reference.
Marker for
The expression of some genes is used as a marker for a particular tissue or developmental state. If that data has been curated for a gene, the name of the body part would be listed here.
Subcellular Localization
CV term | A list of Gene Ontology (GO) cellular component terms that describe the subcellular location of the transcript or transcripts of this gene. (for valid CV terms, see the Vocabularies search page). |
Polypeptide Expression
A three column table with the headings Stage, Tissue/Position, and Reference. Each row in the table represents one distinct expression pattern defined by time of expression (Stage) and by location of expression (Tissue/Position). Each distinct expression pattern is attributed to the reference that reported it. Developmental stages and anatomical parts are described using controlled vocabulary (CV) terms (for valid CV terms, see the Vocabularies search page).
The table is subdivided by the assay used, when appropriate.
Additional Descriptive Data
Free-text curated descriptions of expression patterns, sorted by reference.
Marker for
The expression of some genes is used as a marker for a particular tissue or developmental state. If that data has been curated for a gene, the name of the body part would be listed here.
Subcellular Localization
CV term | A list of Gene Ontology (GO) cellular component terms that describe the subcellular location of the transcript or transcripts of this gene. (for valid CV terms, see the Vocabularies search page). |
Expression Deduced from Reporters
High-Throughput Expression Data
Associated Tools
Links to associated tools that enable visualization and searching of High-Throughput Expression Data
GBrowse2 - Visual display of genome-wide RNA-Seq coverage data.
RNA-Seq Profile Search - Search for genes by their RNA-Seq expression profile.
RNA-Seq by Region - Search RNA-Seq expression levels for all exons of a gene, or for specified genomic regions.
Microarray and RNA-Seq Expression Data
In the subsequent sections, gene expression levels are presented for a number of high-throughput microarray and RNA-Seq datasets. Microarray expression values are provided by FlyAtlas and genes are sorted into one of five gene expression level bins by FlyBase (from "No expression" to "Very high expression"). RNA-Seq expression values are calculated by FlyBase from coverage data (provided to us by the modENCODE project), as described by Gelbart and Emmert, 2013; genes are sorted into one of eight gene expression level bins (from "No/Extremely low expression" to "Extremely high expression"). Expression values are presented as user-configurable histograms/heatmaps, color-coded by expression level bin. Expression values can be downloaded as a tab-separated values (.tsv) file. Links to the relevant dataset reports are at the top left of each section.
FlyAtlas Anatomy Microarray - see the FlyAtlas-RNA.adult and FlyAtlas-RNA.larva dataset reports for details.
modENCODE Anatomy RNA-Seq - see the modENCODE_mRNA-Seq_U dataset report for details.
modENCODE Development RNA-Seq - see the modENCODE_mRNA-Seq_tissues dataset report for details.
modENCODE Cell Lines RNA-Seq - see the modENCODE_mRNA-Seq_cell.B dataset report for details.
modENCODE Treatments RNA-Seq - see the modENCODE_mRNA-Seq_treatments dataset report for details.
NIH Anatomical Expression RNA-Seq - expression levels for D. pseudoobscura genes from modENCODE RNA-Seq datasets: NIH_mRNA_Dpse_gonads_carcasses, NIH-Dpse_1_adult_heads and NIH_mRNA_Dpse_2_adults.
Search for similarly expressed genes
This button will return genes with similar RNA-Seq expression patterns. It is currently offered only for the modENCODE_mRNA-Seq_U developmental profile. Expression data comparisons were not generated by the modENCODE consortium but by FlyBase software. The 'search for similarly expressed genes' button on a gene report initiates a 'fixed' variant of a comparative search (using some fixed predefined search settings specific to expression).
Searching is performed by comparison of gene expression level changes through the set of [similarly ordered] 'experiments' which represent known expression values under different experimental conditions (for staging data, experiments are the different developmental stages, for FlyAtlas data they are expression levels in different bodyparts/ tissues and so on). The purpose of the declared search is to compare sets of expression values from different genes, to determine the tendency of two datasets to vary together, in a somewhat correlated manner. The correlation coefficient is the best statistical test serving such a purpose.
Predefined search settings (currently) include all developmental stages from modENCODE staging data. Sets of expression values from all genes are compared to the 'test' gene, resulting in an output of top matches. This is in the form of a custom HitList which includes graphical representation of the expression profiles. Visual inspection of these profiles might allow the user an idea about the interpretation of the obtained correlation (note, the 'test' gene profile in almost all cases is located at the top of the HitList as naturally, being compared to itself, it gives perfect 100% correlation match).
FlyBase provides a simple comparison tool for the modENCODE expression data. We hope this will aid the user in interpreting the expression patterns but we also appreciate that there are more complicated ways of comparing and interpreting the data.
Expression Clusters
This section reports co-expression clusters of which the gene is a member, based on RNA-Seq profiles of 30 developmental stages (see the modENCODE_mRNA-Seq_U dataset report for details). These clusters were generated by two alternative clustering analyses from the modENCODE consortium. The first analysis by The modENCODE Consortium et al., 2010 generated 34 clusters of 10,482 genes, as described in the mE2_34_mRNA_expression_clusters dataset report. The second analysis by Graveley et al., 2011 generated 20 clusters of 11391 genes, as described in the mE1_20_mRNA_expression_clusters dataset report. Within a given analysis, genes could only be assigned to a single cluster (or none at all), but a gene may be part of one cluster from each of the two analyses. The report for each expression cluster lists all genes in that cluster (see the "Member" section).
External Data and Images
A list of links to external databases that are relevant to expression data is also displayed. The links are Linkouts, which are indicated by a "LinkOut" label in parentheses after the field label. Clicking on the linkout identifier will take you to the appropriate entry in the external database from which they are derived.
The external databases currently displayed in this section are:
BDGP expression data - Patterns of gene expression in Drosophila embryogenesis
FLIGHT - Integrating Genomic and High-Throughput data
FlyAtlas - the Drosophila adult expression atlas
Fly-FISH Expression patterns of Drosophila mRNAs at the subcellular level during embryogenesis and third instar larval tissues.
FlyGut - An atlas of the Drosophila adult midgut.
FlyExpress - A Drosophila melanogaster expression pattern search engine
SliceSeq - Genome-wide anterior-posterior spatial patterns of gene expression from transverse slices of Drosophila embryos
These links are also displayed together with all other external data links for the gene in the EXTERNAL CROSSREFERENCES & LINKOUTS section of the Gene Report.
Alleles & Phenotypes
Summary of Allele Phenotypes
A table containing a summary of the phenotypes of alleles of the gene. The table lists controlled vocabulary terms describing the phenotype, together with the allele(s) of the gene which have been reported to show this phenotype.
To see the detail of the phenotype (including publications where the phenotype has been reported and additional free text comments), click on the allele symbol to go to the relevant Allele Report and look at the Phenotypic Data section.
Clicking on the controlled vocabulary term will take you to the relevant CV Term Report, which includes a definition of the term (where available).
The table is divided into 4 sections.
Lethality - a list of reported lethal/viable phenotypes.
Sterility - a list of reported sterile/fertile phenotypes.
Other Phenotypes - a list of other reported phenotypic classes.
Phenotype manifest in - a list of the parts of the animal reported to be affected in the mutant alleles.
Phenotypic Description from the Red Book (Lindsley and Zimm 1992)
Phenotypic descriptions of mutants of this gene from the Red Book (Lindsley and Zimm 1992). Gene/Allele symbols may differ from current usage.
Classical Alleles
The number in parentheses shows the total number of classical alleles that exist in FlyBase for this gene.
A table listing all the classical alleles of the gene together with summary information, including the allele class, mutagen used to generate the allele, stock availability and whether or not the lesion that causes the allele is known.
To see the full allele information, click on the allele symbol to go to the relevant Allele Report.
Click on the buttons at the top of this section to get either Pre-selected data for all classical alleles, or to choose specific fields using the Batch Download tool.
Alleles Carried on Transgenic Constructs
The number in parentheses shows the total number of alleles carried on transgenic constructs that exist in FlyBase for this gene.
A table listing all the alleles carried on transgenic constructs of the gene together with summary information, including the allele class, mutagen used to generate the allele, stock availability and whether or not the lesion that causes the allele is known.
To see the full allele information, click on the allele symbol to go to the relevant Allele Report.
Click on the buttons at the top of this section to get either Pre-selected data for all alleles carried on transgenic constructs, or to choose specific fields using the Batch Download tool.
Deletions and Duplications
Disrupted in | A list of aberrations that have been reported to completely delete/disrupt the gene, determined by either complementation or molecular analysis.
Clicking on an aberration symbol will take you to the relevant Aberration Report. This field is only displayed in an individual Gene Report if it contains data. |
Partially disrupted in | A list of aberrations that have been reported to partially disrupt the gene, determined by either complementation or molecular analysis.
Clicking on an aberration symbol will take you to the relevant Aberration Report. This field is only displayed in an individual Gene Report if it contains data. |
Not Disrupted in | A list of aberrations that have been reported not to disrupt the gene, determined by either complementation or molecular analysis.
Clicking on an aberration symbol will take you to the relevant Aberration Report. This field is only displayed in an individual Gene Report if it contains data. |
Duplicated in | A list of aberrations that have been reported to completely duplicate the gene, determined by either complementation or molecular analysis.
Clicking on an aberration symbol will take you to the relevant Aberration Report. This field is only displayed in an individual Gene Report if it contains data. |
Partially duplicated in | A list of aberrations that have been reported to partially duplicate the gene, determined by either complementation or molecular analysis.
Clicking on an aberration symbol will take you to the relevant Aberration Report. This field is only displayed in an individual Gene Report if it contains data. |
Not duplicated in | A list of aberrations that have been reported not to duplicate the gene, determined by either complementation or molecular analysis.
Clicking on an aberration symbol will take you to the relevant Aberration Report. This field is only displayed in an individual Gene Report if it contains data. |
Transgenic Constructs and Insertions
Transgenic Constructs
A table showing transgenic constructs that have been generated for the gene, subdivided by the type of construct.
Types of constructs shown are:
GAL4 construct
reporter construct
UAS construct
heat-shock construct
characterization construct
For GAL4 constructs and reporter constructs the table indicates whether expression data is available.
Clicking on the transgenic construct symbol will take you to the relevant Recombinant Construct Report.
Insertions
A table showing transgenic constructs that have been generated for the gene, subdivided by the type of insertion.
Types of constructs shown are:
insertion of enhancer trap
miscellaneous insertions
insertion of mobile activating element
insertion of enhancer trap binary system
For enhancer trap insertions the table indicates whether expression data is available.
Clicking on the insertion symbol will take you to the relevant Insertion Report.
Orthologs
Human Orthologs (via DIOPT)
This section presents orthology calls between D. melanogaster and human, as provided by the DRSC Integrative Ortholog Prediction Tool (DIOPT), together with links to relevant gene and phenotype reports at the Online Mendelian Inheritance in Man (OMIM) database. The ortholog data shown here repeats that in the ‘Model Organism Orthologs (via DIOPT)’ section, described below - more details on the DIOPT approach are also given in that section.
The columns are:
- Gene Name: the official human gene symbol and name, as used at (and linked to) the HGNC
- Score: the number of tools that support a given orthologous gene-pair relationship compared to the total number of tools that compute orthology relationships for those two species (expressed as "X of Y")
- OMIM ID: the OMIM number for this human gene, linked to the corresponding OMIM gene report
- OMIM Phenotype: any phenotypes associated with the human gene, as reported in OMIM, and linked to the corresponding OMIM phenotype report
- Transgene in Fly: Where applicable, a link to a FlyBase Gene Report for the human gene, indicating that it has been expressed transgenically in Drosophila
Model Organism Orthologs (via DIOPT)
This section presents orthology calls between D. melanogaster and 8 other organisms (Homo sapiens (human), Rattus norvegicus (Norway rat), Mus musculus (laboratory mouse), Xenopus tropicalis (Western clawed frog), Danio rerio (zebrafish), Caenorhabditis elegans (nematode, roundworm), Saccharomyces cerevisiae (brewer's yeast) and Schizosaccharomyces pombe (fission yeast)), as provided by the DRSC Integrative Ortholog Prediction Tool (DIOPT). The DIOPT approach integrates ortholog predictions from multiple tools, thereby giving a balanced overview of potential orthologs derived from different algorithms. (Further documentation is here.)
The list of ortholog predictions is arranged by species, in order of increasing phylogenetic distance relative to human.
The columns are:
- Gene Symbol: official gene symbol, as used in (and linked to) the relevant model organism database, prefixed with the 4-letter species abbreviation used in FlyBase
- NCBI Gene ID: links to the gene report page at NCBI
- Score: the number of tools that support a given orthologous gene-pair relationship compared to the total number of tools that compute orthology relationships for those two species (expressed as "X of Y")
- Best Score: either ‘yes’ or ‘no’ to indicate whether the given ortholog has the highest score for the query gene
- Best Rev Score: either ‘yes’ or ‘no’ to indicate whether the query gene has the highest score for the given ortholog in the reciprocal search
- Source: check-boxes indicating which individual ortholog prediction tools support a given orthologous gene-pair relationship
- Align: link to an alignment between the given orthologous gene-pairs on the DIOPT site
- Transgene in Fly: Where applicable, a link to a FlyBase Gene Report for a non-Drosophila gene, indicating that that gene has been expressed transgenically in Drosophila
Orthologs (via OrthoDB)
This section presents a uniform set of OrthoDB-derived orthology calls between D. melanogaster and around 40 other species, biased towards those closely related to D. melanogaster. Orthology calls are arranged into 5 ‘orthology groups’: Drosophila species, non-Drosophila Dipterans, non-Dipteran Insects, non-Insect Arthropods, and non-Arthropod Metazoa.
At the top of this section is the “OrthoDB Ortholog Groups” subsection that contains links to OrthoDB for each of the 5 groups mentioned above. Below this are separate sections for each orthology group listing the relevant predicted orthologs. The groups, and the species within them, are listed in order of increasing phylogenetic distance relative to D. melanogaster.
The columns in each section are:
- Organism: the latin species designation
- Common Name: the common species name
- Gene: symbol/ID of the orthologous gene, linked to an appropriate speicies database or Ensembl, prefixed with the 4-letter species abbreviation used in FlyBase
- AAA Syntenic Ortholog: (only shown for the ‘Drosophila species’ group)
- Multiple Dmel Genes in this Orthologous Group: either blank or ‘Y’, as appropriate
For non D. melanogaster FlyBase gene reports, the Orthologs sections includes links to the OrthoDB Ortholog groups (if any are identified) as well as a link to the orthologous D. melanogaster gene (where one has been identified).
Human Disease Model Data
FlyBase Human Disease Model Reports
Links to Human Disease Model Reports associated with this gene.
Alleles Reported to Model Human Disease (Disease Ontology)
Human Disease model data curated using disease terms from the Disease Ontology.
Models | A table showing any human disease(s) that are being modelled by a given mutant or transgenic allele. The phenotype(s) being studied must recapitulate some aspect of the human disease for the allele to be considered a model. In some cases, an allele may be expected to produce a disease phenotype but does not. These unexpectedly negative results are also shown.
The table consists of four columns Allele - The alleles of the gene that is being used as a model. Disease - Indicates whether or not the allele is modeling the disease ('model of' or 'DOES NOT model') followed by the name of the disease (from the Disease Ontology). Evidence - If the allele models the phenotype on its own 'Inferred from mutant phenotype' is displayed. If the mutant phenotype is only observed in combination with another allele 'In combination with' is displayed followed by the symbols of other allele(s), with hyperlinks to the relevant allele report pages. Note that any drivers that are required for the expression of transgenic alleles are not listed in this section. References - lists the reference(s) that describe the model. This information is also shown on the individual allele reports. |
Interactions | A table showing interactions of this allele with other disease-causing allele(s)
The table consists of four columns Allele - The alleles of the gene that is being used as a model. Disease - the nature of the interaction ('exacerbates' or 'ameliorates' and the name of the disease (from the Disease Ontology). This column also indicates if the allele is not modifying a disease (prefixed with DOES NOT). Interaction - 'modeled by' followed by the symbol of the allele that is modeling the disease, with hyperlinks to the relevant allele report pages. References - lists the reference(s) that describes the model. This information is also shown on the individual allele reports. |
Comments | Curated comments that highlight certain features of a model, for example, when particular aspects of a disease phenotype are modeled while others are not. These comments are used sparingly to avoid duplication with phenotype information.
This information is also shown on the individual allele reports. |
Gene Model & Products
A GBrowse graphic showing the gene model. The extent of the transcription unit on the genome (highlighted in pink), its transcripts, CDS and any transgene insertion sites in the region are shown. For a more detailed view showing more features, a link to open GBrowse in a separate window is provided above the graphic.
Comments on the Gene Model
Comments regarding the annotation of the gene, including whether the gene model contains any unconventional features.
Sequence Ontology: Class of Gene
A list of controlled vocabulary terms from the Sequence Ontology (SO), which describe the class of the gene. Aspects of the gene class which are described by the SO terms include (but are not limited to):
- the type of molecule (e.g. protein, tRNA, rRNA) encoded by the gene.
- where the gene is encoded (e.g. nucleus, mitochondrion).
Transcript Data
Annotated Transcripts
A table listing the annotated transcripts of the gene, their FlyBase identifier number, length in nucleotides and length in amino acid residues of the associated coding sequence (CDS). Clicking on a transcript symbol will take you to the relevant Transcript Report.
Additional Transcript Data & Comments
Reported transcript sizes (kB) | A list of transcript sizes that have been reported in the literature along with the associated reference. |
Comments | Miscellaneous free text comments related to reported transcripts along with the associated reference. |
External Data
A list of accession numbers from external databases that are relevant to the transcripts of the gene. The accession numbers are FlyBase curated links.
Accession numbers from the following databases are currently displayed in this section:
MIR - miRBase, microRNA data
Rfam - RNA families database of alignments and CMs
These links are also displayed together with all other external data links for the gene in the EXTERNAL CROSSREFERENCES & LINKOUTS section of the Gene Report.
Polypeptide Data
Annotated Polypeptides
A table listing the annotated polypeptides of the gene, their FlyBase identifier number, predicted molecular weight (kD), length in amino acid residues, theoretical pI, and GenBank protein accession numbers. Clicking on a polypeptide symbol will take you to the relevant Polypeptide Report.
Additional Polypeptide Data & Comments
Reported protein sizes (kB) | A list of protein sizes that have been reported in the literature along with the associated reference. |
Comments | Miscellaneous free text comments related to reported proteins along with the associated reference. |
External Data
A list of links to external databases that are relevant to the polypeptides encoded by the gene. The links are either Linkouts, which are indicated by a "LinkOut" label in parentheses after the field label, or are accession numbers which are FlyBase curated links. Clicking on the accession number or linkout identifier will take you to the appropriate entry in the external database from which they are derived.
The external databases currently displayed in this section are:
FlyBase-curated links
GCR - The G protein-coupled receptor database
InterPro - a database of protein families, domains and functional sites
MEROPS - Protease database
MITODROME - The MitoDrome database
NRL_3D - NRL_3D database
PDB - Protein Data Bank (Brookhaven)
TransFac - The TRANSFAC database of transcription factors and their binding sites
Linkouts
PANTHER - Protein Classification System
These links are also displayed together with all other external data links for the gene in the EXTERNAL CROSSREFERENCES & LINKOUTS section of the Gene Report.
Sequences Consistent with the Gene Model
A table of sequence accession numbers that support the gene model. Clicking on the accession number will take you to the appropriate entry in the external database from which they are derived. Sequences associated with genes that do not yet have an annotation are also listed in this section.
The table contains accession numbers from the following databases:
DDBJ/EMBL/GenBank - nucleic acid accession number and associated protein ID number (if applicable).
UniProtKB/Swiss-Prot - protein accession number.
UniProtKB/TrEMBL - protein accession number.
These accessions are both FlyBase curated links and associations made by automated assessment of aligned cDNA and EST nucleotide sequences.
Mapped Features & Mutations
A table listing features associated with the gene, additional to the basic transcript and polypeptide structure, that have been mapped to the genome and form part of the annotation of the gene. The table lists the type of feature in the first column. The second column contains the feature's name as well as its location on the genome. The third column lists the evidence type, any associated mapping information, and any comments related to mapping the feature. The fourth column lists the reference used to map the feature.
External Data
A list of links to external databases that are relevant to the gene model as a whole, but not specifically to individual gene products.
The links are either Linkouts, which are indicated by a LinkOut label in parentheses after the field label, or are accession numbers which are FlyBase curated links. Clicking on the accession number or linkout identifier will take you to the appropriate entry in the external database from which they are derived.
The external databases currently displayed in this section are:
FlyBase-curated links
EPD - Eukaryotic Promoter Database (Bucher)
Linkouts
DEDB - Drosophila Exon Database
These links are also displayed together with all other external data links for the gene in the EXTERNAL CROSSREFERENCES & LINKOUTS section of the Gene Report.
Genomic Location and Detailed Mapping Data
Chromosome (arm) | The chromosome arm that the gene is located on. This field is only filled in for genes with annotations. |
Cytogenetic Map | A computed cytological location, based on the position on the genome to which the gene maps. This field is only filled in for genes with annotations.
See Computed cytological data in FlyBase for a detailed description of how this computed cytological location is calculated. |
Recombination map | A computed genetic map position for the gene, derived from the full list of terms listed in the Experimentally Determines Recombination Data subsection.
The genetic map position is given as chromosome number-map position, or simply chromosome number-, if the gene has not been mapped within the chromosome. |
Sequence location | The extent of the transcription unit on the genome, prefixed with the chromosome arm that the gene is located on. The strand to which the gene maps is indicated in square brackets after the sequence coordinates.
This field is only filled in for genes with annotations. |
FlyBase Computed Cytological Location
An inferred cytological location, based on the position on the genome to which the gene maps, or on polytene localization, recombination, complementation or molecular information for the gene. The evidence that was used to derive the inferred location is also given.
See Computed cytological data in FlyBase for a detailed description of how this computed cytological location is calculated.
Experimentally Determined Cytological Location
A table of cytological locations that have been reported for the gene in the literature. The table also contains additional notes on the reported location, such as a comment that the cytological location is inferred from the location of a transposable element insertion causing an allele of a gene.
Experimentally Determined Recombination Data
A table of genetic map data that have been reported for the gene in the literature. The table contains reported genetic map positions, given as chromosome number-map position, or simply chromosome number-, if the gene has not been mapped within the chromosome.
The table also lists other genes to which the gene has been mapped to the left of or right of, and additional notes on the reported location.
If a gene has been mapped cytogenetically but not genetically, an estimated genetic position may be given in square brackets, inferred from a standard Map Conversion Table.
Stocks & Reagents
Stocks Listed in FlyBase
A list of the stocks from the public stocks centers that contain this gene; genotypes may be provided. If there are a large number of such stocks, the list may be truncated. For participating stock centers, clicking on the stock number will take you to the appropriate stock report.
Genomic Clones
A list of genomic clones that are associated with this gene.
cDNA Clones
This section lists cDNA clones that have been associated with this gene. It is subdivided to distinguish fully sequenced clones from those which have only been partially characterized. Clones that are part of the BDGP DGC collection are listed separately from the other clones.
These associations are primarily automated and at this time include clones containing inserts that support the gene model as well as clones containing inserts that overlap the gene model but do not support any of the currently annotated transcripts. Efforts have been made to eliminate from this list clones that overlap the gene model but support another gene model in the same genomic region (i.e. a gene nested within an intron of another gene). In addition, clones containing sequences that overlap transposable elements have been excluded from this list.
We plan to subdivide clones into those that completely support a gene model and those that overlap but do not support an annotation in a future iteration of the report but at this time care should be taken to assess the sequences linked to the clone before using a clone as a reagent.
RNAi & Array Information
Linkouts to RNAi and array resources. Clicking on the linkout identifier will take you to the appropriate entry in the external database from which they are derived.
Resources currently listed include
DRSC
GenomeRNAi
Antibody Information
A list of publications in which the generation of antibodies against the subject polypeptide has been reported. Each entry is categorized as "polyclonal" or "monoclonal" to indicate the nature of the antibodies. In cases where a monoclonal antibody is available from the Developmental Studies Hybridoma Bank in Iowa, a link the DSHB website is provided.
Other Information
Relationship to Other Genes
Source for database identity of | A statement indicating that the valid symbol used in FlyBase for the gene has been changed, because it previously had an arbitrary symbol, for example a gene that was identified by genomic sequencing projects e.g. CG8896, or was a lethal locus named solely by its cytogenetic location, e.g. l(3)64Aj. The publication that was the source of the change is listed.
The statement consists of the new symbol and previously used symbol of the gene at the time of the rename, prefixed with "Source for identity of: " See Gene symbols and names in the nomenclature document for cases when renaming may occur. |
Source for database merge of | A statement indicating that two (or more) gene records have been merged into a single gene record in the database, together with the publication that was the source of the merge.
The statement consists of the valid symbols of the gene at the time of the merge, prefixed with "Source for merge of: " |
Member gene of | If a gene is present as a cluster in the genome, where each member of the cluster is so similar that they are traditionally referred to by a single name, then FlyBase creates a gene report for each individual member of the cluster, and also a "generic" report representing the cluster as a whole. For example, a generic gene record exists for "5SrRNA" and also for individual members of the cluster, such as "5SrRNA:CR33353".
This field is displayed in the Gene Report for the individual member of the cluster and lists the gene symbol of the generic record, representing the whole cluster, that the gene is a member of. Clicking on the gene symbol will take you to the Gene Report for the generic record representing the whole cluster. This field is only displayed in an individual Gene Report if it contains data. |
Component gene(s) | If a gene is present as a cluster in the genome, where each member of the cluster is so similar that they are traditionally referred to by a single name, then FlyBase creates a gene report for each individual member of the cluster, and also a "generic" report representing the cluster as a whole. For example, a generic gene record exists for "5SrRNA" and also for individual members of the cluster, such as "5SrRNA:CR33353".
This field is displayed in the Gene Report for the generic record, representing the whole cluster, and lists the gene symbols of the individual members of the cluster. Clicking on a gene symbol will take you to the Gene Report for individual member of the cluster. This field is only displayed in an individual Gene Report if it contains data. |
Encoded by | This field is typically displayed in a Gene Report for genes encoded by natural transposons, and lists the symbol of the natural transposon that encodes the gene.
Clicking on the symbol will take you to the Report for the natural transposon. This field is only displayed in an individual Gene Report if it contains data. |
Tags | This field is displayed in a Gene Report that represents an engineered tag that has been used in a transgenic construct to mark genes or their products. This includes epitope tags such as "Avic\GFP" and function tags such as nuclear localization signals.
The field lists the alleles that contain the tag. Clicking on the symbol will take you to the relevant Allele Report. This field is only displayed in an individual Gene Report if it contains data. |
Additional comments | Free text comments about the relationship of the gene to other genes or groups of alleles. |
Other Comments
Miscellaneous free text comments about the gene.
Origin and Etymology
Discoverer
A list of the individuals who identified the gene.
Etymology
The explanation behind the gene name as reported by the authors.
Identification
Details of how the gene was identified.
External Crossreferences & Linkouts
A complete list of links to external databases for the gene. These links may also be present in other sections of the Gene Report, where appropriate. The SEQUENCE CROSSREFERENCES and OTHER CROSSREFERENCES sections contain accession numbers which are FlyBase curated links. The LINKOUTS section contains Linkouts. Clicking on the accession number or linkout identifier will take you to the appropriate entry in the external database from which they are derived.
The SEQUENCE CROSSREFERENCES section contains accession numbers from the following databases:
DDBJ/EMBL/GenBank - the nucleic acid sequence databases of Japan, the U.S., and Europe
UniProtKB/Swiss-Prot - UniProt Knowledgebase, Swiss-Prot section
UniProtKB/TrEMBL - UniProt Knowledgebase, TrEMBL section
The OTHER CROSSREFERENCES section contains accession numbers from the following databases:
EPD - Eukaryotic Promoter Database (Bucher)
GCR - The G protein-coupled receptor database
InterPro - a database of protein families, domains and functional sites
MEROPS - Protease database
MIR - miRBase, microRNA data
MITODROME - The MitoDrome database
NRL_3D - NRL_3D database
PDB - Protein Data Bank (Brookhaven)
Rfam - RNA families database of alignments and CMs
TransFac - The TRANSFAC database of transcription factors and their binding sites
The LINKOUTS section contains linkout identifiers from the following databases:
BDGP in situ Gene Expression Database - Patterns of gene expression in Drosophila embryogenesis
BioGRID - General Repository for Interaction Datasets
DEDB - Drosophila Exon Database
Drosophila PIMRider - The Drosophila Protein Interaction map
DRSC - Drosophila RNAi Screening Center
FLIGHT - Integrating Genomic and High-Throughput data
Flyatlas - the Drosophila adult expression atlas
FlyMine - An integrated database for Drosophila and Anopheles genomics
FlyView - A Drosophila Image Database
Genome RNAi, Heidelberg - a database of phenotypes from systematic RNA interference (RNAi) screens in cultured Drosophila cells
Inparanoid - Eukaryotic Ortholog Groups
Interactive Fly - A cyberspace guide to Drosophila development and metazoan evolution
GEO - NCBI's Gene Expression Omnibus
Reactome - Curated Database of Biological Pathways
PANTHER - Protein Classification System
Yale Developmental Gene Expression
Synonyms & Secondary IDs
Reported As
Symbol Synonym | A list of symbols that have been used in the literature, or by FlyBase, to describe the gene. |
Name Synonym | A list of names that have been used in the literature, or by FlyBase, to describe the gene. |
Secondary FlyBase IDs
A list of Secondary FlyBase identifier numbers of the gene.
If a gene has a secondary identifier number, it generally indicates that at some point it has been merged with or split from other entries in the database. See the FlyBase identifier numbers section for some examples of the cases where identifier numbers are made secondary.
References
A list of publications that discuss the gene, subdivided into fields by type of publication. Only those fields containing data are displayed in an individual Gene Report.