Difference between revisions of "FlyBase:QuickSearch"

From FlyBase Wiki
Jump to navigation Jump to search
 
 
(187 intermediate revisions by 9 users not shown)
Line 3: Line 3:
 
===Overview===
 
===Overview===
  
The QuickSearch tool on the FlyBase home page allows searching across all FlyBase reports. Forms for searching specific types of data have been separated into ‘tabs’, arrayed at the top of the QuickSearch window. Several of the tabs contain entirely new search tools, such as a new ‘Simple’ search form, an easy–to–use tool with access to all the data types in FlyBase. More information on how to use QuickSearch can be found below.
+
The QuickSearch tool on the FlyBase home page allows searching across all FlyBase reports. Forms for searching specific types of data have been separated into ‘tabs’, arrayed at the top of the QuickSearch window. Information on how to use each of the QuickSearch tabs can be found below.
  
 
Links to specific help for each tab:
 
Links to specific help for each tab:
  
[[FlyBase:QuickSearch#Simple tab| Simple tab]] | [[FlyBase:QuickSearch#Data Type tab| Data Type tab]] [[FlyBase:QuickSearch#Expression tab| Expression tab]] | [[FlyBase:QuickSearch#Phenotype tab| Phenotype tab]] [[FlyBase:QuickSearch#References tab| References tab]] | [[FlyBase:QuickSearch#GO tab| GO tab]] |   [[FlyBase:QuickSearch#Protein Domain tab| Protein Domain tab]] | [[FlyBase:QuickSearch#Gene Groups tab| Gene Groups tab]] [[FlyBase:QuickSearch#Human Disease tab| Human Disease tab]] | [[FlyBase:QuickSearch#Orthology tab| Orthology tab]]
+
*[[FlyBase:QuickSearch#Search FlyBase tab| Search FlyBase tab]]  
 +
*[[FlyBase:QuickSearch#Data Class tab| Data Class tab]]
 +
*[[FlyBase:QuickSearch#Expression tab| Expression tab]]
 +
*[[FlyBase:QuickSearch#GAL4 etc tab| GAL4 etc tab]]
 +
*[[FlyBase:QuickSearch#Gene Groups tab| Gene Groups tab]]  
 +
*[[FlyBase:QuickSearch#Pathways tab| Pathways tab]]
 +
*[[FlyBase:QuickSearch#GO tab| GO tab]]
 +
*[[FlyBase:QuickSearch#Human Disease tab| Human Disease tab]]
 +
*[[FlyBase:QuickSearch#Homologs tab| Homologs tab]]
 +
*[[FlyBase:QuickSearch#Phenotype tab| Phenotype tab]]
 +
*[[FlyBase:QuickSearch#Protein Domains tab| Protein Domains tab]]
 +
*[[FlyBase:QuickSearch#References tab| References tab]]
 +
 
 +
Also see this publication:
 +
:Marygold SJ and the FlyBase Consortium (2023)
 +
:    '''Exploring FlyBase Data Using QuickSearch (Updated protocol)'''
 +
:    Current Protocols 3:e731. [https://doi.org/10.1002/cpz1.731 DOI:10.1002/cpz1.731]
  
 
===Species Searched===
 
===Species Searched===
Several tabs search data that may be species-specific. In these tabs, a Species checkbox appears giving you the option to ‘include non-Dmel species’ in your search results. The default behavior is to return only Drosophila melanogaster (Dmel) data.
 
  
In the [[FlyBase:QuickSearch#Data Type tab| Data Type tab]], an override behavior is available. To search for data in a non-Dmel species you can add a 4-letter species prefix to the symbol you are using to search, separated by a backslash (‘\’). For example, if you type Dvir\dpp, the search results for the gene symbol dpp will be filtered for those associated with D. virilis only.
+
All tabs search data for all species included in FlyBase. An option to filter by species is provided in the resulting hit-list.
  
 
===Controlled Vocabularies===
 
===Controlled Vocabularies===
  
Several QuickSearch tabs search FlyBase data by making use of controlled vocabulary (CV) terms. These tabs provide intuitive domain-specific searches of FlyBase reports based on the Gene Ontology (GO) controlled vocabulary, on anatomical, developmental-stage-specific or phenotypic class terms used to annotate phenotypes, and on anatomical and/or developmental-stage-specific terms used to annotate gene expression. Combinations of CV terms can be searched using the forms in these tabs. An auto-completion feature is active wherever a search term should come from a CV, to assist you in choosing terms that will match records in FlyBase.
+
Several QuickSearch tabs search FlyBase data by making use of controlled vocabulary (CV) terms. These tabs provide intuitive domain-specific searches of FlyBase reports based on the Gene Ontology (GO) controlled vocabulary, on anatomical, developmental-stage-specific or phenotypic class terms used to annotate phenotypes, and on anatomical and/or developmental-stage-specific terms used to annotate gene expression. Combinations of CV terms can be searched using the forms in these tabs. An auto-completion feature is active wherever a search term should come from a CV, to assist you in choosing terms that will match records in FlyBase. The various controlled vocabularies used in FlyBase can also be searched or browsed by clicking on the "[http://{{flybaseorg}}/vocabularies Vocabularies]" button above the QuickSearch box on the home page.
  
 
===Auto-completion===
 
===Auto-completion===
  
The QuickSearch auto-completion feature is active in tabs that search FlyBase using [[FlyBase:QuickSearch#Controlled Vocabularies| controlled Vocabulary]] terms. Since only terms that are in the [[FlyBase:QuickSearch#Controlled Vocabularies| controlled Vocabulary]] will match records in FlyBase, the auto-completion feature suggests CV terms that are compatible with what you have typed. Selecting a term from the suggestion list reduces the possibility of a search returning nothing because the search term is not one that is used by FlyBase curators.
+
The QuickSearch auto-completion feature is active in tabs that search FlyBase using [[FlyBase:QuickSearch#Controlled Vocabularies| controlled Vocabulary]] terms. Since only terms that are in the [[FlyBase:QuickSearch#Controlled Vocabularies| controlled Vocabulary]] will match records in FlyBase, the auto-completion feature suggests CV terms that are compatible with what you have typed. Selecting a term from the suggestion list reduces the possibility of a search returning nothing because the search term is not one that is used by FlyBase curators. The various controlled vocabularies used in FlyBase can also be searched or browsed by clicking on the "Vocabularies" button above the QuickSearch box on the home page.
  
Some tabs for non-CV-based searches also use the auto-complete feature. Several of the searchable fields available in the [[FlyBase:QuickSearch#References tab| References tab]] are enhanced with auto-completion, which helps prevent searches that fail due to mis-spelled names or mis-remembered journal titles. Most of the data classes searchable under the [[FlyBase:QuickSearch#Data Type tab| Data Type tab]] have auto-completion associated with them as well.
+
Some tabs for non-CV-based searches also use the auto-complete feature. Several of the searchable fields available in the [[FlyBase:QuickSearch#References tab| References tab]] are enhanced with auto-completion, which helps prevent searches that fail due to mis-spelled names or mis-remembered journal titles. Most of the data classes searchable under the [[FlyBase:QuickSearch#Data Type tab| Data Class tab]] have auto-completion associated with them as well.
  
The QuickSearch auto-completion feature overrides your browser’s auto-completion function.
+
The QuickSearch auto-completion feature overrides your browser’s auto-completion function. '''Important Note''' -- In the Data Class and References tabs, the auto-complete function must be selected. If the auto-complete button is not checked, your browser's auto-complete function may operate and will offer options based on your history rather than valid FlyBase terms.
  
'''Coordinated Auto-completion'''
+
===Coordinated Auto-completion===
  
 
The coordinated auto-completion feature is active for tabs in which several search terms may be used simultaneously for a search. When a term has been entered in one of these fields, the coordinated auto-completion for the other fields is aware of the term already typed, and suggests only terms that actually occur in combination with the first term in FlyBase reports. Here is an example of how it works in the [[FlyBase:QuickSearch#Expression tab| Expression tab]]:
 
The coordinated auto-completion feature is active for tabs in which several search terms may be used simultaneously for a search. When a term has been entered in one of these fields, the coordinated auto-completion for the other fields is aware of the term already typed, and suggests only terms that actually occur in combination with the first term in FlyBase reports. Here is an example of how it works in the [[FlyBase:QuickSearch#Expression tab| Expression tab]]:
  
When the '''expression pattern (lit. curated)''' data class is selected, text box fields for '''Stage, Tissue''', and '''Cell Loc.''' '''(cell location)''' are displayed. The auto-completion for these three fields is coordinated in the following sense: Suppose you enter "fertilized egg stage" in the '''Stage''' text box. When you move your focus to the Tissue text box, auto-complete there will show only four options; "egg", "female pronucleus", "fertilized egg", and "male pronucleus". This is because, out of the multitude of CV terms available for the '''Tissue''' field, only these four terms have actually been used in combination with "fertilized egg stage" by curators in an annotation captured in the FlyBase database. If you enter any other term in the '''Tissue''' text box, even though it may be a valid CV term for that field, your search would return zero hits, because there are no FlyBase reports containing that combination of CV terms.
+
In the '''Expression''' tab, text box fields for '''Stage, Tissue''', and '''Cell Loc.''' '''(cell location)''' are displayed. The auto-completion for these three fields is coordinated in the following sense: Suppose you enter "fertilized egg stage" in the '''Stage''' text box. When you move your focus to the Tissue text box, auto-complete there will show only four options; "egg", "female pronucleus", "fertilized egg", and "male pronucleus". This is because, out of the multitude of CV terms available for the '''Tissue''' field, only these four terms have actually been used in combination with "fertilized egg stage" by curators in an annotation captured in the FlyBase database. If you enter any other term in the '''Tissue''' text box, even though it may be a valid CV term for that field, your search would return zero hits, because there are no FlyBase reports containing that combination of CV terms.
  
 
Using the terms suggested by the auto-completion feature ensures that you do not enter terms that would be mutually exclusive (or are simply not used by curators) in FlyBase reports. Terms suggested by the auto-completion should always return results. If the coordinated auto-completion does not offer a term you wish to enter in a field, it is because this term does not appear in combination with some other term you have entered elsewhere on the form. In this case you should try another combination.
 
Using the terms suggested by the auto-completion feature ensures that you do not enter terms that would be mutually exclusive (or are simply not used by curators) in FlyBase reports. Terms suggested by the auto-completion should always return results. If the coordinated auto-completion does not offer a term you wish to enter in a field, it is because this term does not appear in combination with some other term you have entered elsewhere on the form. In this case you should try another combination.
Line 36: Line 51:
 
===Wild Cards===
 
===Wild Cards===
  
When you use QuickSearch you can add the asterisk character ( * ) to the beginning or the end of a search term. This is recognized as a “wild card” and will find all terms that contain your search term at the end or beginning of a phrase, respectively. You can also flank your search term with wild card characters to find all phrases containing your search term. For example, you can find the genes that start with 'ft' by entering 'ft*'. (Search the Genes data class either under the [[FlyBase:QuickSearch#Simple tab| Simple tab]] by selecting the 'Genes' data class from the result summary table, or under the [[FlyBase:QuickSearch#Data Type tab| Data Type tab]] by selecting 'genes' from the '''Data Class''' drop-down menu.) The result of this search lists fat (ft) and fushi tarazu (ftz), as you would expect, and also fruitless (fru), because it has the synonym 'fty'.
+
When you use QuickSearch you can add the asterisk character ( * ) to the beginning or the end of a search term. This is recognized as a “wild card” and will find all terms that contain your search term at the end or beginning of a phrase, respectively. You can also flank your search term with wild card characters to find all phrases containing your search term. For example, you can find the genes that start with 'ft' by entering 'ft*'. (Search the Genes data class either under the [[FlyBase:QuickSearch#Simple tab| Simple tab]] by selecting the 'Genes' data class from the result summary table, or under the [[FlyBase:QuickSearch#Data Class tab| Data Class tab]] by selecting 'genes' from the '''Data Class''' drop-down menu.) The result of this search lists fat (ft) and fushi tarazu (ftz), as you would expect, and also fruitless (fru), because it has the synonym 'fty'.
 +
 
 +
'''Please note''' that wild cards cannot be used in numeric fields (year, ''etc'').
  
 
==Tab Descriptions==
 
==Tab Descriptions==
  
===Simple tab===
+
===Search FlyBase tab===
 +
 
 +
This tab performs a comprehensive search of text-searchable FlyBase data. This includes most fields from all data classes of reports.
 +
 
 +
Enter one or more search terms in the box. The search term box of the Simple tab supports a number of additional features that can be used to narrow or broaden the query. A wildcard character (*) can be appended, prepended or added within a search term to broaden the query. When specifying multiple terms, a Boolean ‘AND’ is used for searches by default and does not require any special notation (e.g. a search for ‘neurogenesis microtubule polymerization’ will return only hits that have all of those three terms somewhere in the record). A Boolean ‘OR’ can be added to find records that have one or another of a list of specified terms (e.g. ‘cnn OR cbs’). To exclude certain terms from the results, prefix the term(s) to be excluded with a ‘-’ character (e.g. ‘Parkinson -CG5680’). Finally, results can be specified to contain an exact phrase by surrounding the search term with double quotes (e.g. “SH3 domain”).
  
This tab performs a comprehensive search of text-searchable FlyBase data. This includes most fields from sixteen data classes of reports. The search returns a result page summarizing the matching records by data type. Clicking on one of these data types takes you to a secondary result page containing a table of individual matches within that data type. QuickSearch also places your query text in a resubmission form on the result summary page, where you can edit or refine the phrase directly and search again, without having to start over.
+
Click on the 'Search' button or press 'enter'. The search returns a result page summarizing the matching records by data type. Clicking on one of these data types takes you to a secondary result page containing a table of individual matches within that data type. Click on any of these to view its report page. QuickSearch also places your query text in a resubmission form on the result summary page, where you can edit or refine the phrase directly and search again, without having to start over.
  
 
The QuickSearch [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature is not active in this tab.
 
The QuickSearch [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature is not active in this tab.
Line 48: Line 69:
 
===Data Class tab===
 
===Data Class tab===
  
This tab contains a subset of the previous version of QuickSearch, and is laid out with very little change from that version. The Data Class drop-down menu restricts searches to only the single data type chosen, as before.
+
The Data Class tab allows searches that are restricted to only a single chosen data type.
 +
 
 +
Choose from among the data types offered in the Data Class dropdown menu. There is also an "All data classes" option.
 +
 
 +
Choose to search just "ID/Symbol/Name" or search "All text" by clicking the appropriate box.
 +
 
 +
Enter a symbol appropriate to the selected data type in the "Enter text" box.
 +
Clicking the "QuickSearch autocomplete" option will enable the autocomplete function, which allows you to choose from among valid symbols for the selected data type. The QuickSearch [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature is active for most of the data classes in this tab.
  
The QuickSearch [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature is active for most of the data classes in this tab.
+
'''Important Note''' -- The auto-complete function must be selected. If the auto-complete button is not checked, your browser's auto-complete function may operate and will offer options based on your history rather than valid FlyBase terms.
  
 
===Expression tab===
 
===Expression tab===
Line 56: Line 84:
 
Search for genes according to expression patterns:
 
Search for genes according to expression patterns:
  
At the top of this tab is a link to the [http://dev.flybase.org/static_pages/rna-seq/rna-seq_profile_search.html RNA-Seq Profile Search] tool. This tool can be used to search for genes by specifying a pattern of expression, as evidenced by high-throughput RNA-seq experiments.
+
The top part of the tab contains a form allowing searching of curated statements that describe published accounts of transcript and polypeptide expression as well as expression associated with reporter constructs and insertions. Choose the expression pattern you wish to search for. The form has input boxes for '''Developmental Stage''', '''Anatomy/Cell Type''', and '''Cellular Component'''. The [[FlyBase:QuickSearch#Coordinated Auto-completion| coordinated auto-completion]] feature will assist you in finding an appropriate [[FlyBase:QuickSearch#Controlled Vocabularies| controlled vocabulary]] (CV) terms or combination of terms that have been used during the curation of each descriptor. Please note that if you fill one of the three boxes, the autocompleted options you will be offered in the other two boxes will include only terms that have been used with the term you have entered into the box or boxes you have already filled. You do not need to fill every box to search.
  
Use the form below this link to search curated statements that describe published accounts of transcript and polypeptide expression. The input form has input boxes for developmental '''Stage''', body part or '''Tissue''', and subcellular localization '''(Cell Loc.'''). The coordinated [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature will assist you in finding the appropriate [[FlyBase:QuickSearch#Controlled vocabularies| controlled vocabulary]] (CV) terms that have been used during the curation of each descriptor.
+
You can refine this search further by choosing to add [http://{{flybaseorg}}/cgi-bin/cvreport.pl?rel=is_a&id=FBcv:0000005 qualifier] terms. Click the "'''+'''" sign above the search boxes to bring up additional search boxes for entering qualifier terms. The [[FlyBase:QuickSearch#Coordinated Auto-completion| coordinated auto-completion]] feature will provide you with a list of CV terms that have been used by curators to modify or limit the associated main term. The auto-completion for the qualifier terms is fully coordinated across all of these fields, in the sense that choosing a term for (e.g.) the '''Developmental Stage''' input will affect which qualifier terms are suggested for the '''Anatomy/Cell Type''' or '''Cellular Component.''' qualifier fields. Please note: many embryonic expression patterns, such as "pair rule expression pattern" are spatial qualifiers, not anatomy terms; you can search for such terms by filling the '''Anatomy/Cell Type qualifier''' box while leaving the '''Anatomy/Cell Type''' box unfilled.
  
You can refine this search further by choosing to add '''qualifier terms'''. The coordinated [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature will provide you with a list of CV terms that have been used by curators to modify or limit the associated main term. The auto-completion for the qualifier terms is fully coordinated across all of these fields, in the sense that choosing a term for (e.g.) the Stage input will affect which qualifier terms are suggested for the Tissue or Cell Loc qualifier fields.
+
Running the search will take you to a hit list of genes to which the search terms have been associated. To see other classes of hits (Alleles, Insertions, Recombinant Constructs), click on the associated green box at the top of the results page. To see the curated expression patterns, click on an item in the hit list and open the "Expression Data" section of the corresponding report.
 +
If you are looking for expression patterns of GAL4 or other binary drivers or lacZ or GFP reporters, use the new [[FlyBase:QuickSearch#GAL4 etc tab| GAL4 etc tab]]. Alternatively, choose the green box labeled "Alleles", click one of the blue arrows in the Symbol column to sort alphabetically, then scroll until you reach symbols that start with the text "Ecol\lacZ" (lacZ reporters) or "Scer\GAL4" (GAL4 drivers). Expression data is associated with the insertion or transgenic construct associated with the allele; the associated insertion or construct can be found on the allele report under the General Information section at the top of the report, under the fields "Associated Insertion(s)" and "Carried in Constructions". Alternatively, you can select all the alleles of interest in the Alleles hitlist, then select the Batch Download option from the HitList Conversion Tools button. In the Select Fields menu, choose "Associated insertion(s)" and "Carried in construct", found under the "Nature of the Allele" heading. This will generate a hitlist of the insertions and constructs to which the desired driver or reporter expression pattern have been curated.
  
===Phenotype tab===
+
The bottom section of this tab contains a dropdown menu with links to a variety of RNA-Seq Search tools. JBrowse allows you to visually examine RNA-Seq expression levels in particular regions of the genome. The RNA-Seq Profile Search can be used to search for genes that have a specific expression pattern of interest. RNA-Seq Similarity Search allows you to search for genes with a similar pattern of expression to the input gene. RNA-Seq by Region allows the comparison of RNA-Seq signals in a given region across samples or to compare signal between two regions in a single sample. You can find a more detailed description of these RNA-Seq search options on the [[FlyBase:RNA-Seq Overview | RNA-Seq overview page]]. There are '''video tutorials''' available for [https://www.youtube.com/watch?v=Ho_PZ4XB8y8&t=2s RNA-Seq Profile Search], and [https://www.youtube.com/watch?v=mcMpRxeX-KY RNA-Seq Similarity Search], and [https://www.youtube.com/watch?v=KpJVkopUBDM&t=107s RNA-Seq in GBrowse]  -- <span style="color:red"> Warning: The GBrowse genome viewer has been updated to JBrowse, but techniques shown here are still useful in JBrowse.</span>
 +
 
 +
===GAL4 etc tab===
 +
 
 +
The GAL4 etc tab allows searches for GAL4 and other binary system drivers, and for non-binary reporters, by temporal-spatial expression pattern, or gene promoter expression pattern, as curated from the literature.
 +
 
 +
'''Search by curated expression pattern''' Click the option "'''by curated expression pattern'''". Choose the expression pattern you wish to search for. The form has input boxes for '''Developmental Stage''', '''Anatomy or Cell Type''', and '''Cellular Component'''. The [[FlyBase:QuickSearch#Coordinated Auto-completion| coordinated auto-completion]] feature will assist you in finding an appropriate [[FlyBase:QuickSearch#Controlled Vocabularies| controlled vocabulary]] (CV) terms or combination of terms that have been used during the curation of each descriptor. Please note that if you fill one of the three boxes, the autocompleted options you will be offered in the other two boxes will include only terms that have been used with the term you have entered into the box or boxes you have already filled. Please note that you must fill either the '''Developmental Stage''' or '''Anatomy or Cell Type''' boxes to use this search option; you cannot fill only the '''Cellular Component''' box. Please note that you must use a valid controlled vocabulary term, and cannot search using a synonym. Synonyms will fail to autocomplete; you will instead see text reading "...no matching text suggestions...", and your query will produce no matches. You can use the [http://{{flybaseorg}}/vocabularies Vocabularies] tool to search with your synonym to find a valid Controlled Vocabulary term.
 +
 
 +
'''Search with qualifier terms ''' You can refine this search further by choosing to add [http://{{flybaseorg}}/cgi-bin/cvreport.pl?rel=is_a&id=FBcv:0000005 qualifier] terms. Click the “'''+'''” sign above the search boxes to bring up additional search boxes for entering qualifier terms. The coordinated auto-completion feature will provide you with a list of CV terms that have been used by curators to modify or limit the associated main term. The auto-completion for the qualifier terms is fully coordinated across all of these fields, in the sense that choosing a term for (e.g.) the '''Developmental Stage''' input will affect which qualifier terms are suggested for the '''Anatomy/Cell Type''' or '''Cellular Component qualifier''' fields. Please note: many embryonic expression patterns, such as [http://{{flybaseorg}}/cgi-bin/cvreport.pl?rel=is_a&id=FBcv:0000322 pair rule expression pattern] are spatial qualifiers, not anatomy terms; you can search for such terms by filling the '''Anatomy/Cell Type qualifier''' box while leaving the '''Anatomy/Cell Type''' box unfilled; you do, however, need to also fill the '''Developmental Stage''' box in this case. Sex-specific queries use the '''qualifier''' box for '''Developmental Stage'''. You must fill the '''Developmental Stage''' box to use a [http://{{flybaseorg}}/cgi-bin/cvreport.pl?cvterm=FBcv:0000332 sex qualifier], even if you have filled the '''Anatomy/Cell Type''' box.
 +
 
 +
'''Choosing the correct search stringency''' It is important to note that you should fill only as many fields as you need; for this tab, you should usually leave the '''Cellular Component''' field unfilled, as only a small subset of expression pattern curation for driver or reporter alleles include a GO cellular component term, such as [http://{{flybaseorg}}/cgi-bin/cvreport.pl?rel=&id=GO:0043195 terminal bouton]. Also note that this tool supports searching a set of hierarchical controlled vocabularies. For example, if you search '''by curated expression pattern''' for the anatomy term "imaginal disc", you will get not only the list of all such drivers annotated with the term [http://{{flybaseorg}}/cgi-bin/cvreport.pl?rel=is_a&id=FBbt:00001761 imaginal disc], but also all terms that have an is_a relationship (e.g., [http://{{flybaseorg}}/cgi-bin/cvreport.pl?rel=is_a&id=FBbt:00001778 wing disc], [http://{{flybaseorg}}/cgi-bin/cvreport.pl?rel=is_a&id=FBbt:00001768 eye disc]) or part_of relationship (e.g., [http://{{flybaseorg}}/cgi-bin/cvreport.pl?rel=is_a&id=FBbt:00007111 imaginal disc posterior compartment], [http://{{flybaseorg}}/cgi-bin/cvreport.pl?rel=is_a&id=FBbt:00006029 wing pouch]) to "imaginal disc". The terms that appear in the '''Expression terms''' column of the '''GAL4 etc''' hitlist can assist you in choosing a more specific search term; you can also use the [http://{{flybaseorg}}/vocabularies Vocabularies] tool to find controlled vocabulary terms.
 +
 
 +
'''Search by expression pattern of a particular gene''' Click the option "'''reflecting expression pattern of a particular gene'''". The form has one input box, '''Gene'''. You can enter either a valid FlyBase gene symbol, such as [http://{{flybaseorg}}/reports/FBgn0000490 dpp], or a valid FlyBase gene ID number, such as FBgn0000490; synonyms or full gene names will not work in this search. Please note that this search is only for drivers and/or reporters that have been curated as reflecting the expression pattern of a specific gene; this search will not find drivers/reporters with an expression pattern similar to that of the gene you have entered in the search box, nor will it find drivers/reporters that are associated with a specific gene, but have not been curated as reflecting expression of that gene. Please note: this search option is an alternative to the "'''by curated expression pattern'''" search option; you can choose only one of these two options.
 +
 
 +
'''Output format options''' IN PROGRESS Examples of how to manipulate the results page can be found in this [http://{{flybaseorg}}/commentaries/2019_08/GAL4TabUpgrade.html FlyBase commentary] Choose the '''Output format''': Integrated Table or List. The List output provides a faceted hitlist consisting of alleles, insertions, transgenic constructs, and stocks. The Integrated Table output provides a customized table view of the hitlist. This format shows the connection between particular alleles, insertions, constructs, and stocks; additionally, the '''Relevant Expression Statements''' column lists the anatomy, stage, and/or GO cellular component [[FlyBase:QuickSearch#Controlled Vocabularies| controlled vocabulary]] term that triggered the search result. The Integrated Table view also pre-sorts drivers/reporters with a publically available stock to the top of the hitlist.
 +
 
 +
'''Accessing the full expression pattern''' Note: Many drivers and reporters may be expressed at other developmental stages and/or in other tissues or cell types than the pattern you searched for. Clicking through to the '''Allele Reports''', '''Insertion Reports''' or '''Construct Reports''' of your hits will allow you to see the complete curated expression pattern of the drivers or reporters in your hitlist.
 +
 
 +
===Gene Groups tab===
  
Search for alleles that have particular phenotypes. The form is divided into two portions, which may be used independently or in combination. The coordinated [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature will assist you in finding the appropriate [[FlyBase:QuickSearch#Controlled Vocabularies| controlled vocabulary]] (CV) terms that have been used during the curation of each phenotype.
+
This tab searches FlyBase-curated [https://wiki.flybase.org/wiki/FlyBase:Gene_Group_Report 'Gene Groups'] -  sets of genes/gene products that are acknowledged to form a biological group, such as members of a gene family (e.g. Actins, Wnts), subunits of a protein complex (e.g. proteasome, ribosome), or some other functional grouping (e.g. cadherins or caspases).
  
The top section searches for alleles with a particular [http://{{SERVERNAME}}/cgi-bin/cvreport.html?id=FBcv:0000347 phenotypic class], e.g. "lethal" or "behavior defective". You can refine this search further using the refinement boxes, searching for a phenotype that occurs at a particular [http://{{SERVERNAME}}/cgi-bin/cvreport.html?id=FBdv:00007008 developmental stage], e.g. an embryonic stage and/or under particular conditions, e.g. "recessive" or "heat sensitive".
+
To use, either start typing and select the appropriate Gene Group name from the [http://flybase.org/wiki/FlyBase:QuickSearch#Auto-completion auto-complete] suggestions, or enter your own text, using [http://flybase.org/wiki/FlyBase:QuickSearch#Wild_Cards wildcard(s)] (*) if desired. Then click the 'Search' button or press 'enter'. The resulting hits will be Gene Groups that wholly or partially match your search term. Alternatively, enter the symbol, name or ID of a gene in the search box to retrieve those Gene Groups to which that gene belongs.
  
The bottom section searches for alleles that show a phenotype in a particular tissue or cell type, e.g. "wing" or "RP2 neuron". In this case, terms from the [http://{{SERVERNAME}}/cgi-bin/cvreport.html?id=FBbt:00000001Anatomy controlled vocabulary] or cellular component terms from the [http://geneontology.org/ Gene Ontology controlled vocabulary] are used. Again, you can refine this search further using the refinement boxes.
+
Click the 'browse' link at the bottom of the panel to see a full [http://flybase.org/lists/FBgg/ list of Gene Groups].
  
Please note that the coordinated [[FlyBase:QuickSearch#Auto-completion| auto-completion]] works within the two sections, but not between them. This means it is possible even when using auto-completion suggestions, to search on a combination of terms entered in both sections of this form that will return zero hits.
+
===Pathways tab===
  
===References tab===
+
This tab searches FlyBase-curated [https://wiki.flybase.org/wiki/FlyBase:Pathway_Report 'Pathway Reports'] - sets of genes/gene products that have been experimentally shown to act within a pathway or to regulate a pathway.
  
This tab searches the extensive FlyBase bibliography. Searches can be filtered by title/abstract text, journal name, publication type, and reference IDs (PubMed or FlyBase), in addition to the author and date filters. Appropriate fields also allow the use of Boolean operators, so you can search for papers authored by e.g. “Smith NOT Johnson”. In addition to Boolean operators the year field supports mathematical comparison symbols (>,>=,<,<=) and range indicators (-,--,..). For example,
+
To use, either start typing and select the appropriate Pathway Report name from the [http://flybase.org/wiki/FlyBase:QuickSearch#Auto-completion auto-complete] suggestions, or enter your own text, using [http://flybase.org/wiki/FlyBase:QuickSearch#Wild_Cards wildcard(s)] (*) if desired. Then click the 'Search' button or press 'enter'. The resulting hits will be Pathway Reports that wholly or partially match your search term. Alternatively, enter the symbol, name or ID of a gene in the search box to retrieve those Pathway Reports to which that gene belongs.
  
>2003
+
Click the 'browse' link at the bottom of the panel to see a full [http://flybase.org/lists/FBgg/pathways list of Pathway Reports].
<=1945
 
1999-2003
 
1970-1990 NOT 1976
 
1992 OR 1995 OR 1998
 
The QuickSearch auto-completion feature is active for the fields in this tab where it will be helpful, such as the journal name field. These fields are indicated with a superscript.
 
  
 
===GO tab===
 
===GO tab===
  
Search the [[FlyBase:Gene Ontology (GO) Annotation| Gene Ontology]] (GO) [[FlyBase:Controlled Vocabularies| controlled vocabulary]] directly. Results are a CV term report, or list of reports. Once you are looking at the term report, you can then get a list of genes that are annotated with that GO term (look in the '''Records annotated with this exact term''' section), among other things. Please see the [[FlyBase:Vocabularies| CV term report help page]] for more information.
+
Search the [[FlyBase:Gene Ontology (GO) Annotation| Gene Ontology]] (GO) [[FlyBase:QuickSearch#Controlled Vocabularies| controlled vocabulary]] directly. You can search all GO terms or limit your search to the molecular function, biological process, or cellular component GO vocabularies by selecting from the "Data Field" dropdown menu.
 +
 
 +
Results are in the form of a hit list of matching GO terms. Clicking on the term of interest takes you to the term report from which, among other things, you can get a list of genes that are annotated with that GO term; look in the '''Records annotated with this exact term''' section. Note, as the GO is an ontology, all child terms will possess the property of the parent, therefore consider using '''Records annotated with this term OR any of its CHILDREN TERMS''' to get a complete gene list.
 +
 
 +
Please see the [[FlyBase:Vocabularies| Vocabularies]] help page for more information.
  
 
The QuickSearch  [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature is active in this tab.
 
The QuickSearch  [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature is active in this tab.
  
===Protein Domains tab===
+
===Human Disease tab===
 
 
Search using InterPro IDs or signatures, including protein domains, families, repeats, and sites.
 
  
Either start typing and select a term from the drop-down menu, or enter your own search term using wildcard(s) (*) if desired. Resulting hits will be genes whose protein products are annotated with an InterPro signature that wholly or partially matches your term. N.B. Search with an InterPro ID (e.g. 'IPR019956') if you wish to retrieve hits annotated with a specific InterPro signature.
+
Search [https://wiki.flybase.org/wiki/FlyBase:Human_Disease_Model_Report Human Disease Model Reports] and the [http://disease-ontology.org/ Disease Ontology (DO)] by entering a disease, human disease-associated gene, or ''Drosophila melanogaster'' gene, into the search box. This search supports a great deal of flexibility in search text.  
  
See the InterPro [[http://www.ebi.ac.uk/interpro/faqs.html FAQ]] page for an explanation of these different signature types.
+
'''You can search by disease using:'''
 +
*Disease Ontology (DO) term (e.g., autosomal dominant Parkinson Disease 1)
 +
*DOID (e.g., DOID:0060367 '''or''' 0060367)
 +
*Human Disease Model name (e.g., Parkinson disease 1)
 +
*Human Disease Model ID (e.g., FBhh0000006)
 +
*[https://omim.org/ OMIM] phenotype term (e.g., PARKINSON DISEASE 1, AUTOSOMAL DOMINANT)
 +
*OMIM phenotype ID (e.g., 168601)
 +
*disease synonym (e.g., PD1).  
  
The QuickSearch [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature is active in this tab.
+
'''You can search by human disease-associated gene using:'''
 +
*[https://www.genenames.org HGNC (Human Gene Nomenclature Committee)] gene symbol (e.g., SNCA)
 +
*HGNC ID (e.g., 11138)
 +
*OMIM genotype symbol (e.g. ALSIN)
 +
*OMIM genotype ID (e.g., 606352)
  
===Gene Groups tab===
+
'''You can search by ''Drosophila melanogaster'' gene using:'''
 +
*FlyBase gene symbol (e.g., Sod1)
 +
*FlyBase gene name (e.g., superoxide dismutase 1)
 +
*FlyBase gene ID (e.g., FBgn0003462)
  
Search the FlyBase-curated Gene Group data class using a gene or Gene Group symbol, name, synonym or ID.
+
Please note that for HGNC and OMIM IDs, you can search only using the digits of the ID number; so 11138 or 606352 work as search terms, but HGNC:11138 or OMIM:606352 do not.
  
The QuickSearch [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature is active in this tab.
+
Please note that disease synonyms work only for exact synonyms that have been attached to a Human Disease Model or to a DO term. So, the search string "ALS" fails to find the disease "amyotrophic lateral sclerosis 4", as "ALS4" is a synonym for that disease, but "ALS" is not. Addition of a wild card (e.g., ALS*) will, in many cases, get around this issue.
  
Click the 'Browse' button to see a full list of Gene Groups.
+
'''The QuickSearch auto-completion feature is active in this tab; auto-completion works for'''
 +
*Disease Ontology terms and DOIDs (digits only)
 +
*Human Disease Model names and IDs
 +
*OMIM phenotype terms
 +
*HGNC gene symbols and IDs
 +
*FlyBase gene symbols, names, and IDs
  
===Human Disease tab===
+
The results are a hit list of matching Disease Ontology CV terms, Human Disease Model Reports, ''Drosophila melanogaster'' genes associated with a Human Disease Model, and ''Drosophila melanogaster'' alleles with a models_of relationship to a Disease Ontology term. Clicking on the Disease Ontology term of interest takes you to a term report from which, among other things, you can get a list of all genes or alleles that have been used to model, or interact with a model, of that disease in flies; you can also get to Human Disease Model Reports, which compile all of the disease model-related information on that disease in FlyBase. Please see the  [[FlyBase:Vocabularies| Vocabularies]] help page for more information. Human Disease Model, Gene, and Allele hits take you to the corresponding report.
  
Search the [http://disease-ontology.org/ Disease Ontology (DO)] controlled vocabulary to find alleles that have been used as disease models. Enter the name of a disease into the search box. The results are a list of Disease Ontology CV reports that match the inputted term. From the CV report you can get a list of all genes or alleles that have been used to model, or interact with a model, of that disease in flies. Please see the  [[FlyBase:Vocabularies| CV term report help page]] for more information.
+
Click the 'browse' button to see a full [http://{{flybaseorg}}/lists/FBhh list of Human Disease Model Reports]. This list has been organized as an index, so that you can easily browse to your disease; for example, Machado-Joseph disease is redundantly listed as Machado-Joseph disease, under polyglutamine diseases, and under spinocerebellar ataxia.
  
The QuickSearch auto-completion feature is active in this tab.
+
===Homologs tab===
  
Click the 'Browse' button to see a full list of Human Disease Model Reports.
+
This tab can be used to quickly search for orthologs of ''D. melanogaster'', human or other model organism genes, as provided by the [http://www.flyrnai.org/cgi-bin/DRSC_orthologs.pl DRSC Integrative Ortholog Prediction Tool (DIOPT)]. It can also be used to find paralogs within ''D. melanogaster''. The DIOPT dataset integrates ortholog/paralog predictions for humans and model organisms from multiple tools and algorithms. (Further documentation is [http://www.flyrnai.org/DRSC-OPT.html here].) A related '''video tutorial''' can be found at [https://www.youtube.com/watch?v=JsNyrNG28M0 Using the Orthology search tool].
  
===Orthology tab===
+
To search for orthologs, first select the input species by clicking on the '''Species''' dropdown menu. Next, enter one or more gene symbols/IDs in the adjacent '''Gene(s)''' box - multiple entries are accepted and need to be separated by spaces. (Response time will be proportional to the number of entries.) Then, select one or more output species using the '''check-boxes'''.
  
This tab can be used to quickly search for orthologs of ''D. melanogaster'', human or other model organism genes, as provided by the [http://www.flyrnai.org/cgi-bin/DRSC_orthologs.pl DRSC Integrative Ortholog Prediction Tool (DIOPT)] or [http://orthodb.org/ OrthoDB]. The DIOPT dataset integrates ortholog predictions for 8 model organisms from multiple tools and algorithms. (Further documentation is [http://www.flyrnai.org/DRSC-OPT.html here].) The OrthoDB dataset (as implemented in FlyBase) comprises ~40 species, biased towards those that are closely related to ''D. melanogaster'', and arranged into 5 ‘orthology groups’: Drosophila species, non-Drosophila Dipterans, non-Dipteran Insects, non-Insect Arthropods, non-Arthropod Metazoa.
+
To search for paralogs within ''D. melanogaster'', select ''D. melanogaster'' as '''both''' the input and output species.
  
To use, first select the input species by clicking on the '''Species''' drop-down menu. Next, enter one or more gene symbols/IDs in the adjacent '''Gene(s)''' box - multiple entries are accepted and need to be separated by spaces. (Response time will be proportional to the number of entries.) Then, select one or more output species using the '''check-boxes''' - where the input species is ''D. melanogaster'', there is a choice between searching the DIOPT or OrthoDB datasets. Finally, click the green '''Search''' button or press ‘enter’.
+
Finally, click the '''Search''' button or press ‘enter’.
  
 
The symbols/IDs that may be entered in the '''Gene(s)''' box depends on the 'input species', as follows:
 
The symbols/IDs that may be entered in the '''Gene(s)''' box depends on the 'input species', as follows:
Line 127: Line 190:
 
|-
 
|-
 
| ''H. sapiens'' || [http://www.genenames.org/ HGNC] gene symbol (e.g. CDK1) or gene ID (e.g. HGNC:1722); [http://omim.org OMIM] ID (e.g. OMIM_GENE:116940); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. '983); [http://www.ensembl.org/ Ensembl] ID (e.g. ENSG00000170312)
 
| ''H. sapiens'' || [http://www.genenames.org/ HGNC] gene symbol (e.g. CDK1) or gene ID (e.g. HGNC:1722); [http://omim.org OMIM] ID (e.g. OMIM_GENE:116940); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. '983); [http://www.ensembl.org/ Ensembl] ID (e.g. ENSG00000170312)
 +
|-
 +
| ''R. norvegicus'' || [http://rgd.mcw.edu RGD] gene symbol (e.g. Cdk1) or gene ID (e.g. 2319); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 54237)
 
|-
 
|-
 
| ''M. musculus'' || [http://www.informatics.jax.org MGI] gene symbol (e.g. Cdk1) or gene ID (e.g. MGI:88351); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 12534)
 
| ''M. musculus'' || [http://www.informatics.jax.org MGI] gene symbol (e.g. Cdk1) or gene ID (e.g. MGI:88351); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 12534)
Line 135: Line 200:
 
|-
 
|-
 
| ''D. melanogaster'' || [http://www.flybase.org FlyBase] gene symbol (e.g. Cdk1), annotation symbol (e.g. CG5363), or gene ID (e.g. FBgn0004106); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 34411)
 
| ''D. melanogaster'' || [http://www.flybase.org FlyBase] gene symbol (e.g. Cdk1), annotation symbol (e.g. CG5363), or gene ID (e.g. FBgn0004106); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 34411)
 +
|-
 +
| ''A. gambiae'' || [https://vectorbase.org/vectorbase/app/ VectorBase] gene ID (e.g. AGAP007642); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 1269584)
 
|-
 
|-
 
| ''C. elegans'' || [http://www.wormbase.org WormBase] gene symbol (e.g. cdk-1) or gene ID (e.g. WBGene00000405); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 176374)
 
| ''C. elegans'' || [http://www.wormbase.org WormBase] gene symbol (e.g. cdk-1) or gene ID (e.g. WBGene00000405); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 176374)
 +
|-
 +
| ''A. thaliana'' || [https://apps.araport.org/thalemine/keywordSearchResults.do?searchTerm= Araport] gene symbol (e.g. CDC2) or gene ID (e.g. AT3G48750); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 824036)
 
|-
 
|-
 
| ''S. cerevisiae'' || [http://www.yeastgenome.org SGD] gene symbol (e.g. CDC28) or gene ID (e.g. S000000364); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 852457)
 
| ''S. cerevisiae'' || [http://www.yeastgenome.org SGD] gene symbol (e.g. CDC28) or gene ID (e.g. S000000364); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 852457)
 
|-
 
|-
 
| ''S. pombe'' || [http://www.pombase.org PomBase] gene symbol (e.g. cdc2); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 2539869)
 
| ''S. pombe'' || [http://www.pombase.org PomBase] gene symbol (e.g. cdc2); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 2539869)
 +
|-
 +
| ''E. coli'' || [https://ecocyc.org ECOCYC] gene symbol (e.g. pfkA); [http://www.ncbi.nlm.nih.gov/gene NCBI Gene] ID (e.g. 948412)
 
|-
 
|-
 
|}
 
|}
Line 147: Line 218:
  
  
On the results page, the top row shows the search term, species, the matched gene symbol, and any relevant links to Gene Reports. Below this are the column headers, followed by the list of ortholog predictions arranged by species.
+
On the results page, the top row shows the search term, species, the matched gene symbol, and any relevant links to Gene Reports. Below this are the column headers, followed by the list of ortholog/paralog predictions arranged by species.
For DIOPT-based searches, the columns are:
+
The columns are:
* '''Ortholog Gene''': official gene symbol, as used in the relevant model organism database
+
* '''Gene''': official gene symbol, as used in the relevant model organism database
* '''Ortholog Gene Reports''': links to report pages at model organism databases, NCBI, Ensembl and/or OMIM
+
* '''Gene Reports''': links to report pages at model organism databases, NCBI, Ensembl and/or OMIM
* '''Score''': a simple score indicating the number of tools that support a given orthologous gene-pair relationship
+
* '''Score''': the number of tools that support a given gene-pair relationship compared to the total number of tools that compute relationships for those two species (expressed as "X of Y")
* '''Best Score''': either ‘yes’ or ‘no’ to indicate whether the given ortholog has the highest score for the query gene
+
* '''Best Score''': either ‘yes’ or ‘no’, indicating whether the given gene has the highest score for the query gene within that species
* '''Best Rev Score''': either ‘yes’ or ‘no’ to indicate whether the query gene has the highest score for the given ortholog in the reciprocal search; also includes a link to show the full results of performing the reciprocal search (among those species selected in the original query)
+
* '''Best Rev Score''': either ‘yes’ or ‘no’, indicating whether the query gene has the highest score for the given gene in the reciprocal search; also includes a link to show the full results of performing the reciprocal search (among those species selected in the original query)
* '''Source''': list of individual ortholog prediction tools that support a given orthologous gene-pair relationship
+
* '''Source''': list of individual prediction tools that support a given gene-pair relationship
* '''Align''': link to an alignment between the given orthologous gene-pairs on the DIOPT site
+
* '''Align''': link to an alignment between the given gene-pairs on the DIOPT site
 
* '''Transgene in Fly''': link to a FlyBase Gene Report for a non-Drosophila gene, indicating that that gene has been expressed transgenically in Drosophila
 
* '''Transgene in Fly''': link to a FlyBase Gene Report for a non-Drosophila gene, indicating that that gene has been expressed transgenically in Drosophila
The results page for OrthoDB-based searches is similar, except that the DIOPT-specific columns are absent and the ‘Source’ column lists only ‘OrthoDB’. The 'orthology group' to which the species belongs is shown on the right side of each species line.
 
  
In cases where there are multiple hits to a single search term (as may happen when a numerical ID is entered), then all hits together with their predicted orthologs are shown in the results table.
+
In cases where there are multiple hits to a single search term (as may happen when a numerical ID is entered), then all hits together with their predicted orthologs/paralogs are shown in the results table.
  
 
Clicking on the '''Save results as tsv file''' text at the top of the results page will download all the results shown in that page to a file in tab separated value format, with one orthologous gene-pair per line. It has the following columns:
 
Clicking on the '''Save results as tsv file''' text at the top of the results page will download all the results shown in that page to a file in tab separated value format, with one orthologous gene-pair per line. It has the following columns:
* query_context: the entered search term
+
* '''query_context''': the entered search term
* query_species: the selected input species
+
* '''query_species''': the selected input species
* query_gene: the matched input gene symbol
+
* '''query_gene''': the matched input gene symbol
* target_species: the selected output species
+
* '''target_species''': the selected output species
* ortholog_gene: official gene symbol, as used in the relevant model organism database
+
* '''gene''': official gene symbol, as used in the relevant model organism database
* ortholog_gene_reports: gene IDs at model organism databases, NCBI, Ensembl and/or OMIM
+
* '''gene_reports''': gene IDs at model organism databases, NCBI, Ensembl and/or OMIM
* source: list of individual ortholog prediction tools that support a given orthologous gene-pair relationship
+
* '''source''': list of individual prediction tools that support a given gene-pair relationship
* score: a simple score indicating the number of tools that support a given orthologous gene-pair relationship
+
* '''score''': a simple score indicating the number of tools that support a given gene-pair relationship
* best_score: either ‘yes’ or ‘no’ to indicate whether the given ortholog has the highest score for the query gene
+
* '''best_score''': either ‘yes’ or ‘no’ to indicate whether the given gene has the highest score for the query gene
* best_reverse_score: either ‘yes’ or ‘no’ to indicate whether the query gene has the highest score for the given ortholog in the reciprocal search
+
* '''best_reverse_score''': either ‘yes’ or ‘no’ to indicate whether the query gene has the highest score for the given gene in the reciprocal search
* transgene_in_fly: Where applicable, the FlyBase gene ID and symbol for a non-Drosophila gene where that gene has been expressed transgenically in Drosophila
+
* '''transgene_in_fly''': Where applicable, the FlyBase gene ID and symbol for a non-Drosophila gene where that gene has been expressed transgenically in Drosophila
The columns for OrthoDB-based searches are similar, except that the DIOPT-specific columns are absent and the ‘Source’ column lists only ‘OrthoDB’.
+
 
 +
Clicking on the '''Exclude scores <3''' text at the top of the results page will remove any pairwise calls with a DIOPT score less than 3 (i.e. only 1 or 2 individual prediction tools support that particular call), which is useful if the unfiltered list is long owing to many low scoring calls. The text switches to '''Show scores <3''', and clicking again reverts back to the full list.
 +
 
 +
===Phenotype tab===
 +
 
 +
This tool allows searching for alleles that have particular phenotypes. The form is divided into two portions, which may be used independently or in combination.
 +
 
 +
The top section searches for alleles with a particular [http://{{flybaseorg}}/cgi-bin/cvreport.html?id=FBcv:0000347 phenotypic class] (e.g. "lethal" or "behavior defective"). You can refine this search further using the refinement boxes, searching for a phenotype that occurs at a particular [http://{{flybaseorg}}/cgi-bin/cvreport.html?id=FBdv:00007008 developmental stage] (e.g. an embryonic stage) and/or under particular conditions (e.g. "recessive" or "heat sensitive").
 +
 
 +
The bottom section searches for alleles that show a phenotype in a particular tissue or cell type (e.g. "wing" or "RP2 neuron"). This uses terms from the [http://{{flybaseorg}}/cgi-bin/cvreport.html?id=FBbt:00000001Anatomy controlled vocabulary] or cellular component terms from the [http://geneontology.org/ Gene Ontology controlled vocabulary]. Again, you can refine this search further using the refinement boxes.
 +
 
 +
A coordinated [[FlyBase:QuickSearch#Auto-completion| auto-completion]] feature will assist you in finding the appropriate [[FlyBase:QuickSearch#Controlled Vocabularies| controlled vocabulary]] (CV) terms that have been used during the curation of each phenotype. The refinement boxes will only suggest terms that have been curated in combination with the main search term. Please note that this auto-completion works within the two sections, but not between them. This means it is possible even when using auto-completion suggestions, to search on a combination of terms entered in both sections of this form that will return zero hits.
 +
 
 +
Please see the related Video Tutorial [https://www.youtube.com/watch?v=hZgsDPypZvk 'Finding genes with similar phenotypes'].
 +
 
 +
===Protein Domains tab===
 +
 
 +
Search for genes whose product(s) have a specified domain, repeat or site, or belong to a particular protein family, as defined by [https://www.ebi.ac.uk/interpro/ InterPro]. (See the InterPro [http://www.ebi.ac.uk/interpro/faqs.html FAQs] page for an explanation of different signature types.)
 +
 
 +
To use, either start typing and select an InterPro term from the [http://flybase.org/wiki/FlyBase:QuickSearch#Auto-completion auto-complete] suggestions (recommended) or enter your own term, using [http://flybase.org/wiki/FlyBase:QuickSearch#Wild_Cards wildcard(s)] (*) if desired. Then click the 'Search' button or press 'enter'. Resulting hits will be genes whose protein products are associated with an InterPro signature that wholly or partially matches your search term. E.g. A search for 'Ubiquitin' will retrieve hits to the InterPro family 'Ubiquitin', as well as other InterPro signatures that contain that word ('Ubiquitin domain', 'Ubiquitin conserved site' etc.). If you wish to retrieve hits annotated with a specific InterPro signature, then the InterPro ID (e.g. 'IPR019956') should be used.
 +
 
 +
===References tab===
 +
 
 +
This tab searches the extensive FlyBase bibliography. To use, first select which fields you wish to search by checking the appropriate box(es) at the top, then enter one or more search term(s) as appropriate.  Note that certain fields allow the use of Boolean operators (AND, OR, NOT), the year field supports mathematical comparison symbols (>,>=,<,<=) and range indicators (-,--,..), and the FlyBase [http://flybase.org/wiki/FlyBase:QuickSearch#Auto-completion auto-complete] feature is active in applicable fields. [http://flybase.org/wiki/FlyBase:QuickSearch#Wild_Cards wildcard(s)] (*) can be added to any search term except for 'Year'.
 +
This functionality is summarized in the following table:
 +
 
 +
{| class="wikitable"
 +
|-
 +
! Field !! Boolean terms accepted? !! Auto-complete active? !! Wild cards accepted? !! Example
 +
|-
 +
| '''Author''' || Yes || Yes || Yes || Smith NOT Johnson
 +
|-
 +
| '''Year''' || Yes || No || No || 2004-2008
 +
|-
 +
| '''Title/Abstract''' || No || No || Yes || metabolomics
 +
|-
 +
| '''Journal''' || Yes || Yes || Yes || Dev. Biol.
 +
|-
 +
| '''FBrf/PMID/PMCID/DOI''' || No || No || Yes || FBrf0126983
 +
|-
 +
| '''Publication type''' || Yes || Yes || Yes || paper
 +
|-
 +
| '''All report fields''' || No || No || Yes || dpp
 +
|-
 +
|}

Latest revision as of 13:51, 15 June 2023

General help

Overview

The QuickSearch tool on the FlyBase home page allows searching across all FlyBase reports. Forms for searching specific types of data have been separated into ‘tabs’, arrayed at the top of the QuickSearch window. Information on how to use each of the QuickSearch tabs can be found below.

Links to specific help for each tab:

Also see this publication:

Marygold SJ and the FlyBase Consortium (2023)
Exploring FlyBase Data Using QuickSearch (Updated protocol)
Current Protocols 3:e731. DOI:10.1002/cpz1.731

Species Searched

All tabs search data for all species included in FlyBase. An option to filter by species is provided in the resulting hit-list.

Controlled Vocabularies

Several QuickSearch tabs search FlyBase data by making use of controlled vocabulary (CV) terms. These tabs provide intuitive domain-specific searches of FlyBase reports based on the Gene Ontology (GO) controlled vocabulary, on anatomical, developmental-stage-specific or phenotypic class terms used to annotate phenotypes, and on anatomical and/or developmental-stage-specific terms used to annotate gene expression. Combinations of CV terms can be searched using the forms in these tabs. An auto-completion feature is active wherever a search term should come from a CV, to assist you in choosing terms that will match records in FlyBase. The various controlled vocabularies used in FlyBase can also be searched or browsed by clicking on the "Vocabularies" button above the QuickSearch box on the home page.

Auto-completion

The QuickSearch auto-completion feature is active in tabs that search FlyBase using controlled Vocabulary terms. Since only terms that are in the controlled Vocabulary will match records in FlyBase, the auto-completion feature suggests CV terms that are compatible with what you have typed. Selecting a term from the suggestion list reduces the possibility of a search returning nothing because the search term is not one that is used by FlyBase curators. The various controlled vocabularies used in FlyBase can also be searched or browsed by clicking on the "Vocabularies" button above the QuickSearch box on the home page.

Some tabs for non-CV-based searches also use the auto-complete feature. Several of the searchable fields available in the References tab are enhanced with auto-completion, which helps prevent searches that fail due to mis-spelled names or mis-remembered journal titles. Most of the data classes searchable under the Data Class tab have auto-completion associated with them as well.

The QuickSearch auto-completion feature overrides your browser’s auto-completion function. Important Note -- In the Data Class and References tabs, the auto-complete function must be selected. If the auto-complete button is not checked, your browser's auto-complete function may operate and will offer options based on your history rather than valid FlyBase terms.

Coordinated Auto-completion

The coordinated auto-completion feature is active for tabs in which several search terms may be used simultaneously for a search. When a term has been entered in one of these fields, the coordinated auto-completion for the other fields is aware of the term already typed, and suggests only terms that actually occur in combination with the first term in FlyBase reports. Here is an example of how it works in the Expression tab:

In the Expression tab, text box fields for Stage, Tissue, and Cell Loc. (cell location) are displayed. The auto-completion for these three fields is coordinated in the following sense: Suppose you enter "fertilized egg stage" in the Stage text box. When you move your focus to the Tissue text box, auto-complete there will show only four options; "egg", "female pronucleus", "fertilized egg", and "male pronucleus". This is because, out of the multitude of CV terms available for the Tissue field, only these four terms have actually been used in combination with "fertilized egg stage" by curators in an annotation captured in the FlyBase database. If you enter any other term in the Tissue text box, even though it may be a valid CV term for that field, your search would return zero hits, because there are no FlyBase reports containing that combination of CV terms.

Using the terms suggested by the auto-completion feature ensures that you do not enter terms that would be mutually exclusive (or are simply not used by curators) in FlyBase reports. Terms suggested by the auto-completion should always return results. If the coordinated auto-completion does not offer a term you wish to enter in a field, it is because this term does not appear in combination with some other term you have entered elsewhere on the form. In this case you should try another combination.

Wild Cards

When you use QuickSearch you can add the asterisk character ( * ) to the beginning or the end of a search term. This is recognized as a “wild card” and will find all terms that contain your search term at the end or beginning of a phrase, respectively. You can also flank your search term with wild card characters to find all phrases containing your search term. For example, you can find the genes that start with 'ft' by entering 'ft*'. (Search the Genes data class either under the Simple tab by selecting the 'Genes' data class from the result summary table, or under the Data Class tab by selecting 'genes' from the Data Class drop-down menu.) The result of this search lists fat (ft) and fushi tarazu (ftz), as you would expect, and also fruitless (fru), because it has the synonym 'fty'.

Please note that wild cards cannot be used in numeric fields (year, etc).

Tab Descriptions

Search FlyBase tab

This tab performs a comprehensive search of text-searchable FlyBase data. This includes most fields from all data classes of reports.

Enter one or more search terms in the box. The search term box of the Simple tab supports a number of additional features that can be used to narrow or broaden the query. A wildcard character (*) can be appended, prepended or added within a search term to broaden the query. When specifying multiple terms, a Boolean ‘AND’ is used for searches by default and does not require any special notation (e.g. a search for ‘neurogenesis microtubule polymerization’ will return only hits that have all of those three terms somewhere in the record). A Boolean ‘OR’ can be added to find records that have one or another of a list of specified terms (e.g. ‘cnn OR cbs’). To exclude certain terms from the results, prefix the term(s) to be excluded with a ‘-’ character (e.g. ‘Parkinson -CG5680’). Finally, results can be specified to contain an exact phrase by surrounding the search term with double quotes (e.g. “SH3 domain”).

Click on the 'Search' button or press 'enter'. The search returns a result page summarizing the matching records by data type. Clicking on one of these data types takes you to a secondary result page containing a table of individual matches within that data type. Click on any of these to view its report page. QuickSearch also places your query text in a resubmission form on the result summary page, where you can edit or refine the phrase directly and search again, without having to start over.

The QuickSearch auto-completion feature is not active in this tab.

Data Class tab

The Data Class tab allows searches that are restricted to only a single chosen data type.

Choose from among the data types offered in the Data Class dropdown menu. There is also an "All data classes" option.

Choose to search just "ID/Symbol/Name" or search "All text" by clicking the appropriate box.

Enter a symbol appropriate to the selected data type in the "Enter text" box. Clicking the "QuickSearch autocomplete" option will enable the autocomplete function, which allows you to choose from among valid symbols for the selected data type. The QuickSearch auto-completion feature is active for most of the data classes in this tab.

Important Note -- The auto-complete function must be selected. If the auto-complete button is not checked, your browser's auto-complete function may operate and will offer options based on your history rather than valid FlyBase terms.

Expression tab

Search for genes according to expression patterns:

The top part of the tab contains a form allowing searching of curated statements that describe published accounts of transcript and polypeptide expression as well as expression associated with reporter constructs and insertions. Choose the expression pattern you wish to search for. The form has input boxes for Developmental Stage, Anatomy/Cell Type, and Cellular Component. The coordinated auto-completion feature will assist you in finding an appropriate controlled vocabulary (CV) terms or combination of terms that have been used during the curation of each descriptor. Please note that if you fill one of the three boxes, the autocompleted options you will be offered in the other two boxes will include only terms that have been used with the term you have entered into the box or boxes you have already filled. You do not need to fill every box to search.

You can refine this search further by choosing to add qualifier terms. Click the "+" sign above the search boxes to bring up additional search boxes for entering qualifier terms. The coordinated auto-completion feature will provide you with a list of CV terms that have been used by curators to modify or limit the associated main term. The auto-completion for the qualifier terms is fully coordinated across all of these fields, in the sense that choosing a term for (e.g.) the Developmental Stage input will affect which qualifier terms are suggested for the Anatomy/Cell Type or Cellular Component. qualifier fields. Please note: many embryonic expression patterns, such as "pair rule expression pattern" are spatial qualifiers, not anatomy terms; you can search for such terms by filling the Anatomy/Cell Type qualifier box while leaving the Anatomy/Cell Type box unfilled.

Running the search will take you to a hit list of genes to which the search terms have been associated. To see other classes of hits (Alleles, Insertions, Recombinant Constructs), click on the associated green box at the top of the results page. To see the curated expression patterns, click on an item in the hit list and open the "Expression Data" section of the corresponding report. If you are looking for expression patterns of GAL4 or other binary drivers or lacZ or GFP reporters, use the new GAL4 etc tab. Alternatively, choose the green box labeled "Alleles", click one of the blue arrows in the Symbol column to sort alphabetically, then scroll until you reach symbols that start with the text "Ecol\lacZ" (lacZ reporters) or "Scer\GAL4" (GAL4 drivers). Expression data is associated with the insertion or transgenic construct associated with the allele; the associated insertion or construct can be found on the allele report under the General Information section at the top of the report, under the fields "Associated Insertion(s)" and "Carried in Constructions". Alternatively, you can select all the alleles of interest in the Alleles hitlist, then select the Batch Download option from the HitList Conversion Tools button. In the Select Fields menu, choose "Associated insertion(s)" and "Carried in construct", found under the "Nature of the Allele" heading. This will generate a hitlist of the insertions and constructs to which the desired driver or reporter expression pattern have been curated.

The bottom section of this tab contains a dropdown menu with links to a variety of RNA-Seq Search tools. JBrowse allows you to visually examine RNA-Seq expression levels in particular regions of the genome. The RNA-Seq Profile Search can be used to search for genes that have a specific expression pattern of interest. RNA-Seq Similarity Search allows you to search for genes with a similar pattern of expression to the input gene. RNA-Seq by Region allows the comparison of RNA-Seq signals in a given region across samples or to compare signal between two regions in a single sample. You can find a more detailed description of these RNA-Seq search options on the RNA-Seq overview page. There are video tutorials available for RNA-Seq Profile Search, and RNA-Seq Similarity Search, and RNA-Seq in GBrowse -- Warning: The GBrowse genome viewer has been updated to JBrowse, but techniques shown here are still useful in JBrowse.

GAL4 etc tab

The GAL4 etc tab allows searches for GAL4 and other binary system drivers, and for non-binary reporters, by temporal-spatial expression pattern, or gene promoter expression pattern, as curated from the literature.

Search by curated expression pattern Click the option "by curated expression pattern". Choose the expression pattern you wish to search for. The form has input boxes for Developmental Stage, Anatomy or Cell Type, and Cellular Component. The coordinated auto-completion feature will assist you in finding an appropriate controlled vocabulary (CV) terms or combination of terms that have been used during the curation of each descriptor. Please note that if you fill one of the three boxes, the autocompleted options you will be offered in the other two boxes will include only terms that have been used with the term you have entered into the box or boxes you have already filled. Please note that you must fill either the Developmental Stage or Anatomy or Cell Type boxes to use this search option; you cannot fill only the Cellular Component box. Please note that you must use a valid controlled vocabulary term, and cannot search using a synonym. Synonyms will fail to autocomplete; you will instead see text reading "...no matching text suggestions...", and your query will produce no matches. You can use the Vocabularies tool to search with your synonym to find a valid Controlled Vocabulary term.

Search with qualifier terms You can refine this search further by choosing to add qualifier terms. Click the “+” sign above the search boxes to bring up additional search boxes for entering qualifier terms. The coordinated auto-completion feature will provide you with a list of CV terms that have been used by curators to modify or limit the associated main term. The auto-completion for the qualifier terms is fully coordinated across all of these fields, in the sense that choosing a term for (e.g.) the Developmental Stage input will affect which qualifier terms are suggested for the Anatomy/Cell Type or Cellular Component qualifier fields. Please note: many embryonic expression patterns, such as pair rule expression pattern are spatial qualifiers, not anatomy terms; you can search for such terms by filling the Anatomy/Cell Type qualifier box while leaving the Anatomy/Cell Type box unfilled; you do, however, need to also fill the Developmental Stage box in this case. Sex-specific queries use the qualifier box for Developmental Stage. You must fill the Developmental Stage box to use a sex qualifier, even if you have filled the Anatomy/Cell Type box.

Choosing the correct search stringency It is important to note that you should fill only as many fields as you need; for this tab, you should usually leave the Cellular Component field unfilled, as only a small subset of expression pattern curation for driver or reporter alleles include a GO cellular component term, such as terminal bouton. Also note that this tool supports searching a set of hierarchical controlled vocabularies. For example, if you search by curated expression pattern for the anatomy term "imaginal disc", you will get not only the list of all such drivers annotated with the term imaginal disc, but also all terms that have an is_a relationship (e.g., wing disc, eye disc) or part_of relationship (e.g., imaginal disc posterior compartment, wing pouch) to "imaginal disc". The terms that appear in the Expression terms column of the GAL4 etc hitlist can assist you in choosing a more specific search term; you can also use the Vocabularies tool to find controlled vocabulary terms.

Search by expression pattern of a particular gene Click the option "reflecting expression pattern of a particular gene". The form has one input box, Gene. You can enter either a valid FlyBase gene symbol, such as dpp, or a valid FlyBase gene ID number, such as FBgn0000490; synonyms or full gene names will not work in this search. Please note that this search is only for drivers and/or reporters that have been curated as reflecting the expression pattern of a specific gene; this search will not find drivers/reporters with an expression pattern similar to that of the gene you have entered in the search box, nor will it find drivers/reporters that are associated with a specific gene, but have not been curated as reflecting expression of that gene. Please note: this search option is an alternative to the "by curated expression pattern" search option; you can choose only one of these two options.

Output format options IN PROGRESS Examples of how to manipulate the results page can be found in this FlyBase commentary Choose the Output format: Integrated Table or List. The List output provides a faceted hitlist consisting of alleles, insertions, transgenic constructs, and stocks. The Integrated Table output provides a customized table view of the hitlist. This format shows the connection between particular alleles, insertions, constructs, and stocks; additionally, the Relevant Expression Statements column lists the anatomy, stage, and/or GO cellular component controlled vocabulary term that triggered the search result. The Integrated Table view also pre-sorts drivers/reporters with a publically available stock to the top of the hitlist.

Accessing the full expression pattern Note: Many drivers and reporters may be expressed at other developmental stages and/or in other tissues or cell types than the pattern you searched for. Clicking through to the Allele Reports, Insertion Reports or Construct Reports of your hits will allow you to see the complete curated expression pattern of the drivers or reporters in your hitlist.

Gene Groups tab

This tab searches FlyBase-curated 'Gene Groups' - sets of genes/gene products that are acknowledged to form a biological group, such as members of a gene family (e.g. Actins, Wnts), subunits of a protein complex (e.g. proteasome, ribosome), or some other functional grouping (e.g. cadherins or caspases).

To use, either start typing and select the appropriate Gene Group name from the auto-complete suggestions, or enter your own text, using wildcard(s) (*) if desired. Then click the 'Search' button or press 'enter'. The resulting hits will be Gene Groups that wholly or partially match your search term. Alternatively, enter the symbol, name or ID of a gene in the search box to retrieve those Gene Groups to which that gene belongs.

Click the 'browse' link at the bottom of the panel to see a full list of Gene Groups.

Pathways tab

This tab searches FlyBase-curated 'Pathway Reports' - sets of genes/gene products that have been experimentally shown to act within a pathway or to regulate a pathway.

To use, either start typing and select the appropriate Pathway Report name from the auto-complete suggestions, or enter your own text, using wildcard(s) (*) if desired. Then click the 'Search' button or press 'enter'. The resulting hits will be Pathway Reports that wholly or partially match your search term. Alternatively, enter the symbol, name or ID of a gene in the search box to retrieve those Pathway Reports to which that gene belongs.

Click the 'browse' link at the bottom of the panel to see a full list of Pathway Reports.

GO tab

Search the Gene Ontology (GO) controlled vocabulary directly. You can search all GO terms or limit your search to the molecular function, biological process, or cellular component GO vocabularies by selecting from the "Data Field" dropdown menu.

Results are in the form of a hit list of matching GO terms. Clicking on the term of interest takes you to the term report from which, among other things, you can get a list of genes that are annotated with that GO term; look in the Records annotated with this exact term section. Note, as the GO is an ontology, all child terms will possess the property of the parent, therefore consider using Records annotated with this term OR any of its CHILDREN TERMS to get a complete gene list.

Please see the Vocabularies help page for more information.

The QuickSearch auto-completion feature is active in this tab.

Human Disease tab

Search Human Disease Model Reports and the Disease Ontology (DO) by entering a disease, human disease-associated gene, or Drosophila melanogaster gene, into the search box. This search supports a great deal of flexibility in search text.

You can search by disease using:

  • Disease Ontology (DO) term (e.g., autosomal dominant Parkinson Disease 1)
  • DOID (e.g., DOID:0060367 or 0060367)
  • Human Disease Model name (e.g., Parkinson disease 1)
  • Human Disease Model ID (e.g., FBhh0000006)
  • OMIM phenotype term (e.g., PARKINSON DISEASE 1, AUTOSOMAL DOMINANT)
  • OMIM phenotype ID (e.g., 168601)
  • disease synonym (e.g., PD1).

You can search by human disease-associated gene using:

You can search by Drosophila melanogaster gene using:

  • FlyBase gene symbol (e.g., Sod1)
  • FlyBase gene name (e.g., superoxide dismutase 1)
  • FlyBase gene ID (e.g., FBgn0003462)

Please note that for HGNC and OMIM IDs, you can search only using the digits of the ID number; so 11138 or 606352 work as search terms, but HGNC:11138 or OMIM:606352 do not.

Please note that disease synonyms work only for exact synonyms that have been attached to a Human Disease Model or to a DO term. So, the search string "ALS" fails to find the disease "amyotrophic lateral sclerosis 4", as "ALS4" is a synonym for that disease, but "ALS" is not. Addition of a wild card (e.g., ALS*) will, in many cases, get around this issue.

The QuickSearch auto-completion feature is active in this tab; auto-completion works for

  • Disease Ontology terms and DOIDs (digits only)
  • Human Disease Model names and IDs
  • OMIM phenotype terms
  • HGNC gene symbols and IDs
  • FlyBase gene symbols, names, and IDs

The results are a hit list of matching Disease Ontology CV terms, Human Disease Model Reports, Drosophila melanogaster genes associated with a Human Disease Model, and Drosophila melanogaster alleles with a models_of relationship to a Disease Ontology term. Clicking on the Disease Ontology term of interest takes you to a term report from which, among other things, you can get a list of all genes or alleles that have been used to model, or interact with a model, of that disease in flies; you can also get to Human Disease Model Reports, which compile all of the disease model-related information on that disease in FlyBase. Please see the Vocabularies help page for more information. Human Disease Model, Gene, and Allele hits take you to the corresponding report.

Click the 'browse' button to see a full list of Human Disease Model Reports. This list has been organized as an index, so that you can easily browse to your disease; for example, Machado-Joseph disease is redundantly listed as Machado-Joseph disease, under polyglutamine diseases, and under spinocerebellar ataxia.

Homologs tab

This tab can be used to quickly search for orthologs of D. melanogaster, human or other model organism genes, as provided by the DRSC Integrative Ortholog Prediction Tool (DIOPT). It can also be used to find paralogs within D. melanogaster. The DIOPT dataset integrates ortholog/paralog predictions for humans and model organisms from multiple tools and algorithms. (Further documentation is here.) A related video tutorial can be found at Using the Orthology search tool.

To search for orthologs, first select the input species by clicking on the Species dropdown menu. Next, enter one or more gene symbols/IDs in the adjacent Gene(s) box - multiple entries are accepted and need to be separated by spaces. (Response time will be proportional to the number of entries.) Then, select one or more output species using the check-boxes.

To search for paralogs within D. melanogaster, select D. melanogaster as both the input and output species.

Finally, click the Search button or press ‘enter’.

The symbols/IDs that may be entered in the Gene(s) box depends on the 'input species', as follows:

Input species Allowable symbols/IDs (example)
H. sapiens HGNC gene symbol (e.g. CDK1) or gene ID (e.g. HGNC:1722); OMIM ID (e.g. OMIM_GENE:116940); NCBI Gene ID (e.g. '983); Ensembl ID (e.g. ENSG00000170312)
R. norvegicus RGD gene symbol (e.g. Cdk1) or gene ID (e.g. 2319); NCBI Gene ID (e.g. 54237)
M. musculus MGI gene symbol (e.g. Cdk1) or gene ID (e.g. MGI:88351); NCBI Gene ID (e.g. 12534)
X. tropicalis XenBase gene symbol (e.g. cdk1) or gene ID (e.g. XB-GENE-482750); NCBI Gene ID (e.g. 394503)
D. rerio ZFIN gene symbol (e.g. cdk1) or gene ID (e.g. ZDB-GENE-010320-1); NCBI Gene ID (e.g. 80973)
D. melanogaster FlyBase gene symbol (e.g. Cdk1), annotation symbol (e.g. CG5363), or gene ID (e.g. FBgn0004106); NCBI Gene ID (e.g. 34411)
A. gambiae VectorBase gene ID (e.g. AGAP007642); NCBI Gene ID (e.g. 1269584)
C. elegans WormBase gene symbol (e.g. cdk-1) or gene ID (e.g. WBGene00000405); NCBI Gene ID (e.g. 176374)
A. thaliana Araport gene symbol (e.g. CDC2) or gene ID (e.g. AT3G48750); NCBI Gene ID (e.g. 824036)
S. cerevisiae SGD gene symbol (e.g. CDC28) or gene ID (e.g. S000000364); NCBI Gene ID (e.g. 852457)
S. pombe PomBase gene symbol (e.g. cdc2); NCBI Gene ID (e.g. 2539869)
E. coli ECOCYC gene symbol (e.g. pfkA); NCBI Gene ID (e.g. 948412)

Note that symbol-based searches are case-sensitive - to ensure validity, users should select a gene symbol from the auto-suggest list that appears when typing. (Auto-suggest works only for the first entered symbol.) Also note that this tool does not support searching using gene fullnames.


On the results page, the top row shows the search term, species, the matched gene symbol, and any relevant links to Gene Reports. Below this are the column headers, followed by the list of ortholog/paralog predictions arranged by species. The columns are:

  • Gene: official gene symbol, as used in the relevant model organism database
  • Gene Reports: links to report pages at model organism databases, NCBI, Ensembl and/or OMIM
  • Score: the number of tools that support a given gene-pair relationship compared to the total number of tools that compute relationships for those two species (expressed as "X of Y")
  • Best Score: either ‘yes’ or ‘no’, indicating whether the given gene has the highest score for the query gene within that species
  • Best Rev Score: either ‘yes’ or ‘no’, indicating whether the query gene has the highest score for the given gene in the reciprocal search; also includes a link to show the full results of performing the reciprocal search (among those species selected in the original query)
  • Source: list of individual prediction tools that support a given gene-pair relationship
  • Align: link to an alignment between the given gene-pairs on the DIOPT site
  • Transgene in Fly: link to a FlyBase Gene Report for a non-Drosophila gene, indicating that that gene has been expressed transgenically in Drosophila

In cases where there are multiple hits to a single search term (as may happen when a numerical ID is entered), then all hits together with their predicted orthologs/paralogs are shown in the results table.

Clicking on the Save results as tsv file text at the top of the results page will download all the results shown in that page to a file in tab separated value format, with one orthologous gene-pair per line. It has the following columns:

  • query_context: the entered search term
  • query_species: the selected input species
  • query_gene: the matched input gene symbol
  • target_species: the selected output species
  • gene: official gene symbol, as used in the relevant model organism database
  • gene_reports: gene IDs at model organism databases, NCBI, Ensembl and/or OMIM
  • source: list of individual prediction tools that support a given gene-pair relationship
  • score: a simple score indicating the number of tools that support a given gene-pair relationship
  • best_score: either ‘yes’ or ‘no’ to indicate whether the given gene has the highest score for the query gene
  • best_reverse_score: either ‘yes’ or ‘no’ to indicate whether the query gene has the highest score for the given gene in the reciprocal search
  • transgene_in_fly: Where applicable, the FlyBase gene ID and symbol for a non-Drosophila gene where that gene has been expressed transgenically in Drosophila

Clicking on the Exclude scores <3 text at the top of the results page will remove any pairwise calls with a DIOPT score less than 3 (i.e. only 1 or 2 individual prediction tools support that particular call), which is useful if the unfiltered list is long owing to many low scoring calls. The text switches to Show scores <3, and clicking again reverts back to the full list.

Phenotype tab

This tool allows searching for alleles that have particular phenotypes. The form is divided into two portions, which may be used independently or in combination.

The top section searches for alleles with a particular phenotypic class (e.g. "lethal" or "behavior defective"). You can refine this search further using the refinement boxes, searching for a phenotype that occurs at a particular developmental stage (e.g. an embryonic stage) and/or under particular conditions (e.g. "recessive" or "heat sensitive").

The bottom section searches for alleles that show a phenotype in a particular tissue or cell type (e.g. "wing" or "RP2 neuron"). This uses terms from the controlled vocabulary or cellular component terms from the Gene Ontology controlled vocabulary. Again, you can refine this search further using the refinement boxes.

A coordinated auto-completion feature will assist you in finding the appropriate controlled vocabulary (CV) terms that have been used during the curation of each phenotype. The refinement boxes will only suggest terms that have been curated in combination with the main search term. Please note that this auto-completion works within the two sections, but not between them. This means it is possible even when using auto-completion suggestions, to search on a combination of terms entered in both sections of this form that will return zero hits.

Please see the related Video Tutorial 'Finding genes with similar phenotypes'.

Protein Domains tab

Search for genes whose product(s) have a specified domain, repeat or site, or belong to a particular protein family, as defined by InterPro. (See the InterPro FAQs page for an explanation of different signature types.)

To use, either start typing and select an InterPro term from the auto-complete suggestions (recommended) or enter your own term, using wildcard(s) (*) if desired. Then click the 'Search' button or press 'enter'. Resulting hits will be genes whose protein products are associated with an InterPro signature that wholly or partially matches your search term. E.g. A search for 'Ubiquitin' will retrieve hits to the InterPro family 'Ubiquitin', as well as other InterPro signatures that contain that word ('Ubiquitin domain', 'Ubiquitin conserved site' etc.). If you wish to retrieve hits annotated with a specific InterPro signature, then the InterPro ID (e.g. 'IPR019956') should be used.

References tab

This tab searches the extensive FlyBase bibliography. To use, first select which fields you wish to search by checking the appropriate box(es) at the top, then enter one or more search term(s) as appropriate. Note that certain fields allow the use of Boolean operators (AND, OR, NOT), the year field supports mathematical comparison symbols (>,>=,<,<=) and range indicators (-,--,..), and the FlyBase auto-complete feature is active in applicable fields. wildcard(s) (*) can be added to any search term except for 'Year'. This functionality is summarized in the following table:

Field Boolean terms accepted? Auto-complete active? Wild cards accepted? Example
Author Yes Yes Yes Smith NOT Johnson
Year Yes No No 2004-2008
Title/Abstract No No Yes metabolomics
Journal Yes Yes Yes Dev. Biol.
FBrf/PMID/PMCID/DOI No No Yes FBrf0126983
Publication type Yes Yes Yes paper
All report fields No No Yes dpp