Difference between revisions of "FlyBase:FAQ"

From FlyBase Wiki
Jump to navigation Jump to search
(format is changed, new FAQ's added)
m
Line 122: Line 122:
 
! style="text-align: left" | 7.7. !! style="text-align: left" | What does the annotation release number mean?
 
! style="text-align: left" | 7.7. !! style="text-align: left" | What does the annotation release number mean?
 
|-
 
|-
|style="text-align: left"| || ''D. melanogaster'' annotation numbers are combine the BDGP reference genome assembly release number and the FlyBase annotation release number for that reference genome assembly, separated by a decimal. For example, the 2022_05 release of FlyBase included the r6.48 ''D. melanogaster'' annotation set (the 48th annotation set for Release 6 of the reference genome assembly). You can find the annotation number by selecting "Release Notes" from the "About" section of the blue NavBar at the top of every FlyBase page.
+
|style="text-align: left"| || ''D. melanogaster'' annotation numbers combine the BDGP reference genome assembly release number and the FlyBase annotation release number for that reference genome assembly, separated by a decimal. For example, the 2022_05 release of FlyBase included the r6.48 ''D. melanogaster'' annotation set (the 48th annotation set for Release 6 of the reference genome assembly). You can find the annotation number by selecting "Release Notes" from the "About" section of the blue NavBar at the top of every FlyBase page.
 
NB - the FlyBase annotation number is not associated with the NCBI release number in any way.
 
NB - the FlyBase annotation number is not associated with the NCBI release number in any way.
 
|-
 
|-
Line 309: Line 309:
 
There are a number of exceptional gene model annotations documented in FlyBase. These exceptional cases have strong support from multispecies conservation, protein prediction algorithms, and/or published experimental evidence. Each exceptional case will have a clarifying comment in the  'Comments on Gene Model' subsection of the 'Gene Model and Products' section of the relevant gene report, and/or an informative SO term in the 'Sequence Ontology: Class of Gene' subsection of the  of the 'Gene Model and Products' section of the relevant gene report. You can find more information about exceptional annotation cases in the FlyBase paper [http://flybase.org/reports/FBrf0229217.html "Gene Model Annotations for ''Drosophila melanogaster'': The Rule Benders"].
 
There are a number of exceptional gene model annotations documented in FlyBase. These exceptional cases have strong support from multispecies conservation, protein prediction algorithms, and/or published experimental evidence. Each exceptional case will have a clarifying comment in the  'Comments on Gene Model' subsection of the 'Gene Model and Products' section of the relevant gene report, and/or an informative SO term in the 'Sequence Ontology: Class of Gene' subsection of the  of the 'Gene Model and Products' section of the relevant gene report. You can find more information about exceptional annotation cases in the FlyBase paper [http://flybase.org/reports/FBrf0229217.html "Gene Model Annotations for ''Drosophila melanogaster'': The Rule Benders"].
 
===7.7. What does the annotation release number mean? ===
 
===7.7. What does the annotation release number mean? ===
''D. melanogaster'' annotation numbers are combine the BDGP reference genome assembly release number and the FlyBase annotation release number for that reference genome assembly, separated by a decimal. For example, the 2022_05 release of FlyBase included the r6.48 ''D. melanogaster'' annotation set (the 48th annotation set for Release 6 of the reference genome assembly). You can find the annotation number by selecting "Release Notes" from the "About" section of the blue NavBar at the top of every FlyBase page.
+
''D. melanogaster'' annotation numbers combine the BDGP reference genome assembly release number and the FlyBase annotation release number for that reference genome assembly, separated by a decimal. For example, the 2022_05 release of FlyBase included the r6.48 ''D. melanogaster'' annotation set (the 48th annotation set for Release 6 of the reference genome assembly). You can find the annotation number by selecting "Release Notes" from the "About" section of the blue NavBar at the top of every FlyBase page.
 
NB - the FlyBase annotation number is not associated with the NCBI release number in any way.
 
NB - the FlyBase annotation number is not associated with the NCBI release number in any way.
 
===7.8. What reference genome assemblies have been produced for ''D. melanogaster''? ===
 
===7.8. What reference genome assemblies have been produced for ''D. melanogaster''? ===

Revision as of 09:35, 11 November 2022


FlyBase FAQ
1. Bulk data retrieval
1.1. What sort of bulk data files do you offer?
The data sets available are described on the Downloads Overview page.
2. Fast-Track Your Paper
2.1. I was contacted by FlyBase about my publication. I don't think it is relevant since I didn't investigate any Drosophila melanogaster genes.
It is helpful to FlyBase curators to know that your paper has no FlyBase-curatable data; this gives us more time to devote to data-rich papers. The Fast-Track Your Paper tool allows you to quickly indicate that your paper has no curatable data types or investigates no Drosophila melanogaster genes.
2.2. For "new transgene" under "Drosophila reagents", I constructed a plasmid containing an existing known D. melanogaster gene. Is this considered as a "new transgene"?
If you went on to use the plasmid in vivo (e.g. injected a UAS-geneX construct into Drosophila melanogaster) then this would be considered a new transgene. If you used the plasmid only in cultured cells or in vitro, we do not consider it a new transgene. Please find futher information about FTYP tool here.
2.3. I just went to Fast-Track a recent publication but it was not found by the FlyBase "Fast-Track Your Paper" tool.
If it is a very recent paper, it may not have entered our bibliography. This means the paper is not available to the Fast Track Your Paper tool yet. Please try again in one or two weeks.
2.4. I was asked to complete the FTYP form for my paper published recently. The paper describes many genes, but it's actually a review article and there is no original data. Should I still list up genes described in this paper as genes that were 'studied'?
Any gene you associate with your review will result in your review appearing on the reference list of that gene report. So, we encourage you add any gene(s) that are the focus of your review.
2.5. I submitted an FTYP form but my data is not visible on FlyBase. When will it show on the relevant gene reports?
We only update the public web pages every two months, so depending on when you submit the Fast-Track information, it can take between 3-12 weeks for the gene information to appear on gene reports.
2.6. I received a link to Fast-Track our paper but when I go to the link it shows that the paper has been already curated. What should I do?
The reason that that link is no longer available is because your paper has already been curated by a FlyBase curator or by another Fast-Track contributor. You don't need to take any further action.
2.7. I have used some genetic reagents but I have not investigated their roles in my paper. I just used them as tools (e.g. selectable marker in crosses, GAL4 driver, FLP to induce clones). Shall I mention these genes in the field "gene studied" of FTYP?
No, we are only interested in the genes that you studied (e.g. to analyse their role in development, to determine the mutant phenotype, or to determine where they are expressed), so there is no need to add genes for genetic reagents used as tools.
2.8. I submitted the information to Fast-Track our recent manuscript in which we made a new transgene expressing a human gene. However, when selecting genes studied, the system did not give me the option to select the corresponding human gene. How shall I proceed?
If you have made the first reported transgenic construct of a human gene in flies, there will not yet be a FlyBase gene report for that gene. In the Data section (before gene selection), make sure the box for 'New transgene' has been selected, and a FlyBase curator will attach that gene (and the construct) to your paper, after making the appropriate gene report.
2.9. I would like to use Fast-Track for our paper but the email request has gone to another author. Can I still fill in the FTYP from?
Yes, you can use the original link that was provided in the email and then change the email address to your own manually in the 'Contact' step.
3. FlyBase fee
3.1. How can I pay the FlyBase fee? Can I pay the fee for my entire lab/institution/company?
You can pay the FlyBase fee at this link. We also have a dedicated FlyBase Fees FAQ. Please contact us to discuss institutional/departmental/corporate fees.
4. FlyBase people database
4.1. Is FlyBase people database still active?
No, the FlyBase people database was retired.
5. Genome browser (JBrowse and GBrowse)
5.1. Is GBrowse being discontinued?
Yes, GBrowse is no longer being maintained, and GBrowse access will be discontinued in FlyBase release FB2202_06. All GBrowse tracks and features are now also available in JBrowse. Please use JBrowse for your queries.
5.2. How can I download FASTA sequence from JBrowse?
Please find instructions in this FlyBase tweetorial.
6. Gene data
6.1. How can I find transgenic constructs with particular characteristics?
Please see the commentary on FlyBase Experimental Tool Reports.
6.2. How can I obtain flanking gene sequence?
To obtain flanking sequence for a gene of interest, go to the gene page for that gene. Go to the "Genomic Location" section and click on the "Get Decorated FASTA" link. You can specify the amount of flanking sequence that you wish to have displayed in the "Additional upstream / downstream bases" box.

You can get a multiple FASTA file containing the gene region sequences including 2000 bp flanking on either end for all Dmel genes from the Current Relase page in the Genomes: Annotation and Sequence section.

The file: dmel-all-gene_extended2000-r6.nn.fasta contains these sequences. There is also a file with all intergenic sequences.

7. Gene model and genome annotation
7.1. What happened to my gene? Its gene report says it is withdrawn.
Genes are occasionally withdrawn because of a lack of supporting evidence. More commonly, a gene model might be merged with another gene, or split into two or more genes. In such cases, the original gene(s) are withdrawn. You can find the fate and/or history of the withdrawn gene in the Relationship to Other Genes subheading (under Other Information in Gene Reports) in both the withdrawn gene's report and in the report(s) of gene(s) resulting from the merge or split.
7.2. I am trying to compare coordinates between the R6 assembly of the sequenced genome and the dm3 assembly, but the Sequence Coordinate Converter tool does not support Release 3 as an Output Assembly option.
The Sequence Coordinate Converter tool does allow this option. The UCSC dm3 reference sequence corresponds to the release_5 (R5) assembly of the D. melanogaster genome, not the R3 assembly.
7.3. I translated a transcript for a protein, and that protein is different from the one FlyBase displays. Why is there this inconsistency?
You have found a gene for which there is a mutation in the sequenced strain. FlyBase has so far identified 64 genes in which a mutation disrupts the coding sequence; we provide a corrected CDS for affected transcripts. Such genes will have a comment about the mutation in the 'Comments on Gene Model' subsection of the 'Gene Model and Products' section of the relevant gene report. You can find the list of affected genes in the Release Notes, TABLE 5: Genes with Known Disruptive Mutations in the iso-1 Reference Sequence. Please also see this FlyBase paper.
7.4. These two genes map to the same genomic coordinates, and have the same transcript, but FlyBase says they are different genes. How can this be?
You have found a pair of dicistronic genes, in which there are two distinct protein coding regions with overlapping transcripts. Dicistronic gene pairs can include both dicistronic transcipts that encode the CDS of both genes, as well as monocistronic transcripts that encode the CDS of only one of the affected genes. Such genes will have a gene_with_dicistronic_mRNA SO entry in the 'Sequence Ontology: Class of Gene' subsection of the of the 'Gene Model and Products' section of the relevant gene report. You can find more information about dicistronic genes and other exceptional annotation cases in the FlyBase paper "Gene Model Annotations for Drosophila melanogaster: The Rule Benders".
7.5. Part of this transcript is on the positive strand, and part is on the negative strand. Is there a mistake?
You have found a gene with a trans-spliced transcript. Such genes will have a gene_with_trans_spliced_transcript SO entry in the 'Sequence Ontology: Class of Gene' subsection of the of the 'Gene Model and Products' section of the relevant gene report. You can find more information about trans-spliced genes and other exceptional annotation cases in the FlyBase paper "Gene Model Annotations for Drosophila melanogaster: The Rule Benders".
7.6. The protein encoded by this transcript does not include the first ATG in the ORF/extends beyond a stop codon/does not start with an AUG codon.
There are a number of exceptional gene model annotations documented in FlyBase. These exceptional cases have strong support from multispecies conservation, protein prediction algorithms, and/or published experimental evidence. Each exceptional case will have a clarifying comment in the 'Comments on Gene Model' subsection of the 'Gene Model and Products' section of the relevant gene report, and/or an informative SO term in the 'Sequence Ontology: Class of Gene' subsection of the of the 'Gene Model and Products' section of the relevant gene report. You can find more information about exceptional annotation cases in the FlyBase paper "Gene Model Annotations for Drosophila melanogaster: The Rule Benders".
7.7. What does the annotation release number mean?
D. melanogaster annotation numbers combine the BDGP reference genome assembly release number and the FlyBase annotation release number for that reference genome assembly, separated by a decimal. For example, the 2022_05 release of FlyBase included the r6.48 D. melanogaster annotation set (the 48th annotation set for Release 6 of the reference genome assembly). You can find the annotation number by selecting "Release Notes" from the "About" section of the blue NavBar at the top of every FlyBase page.

NB - the FlyBase annotation number is not associated with the NCBI release number in any way.

7.8. What reference genome assemblies have been produced for D. melanogaster?
D. melanogaster has had 6 reference genome assemblies produced by BDGP. Release_1 and Release_2 were Celera whole genome shotgun (WGS) assemblies, whereas Releases_3, _4, _5 and _6 were all BDGP BAC tiling array assemblies taken to a high degree of finishing (except for the uncooperative parts of the centric heterochromatin and a small number of other sizeable regions containing highly repetitive sequences such as the histone gene cluster). Because of the WGS nature of Release_1 and Release_2, producing a coordinate converter for these releases to Releases_3, _4, _5, or _6 is not practically possible.
7.9. How do FlyBase/GenBank/BDGP assembly coordinates relate to UCSC coordinates? And where can I find the dm3 genome assembly for D. melanogaster?
An alternative "dm" numbering system from UCSC has sometimes been used to refer to the BDGP reference genome assemby releases, often causing confusion.

The correspondence of "dm" number to BDGP releases is as follows. dm1 = Release3 dm2 = Release4 dm3 = Release5 dm6 = Release6

7.10. How can I submit a correction to the genomic sequence of D. melanogaster?
The D. melanogaster reference genome assembly was generated by the BDGP and represents a specific assembly product using the sequenced strain. It is not within the purview of FlyBase to make corrections to that reference assembly. However, FlyBase can make necessary corrections to gene and transcript annotations - simply contact us.
7.11. How can I submit a correction to a gene model of D. melanogaster?
If the correction (e.g., new splice isoform, new TSS) is associated with a published paper, please use the Fast-Track Your Paper tool, and under "Genome Annotation Data", tick the box relating to transcript/polypeptide structure changes. Otherwise, please contact us via the FlyBase contact form - we can incorporate the data and attribute it to a personal communication.
7.12. How do FlyBase transcript annotations compare to NCBI RefSeq annotations for D. melanogaster?
NCBI RefSeq transcript annotations for D. melanogaster are taken directly from the annual FlyBase submission to GenBank. NCBI RefSeq annotations (from FlyBase) are in turn used by the UCSC genome browser. EBI Ensembl D. melanogaster are obtained directly from FlyBase. As such, FlyBase is the definitive and original source for D. melanogaster transcript annotations available from most sites.
7.13. Why are all theoretically possible transcripts of a gene not annotated ? How does FlyBase decide which transcripts to annotate?
For explanation of FlyBase gene annotation process please see the FlyBase paper "Gene Model Annotations for Drosophila melanogaster: Impact of High-Throughput Data".
7.14. What does the annotation release number mean?
BDGP genome assemblies are referred to using a 'Release_#' notation. The decimal in FlyBase release notation refers to an annotation set produced by FlyBase that is attached to that assembly; e.g., the release for April 2019 was Release_6.27. Note that feature coordinates will only change with new release assemblies. If you want information on the current assembly and current annotation set, you should look at FlyBase's Release Notes, opening up the subsection entitled "Drosophila melanogaster (R#.##)".

D. melanogaster has had 6 genome assemblies produced by BDGP. Release_1 and Release_2 were Celera whole genome shotgun (WGS) assemblies, whereas Releases_3, _4, _5 and _6 were all BDGP BAC tiling array assemblies taken to a high degree of finishing (except for the uncooperative parts of the centric heterochromatin and a small number of other sizeable regions containing highly repetitive sequences such as the histone gene cluster). Because of the WGS nature of Release_1 and Release_2, producing a coordinate converter for these releases to Releases_3, _4, _5, or _6 is not practically possible.

The D. pseudoobscura release 3 assembly was produced by the Baylor HGSC as an improvement to its release 1 and 2 assemblies. The assemblies in FlyBase for the other sequenced Drosophila species are the initial frozen CAF1 assemblies that were produced as part of the same project. More information can be found at the Drosophila pseudoobscura Genome Project site.

7.15. How do I convert coordinates from one genome release to another?
You can use the FlyBase Sequence Coordinates Converter tool to forward-migrate coordinates from Releases 3, 4 or 5 to Release 4, 5, or 6. There is a link on that page for a tool that offers back conversion from Release 6 to Release 5. No tool exists for migrating from BDGP/FlyBaseGenBank Releases 1 or 2.
7.16. How can I submit a correction to the genomic sequence or gene model of a Drosophila species other than D. melanogaster?
New releases of genomic sequences and gene model annotations of the "other" Drosophila species (other than D. melanogaster) are now managed by NCBI (https://www.ncbi.nlm.nih.gov/genome/browse#!/overview/Drosophila, https://www.ncbi.nlm.nih.gov/genome/annotation_euk/). If you have new or more reliable sequence data, we suggest that you submit your sequence to GenBank. You can then send us a personal communication with the accession number and a description. We will add that communication and the accession number to the gene records affected, so that other users of FlyBase may benefit from your improved data.

For gene models, if you have sequenced a cDNA, you should submit that sequence to GenBank. We have an automatic pipeline from NCBI for handling cDNA data. The accession number will be incorporated into the gene report and an alignment shown on JBrowse.

If you do not have a sequenced cDNA, but have a proposed correction determined by other means, you may be able to submit it to GenBank as a TPA (third party annotation). Again, in this case, we would appreciate a personal communication that includes the accession number and a description of the error.

If your correction cannot be appropriately submitted to GenBank by any of these routes, we will accept it as a personal communication --- but this is not preferrable. The sequence would not be available via BLAST or other queries, nor would it appear as an alignment in JBrowse.

8. Gene name/rename
8.1. When (under what circumstances) does FlyBase consider changing a designated gene name?
We’re usually quite conservative about making changes, certainly when a symbol/name has been well-used in the literature, so decisions are made on a case by case basis. If it’s clear the relevant community prefers to use a different symbol/name than the current one in FlyBase, we will consider changing it. That decision can be prompted by preferred usage of an alternative in the published literature, or by the interested parties publishing a report or contacting us directly with the intended change. We will also consider a change where a symbol is inaccurate/misleading, is too similar to the symbol of a different gene, is offensive in some way, or to rationalize nomenclature within a gene family, etc.. Please see our nomenclature guidelines here.
9. Job postings
9.1. I want to post a job on FlyBase; how can I do that?
You can post a position in the FlyBase Forum, which you can reach by clicking the 'Positions Available' icon under the 'Meetings/Courses/Jobs' tab of the External Resources sidebar on the FlyBase homepage.
10. Meetings or courses
10.1. I am organizing a Drosophila meeting. How can I add this meeting to the Drosophila Meetings page?
Meeting announcements are now hosted at the Fly Research Portal, which you can reach via the 'Meetings Courses' icon in the left sidebar of the FlyBase homepage. Please submit your event using the link at the bottom of the Fly Research Portal Events page.
11. Nomenclature
11.1. I identified a gene and would like to name it. Are there any FlyBase guidelines for this?
Yes, there are. Please see our nomenclature guidelines here. We recommend mentioning a unique FlyBase identifier (FBgn#) along with the new name in your paper.
12. Non-melanogaster species
12.1. I can't find gene model or genome assembly information for my favorite non-melanogaster Drosophila species. What happened to that information?
FlyBase stopped supporting gene model annotation and genome assembly information for species other than Drosophila melanogaster in Flybase release FB2018_06. You can find the last NCBI annotation update for these species in this FlyBase archived release, and current annotation information at the NCBI page Eukaryotic genomes annotated at NCBI.
13. Referencing FlyBase
13.1. Can I use the FlyBase logo in my publication/poster?
The use of the FlyBase logo is permitted unless it is for commercial gain. Please use this guideline about citing FlyBase.
13.2. I retrieved some information from FlyBase for a publication and wondering how to cite FlyBase in my publication. Are there any guidelines that I can follow?
Yes, there are. Please use this guideline about citing FlyBase.
14. Stocks
14.1. I would like to donate some flies to Bloomington Drosophila Stock Center. What is the procedure for this?
Please contact BDSC directly at (flystock AT indiana DOT edu).
14.2. How can I obtain a stock listed in FlyBase? Can I order that stock from FlyBase?
FlyBase does not distribute fly stocks. Please contact the stock center that provides the fly lines you are interested in; FlyBase stock reports include a link to the relevant stock center's order form at the bottom of the report. You can also find a list of stock centers here.
15. Submitting data before publication
15.1. I have a preprint. Should I wait for the paper to be published in a peer-reviewed journal, or can I submit some information to FlyBase?
If you intend to submit your data for peer-review and publish in a journal, then yes, it would be best to wait for that to happen before adding any information to FlyBase. That way, we know we’re curating the final dataset (following peer-review and any additions/corrections etc) and can attribute the information to the final published account.
15.2. I have some experimental data that I am not planning on publishing. Can I submit those data to FlyBase?
Yes, we have a way to incorporate unpublished data. Please send us an email with the information so that we can decide whether it is appropriate to add it to FlyBase as a 'personal communication'. You can see the kinds of data appropriate for a personal communication with a QuickSearch References tab query: set the Year field to a range of the last 3 years and Publication type to "personal communication to FlyBase". Additional information about submitting a personal communication can be found here.

You might also consider if your data is appropriate for a micropublication in https://www.micropublication.org.

16. Web server
16.1. How do I access data from an older release that is not available via an active archive server?
All data from every release of FlyBase is available via our FTP server (https://ftp.flybase.org/releases ; https://ftp.flybase.org/genomes). Please keep in mind that access to old servers is limited to what you see right now running.

1. Bulk data retrieval

1.1. What sort of bulk data files do you offer?

The data sets available are described on the Downloads Overview page.

2. Fast-Track Your Paper

2.1. I was contacted by FlyBase about my publication. I don't think it is relevant since I didn't investigate any Drosophila melanogaster genes.

It is helpful to FlyBase curators to know that your paper has no FlyBase-curatable data; this gives us more time to devote to data-rich papers. The Fast-Track Your Paper tool allows you to quickly indicate that your paper has no curatable data types or investigates no Drosophila melanogaster genes.

2.2. For "new transgene" under "Drosophila reagents", I constructed a plasmid containing an existing known D. melanogaster gene. Is this considered as a "new transgene"?

If you went on to use the plasmid in vivo (e.g. injected a UAS-geneX construct into Drosophila melanogaster) then this would be considered a new transgene. If you used the plasmid only in cultured cells or in vitro, we do not consider it a new transgene. Please find futher information about FTYP tool here.

2.3. I just went to Fast-Track a recent publication but it was not found by the FlyBase "Fast-Track Your Paper" tool.

If it is a very recent paper, it may not have entered our bibliography. This means the paper is not available to the Fast Track Your Paper tool yet. Please try again in one or two weeks.

2.4. I was asked to complete the FTYP form for my paper published recently. The paper describes many genes, but it's actually a review article and there is no original data. Should I still list up genes described in this paper as genes that were 'studied'?

Any gene you associate with your review will result in your review appearing on the reference list of that gene report. So, we encourage you add any gene(s) that are the focus of your review.

2.5. I submitted an FTYP form but my data is not visible on FlyBase. When will it show on the relevant gene reports?

We only update the public web pages every two months, so depending on when you submit the Fast-Track information, it can take between 3-12 weeks for the gene information to appear on gene reports.

2.6. I received a link to Fast-Track our paper but when I go to the link it shows that the paper has been already curated. What should I do?

The reason that that link is no longer available is because your paper has already been curated by a FlyBase curator or by another Fast-Track contributor. You don't need to take any further action.

2.7. I have used some genetic reagents but I have not investigated their roles in my paper. I just used them as tools (e.g. selectable marker in crosses, GAL4 driver, FLP to induce clones). Shall I mention these genes in the field "gene studied" of FTYP?

No, we are only interested in the genes that you studied (e.g. to analyse their role in development, to determine the mutant phenotype, or to determine where they are expressed), so there is no need to add genes for genetic reagents used as tools.

2.8. I submitted the information to Fast-Track our recent manuscript in which we made a new transgene expressing a human gene. However, when selecting genes studied, the system did not give me the option to select the corresponding human gene. How shall I proceed?

If you have made the first reported transgenic construct of a human gene in flies, there will not yet be a FlyBase gene report for that gene. In the Data section (before gene selection), make sure the box for 'New transgene' has been selected, and a FlyBase curator will attach that gene (and the construct) to your paper, after making the appropriate gene report.

2.9. I would like to use Fast-Track for our paper but the email request has gone to another author. Can I still fill in the FTYP from?

Yes, you can use the original link that was provided in the email and then change the email address to your own manually in the 'Contact' step.

3. FlyBase fee

3.1. How can I pay the FlyBase fee? Can I pay the fee for my entire lab/institution/company?

You can pay the FlyBase fee at this link. We also have a dedicated FlyBase Fees FAQ. Please contact us to discuss institutional/departmental/corporate fees.

4. FlyBase people database

4.1. Is FlyBase people database still active?

No, the FlyBase people database was retired.

5. Genome browser (JBrowse and GBrowse)

5.1. Is GBrowse being discontinued?

Yes, GBrowse is no longer being maintained, and GBrowse access will be discontinued in FlyBase release FB2202_06. All GBrowse tracks and features are now also available in JBrowse. Please use JBrowse for your queries.

5.2. How can I download FASTA sequence from JBrowse?

Please find instructions in this FlyBase tweetorial.

6. Gene data

6.1. How can I find transgenic constructs with particular characteristics?

Please see the commentary on FlyBase Experimental Tool Reports.

6.2. How can I obtain flanking gene sequence?

To obtain flanking sequence for a gene of interest, go to the gene page for that gene. Go to the "Genomic Location" section and click on the "Get Decorated FASTA" link. You can specify the amount of flanking sequence that you wish to have displayed in the "Additional upstream / downstream bases" box.

You can get a multiple FASTA file containing the gene region sequences including 2000 bp flanking on either end for all Dmel genes from the Current Relase page in the Genomes: Annotation and Sequence section.

The file: dmel-all-gene_extended2000-r6.nn.fasta contains these sequences. There is also a file with all intergenic sequences.

7. Gene model and genome annotation

7.1. What happened to my gene? Its gene report says it is withdrawn.

Genes are occasionally withdrawn because of a lack of supporting evidence. More commonly, a gene model might be merged with another gene, or split into two or more genes. In such cases, the original gene(s) are withdrawn. You can find the fate and/or history of the withdrawn gene in the Relationship to Other Genes subheading (under Other Information in Gene Reports) in both the withdrawn gene's report and in the report(s) of gene(s) resulting from the merge or split.

7.2. I am trying to compare coordinates between the R6 assembly of the sequenced genome and the dm3 assembly, but the Sequence Coordinate Converter tool does not support Release 3 as an Output Assembly option.

The Sequence Coordinate Converter tool does allow this option. The UCSC dm3 reference sequence corresponds to the release_5 (R5) assembly of the D. melanogaster genome, not the R3 assembly.

7.3. I translated a transcript for a protein, and that protein is different from the one FlyBase displays. Why is there this inconsistency?

You have found a gene for which there is a mutation in the sequenced strain. FlyBase has so far identified 64 genes in which a mutation disrupts the coding sequence; we provide a corrected CDS for affected transcripts. Such genes will have a comment about the mutation in the 'Comments on Gene Model' subsection of the 'Gene Model and Products' section of the relevant gene report. You can find the list of affected genes in the Release Notes, TABLE 5: Genes with Known Disruptive Mutations in the iso-1 Reference Sequence. Please also see this FlyBase paper.

7.4. These two genes map to the same genomic coordinates, and have the same transcript, but FlyBase says they are different genes. How can this be?

You have found a pair of dicistronic genes, in which there are two distinct protein coding regions with overlapping transcripts. Dicistronic gene pairs can include both dicistronic transcipts that encode the CDS of both genes, as well as monocistronic transcripts that encode the CDS of only one of the affected genes. Such genes will have a gene_with_dicistronic_mRNA SO entry in the 'Sequence Ontology: Class of Gene' subsection of the of the 'Gene Model and Products' section of the relevant gene report. You can find more information about dicistronic genes and other exceptional annotation cases in the FlyBase paper "Gene Model Annotations for Drosophila melanogaster: The Rule Benders".

7.5. Part of this transcript is on the positive strand, and part is on the negative strand. Is there a mistake?

You have found a gene with a trans-spliced transcript. Such genes will have a gene_with_trans_spliced_transcript SO entry in the 'Sequence Ontology: Class of Gene' subsection of the of the 'Gene Model and Products' section of the relevant gene report. You can find more information about trans-spliced genes and other exceptional annotation cases in the FlyBase paper "Gene Model Annotations for Drosophila melanogaster: The Rule Benders".

7.6. The protein encoded by this transcript does not include the first ATG in the ORF/extends beyond a stop codon/does not start with an AUG codon.

There are a number of exceptional gene model annotations documented in FlyBase. These exceptional cases have strong support from multispecies conservation, protein prediction algorithms, and/or published experimental evidence. Each exceptional case will have a clarifying comment in the 'Comments on Gene Model' subsection of the 'Gene Model and Products' section of the relevant gene report, and/or an informative SO term in the 'Sequence Ontology: Class of Gene' subsection of the of the 'Gene Model and Products' section of the relevant gene report. You can find more information about exceptional annotation cases in the FlyBase paper "Gene Model Annotations for Drosophila melanogaster: The Rule Benders".

7.7. What does the annotation release number mean?

D. melanogaster annotation numbers combine the BDGP reference genome assembly release number and the FlyBase annotation release number for that reference genome assembly, separated by a decimal. For example, the 2022_05 release of FlyBase included the r6.48 D. melanogaster annotation set (the 48th annotation set for Release 6 of the reference genome assembly). You can find the annotation number by selecting "Release Notes" from the "About" section of the blue NavBar at the top of every FlyBase page. NB - the FlyBase annotation number is not associated with the NCBI release number in any way.

7.8. What reference genome assemblies have been produced for D. melanogaster?

D. melanogaster has had 6 reference genome assemblies produced by BDGP. Release_1 and Release_2 were Celera whole genome shotgun (WGS) assemblies, whereas Releases_3, _4, _5 and _6 were all BDGP BAC tiling array assemblies taken to a high degree of finishing (except for the uncooperative parts of the centric heterochromatin and a small number of other sizeable regions containing highly repetitive sequences such as the histone gene cluster). Because of the WGS nature of Release_1 and Release_2, producing a coordinate converter for these releases to Releases_3, _4, _5, or _6 is not practically possible.

7.9. How do FlyBase/GenBank/BDGP assembly coordinates relate to UCSC coordinates? And where can I find the dm3 genome assembly for D. melanogaster?

An alternative "dm" numbering system from UCSC has sometimes been used to refer to the BDGP reference genome assemby releases, often causing confusion. The correspondence of "dm" number to BDGP releases is as follows. dm1 = Release3 dm2 = Release4 dm3 = Release5 dm6 = Release6

7.10. How can I submit a correction to the genomic sequence of D. melanogaster?

The D. melanogaster reference genome assembly was generated by the BDGP and represents a specific assembly product using the sequenced strain. It is not within the purview of FlyBase to make corrections to that reference assembly. However, FlyBase can make necessary corrections to gene and transcript annotations - simply contact us.

7.11. How can I submit a correction to a gene model of D. melanogaster?

If the correction (e.g., new splice isoform, new TSS) is associated with a published paper, please use the Fast-Track Your Paper tool, and under "Genome Annotation Data", tick the box relating to transcript/polypeptide structure changes. Otherwise, please contact us via the FlyBase contact form - we can incorporate the data and attribute it to a personal communication.

7.12. How do FlyBase transcript annotations compare to NCBI RefSeq annotations for D. melanogaster?

NCBI RefSeq transcript annotations for D. melanogaster are taken directly from the annual FlyBase submission to GenBank. NCBI RefSeq annotations (from FlyBase) are in turn used by the UCSC genome browser. EBI Ensembl D. melanogaster are obtained directly from FlyBase. As such, FlyBase is the definitive and original source for D. melanogaster transcript annotations available from most sites.

7.13. Why are all theoretically possible transcripts of a gene not annotated ? How does FlyBase decide which transcripts to annotate?

For explanation of FlyBase gene annotation process please see the FlyBase paper "Gene Model Annotations for Drosophila melanogaster: Impact of High-Throughput Data".

7.14. What does the annotation release number mean?

BDGP genome assemblies are referred to using a 'Release_#' notation. The decimal in FlyBase release notation refers to an annotation set produced by FlyBase that is attached to that assembly; e.g., the release for April 2019 was Release_6.27. Note that feature coordinates will only change with new release assemblies. If you want information on the current assembly and current annotation set, you should look at FlyBase's Release Notes, opening up the subsection entitled "Drosophila melanogaster (R#.##)".

D. melanogaster has had 6 genome assemblies produced by BDGP. Release_1 and Release_2 were Celera whole genome shotgun (WGS) assemblies, whereas Releases_3, _4, _5 and _6 were all BDGP BAC tiling array assemblies taken to a high degree of finishing (except for the uncooperative parts of the centric heterochromatin and a small number of other sizeable regions containing highly repetitive sequences such as the histone gene cluster). Because of the WGS nature of Release_1 and Release_2, producing a coordinate converter for these releases to Releases_3, _4, _5, or _6 is not practically possible.

The D. pseudoobscura release 3 assembly was produced by the Baylor HGSC as an improvement to its release 1 and 2 assemblies. The assemblies in FlyBase for the other sequenced Drosophila species are the initial frozen CAF1 assemblies that were produced as part of the same project. More information can be found at the Drosophila pseudoobscura Genome Project site.

7.15. How do I convert coordinates from one genome release to another?

You can use the FlyBase Sequence Coordinates Converter tool to forward-migrate coordinates from Releases 3, 4 or 5 to Release 4, 5, or 6. There is a link on that page for a tool that offers back conversion from Release 6 to Release 5. No tool exists for migrating from BDGP/FlyBaseGenBank Releases 1 or 2.

7.16. How can I submit a correction to the genomic sequence or gene model of a Drosophila species other than D. melanogaster?

New releases of genomic sequences and gene model annotations of the "other" Drosophila species (other than D. melanogaster) are now managed by NCBI (https://www.ncbi.nlm.nih.gov/genome/browse#!/overview/Drosophila, https://www.ncbi.nlm.nih.gov/genome/annotation_euk/). If you have new or more reliable sequence data, we suggest that you submit your sequence to GenBank. You can then send us a personal communication with the accession number and a description. We will add that communication and the accession number to the gene records affected, so that other users of FlyBase may benefit from your improved data.

For gene models, if you have sequenced a cDNA, you should submit that sequence to GenBank. We have an automatic pipeline from NCBI for handling cDNA data. The accession number will be incorporated into the gene report and an alignment shown on JBrowse.

If you do not have a sequenced cDNA, but have a proposed correction determined by other means, you may be able to submit it to GenBank as a TPA (third party annotation). Again, in this case, we would appreciate a personal communication that includes the accession number and a description of the error.

If your correction cannot be appropriately submitted to GenBank by any of these routes, we will accept it as a personal communication --- but this is not preferrable. The sequence would not be available via BLAST or other queries, nor would it appear as an alignment in JBrowse.

8. Gene name/rename

8.1. When (under what circumstances) does FlyBase consider changing a designated gene name?

We’re usually quite conservative about making changes, certainly when a symbol/name has been well-used in the literature, so decisions are made on a case by case basis. If it’s clear the relevant community prefers to use a different symbol/name than the current one in FlyBase, we will consider changing it. That decision can be prompted by preferred usage of an alternative in the published literature, or by the interested parties publishing a report or contacting us directly with the intended change. We will also consider a change where a symbol is inaccurate/misleading, is too similar to the symbol of a different gene, is offensive in some way, or to rationalize nomenclature within a gene family, etc.. Please see our nomenclature guidelines here.

9. Job postings

9.1. I want to post a job on FlyBase; how can I do that?

You can post a position in the FlyBase Forum, which you can reach by clicking the 'Positions Available' icon under the 'Meetings/Courses/Jobs' tab of the External Resources sidebar on the FlyBase homepage.

10. Meetings and courses

10.1. I am organizing a Drosophila meeting. How can I add this meeting to the Drosophila Meetings page?

Meeting announcements are now hosted at the Fly Research Portal, which you can reach via the 'Meetings Courses' icon in the left sidebar of the FlyBase homepage. Please submit your event using the link at the bottom of the Fly Research Portal Events page.

11. Nomenclature

11.1. I identified a gene and would like to name it. Are there any FlyBase guidelines for this?

Yes, there are. Please see our nomenclature guidelines here. We recommend mentioning a unique FlyBase identifier (FBgn#) along with the new name in your paper.

12. Non-melanogaster species

12.1. I can't find gene model or genome assembly information for my favorite non-melanogaster Drosophila species. What happened to that information?

FlyBase stopped supporting gene model annotation and genome assembly information for species other than Drosophila melanogaster in Flybase release FB2018_06. You can find the last NCBI annotation update for these species in this FlyBase archived release, and current annotation information at the NCBI page Eukaryotic genomes annotated at NCBI.

13. Referencing FlyBase

13.1. Can I use the FlyBase logo in my publication/poster?

The use of the FlyBase logo is permitted unless it is for commercial gain. Please use this guideline about citing FlyBase.

13.2. I retrieved some information from FlyBase for a publication and wondering how to cite FlyBase in my publication. Are there any guidelines that I can follow?

Yes, there are. Please use this guideline about citing FlyBase.

14. Stocks

14.1. I would like to donate some flies to Bloomington Drosophila Stock Center. What is the procedure for this?

Please contact BDSC directly at (flystock AT indiana DOT edu).

14.2. How can I obtain a stock listed in FlyBase? Can I order that stock from FlyBase?

FlyBase does not distribute fly stocks. Please contact the stock center that provides the fly lines you are interested in; FlyBase stock reports include a link to the relevant stock center's order form at the bottom of the report. You can also find a list of stock centers here.

15. Submitting data before publication

15.1 I have a preprint. Should I wait for the paper to be published in a peer-reviewed journal, or can I submit some information to FlyBase?

If you intend to submit your data for peer-review and publish in a journal, then yes, it would be best to wait for that to happen before adding any information to FlyBase. That way, we know we’re curating the final dataset (following peer-review and any additions/corrections etc) and can attribute the information to the final published account.

15.2 I have some experimental data that I am not planning on publishing. Can I submit those data to FlyBase?

Yes, we have a way to incorporate unpublished data. Please send us an email with the information so that we can decide whether it is appropriate to add it to FlyBase as a 'personal communication'. You can see the kinds of data appropriate for a personal communication with a QuickSearch References tab query: set the Year field to a range of the last 3 years and Publication type to "personal communication to FlyBase". Additional information about submitting a personal communication can be found here. You might also consider if your data is appropriate for a micropublication in https://www.micropublication.org.

16. Web server

16.1. How do I access data from an older release that is not available via an active archive server?

All data from every release of FlyBase is available via our FTP server (https://ftp.flybase.org/releases ; https://ftp.flybase.org/genomes). Please keep in mind that access to old servers is limited to what you see right now running.


View FlyBase Help Index Do you still have a question? Contact us.