Difference between revisions of "FlyBase:FAQ"
Line 27: | Line 27: | ||
You can get a multiple FASTA file containing the gene region sequences including 2000 bp flanking on either end for all Dmel genes from the [http://flybase.org/cgi-bin/get_static_page.pl?file=bulkdata7.html&title=Current%20Release Current Relase] page in the Genomes: Annotation and Sequence section. | You can get a multiple FASTA file containing the gene region sequences including 2000 bp flanking on either end for all Dmel genes from the [http://flybase.org/cgi-bin/get_static_page.pl?file=bulkdata7.html&title=Current%20Release Current Relase] page in the Genomes: Annotation and Sequence section. | ||
− | The file: dmel-all-gene_extended2000-r6. | + | The file: dmel-all-gene_extended2000-r6.nn.fasta contains these sequences. There is also a file with all intergenic sequences. |
== What sort of bulk data files do you offer? == | == What sort of bulk data files do you offer? == |
Revision as of 17:08, 13 August 2019
How can I obtain a stock listed in FlyBase?
Questions about individual stocks listed in FlyBase should be directed to the collection that holds the stock. For collections with web sites, the name in the "Collection" field of the FlyBase Stock Report links to the collection's web site, where contact information can be found. For laboratory collections that do not have web sites, an e-mail address is included in the "To Request Stock" field of the FlyBase Stock Report. For lists of stock centers and stock collections, see the Stocks page in the External Resources section.
How can I submit a correction to the genomic sequence or gene model of a Drosophila species other than D. melanogaster?
New releases of genomic sequences and gene model annotations of the "other" Drosophila species (other than D. melanogaster) are now managed by NCBI (https://www.ncbi.nlm.nih.gov/genome/browse#!/overview/Drosophila, https://www.ncbi.nlm.nih.gov/genome/annotation_euk/). If you have new or more reliable sequence data, we suggest that you submit your sequence to GenBank. You can then send us a personal communication with the accession number and a description. We will add that communication and the accession number to the gene records affected, so that other users of FlyBase may benefit from your improved data. .
For gene models, if you have sequenced a cDNA, you should submit that sequence to GenBank. We have an automatic pipeline from NCBI for handling cDNA data. The accession number will be incorporated into the gene report and an alignment shown on GBrowse.
If you do not have a sequenced cDNA, but have a proposed correction determined by other means, you may be able to submit it to GenBank as a TPA (third party annotation). Again, in this case, we would appreciate a personal communication that includes the accession number and a description of the error.
If your correction cannot be appropriately submitted to GenBank by any of these routes, we will accept it as a personal communication --- but this is not preferrable. The sequence would not be available via BLAST or other queries, nor would it appear as an alignment in GBrowse.
Where can I find information on BDGP Drosophila Gold Collection (DGC) and other cDNA clones?
If there is a full-length DGC clone associated with a gene you will find it in the gene report under the STOCKS and REAGENTS section in its own field. For example: lark lists LD40792 as a BDGP DGC clone. If the "BDGP DGC clones" field does not appear in a report it means that a DGC clone has not been identified for the gene.
Immediately below that section is the section "Drosophila Genomics Resource Center cDNA clones". This includes a link to the relevant DGRC gene page, with a listing of all cDNA clones and cDNA derivatives available from the DGRC for that gene.
Exelixis clones are not available. The decision was made not to save their libraries as they felt the clones were largely overlapping with existing ones.
How can I obtain flanking gene sequence?
To obtain flanking sequence for a gene of interest, go to the gene page for that gene. Go to the "Genomic Location/Sequence" section and click on the "Get Decorated FASTA" link. You can specify the amount of flanking sequence that you wish to have displayed in the "Additional upstream / downstream bases" box.
You can get a multiple FASTA file containing the gene region sequences including 2000 bp flanking on either end for all Dmel genes from the Current Relase page in the Genomes: Annotation and Sequence section.
The file: dmel-all-gene_extended2000-r6.nn.fasta contains these sequences. There is also a file with all intergenic sequences.
What sort of bulk data files do you offer?
The data sets available are described on the Downloads Overview page.
How do I cite the Drosophila phylogeny?
The Drosophila phylogeny shown on FlyBase is a compilation of information that can be found in:
Powell, J.R. (1997) Progress and Prospects in Evolutionary Biology: The Drosophila Model.
Oxford University Press, Inc., New York, BIOSIS ID: 1102964 (FBrf0085765)
An updated discussion of Drosophila phylogeny can be found here:
O'Grady, P.M., DeSalle, R. (2018) Phylogeny of the Genus Drosophila.
Genetics 209(1): 1--25. PMID:29716983 (FBrf0238822)
How can I search the abstracts of the Annual Drosophila Research Conference?
The Annual Drosophila Research Conference is a meeting of the Genetics Society of America, so you can reach abstracts from the Drosophila meetings (back to 2000) on the GSA website:
From this, you can go to the abstract search page to find your abstract-of-interest, for example:
2019 Abstract Search and Program Planner
What does the annotation release number mean?
BDGP genome assemblies are referred to using a 'Release_#' notation. The decimal in FlyBase release notation refers to an annotation set produced by FlyBase that is attached to that assembly; e.g., the release for April 2019 was Release_6.27. Note that feature coordinates will only change with new release assemblies. If you want information on the current assembly and current annotation set, you should look at FlyBase's Release Notes, opening up the subsection entitled "Drosophila melanogaster (R#.##)".
D. melanogaster has had 6 genome assemblies produced by BDGP. Release_1 and Release_2 were Celera whole genome shotgun (WGS) assemblies, whereas Releases_3, _4, _5 and _6 were all BDGP BAC tiling array assemblies taken to a high degree of finishing (except for the uncooperative parts of the centric heterochromatin and a small number of other sizeable regions containing highly repetitive sequences such as the histone gene cluster). Because of the WGS nature of Release_1 and Release_2, producing a coordinate converter for these releases to Releases_3, _4, _5, or _6 is not practically possible.
The D. pseudoobscura release 3 assembly was produced by the Baylor HGSC as an improvement to its release 1 and 2 assemblies. The assemblies in FlyBase for the other sequenced Drosophila species are the initial frozen CAF1 assemblies that were produced as part of the same project. More information can be found at the Drosophila pseudoobscura Genome Project site.
How do FlyBase/GenBank/BDGP assembly coordinates relate to UCSC coordinates?
The UCSC assembly numbers dm1 through dm3 correspond to BDGP/FlyBaseGenBank assemblies Release 3 through Release 5. The UCSC assembly dm6 corresponds to BDGP/FlyBaseGenBank assembly Release 6.
How do I convert coordinates from one genome release to another?
You can use the FlyBase Sequence Coordinates Converter tool to forward-migrate coordinates from Releases 3, 4 or 5 to Release 4, 5, or 6. There is a link on that page for a tool that offers back conversion from Release 6 to Release 5. No tool exists for migrating from BDGP/FlyBaseGenBank Releases 1 or 2.
How do I print FlyBase pages accurately?
We realize that many people like to print out the information provided by FlyBase, however some of the current browsers do not render the new FlyBase pages particularly well. If you routinely print out a significant amount of information from FlyBase we recommend that you use Opera as your browser. While many other browsers will print the information included in a report, at this time only Opera will print the page format properly (including colors and shading).
If your work requires that you print the page exactly as it appears on your monitor and you do not wish to install Opera, we suggest that you print a screen shot of the page. Instructions on how to take a screen shot on different platforms are given below.
PC
- Open the application "Microsoft Word"
- Click on the web browser and arrange the information you would like to print within the window
- Press the key combination "Alt-PrintScreen"
- Click on the "Microsoft Word" window to activate it and press the key combination "Ctrl-V" to paste the image into the window
- Press the key combination "Ctrl-P" to enter the print dialog of the application. Under the Zoom section of the dialog box set the "Scale to paper size" option to the size of the paper used in your printer, then press "OK".
Macintosh
Method One
- Open the application "Grab" in the "Utilities" folder.
- From the Capture menu select "Window" (or press the key combination "Shift-Command-W")
- Select "Choose Window", and click on the browser window you wish to capture.
- A image of the browser will appear in a new window.
- To print the image click on "Page Setup.." under the File menu and set "Scale" as 50%, press "OK", and then enter the print dialog by pressing the key combination "Command-P"
Method Two
- Open the application "Preview"
- Press the key combination "Control-Command-Shift-4", then press the space bar.
- Move the pointer over the browser window so that it is highlighted, then click.
(NB: You can cancel the operation by pressing the escape key)
- Activate Preview by clicking on its icon
- Press the key combination "Command-N" to create an image of the web browser.
- Press the key combination "Command-P" to enter the print dialog
(In this case the page is scaled to fit the page by default)
Method Three
If you use Safari consider installing the free program Netfixer that allows you to capture an image of an entire webpage no matter its "length".