FlyBase:Batch Download

From FlyBase Wiki
Revision as of 19:49, 19 January 2018 by Bmatthew (talk | contribs)
Jump to navigation Jump to search

Batch Download Overview

The Batch Download tool provides access to data from a variety of specific fields in FlyBase reports for a specified list of symbols or IDs. The data from multiple fields can be accessed at once. The specified list of IDs can be large (e.g. all genes) or small (e.g. one gene). Results can be downloaded or viewed online.

Validating FlyBase symbols and IDs

To insure that the list of symbols or IDs submitted to Batch Download are currently valid in FlyBase, the input list will be run through the ID validator/converter as a first step to running Batch Download. Once any invalid IDs have been resolved, the list will be ready to submit. See ID Converter for more information on this step.

Quick Instructions

  1. The Batch Download tool is organized into three main parts: a dropdown menu for the selection of Output Format, a dropdown menu for selection of Output Options, and a box for inputting symbols or IDs.
  2. Select the output format which can be either an HTML table or a tab-separated file. There is also an option in this tab called “From precomputed files”, which allows you to select fields from the precomputed files in FlyBase.
  3. Choose to display the results in a browser or to save them as a tab-delimited file to your computer or device.
  4. Enter your list of IDs in the box. The list of IDs can be separated by commas, tabs, return characters or spaces, and can be entered manually in the box or by uploading the information from your computer. To upload a file of IDs, click on “Browse” and select a file to upload. You can also populate Batch Download with IDs by choosing “Batch Download" from the “Export” menu in any hit list.
  5. Click on “Continue to Select Fields”. A new window will appear with fields specific to the data type of the IDs you entered. Select your field(s) of interest.
  6. Click on one of the two “Get Field Data” buttons located at the top right and bottom right corners of the form.

Additional Information and more detailed Instructions

  • "Field Data" includes all the data that is presented in various FlyBase reports. For example, for a list of genes, you can retrieve transcript expression data, GO: Molecular Function terms, lists of cDNA clones, etc. For a list of alleles, you can retrieve phenotypic class information, lists of stocks, etc. Note that the Human Disease Model, Gene Groups, Physical Interaction, and Strain reports are not supported by Batch Download. To retrieve cell line data, enter Cell line IDs, not symbols.
  • There are three output options: As HTML Table, As Tab Separated File, and From Precomputed Files. Use the first option if you are going to view your results in the browser window or print them out directly, since the output will be nicely formatted. However, if you'd like to save your output in a file or open it in a spreadsheet, choose to see the data tab separated. Finally, the option to see data from Precomputed Files allows you to obtain the data available in the FlyBase precomputed files for your list of IDs. For example, if you have a list of genes, you can get the overlapping affy oligos from the "Genes: fbgn_exons2affy1_overlaps.tsv" file. There are many other options available from precomputed files, so it's worth taking a look at the list. See the Downloads Overview page for more information on the precomputed files.
  • It is important to note that Batch Download can only retrieve information about a single dataset at one time. For example, you cannot simultaneously retrieve field data for genes and alleles. If you input a list that contains a mix of gene and allele symbols, whichever data type has more entries will be recognized and the other data type will be ignored. If your list contains equal numbers, the identity of the first entry on the list will be recognized.
  • Once you have entered your list of IDs, click "Continue to Select Fields". This opens a new window that includes a menu of all the available fields for your dataset of interest. The fields are organized in a similar fashion to the report pages. Multiple fields can be selected at once by checking more than one check box, or by using the Check Section or Check All buttons.
  • Click "Get field data" to obtain your results.
  • Important tip: If you entered Batch Download via a Hits List (i.e. you exported your hits from the Hits List to Batch Download), your data will be entered as FlyBase IDs (e.g. FBgn, FBal, etc.). If you want to see the associated symbols in your output, be sure to select the field "Symbol: symbol".