Help:State of the Wiki

From FlyBase Wiki
Revision as of 17:41, 12 September 2012 by Ggrumbli FlyBase (talk | contribs)
Jump to navigation Jump to search

The overall goal of this wiki is to enable the Drosophila community to contribute their expertise to FlyBase. The initial goal is to encourage FlyBase users to contribute expert, free-text gene summaries. A future goal can be to extend support to other categories of information beyond the initial solicitation of gene summary information.

The general plan to build a usable wiki focused on the summarization of gene information has been to:

  • Use regular wiki technology to seed one wiki article per Drosophila melanogaster gene with the current automated summary serving as a placeholder until its replacement with human edited free text.
  • Use semantic wiki technology to convert most of the information represented in the automated summary to tables of structured data surrounding the free text for use as reference by writers.
  • Use forms based on semantic wiki technology to enable users to edit limited subsets of the tabular structured data in order to make new connections between a given gene and other types of data.

Progress made towards implementing this plan is outlined below.

Done

Operating system

  • The Debian operating system has been installed on hardware good enough for group testing and initial deployment.
  • All dependencies necessary for the wiki software to function have been installed and configured.

Wiki software

  • The MediaWiki free and open source wiki software, the same software behind Wikipedia, has been installed and configured.
  • Several important extensions to the base MediaWiki software have been installed and configured to enable greater functionality in the wiki:
    OpenID
    This extension allows the creating of accounts and login via OpenID.
    WikiEditor
    This extension improves the user experience by adding a toolbar to help in editing wiki markup (wikitext) in the free text.
    Cite
    This extension allows users to cite references and create a list of references.
    PubMed
    This extension pulls in literature data from scientific articles stored in PubMed
    Semantic MediaWiki (SMW)
    This extension allows for the storage and querying of structured data within the pages of the wiki, thus allowing the wiki to be a "collaborative database" in addition to a "collaborative book".
    Semantic Forms
    This extension allows for the building of forms for adding, editing and querying data on the wiki, without any programming.

Seeding data

  • Scripts have been written for initially seeding data into the wiki in three steps:
    1. Relevant data from Chado XML for the current release is converted to a data structure and stored for later use.
    2. This data is combined with the precomputed gene summaries for the current release and written to wikitext files for each page in the wiki.
    3. Each wikitext file is then uploaded by a bot, which is a program that automatically retrieves or updates wiki pages, overwriting any pages that already exist.
  • A 15 gene wiki page sample set has been seeded:
  • 3,803 allele wiki pages associated to the gene sample set have been seeded.
  • 8,742 reference wiki pages associated to the gene sample set have been seeded.

Redirects

  • Redirects, pages that point to other pages, have been added for FBids and PMIDs.

Gene wiki page

  • An initial layout for the above gene wiki pages has been made with these features:
    Stub
    A stub is an article deemed too short.
    Each seeded gene page has been tagged as a stub and had an explanatory banner added to the top of the page.
    Lead
    The lead section of a Wikipedia article is the section before the table of contents and the first heading.
    This is where the automated gene summary has been placed on each page with the expectation it will be replaced.
    Infobox
    An infobox is a fixed-format table in the top right-hand corner of articles.
    This is where some identifying information for each gene has been placed.
    TOC
    The table of contents is automatically generated and shows any section headings that follow.
    Publications
    Tables of recent reviews and papers that have data on each gene have been automatically generated in place using semantic wiki technology for use as reference while writing a summary and while citing assertions made in the free text.
    Alleles
    A table of alleles of each gene has been automatically generated in place using semantic wiki technology for use as reference while writing a summary.

Edit with form

  • An edit with form tab has been enabled using semantic wiki technology to hide the complexity of editing raw wikitext.
  • A Summary tab of the gene form has been enabled to separate the place where the automated gene summary is to be replaced with user contributed free text using the WikiEditor toolbar for guidance.

In progress

Gene wiki page

  • More identifying data is being added to the infobox on gene pages.
  • Links to references tab of gene form requesting the addition of missing recent reviews and missing PMIDs are being added to the gene pages.
  • Link to alleles tab of gene form requesting help identifying e.g. "best null" is being added to the gene pages.

Allele wiki page

  • Just enough information is being added to be useful on gene summary page and its form.

Reference wiki page

  • Just enough information is being added to be useful on gene summary page and its form.

Edit with form

  • References tab
    Functionality is being added to enable wiki users to add more recent reviews for a gene and any missing PMIDs.
  • Alleles tab
    Functionality is being added to enable wiki users to contribute data we do not curate that is highly valuable to our community, e.g. "best null".

To-dos

FlyBase gene page

  • Add reciprocal links between FlyBase gene pages and wiki gene pages.

Access control

  • Add different levels of access control to wiki:
    • Administrators
    • Editors
    • Writers
    • Readers

Human disease

  • Add functionality to allow capture of free text or structured data for human disease.

Gene Ontology

  • Add gene ontology data to structured gene summary tables around free text.
  • Add functionality to allow users to contribute new gene ontology structured data.

Documentation

  • Write example page for what a finished human-edited gene summary should look like.
  • Make visual tour showing how to accomplish certain tasks to help users.

Data flow

  • Work out how to synchronize with FlyBase after initial, one-time, one-way seeding is done.

MediaWiki extensions

  • Add extensions to put spam protections in place on wiki.

Legal issues

  • What license are users contributing their work under?
  • What warnings about submitting copyrighted work should be added?

Look and feel

  • Make wiki appearance more in line with FlyBase website.

  • Crop fly from FlyBase logo and write FlyBase Wiki underneath in same typeface at smaller size?