Difference between revisions of "Help:State of the Wiki"

From FlyBase Wiki
Jump to navigation Jump to search
(Undo revision 137556 by Spammer (talk))
 
(14 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 +
The overall goal of this wiki is to enable the ''Drosophila'' community to contribute their expertise to FlyBase. The initial goal is to encourage FlyBase users to contribute expert, free-text gene summaries. A future goal can be to extend support to other categories of information beyond the initial solicitation of gene summary information.
 +
 +
The general plan to build a usable wiki focused on the summarization of gene information has been to:
 +
 +
* Use regular wiki technology to seed one wiki article per ''Drosophila melanogaster'' gene with the current automated summary serving as a placeholder until its replacement with human edited '''free text'''.
 +
* Use semantic wiki technology to convert most of the information represented in the automated summary to tables of '''structured data''' surrounding the '''free text''' for use as reference by writers.
 +
* Use forms based on semantic wiki technology to enable users to edit limited subsets of the tabular '''structured data''' in order to make new connections between a given gene and other types of data.
 +
 +
Progress made towards implementing this plan is outlined below. 
 +
 
== Done ==
 
== Done ==
  
Line 4: Line 14:
  
 
* The Debian operating system has been installed on hardware good enough for group testing and initial deployment.
 
* The Debian operating system has been installed on hardware good enough for group testing and initial deployment.
* All dependencies necessary for the wiki software have been installed and configured (e.g.  Apache HTTP server, MySQL database software, PHP scripting language, etc.).
+
* All dependencies necessary for the wiki software to function have been installed and configured.
  
 
=== Wiki software ===
 
=== Wiki software ===
  
* The MediaWiki free open source wiki software, the same software behind Wikipedia, has been installed and configured.
+
* The MediaWiki free and open source wiki software, the same software behind Wikipedia, has been installed and configured.
* Several important extensions to the base MediaWiki software have been installed and configured to enable greater functionality for wiki users.  
+
* Several important extensions to the base MediaWiki software have been installed and configured to enable greater functionality in the wiki:
 +
*; OpenID
 +
*: This extension allows the creating of accounts and login via [http://openid.net/ OpenID].
 
*; WikiEditor
 
*; WikiEditor
*: This extension improves the user experience by adding a toolbar to help in editing wiki markup or wikitext.  
+
*: This extension improves the user experience by adding a toolbar to help in editing wiki markup (wikitext) in the free text.  
 
*; Cite
 
*; Cite
 
*: This extension allows users to cite references and create a list of references.
 
*: This extension allows users to cite references and create a list of references.
Line 17: Line 29:
 
*: This extension pulls in literature data from scientific articles stored in PubMed
 
*: This extension pulls in literature data from scientific articles stored in PubMed
 
*; Semantic MediaWiki (SMW)
 
*; Semantic MediaWiki (SMW)
*: This extension lets you store and query structured data, not just free text, within the pages of the wiki making the wiki a "collaborative database" in addition to a "collaborative book".
+
*: This extension allows for the storage and querying of structured data within the pages of the wiki, thus allowing the wiki to be a "collaborative database" in addition to a "collaborative book".
 
*; Semantic Forms
 
*; Semantic Forms
*: This extension allows you to have forms for adding, editing and querying data on your wiki, without any programming.
+
*: This extension allows for the building of forms for adding, editing and querying data on the wiki, without any programming.
  
 
=== Seeding data ===
 
=== Seeding data ===
  
* Scripts have been written in the Python programming language for initially seeding data into the wiki in three steps:
+
* Scripts have been written for initially seeding data into the wiki in three steps:
*# Relevant data from Chado XML for the current release is converted to a Python data structure and stored for later use.
+
*# Relevant data from Chado XML for the current release is converted to a data structure and stored for later use.
*# This Chado XML data is combined with the precomputed gene summaries for the current release and written to wikitext files for each page in the wiki.
+
*# This data is combined with the precomputed gene summaries for the current release and written to wikitext files for each page in the wiki.
*# Each wikitext file is then uploaded by a bot, which is a program that automatically retrieves or updates wiki pages, overwriting any pages that exist.  
+
*# Each wikitext file is then uploaded by a bot, which is a program that automatically retrieves or updates wiki pages, overwriting any pages that already exist.  
* A 15 [[:Category:Genes|gene]] wiki page sample set has been seeded.
+
* A 15 [[:Category:Genes|gene]] wiki page sample set has been seeded:
 
** [[Dmel\Adh]]
 
** [[Dmel\Adh]]
 
** [[Dmel\Antp]]
 
** [[Dmel\Antp]]
Line 43: Line 55:
 
** [[Dmel\αTub67C]]
 
** [[Dmel\αTub67C]]
 
** [[Dmel\γTub23C]]
 
** [[Dmel\γTub23C]]
* 3,737 [[:Category:Alleles|allele]] wiki pages associated to the gene sample set have been seeded.
+
* 3,803 [[:Category:Alleles|allele]] wiki pages associated to the gene sample set have been seeded.
* 8,742 [[:Category:References|reference]] wiki pages associated to the gene sample set have been seeded.
+
* 8,803 [[:Category:References|reference]] wiki pages associated to the gene sample set have been seeded.
 +
 
 +
=== Redirects ===
 +
 
 +
* Redirects, pages that point to other pages, have been added for FBids and PMIDs.
  
 
=== Gene wiki page ===
 
=== Gene wiki page ===
  
* An initial layout of gene wiki pages has been made.
+
* An initial layout for the above gene wiki pages has been made with these features:
 
*; Stub
 
*; Stub
*: A stub is an article deemed too short and each initially seeded gene page has been tagged as one and an explanatory banner added to the top of the page.
+
*: A stub is an article deemed too short.
 +
*: Each seeded gene page has been tagged as a stub and had an explanatory banner added to the top of the page.
 
*; Lead
 
*; Lead
*: The lead section of a Wikipedia article is the section before the table of contents and the first heading and where the automated gene summary has been placed on each page.
+
*: The lead section of a Wikipedia article is the section before the table of contents and the first heading.
 +
*: This is where the automated gene summary has been placed on each page with the expectation it will be replaced.
 
*; Infobox
 
*; Infobox
*: An infobox is a fixed-format table in the top right-hand corner of articles and where some identifying information for each gene has been placed.
+
*: An infobox is a fixed-format table in the top right-hand corner of articles.
 +
*: This is where some identifying information for each gene has been placed.
 
*; TOC
 
*; TOC
*: The tabel of contents is auto-generated and ...
+
*: The table of contents is automatically generated and shows any section headings that follow.
 
*; Publications
 
*; Publications
*:  
+
*: Tables of recent reviews and papers that have data on each gene have been automatically generated in place using semantic wiki technology for use as reference while writing a summary and while citing assertions made in the free text.
 
*; Alleles
 
*; Alleles
*:
+
*: A table of alleles of each gene has been automatically generated in place using semantic wiki technology for use as reference while writing a summary.
* The overall idea being: <p>p</p>
 
  
=== Edit gene wiki page with form ===
+
=== Edit with form ===
  
* Summary tab
+
* An edit with form tab has been enabled using semantic wiki technology to hide the complexity of editing raw wikitext.
 +
* A Summary tab of the gene form has been enabled to separate the place where the automated gene summary is to be replaced with user contributed free text using the WikiEditor toolbar for guidance.
  
 
== In progress ==
 
== In progress ==
Line 71: Line 90:
 
=== Gene wiki page ===
 
=== Gene wiki page ===
  
* Infobox
+
* More identifying data is being added to the infobox on gene pages.
 +
* Links to references tab of gene form requesting the addition of missing recent reviews and missing PMIDs are being added to the gene pages.
 +
* Link to alleles tab of gene form requesting help identifying e.g. "best null" is being added to the gene pages.
 +
 
 +
=== Allele wiki page ===
 +
 
 +
* Just enough information is being added to be useful on gene summary page and its form.
  
=== Edit gene wiki page with form ===
+
=== Reference wiki page ===
  
* Alleles Tab
+
* Just enough information is being added to be useful on gene summary page and its form.
  
* References Tab
+
=== Edit with form ===
  
=== Documentaion ===
+
*; References tab
 +
*: Functionality is being added to enable wiki users to add more recent reviews for a gene and any missing PMIDs.
  
* Tour/walkthrough
+
*; Alleles tab
 +
*: Functionality is being added to enable wiki users to contribute data we do not curate that is highly valuable to our community, e.g. "best null".
  
 
== To-dos ==
 
== To-dos ==
 +
 +
=== FlyBase gene page ===
 +
 +
* Add reciprocal links between FlyBase gene pages and wiki gene pages.
 +
 +
=== Access control ===
 +
 +
* Add different levels of access control to wiki:
 +
** Administrators
 +
** Editors
 +
** Writers
 +
** Readers
  
 
=== Human disease ===
 
=== Human disease ===
  
=== Look and feel ===
+
* Add functionality to allow capture of free text or structured data for human disease.
 +
 
 +
=== Gene Ontology ===
 +
 
 +
* Add gene ontology data to structured gene summary tables around free text.
 +
* Add functionality to allow users to contribute new gene ontology structured data.
 +
 
 +
=== Documentation ===
 +
 
 +
* Write example page for what a finished human-edited gene summary should look like.
 +
* Make visual tour showing how to accomplish certain tasks to help users.
 +
 
 +
===  Data flow ===
 +
 
 +
* Work out how to synchronize with FlyBase after initial, one-time, one-way seeding is done.
  
=== Access control ===
+
=== MediaWiki extensions ===
  
* Currently doing work as 'FlyBase Bot' and 'FlyBase Administrator'.
+
* Add extensions to put spam protections in place on wiki.
* Readers
 
* writers
 
* editors
 
* administrators
 
  
=== Ontologies ===
+
=== Legal issues ===
  
=== Logo ===
+
* What license are users contributing their work under?
 +
* What warnings about submitting copyrighted work should be added?
  
=== Sync with FB after seeding / data flow ===
+
=== Look and feel ===
  
=== SPAM ===
+
* Make wiki appearance more in line with FlyBase website.
  
=== Open ID ===
+
=== Logo ===
  
=== Legal ===
+
* Crop fly from FlyBase logo and write FlyBase Wiki underneath in same typeface at smaller size?

Latest revision as of 19:22, 7 October 2016

The overall goal of this wiki is to enable the Drosophila community to contribute their expertise to FlyBase. The initial goal is to encourage FlyBase users to contribute expert, free-text gene summaries. A future goal can be to extend support to other categories of information beyond the initial solicitation of gene summary information.

The general plan to build a usable wiki focused on the summarization of gene information has been to:

  • Use regular wiki technology to seed one wiki article per Drosophila melanogaster gene with the current automated summary serving as a placeholder until its replacement with human edited free text.
  • Use semantic wiki technology to convert most of the information represented in the automated summary to tables of structured data surrounding the free text for use as reference by writers.
  • Use forms based on semantic wiki technology to enable users to edit limited subsets of the tabular structured data in order to make new connections between a given gene and other types of data.

Progress made towards implementing this plan is outlined below.

Done

Operating system

  • The Debian operating system has been installed on hardware good enough for group testing and initial deployment.
  • All dependencies necessary for the wiki software to function have been installed and configured.

Wiki software

  • The MediaWiki free and open source wiki software, the same software behind Wikipedia, has been installed and configured.
  • Several important extensions to the base MediaWiki software have been installed and configured to enable greater functionality in the wiki:
    OpenID
    This extension allows the creating of accounts and login via OpenID.
    WikiEditor
    This extension improves the user experience by adding a toolbar to help in editing wiki markup (wikitext) in the free text.
    Cite
    This extension allows users to cite references and create a list of references.
    PubMed
    This extension pulls in literature data from scientific articles stored in PubMed
    Semantic MediaWiki (SMW)
    This extension allows for the storage and querying of structured data within the pages of the wiki, thus allowing the wiki to be a "collaborative database" in addition to a "collaborative book".
    Semantic Forms
    This extension allows for the building of forms for adding, editing and querying data on the wiki, without any programming.

Seeding data

  • Scripts have been written for initially seeding data into the wiki in three steps:
    1. Relevant data from Chado XML for the current release is converted to a data structure and stored for later use.
    2. This data is combined with the precomputed gene summaries for the current release and written to wikitext files for each page in the wiki.
    3. Each wikitext file is then uploaded by a bot, which is a program that automatically retrieves or updates wiki pages, overwriting any pages that already exist.
  • A 15 gene wiki page sample set has been seeded:
  • 3,803 allele wiki pages associated to the gene sample set have been seeded.
  • 8,803 reference wiki pages associated to the gene sample set have been seeded.

Redirects

  • Redirects, pages that point to other pages, have been added for FBids and PMIDs.

Gene wiki page

  • An initial layout for the above gene wiki pages has been made with these features:
    Stub
    A stub is an article deemed too short.
    Each seeded gene page has been tagged as a stub and had an explanatory banner added to the top of the page.
    Lead
    The lead section of a Wikipedia article is the section before the table of contents and the first heading.
    This is where the automated gene summary has been placed on each page with the expectation it will be replaced.
    Infobox
    An infobox is a fixed-format table in the top right-hand corner of articles.
    This is where some identifying information for each gene has been placed.
    TOC
    The table of contents is automatically generated and shows any section headings that follow.
    Publications
    Tables of recent reviews and papers that have data on each gene have been automatically generated in place using semantic wiki technology for use as reference while writing a summary and while citing assertions made in the free text.
    Alleles
    A table of alleles of each gene has been automatically generated in place using semantic wiki technology for use as reference while writing a summary.

Edit with form

  • An edit with form tab has been enabled using semantic wiki technology to hide the complexity of editing raw wikitext.
  • A Summary tab of the gene form has been enabled to separate the place where the automated gene summary is to be replaced with user contributed free text using the WikiEditor toolbar for guidance.

In progress

Gene wiki page

  • More identifying data is being added to the infobox on gene pages.
  • Links to references tab of gene form requesting the addition of missing recent reviews and missing PMIDs are being added to the gene pages.
  • Link to alleles tab of gene form requesting help identifying e.g. "best null" is being added to the gene pages.

Allele wiki page

  • Just enough information is being added to be useful on gene summary page and its form.

Reference wiki page

  • Just enough information is being added to be useful on gene summary page and its form.

Edit with form

  • References tab
    Functionality is being added to enable wiki users to add more recent reviews for a gene and any missing PMIDs.
  • Alleles tab
    Functionality is being added to enable wiki users to contribute data we do not curate that is highly valuable to our community, e.g. "best null".

To-dos

FlyBase gene page

  • Add reciprocal links between FlyBase gene pages and wiki gene pages.

Access control

  • Add different levels of access control to wiki:
    • Administrators
    • Editors
    • Writers
    • Readers

Human disease

  • Add functionality to allow capture of free text or structured data for human disease.

Gene Ontology

  • Add gene ontology data to structured gene summary tables around free text.
  • Add functionality to allow users to contribute new gene ontology structured data.

Documentation

  • Write example page for what a finished human-edited gene summary should look like.
  • Make visual tour showing how to accomplish certain tasks to help users.

Data flow

  • Work out how to synchronize with FlyBase after initial, one-time, one-way seeding is done.

MediaWiki extensions

  • Add extensions to put spam protections in place on wiki.

Legal issues

  • What license are users contributing their work under?
  • What warnings about submitting copyrighted work should be added?

Look and feel

  • Make wiki appearance more in line with FlyBase website.

  • Crop fly from FlyBase logo and write FlyBase Wiki underneath in same typeface at smaller size?