RDoQ Tutorial


RDoQ Tutorial



RDoQ is an extended interface for using PubMed/MEDLINE, a vast digital library for the biomedical literature. RDoQ allows creation of literature maps, 2-dimensional visual summaries of associations present in this literature.

RDoQ is a high-level interface, permitting people to ask shotgun-style queries that look for associations between two sets of terms. These terms can be any PubMed query, and the two sets can be fairly large; RDoQ can handle term sets that include several hundred queries. Of course, as the number of terms grows, the visibility of the results declines, so it should be thought of mainly as a tool for exploring modest sets of terms.

For people familiar with bioinformatics, it may help to think of RDoQ as a kind of BLAST for PubMed, performing simultaneous searches and reporting the results in a coherent way.

Overview of RDoQ presented at AMIA STB'09




A Quick Walkthrough of RDoQ


RDoQ is a program that acts as a front-end to PubMed. The initial RDoQ query form looks like this:
figures/queryform.jpg
As input RDoQ takes two sets of "terms", which are text lines having one of the two formats
PubMed Query
concept phrase : PubMed Query
and analyze them in a table of associations. Any PubMed query is valid here, and the more sophisticated the better, since the results are typically more precise.

It is permitted to include comments also, which can appear on any line, and anywhere on the line. Comments start with the # character. Blank lines (or comment-only lines) are fine also; they cause borders to be produced in the output association table.

An example could be:
#  Project Direction 
Bob Bilder:                 Bilder Robert  [FAU]  OR  Bilder R   [AU]  # Director of CNP
Roberto Peccei:             Peccei Roberto [FAU]  OR  Peccei R   [AU]
Leonard Rome:               Rome Leonard   [FAU]  OR  Rome LH    [AU]
Fred Sabb:                  Sabb Fred      [FAU]  OR  Sabb FW    [AU]

#  Cognitive Neuroscience
Carrie Bearden:             Bearden Carrie [FAU]  OR  Bearden CE [AU]
Bob Bilder:                 Bilder Robert  [FAU]  OR  Bilder R   [AU]
Ty Cannon:                  Cannon Tyrone  [FAU]  OR  Cannon TD  [AU]
...

If we do not give a second set of terms, RDoQ assumes the two sets are the same, and looks for associations between all terms in this set.

If you look carefully, you can see that the figure above actually selects a predefined term set -- a list of people in the CNP (Consortium for Neuropsychiatric Phenomics). Using the predefined term set achieves the same effect as typing all this into the web form.

It also shows us setting the relevance level to 5, asking that only people who have 5 or more publications with another person be included in the result. (This kind of thresholding is useful — the screen is filled more densely with relevant information.) If we click on submit, we get a result page that has results from PubMed for everyone in the CNP list:
figures/evaluations.jpg

Scrolling down further, we can explore the resulting table:
figures/results.jpg

We can mouse over table entries and get a breakdown of the corresponding co-occurrences of publications:
figures/popup.jpg

Also, clicking on the table entry for Bob Bilder and Arthur Toga that is highlighted above produces a PubMed summary of these publications:
figures/pubmed_link.jpg

By moving the sliders on the left, we can change the size of the display — and for example go back in time to see what the associations in PubMed between these people looked like at the end of 2000:
figures/exploring.jpg

At the very bottom of the page is a Revise input button:
figures/revision.jpg

Clicking on this button produces the RDoQ query form, permitting us to revise our query:
figures/query_revision.jpg

By choosing one of the predefined lists of genes for our second set of terms we can explore associations between people and genes: This obtains a table of associations between people and genes in the literature:
figures/blast_results.jpg

This is just a quick sketch! Have fun!

A 2-Minute Tour: Some natural uses for RDoQ



Learning about People



First, RDoQ lets us learn things about people at CNP. Here, we find associations between the predefined set of people at CNP (grouped by research field) and itself (so, associations between people at CNP). The output, very like what was shown earlier, spotlights two patterns of research interests that are surprising:

CNP_people_grouped_by_field.jpg


To be more easy to digest, this query asked for associations between investigators who had at least 10 publications with another investigator here. By varying the query we could drill down and better understand how people interact within the CNP.

Who Works on What



Who knows about the gene I am interested in? Questions like this can be answered quickly with RDoQ, simply by finding associations between CNP people and the predefined set of Genes that may be relevant to projects at CNP available in RDoQ:

CNP_People_vs_Genes.jpg


RDoQ also includes a list of faculty in the UCLA Neuroscience Interdisciplinary program. This can make it easy for RDoQ to find interdisciplinary research connections between other people at UCLA and CNP investigators:

UCLA_Neuroscience_People_vs_CNP_People.jpg


Automatically generating Reviews of the Literature



Meta-analyses are formal systematic reviews that aggregate individual statistical results into results with greater significance. These systematic reviews can be better sources of evidence than the open literature, and so in initial study it can help to see what is available in these reviews. PubMed has features for specific retrieval of meta-analyses, and RDoQ can take advantage of this.

RDoQ provides the ability to set a context on any search. If we set the context to "Meta-Analyses", limiting all retrievals from PubMed to be of this kind, then we can get a detailed summary of available reviews for all terms of interest to us. For example, we can do this for the relationships between established PubMed Interfaces and interface features:

figures/PubMedExtensions_vs_Features.png

Another feature RDoQ provides is the ability to download any table like this in the format of either a spreadsheet or a graph, in VUE format, VUE is a graph editor and information visualization tool; it can be either downloaded or run as an applet in a new browser window.
figures/PubMedExtensions_vs_Features_Graph.png

In the downloaded graph, all links are active and can be used to retrieve PubMed records.

Exploring Hierarchies



RDoQ can handle not just sets of terms, but hierarchies of terms. This permits it to analyze vocabularies, lexicons, taxonomies (is-a hierarchies), part-of hierachies, and ever some kinds of ontologies. For example, one of the predefined sets of terms is a hierarchy of 330 anatomical regions in the human brain used by PubBrain.

RDoQ also includes many hierarchies from MeSH — the Medical Subject Heading "ontology" of keywords used by PubMed. For example, if we use the MeSH hierarchy of neurotransmitter receptor terms we can find a comprehensive picture of associations in the literature between neuroanatomical regions and neurotransmitter receptors. RDoQ can adjust the viewing scale also, so we can get two images — an overall picture (from 30,000 feet, in 1-point font), and an up-close picture (from 3 feet, with 14-point font), that show associations between regions and receptors:

PubBrain_Anatomical_region_hierarchy_vs_MeSH_Neurotransmitter_receptor_hierarchy.jpg
PubBrain_Anatomy_vs_MeSH_Neurotransmitter_receptor_zoom_in.jpg


Sophisticated use: Knowing how to use PubMed and MeSH



To use RDoQ, it helps to be familiar with querying PubMed. There is an extensive online tutorial/reference/help system for PubMed, and the RDoQ query page has links to information about important features. This knowledge is important; being aware of how PubMed interprets queries permits you to find associations (and relevant publications) that otherwise are likely to be missed.

It also helps to know MeSH, the Medical Subject Heading "ontology" used by PubMed. MeSH is a kind of "keyword" index, or set of "indexing terms", or "topics" — as articles are entered into PubMed they are tagged by curators as being relevant to certain MeSH terms, so that one can search articles by topic.

How RDoQ Works



RDoQ asks PubMed for documents matching the queries in your sets of terms, and then analyzes the extent to which these sets intersect. The size of the intersection is used as a measure of association.

Some caveats:
So: be thoughtful in your choice of queries, and avoid loading down both RDoQ and PubMed with work that isn't really useful. Do unto PubMed as you would have it do unto you.

Acknowledgements




Output information here is provided "as is", and with no representation or warranty expressed or implied by any parties.



To print in color (even to PDF), your browser must be set properly:
FirefoxPage Setup → Print Background (Colors & Images)
SafariPrint; then, in print options: Safari → Print backgrounds
IETools → Internet Options → Advanced → Printing → Print Background Colors.


NIH_logos

RDoQ

Copyright © 2007–2015 D.S. Parker    All Rights Reserved.