LA2K Explorer

LA2K Explorer

HypWeb is a system for developing hypotheses within the CNP (Consortium for Neuropsychiatric Phenomics). HypWeb currently allows specification of hypotheses involving the LA2K Database.

Click here to run the LA2K Hypothesis Space Explorer (in another window)

The LA2K Database

The LA2K Database is a database from a study underway in the CNP. It includes results from a broad variety of psychiatric tests or tasks taken by a large set of subjects at UCLA. Although there are many tables in this database, there are about 50 that have results of a task (or a small set of related tasks). Key tasks among these, with a correponding LA2K table, include the following:
LA2K Database Tables
ASRS (Adult Self Report Scale) V1.1 Screener
ACDS (Adult ADHD Clinical Diagnostic Scale) V1.2
BIS (Barratt Impulsiveness Scale)
BPRS (Brief Psychiatric Rating Scale)
Chapman Scales: Infrequency
Chapman Scales: Perceptual Aberrations
Chapman Scales: Physical Anhedonia
Chapman Scales: Social Anhedonia
D-KEFS Verbal Fluency Test -- English
D-KEFS Verbal Fluency Test -- Spanish
Dickman Scale of Functional vs Dysfunctional Impulsivity
HPRS (Eckblad and Chapman Hypomanic Personality Scale)
IVE-R (Eysenck's Impulsivity, Venturesome and Empathy Inventory)
Golden and Meehl's 7-Item Schizoid Scale
Hamilton Psychiatric Rating Scale for Depression
Hopkins Symptom Checklist
LA2K Health Questionnaire
MPQ (Control-Impulsivity items)
Modified Edinburgh Handedness Inventory
MCTQ (Munich ChronoType Questionnaire)
Neurocognitive Measures
SANS (Scale for Assessment of Negative Symptoms)
SAPS (Scale for Assessment of Positive Symptoms)
Scale for Traits that increase risk for Bipolar II Disorder
Spanish Vocabulary
TBI (Traumatic Brain Injuury)
TCI Version 9 (Temperament and Character Inventory)
YMRS-C (Young Mania Rating Scale)

Each table has tens of attributes giving values for indicator variables (measures, phenotypes) on these tasks, with the net effect that the database has several thousand measures for each subject, on this battery of about 50 tasks. The intent of this study was to record these measures for two thousand subjects, which was the basis for the name "LA2K".

Groups of Subjects in the LA2K Population

LA2K is a large database with detailed phenotype information about its subject population. It highlights ethnicity as an important factor; although its subject population includes five races, the study was designed to understand Hispanic and non-Hispanic differences. More specifically, the Demographics table includes a number of background variables, including:
These variables can be used to subdivide this population into more specific groups. For example, Gender allows us to define two groups of subjects: By also considering Ethnicity, we can define 4 different groups of subjects: If we also want these to be Control subjects (healthy control, with complete data), we can define 4 different groups of subjects: We can then study similarities and differences between these groups on the various tests and tasks.

It is difficult to get perspective on a large database like LA2K without some kind of automated system that helps organize the options, and permits exploratory analysis of the data to get intuition about what it holds.

LA2K Explorer

LA2K Explorer is a service that uses a battery of data exploration tools to provide intuition about rough hypotheses about LA2K data. The LA2K Explorer form looks like this:
As this page indicates, an input query to HypWeb is a hypothesis space -- a rough hypothesis that includes three things: From these three things, LA2K Explorer generates a hypothesis web -- a web site that integrates relevant information about the space of hypotheses. The resulting web site is a kind of review or report that includes automatically generated data visualizations.

Click here to run the LA2K Hypothesis Space Explorer (in another window)

Sample LA2K Explorer Results

The result of running LA2K Explorer is a set of visualizations -- using standard methods of exploratory data analysis.
For example, we can look at the correlation matrix for the selected variables as a heatmap, with rows and columns clustered by similarity:
Clicking on any image like this then obtains the PDF for it (and also for results of related analyses).

Clicking obtains the PDF behind this image, which also includes correlation matrices for each of the two groups.

We can also do the same analysis using the four groups defined above:

Results for the Demographics table on the groups {Female,Male} x {Hispanic,non-Hispanic}

There are about 60 tables we can analyze however. For example, we can consider the results of the MPQ test, ore define more complex groups, like by mixing Gender and Smoking variables:

Results for MPQ test on the groups {Female,Male} x {Hispanic,non-Hispanic}

Results for the Eysenck test on the groups {Female,Male} x {Non-Smoker (0/day), Light Smoker (1-4/day), and Heavy Smoker (more than 4/day)}

LA2K Explorer allows definition of still more complex groups, in terms of Age, Height/Weight, School Years, and so on.

Sample Exploration Session

Suppose we want to study the role of ethnicity on response time and accuracy of LA2K subjects. If we want to focus on female subjects, and also are interested in effects of ethnicity, with the LA2K Explorer LA2K GROUP DEFINITIONS menu we can define the four groups discussed earlier: This menu interface allows us to define 4 columns of features, so the four groups above {Male,Female} x {Hispanic,non-Hispanic} four columns of checked boxes shown:

In other words, the checked boxes specify the four groups we want: the first column specifies Group 1 (Male, Hispanic) and the last column specifies Group 4 (Feale, non-Hispanic). This interface is extremely flexible, and permits very general definition of groups.

With these group definitions, we then specify which effect variables we are interested in. These variables are indicators (field names) in the LA2K database, and they can be chosen from the HYPOTHESIZED EFFECTS OVER THESE GROUPS menu:

Finally we can select exploratory data analysis schemes of interest from the HYPOTHESIS EXPLORATION METHODS menu:

The selections (Histograms, Scatterplots, Parallel Coordinates, Correlation Heatmap, Correlation Ellipses, and Principal Components Analysis) are used in generating results.

The page generated by compiling this information starts with a summary of our specification of the space of hypotheses here.

Things are pretty self-explanatory after this.

By clicking on any image you can get PDF for it, and for related images that are not shown.

For example, histograms showing results for each of the six groups can be obtained by clicking on the histogram result image:

All results are color-coded by group, and the interface allows you to select colors if you want that. LA2K Explorer offers you many ways to explore LA2K data.

Click here to run the LA2K Hypothesis Space Explorer (in another window)

Exploration Performance

LA2K Explorer first extracts the relevant information from the LA2K database, based on the specifications for groups and effect variables. It then executes a script in R that performs all the requested tests.

Some caveats:

HypWeb and the HypSpace Hypothesis Space manager

HypSpace is a related system; together HypSpace and LA2K Explorer are the components of HypWeb, the CNP Hypothesis Web system. HypSpace permits definition and editing of hypothesis spaces --- rough descriptions of hypotheses that assert some experimental measure yields different values on different groups of people. Teams of researchers who conjecture "effects" -- rough hypotheses about several groups of LA2K subjects (such as Female Hispanic subjects) on the LA2K database. We can use HypSpace to manage information about hypotheses concerning these groups.

Currently HypWeb can both store this information and perform automated data exploration using LA2K Explorer. The visual presentations of this data described above are linked a browsable report with references to relevant literature. This report is actually a web site -- a hypothesis web -- that then subsequently be a hub for collaborative hypothesis development.


Output information here is provided "as is", and with no representation or warranty expressed or implied by any parties.

To print in color (even to PDF), your browser must be set properly:
FirefoxPage Setup → Print Background (Colors & Images)
SafariPrint; then, in print options: Safari → Print backgrounds
IETools → Internet Options → Advanced → Printing → Print Background Colors.


Consortium for Neuropsychiatric Phenomics

UCLA Semel Institute, Room C8-849    760 Westwood Plaza, CA 90095
Phone: 310-825-9474    Fax: 310-825-2850    E-mail:

Copyright © 2007–2009 Consortium for Neuropsychiatric Phenomics    All Rights Reserved.