Shared capabilities and resources within MHRN improve the speed, efficiency, and generalizability of mental health research. MHRN has taken advantage of the well established infrastructure provided by the HMO Research Network and built upon these unique capabilities to help speed and streamline multi-site studies in mental health clinical and health services research.

Virtual Data Warehouse (VDW) facilitates multi-site research while protecting patient privacy and proprietary health practice information. Originally developed by the HMO Cancer Research Network, the VDW now supports studies of cancer, drug safety, cardiovascular disease, mental health, and more. Member sites agree on data to make available for research and derive standard definitions and formats – but each site maintains control of its own data via a “distributed” or “federated” model. That is the VDW is not a central database. Administrative, clinical and claims data are translated to a common set of data standards at each site.

Building upon the HMORN distributed data model, MHRN has developed a comprehensive distributed warehouse of computerized records data regarding mental health diagnoses and treatments at our participating sites. This data resource dramatically increases the efficiency of ongoing and future research while protecting patient privacy.  Go to MHRN Resources to learn more about the specific resources developed by MHRN.

The VDW is a cornerstone of our collaborative research, protecting privacy and fostering standardization.

How It Works

Governance and Management

  • HMORN Board: Provides overall policy and direction setting about content, resources, and access.
  • VDW Operations Committee (VOC): Manages development activities across the HMORN, and provide technical input.
  • VDW Data Area Workgroups: Define, maintain, and interpret data file specifications, propose new variable, identify site-specific issues with data standards, and provide scientific input for each data area.
  • VDW Implementation Group (VIG): Site data managers and others who extract data from local systems, convert it to standard VDW structures, ratify the data specifications, and share best practices.

Financing and Staff

  • VOC staff support comes through a modest annual contribution HMORN members make to a common operations fund. Member sites support analysts at each site who map local data to the common format and PIs and analysts serving on Data Area Workgroups and the VIG.


  • We use published data standards where available (e.g., CPT-4) and create our own when necessary.

Hardware and Software

  • Each site needs hardware and software (mainly SAS®) to store, retrieve, process, and manage datasets. VDW files are separate from health plan files. We have found that research activities are better supported when a specialized research data warehouse is maintained at each members center.


  • Sites contribute limited data documentation (e.g., the source of variables and any variation) to a password-protected Web portal.

Quality Control

  • Periodic checks look at ranges, cross-field agreement, implausible data patterns, and cross-site comparisons.

Specification Update Procedures

  • Specification and standards updates are developed by Data Area Workgroups. The VOC and VIG review them and set implementation milestones and target dates.

Available Data

  • Each institution’s VDW data remain at their site until a study-specific need arises. The minimum necessary required data are extracted after ethical, contractual and HIPAA requirements are met.

Data Domains

  • Laboratory Results
  • Enrollment and Demographics
  • Cancer
  • Pharmacy
  • Utilization
  • Vital Signs and Social History
  • Census

How Researchers Use the VDW

  1. Work with collaborators to develop the study protocol.
  2. Obtain relevant regulatory, contractual and ethical approvals.
  3. A SAS analytic program is developed at the lead study site.
  4. The project sites run the program against their local VDW and return project-specific datasets (often aggregate) to the lead site for data pooling. This process may be reiterative, depending on the complexity and availability of the data within the VDW.



  • Underlying data are collected for treatment, payment, and operations – not for research.
  • Source data that vary substantially within and across sites.
  • Health plans continually change their information systems, often requiring adaptation or re-implementation at sites.
  • Sharing data beyond project collaborators is complicated for technical, regulatory, and political reasons.
  • Maintaining these processes takes money. Project-specific grant funding does not support the level of cross-site and cross-project upkeep and knowledge sharing that is needed.

It takes time to:

  • Agree on the need for a new variable or data area.
  • Develop clear specifications to guide implementers and end-users.
  • Implement new variables at each site.
  • Verify and document the implementations.
  • Consult with users throughout.



The VDW is an example of a distributed (or federated) data-sharing model based on electronic clinical claims, and administrative health care data. It is applicable for multi-site health services and population health research. With planning and ongoing funding, it yields data across institutions and over time. Benefits for multi-site analysis include:

  • Improved data efficiency, accuracy, and completeness.
  • Analytical precision plus patient and institutional protections.
  • More generalizable results.