Big Data in Diagnostics

Big Data is top of mind in many sectors of healthcare. Vast computing power is enabling us to analyze diverse data as never before. This analysis can drive innovations in the design of new technologies and assays for diagnostics, which, in turn, could have big impact on health outcomes, cost of care, and quality of care.

Molecular Diagnostics

The molecular diagnostics segment remains the fastest growing category within the in vitro diagnostics industry, driven by technology advancements and personalized medicine initiatives.

The Next Molecular Generation

Technologies based on gene expression (mRNA), detecting single nucleotide polymorphisms (SNPs), DNA methylation, microRNA and next generation sequencing (NGS) hold substantial promise. Assays are used to diagnose genetic abnormalities, are increasingly effective in diagnosing infectious diseases and play an important role in precision medicine, particularly in oncology. Examples include Genomic Health, Myriad Genetics, Foundation Medicine, Veracyte and Exact Sciences.

Proteomics: The “Other ‘omics”

Genes are the blueprint and give clues to possible future diseases. Proteins give insights about real-time health status. A caterpillar and a butterfly have the same genes, but different protein expression. Advances in proteomics have resulted in several commercial multiplex assays (e.g., Biodesix assay in lung cancer). New innovations are on the horizon that have the potential to revolutionize diagnostics. For example, SomaLogic plans to use its biomarker discovery platform to develop a menu of health insights.

The Role of Big Data in Medicine

The essence of “Big Data” is sophisticated statistical techniques applied to giant data sets. The resulting algorithms have led to objective health profiles and better predictive models. These models have been applied in precision medicine initiatives that personalize selection and/or implementation of a care pathway for an individual patient, based on a more accurate diagnosis.

Outcomes and Financial Impact

Data is central to healthcare, whether it’s used by payors, hospitals, physicians, clinical trials or patients themselves. Hosting, transfer, storage and security of private health information remains a key concern. Fueled by CMS-mandated use of electronic medical records (EMRs) and the need to understand cost drivers to cope with value-based payor contracting, the volume and value of health data is booming. Between imaging and EMR data, a single patient generates ~80 megabytes of data each year. The average hospital generates 665 terabytes of data. To put that number into perspective, 1 terabyte of data approximates 2,767 copies of the 32 volume Encyclopedia Britannica.

This data has significant clinical, financial, and operational value for the healthcare industry. In an April 2013 report, McKinsey estimated that new value pathways enabled by Big Data analysis could be worth more than $300 billion annually in reduced costs alone. Add to that value equation such promises as possible:

  • Gains in cost efficiency and fighting fraud
  • Reduced mortality, thanks to the ability to continuously monitor data dashboards and intervene in a crisis
  • Disease interception, thanks to wellness promotion and efficient medical follow-up
  • Personalized medicine directed by precision diagnostics

Big Data in the Innovation Process

When we think about biomarkers, we generally think of genes, proteins. or some other substance present in blood or other biological samples. Big Data can help in new biomarker discovery. This is important because simple biomarkers have already been found and developed into lab tests. Advances in computing power enables analysis of genetics, proteomics, digital data from electronic medical records and claims data to create sophisticated decision support models. The expectation is to improve, perhaps dramatically, on the performance of existing tests. That could be early detection of cancer, better risk stratification of people on the brink of diabetes, or predicting effectiveness of expensive specialty drugs.

Gap and Opportunity Analysis

A new product initiative starts with an estimate of market size, investigation of clinical relevance, and identification of market needs. The need could be a gap in a care pathway, excessive cost, or high morbidity/mortality. The test result should drive action by the treating physician. It is also important to define the appropriate patient population and where in the clinical pathway a test should be used.

Big Data in Biomarker Discovery

Big Data is most applicable in biomarker identification and algorithm development. Biomarker discovery starts with a clearly defined research question.

Specimens often come from public biobanks or stored clinical trial samples where the patient outcome is known and carefully documented.

Samples are run in the lab to quantify biomarkers. The goal of the bioinformatics analysis is to apply sophisticated statistical techniques to a Test Set to find combinations of biomarker and clinical data to find an algorithm that can predict the outcome. That equation is tested against a Validation Set to ensure that the algorithm delivers good results with an independent set of samples.

Assay Development

Assay development focuses on standardization, reproducibility, dynamic range, level of detection and other analytical validation.

Clinical Utility Validation

Prospective trials are generally used to test clinical utility of a new assay. This is the gold standard that payors, health systems, and physicians have come to expect, if not require. These trials should be designed to reveal how physicians would use the results and evaluate the impact on the care pathway and treatment costs.


Key issues for commercial success include regulatory approval, securing coding, positive coverage decisions and reasonable reimbursement rates, publishing in peer-reviewed journals, and being incorporated into clinical guidelines. With robust evidence published, adoption by health systems and physicians accelerates.

Bringing it all together

  • Cost is the major driver for new product and technology adoption. Economic buyers demand evidence of cost-savings or cost-avoidance in addition to health outcomes data.
  • Big Data makes sense out of data from many sources. The resulting multiplex tests often offer more complete information and better performance characteristics than traditional assays.
  • Actionable insights should inform better treatment decisions and offers the promise of better health outcomes at lower cost.

About the Authors

Carrie Mulherin is CEO of Focus Marketing, offering consulting services in planning, market research, commercial execution and training to growing companies.,

Emery Stephans is Founder and CEO of Enterprise Analysis Corporation (EAC), a broad-based consulting practice providing strategic consulting, business development and research services to medical, life-science and animal health organizations worldwide.

Download a PDF of this post

Please share!