Twenty years ago, Doug Laney, the author of Infonomics was the first to observe that data had taken on three characteristics. 1) Volume – a vast amount of data was being generated, whether via social media, uploaded photographs, device logs, or in the case of health care, hospital records, IoT or biomedical research. 2) Velocity – data was being generated with every increasing speed and 3) Variety – As shown above, data was coming from everywhere and encompassed everything. These three V’s came to define the concept we know as Big Data. Today, Big Data represents the almost unfathomably large amount of information being collected that has also become unmanageable using traditional software or internet-based platforms.
Big Data in Healthcare
The healthcare space is no stranger to the problems created by Big Data as data-driven and information-based systems have quickly become the new paradigm in American healthcare. Electronic Health Records (EHR), Electronic Medical Records (EMR), Personal Health Records (PHR), Insurance records, prescription information, data from genomic experiments, IoT data including data collected by wearable technology, and research data–all help drive an industry on the cusp of explosive growth in data volume.
By 2025 healthcare related data will have far surpassed media and entertainment and equaled that of the financial services sector in growth, 36% compounded annual growth rate. Adding to the volume of healthcare data are efforts like the National Institutes of Health (NIH) “All of Us” initiative that aims to collect one million or more patients’ data over the next few years in order to build the most diverse health database in hopes of one day finding information to prevent and treat disease. Given, a single patient alone generates close to 80MB of data each year in both imaging and electronic medical record data, we can see why such an increase in growth is expected.
What good is all that data though, if we have no way to take action on it? Without a plan to extract meaning from data and turn insights into action, data is nothing but noise. Traditional data collection and analysis tools however, aren’t effective at organizing, manipulating, and extracting insight from that much information. Healthcare organizations must adopt a data-driven mindset that begins by defining a clear set of goals and objectives to help analysts focus on what data should be mined and what data can be eliminated.
Four Strategies to Adopt a Data-Driven Mindset
Clean data leads to accelerated decision making. Data collection can be messy; information is compiled from multiple outlets and not all of the data collected is normalized consistently before analysis. Data cleaning removes observations that aren’t relevant, unifies and standardizes the formats, and validates the data. Though the use of tools such as Excel, Python, and other scripting languages can be invaluable, there are also tools on the market today more dedicated to this task.
Understand the scope of work ahead of you. Gaining valuable insight from data collected is tantalizing and can lead to organizations running before they are ready to walk. Without the right tools sets in place this can not only hinder progress but deliver faulty analysis as well.
Employ automation to simplify data analysis. Automation in data analytics can provide insights that might be otherwise unavailable to an organization. Varying in complexity from simple scripts that deliver records into a pre-defined data model, to more full-service tools that perform exploratory data analysis, feature discovery, model selection, and statistical significance tests, automation gives users fast availability of data for analysis.
Visualize data to reduce complexities. Gleaning raw data, no matter the source, does not give us the information we seek. Because of how the brain processes information, to gain the true insight, we must present data visually. Whether through charts, graphs, metrics, KPI’s, or dashboards, doing so not only makes understanding and interpreting data faster and easier, it also may identify and highlight information and observations that would be nearly impossible in other formats. From cloud provider tools, to dashboarding and business intelligence, the market has delivered numerous options for organizations today to drive cutting edge analysis.
Since the original observation by Mr. Laney many years ago, two more “V’s” have been added to the Big Data equation, Veracity and Value. The final “V”, value, has become critical for industry but even more so for today’s healthcare industry as Big Data represents a launching pad to significantly transform the landscape. At its core the healthcare industry, exists to keep people healthy. Through intelligent use of collected information, clinicians can make more cost-effective, high-quality decisions, medical researchers can gain previously unknown insight into the causes of diseases, hospitals can optimize operating room staffing based on expected demand, and patients can make more and better-informed decisions about their treatment.
Team RIVA is proud to support the modernization of healthcare systems and empower agencies like NIH and Agency for Healthcare Research and Quality (AHRQ) to collect, analyze, and act on healthcare data and trends in support of a healthier tomorrow for all.