I have enjoyed the sport of running for many years, but I am a virtual neophyte when it comes to the analytics of the activity. Several years ago, I invested in a simple, wrist-worn gps device that doubles as a heart-rate monitor, and have been routinely logging my runs via a web-based application. It is only recently, however, that I have taken to analyzing my progress, and it has allowed me to set goals to try and improve my outcomes. As I set about monitoring my PR’s, it occurred to me that I had been contributing to the proliferation of the internet of things, or the connection of a myriad of physical, smart objects to the internet. This particular, personal connection allows for access to remote sensor data. The promise of this type of analytics has lead to a vision of a global infrastructure of networked physical objects with unprecedented connectivity and the gathering of massive amounts of data. The internet of things is ubiquitous, and its implications with regard to healthcare data, of which I have been closely involved with over the past several years, have not escaped me. Consider what types of data the medical community is on the cusp of collecting now: from infant monitors to insulin injection and prescription pill trackers – to what will one day be collected outside of the domain of the hospital on a personal level via the internet of things. This technology represents a highly diverse volume of data driven by a large number of potential participants and a wide range of measured variables. Digitized medical data has become so voluminous that it threatens to become unmanageable. Like my wrist-worn, running gadget, the consequence of a wearable, consumer-level, health data collecting device is the propagation of a large amount of “big data” that is complex, diverse, and timely. Debate concerning big data itself has become dernier cri, and it has been the subject of both sanguineness and condemnation. But, will we ever really be able to do anything with the data, particularly from a healthcare standpoint, and do we really understand how it might change the world?

Understanding begins with a definition. Big data is data that exceeds the storage and processing capacity of conventional database systems. Think in these terms: the total accumulation of data over the past two years—a zettabyte—dwarfs the prior record of human civilization (Shaw, 2014). Because it so unwieldy, new procedures must be found to process it (Dumbill, 2012). An important distinction is that that Big Data not only exists in massive amounts, but that it is highly dynamic: it comes in many different forms (structured, unstructured and semi-structured), its content is constantly changing, it exists in many locations throughout electronic space, and it is stored in perpetuity. It is not intended to answer a single question, but the queries against it are protean. The term “big data” was first coined by NASA researchers Michael Cox and David Ellsworth, who wrote in Application-Controlled Demand Paging for Out-of-Core Visualization for the proceedings of the VIS97 IEEE Visualization ’97 Conference that “data sets are generally quite large, taxing the capacities of main memory, local disk, and even remote disk. We call this the problem of ‘big data’” (Cox & Ellsworth, 1997). Fast-forward to 2008, when researchers Bryant, Katz, and Lazowska estimated that big data’s revolutionary effect would equal that of the advent of search engines and the manner in which technology has transformed how we access information. “Big data computing,” They write, “can and will transform the activities of companies, scientific researchers, medical practitioners, and our nation’s defense and intelligence operations” (Bryant, Katz, & Lazowska, 2008). Considering the size and breadth of the healthcare sector, one realizes that it is an industry that has historically generated large amounts of data. In the last five years alone, it has witnessed an explosion of information from sources as diverse as decision support systems, electronic health records, sensor and monitoring systems, and social media. “Data from the U.S. healthcare system alone reached, in 2011, 150 exabytes,” report Raghupathi, Wullianallur, and Raghupathi, “at this rate of growth, big data for U.S. healthcare will soon reach the zettabyte (1021 gigabytes) scale and, not long after, the yottabyte (1024 gigabytes)” (Raghupathi, Wullianallur, & Raghupathi, 2014). It will also increase costs: Burke reports that 2013 spending on big data was projected to top U.S. $34 billion (Burke, 2013). However, this is offset by the significant benefits that are expected to be realized by healthcare organizations. It is estimated that big data analytics stands to enable more than $300 billion in savings per year in U.S. healthcare, “two thirds of that through reductions of approximately 8% in national healthcare expenditures” (Raghupathi, Wullianallur, & Raghupathi, 2014, p.2). The majority of savings is projected to come in the areas of clinical operations, R&D, public health, decision support, and device/remote monitoring.

The potential savings associated with healthcare data analytics makes it an attractive pursuit. However, in order to realize these benefits, organizations seeking to leverage data provisioning must overcome a number of challenges. An introduction to big data considerations in healthcare can only touch upon some of these demands, but specific solutions must be sought after in order realize the value from data. It is important for organizations to know the history of given data, and, ultimately, how trustworthy it is. Data is useless if its validity and reliability cannot be confirmed, and it is important to understand the background and conditions the information was collected under. Organizations must also be prepared to allocate the necessary resources and expertise to manage big data analytic initiatives. Solutions must be scalable such that they can handle massive growth of data or are interoperable with other systems and can exchange and interpret shared information. Addressing these and other considerations is the point of departure to changing the way healthcare decisions are made and, potentially, redefining the industry.

 

 

Bryant, R., Katz, R. H., & Lazowska, E. D. (2008). Big-Data Computing: Creating Revolutionary Breakthroughs in Commerce, Science and Society.

Burke, Jason. Health Analytics: Gaining the Insights to Transform Health Care. Vol. 69. John Wiley & Sons, 2013.

Cox, M., & Ellsworth, D. (1997, October). Application-controlled demand paging for out-of-core visualization. In Proceedings of the 8th conference on Visualization’97 (pp. 235-ff). IEEE Computer Society Press.

Dumbill, E. (2012). Big data now current perspectives from O’Reilly Media. (2012 ed.). Sebastopol, CA: O’Reilly Media.

Raghupathi, Wullianallur, and Viju Raghupathi. “Big data analytics in healthcare: promise and potential.” Health Information Science and Systems 2.1 (2014): 3.

Shaw, J. (2014, March). Why “Big Data” Is a Big Deal. Understanding big data leads to insights, efficiencies, and saved lives. Retrieved from http://harvardmagazine.com/2014/03/why-big-data-is-a-big-deal