In the BC MHA program, we want to understand how to unleash the potential of big data; how organizations can manipulate and interpret data to improve healthcare service and delivery, including patient experience and the bottom line.
The term, “big data,” is more and more frequently thrown around.
If you happen to Google the term, you will find over 275,000,000 results. According to SAS.com: “Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis.”
To better understand how healthcare organizations leverage massive quantities of data, which are often housed in disparate systems, and how these same systems transform these data into something meaningful, I recently spoke with Dr. Larry Fulton, BC MHA Faculty, and Professor of Health Analytics and Management Theory and Organizational Behavior.
Here, Dr. Fulton explains how leading healthcare organizations are tapping into big data in order to drive evidence-based decision-making, and to move towards greater efficiency, as well as how BC MHA students are gaining experience with this important topic.
Emily Raviola, MPH
BC MHA Interim Program Director
Raviola: We hear a lot about big data. How should we understand the importance of this topic in the context of healthcare? How are healthcare organizations able to ‘get a handle’ on all of this data?
Dr. Fulton: Data is inundating all industries. In healthcare, data are so voluminous that getting a handle around them using typically conventional techniques becomes more than problematic.
Ever since organizations started processing manual records, we have had large amounts of data. It’s just that now, since most of our resources are being populated in electronic systems, we have the ability to process that data in an automated fashion. We can use supervised and unsupervised methods of data mining to learn from the data and to convert it to intelligence so it can be used to make informed decisions, build predictive models, generate efficiencies, and reduce waste.
We have this ability because machine learning has really come into its own largely due to the increased availability of computer power. Our laptops now have more power than what was previously available in the early versions of supercomputers. And we continuously find ways to squeeze out more juice from our machines.
For example, the laptop I have here right now has a NVidia graphical processing unit (GPU) that can do matrix mathematics exponentially faster than the core processors traditionally tasked for these operations. Even on individual computers like this laptop, we have learned how to send processing instructions to both the GPUs and the separate cores of the processor in ways that drastically reduces processing time for complex operations. And if that additional computing power is insufficient for the task (perhaps, large-scale image recognition problems as an example), then there is distributed processing, where we can access multiple cores from multiple computers across the cloud. We have the ability to process data on laptops and desktops in ways that would not have even been fathomable just ten years ago.
Raviola: Do you have an example of recent work you’ve done where you’ve been able leverage big data in order to deliver intelligence to deliver value to a healthcare organization?
Dr. Fulton: One of the things that that I have worked on with colleagues has been Medicare fraud detection. Unfortunately, there tends to be a lot of fraud associated with Medicare reimbursements. Medicare has millions of records and millions of variables associated with reimbursement. What is very interesting is that we can leverage big data to find patterns within the records that are highly suggestive of fraud.
And we can do so in different ways. For example, we can just look at patterns or clusters in the data without having any knowledge of what is or what is not fraud. Alternatively, we can train models to learn from known fraud cases to identify other potentially fraudulent cases. Simply put, we can use unsupervised and supervised machine learning methods. With these methods, we are able to forecast other potentially fraudulent claims.
When we think broadly about healthcare settings, we can use fairly sophisticated modeling to identify patterns in the treatment and administration of health services. We can data mine the Electronic Health Record as well as other systems within our organizations. Doing so allows us to turn data into intelligence and use that intelligence for performance improvement (e.g., development of clinical pathways). All aspects of healthcare are going to be impacted by big data with the mandate of the Electronic Health Record. Healthcare organizations have the data sources, but fundamentally, analyzing this data requires the application of some fairly sophisticated models.
Raviola: Is it safe to say that without the ability to rely on these methods or tools to process and analyze big data, an organization would be overwhelmed?
Dr. Fulton: Absolutely. In fact, using traditional organizational computing power sometimes isn’t sufficient for big data problems. If you have a large enough problem then you may have to leverage distributed processing, using a number of different cores and different machines often in the cloud.
The company, Amazon, for example, has a web server through which you can buy processing time and send instructions to their core processors and they will process the data for you. Essentially, they’ll run your algorithms on their machines. Google is another company that provides this service.
Raviola: Clearly it’s a field that is highly quantitative and technological. How would you have a healthcare professional, who was considering earning an MHA, think about what they would learn in an MHA program about big data? In terms of this topic, what competencies do you personally think are critical for a healthcare leader to have these days?
Dr. Fulton: It’s essential that health care leaders should appreciate the significance of big data in evidence-based decision-making. We have these resources, and we need to use them.
In terms of student exposure to big data, it isn’t required for an incoming MHA student to be highly mathematical or technical. When students complete our coursework, they will not only be exposed to many, many tools, but they will also be able to apply them or (most importantly) to know when they should be applied. I personally try to expose students to as many analytical tools as possible within the given time constraints. I provide students with an overview of the available tools, show them how to use them, and, finally, teach the students how to attack problems with those tools.
For example, using Kaggle.com, I typically start MHA students down the analytical path by engaging them in a competition to predict survival on the Titanic. While the competition is certainly not big data, it does allow for students to use an algorithm to support data mining (beginning with basic exploratory data analysis), and it does support the use of methods such as logistic regression, random forests, neural networks, boosting / bagging, etc.
Going into the competition, students already know who survived or perished for a training set of data; however, the test set provided to the students does not contain the survival outcome variable. That survival variable is withheld from the students. After cleaning the data, handling outliers, and creating features, students build models on the training dataset, and evaluate the performance of these models internally using methods such as leave-one-out cross validation. Then students forecast survival on the test set, and the accuracy of their forecasts is assessed. One of the lessons students learn from this process is that overfitting, making it excessively complex, can destroy a model’s predictive power.
So at the end of the day a student will be able to conduct exploratory data analysis, build models, and use these models to address healthcare challenges. And they will be familiar with a large set of analytical tools and under which circumstances they might be applied.
A lot of students coming out of programs, in my experience, have a limited set of skills to analyze and interpret data. If all you have is a hammer, everything is a nail. Students need more tools for their toolboxes just to build a base level of competency.
Our graduates must be able to understand the utility of big data, and the importance of converting this data into intelligence to provide evidence-based decision-making. Our MHA students will develop the skills and take away tools useful for converting big data to intelligence whether they enter the MHA program with an undergraduate in mathematics, liberal arts or another discipline. What is important is to know is when it’s appropriate to analyze big data, what tools are available to convert it to intelligence, and what methods are appropriate for the problem.
Raviola: Looking towards the future, how do you see big data impacting healthcare?
Dr. Fulton: Fundamentally, my belief is that organizations that don’t convert big data to intelligence are the organizations that will fail. They will lack the competitive advantage to thrive. Period.
Each element of health care—from the administration side to the clinical side—has large amounts of data, and much of it is never converted to intelligence. Organizations that have leaders who understand how to leverage big data will be the ones to watch.