After engineers at Google had observed that there would be a spike in searches on flu symptoms as people tried to diagnose themselves before an actual outbreak was registered, they put together a web tool known as Google Flu Trends in 2008. User location was determined by IP address, and search volumes were compared to a baseline so that spikes could be detected, which could potentially predict an outbreak, and the project was later expanded to 25 countries from Argentina to Uruguay.

Initially, there appeared to be a correlation between search spikes and actual outbreaks as recorded by the Centre for Disease Control (CDC) through laboratories monitoring the flu. This signal spike was further enhanced by including more search terms that followed the same pattern. Unfortunately, as the project progressed, this association proved increasingly inaccurate, and Google Flu Trends started to overestimate flu prevalence.1 Eventually, the project was shut down in 2015 for poor results when compared to CDC data.2

Inaccuracy caused by changes in search habits of people

A post-mortem report of the project identified some reasons for its failure. A key reason identified was the change in search habits – searches by people merely interested in flu outbreaks were indistinguishable from people trying to diagnose themselves, causing an overestimation of the number of cases by the web tool. Additionally, Google’s predictive search based on both entered text and the user’s internet history added to the confusion, as the user’s search terms were no longer autonomously created. Finally, results generated by the search algorithm which Google used were continuously modified to generate more revenue rather than always giving users the information they seek, thereby further modifying user search behaviour.

However, scientists have persisted and come up with, a new, more accurate model called ARGO (AutoRegression with Google Search Data). ARGO reduces the noise in data by self-correcting for changes in how people search and excludes terms that incidentally correlate with flu surges, amongst other modifications.3 It was reported that the data generated by AGRO was more accurate than Google Flu Trends and other models based on CDC’s past and current data, and the scientists are optimistic about the use of Big Data in disease tracking.4

Rise in use of big data analytics in other areas of healthcare

Beyond such real-time tracking projects, Big Data analytics are also increasingly being employed in other areas of healthcare, such as the monitoring and analysing of vital bodily signs. For example, Google-owned DeepMind intends to create an app based on patient bloodwork history and alerts healthcare providers when patients are at risk of acute kidney injury.5 Currently, DeepMind is working together with three hospitals in the UK, which will provide the data needed to develop the app.

Another project by DeepMind is putting the wireless body sensors worn by McLaren F1 drivers on children in Birmingham Children’s Hospital. This allows the children to be continuously and automatically monitored, rather than having to be manually checked on by staff and being restrained by physical wires in daily life. It also facilitates the rapid detection of a deteriorating condition.6

With big data being used by private companies in such a manner, naturally, concerns about user privacy have arisen. The data being used might not be sufficiently secured by companies or anonymized, leaving it vulnerable to hackers and invading the privacy of individuals.

Furthermore, despite the anonymization of data, groups with undesirable characteristics may face discrimination once they are identified. Another problem is that data protection laws might not be able to keep up with the pace of innovation, leaving companies free to do whatever they see fit with the healthcare data for commercial purposes.7 Regardless, the use of big data analytics in healthcare will certainly be an exciting field to look out for. MIMS

Read more:
Helping doctors keep up with the medical knowledge explosion
Only 16% of medical news found to have independent expert commentary
The future of healthcare in Singapore – SGH Campus Master Plan

1. Lazer, David, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014. “The Parable of Google Flu: Traps in Big Data Analysis.” Science 343 (6167): 1203–5. doi:10.1126/science.1248506.
2. Timmer, John. 2014. “Researchers Warn against the Rise of ‘big Data Hubris.’” Ars Technica, March 13.
3. Yang, Shihao, Mauricio Santillana, and S C Kou. 2015. “Accurate Estimation of Influenza Epidemics Using Google Search Data via ARGO.” Proceedings of the National Academy of Sciences of the United States of America 112 (47): 14473–78. doi:10.1073/pnas.1515373112.
4. Mole, Beth. 2015. “New flu tracker uses Google search data better than Google.” Ars Technica, Oct 11.
5. Wakefield, Jane. 2016. “Google given Access to London Patient Records for Research.” BBC, May 3.
6. Birmingham Children’s Hospital. 2015. “Ground-Breaking £1.8million Formula 1 Inspired Research at Birmingham Children’s Hospital.”£18million-formula-1-inspired-research-birmingham-children’s-hosp.
7. Mittelstadt, Brent Daniel, and Luciano Floridi. 2016. “The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts.” Science and Engineering Ethics. doi:10.1007/s11948-015-9652-2.