- by Gary Finnegan

Building the ultimate dengue surveillance tool

image dengue health map

Image courtesy of HealthMap

Today’s increasingly connected world offers us an opportunity using digital surveillance tools to gain a true, timely picture of dengue activity around the globe. Data gleaned from social media, online searches, mobile phone tracking, crowdsourcing and more have the potential to fill gaps in epidemiological surveillance information.

“We now have access to more data than ever before,” confirms Dr. Mauricio Santillana from the HealthMap team at Boston Children’s Hospital. “Transforming today’s many diverse data sources into meaningful information and knowledge will give public health officials and the public a complete picture of the current state of dengue activity.”

HealthMap aims to do just that: in addition to showing dengue-related news reports around the globe in near real-time (as it does now), it will soon centralize dengue observations from diverse sources to show multiple aspects of the disease. More than that, the recent Big Data methodological advances made by the HealthMap research team and Harvard researchers on flu activity forecasting will soon allow the system to dynamically learn about the significance of diverse data sources and take that into account in the picture it provides. That training will help ensure it offers the best possible near real-time intelligence on an outbreak.

Machine learning improves intelligence

HealthMap was originally created around a decade ago by a team of researchers, epidemiologists and software developers at Boston Children’s Hospital as a tool for tracking disease outbreaks across the U.S. Unlike similar disease tracking tools, the HealthMap Flu Trends tool, developed during the past two years now incorporates new sources of insight (including data from Google, Twitter, Wikipedia, crowd-sourced participatory disease surveillance tools and clinician’s databases) along with simulation techniques from weather forecasting and other fields such as finance.

These simulation techniques use machine learning approaches to training the system to deliver the best results. “Information from traditional flu surveillance systems, such as the Centers for Disease Control and Prevention’s ILINet flu reports, is dynamically incorporated into the flu tracking model every week,” reveals Dr. Santillana. “This allows our models to reassess the role of its various variables along with the predictive power of each data source – or even each individual input variable. It then takes that into account going forward.”

The team soon realized that the HealthMap Flu Trends could be extended to other diseases, such as dengue, and other geographic locations as a new tool to help individuals and organizations engaged in the fight against these diseases.

Google used to map dengue too

During 2008-2015, Google leveraged its technical know-how to help track infectious diseases. The technology giant explored whether data sources not designed for disease tracking could be used to map a disease. “Google introduced the notion that the search activity of flu-related terms within their search engine could be used to produce flu estimates in near real-time. Later on, and in collaboration with our team, it was shown that this approach could be extended to track Dengue. This led to the creation of Google Flu Trends and Google Dengue Trends” adds Dr. Santillana.

Public officials used the Google Flu Trends and Google Dengue Trends tools to assess the gravity of the situation. It didn’t give them precise values, but would show whether dengue was peaking or increasing. However, Google’s tools were far from ideal. In some regions the tracking was reasonably accurate; in others, it often over-predicted or failed to capture outbreaks. “While many lost faith in this use of this wealth of new information to monitor and track infectious diseases, our research team has shown in multiple peer-reviewed publications that there is still promise in these novel data sources,” says Dr. Santillana.

Google discontinued its Flu and Dengue Trends tool in the summer of 2015 and opened up its data to researchers, including the HealthMap team. As such, HealthMap is studying how to extend the method it recently developed based on Google searches to predict flu, to predict dengue activity.

Expanding sources of insight

HealthMap and Break Dengue are partnering to expand the HealthMap platform to incorporate additional novel sources including traditional clinical reporting systems, crowdsourced disease surveillance tools, internet-based services such as Twitter and Wikipedia, and even precipitation values from weather systems to take into account how flooding might increase the severity of an outbreak. Break Dengue will provide some of that social media data.

To add to the picture, the HealthMap Flu, and Dengue Trends tool may also one-day incorporate flight frequencies between countries or track mobile phones locations, for instance. The more sources the tool can combine, the closer its correlation will be with outbreaks.

After all, HealthMap’s ultimate goal is to transform any relevant information that’s already been collected for some other purpose into meaningful information for public health officials and the public in general at a local level. The team aims to provide a centralized system that delivers a single web page with electronic records that show what diverse sources of information are saying – what Twitter is saying or what Google is saying, for instance – about dengue in any specific location.

Dr. Santillana describes how, by exposing some aspects of the complex underlying dynamics, the project is contributing to efforts to contain diseases such as dengue and limit their transmission: “With HealthMap we can also assess how certain activities may be linked to transmission.”

Understanding the burden of dengue

As well as giving a near real-time picture of the current situation, the team hopes the soon-to-come HealthMap Dengue Trends tool will also act as an early warning system – estimating what might happen over the next week or two in most cases or the next month in certain countries. “We’re hoping that when we combine our alternative sources with modeling techniques well-studied in epidemiology we’ll gain a better understanding of how a dengue season is developing – whether or not it’s going to be a bad season,” Dr. Santillana says.

More than that, HealthMap also aims to give meaningful intelligence on the actual burden of the disease, as Dr. Santillana reveals: “Our ongoing research is also looking into how best to turn qualitative results that come from news reports to quantitative estimates of disease burden or disease incident.”

Testing the dengue model

The HealthMap team will use data-rich environments to appraise the viability of their model for dengue before evaluating whether the same model applies to regions where data is poorer. Ensuring the model is beneficial to local communities will require a fair amount of research as Dr. Santillana comments: “We’ve learned a lot from tracking flu in the U.S., which is a data-rich environment. Now we need to explore how our knowledge and models extend to monitor Dengue in places where epidemiologic or weather information, for instance, or internet penetration is limited.”

That research has already begun. Models are being built ready for pilots in at least four countries: Brazil, Mexico, Singapore and possibly Taiwan and/or Thailand, where Google search data will be used to track dengue activity.

By centralizing information from different data sources, dynamically learning from it and presenting multiple aspects of the disease in an easy-to-understand format, the HealthMap Dengue Trends tool may prove to be the optimum dengue surveillance tool.