Predicting dengue: public health takes notes from sports and political forecasting
Will Donald Trump become the US President? Could Real Madrid win the Champions League? What is the risk that Greece could leave the Eurozone by 2020?
These questions – ranging from the serious to the trivial – are what occupy the minds of professional and amateur forecasters alike. Now the forecasters have dengue in their sights.
Information is power. The outcome of world events can have considerable economic and political ramifications which are why mathematicians and data scientists are building models designed to estimate the risk of war, famine, public health, climate change and political upheaval.
What are the odds?
Public health may be on the cusp of a revolution in epidemic forecasting that would profoundly improve our ability to deal with infectious diseases. A more informed and strategic approach could be just over the horizon.
Knowing how climate, for example, affects the risk of a dengue fever outbreak in a particular area gives public health officials a huge advantage as it allows authorities to step up vector control and vaccination efforts in at-risk regions.
Predicting the future can be hugely complex, in locations where there are multiple factors to consider, requiring increasingly sophisticated modeling.
Fortunately, the kinds of algorithms health forecasters are dealing with are primarily the same as those used by people predicting election outcomes and sports results – and the scope for mutual learning is enormous.
Add in the exponential improvement in computer power and access to an ever-increasing pool of data, and it is not difficult to see why the field is generating such excitement.
Indeed, the potential of this area has been recognized at the highest levels, prompting the launch of the White House Dengue Prediction Challenge last year. The competition challenged participants to come up with a formula for predicting outbreaks of dengue fever in San Juan, Puerto Rico and Iquitos, Peru. More about that in a moment but first, how did mathematical modeling become the next big thing?
With the notable exception of weather forecasting, predictive modeling has traditionally had low public visibility. But that began to change with an explosion of interest in using data science.
One of the chief exponents of this young field was Nate Silver, a statistician who combined his love of maths with a passion for baseball and politics. He made a name for himself first by developing an algorithm called PECOTA which accurately forecast the performance of baseball players.
Shortly afterward, his blog FiveThirtyEight.com made him a household name in the US. The website, which takes its name from the number of delegates at stake in US Presidential Elections, became a reference point for journalists, analysts, and politicians during the 2008 election of Barack Obama. Suddenly data science was sexy.
Dr. Nicholas Reich, Department of Biostatistics and Epidemiology at the University of Massachusetts School of Public Health. “Sports is the obvious place that has seen an analytics revolution in recent years, but also finance, politics, and public health. I credit people like Nate Silver for taking terms like ‘data journalism’ and ‘predictive analytics’ into the mainstream media vocabulary.”
Dr. Reich’s team has been working to turn the power of this analytics revolution on public health problems such as flu and dengue fever. Among the big questions to be answered in translating this statistical know-how to disease modeling is whether there are near-universal approaches to forecasting all complex phenomena or does each disease requires a new algorithm to be built almost from scratch.
He says the prevailing sentiment in the infectious disease community is that models that are customized to predict disease should perform as well or better than models, not customized for disease prediction. “But the jury is still out on this. I don’t think we have enough data to support the widespread use of one type of model over another in general, yet.”
Dr. Reich has been using a general approach. A team from his lab entered the White House Dengue Prediction Challenge with a model adapted from a similar formula for predicting flu outbreaks and the trajectory of professional basketball players’ careers.
“One of the things that we learned from using this very general prediction algorithm – that has strong methodological similarities to what Nate Silver and his team a FiveThirtyEight use to project the career trajectories of NBA players – to predict dengue is that it did roughly as well as many other highly customized methods,” says Dr Reich.
Dengue’s next top model
The White House Dengue Prediction Challenge attracted ideas from over a dozen universities as well as several companies in the US and Europe. Participants were given access to dengue case counts and environmental, climatological, demographic and vegetation data.
Their challenge was to predict the timing of peak incidence, maximum weekly incidence and the total number of cases in the transmission season. In short, when will it happen, how bad will it be at its worst, and how severe will it be overall.
The winning entry came from the Johns Hopkins University Applied Physics Laboratory (APL).
“The method we developed for predictions for both locations was the same,” says Anna Buczak, who led the APL project team. “Iquitos was much more challenging to predict than San Juan because the time series were much shorter, and the data were substantially noisier. We developed an ensemble method, in which each ensemble was made from 300 best models.”
The White House-sponsored competition built on earlier US initiatives in the fields of health, the environment, and defense. For example, the US Health and Human Services ‘idea lab’ runs an Epidemic Prediction Initiative, the National Oceanic and Atmospheric Administration runs a Dengue Forecasting project and the National Science and Technology Council hosts and an interagency working group dedicated to infectious disease forecasting.
The US government plans to publish all the models submitted as part of the contest with all team leaders included as co-authors of a peer-reviewed paper on the topic.
Room to improve
Dr. Reich believes that while the contest did not produce a perfect dengue prediction model, it helped to move the field forward. His team is currently working to refine their models in a real-world project in Thailand. The key, he says, to optimizing prediction is to get accurate, up-to-date data.
“One very complicated factor that none of the infectious disease prediction competitions have taken into account yet is complications arising from delays in case reporting,” says Dr. Reich. “We have seen that this plays a significant role in our real-time predictions of dengue in Thailand, and making adjustments for delays in case reporting makes a huge improvement in our predictive performance.”
With organizations like HealthMap, a Break Dengue partner, building real-time surveillance systems that pull data from a range of sources including social media, the prospect of real-time forecasting based on live surveillance data is coming into focus.
Of course, it’s not just university labs and government experts that want to build better disease forecasting models. Private sector disease forecasters have been working on this problem too. Take Ascel Bio, for example. They began forecasting dengue in Asia in 2011 and had since branched out to include Brazil and Mexico.
The company works with the private sector and defense clients concerned by spikes in demand for health services, as well as other potential disruption, caused by unusual outbreaks of infectious diseases.
“Our tools create opportunities to save lives and reduce the costs of care through better planning and operational improvements,” says Patrick Wedlock, a Senior Analyst at Ascel Bio. “Disease forecasting has become more accurate with increased computational power and a better understanding of the factors that drive disease transmission.”
He says the best way to improve forecasting is to develop the baseline data. “This means more granular and routine disease surveillance worldwide. Individual nations would reap tremendous benefits by making disease data more transparent to forecasters,” says Wedlock.
As demand for accurate forecasts rises and our ability to generate real-time surveillance data grows, we can safely predict a healthy future for the field of dengue forecasting.