The models are all wrong!

“The models were all wrong and can’t be trusted!”

I have been seeing this argument a lot online, sometimes on Twitter but more often on Facebook, as a claim meant to support the idea we shouldn’t be sheltering in place at the expense of the economy.

This is dangerous and wrong thinking.

I am not an epidemiologist and don’t work with those particular types of models. But I have done enough study and working with models to know how the underlying statistics and assumptions work.

The problem isnt that the models are wrong, but rather the public’s understanding of them (and quite possibly how reporters cover model projection claims).

Models provide a range of projection based on certain inputs we call variables. They are constructed based on factors we know about, and then those factors have a numerical value based on data we have. So right there, there are two important things to understand here:

  1. How robust are the imagined inputs?

Inputs can be things like demographics, population density, underlying population health, etc. It’s a mix of social science and science, and it is hard. Epidemiologists spend years trying to figure out the most critical factors when building models, and weight them properly. Some factors are more important (i.e. better predictors of spread) than others.

2. How good is the data?

But even if your model is good, bad data screws it up. Witness what’s happened in Georgia with messed up reporting. Or Florida, with claims the state is encouraging tainted data reporting in order to make the political argument for opening up look stronger. This is why we want science to be independent of politics, left to scientists and not managed and run by those with a political stake in a certain outcome. Politicians should listen to scientists, not influence science.

But even if the data is good, here’s the thing: any change to inputs affects a model. People saying “We were told 6 million deaths, then 2 million deaths, then 500,000 deaths! They were all wrong!” misses the point. We changed the inputs. We shut things down and that changes the model. Any change at all to behavior or those major factors that predict the spread of a disease is going to change the model’s numbers.

So the lesson here is we don’t judge a model by the accuracy of the numbers it’s returning, because the factors can and do change. Instead, we make sure the model itself is accurate with regards to inputs and data. A death toll that comes in lower than a prediction in February is not how you do it.

My statistics profs in grad school said several times, “All models are wrong, but some are useful.” The point was, we change the model as we learn more. We understand factors better, and which factors matter most. We get better data to understand the impact.

The model being wrong is a feature, not a bug. There. Is. No. Such. Thing. As. A. Static. Model.

For instance, we are learning more about how which subsets of the population are more vulnerable to COVID-19. We are learning what prevention practices are helping most, how to better make use of medical gear to prevent spread. We can change our approach to account for new knowledge, and then turn those into factors; if states adopt new guidelines and use new knowledge, can we further lower the mortality and transmission rates?

We want this. We want our models to be wrong, so we can make better ones. Knowledge builds on knowledge.

So how can the public better understand models through better reporting? Here is what I’d suggest:

  1. Models have ranges. The death toll projection is not the headline. They give a high/low range because models aren’t certain. The public needs to understand this, and that it’s a best guess based on understood factors.
  2. Models represent what we know right now, when the numbers are spit out. The minute we change something, we change the number in some direction (lower or higher).
  3. Thus (and this is super critical) models are NOT predictions. They exist to give you a range of odds to assess a likelihood of outcomes as a factor in decision-making. People want precision from models, but instead we need to see this as a projection based on what we know. Models exist to give us knowledge about what we risk by doing nothing, or by changing a few things up here and there. They inform decision-making.

Again, not an epidemiologist, but I have enough training to understand what I’ve seen in the news. The model ranges keep getting revised downwards. I look at that and see us changing the trajectory with behavior change as well as getting better data about cases and deaths. That the models are wrong is a good thing. It means we’ve changed the outcomes for the better. It also means science isn’t so stubborn that it sticks to the number but rather adapts to the current state of play.

We want both of these things.

Data reporting is hard. Many who do it have some training, but the current news environment is asking reporters with a lot less training to make sense of models and stats and figure out how to tell the public about it.

My advice is to always provide context, to note that science and social science data is always a snapshot in time based on what we know right now. But also that our choices matter, that they change models and outcomes. That is empowering.

But we can’t as news folks let this devolve into couch criticism about models in general. We use models all the time in society — from weather to economics — and they’re not 100% accurate. But they help us make better choices than we’d make if we are completely flying blind.

What we should be demanding is better data. We don’t have it. Again, are states testing enough, and are they faithfully reporting numbers? If you want precision from a model, you should be demanding those things because those things will make predictions marginally better. If your goal is opening up, demand better, independent data gathering and reporting.

Point being: it’s good to say a model’s numbers are not destiny, to understand that we can change behavior to change the numbers. But to bash epidemiologists with years of training and whom I guarantee are smarter than all of us about this? It’s absurd and ignorant.

The people citing models as wrong, they are spreading a dangerous argument based on the simple thinking that models predicted X and we got Y. The goal is to kill confidence in the model. It never comes with a demand for better models. You know why? Because they don’t want a better model. They want to do what they want to do, and the use of modeling is in their way.

And so we have to push back and call it out when we see it.

Look, those making and running models will agree the models are all wrong. But they also understand what they don’t know and know how to interpret their work accordingly, to advise while also avoiding a sense of certainty. People who think a model has to make accurate predictions to be valid are arguing from ignorance, and they are pushing a dangerous line of reasoning that will lead to more pain.

We will open up again eventually. One of the great things about models as we revise them is we can learn what habit changes best prevent spread. Modeling will help us perfect a set of best practices to help us open up more safely than we would have before.

Jeremy Littau is an associate professor of journalism and communication at Lehigh University. He studies activism and jounalism in the context of social networks and digital culture.

Journalism prof • Multimedia • Sociology • Dad • Generation Catalano • #Mizzou • Sabermetrics Justice Warrior • I read retweets for the endorsements