Is it possible to predict the development of the coronavirus pandemic? Using Russia as an example?

Authorities and ordinary citizens alike are wondering when the coronavirus pandemic will end and quarantines can finally be lifted. Scientists could provide an answer: they have mathematical models that allow them to make predictions. But how effective are these models, and can they be applied in real life?

Because of the coronavirus pandemic, many people around the world have begun to understand a little more about how infectious diseases spread. Phrases like “reaching a plateau,” which means stabilizing the epidemic, and “flattening the curve,” which means slowing the spread of the virus, have become commonplace. At the same time, there are many predictions about how quickly a given country will reach this plateau. And the results vary widely. Some politicians blame scientists for failing to predict the pandemic. Scientists respond by saying that modeling must take into account many indicators, all of which are constantly being refined because the new coronavirus is not well understood.

The Russian service of the BBC tried to understand how scientists predict the development of the pandemic and what they can say about the situation in Russia, taking into account the quality issues of statistical data. One of the most common models is the SIR model, which was developed in the late 1920s. Its name is an abbreviation of the initials of the words susceptible, infected, and recovered.

The essence of the SIR model is that the entire population is divided into susceptible individuals – those who can be infected, the infected themselves, and those who have recovered from the disease. Several variations of this model have been developed, including SEIR, where E stands for exposed, which refers to individuals infected with the virus during the incubation period. These individuals have already been infected, but have not yet shown symptoms of the disease. According to the SEIR model, susceptible individuals first become infected, then have an incubation period for some time, after which symptoms appear, and finally they either recover or die. In the case of Covid-19, infected individuals can transmit the virus to others even before showing symptoms, making the virus particularly contagious. According to the World Health Organization, the average incubation period for coronaviruses is 5-6 days.

The contagiousness of the virus is measured by the index (coefficient) of basic reproduction – R0. If this coefficient is equal to 1, then each person infects another person, and this person then transmits the virus to another person, and so on. If we imagine, for example, ten such “jumps” of the virus from person to person, then each jump will infect one person. Obviously, at such a rate, the virus will spread very slowly. Increasing R0 by one greatly speeds up the process. If R0 is 2, it means that the first infected person will spread the virus to two more people, then each of them will spread the infection to two more people, and so on. The reproductive index of the current coronavirus is about three, but estimates vary. The initial indicator in a particular area can change depending on how often people interact with each other. As the authorities restrict people’s contact through various measures, the reproduction index decreases. This can be seen in the example of Great Britain, where the authorities announced the peak of the epidemic on April 30.

Now, all countries are trying to reduce the reproduction rate below one, hoping that with slow rates of infection spread it will be possible to relax restrictive measures, allow people to return to offices, open restaurants and other service businesses. Achieving such a reduction in the index does not mean that the virus has been defeated, but with a low index, the spread of infection can be controlled with less harsh measures on the economy – masks, mass testing, contact tracing, and timely isolation of people with symptoms. At least that is what scientists and officials hope. The efforts of many specialists around the world are now focused precisely on this – to understand how effective the restrictive measures are, i.e. how much the reproduction rate has decreased in a given area, and whether it is already possible to reopen the economy.

To model the spread of coronaviruses, you can use an interactive model developed by the University of Basel (Switzerland) based on SEIR. When a specific country is selected, many data are automatically loaded into this model (population, age structure of the population, number of confirmed cases of coronavirus infection, etc.). R0 is assumed to be between 2.64 and 3.23. If we choose Russia and consider what would have happened if the authorities had not taken restrictive measures, the model shows that by the beginning of July, 1.3-1.36 million people would have died in Russia. The number of infected people at the peak of the epidemic would have reached 14-22 million.

We explain quickly, simply, and clearly what happened, why it matters, and what happens next. The number of offers should remain: episodes. The end of history: Podcast Advertising There is some overestimation here – maybe about 30%, because the model assumes that the population is homogeneous, meaning that all people are roughly the same and that they have roughly the same chances of infecting each other,” says Associate Professor of Moscow State University, PhD in physics and mathematics Mikhail Tamm. “Russia, from the point of view of virus spread, is a collection of separate cities, separate reservoirs, within which everything mixes relatively quickly. There are small flows between them, which are not sufficient for what happens in Moscow to spread quickly to Novosibirsk, for example,” the scientist explains. In addition, a quarter of Russia’s population lives in rural areas. There is a possibility that the virus will not reach isolated places, but the model does not take this into account, Tamm adds. “The second consideration that can overestimate these models is that if people see that an epidemic is raging, that many people they know are getting sick, and some are getting very sick, then people will start observing a kind of quarantine themselves, without waiting for instructions from the authorities. This will eventually reduce the mortality rate, but not by half, of course. I think 1.3 million is probably an exaggerated number, but 0.8 million is absolutely realistic [in a scenario without quarantine],” says Tamm.

You can see that the model can exaggerate the data on the number of infections and deaths, for example in the case of Sweden, which did not impose strict quarantine measures like many other European countries. In Sweden, residents have voluntarily gone into a regime of social distancing, gatherings of more than 50 people are banned, universities and schools for the upper classes are closed. Many people are working from home, but shops and restaurants remain open. On May 21, approximately 3,800 deaths were registered in Sweden. However, the Basel University model shows that in the “no quarantine” scenario, the number of deaths in Sweden on the same day should have been much higher – 88-110 thousand people. In any case, the authorities in Russia took restrictive measures, and this scenario did not materialize. In Moscow, where about half of all Russian SARS-CoV-2 cases have been reported, mass gatherings have been banned since March 16, and schools have been closed since March 21. On March 23, Mayor Sobyanin ordered elderly Moscow residents not to leave their homes unless absolutely necessary, and in the following days libraries, entertainment venues, restaurants, and parks were closed. On March 30, Putin announced the introduction of non-working days, which were subsequently extended several times. On April 11, digital passes were introduced in Moscow for movement around the city. The regime of non-working days ended on May 11. Restrictive measures have kept many people at home for several weeks.

The tool developed by the University of Basel allows the impact of restrictive measures to be taken into account. To do so, however, it is necessary to assess their effectiveness in percentages and enter this data into the model. The effectiveness of quarantine measures can only be assessed after the fact, and experts do this by adjusting the model to actual data on new cases or deaths. By observing the change in trend, it is possible to understand the extent to which the quarantine has reduced the coefficient of reproduction. “Are you looking for a breakthrough moment where the quarantine had an effect on this curve? And you start to calculate how much the reproduction coefficient needs to be adjusted at that point to account for the effect of quarantine. And you get an estimate,” Mikhail Tamm describes the adjustment process.

For example, the change in trend is clearly seen in the graph of new Covid-19 cases in Russia over the past three days. On April 20, a slowdown began – apparently the effect of the holiday began to be felt. From May 2 to 5, there was a new increase, which the authorities explained by the increase in the number of tests. From May 6, a new slowdown in growth can be observed, and even a decrease in the number of new cases in the last three days.

It is difficult to model the situation in Russia as a whole because of the different population densities: there are dozens of large cities, thousands of small towns, and more than 130,000 villages in Russia. However, it is possible to model the situation in Moscow. Before the implementation of restrictive measures in the Russian capital, the reproduction coefficient was about three, but after the closure of schools it decreased to 1.9-2, says Mikhail Tamm, who evaluated this indicator. “When we went into quarantine in the first half of April, there was probably just over 1 [supply]. Then there was a spike in new cases that we saw in the first few days of May. After that spike, the reproductive coefficient seemed to be around 1 or a little lower. In the last few days, it’s gone down sharply, and now it’s probably 0.85-0.8. But it is unclear whether the recent successes are problems with the statistics or real successes,” says Tamm.

To understand how the situation with the coronavirus in Russia will develop in the coming weeks, you can also look at the modeling of the Epidemiology Center of Imperial College London. Specialists from this center model the development of the epidemic in many countries. If the restrictions in Russia are relaxed so that the number of contacts between people increases by 50%, then the number of new cases will begin to rise again, according to data from Imperial College.

Models improve as data accumulates. Over time, the understanding of how people in a given area respond to government restrictions also improves. In the United States, for example, where the situation is being modeled by several scientific centers, the range of forecasts has narrowed considerably. On April 12, four models showed that the number of deaths could range from a few dozen to 5,000 per day by early June. On May 12, the models showed that the number of deaths could reach 700-1,800 per day by early June.

Collecting the most accurate data possible is one of the most important aspects of epidemic modeling. “You determine the parameters from the statistics. You have no other data. Some parameters can be extracted from the analysis of specific cases, where epidemiologists write in detail about how the disease develops and how long it lasts, and use examples to determine how long it takes for people to infect each other,” explains Associate Professor Tamm from MSU. In particular, as mentioned above, experts assess the impact of quarantine on reducing the reproduction rate of new cases or deaths. If these data are incorrect, it will not be possible to accurately assess the reduction coefficient. Accordingly, it will not be possible to understand whether quarantine is having the desired effect. Tamm does not trust the statistics published by the Russian regions. And he is not alone. Many mathematicians, demographers, and statisticians are analyzing discrepancies in infection rates and mortality data. Several experts expressed a similar opinion in a conversation with the BBC. Economist Tatiana Mikhailova pointed out that the number of infections detected in some regions changes by one or two cases a day. She wrote about it on Facebook. At the time of writing, there were only three such “super-stable” regions – Krasnodar Krai, Lipetsk and Kursk oblasts. In recent days, the number of identified cases in these regions has been gradually decreasing.

“Among people who are interested in the reliability of official statistics, there is a fairly unanimous opinion that the level of falsification is growing and has been steadily increasing since around April 20,” says Boris Ovchinnikov, Director of Research at Data Insight. He emphasizes that the anomalies are spreading geographically – “suspicious data is coming from more and more regions”. The experts’ doubts are based primarily on numerical coincidences. Ovchinnikov gives the example of reports from the regions for May 16: Kaluga, Tula, Yaroslavl, and Chuvashia regions reported 97 newly identified cases, while Chelyabinsk, Bryansk, Bashkortostan, and Stavropol Krai reported 98 new cases. “According to my very rough estimate, the probability of such a coincidence being random is about 1 in 3 million,” says Ovchinnikov.

In the Krasnodar region, for 11 consecutive days, numbers between 96 and 99 were reported, with a difference of 1-2 cases per day. The probability that such an event could happen by chance is less than a billionth,” gives another example Tammo.

Another example of statistics that raises questions among experts is the discrepancy between the dynamics of diseases and specific events. On April 28, North Ossetia reported a sharp increase in the number of detected cases – up to 148. This happened eight days after a rally against self-isolation in Vladikavkaz. One might assume that 148 people would infect others and that the number of infected would continue to rise. However, Ossetian statistics show that, on the contrary, the rate of infection has begun to decrease.

Data Insight’s Ovchinnikov tries to evaluate indirect data, for example, by comparing official statistics on Covid-19 incidence with information on the increase in “out-of-hospital pneumonia,” the increase in all-cause mortality, the number of health care workers on the memory list from the region, and other data. Based on the analysis of these data, he constructs an index of underestimation of actual coronavirus mortality in different regions. According to Ovchinnikov’s estimate, this indicator for Moscow is about 3. In St. Petersburg, according to his rough estimate, it is underestimated by a factor of 10, and in Dagestan – by a factor of 20.

“This is a very bad trend. This [alleged distortion of statistics] makes forecasting meaningless. Forecasting is the least of our worries; there is a more fundamental problem: we do not know what is happening in these regions,” says Mikhail Tamm.