COVID-19 and states of emergency in Tokyo:
Analysis and forecast with data assimilation

Q. Suna,b, S. Richarda,b, C. Suna,c, L. Zhangc, T. Miyoshia,d,e

Data Assimilation Research Team, RIKEN Center for Computational Science (R-CCS), Japan
Graduate School of Mathematics, Nagoya University, Chikusa-ku, Nagoya 464-8602, Japan
School of Science, Nagoya University, Chikusa-ku, Nagoya 464-8602, Japan
Prediction Science Laboratory, RIKEN Cluster for Pioneering Research, Japan
RIKEN interdisciplinary Theoretical and Mathematical Sciences Program (iTHEMS), Japan
August 2, 2021


The last week of July 2021 saw a sharp rise in the number of cases of COVID-19 in Tokyo. We provide an analysis of the propagation of the epidemic, from March 6th, 2020, to the end of July 2021, and in particular the effective reproduction number. Our approach is based on data assimilation, a technique which has been proven very successful for example for accurate weather prediction. This analysis is then correlated with the four states of emergency successively declared in Tokyo, and different scenarios are proposed for the evolution of the epidemic in the coming weeks. Each scenario takes into account the experience gained during the the first three states of emergency. The dependence on uncertain parameters is also discussed.

1 Introduction

In this note we provide a short analysis and several forecasts for the propagation of COVID-19 in Tokyo. These investigations are based on data assimilation, and the real data are collected up to the end of July 2021. Data assimilation is a fundamental method developed originally for numerical weather prediction, and recently applied successfully to different fields of research. Our data assimilation approach consists of two main steps, repeated on a regular basis: 1) A model for the evolution of the epidemic provides forecasts, 2) These forecasts are compared to real data, and an optimization process is implemented for the following forecasts. Since the data about COVID-19 are provided on a daily basis, the system is updated every day. Several independent systems are considered simultaneously, and consequently some quantities of interest can be estimated with probability distributions. Examples of such quantities are unknown parameters or unknown populations.

For the sake of completeness, we perform our analysis with two different approaches: one continuous approach using ensemble Kalman filter, and one agent-based approach with particle filter. Their main difference is that the continuous approach involves only 15 independent simulations of the propagation of the epidemic in Tokyo, while the discrete approach involves 50’000 to 100’000 such realizations. We introduce the common structure of these approaches in the next section, and refer to [1, 2] for precise descriptions and additional information.

The results of our investigations are represented by different scenarios mimicking the evolution of the epidemic during the first three states of emergency. Starting from the very beginning of August 2021, a behavior similar to the one observed during the first state of emergency would lead back to a reasonable situation within a few weeks, while for a behavior similar to the one observed during the third state of emergency, several months would be necessary. In all scenarios, the number of infected persons reaches unprecedented values.

2 The model

Both approaches are based on an extension of the SEIR compartmental model commonly used in epidemiology. The compartment I, standing for infectious, is subdivided into two compartments, one with infectious agents who do not show any symptoms, and one with infectious agents presenting symptoms. The transfer diagram for this model is introduced in Figure 1.


Figure 1:The transfer diagram for the extended SEIR model, with the three compartments with observations indicated in yellow. The red arrows indicate the generation of secondary cases (new infected agents)

In Figure 1, the different compartments correspond to:

The compartment of all susceptible agents,
The exposed agents, infected but not infectious yet,
The pre-symptomatic agents (who will develop symptoms later) or asymptomatic agents (who will never show any symptoms). The agents are infectious, part of them will become symptomatic and move to Is, while some will recover without being ever recorded (downwards arrow),
The symptomatic agents, also infectious. The majority of these agents will look for medical assistance and move to H, while some will quarantine at home, without being recorded (downwards arrow),
The agents undergoing treatment (in hospital, in hotels, or at home), recorded by the medical systems,
The deceased agents,
The recovered agents who were recorded while in H.

In such a diagram, some constant (time duration, probability) are attached to each compartment and to each arrow. Some of these quantities are well documented in the literature, some of them are quite uncertain. In [1, 2] we provide the precise values used in our simulations and the references. It turns out that our model is stable under a change of the main uncertain parameters: the relative infectivity k of the asymptomatic agents compared to the symptomatic agents, and the ratio of symptomatic/asymptomatic agents. This stability is further discussed in Section 5.

3 The effective reproduction number

The effective reproduction number Rt is defined by the average number of secondary cases generated by one primary case. It corresponds to one of the main indicators for the rapidity of propagation of a disease. If Rt is smaller than 1, then the propagation of the epidemic is slowing down, while if Rt is bigger than 1, the epidemic is expanding. Various techniques exist for evaluating the effective reproduction number, see for example [3], and our approach with data assimilation leads to a new and rather natural one. We provide in Figure 2 the evaluation of Rt (mean and confidence intervals) in the discrete and in the continuous approach.

Figure 2:Effective reproduction number: (a) discrete approach, (b) continuous approach. The black curve indicates the mean value; the dark and light colored regions refer to the 68% and 95% confidence intervals, respectively. States of emergency correspond to grey regions In (b) the jump at the end of May 2020 is due to an abrupt change in one medical constant [2]

In the graphs of Figure 2, the states of emergency (SoE) are indicated by the shaded regions. More precisely, the SoE took place on:

2020∕4∕7− 2020∕5∕25, 2021∕1∕8− 2021∕3∕21,  2021∕4∕25 − 2021∕5∕31,  2021∕7∕12 − ...

During the first three SoE, the mean effective reproduction number decreases, but the average decay of the mean depends on the period under consideration. If one evaluates this decay (taking an optimistic perspective, and accepting an effect lasting even after the official end of the SoE) one gets the average daily decay of Rt provided in Table 1 for the continuous approach. These values will be the basis for the predictions in the subsequent section. Note that our approach for estimating these values is very simple; a more precise and systematic technique is in preparation. Note also that the biggest average decay of Rt is observed during the first SoE, while the smallest one took place during the third state of emergency. In simpler term, the first SoE has been the most efficient one, while the third one has been the least efficient one.

State of emergencyAverage daily decay of Rt (initial and final days considered)

1 0.0563 (2020/4/8 - 2020/5/31)

2 0.0147 (2021/1/15 - 2021/3/3)

3 0.0085 (2021/4/29 - 2021/6/16)

Table 1:Average daily decays of Rt during the first three states of emergency in Tokyo evaluated from Figure 2(b)

Let us mention that this model and these two approaches can also be used for estimating the populations E, Ia, or Is, as well as the population of recovered agents. Additional unknown parameters can also be estimated with these approaches, and as for Rt it turns out that the two approaches lead to similar analysis results.

4 Forecasts

For this section and for simplicity, we shall use only the continuous approach. For the forecasts, we shall mimic the effects of the previous three SoE. Indeed, as mentioned in Table 1, each of them has produced a different decay of the effective reproduction number Rt. Since the fourth SoE (declared on July 12th,2021) has not produced any noticeable effect as of July 31st, we have opted for two scenarios: (A) Impose a decay to Rt starting on August 1st, (B) Impose a decay to Rt starting on August 4th. From July 27th to the initial day of the decay, Rt has been kept increasing with the rate observed in the last week of July. For the daily decay rate of Rt, we have used the three average daily decays measured during the previous SoE. For each of them, the number of agents in H (in hospital, in hotels, or at home) and the accumulated deceased agents D are shown below. As a mean of comparison, we also recall the observed values of H and D for the last 17 months provided by [3]. At the technical level, let us mention that the death rate and the average time in H during the forecasts have been kept constant at their current values of July 27th, 2021.

In Figure 3 we provide the information about the real data and our analysis results from March 6th,2020 to July 27th,2021. From July 27th to August 3rd, the effective reproduction number is kept increasing with the rate observed for the few days before July 27th, and from August 4th a decay similar to the one observed during the first SoE is imposed.

Figure 3:(a) Agents in H, (b) Accumulated deceased agents D. The red curve indicates the real values, the black curve shows the mean value of our analysis; the dark and light colored regions refer to the 68% and 95% confidence intervals, respectively. From July 27th our forecast (scenario B) with a decay of Rt similar to the first state of emergency
Figure 4:Scenario A: (a) Agents in H, (b) Accumulated deceased agents D. In Ax, the index x corresponds to the average decay rate of Rt observed during the xth SoE. Light colored region refers to the 68% confidence interval

In Figures 4 and 5, we provide the outcomes of the different scenarios with a presentation starting on July 12th,2021. Scenarios A (decay from August 1st) and B (decay from August 4th) are presented separately. The decay of the effective reproduction numbers observed during the first three SoE are also implemented independently, and the numbering correspond: 1 for the first SoE, 2 for the second one, and 3 for the third one. Clearly, the scenario A1 (decay starting on August 1st with a rate similar to the first SoE) is the most optimistic one, while scenario B3 (decay starting on August 4th with a rate similar to the third SoE) is the most pessimistic one.

Figure 5:Scenario B: (a) Agents in H, (b) Accumulated deceased agents D. In Bx, the index x corresponds to the average decay rate of Rt observed during the xth SoE. Light colored region refers to the 68% confidence interval

5 Dependence on some parameters

As mentioned in Section 2, the relative infectivity k of the asymptomatic agents compared to the symptomatic agents is one very uncertain parameter. For most of our simulations we have used k = 0.58, based on the information provided by the literature, see for example [4]. Since this ratio is quite uncertain, we performed similar investigations with different ratios varying from 0.1 to 1.0. The corresponding mean values for Rt are shown in Figure 6. The patterns are similar, especially after May 31st, 2020. A conclusion from this result is obtained: except for the first three months, the different values for the relative infectivity do not generate any noticeable difference for the effective reproduction number.

Figure 6:Mean Rt for different values of k: (a) discrete approach, (b) continuous approach

Similarly, the ratio between symptomatic and asymptomatic agents is also a somewhat controversial parameter. For our simulations, we have used the respective proportion of 83% and 17%, see [4]. However, since asymptomatic cases are very difficult to detect, and since we can not be fully confident in this ratio, a sensitivity test has been performed. To do this, we have increased the ratio of asymptomatic, but kept the parameter k = 0.58 constant. Compared to the original setting, in the new scenarios more agents become asymptomatic and recover without showing any symptoms. On the other hand, the number of symptomatic agents is more or less constant, since they are compared to the real data on a daily basis. In Figure 7 the different curves for Rt look very similar. However, we observe that a bigger proportion of asymptomatic agents leads to a slightly bigger value of Rt. Indeed, since asymptomatic agents are less infectious by the factor 0.58, the transmission coefficient has to be bigger to create enough secondary cases to fit with the observations. As a consequence, Rt will also become slightly bigger.

Figure 7:Mean Rt for different ratios between symptomatic and asymptomatic: (a) discrete approach, (b) continuous approach

6 Conclusion

The mean decay of the effective reproduction number Rt during the first state of emergency in Tokyo is much higher than the ones observed during the second and the third states of emergency. Psycho-socio-economical factors can certainly explain this evolution, but it is not our purpose to discuss this aspect. On the other hand, by implementing these decay rates in our model of the evolution of the epidemic in Tokyo, we forecast that the sudden rise in the number of COVID-19 infections of the last week of July 2021 can either be constrained within a few weeks (scenarios A1 and B1), or will last several months (scenarios A2, A3, B2, and B3). In any of these scenarios we expect that the number of agents undergoing treatment (compartment H) will reach a maximum value bigger than any values observed for the last 17 months in Tokyo, unless the current SoE becomes quickly more efficient than the first one.


[1] Memo1.pdf

[2] Memo2.pdf


[4] O. Byambasuren, et al., J. Association of Medical Microbiology and Infectious Disease Canada Vol. 5 no 4, Dec. 2020, 223–234.