Frequentist Parameter Estimation (OLS)

We are given the SIR model

\( \begin{array}{lcl} \frac{d S}{d t} = \mu (N – S) – \beta IS & , & S(0) = 900, \\[10pt] \frac{d I}{d t} = \beta IS – (\gamma + \mu) I & , & I(0) = 100, \\[10pt] \frac{d R}{d t} = \gamma I – \mu R & , & R(0) = 0 \end{array} \)

where at time \(t\), \(S(t)\) is the number of susceptible individuals, \(I(t)\) is the number of infected individuals, \(R(t)\) is the number of recovered individuals, \(N=S(t)+I(t)+R(t)\) is the population size, \(\mu\) is the birth and death rate, \(\beta\) is the transmission rate, and \(\gamma\) is the recovery rate. Using a response based on the number of individuals infected at 150 evenly spaced time points, we aim to deduce \(\theta = [\beta, \gamma, \mu]^T\) from the frequentist perspective.
Using MATLAB’s ode45.m to approximate the solution to the above system, and then using MATLAB’s fminsearch.m to optimize our parameters (starting values of \(\beta=\gamma=\mu=0.1\)), we get our estimates: \(\widehat{\beta}=0.004027\), \(\widehat{\gamma}=0.348751\), and \(\widehat{\mu}=0.078226\).
Notice that our estimated parameters perform well in fitting the infection curve:

The entire SIR model is shown below. While the susceptible population begins to rise part way through the time period, the general shape looks approximately right.

Finally, we note that we decompose the transmission rate as the product of the contact rate \(k\) and infection coefficient \(\eta\) so that all parameters are identifiable; \(\beta=k\eta\). This is shown in pairs plot below, which was derived from taking a Bayesian approach to the same problem.



%%%1. Load data from the SIR_integer.txt file%%%
data = load('SIR_integer.txt');
time_data = data(:, 1);  %Time points (t_j)%
I_data = data(:, 2);     %Corresponding I(t_j) values (infected population)%

%%%2. Function defining the SIR model as a system of ODEs%%%
function dSIRdt=SIR_model(t, y, beta, gamma, mu)
    S = y(1);  %Susceptible population%
    I = y(2);  %Infected population%
    R = y(3);  %Recovered population%
    N = 1000;  %Total population size (S+I+R)%
    
    %ODEs for SIR model%
    dSdt= mu * (N - S) - beta * S * I;
    dIdt=beta * S * I - (gamma + mu) * I;
    dRdt=gamma * I - mu * R;
    
    %Return the derivatives as a column vector%
    dSIRdt = [dSdt; dIdt; dRdt];
end

%%%3.Objective function to compute error%%%
function error = objective_function(theta, time_data, I_data)
    %3a. Initial conditions%
    S0 = 900;
    I0 = 100;
    R0 = 0;
    y0 = [S0; I0; R0];  % Initial state vector [S(0), I(0), R(0)]
    
    %3b. Solve the ODE system using ode15s for the given parameters%
    [t, y] = ode15s(@(t, y) SIR_model(t, y, theta(1), theta(2), theta(3)), time_data, y0);
    
    %3c. Extract the simulated I values from the solution%
    I_sim = y(:, 2);
    
    %3d. Compute the sum of squared errors between the simulated and observed I values%
    error = sum((I_sim - I_data).^2) + 1e-6 * sum(theta.^2);
end

%%%4. Initial guesses and bounds for the parameters [beta, gamma, mu]%
theta0 = [0.01, 0.01, 0.01];
lb = [0, 0, 0];        %Lower bounds%
ub = [0.5, 0.5, 0.5];  %Upper bounds%

%%%5. Use fmincon to minimize the objective function and estimate parameters%%%
options = optimoptions('fmincon', 'Display', 'iter', 'Algorithm', 'sqp');
[theta_est, fval] = fmincon(@(theta) objective_function(theta, time_data, I_data), theta0, [], [], [], [], lb, ub, [], options);
disp('Estimated parameters:');
disp(['Beta: ', num2str(theta_est(1))]);
disp(['Gamma: ', num2str(theta_est(2))]);
disp(['Mu: ', num2str(theta_est(3))]);
disp('Final sum of squared errors:');
disp(fval);

%%%6. After optimization and plotting%%%
[t, y] = ode45(@(t, y) SIR_model(t, y, theta_est(1), theta_est(2), theta_est(3)), time_data, [900; 100; 0]);

plot(time_data, I_data, 'o', t, y(:,2), '-');
legend('Observed', 'Fitted');
xlabel('Time');
ylabel('Infected Population');
title('SIR Model Fit');