by Veronica Scerra
A compartmental epidemiological model analyzing HIV/AIDS transmission dynamics and contact tracing effectiveness in Cuba (1986-2000).
How do we measure the impact of public health interventions that were never not implemented? This project tackles that question by building a mathematical model of HIV/AIDS epidemic dynamics in Cuba, explicitly incorporating the country's aggressive contact tracing program. By fitting the model to 15 years of surveillance data, I quantified both the historical effectiveness of Cuba's program and projected the potential impact of enhanced interventions.
Cuba implemented one of the world's most comprehensive HIV/AIDS control programs starting in 1986, featuring mandatory testing for certain populations, extensive contact tracing, and universal free treatment. This created a unique natural experiment: could mathematical modeling retrospectively quantify intervention effectiveness and guide future policy?
The data comes from De Arazoza & Lounes (2002), covering annual HIV cases, AIDS cases, and deaths from 1986-2000. The challenge was to build a model that could capture both disease biology (transmission, progression) and public health intervention (contact tracing) in a way that was computationally tractable yet policy-relevant.
Compartmental Design. I developed a modified SEIR-type model with 5 compartments: Susceptible (S), HIV-positive untreated (H), HIV-positive treated (T), AIDS (A), and Deaths (D). The key innovation was splitting HIV-positive individuals into treated vs. untreated compartments, allowing explicit modeling of contact tracing as a transfer rate (τ) from H to T.
The model captures three critical mechanisms:
Why This Approach? Compartmental ODE models occupy a "sweet spot" for epidemic modeling: complex enough to capture key biological and intervention mechanisms, simple enough for robust parameter estimation from aggregate data, and fast enough for interactive exploration. Alternative approaches (agent-based models, statistical regression) would either require unavailable individual-level data or sacrifice mechanistic insight.
Omission of "Exposed" (E) compartment. Classic SEIR models have "exposed" compartments for diseases like measles (10-14 day incubation), tuberculosis (weeks-months of latent infection), and even COVID-19 (2-5 day incubation). Unlike those diseases, however, HIV doesn't have a meaningful non-infectious incubation period. HIV-infected individuals become infectious almost immediately after infection, even before they develop symptoms or test positive. Adding an E compartment would increase model complexity and require data on acute vs. chronic transmission rates (not available in the dataset I used), and the annual resolution of this model wouldn't capture week-scale dynamics. The cost-benefit verdict in this case is that it wouldn't be worth the additional model complexity to answer the general policy question of contact tracing effectiveness over years.
Optimization Challenge. Estimating 6 parameters (β, α, αT, τ, δ, μ) from epidemiological data is non-trivial. The objective function—matching model predictions to observed HIV cases, AIDS cases, and deaths—involves solving a system of coupled ODEs at each evaluation, creating a highly non-linear, non-convex optimization landscape with multiple local minima.
Differential Evolution Solution. I used differential evolution, a population-based global optimizer that explores parameter space by evolving a population of candidate solutions through mutation, crossover, and greedy selection. Unlike gradient-based methods, it doesn't require derivatives (which would necessitate expensive adjoint equations) and is robust to local minima. The algorithm ran for 300 generations with a population of 90 individuals (~27,000 ODE solves), completing in 2-3 minutes on standard hardware.
| Parameter | Value | Interpretation |
|---|---|---|
| β | 0.932 | Transmission rate (per year) |
| α | 0.235 | HIV→AIDS progression, untreated (mean: 4.3 years) |
| αT | 0.085 | HIV→AIDS progression, treated (mean: 11.8 years) |
| τ | 0.500 | Contact tracing rate (66% coverage -> probability traced before AIDS: τ/(τ+α+μ) ) |
| δ | 0.132 | AIDS mortality rate (mean survival: 7.6 years) |
| μ | 0.020 | Natural birth/death rate |
Key Epidemiological Insight: The fitted basic reproduction number R₀ = 1.24 indicates slow but sustained epidemic growth—close enough to 1 that enhanced interventions could realistically push it below the epidemic threshold. Treatment extends healthy life by 2.7×, and Cuba's contact tracing achieved 66% coverage, far exceeding most countries during this era.
The model achieved excellent fit to all three data streams:
| Outcome | R² Value | Assessment |
|---|---|---|
| HIV cases | 0.871 | 87.1% of variance explained |
| AIDS cases | 0.948 | 94.8% of variance explained |
| Deaths | 0.929 | 92.9% of variance explained |
Visual inspection confirmed the model captures both long-term trends (increasing cases) and relative timing (AIDS lags behind HIV, deaths lag behind AIDS). The fitted compartment dynamics reveal an interesting phenomenon: the treated (T) compartment grows fastest, eventually exceeding all other disease states. This isn't a model artifact—it reflects the real-world impact of successful treatment programs that convert HIV from a rapidly fatal disease into a chronic manageable condition.
Scenario Design. I projected epidemic trajectories under four contact tracing intensities starting in 1997: baseline (continue existing program), moderate (2× enhancement), strong (4× enhancement), and very strong (6× enhancement). The analysis quantifies both absolute impact (cases averted) and relative effectiveness (percent reduction).
| Intervention | AIDS Cases Averted by 2010 | Deaths Averted | % Reduction |
|---|---|---|---|
| Moderate (2×) | ~7210 | ~3595 | 89.3% |
| Strong (4×) | ~7541 | ~4032 | 93.4% |
| Very Strong (6×) | ~7583 | ~4109 | 93.9% |
Diminishing Returns. While all scenarios show substantial benefit, returns diminish with intensity. Doubling contact tracing (2×) averts 7210 AIDS cases; doubling again (4×) adds only 331 more cases averted. This suggests an optimal balance around 3-4× baseline capacity, where impact remains high without hitting logistical constraints.
Practical Feasibility. The moderate scenario (2× tracing) is highly feasible with modest investment—approximately $10-20M/year for a population of 11 million, yielding a cost per life-year saved of ~$25-35K, which is highly cost-effective by WHO standards. The strong scenario (4×) is achievable but requires major investment and multi-year ramp-up. The very strong scenario (6×) likely hits practical ceilings around staffing, infrastructure, and the hardest-to-reach populations.
This analysis demonstrates that mathematical modeling can quantify intervention effectiveness even for programs that were never absent. Key findings:
Methodological Decisions and Reasoning
Current Limitations. The model assumes homogeneous mixing (no risk groups, networks, or geographic structure), constant parameters over time (ignoring behavioral changes and treatment improvements post-1996), and perfect adherence once treated. No uncertainty quantification means we have point estimates without confidence intervals.
When This Model Is/Isn't Appropriate. This approach excels at population-level policy planning, medium-term projections (5-15 years), and comparative scenario analysis. It's inappropriate for individual-level predictions, short-term outbreak response requiring stochasticity, or situations demanding detailed population stratification.
Extensions Worth Pursuing. Priority improvements include Bayesian inference for uncertainty quantification, stratification by risk group (MSM, heterosexual, IDU), time-varying parameters to capture HAART introduction, and cost-effectiveness analysis integrating economic data. The modular codebase makes these additions straightforward.
This work sits within a 40-year tradition of HIV/AIDS modeling, from early SIR models in the 1980s through treatment-era models in the 1990s to modern combination prevention frameworks. The contribution here is demonstrating how established methodology can be applied to unique datasets (Cuba's program) to answer policy-relevant questions (intervention effectiveness) using modern computational tools (differential evolution, modular Python).
The project validates that compartmental models remain valuable despite growing interest in agent-based and machine learning approaches. When the question is "how effective is this intervention?" rather than "what will happen to this individual?", mechanistic population models provide interpretable, policy-actionable insights that more complex approaches often obscure.