← Back to Disease Modeling

HIV/AIDS Epidemic Modeling in Cuba: Quantifying Contact Tracing Impact

by Veronica Scerra

A compartmental epidemiological model analyzing HIV/AIDS transmission dynamics and contact tracing effectiveness in Cuba (1986-2000).

How do we measure the impact of public health interventions that were never not implemented? This project tackles that question by building a mathematical model of HIV/AIDS epidemic dynamics in Cuba, explicitly incorporating the country's aggressive contact tracing program. By fitting the model to 15 years of surveillance data, I quantified both the historical effectiveness of Cuba's program and projected the potential impact of enhanced interventions.

Background and Motivation

Cuba implemented one of the world's most comprehensive HIV/AIDS control programs starting in 1986, featuring mandatory testing for certain populations, extensive contact tracing, and universal free treatment. This created a unique natural experiment: could mathematical modeling retrospectively quantify intervention effectiveness and guide future policy?

The data comes from De Arazoza & Lounes (2002), covering annual HIV cases, AIDS cases, and deaths from 1986-2000. The challenge was to build a model that could capture both disease biology (transmission, progression) and public health intervention (contact tracing) in a way that was computationally tractable yet policy-relevant.

Model Structure

Compartmental Design. I developed a modified SEIR-type model with 5 compartments: Susceptible (S), HIV-positive untreated (H), HIV-positive treated (T), AIDS (A), and Deaths (D). The key innovation was splitting HIV-positive individuals into treated vs. untreated compartments, allowing explicit modeling of contact tracing as a transfer rate (τ) from H to T.

The model captures three critical mechanisms:

Why This Approach? Compartmental ODE models occupy a "sweet spot" for epidemic modeling: complex enough to capture key biological and intervention mechanisms, simple enough for robust parameter estimation from aggregate data, and fast enough for interactive exploration. Alternative approaches (agent-based models, statistical regression) would either require unavailable individual-level data or sacrifice mechanistic insight.

Omission of "Exposed" (E) compartment. Classic SEIR models have "exposed" compartments for diseases like measles (10-14 day incubation), tuberculosis (weeks-months of latent infection), and even COVID-19 (2-5 day incubation). Unlike those diseases, however, HIV doesn't have a meaningful non-infectious incubation period. HIV-infected individuals become infectious almost immediately after infection, even before they develop symptoms or test positive. Adding an E compartment would increase model complexity and require data on acute vs. chronic transmission rates (not available in the dataset I used), and the annual resolution of this model wouldn't capture week-scale dynamics. The cost-benefit verdict in this case is that it wouldn't be worth the additional model complexity to answer the general policy question of contact tracing effectiveness over years.

Parameter Estimation

Optimization Challenge. Estimating 6 parameters (β, α, αT, τ, δ, μ) from epidemiological data is non-trivial. The objective function—matching model predictions to observed HIV cases, AIDS cases, and deaths—involves solving a system of coupled ODEs at each evaluation, creating a highly non-linear, non-convex optimization landscape with multiple local minima.

Differential Evolution Solution. I used differential evolution, a population-based global optimizer that explores parameter space by evolving a population of candidate solutions through mutation, crossover, and greedy selection. Unlike gradient-based methods, it doesn't require derivatives (which would necessitate expensive adjoint equations) and is robust to local minima. The algorithm ran for 300 generations with a population of 90 individuals (~27,000 ODE solves), completing in 2-3 minutes on standard hardware.

ParameterValueInterpretation
β0.932Transmission rate (per year)
α0.235HIV→AIDS progression, untreated (mean: 4.3 years)
αT0.085HIV→AIDS progression, treated (mean: 11.8 years)
τ0.500Contact tracing rate (66% coverage -> probability traced before AIDS: τ/(τ+α+μ) )
δ0.132AIDS mortality rate (mean survival: 7.6 years)
μ0.020Natural birth/death rate

Key Epidemiological Insight: The fitted basic reproduction number R₀ = 1.24 indicates slow but sustained epidemic growth—close enough to 1 that enhanced interventions could realistically push it below the epidemic threshold. Treatment extends healthy life by 2.7×, and Cuba's contact tracing achieved 66% coverage, far exceeding most countries during this era.

Model Validation

The model achieved excellent fit to all three data streams:

OutcomeR² ValueAssessment
HIV cases0.87187.1% of variance explained
AIDS cases0.94894.8% of variance explained
Deaths0.92992.9% of variance explained

Visual inspection confirmed the model captures both long-term trends (increasing cases) and relative timing (AIDS lags behind HIV, deaths lag behind AIDS). The fitted compartment dynamics reveal an interesting phenomenon: the treated (T) compartment grows fastest, eventually exceeding all other disease states. This isn't a model artifact—it reflects the real-world impact of successful treatment programs that convert HIV from a rapidly fatal disease into a chronic manageable condition.

Intervention Analysis

Scenario Design. I projected epidemic trajectories under four contact tracing intensities starting in 1997: baseline (continue existing program), moderate (2× enhancement), strong (4× enhancement), and very strong (6× enhancement). The analysis quantifies both absolute impact (cases averted) and relative effectiveness (percent reduction).

InterventionAIDS Cases Averted by 2010Deaths Averted% Reduction
Moderate (2×)~7210~359589.3%
Strong (4×)~7541~403293.4%
Very Strong (6×)~7583~410993.9%

Diminishing Returns. While all scenarios show substantial benefit, returns diminish with intensity. Doubling contact tracing (2×) averts 7210 AIDS cases; doubling again (4×) adds only 331 more cases averted. This suggests an optimal balance around 3-4× baseline capacity, where impact remains high without hitting logistical constraints.

Practical Feasibility. The moderate scenario (2× tracing) is highly feasible with modest investment—approximately $10-20M/year for a population of 11 million, yielding a cost per life-year saved of ~$25-35K, which is highly cost-effective by WHO standards. The strong scenario (4×) is achievable but requires major investment and multi-year ramp-up. The very strong scenario (6×) likely hits practical ceilings around staffing, infrastructure, and the hardest-to-reach populations.

Key Insights

This analysis demonstrates that mathematical modeling can quantify intervention effectiveness even for programs that were never absent. Key findings:

Methodological Decisions and Reasoning

Limitations and Future Directions

Current Limitations. The model assumes homogeneous mixing (no risk groups, networks, or geographic structure), constant parameters over time (ignoring behavioral changes and treatment improvements post-1996), and perfect adherence once treated. No uncertainty quantification means we have point estimates without confidence intervals.

When This Model Is/Isn't Appropriate. This approach excels at population-level policy planning, medium-term projections (5-15 years), and comparative scenario analysis. It's inappropriate for individual-level predictions, short-term outbreak response requiring stochasticity, or situations demanding detailed population stratification.

Extensions Worth Pursuing. Priority improvements include Bayesian inference for uncertainty quantification, stratification by risk group (MSM, heterosexual, IDU), time-varying parameters to capture HAART introduction, and cost-effectiveness analysis integrating economic data. The modular codebase makes these additions straightforward.

Broader Context

This work sits within a 40-year tradition of HIV/AIDS modeling, from early SIR models in the 1980s through treatment-era models in the 1990s to modern combination prevention frameworks. The contribution here is demonstrating how established methodology can be applied to unique datasets (Cuba's program) to answer policy-relevant questions (intervention effectiveness) using modern computational tools (differential evolution, modular Python).

The project validates that compartmental models remain valuable despite growing interest in agent-based and machine learning approaches. When the question is "how effective is this intervention?" rather than "what will happen to this individual?", mechanistic population models provide interpretable, policy-actionable insights that more complex approaches often obscure.

Source Code