MILP-Based Virtual Population Calibration

Matching Virtual Population to Clinical Trial Characteristics

Authors

Vijay Ivaturi

David Müller-Widmann

HBV Tutorial Series

This is Tutorial 5 of 6 in the HBV series.

#	Tutorial	Status
1	Model Introduction	✓ Complete
2	Global Sensitivity Analysis	✓ Complete
3	Structural Identifiability	✓ Complete
4	Copula Vpop Generation	✓ Complete
5	MILP Calibration (current)	← You are here
6	VCT Simulation

1 Learning Outcomes

Calibration ensures the virtual population matches clinical trial baseline characteristics. For HBV, baseline HBsAg is a critical stratification variable that strongly predicts treatment response (Cortés-Ríos et al., 2025).

In this tutorial, you will learn:

Why baseline HBsAg calibration is critical for HBV VCTs
How to apply ILP calibration (from TB-05) to HBV
How to define target distributions from clinical trial data
How to validate calibration results

Prerequisites

This tutorial builds on:

HBV-04: HBV vpop generation
TB-05: Detailed ILP theory

We’ll apply ILP concepts without repeating the full theory.

2 Libraries

using Pumas
using StatsBase
using JuMP
using HiGHS
import MultiObjectiveAlgorithms as MOA
using Copulas
using AlgebraOfGraphics
using CairoMakie
using DataFramesMeta
using SummaryTables

using Statistics
using Random

# Set seed for reproducibility
Random.seed!(1234)

3 Part 1: Why Baseline HBsAg Calibration?

3.1 Clinical Significance

Baseline HBsAg level is the most important predictor of treatment response in HBV:

Low HBsAg (<100 IU/mL): Higher likelihood of functional cure
High HBsAg (>1000 IU/mL): Lower response, longer treatment needed

3.2 Everest Trial Distribution

The Everest trial is a key reference for HBV treatment. Here’s their baseline HBsAg distribution:

# Everest trial baseline HBsAg distribution
everest_target = DataFrame(
    hbsag_range = ["<100", "[100, 200)", "[200, 500)", "[500, 1000)", "≥1000"],
    log10_lower = [-Inf, log10(100), log10(200), log10(500), log10(1000)],
    log10_upper = [log10(100), log10(200), log10(500), log10(1000), Inf],
    pct = [37.5, 10.9, 19.1, 21.5, 11.0]
)

simple_table(everest_target, [:hbsag_range => "HBsAg", :pct => "Target (%)"])


HBsAg	Target (%)

<100	37.5
[100, 200)	10.9
[200, 500)	19.1
[500, 1000)	21.5
≥1000	11

4 Part 2: Generate HBV Virtual Population

We generate a virtual population using the Gaussian copula approach from HBV-04, then simulate 5 years of untreated chronic infection followed by 4 years of NA suppression to obtain each patient’s baseline HBsAg.

4.1 Model and Parameters

hbv_model = @model begin
    @param begin
        # Estimated parameters (log10 scale)
        "Production/synthesis rate of HBsAg subviral particles from infected hepatocytes [molec./cell/d]"
        tv_log10_p_S ∈ RealDomain(; lower = 2, upper = 12)
        "HBV infectivity [mL/virus/d]"
        tv_log10_β ∈ RealDomain(; lower = -10, upper = 0)
        "Killing rate of infected hepatocytes by cellular immunity [mL/cells/d]"
        tv_log10_m ∈ RealDomain(; lower = -3, upper = 5)
        "Synthesis rate of antigen presenting cells [cells/d]"
        tv_log10_k_P ∈ RealDomain()
        "Synthesis rate of ALT from debris hepatocytes [10^3 U/cells/d]"
        tv_log10_k_A ∈ RealDomain()
        "PD efficacy of NA on HBV production []"
        tv_log10_ϵ_NA ∈ RealDomain(; lower = -5, upper = 0)
        "PD efficacy of IFNα on HBV production []"
        tv_log10_ϵ_IFN ∈ RealDomain(; lower = -5, upper = 0)
        "PD efficacy of IFNα on activation of cellular immunity []"
        tv_log10_r_E_IFN ∈ RealDomain(; lower = 0, upper = 4)
        "Scaling factor for observed effector CD4/8 T cell numbers []"
        tv_log10_conv_E ∈ RealDomain()

        # IIV of estimated parameters in log10 scale
        Ω ∈ PDiagDomain(9)

        # Fixed parameters
        "Baseline of HBV [copies/mL]"
        ini_V ∈ RealDomain(; lower = 0.0)
        "Baseline of ALT [U/L]"
        ini_A ∈ RealDomain(; lower = 0.0)
        "Max. proliferation rate of uninfected hepatocytes [1/d]"
        r_T ∈ RealDomain(; lower = 0.0)
        "Max. proliferation rate of cellular immunity [1/d]"
        r_E ∈ RealDomain(; lower = 0.0)
        "Max. proliferation rate of innate immunity [1/d]"
        r_X ∈ RealDomain(; lower = 0.0)
        "Fraction of cellular immunity that killing uninfected and refractory cells []"
        n ∈ RealDomain(; lower = 0.0)
        "Max. rate of conversion of infected cells to refractory cells due to the action of innate immunity [1/d]"
        ρ ∈ RealDomain(; lower = 0.0)
        "Max. rate of conversion of refractory cells back to target cells in the absence of innate immune responses []"
        ρ_o_ρ_r ∈ RealDomain(; lower = 0.0)
        "Production/synthesis rate of HBV from infected hepatocytes [virus/cell/d]"
        p_V ∈ RealDomain(; lower = 0.0)
        "Death rate of infected hepatocytes [1/d]"
        d_TI ∈ RealDomain(; lower = 0.0)
        "Degradation/clearance of HBV HBsAg subviral particles [1/d]"
        d_V ∈ RealDomain(; lower = 0.0)
        "Rate of systemic induction/recovery from exhaustion [1/d]"
        d_Q ∈ RealDomain(; lower = 0.0)
        "Degradation/clearance of innate immunity [1/d]"
        d_X ∈ RealDomain(; lower = 0.0)
        "Degradation/clearance of cellular immunity [1/d]"
        d_E ∈ RealDomain(; lower = 0.0)
        "Degradation/clearance of antigen presenting cells [1/d]"
        d_P ∈ RealDomain(; lower = 0.0)
        "Degradation/clearance of debris [1/d]"
        d_D ∈ RealDomain(; lower = 0.0)
        "Degradation/clearance of ALT [1/d]"
        d_A ∈ RealDomain(; lower = 0.0)
        "Max. rate of exhaustion [1/d]"
        d_E_Q ∈ RealDomain(; lower = 0.0)
        "Hepatocyte carrying capacity in the liver [cells/mL]"
        T_max ∈ RealDomain(; lower = 0.0)
        "Half-maximal (saturation) parameter for the effect of systemic exhaustion []"
        ϕ_Q ∈ RealDomain(; lower = 0.0)
        "Half-maximal (saturation) parameter for the activation of cellular immunity [cells/mL]"
        ϕ_E ∈ RealDomain(; lower = 0.0)
        "Max. inhibitory effect of HBsAg subviral particles on innate immunity []"
        S_max ∈ RealDomain(; lower = 0.0)
        "Half-maximal (saturation) parameter for inhibitory effect of HBsAg subviral particles on the generation of innate immunity [μg/mL]"
        ϕ_S ∈ RealDomain(; lower = 0.0)
        "Hill coefficient for describing HBsAg-induced inhibition of innate immunity []"
        nS ∈ RealDomain()
        "Threshold of infected hepatocytes to produce dynamical switch from progressive disease to long-lasting control of HBV infection [cells/mL]"
        I_threshold ∈ RealDomain(; lower = 0.0)

        # Residual errors
        σ_add_log10_HBsAg ∈ RealDomain(lower = 0.0)
        σ_prop_log10_HBsAg ∈ RealDomain(lower = 0.0)
        σ_add_log10_HBV ∈ RealDomain(lower = 0.0)
        σ_prop_log10_HBV ∈ RealDomain(lower = 0.0)
        σ_prop_log10_ALT ∈ RealDomain(lower = 0.0)
        σ_prop_log10_CD8 ∈ RealDomain(lower = 0.0)
    end

    @random begin
        η ~ MvNormal(Ω)
    end

    # Time-varying covariates for treatment phases
    @covariates NA IFN

    @pre begin
        # Individual parameters
        # Logit-normal distribution of log10(p_S), scaled to range [2, 12]
        p_S = exp10(10 * logistic(logit((tv_log10_p_S - 2) / 10) + η[1]) + 2)
        # Logit-normal distribution of log10(β), scaled to range [-10, 0]
        β = exp10(10 * (logistic(logit(tv_log10_β / 10 + 1) + η[2]) - 1))
        # Logit-normal distribution of log10(m), scaled to range [-3, 5]
        m = exp10(8 * logistic(logit((tv_log10_m + 3) / 8) + η[3]) - 3)
        # Normal distribution of log10(k_P)
        k_P = exp10(tv_log10_k_P + η[4])
        # Normal distribution of log10(k_A)
        k_A = exp10(tv_log10_k_A + η[5])
        # Logit-normal distribution of log10(ϵ_NA), scaled to range [-5, 0]
        ϵ_NA = exp10(NA * 5 * (logistic(logit(tv_log10_ϵ_NA / 5 + 1) + η[6]) - 1))
        # Logit-normal distribution of log10(ϵ_IFN), scaled to range [-5, 0]
        ϵ_IFN = exp10(IFN * 5 * (logistic(logit(tv_log10_ϵ_IFN / 5 + 1) + η[7]) - 1))
        # Logit-normal distribution of log10(r_E_IFN), scaled to range [0, 4]
        r_E_IFN = exp10(IFN * 4 * logistic(logit(tv_log10_r_E_IFN / 4) + η[8]))
        # Normal distribution
        conv_E = exp10(tv_log10_conv_E + η[9])
    end

    # Initial conditions
    @init begin
        V = ini_V
        I = d_V * V / p_V
        T = T_max - I
        S = p_S * I / d_V
        A = ini_A
    end

    @dynamics begin
        # Target uninfected hepatocytes
        T' = -β * V * T + r_T * T * (1 - (T + Ieff + R) / T_max) -
            d_TI * T + ρ_o_ρ_r * R - (m / n) * r_E_IFN * E * T

        # Infected hepatocytes
        I' = β * V * T + r_T * I * (1 - (T + I + R) / T_max) -
            ρ * X * I - d_TI * I - m * r_E_IFN * E * I

        # Resistant hepatocytes
        R' = ρ * X * Ieff - ρ_o_ρ_r * R + r_T * R * (1 - (T + Ieff + R) / T_max) -
            d_TI * R - (m / n) * r_E_IFN * E * R

        # Dead cell marker
        D' = m * r_E_IFN * E * Ieff + (m / n) * r_E_IFN * E * (T + R) - d_D * D

        # Immune response (ALT proxy)
        A' = d_A * (ini_A - A) + k_A * D

        # HBsAg dynamics
        S' = p_S * Ieff - d_V * S

        # Viral load (NA and IFN reduce production)
        V' = p_V * Ieff * ϵ_NA * ϵ_IFN - d_V * V

        # Dendritic cell activation
        P' = k_P * (Ieff / (ϕ_E + Ieff)) - d_P * P

        # Effector T cells
        E' = (d_E + r_E * E) * Peff - d_E_Q * E * (Q^4 / (ϕ_Q^4 + Q^4)) - d_E * E

        # Delayed effector signal
        Q' = d_Q * (Peff - Q)

        # Cytotoxic effect
        X' = r_X * (1 - X) * (Ieff / (ϕ_E + Ieff)) *
            (1 - S_max * (HBsAg_ug_mL^nS / (ϕ_S^nS + HBsAg_ug_mL^nS))) - d_X * X
    end

    @vars begin
        # Switch between progressive infection (I ≥ I_threshold) and clearance of infection (I < I_threshold)
        Ieff = ifelse(I < I_threshold, zero(I), I)
        Peff = ifelse(I < I_threshold, zero(P), P)

        # HBsAg [μg/mL]: Each particle of V and S is assumed to harbor 96 molecules of S envelope protein
        f_molec2ug := 24000 * 1.0e6 / 6.022e23
        HBsAg_ug_mL := 96 * (V + S) * f_molec2ug
    end

    # Error models: Values are log10-transformed
    @derived begin
        "HBsAg [IU/mL]"
        log10_HBsAg ~ @. CombinedNormal(log10(HBsAg_ug_mL / 0.98e-3), σ_add_log10_HBsAg, σ_prop_log10_HBsAg)

        "Viral load [copies/mL]"
        log10_HBV ~ @. CombinedNormal(log10(V), σ_add_log10_HBV, σ_prop_log10_HBV)

        "ALT [U/L]"
        log10_ALT ~ @. ProportionalNormal(log10(A), σ_prop_log10_ALT)

        "CD8+ T cell counts [cells/mL]"
        log10_CD8 ~ @. ProportionalNormal(log10(conv_E * E), σ_prop_log10_CD8)
    end
end

hbv_fixed_params = (
    ini_V = exp10(-0.481),
    ini_A = exp10(1.25),
    r_T = 1.0,
    r_E = 0.1,
    r_X = 1.0,
    n = 100,
    ρ = 0.001,
    ρ_o_ρ_r = 1.0e-5,
    p_V = 100.0,
    d_TI = 0.0039,
    d_V = 0.67,
    d_Q = 0.00385,
    d_X = 0.2,
    d_E = 0.01,
    d_P = 0.239,
    d_D = 1.66,
    d_A = 0.47,
    d_E_Q = 0.3,
    T_max = 1.36e7,
    ϕ_Q = 0.8,
    ϕ_E = 100.0,
    S_max = 1.0,
    ϕ_S = 0.1473,
    nS = 0.486,
    I_threshold = 1.0e-4,
)

# IIV standard deviations (on logit/log10 scale)
ω_p_S = 0.632
ω_β = 2.489
ω_m = 0.827
ω_k_P = 1.292
ω_k_A = 1.156
ω_ϵ_NA = 1.583
ω_ϵ_IFN = 1.626
ω_r_E_IFN = 2.739
ω_conv_E = 1.248

hbv_estimated_params = (;
    tv_log10_p_S = 7.643,
    tv_log10_β = -4.358,
    tv_log10_m = 1.618,
    tv_log10_k_P = -3.968,
    tv_log10_k_A = -4.869,
    tv_log10_ϵ_NA = -1.485,
    tv_log10_ϵ_IFN = -1.466,
    tv_log10_r_E_IFN = 0.313,
    tv_log10_conv_E = 2.745,

    Ω = Diagonal([ω_p_S^2, ω_β^2, ω_m^2, ω_k_P^2, ω_k_A^2, ω_ϵ_NA^2, ω_ϵ_IFN^2, ω_r_E_IFN^2, ω_conv_E^2]),

    σ_add_log10_HBsAg = 0.0972,
    σ_prop_log10_HBsAg = 2.2204e-16,
    σ_add_log10_HBV = 0.3206,
    σ_prop_log10_HBV = 0.0838,
    σ_prop_log10_ALT = 0.0478,
    σ_prop_log10_CD8 = 0.4149,
)

hbv_params = merge(hbv_fixed_params, hbv_estimated_params)

4.2 Copula Sampling and Simulation

We sample correlated random effects using the Gaussian copula with the Spearman correlation matrix from Cortés‐Ríos et al. (2025) (see HBV-04 for details):

# Spearman correlation matrix (9×9) from @cortes2025HBV Figure S5B
Rho = [
    # p_S     β      m     k_P    k_A    ϵ_NA   ϵ_IFN  r_E_IFN conv_E
    1.0    0.43   0.46   0.51  -0.7   0.58  -0.14  -0.38    0.0;  # p_S
    0.43   1.0    0.3  -0.01  -0.5   0.38  -0.52  -0.21    0.0;  # β
    0.46   0.3   1.0   -0.16  -0.5   0.29   0.07  -0.33    0.0;  # m
    0.51  -0.01  -0.16   1.0   -0.45   0.25   0.14  -0.1    0.0;  # k_P
    -0.7  -0.5  -0.5  -0.45   1.0   -0.41   0.01   0.05    0.0;  # k_A
    0.58   0.38   0.29   0.25  -0.41   1.0   -0.61  -0.36    0.0;  # ϵ_NA
    -0.14  -0.52   0.07   0.14   0.01  -0.61   1.0    0.26    0.0;  # ϵ_IFN
    -0.38  -0.21  -0.33  -0.1   0.05  -0.36   0.26   1.0     0.0;  # r_E_IFN
    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0      1.0;  # conv_E
]

copula = GaussianCopula(Rho)
correlated_randeffs_dist = SklarDist(
    copula, (
        Normal(0, ω_p_S),
        Normal(0, ω_β),
        Normal(0, ω_m),
        Normal(0, ω_k_P),
        Normal(0, ω_k_A),
        Normal(0, ω_ϵ_NA),
        Normal(0, ω_ϵ_IFN),
        Normal(0, ω_r_E_IFN),
        Normal(0, ω_conv_E),
    )
)

The Everest trial enrolls NA-suppressed patients, so baseline HBsAg is the level reached after 5 years of untreated chronic infection followed by 4 years of NA suppression therapy (Cortés‐Ríos et al., 2025). To reduce computation time in this tutorial, we only generate 500 virtual patients. Some random effect combinations lead to integration errors, so we resample until each simulation succeeds:

t_chronic = 5 * 365
t_na_suppression = 4 * 365

# Sample `nvps` virtual patients
nvps = 500
baseline_pop = map(1:nvps) do id
    Subject(;
        id,
        covariates = (; NA = [false, true], IFN = [false, false]),
        covariates_time = [0, t_chronic],
    )
end
baseline_vrandeffs = map(baseline_pop) do _
    return (; η = rand(correlated_randeffs_dist))
end

# Compute baseline HBsAg (after 5 yr chronic infection + 4 year NA suppression)
baseline_obstimes = [t_chronic + t_na_suppression]
baseline_sims = simobs(hbv_model, baseline_pop, hbv_params, baseline_vrandeffs; simulate_error = false, obstimes = baseline_obstimes)

while true
    # Check whether all simulations succeeded
    baseline_invalid = findall(!isvalid, baseline_sims)
    isempty(baseline_invalid) && break

    # Resample random effects for failed simulation
    pop = @view(baseline_pop[baseline_invalid])
    vrandeffs = @view(baseline_vrandeffs[baseline_invalid])
    map!(vrandeffs, pop) do _
        return (; η = rand(correlated_randeffs_dist))
    end
    baseline_sims[baseline_invalid] = simobs(hbv_model, pop, hbv_params, vrandeffs; simulate_error = false, obstimes = baseline_obstimes)
end

# Extract vector of baseline HBsAg
baseline_log10_HBsAg = postprocess((gen, _) -> only(gen.log10_HBsAg), baseline_sims)

5 Part 3: ILP Calibration

5.1 Comparison With Target Distribution

Check current distribution vs target:

current_counts = zeros(Int, nrow(everest_target))
for val in baseline_log10_HBsAg
    for b in 1:nrow(everest_target)
        if everest_target[b, :log10_lower] <= val < everest_target[b, :log10_upper]
            current_counts[b] += 1
            break
        end
    end
end
current_pcts = 100 .* current_counts ./ length(baseline_log10_HBsAg)

comparison = DataFrame(
    range = everest_target.hbsag_range,
    target_pct = everest_target.pct,
    current_pct = round.(current_pcts, digits = 1),
    difference = round.(current_pcts .- everest_target.pct, digits = 1)
)
simple_table(
    comparison,
    [
        :range => "HBsAg",
        :target_pct => "Target (%)",
        :current_pct => "VPop (%)",
        :difference => "Difference (%)",
    ],
)


HBsAg	Target (%)	VPop (%)	Difference (%)

<100	37.5	16.6	-20.9
[100, 200)	10.9	5	-5.9
[200, 500)	19.1	10	-9.1
[500, 1000)	21.5	7.8	-13.7
≥1000	11	60.6	49.6

5.2 Build and Solve ILP Model

We implement the optimization problem using a maximum absolute error of 5% for each category:

# Calibration tolerance
epsilon = 0.05

# Create optimization model
model = Model(HiGHS.Optimizer)
set_silent(model)
set_attribute(model, "presolve", "on")
set_time_limit_sec(model, 60.0)

# Decision variables
@variable(model, x[1:length(baseline_log10_HBsAg)], Bin)

# Objective: maximize number of selected VPs
ntotal = sum(x)
@objective(model, Max, ntotal)

# Constraints: distribution matching for each category
for row in eachrow(everest_target)
    # Target probability
    pc = row.pct / 100

    # Count of selected VPs in this category
    nc = sum(x[row.log10_lower .<= baseline_log10_HBsAg .< row.log10_upper])

    # Upper bound
    @constraint(model, nc <= (pc + epsilon) * ntotal)

    # Lower bound
    @constraint(model, nc >= (pc - epsilon) * ntotal)
end

Solve the optimization problem:

optimize!(model)

We ensure the problem was solved successfully:

assert_is_solved_and_feasible(model)

Inspect the results:

solution_summary(model)

solution_summary(; result = 1, verbose = false)
├ solver_name          : HiGHS
├ Termination
│ ├ termination_status : OPTIMAL
│ ├ result_count       : 1
│ ├ raw_status         : kHighsModelStatusOptimal
│ └ objective_bound    : 2.34000e+02
├ Solution (result = 1)
│ ├ primal_status        : FEASIBLE_POINT
│ ├ dual_status          : NO_SOLUTION
│ ├ objective_value      : 2.34000e+02
│ ├ dual_objective_value : NaN
│ └ relative_gap         : 0.00000e+00
└ Work counters
  ├ solve_time (sec)   : 3.89358e-03
  ├ simplex_iterations : 0
  ├ barrier_iterations : -1
  └ node_count         : 1

We extract the selected virtual patients:

selected_vps = findall(xi -> value(xi) == 1, x)
calibrated_log10_HBsAg = baseline_log10_HBsAg[selected_vps]

6 Part 4: Validate Calibration

6.1 Compare Distributions

Check calibrated distributions vs target:

calibrated_counts = zeros(Int, nrow(everest_target))
for val in calibrated_log10_HBsAg
    for b in 1:nrow(everest_target)
        if everest_target[b, :log10_lower] <= val < everest_target[b, :log10_upper]
            calibrated_counts[b] += 1
            break
        end
    end
end
calibrated_pcts = 100 .* calibrated_counts ./ length(calibrated_log10_HBsAg)

comparison = DataFrame(
    range = everest_target.hbsag_range,
    target_pct = everest_target.pct,
    calibrated_pct = round.(calibrated_pcts, digits = 1),
    difference = round.(calibrated_pcts .- everest_target.pct, digits = 1)
)
simple_table(
    comparison,
    [
        :range => "HBsAg",
        :target_pct => "Target (%)",
        :calibrated_pct => "Calibrated VPop (%)",
        :difference => "Difference (%)",
    ],
)


HBsAg	Target (%)	Calibrated VPop (%)	Difference (%)

<100	37.5	35.5	-2.0
[100, 200)	10.9	10.7	-0.2
[200, 500)	19.1	21.4	2.3
[500, 1000)	21.5	16.7	-4.8
≥1000	11	15.8	4.8

6.2 Visualization

df = vcat(
    DataFrame(; log10_HBsAg = baseline_log10_HBsAg),
    DataFrame(; log10_HBsAg = calibrated_log10_HBsAg);
    source = "stage" => ["Original", "Calibrated"],
)
data_specs = data(df) * mapping(
    :log10_HBsAg,
    col = :stage => sorter("Original", "Calibrated"),
) * histogram(; bins = vcat(everest_target.log10_lower, everest_target.log10_upper[end]), normalization = :probability)
target_specs = mapping(
    collect(Iterators.flatten((row.log10_lower, row.log10_upper) for row in eachrow(everest_target))),
    repeat(everest_target.pct; inner = 2) ./ 100,
    group = repeat(1:nrow(everest_target); inner = 2),
) * visual(Lines; label = "Everest target")
draw(
    data_specs + target_specs;
    axis = (; xlabel = "Baseline HBsAg (log₁₀ IU/mL)", ylabel = "Proportion"),
)

Before and after ILP calibration to Everest trial baseline HBsAg distribution. The calibrated population (right) closely matches the target.

7 Part 5: Multi-Objective Optimization

The single-objective ILP in Part 3 maximizes population size subject to a fixed per-category tolerance ε. An alternative is to treat population size and distribution error as competing objectives and trace the Pareto front directly using a multi-objective solver. See TB-05, Part 6 for the full theoretical motivation and derivation.

7.1 Build and Solve MOA Model

moa_model = Model(() -> MOA.Optimizer(HiGHS.Optimizer))
set_attribute(moa_model, MOA.Algorithm(), MOA.EpsilonConstraint())
set_attribute(moa_model, MOA.SolutionLimit(), 50)
set_silent(moa_model)

# Binary selection variables
@variable(moa_model, y[1:length(baseline_log10_HBsAg)], Bin)

# Continuous count deviation variables (one per HBsAg category)
@variable(moa_model, δ[1:nrow(everest_target)] >= 0)

# Total selected VPs
N_total = sum(y)

# Bi-objective: minimize [-N_total, D]
# (negate N_total since JuMP requires a uniform optimization sense)
@objective(moa_model, Min, [-N_total, sum(δ)])

# Count deviation constraints
for (δ_c, row) in zip(δ, eachrow(everest_target))
    p_c = row.pct / 100
    N_c = sum(y[row.log10_lower .<= baseline_log10_HBsAg .< row.log10_upper])
    @constraint(moa_model, δ_c >= N_c - p_c * N_total)
    @constraint(moa_model, δ_c >= p_c * N_total - N_c)
end

moa_model

A JuMP Model
├ solver: MOA[algorithm=MultiObjectiveAlgorithms.EpsilonConstraint, optimizer=HiGHS]
├ objective_sense: MIN_SENSE
│ └ objective_function_type: Vector{AffExpr}
├ num_variables: 505
├ num_constraints: 515
│ ├ JuMP.AffExpr in MOI.GreaterThan{Float64}: 10
│ ├ JuMP.VariableRef in MOI.GreaterThan{Float64}: 5
│ └ JuMP.VariableRef in MOI.ZeroOne: 500
└ Names registered in the model
  └ :y, :δ

optimize!(moa_model)

assert_is_solved_and_feasible(moa_model)

7.2 Extract and Filter the Pareto Front

n_solutions = result_count(moa_model)

moa_solutions_df = map(1:n_solutions) do i
    obj = objective_value(moa_model; result = i)
    nselected = round(Int, -obj[1])
    count_dev = obj[2]

    # Recover TV distance from count deviation
    error = nselected > 0 ? count_dev / (2 * nselected) : NaN

    # Extract selected VPs
    selected = findall(yj -> value(yj) == 1, y)

    (; nselected, error, selected)
end |> DataFrame

Filter for TV-Pareto-optimality:

filter!(moa_solutions_df) do row
    !any(eachrow(moa_solutions_df)) do other
        other.nselected >= row.nselected &&
            other.error <= row.error &&
            (other.nselected > row.nselected || other.error < row.error)
    end
end

sort!(moa_solutions_df, :nselected; rev = true)
simple_table(
    moa_solutions_df,
    [
        :nselected => "# Selected VPs",
        :error => "Distribution Error",
    ],
)


# Selected VPs	Distribution Error

500	0.496
492	0.49
481	0.48
470	0.471
459	0.461
448	0.45
437	0.439
426	0.428
415	0.415
404	0.402
393	0.389
382	0.374
371	0.359
360	0.343
349	0.326
338	0.307
327	0.288
316	0.267
305	0.244
294	0.22
283	0.194
272	0.166
261	0.136
250	0.111
239	0.0839
228	0.0549
217	0.0353
206	0.0257
195	0.015
184	0.00413
173	0.00155
0	NaN

7.3 Visualization

specs = data(moa_solutions_df) *
    mapping(:error => "Distribution Error", :nselected => "# Selected VPs") *
    visual(Scatter)
draw(specs)

Pareto front from multi-objective optimization: each point is a feasible calibrated population with a different trade-off between size and distribution fidelity.

8 Summary

8.1 Key Results

Metric	Original	Calibrated
Population size	5,000	~3,000-4,000
Mean error vs target	Large	<2%
HBsAg distribution	Unconstrained	Matches Everest

8.2 Why This Matters

Response rates depend on baseline HBsAg - calibration ensures realistic predictions
Subgroup analyses require representative distributions
Regulatory credibility from matching clinical trial characteristics

8.3 What’s Next?

In Tutorial 6: VCT Simulation, we’ll use the calibrated population to run a complete HBV virtual clinical trial with multiple treatment arms.

8.4 References

Cortés-Ríos, J., Magee, M., Sher, A., Jusko, W. J., & Desikan, R. (2025). A step-by-step workflow for performing in silico clinical trials with nonlinear mixed effects models. CPT: Pharmacometrics & Systems Pharmacology. https://doi.org/10.1002/psp4.70122

Cortés‐Ríos, J., Ren, T., Hanan, N., Sher, A., Nader, A., Magee, M., Jusko, W. J., & Desikan, R. (2025). Hepatitis b in silico trials capture functional cure, indicate mechanistic pathways, and suggest prognostic biomarker signatures. Clinical Pharmacology &Amp; Therapeutics, 118(3), 600–612. https://doi.org/10.1002/cpt.3718