MILP-Based Virtual Population Calibration

Matching Response Rate Distributions with Optimization

Authors

Vijay Ivaturi

David Müller-Widmann

NoteTumor Burden Tutorial Series

This is Tutorial 5 of 6 in the Tumor Burden series.

# Tutorial Status
1 Model Introduction ✓ Complete
2 Global Sensitivity Analysis ✓ Complete
3 Structural Identifiability ✓ Complete
4 Copula Vpop Generation ✓ Complete
5 MILP Calibration (current) ← You are here
6 VCT Simulation

1 Learning Outcomes

Virtual populations generated using copula sampling (see Tutorial 4) may not match real clinical data distributions. Calibration ensures the virtual population is representative of the actual patient population being studied (Cortés-Ríos et al., 2025).

In this tutorial, you will learn:

  • Why virtual population calibration is essential for realistic ISCTs
  • The mathematical formulation of an integer linear programming (ILP) problem
  • How to build a JuMP optimization model step-by-step
  • How to calibrate to response rate distributions (not just baseline characteristics)
  • Multi-objective optimization with maximum displacement using MultiObjectiveAlgorithms.jl
  • The mixed-integer linear programming (MILP) formulation with distribution error bound for inter-category trade-offs
  • Validating calibration results with visualizations
TipPrerequisites

This tutorial builds on concepts from earlier tutorials:

We’ll generate a new vpop using the copula approach and then calibrate it to match clinical response rates.

2 Background

2.1 The Problem: Why Calibration Matters

After generating a virtual population (Vpop) using copula sampling, we have patients with realistic parameter correlations. However, the marginal distributions and outcome distributions may not match what we observe in real clinical trials.

Consider this scenario for tumor burden:

  • Your Vpop has 10,000 virtual patients with parameters (f, g, k)
  • When you simulate treatment, you get certain response rates
  • The real clinical trial showed: 30% complete response, 45% partial response, 25% no response
  • Your Vpop shows different rates due to parameter distribution differences
WarningThe Distribution Mismatch Problem

If your Vpop outcome distribution doesn’t match the clinical trial:

  • Treatment effect estimates will be biased
  • Subgroup analyses won’t reflect real clinical heterogeneity
  • Regulatory questions about population representativeness

Calibration selects a subset of your Vpop that, when simulated, matches the target response distribution.

2.2 What is ILP-Based Calibration?

ILP stands for integer linear programming. It is an optimization technique where variables are integer or binary (0 or 1), and both the objective function and the constraints are linear.

For Vpop calibration, we use binary variables to select or reject each virtual patient:

\[ x_j = \begin{cases} 1 & \text{if patient } j \text{ is selected} \\ 0 & \text{if patient } j \text{ is rejected} \end{cases} \]

2.3 Why ILP? Alternatives and Trade-offs

Approach Pros Cons
Random Rejection Sampling Simple, fast Inefficient, may reject too many
Importance Weighting No rejection needed Weights can be extreme, high variance
ILP Optimal selection, precise matching Computationally more expensive

ILP is the preferred approach when you need precise control over the distribution matching.

3 Libraries

using Pumas
using StatsBase
using JuMP
using HiGHS
import MultiObjectiveAlgorithms as MOA
using Copulas
using AlgebraOfGraphics
using CairoMakie
using CategoricalArrays
using DataFramesMeta
using SummaryTables

using Random
using Statistics

# Set seed for reproducibility
Random.seed!(1234)

What each package does:

  • JuMP: Algebraic modeling language for optimization problems
  • HiGHS: High-performance ILP solver (open source, fast)
  • MultiObjectiveAlgorithms (MOA): Multi-objective optimization algorithms for JuMP
  • Copulas: For generating correlated parameter samples
  • AlgebraOfGraphics + CairoMakie: Publication-quality visualizations

4 Part 1: Understanding the Calibration Problem

4.1 The Tumor Burden Model Recap

From Tutorial 1, our tumor burden model has:

  • f: Fraction of treatment-sensitive cells (0-1)
  • g: Growth rate of resistant cells (1/day)
  • k: Death rate of sensitive cells (1/day)

The tumor dynamics follow:

\[ N(t) = N_0 \cdot f \cdot e^{-kt} + N_0 \cdot (1-f) \cdot e^{gt} \]

The full non-linear mixed effect model, implemented and explained in Tutorial 1: Model Introduction, is:

tumor_burden_model = @model begin
    @metadata begin
        desc = "Tumor Burden Model for NSCLC Chemotherapy"
        timeu = u"d"  # Time unit is days
    end

    @param begin
        # Typical values (population parameters)
        tvf  RealDomain(lower = 0.0, upper = 1.0)
        tvg  RealDomain(lower = 0.0)
        tvk  RealDomain(lower = 0.0)

        # Inter-individual variability (diagonal covariance matrix)
        # Values on the transformed scale (logit for f, log for g and k)
        Ω  PDiagDomain(3)

        # Residual error (standard deviation)
        σ  RealDomain(lower = 0.0)
    end

    @random begin
        # Random effects vector (one per parameter)
        η ~ MvNormal(Ω)
    end

    @covariates treatment

    @pre begin
        # Transform random effects to individual parameters
        # f: logit-normal transformation (bounded 0-1)
        f = logistic(logit(tvf) + η[1])
        # g: log-normal transformation (positive value)
        g = tvg * exp(η[2])
        # k: log-normal transformation (positive value)
        # Additionally, treatment affects death rate:
        # treatment = false → control arm (no killing)
        # treatment = true → treatment arm (killing at rate tvk * exp(η[3]))
        k = treatment * tvk * exp(η[3])
    end

    # Initial conditions (normalized tumor = 1)
    @init begin
        N_sens = f           # Sensitive cells
        N_res = 1 - f        # Resistant cells
    end

    @dynamics begin
        # Sensitive cells: die at rate k (0 if control)
        N_sens' = -k * N_sens

        # Resistant cells: grow at rate g
        N_res' = g * N_res
    end

    @derived begin
        # Total normalized tumor size
        Nt := @. N_sens + N_res

        # Observation with residual error
        tumor_size ~ @. Normal(Nt, σ)
    end
end
PumasModel
  Parameters: tvf, tvg, tvk, Ω, σ
  Random effects: η
  Covariates: treatment
  Dynamical system variables: N_sens, N_res
  Dynamical system type: Matrix exponential
  Derived: tumor_size
  Observed: tumor_size

4.2 Defining Response Categories

For tumor burden, we define response based on tumor size at a fixed time point (e.g., 18 weeks):

  • tumor size < 14%: Complete response (CR)
  • 14% ≤ tumor size < 70%: Partial response (PR)
  • 70% ≤ tumor size ≤ 100%: Stable disease (SD)
  • tumor size > 100%: Progressive disease (PD)

4.3 Generating the Virtual Population

Let us generate a vpop using the copula approach from Tutorial 4.

The parameter values come from Qi & Cao (2022), who fit this model to NSCLC clinical trial data:

# Population (typical) parameter values
pop_f = 0.27    # 27% of tumor is treatment-sensitive
pop_g = 0.0013  # Resistant cells grow ~0.13% per day
pop_k = 0.0091  # Sensitive cells die ~0.91% per day

# Inter-individual variability (standard deviations on transformed scale)
ω_f = 2.16  # High variability in sensitive fraction
ω_g = 1.57  # Moderate variability in growth rate
ω_k = 1.24  # Moderate variability in death rate

# Residual error (standard deviation)
resid_σ = 0.095

# Critical correlation from the data
cor_f_g = -0.64  # Negative correlation: high f ↔ low g

We create 2000 virtual subjects for simulations with the estimated population parameters:

patients = [
    Subject(;
            id, time = 7 * (0:18), covariates = (; treatment = true),
        ) for id in 1:2_000
]
param = (; tvf = pop_f, tvg = pop_g, tvk = pop_k, Ω = Diagonal([ω_f^2, ω_g^2, ω_k^2]), σ = resid_σ)

Based on the estimated correlation matrix, we define a Gaussian copula:

Rho = [
    1.0 -0.64 0.0
    -0.64 1.0 0.0
    0.0 0.0 1.0
]
copula = GaussianCopula(Rho)

We define the joint distribution of correlated random effects based on the marginal distributions and the Gaussian copula:

correlated_randeffs_dist = SklarDist(copula, (Normal(0, ω_f), Normal(0, ω_g), Normal(0, ω_k)))

We sample correlated random effects for each virtual patient:

correlated_randeffs = rand(correlated_randeffs_dist, length(patients))
vrandeffs = [(; η) for η in eachcol(correlated_randeffs)]

Now let us simulate the tumor dynamics for each virtual patient:

sims = simobs(
    tumor_burden_model, patients, param, vrandeffs; simulate_error = false
)
Simulated population (Vector{<:Subject})
  Simulated subjects: 2000
  Simulated variables: tumor_size

We compute the observed Spearman correlation of the simulated parameters (f, g, and k):

sims_f_g_k = postprocess(sims) do generated, _
    (; f = only(unique(generated.f)), g = only(unique(generated.g)), k = only(unique(generated.k)))
end |> DataFrame
corspearman(Matrix(sims_f_g_k))
3×3 Matrix{Float64}:
  1.0        -0.643481   -0.0191359
 -0.643481    1.0         0.0374227
 -0.0191359   0.0374227   1.0

Additionally, we validate the marginal distributions of the simulated parameters:

marginals_df = DataFrame(
    parameter = ["f", "g", "k"],
    median_exp = [pop_f, pop_g, pop_k],
    median_obs = [median(sims_f_g_k.f), median(sims_f_g_k.g), median(sims_f_g_k.k)],
    omega_exp = [ω_f, ω_g, ω_k],
    omega_obs = [std(logit.(sims_f_g_k.f)), std(log.(sims_f_g_k.g)), std(log.(sims_f_g_k.k))],
)
simple_table(
    marginals_df,
    [:parameter => "Parameter", :median_exp => "Median (expected)", :median_obs => "Median (observed)", :omega_exp => "ω (expected)", :omega_obs => "ω (observed)"]
)
Parameter Median (expected) Median (observed) ω (expected) ω (observed)
f 0.27 0.262 2.16 2.18
g 0.0013 0.00129 1.57 1.57
k 0.0091 0.00905 1.24 1.24

4.4 Classifying the Tumor Response

Now let us classify the response of each virtual patient after 18 weeks.

vpop = postprocess(sims) do generated, _
    if generated.t[end] != 7 * 18 # days
        error("final time point must be 18 weeks")
    end
    tumor_size_final = generated.tumor_size[end]
    (; tumor_size_final)
end |> DataFrame
@transform! vpop :response = cut(
    :tumor_size_final,
    [0.14, 0.7, 1.0];
    labels = ["CR", "PR", "SD", "PD"],
    extend = true,
)

We compute the distribution of responses:

vpop_distribution = @by vpop :response begin
    :pct = 100 * length(:response) / nrow(vpop)
end
simple_table(
    vpop_distribution, [:response => "Response", :pct => "Vpop (%)"],
)
Response Vpop (%)
CR 2.35
PR 22
SD 26
PD 49.6

4.5 Defining the Target Distribution

Let us define a target response distribution that we want to match. In practice, one would use response rates in a clinical trial.

target_distribution = DataFrame(
    response = categorical(["CR", "PR", "SD", "PD"]),
    pct = [25.0, 40.0, 20.0, 15.0],
)
simple_table(
    target_distribution, [:response => "Response", :pct => "Target (%)"],
)
Response Target (%)
CR 25
PR 40
SD 20
PD 15

We compare the target distribution with the distribution of response rates in the current virtual population.

comparison_df = leftjoin(target_distribution, vpop_distribution; on = :response, renamecols = "_target" => "_vpop")
@rtransform! comparison_df :pct_vpop = @coalesce(:pct_vpop, 0.0)
@rtransform! comparison_df :diff = :pct_vpop - :pct_target

simple_table(
    comparison_df,
    [
        :response => "Response",
        :pct_vpop => "Vpop (%)",
        :pct_target => "Target (%)",
        :diff => "Difference (%)",
    ]
)
Response Vpop (%) Target (%) Difference (%)
CR 2.35 25 -22.6
PR 22 40 -18.0
SD 26 20 6.05
PD 49.6 15 34.6

5 Part 2: The Mathematical Formulation

Before writing code, let’s understand the ILP optimization problem mathematically.

5.1 Decision Variables

We have binary selection variables for each virtual patient: \[ x_j \in \{0, 1\} \quad \text{for } j = 1, \ldots, N \]

5.2 Objective Function

We want to maximize the number of selected patients (larger calibrated populations give better statistical power):

\[\max \sum_{i=1}^N x_i \]

5.3 Constraints

Additionally, we want the distribution of observed response rates of the selected patients to match the target distribution.

As generally we cannot expect the observed and expected distributions to match exactly, we enforce only approximate equality of the two distributions. That is, for each response category \(c\) with target fraction \(p_c\), the proportion of selected VPs in category \(c\) is constrained to be within an absolute error of \(\epsilon > 0\) of the target:

\[ (p_c - \epsilon) N_{total} \leq \text{N}_c \leq (p_c + \epsilon) N_{total}\]

where \(N_{\text{total}} = \sum_{i=1}^N x_i\) is the total number of selected VPs and \(N_{c} = \sum_{i \text{ with category } c} x_i\) is the number of selected VPs in category \(c\).

5.4 Why Binary Variables Make This Hard (But Solvable)

The fundamental insight is that VP selection is an all-or-nothing decision. Each virtual patient is either:

  • Selected (\(x_j = 1\)): included in the calibrated population
  • Rejected (\(x_j = 0\)): excluded from the calibrated population

This discrete choice makes the problem an integer linear program (ILP) rather than a standard linear program (LP).

NoteComputational Complexity

Binary optimization is NP-hard— there’s no known algorithm that solves all instances quickly. With \(N\) virtual patients, there are \(2^N\) possible selections:

VPs Possible Selections
10 1,024
20 ~1 million
100 \(2^{100}\) (astronomical)

Fortunately, modern solvers like HiGHS use sophisticated algorithms (branch-and-bound, cutting planes) that solve well-structured problems like ours in seconds (Dunning et al., 2017; Huangfu & Hall, 2018).

6 Part 3: Building the JuMP Model Step-by-Step

6.1 Step 3a: Build the Optimization Model

We implement the optimization problem using a maximum absolute error of 5% for each category:

# Calibration tolerance
epsilon = 0.05

# Create optimization model
model = Model(HiGHS.Optimizer)
set_silent(model)
set_attribute(model, "presolve", "on")
set_time_limit_sec(model, 60.0)

# Decision variables
@variable(model, x[1:nrow(vpop)], Bin)     # Binary selection for each VP

# Objective: maximize number of selected VPs
N_total = sum(x)
@objective(model, Max, N_total)

# Constraints: distribution matching for each category
for row in eachrow(target_distribution)
    p_c = row.pct / 100

    # Count of selected VPs in this category
    N_c = sum(x[vpop.response .== row.response])

    # Upper bound: count <= (p_c + ε) × N_total
    @constraint(model, N_c <= (p_c + epsilon) * N_total)

    # Lower bound: count >= (p_c - ε) × N_total
    @constraint(model, N_c >= (p_c - epsilon) * N_total)
end

model
A JuMP Model
├ solver: HiGHS
├ objective_sense: MAX_SENSE
│ └ objective_function_type: AffExpr
├ num_variables: 2000
├ num_constraints: 2008
│ ├ JuMP.AffExpr in MOI.GreaterThan{Float64}: 4
│ ├ JuMP.AffExpr in MOI.LessThan{Float64}: 4
│ └ JuMP.VariableRef in MOI.ZeroOne: 2000
└ Names registered in the model
  └ :x

6.2 Step 3c: Solve the Model

Solve the optimization problem:

optimize!(model)

We ensure that the problem was solved successfully:

assert_is_solved_and_feasible(model)

In case the problem was not solved successfully, the function assert_is_solved_and_feasible throws an error describing the state of the solver.

6.3 Step 3d: Extract and Analyze Results

We can inspect the results in more detail:

solution_summary(model)
solution_summary(; result = 1, verbose = false)
├ solver_name          : HiGHS
├ Termination
│ ├ termination_status : OPTIMAL
│ ├ result_count       : 1
│ ├ raw_status         : kHighsModelStatusOptimal
│ └ objective_bound    : 2.35000e+02
├ Solution (result = 1)
│ ├ primal_status        : FEASIBLE_POINT
│ ├ dual_status          : NO_SOLUTION
│ ├ objective_value      : 2.35000e+02
│ ├ dual_objective_value : NaN
│ └ relative_gap         : 0.00000e+00
└ Work counters
  ├ solve_time (sec)   : 1.00106e-02
  ├ simplex_iterations : 5
  ├ barrier_iterations : -1
  └ node_count         : 1

From the summary, we can already see that the selected number of virtual patients (which corresponds to the objective function value of the solution) is 235.

We extract the selected virtual patients, i.e., those for which the solution is 1:

ilp_selected = findall(xi -> value(xi) == 1, x)
ilp_vpop = vpop[ilp_selected, :]

The distribution of observed responses for this subset of virtual patients is

ilp_distribution = @by ilp_vpop :response begin
    :pct = 100 * length(:response) / nrow(ilp_vpop)
end
simple_table(
    ilp_distribution, [:response => "Response", :pct => "Vpop (%)"],
)
Response Vpop (%)
CR 20
PR 35.3
SD 24.7
PD 20

We compare it with the target distribution of response rates:

ilp_comparison_df = leftjoin(target_distribution, ilp_distribution; on = :response, renamecols = "_target" => "_vpop")
@rtransform! ilp_comparison_df :pct_vpop = @coalesce(:pct_vpop, 0.0)
@rtransform! ilp_comparison_df :pct_diff = :pct_vpop - :pct_target
simple_table(
    ilp_comparison_df,
    [
        :response => "Response",
        :pct_vpop => "Vpop (%)",
        :pct_target => "Target (%)",
        :pct_diff => "Difference (%)",
    ]
)
Response Vpop (%) Target (%) Difference (%)
CR 20 25 -5.0
PR 35.3 40 -4.68
SD 24.7 20 4.68
PD 20 15 5

6.4 Evaluating Distribution Error

The ILP constrains the maximum per-category proportion error to \(\epsilon\), so the natural evaluation metric is the realized maximum absolute error:

\[ \epsilon_{\max} = \max_c \left| \frac{N_c}{N_{\text{total}}} - p_c \right| \]

This tells us how far the worst-matching category deviates from its target.

ilp_max_error = maximum(abs, ilp_comparison_df.pct_diff) / 100
0.05

6.5 Visualizing the Calibration Result

original_comparison_df = vcat(
    target_distribution,
    vpop_distribution;
    source = "type" => ["Target", "VPop"],
)
calibrated_comparison_df = vcat(
    target_distribution,
    ilp_distribution;
    source = "type" => ["Target", "VPop"],
)
comparison_df = vcat(
    original_comparison_df,
    calibrated_comparison_df;
    source = "source" => ["Original", "Calibrated"],
)

type_mapping = :type => sorter("VPop", "Target") => ""
specs = data(comparison_df) * mapping(
    :response => renamer(
        "CR" => "Complete Response",
        "PR" => "Partial Response",
        "SD" => "Stable Disease",
        "PD" => "Progressive Disease",
    ) => "Response",
    :pct => "Percentage (%)",
    dodge = type_mapping,
    row = :source => sorter("Original", "Calibrated") => "VPop",
    color = type_mapping
) * visual(BarPlot)
draw(specs; axis = (; xticklabelrotation = π / 4))

Before and after calibration: response distribution comparison. The calibrated virtual population closely matches the target distribution.

Before and after calibration: response distribution comparison. The calibrated virtual population closely matches the target distribution.

7 Part 4: Multi-Objective Optimization

7.1 Motivation

The ILP from Part 3 requires choosing \(\epsilon\) before solving. What if we want to explore the full trade-off between population size and per-category accuracy without committing to a specific tolerance?

Multi-objective optimization lets us find the entire Pareto front directly, revealing all optimal trade-offs between two competing objectives:

  1. Maximize \(N_{\text{total}} = \sum x_i\) (number of selected VPs)
  2. Minimize the maximum per-category error (the same quantity \(\epsilon\) constrains)

7.2 Mathematical Formulation

We introduce a single continuous variable \(\Delta \geq 0\) representing the maximum absolute count displacement across all categories:

\[ \begin{align} \min \quad & [-N_{\text{total}},\; \Delta] \\ \text{s.t.} \quad & N_c - p_c \cdot N_{\text{total}} \leq \Delta & \forall\, c \\ & p_c \cdot N_{\text{total}} - N_c \leq \Delta & \forall\, c \\ & x_j \in \{0, 1\} \\ & \Delta \geq 0 \end{align} \]

The key relationship is \(\Delta / N_{\text{total}} = \max_c |N_c / N_{\text{total}} - p_c|\), which is exactly the quantity that \(\epsilon\) constrains in the ILP. So this formulation directly explores the Pareto front of \((N_{\text{total}},\, \epsilon_{\max})\).

7.3 Recovering the \(\epsilon_{\max}\)-Pareto Front

Using \(\Delta\) as the second objective keeps the problem as a MILP, solvable efficiently by HiGHS. But \(\Delta\) is not the metric we ultimately care about — \(\epsilon_{\max}\) is. The Pareto fronts in \((N_\text{total}, \Delta)\) and \((N_\text{total}, \epsilon_{\max})\) space are generally not identical.

Fortunately, the \(\Delta\)-Pareto front is guaranteed to contain all \(\epsilon_{\max}\)-Pareto-optimal solutions:

Note\(\Delta\)-Dominance Implies \(\epsilon_{\max}\)-Dominance

If solution \(T\) dominates \(S\) in \((N_\text{total}, \Delta)\) space (i.e., \(N_T \geq N_S\) and \(\Delta_T \leq \Delta_S\), with at least one strict inequality), then:

\[ \epsilon_{\max,T} = \frac{\Delta_T}{N_T} \leq \frac{\Delta_S}{N_S} = \epsilon_{\max,S} \]

This holds because \(\Delta_T \leq \Delta_S\) and \(\frac{1}{N_T} \leq \frac{1}{N_S}\), so their product satisfies \(\frac{\Delta_T}{N_T} \leq \frac{\Delta_S}{N_S}\).

Consequence: every \(\epsilon_{\max}\)-Pareto-optimal solution is also \(\Delta\)-Pareto-optimal. We can therefore find the \(\Delta\)-Pareto front efficiently (as a MILP), compute \(\epsilon_{\max} = \Delta / N_\text{total}\) for each solution, and post-filter for \(\epsilon_{\max}\)-dominance to recover the exact \(\epsilon_{\max}\)-Pareto front.

7.4 Multi-Objective Formulation with MultiObjectiveAlgorithms.jl

MultiObjectiveAlgorithms.jl (MOA) extends JuMP with multi-objective optimization algorithms. It wraps a single-objective solver (here, HiGHS) and orchestrates multiple subproblem solves to trace the Pareto front.

We use the ε-constraint method (MOA.EpsilonConstraint), which is well-suited for bi-objective integer programs. It iterates over the range of one objective, constraining it at each step while optimizing the other, systematically enumerating nondominated solutions.

TipChoosing the MOA Algorithm

For bi-objective problems, MOA offers several algorithms:

Algorithm What it finds Best for
EpsilonConstraint() All nondominated points General bi-objective (recommended)
Chalmet() All nondominated points Bi-criterion integer programs
Dichotomy() Supported extreme points Quick overview (may miss points)
TambyVanderpooten() All nondominated points Multi-objective (p ≥ 2) discrete programs

EpsilonConstraint is the most intuitive and gives fine-grained control over the number of Pareto points via MOA.SolutionLimit().

moa_model = Model(() -> MOA.Optimizer(HiGHS.Optimizer))
set_attribute(moa_model, MOA.Algorithm(), MOA.EpsilonConstraint())
set_attribute(moa_model, MOA.SolutionLimit(), 50)
set_silent(moa_model)

# Binary selection variables
@variable(moa_model, y[1:nrow(vpop)], Bin)

# Single continuous variable: maximum absolute count displacement
@variable(moa_model, Δ >= 0)

# Total selected VPs
N_total = sum(y)

# Bi-objective: minimize [-N_total, Δ]
# (negate N_total since JuMP requires a uniform optimization sense)
@objective(moa_model, Min, [-N_total, Δ])

# Maximum displacement constraints
for row in eachrow(target_distribution)
    p_c = row.pct / 100
    N_c = sum(y[vpop.response .== row.response])
    @constraint(moa_model, N_c - p_c * N_total <= Δ)
    @constraint(moa_model, p_c * N_total - N_c <= Δ)
end

moa_model
A JuMP Model
├ solver: MOA[algorithm=MultiObjectiveAlgorithms.EpsilonConstraint, optimizer=HiGHS]
├ objective_sense: MIN_SENSE
│ └ objective_function_type: Vector{AffExpr}
├ num_variables: 2001
├ num_constraints: 2009
│ ├ JuMP.AffExpr in MOI.LessThan{Float64}: 8
│ ├ JuMP.VariableRef in MOI.GreaterThan{Float64}: 1
│ └ JuMP.VariableRef in MOI.ZeroOne: 2000
└ Names registered in the model
  └ :y, :Δ
optimize!(moa_model)
assert_is_solved_and_feasible(moa_model)

7.5 Extracting and Filtering the Pareto Front

We extract all nondominated solutions found by the ε-constraint method, compute the maximum absolute error for each, and filter for \(\epsilon_{\max}\)-Pareto-optimality:

n_solutions = result_count(moa_model)

moa_solutions_df = map(1:n_solutions) do i
    obj = objective_value(moa_model; result = i)
    nselected = round(Int, -obj[1])
    delta = obj[2]

    # Recover max absolute error: ε_max = Δ / N_total
    max_error = nselected > 0 ? delta / nselected : NaN

    (; nselected, max_error)
end |> DataFrame

We filter for \(\epsilon_{\max}\)-Pareto-optimality by removing solutions that are dominated in \((N_\text{total}, \epsilon_{\max})\) space:

sort!(moa_solutions_df, [order(:nselected; rev = true), :max_error])
moa_solutions_df.pareto_optimal .= false
moa_solutions_df.pareto_optimal[begin] = true
dominating_error = moa_solutions_df.max_error[begin]
for row in eachrow(moa_solutions_df)
    if row.max_error < dominating_error
        row.pareto_optimal = true
        dominating_error = row.max_error
    end
end

sort!(moa_solutions_df, :nselected; rev = true)
simple_table(
    moa_solutions_df,
    [
        :nselected => "# selected VPs",
        :max_error => "Max. absolute error",
        :pareto_optimal => "Pareto-Optimal",
    ],
)
# selected VPs Max. absolute error Pareto-Optimal
2000 0.347 true
1320 0.214 true
1282 0.213 true
1244 0.212 true
1206 0.211 true
1168 0.21 true
1130 0.208 true
1092 0.207 true
1054 0.205 true
1016 0.204 true
978 0.202 true
940 0.2 true
902 0.198 true
864 0.196 true
826 0.193 true
788 0.19 true
750 0.187 true
712 0.184 true
674 0.18 true
636 0.176 true
598 0.171 true
560 0.166 true
522 0.16 true
484 0.153 true
446 0.145 true
408 0.135 true
370 0.123 true
332 0.108 true
294 0.0901 true
256 0.0664 true
218 0.0344 true
180 0 true

7.6 Visualization

We visualize the Pareto front and overlay the ILP solution from Part 3:

specs = (
    data(@rsubset(moa_solutions_df, :pareto_optimal)) * mapping(color = direct("Pareto front")) +
        data((; max_error = [ilp_max_error], nselected = [nrow(ilp_vpop)])) * mapping(color = direct("ILP solution (ϵ = 0.05)"))
) * mapping(:max_error => "Max. absolute error", :nselected => "# selected VPs") * visual(Scatter)
draw(specs)

Multi-objective Pareto front showing the trade-off between population size and maximum per-category error. The ILP solution from Part 3 is overlaid for comparison.

Multi-objective Pareto front showing the trade-off between population size and maximum per-category error. The ILP solution from Part 3 is overlaid for comparison.

The Pareto front reveals the full trade-off between population size and maximum per-category error. The ILP solution from Part 3 is a single point on (or near) this front, corresponding to the specific choice of \(\epsilon = 0.05\). The multi-objective approach recovers the entire curve without requiring \(\epsilon\) to be chosen a priori.

8 Part 5: Inter-Category Trade-offs

8.1 Why Allow Trade-offs Between Categories?

The per-category formulation from Part 3 (and its multi-objective extension in Part 4) treats all categories equally: no single category’s proportion can deviate from its target by more than \(\epsilon\). But what if you are willing to accept a slightly larger error in one category, as long as other categories compensate?

Constraining the mean total error (MTE) instead of the per-category maximum allows this trade-off. This leads to a mixed-integer linear program (MILP) that is strictly more flexible than the per-category ILP.

8.2 Mean Total Error

The mean total error across all response categories (Cortés‐Ríos et al., 2025) is defined as:

\[ \text{MTE} = \frac{1}{C} \sum_c \left| \frac{N_c}{N_{\text{total}}} - p_c \right| \]

where \(C\) is the number of categories. This measures the average absolute deviation between the observed and target proportions.

NoteRelationship to Total Variation Distance

The total variation distance between two discrete probability distributions is a standard metric defined as:

\[ \text{TV} = \frac{1}{2} \sum_c |q_c - p_c| \]

where \(q_c\) and \(p_c\) are the observed and target proportions. TV is always in \([0, 1]\) regardless of the number of categories.

The mean total error is simply a rescaled version: \(\text{MTE} = 2 \cdot \text{TV} / C\). In particular, this implies that the MTE is always in \([0, 2/C]\).

8.3 Mathematical Formulation

As in Part 2, we maximize the number of selected VPs. But instead of per-category constraints, we bound the mean total error using auxiliary count deviation variables.

Using the MTE directly as a constraint is problematic: \(N_c / N_{\text{total}}\) is a ratio of decision variables, making the problem a nonlinear fractional program. Instead, we work with count deviations:

\[ \delta_c = |N_c - p_c \cdot N_{\text{total}}| \]

Since \(N_c - p_c \cdot N_\text{total} = \sum_{j \in c} x_j - p_c \sum_j x_j = \sum_j (\mathbb{1}_{j \in c} - p_c) \cdot x_j\), this is a linear function of the decision variables \(x_j\). The absolute value can be linearized with auxiliary variables \(\delta_c \geq 0\) and two-sided inequality constraints:

\[ \delta_c \geq N_c - p_c \cdot N_\text{total}, \qquad \delta_c \geq p_c \cdot N_\text{total} - N_c \]

The total count deviation \(D = \sum_c \delta_c\) is related to MTE by:

\[ D = C \cdot \text{MTE} \cdot N_\text{total} \]

so the constraint \(\text{MTE} \leq \epsilon\) is equivalent to:

\[ \sum_c \delta_c \leq C \cdot \epsilon \cdot N_\text{total} \]

TipSame Epsilon, Different Guarantees

Both formulations use \(\epsilon\) but with different interpretations:

  • The ILP (Part 3) enforces \(|q_c - p_c| \leq \epsilon\) for every category, which guarantees \(\text{MTE} \leq \epsilon\).
  • The MILP enforces \(\text{MTE} \leq \epsilon\) directly, allowing individual categories to exceed \(\epsilon\) as long as the average stays within the bound.

Since the ILP’s feasible set is a subset of the MILP’s, the MILP can always select at least as many VPs.

8.4 Implementation

We implement the MILP with the same \(\epsilon = 0.05\) used in Part 3:

model_milp = Model(HiGHS.Optimizer)
set_silent(model_milp)
set_attribute(model_milp, "presolve", "on")
set_time_limit_sec(model_milp, 60.0)

# Decision variables
@variable(model_milp, x_milp[1:nrow(vpop)], Bin)
@variable(model_milp, δ[1:nrow(target_distribution)] >= 0)

# Objective: maximize number of selected VPs
N_total = sum(x_milp)
@objective(model_milp, Max, N_total)

# Count deviation constraints (linearized absolute values)
for (δ_c, row) in zip(δ, eachrow(target_distribution))
    p_c = row.pct / 100
    N_c = sum(x_milp[vpop.response .== row.response])
    @constraint(model_milp, δ_c >= N_c - p_c * N_total)
    @constraint(model_milp, δ_c >= p_c * N_total - N_c)
end

# Upper bound on mean total error: MTE ≤ ε
# Since sum(δ) = C · MTE · N_total, this is equivalent to: sum(δ) ≤ C · ε · N_total
@constraint(model_milp, sum(δ) <= nrow(target_distribution) * epsilon * N_total)

model_milp
A JuMP Model
├ solver: HiGHS
├ objective_sense: MAX_SENSE
│ └ objective_function_type: AffExpr
├ num_variables: 2004
├ num_constraints: 2013
│ ├ JuMP.AffExpr in MOI.GreaterThan{Float64}: 8
│ ├ JuMP.AffExpr in MOI.LessThan{Float64}: 1
│ ├ JuMP.VariableRef in MOI.GreaterThan{Float64}: 4
│ └ JuMP.VariableRef in MOI.ZeroOne: 2000
└ Names registered in the model
  └ :x_milp, :δ
optimize!(model_milp)
assert_is_solved_and_feasible(model_milp)

8.5 Results and Comparison with ILP

milp_selected = findall(xi -> value(xi) == 1, x_milp)
milp_vpop = vpop[milp_selected, :]
milp_distribution = @by milp_vpop :response begin
    :pct = 100 * length(:response) / nrow(milp_vpop)
end
milp_comparison_df = leftjoin(target_distribution, milp_distribution; on = :response, renamecols = "_target" => "_vpop")
@rtransform! milp_comparison_df :pct_vpop = @coalesce(:pct_vpop, 0.0)
@rtransform! milp_comparison_df :pct_diff = :pct_vpop - :pct_target

simple_table(
    milp_comparison_df,
    [:response => "Response", :pct_vpop => "Vpop (%)", :pct_target => "Target (%)", :pct_diff => "Difference (%)"]
)
Response Vpop (%) Target (%) Difference (%)
CR 15 25 -9.98
PR 40.3 40 0.256
SD 20.1 20 0.128
PD 24.6 15 9.6
milp_mte = mean(abs, milp_comparison_df.pct_diff) / 100
0.04992012779552717

We compare the ILP and MILP solutions:

ilp_mte = mean(abs, ilp_comparison_df.pct_diff) / 100
milp_max_error = maximum(abs, milp_comparison_df.pct_diff) / 100
summary_df = DataFrame(
    method = ["ILP (max. absolute error ≤ $epsilon)", "MILP (mean total error ≤ $epsilon)"],
    nselected = [nrow(ilp_vpop), nrow(milp_vpop)],
    max_error = [ilp_max_error, milp_max_error],
    mte = [ilp_mte, milp_mte],
)
simple_table(
    summary_df,
    [
        :method => "Method",
        :nselected => "# selected VPs",
        :max_error => "Max. absolute error",
        :mte => "Mean total error",
    ]
)
Method # selected VPs Max. absolute error Mean total error
ILP (max. absolute error ≤ 0.05) 235 0.05 0.0484
MILP (mean total error ≤ 0.05) 313 0.0998 0.0499

The MILP selects at least as many virtual patients as the ILP, because its feasible space is strictly larger. The inter-category trade-off allows the solver to find calibrated populations that per-category constraints would reject — but note that the maximum per-category error may exceed \(\epsilon\).

8.6 Multi-Objective Optimization with MTE

Just as we used multi-objective optimization in Part 4 to explore the per-category error trade-off, we can do the same for the MTE-based formulation. Instead of a single \(\Delta\) variable, we use per-category count deviations \(\delta_c\) and minimize their sum \(D = \sum_c \delta_c\) as the second objective:

8.6.1 From MTE to Count Deviations

We cannot use MTE directly as an optimization objective because it involves a ratio of decision variables. Instead, we minimize the total count deviation \(D = \sum_c \delta_c\), which is related to MTE by \(D = C \cdot \text{MTE} \cdot N_\text{total}\).

8.6.2 Recovering the MTE-Pareto Front

Note\(D\)-Dominance Implies MTE-Dominance

If solution \(T\) dominates \(S\) in \((N_\text{total}, D)\) space (i.e., \(N_T \geq N_S\) and \(D_T \leq D_S\), with at least one strict inequality), then:

\[ \text{MTE}_T = \frac{D_T}{C \cdot N_T} \leq \frac{D_S}{C \cdot N_S} = \text{MTE}_S \]

This holds because \(D_T \leq D_S\) and \(\frac{1}{N_T} \leq \frac{1}{N_S}\), so their product satisfies \(\frac{D_T}{N_T} \leq \frac{D_S}{N_S}\).

Consequence: every MTE-Pareto-optimal solution is also \(D\)-Pareto-optimal. We can therefore find the \(D\)-Pareto front efficiently (as a MILP), compute \(\text{MTE} = D / (C \cdot N_\text{total})\) for each solution, and post-filter for MTE-dominance to recover the exact MTE-Pareto front.

moa_mte_model = Model(() -> MOA.Optimizer(HiGHS.Optimizer))
set_attribute(moa_mte_model, MOA.Algorithm(), MOA.EpsilonConstraint())
set_attribute(moa_mte_model, MOA.SolutionLimit(), 50)
set_silent(moa_mte_model)

# Binary selection variables
@variable(moa_mte_model, y_mte[1:nrow(vpop)], Bin)

# Continuous count deviation variables (one per category)
@variable(moa_mte_model, δ_mte[1:nrow(target_distribution)] >= 0)

# Total selected VPs
N_total = sum(y_mte)

# Bi-objective: minimize [-N_total, D]
# (negate N_total since JuMP requires a uniform optimization sense)
@objective(moa_mte_model, Min, [-N_total, sum(δ_mte)])

# Count deviation constraints
for (δ_c, row) in zip(δ_mte, eachrow(target_distribution))
    p_c = row.pct / 100
    N_c = sum(y_mte[vpop.response .== row.response])
    @constraint(moa_mte_model, δ_c >= N_c - p_c * N_total)
    @constraint(moa_mte_model, δ_c >= p_c * N_total - N_c)
end

moa_mte_model
A JuMP Model
├ solver: MOA[algorithm=MultiObjectiveAlgorithms.EpsilonConstraint, optimizer=HiGHS]
├ objective_sense: MIN_SENSE
│ └ objective_function_type: Vector{AffExpr}
├ num_variables: 2004
├ num_constraints: 2012
│ ├ JuMP.AffExpr in MOI.GreaterThan{Float64}: 8
│ ├ JuMP.VariableRef in MOI.GreaterThan{Float64}: 4
│ └ JuMP.VariableRef in MOI.ZeroOne: 2000
└ Names registered in the model
  └ :y_mte, :δ_mte
optimize!(moa_mte_model)
assert_is_solved_and_feasible(moa_mte_model)

8.7 Extracting and Filtering the MTE-Pareto Front

n_mte_solutions = result_count(moa_mte_model)

moa_mte_solutions_df = map(1:n_mte_solutions) do i
    obj = objective_value(moa_mte_model; result = i)
    nselected = round(Int, -obj[1])
    count_dev = obj[2]

    # Recover MTE from count deviation: MTE = D / (C · N_total)
    mte = nselected > 0 ? count_dev / (nrow(target_distribution) * nselected) : NaN

    (; nselected, mte)
end |> DataFrame
# Remove MTE-dominated solutions
sort!(moa_mte_solutions_df, [order(:nselected; rev = true), :mte])
moa_mte_solutions_df.pareto_optimal .= false
moa_mte_solutions_df.pareto_optimal[begin] = true
dominating_error = moa_mte_solutions_df.mte[begin]
for row in eachrow(moa_mte_solutions_df)
    if row.mte < dominating_error
        row.pareto_optimal = true
        dominating_error = row.mte
    end
end

sort!(moa_mte_solutions_df, :nselected; rev = true)
simple_table(
    moa_mte_solutions_df,
    [
        :nselected => "# selected VPs",
        :mte => "Mean total error",
        :pareto_optimal => "Pareto-optimal",
    ],
)
# selected VPs Mean total error Pareto-optimal
2000 0.204 true
1966 0.201 true
1928 0.199 true
1890 0.196 true
1852 0.194 true
1814 0.191 true
1776 0.188 true
1738 0.185 true
1700 0.182 true
1662 0.179 true
1624 0.175 true
1586 0.172 true
1548 0.168 true
1510 0.164 true
1472 0.16 true
1434 0.156 true
1396 0.151 true
1358 0.146 true
1320 0.141 true
1282 0.135 true
1244 0.13 true
1206 0.124 true
1168 0.117 true
1130 0.11 true
1092 0.103 true
1054 0.103 true
1016 0.102 true
978 0.101 true
940 0.1 true
902 0.0989 true
864 0.0978 true
826 0.0965 true
788 0.0952 true
750 0.0937 true
712 0.092 true
674 0.0901 true
636 0.0881 true
598 0.0857 true
560 0.083 true
522 0.08 true
484 0.0764 true
446 0.0723 true
408 0.0674 true
370 0.0615 true
332 0.0542 true
294 0.0451 true
256 0.0332 true
218 0.0172 true
180 0 true

8.7.1 Visualization

We visualize the MTE-Pareto front and overlay the MILP solution:

specs = (
    data(@rsubset(moa_mte_solutions_df, :pareto_optimal)) * mapping(
        color = direct("Pareto front (mean total error ≤ $epsilon)")
    ) +
        data((; mte = [milp_mte], nselected = [nrow(milp_vpop)])) * mapping(
        color = direct("MILP solution (mean total error ≤ $epsilon)")
    )
) * mapping(
    :mte => "Mean total error", :nselected => "# selected VPs"
) * visual(Scatter)
draw(specs)

Multi-objective Pareto front showing the trade-off between population size and mean total error. The MILP solution is overlaid for comparison.

Multi-objective Pareto front showing the trade-off between population size and mean total error. The MILP solution is overlaid for comparison.

The MTE-Pareto front shows the trade-off when inter-category trade-offs are allowed. The MILP solution from earlier in this section is a single point on (or near) this front. Comparing with the max-displacement Pareto front from Part 4 shows how the two different error metrics lead to different trade-off landscapes.

9 Part 6: Practical Recommendations

TipChoosing a Calibration Strategy

The calibration approaches offer different trade-offs:

  • Per-category ε constraints (Part 3): Simple, interpretable, and guarantees that each response category individually matches the target within ε. Use this when per-category accuracy matters (e.g., matching the CR rate within 5%).
  • Multi-objective with max displacement (Part 4): Finds the full Pareto front of population size vs. maximum per-category error without choosing ε a priori. Use this to explore the trade-off space before committing to a tolerance.
  • MILP with MTE bound (Part 5): More flexible, allows inter-category trade-offs while bounding the overall distribution error. This is the recommended approach for VCT simulations (see Tutorial 6).
  • Multi-objective with MTE (Part 5): Explores the MTE-based trade-off without choosing ε a priori.
TipChoosing Epsilon

When using per-category ε constraints (Part 3) or MTE-bounded MILP (Part 5), start with ε = 0.10 (10% tolerance). Note that ε has different interpretations: the ILP guarantees \(|q_c - p_c| \leq \epsilon\) per category, while the MILP enforces \(\text{MTE} \leq \epsilon\) on average:

Epsilon Effect
0.05 (5%) Very strict, fewer VPs selected
0.10 (10%) Good balance (recommended)
0.15-0.20 More VPs, acceptable for preliminary work
> 0.25 Too loose, poor distribution matching
TipHandling Infeasible Problems

If the solver returns INFEASIBLE:

  1. Increase epsilon from 0.10 to 0.15 or 0.20
  2. Check target achievability - do you have VPs in all response categories?
  3. Merge sparse categories - combine SD and PD if both are rare
  4. Increase Vpop size - generate more VPs to have more options
WarningWhen Categories Are Imbalanced

If your Vpop has very few VPs in a target category (e.g., only 2% CR but target is 25% CR):

  • Calibration may fail or reject most VPs
  • Consider: Is the Vpop generation realistic?
  • Solution: Adjust population parameters or use importance sampling

10 Summary

10.1 Key Concepts

  1. Calibration ensures realistic virtual populations that match clinical response distributions

  2. ILP optimization provides precise control over distribution matching through binary selection variables and per-category constraints

  3. Multi-objective optimization with max displacement finds the full Pareto front of population size vs. maximum per-category error without pre-specifying ε

  4. MILP with MTE bound allows inter-category trade-offs by constraining the mean total error rather than individual categories

  5. Multi-objective optimization with MTE explores the MTE-based trade-off using count-deviation objectives and MTE post-filtering

10.2 Quick Reference: Steps

  1. Generate virtual population (Tutorial 4)
  2. Simulate responses for each VP
  3. Define target response distribution
  4. Choose epsilon and formulation (ILP, MILP, or multi-objective)
  5. Run calibration
  6. Validate: compare calibrated vs target distributions
  7. Save calibrated vpop for VCT (Tutorial 6)

10.3 What’s Next?

In Tutorial 6: VCT Simulation, we’ll use the calibrated virtual population to run a complete Virtual Clinical Trial, comparing treatment effects and analyzing endpoints.

10.4 References

Cortés-Ríos, J., Magee, M., Sher, A., Jusko, W. J., & Desikan, R. (2025). A step-by-step workflow for performing in silico clinical trials with nonlinear mixed effects models. CPT: Pharmacometrics & Systems Pharmacology. https://doi.org/10.1002/psp4.70122
Cortés‐Ríos, J., Ren, T., Hanan, N., Sher, A., Nader, A., Magee, M., Jusko, W. J., & Desikan, R. (2025). Hepatitis b in silico trials capture functional cure, indicate mechanistic pathways, and suggest prognostic biomarker signatures. Clinical Pharmacology &Amp; Therapeutics, 118(3), 600–612. https://doi.org/10.1002/cpt.3718
Dunning, I., Huchette, J., & Lubin, M. (2017). JuMP: A modeling language for mathematical optimization. SIAM Review, 59(2), 295–320. https://doi.org/10.1137/15M1020575
Huangfu, Q., & Hall, J. A. J. (2018). Parallelizing the dual revised simplex method. Mathematical Programming Computation, 10(1), 119–142. https://doi.org/10.1007/s12532-017-0130-5
Qi, T., & Cao, Y. (2022). Virtual clinical trials: A tool for predicting patients who may benefit from treatment beyond progression with pembrolizumab in non-small cell lung cancer. CPT: Pharmacometrics & Systems Pharmacology, 12(2), 236–249. https://doi.org/10.1002/psp4.12896