MILP-Based Virtual Population Calibration – In-Silico Clinical Trials with NLME Models

#	Tutorial	Status
1	Model Introduction	✓ Complete
2	Global Sensitivity Analysis	✓ Complete
3	Structural Identifiability	✓ Complete
4	Copula Vpop Generation	✓ Complete
5	MILP Calibration (current)	← You are here
6	VCT Simulation

1 Learning Outcomes

Virtual populations generated using copula sampling (see Tutorial 4) may not match real clinical data distributions. Calibration ensures the virtual population is representative of the actual patient population being studied (Cortés-Ríos et al., 2025).

In this tutorial, you will learn:

Why virtual population calibration is essential for realistic ISCTs
The mathematical formulation of an integer linear programming (ILP) problem
How to build a JuMP optimization model step-by-step
How to calibrate to response rate distributions (not just baseline characteristics)
Multi-objective optimization with maximum displacement using MultiObjectiveAlgorithms.jl
The mixed-integer linear programming (MILP) formulation with distribution error bound for inter-category trade-offs
Validating calibration results with visualizations

Prerequisites

This tutorial builds on concepts from earlier tutorials:

Tutorial 1: Tumor burden model definition
Tutorial 4: Virtual population generation with copulas

We’ll generate a new vpop using the copula approach and then calibrate it to match clinical response rates.

2 Background

2.1 The Problem: Why Calibration Matters

After generating a virtual population (Vpop) using copula sampling, we have patients with realistic parameter correlations. However, the marginal distributions and outcome distributions may not match what we observe in real clinical trials.

Consider this scenario for tumor burden:

Your Vpop has 10,000 virtual patients with parameters (f, g, k)
When you simulate treatment, you get certain response rates
The real clinical trial showed: 30% complete response, 45% partial response, 25% no response
Your Vpop shows different rates due to parameter distribution differences

The Distribution Mismatch Problem

If your Vpop outcome distribution doesn’t match the clinical trial:

Treatment effect estimates will be biased
Subgroup analyses won’t reflect real clinical heterogeneity
Regulatory questions about population representativeness

Calibration selects a subset of your Vpop that, when simulated, matches the target response distribution.

2.2 What is ILP-Based Calibration?

ILP stands for integer linear programming. It is an optimization technique where variables are integer or binary (0 or 1), and both the objective function and the constraints are linear.

For Vpop calibration, we use binary variables to select or reject each virtual patient:

\[ x_j = \begin{cases} 1 & \text{if patient } j \text{ is selected} \\ 0 & \text{if patient } j \text{ is rejected} \end{cases} \]

2.3 Why ILP? Alternatives and Trade-offs

Approach	Pros	Cons
Random Rejection Sampling	Simple, fast	Inefficient, may reject too many
Importance Weighting	No rejection needed	Weights can be extreme, high variance
ILP	Optimal selection, precise matching	Computationally more expensive

ILP is the preferred approach when you need precise control over the distribution matching.

3 Libraries

using Pumas
using StatsBase
using JuMP
using HiGHS
import MultiObjectiveAlgorithms as MOA
using Copulas
using AlgebraOfGraphics
using CairoMakie
using CategoricalArrays
using DataFramesMeta
using SummaryTables

using Random
using Statistics

# Set seed for reproducibility
Random.seed!(1234)

What each package does:

JuMP: Algebraic modeling language for optimization problems
HiGHS: High-performance ILP solver (open source, fast)
MultiObjectiveAlgorithms (MOA): Multi-objective optimization algorithms for JuMP
Copulas: For generating correlated parameter samples
AlgebraOfGraphics + CairoMakie: Publication-quality visualizations

4 Part 1: Understanding the Calibration Problem

4.1 The Tumor Burden Model Recap

From Tutorial 1, our tumor burden model has:

f: Fraction of treatment-sensitive cells (0-1)
g: Growth rate of resistant cells (1/day)
k: Death rate of sensitive cells (1/day)

The tumor dynamics follow:

\[ N(t) = N_0 \cdot f \cdot e^{-kt} + N_0 \cdot (1-f) \cdot e^{gt} \]

The full non-linear mixed effect model, implemented and explained in Tutorial 1: Model Introduction, is:

tumor_burden_model = @model begin
    @metadata begin
        desc = "Tumor Burden Model for NSCLC Chemotherapy"
        timeu = u"d"  # Time unit is days
    end

    @param begin
        # Typical values (population parameters)
        tvf ∈ RealDomain(lower = 0.0, upper = 1.0)
        tvg ∈ RealDomain(lower = 0.0)
        tvk ∈ RealDomain(lower = 0.0)

        # Inter-individual variability (diagonal covariance matrix)
        # Values on the transformed scale (logit for f, log for g and k)
        Ω ∈ PDiagDomain(3)

        # Residual error (standard deviation)
        σ ∈ RealDomain(lower = 0.0)
    end

    @random begin
        # Random effects vector (one per parameter)
        η ~ MvNormal(Ω)
    end

    @covariates treatment

    @pre begin
        # Transform random effects to individual parameters
        # f: logit-normal transformation (bounded 0-1)
        f = logistic(logit(tvf) + η[1])
        # g: log-normal transformation (positive value)
        g = tvg * exp(η[2])
        # k: log-normal transformation (positive value)
        # Additionally, treatment affects death rate:
        # treatment = false → control arm (no killing)
        # treatment = true → treatment arm (killing at rate tvk * exp(η[3]))
        k = treatment * tvk * exp(η[3])
    end

    # Initial conditions (normalized tumor = 1)
    @init begin
        N_sens = f           # Sensitive cells
        N_res = 1 - f        # Resistant cells
    end

    @dynamics begin
        # Sensitive cells: die at rate k (0 if control)
        N_sens' = -k * N_sens

        # Resistant cells: grow at rate g
        N_res' = g * N_res
    end

    @derived begin
        # Total normalized tumor size
        Nt := @. N_sens + N_res

        # Observation with residual error
        tumor_size ~ @. Normal(Nt, σ)
    end
end

PumasModel
  Parameters: tvf, tvg, tvk, Ω, σ
  Random effects: η
  Covariates: treatment
  Dynamical system variables: N_sens, N_res
  Dynamical system type: Matrix exponential
  Derived: tumor_size
  Observed: tumor_size

4.2 Defining Response Categories

For tumor burden, we define response based on tumor size at a fixed time point (e.g., 18 weeks):

tumor size < 14%: Complete response (CR)
14% ≤ tumor size < 70%: Partial response (PR)
70% ≤ tumor size ≤ 100%: Stable disease (SD)
tumor size > 100%: Progressive disease (PD)

4.3 Generating the Virtual Population

Let us generate a vpop using the copula approach from Tutorial 4.

The parameter values come from Qi & Cao (2022), who fit this model to NSCLC clinical trial data:

# Population (typical) parameter values
pop_f = 0.27    # 27% of tumor is treatment-sensitive
pop_g = 0.0013  # Resistant cells grow ~0.13% per day
pop_k = 0.0091  # Sensitive cells die ~0.91% per day

# Inter-individual variability (standard deviations on transformed scale)
ω_f = 2.16  # High variability in sensitive fraction
ω_g = 1.57  # Moderate variability in growth rate
ω_k = 1.24  # Moderate variability in death rate

# Residual error (standard deviation)
resid_σ = 0.095

# Critical correlation from the data
cor_f_g = -0.64  # Negative correlation: high f ↔ low g

We create 2000 virtual subjects for simulations with the estimated population parameters:

patients = [
    Subject(;
            id, time = 7 * (0:18), covariates = (; treatment = true),
        ) for id in 1:2_000
]
param = (; tvf = pop_f, tvg = pop_g, tvk = pop_k, Ω = Diagonal([ω_f^2, ω_g^2, ω_k^2]), σ = resid_σ)

Based on the estimated correlation matrix, we define a Gaussian copula:

Rho = [
    1.0 -0.64 0.0
    -0.64 1.0 0.0
    0.0 0.0 1.0
]
copula = GaussianCopula(Rho)

We define the joint distribution of correlated random effects based on the marginal distributions and the Gaussian copula:

correlated_randeffs_dist = SklarDist(copula, (Normal(0, ω_f), Normal(0, ω_g), Normal(0, ω_k)))

We sample correlated random effects for each virtual patient:

correlated_randeffs = rand(correlated_randeffs_dist, length(patients))
vrandeffs = [(; η) for η in eachcol(correlated_randeffs)]

Now let us simulate the tumor dynamics for each virtual patient:

sims = simobs(
    tumor_burden_model, patients, param, vrandeffs; simulate_error = false
)

Simulated population (Vector{<:Subject})
  Simulated subjects: 2000
  Simulated variables: tumor_size

We compute the observed Spearman correlation of the simulated parameters (f, g, and k):

sims_f_g_k = postprocess(sims) do generated, _
    (; f = only(unique(generated.f)), g = only(unique(generated.g)), k = only(unique(generated.k)))
end |> DataFrame
corspearman(Matrix(sims_f_g_k))

3×3 Matrix{Float64}:
  1.0        -0.643481   -0.0191359
 -0.643481    1.0         0.0374227
 -0.0191359   0.0374227   1.0

Additionally, we validate the marginal distributions of the simulated parameters:

marginals_df = DataFrame(
    parameter = ["f", "g", "k"],
    median_exp = [pop_f, pop_g, pop_k],
    median_obs = [median(sims_f_g_k.f), median(sims_f_g_k.g), median(sims_f_g_k.k)],
    omega_exp = [ω_f, ω_g, ω_k],
    omega_obs = [std(logit.(sims_f_g_k.f)), std(log.(sims_f_g_k.g)), std(log.(sims_f_g_k.k))],
)
simple_table(
    marginals_df,
    [:parameter => "Parameter", :median_exp => "Median (expected)", :median_obs => "Median (observed)", :omega_exp => "ω (expected)", :omega_obs => "ω (observed)"]
)


Parameter	Median (expected)	Median (observed)	ω (expected)	ω (observed)

f	0.27	0.262	2.16	2.18
g	0.0013	0.00129	1.57	1.57
k	0.0091	0.00905	1.24	1.24

4.4 Classifying the Tumor Response

Now let us classify the response of each virtual patient after 18 weeks.

vpop = postprocess(sims) do generated, _
    if generated.t[end] != 7 * 18 # days
        error("final time point must be 18 weeks")
    end
    tumor_size_final = generated.tumor_size[end]
    (; tumor_size_final)
end |> DataFrame
@transform! vpop :response = cut(
    :tumor_size_final,
    [0.14, 0.7, 1.0];
    labels = ["CR", "PR", "SD", "PD"],
    extend = true,
)

We compute the distribution of responses:

vpop_distribution = @by vpop :response begin
    :pct = 100 * length(:response) / nrow(vpop)
end
simple_table(
    vpop_distribution, [:response => "Response", :pct => "Vpop (%)"],
)


Response	Vpop (%)

CR	2.35
PR	22
SD	26
PD	49.6

4.5 Defining the Target Distribution

Let us define a target response distribution that we want to match. In practice, one would use response rates in a clinical trial.

target_distribution = DataFrame(
    response = categorical(["CR", "PR", "SD", "PD"]),
    pct = [25.0, 40.0, 20.0, 15.0],
)
simple_table(
    target_distribution, [:response => "Response", :pct => "Target (%)"],
)


Response	Target (%)

CR	25
PR	40
SD	20
PD	15

We compare the target distribution with the distribution of response rates in the current virtual population.

comparison_df = leftjoin(target_distribution, vpop_distribution; on = :response, renamecols = "_target" => "_vpop")
@rtransform! comparison_df :pct_vpop = @coalesce(:pct_vpop, 0.0)
@rtransform! comparison_df :diff = :pct_vpop - :pct_target

simple_table(
    comparison_df,
    [
        :response => "Response",
        :pct_vpop => "Vpop (%)",
        :pct_target => "Target (%)",
        :diff => "Difference (%)",
    ]
)


Response	Vpop (%)	Target (%)	Difference (%)

CR	2.35	25	-22.6
PR	22	40	-18.0
SD	26	20	6.05
PD	49.6	15	34.6

5 Part 2: The Mathematical Formulation

Before writing code, let’s understand the ILP optimization problem mathematically.

5.1 Decision Variables

We have binary selection variables for each virtual patient: \[ x_j \in \{0, 1\} \quad \text{for } j = 1, \ldots, N \]

5.2 Objective Function

We want to maximize the number of selected patients (larger calibrated populations give better statistical power):

\[\max \sum_{i=1}^N x_i \]

5.3 Constraints

Additionally, we want the distribution of observed response rates of the selected patients to match the target distribution.

As generally we cannot expect the observed and expected distributions to match exactly, we enforce only approximate equality of the two distributions. That is, for each response category \(c\) with target fraction \(p_c\), the proportion of selected VPs in category \(c\) is constrained to be within an absolute error of \(\epsilon > 0\) of the target:

\[ (p_c - \epsilon) N_{total} \leq \text{N}_c \leq (p_c + \epsilon) N_{total}\]

where \(N_{\text{total}} = \sum_{i=1}^N x_i\) is the total number of selected VPs and \(N_{c} = \sum_{i \text{ with category } c} x_i\) is the number of selected VPs in category \(c\).

5.4 Why Binary Variables Make This Hard (But Solvable)

The fundamental insight is that VP selection is an all-or-nothing decision. Each virtual patient is either:

Selected (\(x_j = 1\)): included in the calibrated population
Rejected (\(x_j = 0\)): excluded from the calibrated population

This discrete choice makes the problem an integer linear program (ILP) rather than a standard linear program (LP).

Computational Complexity

Binary optimization is NP-hard— there’s no known algorithm that solves all instances quickly. With \(N\) virtual patients, there are \(2^N\) possible selections:

VPs	Possible Selections
10	1,024
20	~1 million
100	\(2^{100}\) (astronomical)

Fortunately, modern solvers like HiGHS use sophisticated algorithms (branch-and-bound, cutting planes) that solve well-structured problems like ours in seconds (Dunning et al., 2017; Huangfu & Hall, 2018).

6 Part 3: Building the JuMP Model Step-by-Step

6.1 Step 3a: Build the Optimization Model

We implement the optimization problem using a maximum absolute error of 5% for each category:

# Calibration tolerance
epsilon = 0.05

# Create optimization model
model = Model(HiGHS.Optimizer)
set_silent(model)
set_attribute(model, "presolve", "on")
set_time_limit_sec(model, 60.0)

# Decision variables
@variable(model, x[1:nrow(vpop)], Bin)     # Binary selection for each VP

# Objective: maximize number of selected VPs
N_total = sum(x)
@objective(model, Max, N_total)

# Constraints: distribution matching for each category
for row in eachrow(target_distribution)
    p_c = row.pct / 100

    # Count of selected VPs in this category
    N_c = sum(x[vpop.response .== row.response])

    # Upper bound: count <= (p_c + ε) × N_total
    @constraint(model, N_c <= (p_c + epsilon) * N_total)

    # Lower bound: count >= (p_c - ε) × N_total
    @constraint(model, N_c >= (p_c - epsilon) * N_total)
end

model

A JuMP Model
├ solver: HiGHS
├ objective_sense: MAX_SENSE
│ └ objective_function_type: AffExpr
├ num_variables: 2000
├ num_constraints: 2008
│ ├ JuMP.AffExpr in MOI.GreaterThan{Float64}: 4
│ ├ JuMP.AffExpr in MOI.LessThan{Float64}: 4
│ └ JuMP.VariableRef in MOI.ZeroOne: 2000
└ Names registered in the model
  └ :x

6.2 Step 3c: Solve the Model

Solve the optimization problem:

optimize!(model)

We ensure that the problem was solved successfully:

assert_is_solved_and_feasible(model)

In case the problem was not solved successfully, the function assert_is_solved_and_feasible throws an error describing the state of the solver.

6.3 Step 3d: Extract and Analyze Results

We can inspect the results in more detail:

solution_summary(model)

solution_summary(; result = 1, verbose = false)
├ solver_name          : HiGHS
├ Termination
│ ├ termination_status : OPTIMAL
│ ├ result_count       : 1
│ ├ raw_status         : kHighsModelStatusOptimal
│ └ objective_bound    : 2.35000e+02
├ Solution (result = 1)
│ ├ primal_status        : FEASIBLE_POINT
│ ├ dual_status          : NO_SOLUTION
│ ├ objective_value      : 2.35000e+02
│ ├ dual_objective_value : NaN
│ └ relative_gap         : 0.00000e+00
└ Work counters
  ├ solve_time (sec)   : 1.00106e-02
  ├ simplex_iterations : 5
  ├ barrier_iterations : -1
  └ node_count         : 1

From the summary, we can already see that the selected number of virtual patients (which corresponds to the objective function value of the solution) is 235.

We extract the selected virtual patients, i.e., those for which the solution is 1:

ilp_selected = findall(xi -> value(xi) == 1, x)
ilp_vpop = vpop[ilp_selected, :]

The distribution of observed responses for this subset of virtual patients is

ilp_distribution = @by ilp_vpop :response begin
    :pct = 100 * length(:response) / nrow(ilp_vpop)
end
simple_table(
    ilp_distribution, [:response => "Response", :pct => "Vpop (%)"],
)


Response	Vpop (%)

CR	20
PR	35.3
SD	24.7
PD	20

We compare it with the target distribution of response rates:

ilp_comparison_df = leftjoin(target_distribution, ilp_distribution; on = :response, renamecols = "_target" => "_vpop")
@rtransform! ilp_comparison_df :pct_vpop = @coalesce(:pct_vpop, 0.0)
@rtransform! ilp_comparison_df :pct_diff = :pct_vpop - :pct_target
simple_table(
    ilp_comparison_df,
    [
        :response => "Response",
        :pct_vpop => "Vpop (%)",
        :pct_target => "Target (%)",
        :pct_diff => "Difference (%)",
    ]
)


Response	Vpop (%)	Target (%)	Difference (%)

CR	20	25	-5.0
PR	35.3	40	-4.68
SD	24.7	20	4.68
PD	20	15	5

6.4 Evaluating Distribution Error

The ILP constrains the maximum per-category proportion error to \(\epsilon\), so the natural evaluation metric is the realized maximum absolute error:

\[ \epsilon_{\max} = \max_c \left| \frac{N_c}{N_{\text{total}}} - p_c \right| \]

This tells us how far the worst-matching category deviates from its target.

ilp_max_error = maximum(abs, ilp_comparison_df.pct_diff) / 100

0.05

6.5 Visualizing the Calibration Result

original_comparison_df = vcat(
    target_distribution,
    vpop_distribution;
    source = "type" => ["Target", "VPop"],
)
calibrated_comparison_df = vcat(
    target_distribution,
    ilp_distribution;
    source = "type" => ["Target", "VPop"],
)
comparison_df = vcat(
    original_comparison_df,
    calibrated_comparison_df;
    source = "source" => ["Original", "Calibrated"],
)

type_mapping = :type => sorter("VPop", "Target") => ""
specs = data(comparison_df) * mapping(
    :response => renamer(
        "CR" => "Complete Response",
        "PR" => "Partial Response",
        "SD" => "Stable Disease",
        "PD" => "Progressive Disease",
    ) => "Response",
    :pct => "Percentage (%)",
    dodge = type_mapping,
    row = :source => sorter("Original", "Calibrated") => "VPop",
    color = type_mapping
) * visual(BarPlot)
draw(specs; axis = (; xticklabelrotation = π / 4))

Before and after calibration: response distribution comparison. The calibrated virtual population closely matches the target distribution.

7 Part 4: Multi-Objective Optimization

7.1 Motivation

The ILP from Part 3 requires choosing \(\epsilon\) before solving. What if we want to explore the full trade-off between population size and per-category accuracy without committing to a specific tolerance?

Multi-objective optimization lets us find the entire Pareto front directly, revealing all optimal trade-offs between two competing objectives:

Maximize \(N_{\text{total}} = \sum x_i\) (number of selected VPs)
Minimize the maximum per-category error (the same quantity \(\epsilon\) constrains)

7.2 Mathematical Formulation

We introduce a single continuous variable \(\Delta \geq 0\) representing the maximum absolute count displacement across all categories:

\[ \begin{align} \min \quad & [-N_{\text{total}},\; \Delta] \\ \text{s.t.} \quad & N_c - p_c \cdot N_{\text{total}} \leq \Delta & \forall\, c \\ & p_c \cdot N_{\text{total}} - N_c \leq \Delta & \forall\, c \\ & x_j \in \{0, 1\} \\ & \Delta \geq 0 \end{align} \]

The key relationship is \(\Delta / N_{\text{total}} = \max_c |N_c / N_{\text{total}} - p_c|\), which is exactly the quantity that \(\epsilon\) constrains in the ILP. So this formulation directly explores the Pareto front of \((N_{\text{total}},\, \epsilon_{\max})\).

7.3 Recovering the \(\epsilon_{\max}\)-Pareto Front

Using \(\Delta\) as the second objective keeps the problem as a MILP, solvable efficiently by HiGHS. But \(\Delta\) is not the metric we ultimately care about — \(\epsilon_{\max}\) is. The Pareto fronts in \((N_\text{total}, \Delta)\) and \((N_\text{total}, \epsilon_{\max})\) space are generally not identical.

Fortunately, the \(\Delta\)-Pareto front is guaranteed to contain all \(\epsilon_{\max}\)-Pareto-optimal solutions:

\(\Delta\)-Dominance Implies \(\epsilon_{\max}\)-Dominance

If solution \(T\) dominates \(S\) in \((N_\text{total}, \Delta)\) space (i.e., \(N_T \geq N_S\) and \(\Delta_T \leq \Delta_S\), with at least one strict inequality), then:

\[ \epsilon_{\max,T} = \frac{\Delta_T}{N_T} \leq \frac{\Delta_S}{N_S} = \epsilon_{\max,S} \]

This holds because \(\Delta_T \leq \Delta_S\) and \(\frac{1}{N_T} \leq \frac{1}{N_S}\), so their product satisfies \(\frac{\Delta_T}{N_T} \leq \frac{\Delta_S}{N_S}\).

Consequence: every \(\epsilon_{\max}\)-Pareto-optimal solution is also \(\Delta\)-Pareto-optimal. We can therefore find the \(\Delta\)-Pareto front efficiently (as a MILP), compute \(\epsilon_{\max} = \Delta / N_\text{total}\) for each solution, and post-filter for \(\epsilon_{\max}\)-dominance to recover the exact \(\epsilon_{\max}\)-Pareto front.

7.4 Multi-Objective Formulation with MultiObjectiveAlgorithms.jl

MultiObjectiveAlgorithms.jl (MOA) extends JuMP with multi-objective optimization algorithms. It wraps a single-objective solver (here, HiGHS) and orchestrates multiple subproblem solves to trace the Pareto front.

We use the ε-constraint method (MOA.EpsilonConstraint), which is well-suited for bi-objective integer programs. It iterates over the range of one objective, constraining it at each step while optimizing the other, systematically enumerating nondominated solutions.

Choosing the MOA Algorithm

For bi-objective problems, MOA offers several algorithms:

Algorithm	What it finds	Best for
`EpsilonConstraint()`	All nondominated points	General bi-objective (recommended)
`Chalmet()`	All nondominated points	Bi-criterion integer programs
`Dichotomy()`	Supported extreme points	Quick overview (may miss points)
`TambyVanderpooten()`	All nondominated points	Multi-objective (p ≥ 2) discrete programs

EpsilonConstraint is the most intuitive and gives fine-grained control over the number of Pareto points via MOA.SolutionLimit().

moa_model = Model(() -> MOA.Optimizer(HiGHS.Optimizer))
set_attribute(moa_model, MOA.Algorithm(), MOA.EpsilonConstraint())
set_attribute(moa_model, MOA.SolutionLimit(), 50)
set_silent(moa_model)

# Binary selection variables
@variable(moa_model, y[1:nrow(vpop)], Bin)

# Single continuous variable: maximum absolute count displacement
@variable(moa_model, Δ >= 0)

# Total selected VPs
N_total = sum(y)

# Bi-objective: minimize [-N_total, Δ]
# (negate N_total since JuMP requires a uniform optimization sense)
@objective(moa_model, Min, [-N_total, Δ])

# Maximum displacement constraints
for row in eachrow(target_distribution)
    p_c = row.pct / 100
    N_c = sum(y[vpop.response .== row.response])
    @constraint(moa_model, N_c - p_c * N_total <= Δ)
    @constraint(moa_model, p_c * N_total - N_c <= Δ)
end

moa_model

A JuMP Model
├ solver: MOA[algorithm=MultiObjectiveAlgorithms.EpsilonConstraint, optimizer=HiGHS]
├ objective_sense: MIN_SENSE
│ └ objective_function_type: Vector{AffExpr}
├ num_variables: 2001
├ num_constraints: 2009
│ ├ JuMP.AffExpr in MOI.LessThan{Float64}: 8
│ ├ JuMP.VariableRef in MOI.GreaterThan{Float64}: 1
│ └ JuMP.VariableRef in MOI.ZeroOne: 2000
└ Names registered in the model
  └ :y, :Δ

optimize!(moa_model)

assert_is_solved_and_feasible(moa_model)

7.5 Extracting and Filtering the Pareto Front

We extract all nondominated solutions found by the ε-constraint method, compute the maximum absolute error for each, and filter for \(\epsilon_{\max}\)-Pareto-optimality:

n_solutions = result_count(moa_model)

moa_solutions_df = map(1:n_solutions) do i
    obj = objective_value(moa_model; result = i)
    nselected = round(Int, -obj[1])
    delta = obj[2]

    # Recover max absolute error: ε_max = Δ / N_total
    max_error = nselected > 0 ? delta / nselected : NaN

    (; nselected, max_error)
end |> DataFrame

We filter for \(\epsilon_{\max}\)-Pareto-optimality by removing solutions that are dominated in \((N_\text{total}, \epsilon_{\max})\) space:

sort!(moa_solutions_df, [order(:nselected; rev = true), :max_error])
moa_solutions_df.pareto_optimal .= false
moa_solutions_df.pareto_optimal[begin] = true
dominating_error = moa_solutions_df.max_error[begin]
for row in eachrow(moa_solutions_df)
    if row.max_error < dominating_error
        row.pareto_optimal = true
        dominating_error = row.max_error
    end
end

sort!(moa_solutions_df, :nselected; rev = true)
simple_table(
    moa_solutions_df,
    [
        :nselected => "# selected VPs",
        :max_error => "Max. absolute error",
        :pareto_optimal => "Pareto-Optimal",
    ],
)


# selected VPs	Max. absolute error	Pareto-Optimal

2000	0.347	true
1320	0.214	true
1282	0.213	true
1244	0.212	true
1206	0.211	true
1168	0.21	true
1130	0.208	true
1092	0.207	true
1054	0.205	true
1016	0.204	true
978	0.202	true
940	0.2	true
902	0.198	true
864	0.196	true
826	0.193	true
788	0.19	true
750	0.187	true
712	0.184	true
674	0.18	true
636	0.176	true
598	0.171	true
560	0.166	true
522	0.16	true
484	0.153	true
446	0.145	true
408	0.135	true
370	0.123	true
332	0.108	true
294	0.0901	true
256	0.0664	true
218	0.0344	true
180	0	true

7.6 Visualization

We visualize the Pareto front and overlay the ILP solution from Part 3:

specs = (
    data(@rsubset(moa_solutions_df, :pareto_optimal)) * mapping(color = direct("Pareto front")) +
        data((; max_error = [ilp_max_error], nselected = [nrow(ilp_vpop)])) * mapping(color = direct("ILP solution (ϵ = 0.05)"))
) * mapping(:max_error => "Max. absolute error", :nselected => "# selected VPs") * visual(Scatter)
draw(specs)

Multi-objective Pareto front showing the trade-off between population size and maximum per-category error. The ILP solution from Part 3 is overlaid for comparison.

The Pareto front reveals the full trade-off between population size and maximum per-category error. The ILP solution from Part 3 is a single point on (or near) this front, corresponding to the specific choice of \(\epsilon = 0.05\). The multi-objective approach recovers the entire curve without requiring \(\epsilon\) to be chosen a priori.

8 Part 5: Inter-Category Trade-offs

8.1 Why Allow Trade-offs Between Categories?

The per-category formulation from Part 3 (and its multi-objective extension in Part 4) treats all categories equally: no single category’s proportion can deviate from its target by more than \(\epsilon\). But what if you are willing to accept a slightly larger error in one category, as long as other categories compensate?

Constraining the mean total error (MTE) instead of the per-category maximum allows this trade-off. This leads to a mixed-integer linear program (MILP) that is strictly more flexible than the per-category ILP.

8.2 Mean Total Error

The mean total error across all response categories (Cortés‐Ríos et al., 2025) is defined as:

\[ \text{MTE} = \frac{1}{C} \sum_c \left| \frac{N_c}{N_{\text{total}}} - p_c \right| \]

where \(C\) is the number of categories. This measures the average absolute deviation between the observed and target proportions.

Relationship to Total Variation Distance

The total variation distance between two discrete probability distributions is a standard metric defined as:

\[ \text{TV} = \frac{1}{2} \sum_c |q_c - p_c| \]

where \(q_c\) and \(p_c\) are the observed and target proportions. TV is always in \([0, 1]\) regardless of the number of categories.

The mean total error is simply a rescaled version: \(\text{MTE} = 2 \cdot \text{TV} / C\). In particular, this implies that the MTE is always in \([0, 2/C]\).

8.3 Mathematical Formulation

As in Part 2, we maximize the number of selected VPs. But instead of per-category constraints, we bound the mean total error using auxiliary count deviation variables.

Using the MTE directly as a constraint is problematic: \(N_c / N_{\text{total}}\) is a ratio of decision variables, making the problem a nonlinear fractional program. Instead, we work with count deviations:

\[ \delta_c = |N_c - p_c \cdot N_{\text{total}}| \]

Since \(N_c - p_c \cdot N_\text{total} = \sum_{j \in c} x_j - p_c \sum_j x_j = \sum_j (\mathbb{1}_{j \in c} - p_c) \cdot x_j\), this is a linear function of the decision variables \(x_j\). The absolute value can be linearized with auxiliary variables \(\delta_c \geq 0\) and two-sided inequality constraints:

\[ \delta_c \geq N_c - p_c \cdot N_\text{total}, \qquad \delta_c \geq p_c \cdot N_\text{total} - N_c \]

The total count deviation \(D = \sum_c \delta_c\) is related to MTE by:

\[ D = C \cdot \text{MTE} \cdot N_\text{total} \]

so the constraint \(\text{MTE} \leq \epsilon\) is equivalent to:

\[ \sum_c \delta_c \leq C \cdot \epsilon \cdot N_\text{total} \]

Same Epsilon, Different Guarantees

Both formulations use \(\epsilon\) but with different interpretations:

The ILP (Part 3) enforces \(|q_c - p_c| \leq \epsilon\) for every category, which guarantees \(\text{MTE} \leq \epsilon\).
The MILP enforces \(\text{MTE} \leq \epsilon\) directly, allowing individual categories to exceed \(\epsilon\) as long as the average stays within the bound.

Since the ILP’s feasible set is a subset of the MILP’s, the MILP can always select at least as many VPs.

8.4 Implementation

We implement the MILP with the same \(\epsilon = 0.05\) used in Part 3:

model_milp = Model(HiGHS.Optimizer)
set_silent(model_milp)
set_attribute(model_milp, "presolve", "on")
set_time_limit_sec(model_milp, 60.0)

# Decision variables
@variable(model_milp, x_milp[1:nrow(vpop)], Bin)
@variable(model_milp, δ[1:nrow(target_distribution)] >= 0)

# Objective: maximize number of selected VPs
N_total = sum(x_milp)
@objective(model_milp, Max, N_total)

# Count deviation constraints (linearized absolute values)
for (δ_c, row) in zip(δ, eachrow(target_distribution))
    p_c = row.pct / 100
    N_c = sum(x_milp[vpop.response .== row.response])
    @constraint(model_milp, δ_c >= N_c - p_c * N_total)
    @constraint(model_milp, δ_c >= p_c * N_total - N_c)
end

# Upper bound on mean total error: MTE ≤ ε
# Since sum(δ) = C · MTE · N_total, this is equivalent to: sum(δ) ≤ C · ε · N_total
@constraint(model_milp, sum(δ) <= nrow(target_distribution) * epsilon * N_total)

model_milp

A JuMP Model
├ solver: HiGHS
├ objective_sense: MAX_SENSE
│ └ objective_function_type: AffExpr
├ num_variables: 2004
├ num_constraints: 2013
│ ├ JuMP.AffExpr in MOI.GreaterThan{Float64}: 8
│ ├ JuMP.AffExpr in MOI.LessThan{Float64}: 1
│ ├ JuMP.VariableRef in MOI.GreaterThan{Float64}: 4
│ └ JuMP.VariableRef in MOI.ZeroOne: 2000
└ Names registered in the model
  └ :x_milp, :δ

optimize!(model_milp)

assert_is_solved_and_feasible(model_milp)

8.5 Results and Comparison with ILP

milp_selected = findall(xi -> value(xi) == 1, x_milp)
milp_vpop = vpop[milp_selected, :]

milp_distribution = @by milp_vpop :response begin
    :pct = 100 * length(:response) / nrow(milp_vpop)
end
milp_comparison_df = leftjoin(target_distribution, milp_distribution; on = :response, renamecols = "_target" => "_vpop")
@rtransform! milp_comparison_df :pct_vpop = @coalesce(:pct_vpop, 0.0)
@rtransform! milp_comparison_df :pct_diff = :pct_vpop - :pct_target

simple_table(
    milp_comparison_df,
    [:response => "Response", :pct_vpop => "Vpop (%)", :pct_target => "Target (%)", :pct_diff => "Difference (%)"]
)


Response	Vpop (%)	Target (%)	Difference (%)

CR	15	25	-9.98
PR	40.3	40	0.256
SD	20.1	20	0.128
PD	24.6	15	9.6

milp_mte = mean(abs, milp_comparison_df.pct_diff) / 100

0.04992012779552717

We compare the ILP and MILP solutions:

ilp_mte = mean(abs, ilp_comparison_df.pct_diff) / 100
milp_max_error = maximum(abs, milp_comparison_df.pct_diff) / 100
summary_df = DataFrame(
    method = ["ILP (max. absolute error ≤ $epsilon)", "MILP (mean total error ≤ $epsilon)"],
    nselected = [nrow(ilp_vpop), nrow(milp_vpop)],
    max_error = [ilp_max_error, milp_max_error],
    mte = [ilp_mte, milp_mte],
)
simple_table(
    summary_df,
    [
        :method => "Method",
        :nselected => "# selected VPs",
        :max_error => "Max. absolute error",
        :mte => "Mean total error",
    ]
)


Method	# selected VPs	Max. absolute error	Mean total error

ILP (max. absolute error ≤ 0.05)	235	0.05	0.0484
MILP (mean total error ≤ 0.05)	313	0.0998	0.0499

The MILP selects at least as many virtual patients as the ILP, because its feasible space is strictly larger. The inter-category trade-off allows the solver to find calibrated populations that per-category constraints would reject — but note that the maximum per-category error may exceed \(\epsilon\).

8.6 Multi-Objective Optimization with MTE

Just as we used multi-objective optimization in Part 4 to explore the per-category error trade-off, we can do the same for the MTE-based formulation. Instead of a single \(\Delta\) variable, we use per-category count deviations \(\delta_c\) and minimize their sum \(D = \sum_c \delta_c\) as the second objective:

8.6.1 From MTE to Count Deviations

We cannot use MTE directly as an optimization objective because it involves a ratio of decision variables. Instead, we minimize the total count deviation \(D = \sum_c \delta_c\), which is related to MTE by \(D = C \cdot \text{MTE} \cdot N_\text{total}\).

8.6.2 Recovering the MTE-Pareto Front

\(D\)-Dominance Implies MTE-Dominance

If solution \(T\) dominates \(S\) in \((N_\text{total}, D)\) space (i.e., \(N_T \geq N_S\) and \(D_T \leq D_S\), with at least one strict inequality), then:

\[ \text{MTE}_T = \frac{D_T}{C \cdot N_T} \leq \frac{D_S}{C \cdot N_S} = \text{MTE}_S \]

This holds because \(D_T \leq D_S\) and \(\frac{1}{N_T} \leq \frac{1}{N_S}\), so their product satisfies \(\frac{D_T}{N_T} \leq \frac{D_S}{N_S}\).

Consequence: every MTE-Pareto-optimal solution is also \(D\)-Pareto-optimal. We can therefore find the \(D\)-Pareto front efficiently (as a MILP), compute \(\text{MTE} = D / (C \cdot N_\text{total})\) for each solution, and post-filter for MTE-dominance to recover the exact MTE-Pareto front.

moa_mte_model = Model(() -> MOA.Optimizer(HiGHS.Optimizer))
set_attribute(moa_mte_model, MOA.Algorithm(), MOA.EpsilonConstraint())
set_attribute(moa_mte_model, MOA.SolutionLimit(), 50)
set_silent(moa_mte_model)

# Binary selection variables
@variable(moa_mte_model, y_mte[1:nrow(vpop)], Bin)

# Continuous count deviation variables (one per category)
@variable(moa_mte_model, δ_mte[1:nrow(target_distribution)] >= 0)

# Total selected VPs
N_total = sum(y_mte)

# Bi-objective: minimize [-N_total, D]
# (negate N_total since JuMP requires a uniform optimization sense)
@objective(moa_mte_model, Min, [-N_total, sum(δ_mte)])

# Count deviation constraints
for (δ_c, row) in zip(δ_mte, eachrow(target_distribution))
    p_c = row.pct / 100
    N_c = sum(y_mte[vpop.response .== row.response])
    @constraint(moa_mte_model, δ_c >= N_c - p_c * N_total)
    @constraint(moa_mte_model, δ_c >= p_c * N_total - N_c)
end

moa_mte_model

A JuMP Model
├ solver: MOA[algorithm=MultiObjectiveAlgorithms.EpsilonConstraint, optimizer=HiGHS]
├ objective_sense: MIN_SENSE
│ └ objective_function_type: Vector{AffExpr}
├ num_variables: 2004
├ num_constraints: 2012
│ ├ JuMP.AffExpr in MOI.GreaterThan{Float64}: 8
│ ├ JuMP.VariableRef in MOI.GreaterThan{Float64}: 4
│ └ JuMP.VariableRef in MOI.ZeroOne: 2000
└ Names registered in the model
  └ :y_mte, :δ_mte

optimize!(moa_mte_model)

assert_is_solved_and_feasible(moa_mte_model)

8.7 Extracting and Filtering the MTE-Pareto Front

n_mte_solutions = result_count(moa_mte_model)

moa_mte_solutions_df = map(1:n_mte_solutions) do i
    obj = objective_value(moa_mte_model; result = i)
    nselected = round(Int, -obj[1])
    count_dev = obj[2]

    # Recover MTE from count deviation: MTE = D / (C · N_total)
    mte = nselected > 0 ? count_dev / (nrow(target_distribution) * nselected) : NaN

    (; nselected, mte)
end |> DataFrame

# Remove MTE-dominated solutions
sort!(moa_mte_solutions_df, [order(:nselected; rev = true), :mte])
moa_mte_solutions_df.pareto_optimal .= false
moa_mte_solutions_df.pareto_optimal[begin] = true
dominating_error = moa_mte_solutions_df.mte[begin]
for row in eachrow(moa_mte_solutions_df)
    if row.mte < dominating_error
        row.pareto_optimal = true
        dominating_error = row.mte
    end
end

sort!(moa_mte_solutions_df, :nselected; rev = true)
simple_table(
    moa_mte_solutions_df,
    [
        :nselected => "# selected VPs",
        :mte => "Mean total error",
        :pareto_optimal => "Pareto-optimal",
    ],
)


# selected VPs	Mean total error	Pareto-optimal

2000	0.204	true
1966	0.201	true
1928	0.199	true
1890	0.196	true
1852	0.194	true
1814	0.191	true
1776	0.188	true
1738	0.185	true
1700	0.182	true
1662	0.179	true
1624	0.175	true
1586	0.172	true
1548	0.168	true
1510	0.164	true
1472	0.16	true
1434	0.156	true
1396	0.151	true
1358	0.146	true
1320	0.141	true
1282	0.135	true
1244	0.13	true
1206	0.124	true
1168	0.117	true
1130	0.11	true
1092	0.103	true
1054	0.103	true
1016	0.102	true
978	0.101	true
940	0.1	true
902	0.0989	true
864	0.0978	true
826	0.0965	true
788	0.0952	true
750	0.0937	true
712	0.092	true
674	0.0901	true
636	0.0881	true
598	0.0857	true
560	0.083	true
522	0.08	true
484	0.0764	true
446	0.0723	true
408	0.0674	true
370	0.0615	true
332	0.0542	true
294	0.0451	true
256	0.0332	true
218	0.0172	true
180	0	true

8.7.1 Visualization

We visualize the MTE-Pareto front and overlay the MILP solution:

specs = (
    data(@rsubset(moa_mte_solutions_df, :pareto_optimal)) * mapping(
        color = direct("Pareto front (mean total error ≤ $epsilon)")
    ) +
        data((; mte = [milp_mte], nselected = [nrow(milp_vpop)])) * mapping(
        color = direct("MILP solution (mean total error ≤ $epsilon)")
    )
) * mapping(
    :mte => "Mean total error", :nselected => "# selected VPs"
) * visual(Scatter)
draw(specs)

Multi-objective Pareto front showing the trade-off between population size and mean total error. The MILP solution is overlaid for comparison.

The MTE-Pareto front shows the trade-off when inter-category trade-offs are allowed. The MILP solution from earlier in this section is a single point on (or near) this front. Comparing with the max-displacement Pareto front from Part 4 shows how the two different error metrics lead to different trade-off landscapes.

9 Part 6: Practical Recommendations

Choosing a Calibration Strategy

The calibration approaches offer different trade-offs:

Per-category ε constraints (Part 3): Simple, interpretable, and guarantees that each response category individually matches the target within ε. Use this when per-category accuracy matters (e.g., matching the CR rate within 5%).
Multi-objective with max displacement (Part 4): Finds the full Pareto front of population size vs. maximum per-category error without choosing ε a priori. Use this to explore the trade-off space before committing to a tolerance.
MILP with MTE bound (Part 5): More flexible, allows inter-category trade-offs while bounding the overall distribution error. This is the recommended approach for VCT simulations (see Tutorial 6).
Multi-objective with MTE (Part 5): Explores the MTE-based trade-off without choosing ε a priori.

Choosing Epsilon

When using per-category ε constraints (Part 3) or MTE-bounded MILP (Part 5), start with ε = 0.10 (10% tolerance). Note that ε has different interpretations: the ILP guarantees \(|q_c - p_c| \leq \epsilon\) per category, while the MILP enforces \(\text{MTE} \leq \epsilon\) on average:

Epsilon	Effect
0.05 (5%)	Very strict, fewer VPs selected
0.10 (10%)	Good balance (recommended)
0.15-0.20	More VPs, acceptable for preliminary work
> 0.25	Too loose, poor distribution matching

Handling Infeasible Problems

If the solver returns INFEASIBLE:

Increase epsilon from 0.10 to 0.15 or 0.20
Check target achievability - do you have VPs in all response categories?
Merge sparse categories - combine SD and PD if both are rare
Increase Vpop size - generate more VPs to have more options

When Categories Are Imbalanced

If your Vpop has very few VPs in a target category (e.g., only 2% CR but target is 25% CR):

Calibration may fail or reject most VPs
Consider: Is the Vpop generation realistic?
Solution: Adjust population parameters or use importance sampling

10 Summary

10.1 Key Concepts

Calibration ensures realistic virtual populations that match clinical response distributions
ILP optimization provides precise control over distribution matching through binary selection variables and per-category constraints
Multi-objective optimization with max displacement finds the full Pareto front of population size vs. maximum per-category error without pre-specifying ε
MILP with MTE bound allows inter-category trade-offs by constraining the mean total error rather than individual categories
Multi-objective optimization with MTE explores the MTE-based trade-off using count-deviation objectives and MTE post-filtering

10.2 Quick Reference: Steps

Generate virtual population (Tutorial 4)
Simulate responses for each VP
Define target response distribution
Choose epsilon and formulation (ILP, MILP, or multi-objective)
Run calibration
Validate: compare calibrated vs target distributions
Save calibrated vpop for VCT (Tutorial 6)

10.3 What’s Next?

In Tutorial 6: VCT Simulation, we’ll use the calibrated virtual population to run a complete Virtual Clinical Trial, comparing treatment effects and analyzing endpoints.

10.4 References

Cortés-Ríos, J., Magee, M., Sher, A., Jusko, W. J., & Desikan, R. (2025). A step-by-step workflow for performing in silico clinical trials with nonlinear mixed effects models. CPT: Pharmacometrics & Systems Pharmacology. https://doi.org/10.1002/psp4.70122

Cortés‐Ríos, J., Ren, T., Hanan, N., Sher, A., Nader, A., Magee, M., Jusko, W. J., & Desikan, R. (2025). Hepatitis b in silico trials capture functional cure, indicate mechanistic pathways, and suggest prognostic biomarker signatures. Clinical Pharmacology &Amp; Therapeutics, 118(3), 600–612. https://doi.org/10.1002/cpt.3718

Dunning, I., Huchette, J., & Lubin, M. (2017). JuMP: A modeling language for mathematical optimization. SIAM Review, 59(2), 295–320. https://doi.org/10.1137/15M1020575

Huangfu, Q., & Hall, J. A. J. (2018). Parallelizing the dual revised simplex method. Mathematical Programming Computation, 10(1), 119–142. https://doi.org/10.1007/s12532-017-0130-5

Qi, T., & Cao, Y. (2022). Virtual clinical trials: A tool for predicting patients who may benefit from treatment beyond progression with pembrolizumab in non-small cell lung cancer. CPT: Pharmacometrics & Systems Pharmacology, 12(2), 236–249. https://doi.org/10.1002/psp4.12896