Estimate Migration Flows to Match Net Totals via Iterative Proportional Fitting
Source:R/net_matrix_ipf.R
net_matrix_ipf.Rd
The net_matrix_ipf
function finds the maximum likelihood estimates for a flow matrix under the multiplicative log-linear model:
$$\log y_{ij} = \log \alpha_i + \log \alpha_j^{-1} + \log m_{ij}$$
where \(y_{ij}\) is the estimated migration flow from origin \(i\) to destination \(j\), and \(m_{ij}\) is the prior flow.
The function iteratively adjusts origin and destination scaling factors (\(\alpha\)) to match directional net migration totals.
Arguments
- net_tot
A numeric vector of net migration totals for each region. Must sum to zero.
- m
A square numeric matrix providing prior flow estimates. Must have dimensions
length(net_tot) × length(net_tot)
.- zero_mask
A logical matrix of the same dimensions as
m
, whereTRUE
indicates forbidden (structurally zero) flows. Defaults to disallowing diagonal flows.- maxit
Maximum number of iterations to perform. Default is
500
.- tol
Convergence tolerance based on maximum change in \(\alpha\) between iterations. Default is
1e-6
.- verbose
Logical flag to print progress and \(\alpha\) updates during iterations. Default is
FALSE
.
Value
A named list with components:
n
Estimated matrix of flows satisfying the net constraints.
it
Number of iterations used.
tol
Convergence tolerance used.
value
Sum of squared residuals between actual and target net flows.
convergence
Logical indicator of convergence within tolerance.
message
Text description of convergence result.
Details
The function avoids matrix inversion by updating \(\alpha\) using a closed-form solution to a quadratic equation at each step. Only directional net flows (column sums minus row sums) are matched, not marginal totals. Flows are constrained to be non-negative. If multiple positive roots are available when solving the quadratic, the smaller root is selected for improved stability.
See also
net_matrix_entropy()
for entropy-based estimation minimizing KL divergence,
net_matrix_lp()
for L1-loss linear programming,
and net_matrix_optim()
for least-squares (L2) optimization.
Examples
m <- matrix(c(0, 100, 30, 70,
50, 0, 45, 5,
60, 35, 0, 40,
20, 25, 20, 0),
nrow = 4, byrow = TRUE,
dimnames = list(orig = LETTERS[1:4], dest = LETTERS[1:4]))
addmargins(m)
#> dest
#> orig A B C D Sum
#> A 0 100 30 70 200
#> B 50 0 45 5 100
#> C 60 35 0 40 135
#> D 20 25 20 0 65
#> Sum 130 160 95 115 500
sum_region(m)
#> # A tibble: 4 × 5
#> region out_mig in_mig turn net
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 A 200 130 330 -70
#> 2 B 100 160 260 60
#> 3 C 135 95 230 -40
#> 4 D 65 115 180 50
net <- c(30, 40, -15, -55)
result <- net_matrix_ipf(net_tot = net, m = m)
result$n |>
addmargins() |>
round(2)
#> dest
#> orig A B C D Sum
#> A 0.00 80.58 26.05 34.78 141.41
#> B 62.05 0.00 48.48 3.08 113.62
#> C 69.11 32.48 0.00 22.89 124.48
#> D 40.25 40.55 34.95 0.00 115.75
#> Sum 171.41 153.62 109.48 60.75 495.26
sum_region(result$n)
#> # A tibble: 4 × 5
#> region out_mig in_mig turn net
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 A 141. 171. 313. 30.0
#> 2 B 114. 154. 267. 40.0
#> 3 C 124. 109. 234. -15.0
#> 4 D 116. 60.8 177. -55.0