Iterative proportional fitting routine for the indirect estimation of origin-destination migration flow table with known margins.
Source:R/ipf2.R
ipf2.Rd
The ipf2
function finds the maximum likelihood estimates for fitted values in the log-linear model:
$$\log y_{ij} = \log \alpha_{i} + \log \beta_{j} + \log m_{ij} $$
where \(m_{ij}\) is a set of prior estimates for \(y_{ij}\) and itself is no more complex than the one being fitted.
Arguments
- row_tot
Vector of origin totals to constrain the sum of the imputed cell rows.
- col_tot
Vector of destination totals to constrain the sum of the imputed cell columns.
- m
Matrix of auxiliary data. By default set to 1 for all origin-destination combinations.
- tol
Numeric value for the tolerance level used in the parameter estimation.
- maxit
Numeric value for the maximum number of iterations used in the parameter estimation.
- verbose
Logical value to indicate the print the parameter estimates at each iteration. By default
FALSE
.
Value
Iterative Proportional Fitting routine set up in a similar manner to Agresti (2002, p.343). This is equivalent to a conditional maximization of the likelihood, as discussed by Willekens (1999), and hence provides identical indirect estimates to those obtained from the cm2
routine.
The user must ensure that the row and column totals are equal in sum. Care must also be taken to allow the dimension of the auxiliary matrix (m
) to equal those provided in the row and column totals.
If only one of the margins is known, the function can still be run. The indirect estimates will correspond to the log-linear model without the \(\alpha_{i}\) term if (row_tot = NULL
) or without the \(\beta_{j}\) term if (col_tot = NULL
)
Returns a list
object with
- mu
Origin-Destination matrix of indirect estimates
- it
Iteration count
- tol
Tolerance level at final iteration
References
Agresti, A. (2002). Categorical Data Analysis 2nd edition. Wiley.
Willekens, F. (1999). Modelling Approaches to the Indirect Estimation of Migration Flows: From Entropy to EM. Mathematical Population Studies 7 (3), 239--78.
Examples
## with Willekens (1999) data
dn <- LETTERS[1:2]
y <- ipf2(row_tot = c(18, 20), col_tot = c(16, 22),
m = matrix(c(5, 1, 2, 7), ncol = 2,
dimnames = list(orig = dn, dest = dn)))
round(addmargins(y$mu),2)
#> dest
#> orig A B Sum
#> A 13.25 4.75 18
#> B 2.75 17.25 20
#> Sum 16.00 22.00 38
## with all elements of offset equal
y <- ipf2(row_tot = c(18, 20), col_tot = c(16, 22))
round(addmargins(y$mu),2)
#> Sum
#> 7.58 10.42 18
#> 8.42 11.58 20
#> Sum 16.00 22.00 38
## with bigger matrix
dn <- LETTERS[1:3]
y <- ipf2(row_tot = c(170, 120, 410), col_tot = c(500, 140, 60),
m = matrix(c(50, 10, 220, 120, 120, 30, 545, 0, 10), ncol = 3,
dimnames = list(orig = dn, dest = dn)))
# display with row and col totals
round(addmargins(y$mu))
#> dest
#> orig A B C Sum
#> A 72 40 59 170
#> B 32 88 0 120
#> C 396 12 1 410
#> Sum 500 140 60 700
## only one margin known
dn <- LETTERS[1:2]
y <- ipf2(row_tot = c(18, 20), col_tot = NULL,
m = matrix(c(5, 1, 2, 7), ncol = 2,
dimnames = list(orig = dn, dest = dn)))
round(addmargins(y$mu))
#> dest
#> orig A B Sum
#> A 13 5 18
#> B 2 18 20
#> Sum 15 23 38