Cleans epop
data, downloaded using the wcde()
function, for summations of population by 4, 6 or 8 education groups.
Usage
edu_group_sum(
d = NULL,
n = 4,
strip_totals = TRUE,
factor_convert = TRUE,
year_edu_start = 2020
)
Arguments
- d
Data frame downloaded from the
- n
Number of education groups (from 4, 6 or 8)
- strip_totals
Remove total sums in
epop
column. Will not strip education totals ifyear < year_edu_start
andn = 8
as past data on population size by 8 education groups is unavailable.- factor_convert
Convert columns that are character strings to factors, with levels based on order of appearance.
- year_edu_start
Year in which education splits are available for given groupings - in some versions past data is not available for some education groupings. Set to 2020 by default.
Examples
library(tidyverse)
#> ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
#> ✔ dplyr 1.1.0 ✔ readr 2.1.4
#> ✔ forcats 1.0.0 ✔ stringr 1.5.0
#> ✔ ggplot2 3.4.1 ✔ tibble 3.2.1
#> ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
#> ✔ purrr 1.0.2
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
past_epop %>%
filter(year == 2020) %>%
edu_group_sum()
#> # A tibble: 42,210 × 7
#> name country_code year age sex education epop
#> <fct> <dbl> <dbl> <fct> <fct> <fct> <dbl>
#> 1 Bulgaria 100 2020 0--4 Male Under 15 161.
#> 2 Bulgaria 100 2020 0--4 Male No Education 0
#> 3 Bulgaria 100 2020 0--4 Male Primary 0
#> 4 Bulgaria 100 2020 0--4 Male Secondary 0
#> 5 Bulgaria 100 2020 0--4 Male Post Secondary 0
#> 6 Bulgaria 100 2020 0--4 Female Under 15 152.
#> 7 Bulgaria 100 2020 0--4 Female No Education 0
#> 8 Bulgaria 100 2020 0--4 Female Primary 0
#> 9 Bulgaria 100 2020 0--4 Female Secondary 0
#> 10 Bulgaria 100 2020 0--4 Female Post Secondary 0
#> # … with 42,200 more rows