widr

R interface to the World Inequality Database (WID). This package downloads distributional national accounts data (DINA), validates and decodes WID variable codes, and provides helpers for currency conversion, inequality measurement, and plotting.

Installation

# CRAN
install.packages("widr")

# Development version
remotes::install_github("cherylisabella/widr")

Variable codes

WID variables follow the grammar <type><concept>[age][pop]:

Component Width Example Meaning
type 1 letter s share
concept 5–6 letters ptinc pre-tax national income
age 3 digits 992 adults 20+
pop 1 letter j equal-split between spouses

sptinc992j = share of pre-tax national income for equal-split adults aged 20+.

wid_decode("sptinc992j")
#> $series_type  "s"
#> $concept      "ptinc"
#> $age          "992"
#> $pop          "j"

wid_encode("s", "ptinc", age = "992", pop = "j")   # "sptinc992j"
wid_is_valid(series_type = "s", concept = "ptinc")  # TRUE
wid_search("national income")                        # search the concept table

Full catalogue: https://wid.world/codes-dictionary/

Download data

library(widr)

# Top 1% pre-tax income share, United States, 2000–2022
top1 <- download_wid(
  indicators = "sptinc992j",
  areas      = "US",
  perc       = "p99p100",
  years      = 2000:2022)

top1
#> <wid_df>  23 rows | 1 countries | 1 variables
#>   country   variable percentile year  value age pop
#> 1      US sptinc992j  p99p100   2000  0.168 992   j

download_wid() returns a wid_df, a data.frame subclass that works natively with dplyr, ggplot2, and base R without any $data unwrapping.

Key parameters

Parameter Default Description
indicators "all" Variable codes
areas "all" ISO-2 country / region codes
years "all" Integer vector or "all"
perc "all" Percentile codes, e.g. "p99p100"
ages "992" Three-digit age code
pop "j" Population unit
metadata FALSE Attach source info as attr(., "wid_meta")
include_extrapolations TRUE Include interpolated points
cache TRUE Cache responses to disc

Tidyverse integration

library(dplyr)
library(ggplot2)

top1 |>
  wid_tidy(country_names = FALSE) |>
  filter(year >= 1990) |>
  ggplot(aes(year, value)) +
  geom_line(colour = "#58a6ff", linewidth = 0.9) +
  scale_y_continuous(labels = scales::percent_format()) +
  labs(title = "Top 1% pre-tax income share — United States", x = NULL, y = NULL) +
  theme_minimal()

Inequality measures

dist <- download_wid("sptinc992j", areas = c("US", "FR"), perc = "all", years = 1990:2022)

wid_gini(dist)                        # Gini coefficient
wid_top_share(dist, top = 0.01)       # top 1% share
wid_top_share(dist, top = 0.10)       # top 10% share

thresh <- download_wid("tptinc992j", areas = "US", perc = "all")
wid_percentile_ratio(thresh)          # P90/P10

Currency conversion

download_wid("aptinc992j", areas = c("US", "FR"), perc = "p0p50") |>
  wid_convert(target = "ppp", base_year = "2022")

Supported targets: "lcu", "usd", "eur", "gbp", "ppp", "yppp".

Plotting

wid_plot_timeseries(dist)                 # time series, one line per country
wid_plot_compare(dist, year = 2020)       # cross-country bar chart
wid_plot_lorenz(dist, country = "US")     # Lorenz curve

Reusable queries

q <- wid_query(indicators = "sptinc992j", areas = c("US", "FR"))
q <- wid_filter(q, years = 2010:2022)
wid_fetch(q)

Reference tables

Six tables synchronised from WID:

Table Contents
wid_series_types Series type codes (s, a, m, …)
wid_concepts Concept codes (ptinc, hweal, nninc, …)
wid_ages Age group codes (992, 999, …)
wid_pop_types Population unit codes (j, i, t, …)
wid_countries Country and region codes
wid_percentiles Percentile codes (p99p100, p0p50, …)

Independent implementation unaffiliated with the World Inequality Lab (WIL) or the Paris School of Economics. Data maintained by WIL.