Load data easily and efficiently with lissyuse

Author

Gonçalo Marques - Luxembourg Income Study

Description

The function lissyuse() enables LISSY R-users to quickly import whole series of data from a specific country and/or within any time period.

It also automatically merges, if needed, household level databases with individual level ones, based on the user-selected variables. To simplify this selection, lissyuse()automatically includes a set of technical variables: weights, id’s, currency, year and dname.

For quicker and more efficient jobs, we strongly advise the selection of a restrict group of variables in the argument vars.

You can click on the following links for more information on how to start with LISSY, and on METIS for detailed information on our available databases and their respective variables.

Usage

lissyuse(
  data = NULL, 
  vars = NULL, 
  from = NULL, 
  to = NULL
)  

Arguments

data

A character vector containing the databases to be loaded.

  1. One can select the databases separately, using their ccyy format, e.g., data = c("de16", "it20");

  2. To select and entire series from a given country, we recommend the use of the format all_cc, e.g., data = c("all_us", "all_uk");

  3. These two formats can also be used simultaneously, e.g., data = c("all_be", "all_es", "br21", "br20").

vars

A character vector containing the databases to be loaded, e.g., vars = c("dhi", "region_c", "age", "relation").

If nothing is selected the entire database will be loaded.

from

A numeric value representing the year (inclusive) after which the LIS/LWS databases should be loaded.

If the data argument is not specified, the lissyuse() function will load all databases regardless of the country.

to

A numeric value representing the year (inclusive) up to which the LIS/LWS databases should be loaded.

If the data argument is not specified, the lissyuse() function will load all databases regardless of the country.

Value

A list named lis_datasets or lws_datasets. Each element of the list will be a data frame named after their respective database. See the naming formats in the examples below. Each data frame will contain as many columns as the selected variables, plus the default technical ones.

Examples

# LIS # 

# Loading a list of data frames of merged h and p-level databases.  
lissyuse( 
  data = c("all_it", "de16", "us19"), 
  vars  = c("dhi", "region_c", "age", "hourstot")
)

# To check the names of the data frames. 
names(lis_datasets)

# Ways of selecting certain elemennts 
lis_datasets[["it14"]]   # By their name
lis_datasets[1:3]        # By their respective order within lis_datasets

# Selecting all the italian datasets, while restrict them to a certain year range. 
lissyuse(
  data = c("all_it"), 
  vars  = c("dhi", "region_c"), 
  from = 2004, 
  to = 2016
)

# Notice that in the previous line only h-level variables were selected. 
# This will lead to slightly different names for the data frames. 
names(lis_datasets)


# By selecting h-level variables, lissyuse will not merge the p-level files for that country-year. 
# The names of the data frames will be "ccyyih", or "cccyyip" if only p-level variables were chosen. 


# LWS # 

# Example for LWS 
lissyuse(
  data = c("all_us", "uk17", "uk19"), 
  vars = "dnw", 
  from = 2015, 
  to = 2021
)

# When working with LWS databases, the list will be named lws_datasets
names(lws_datasets)

# The naming format of the data frames will keep being "ccyy" if the user selects h and p-level databases. If not, the format will be "ccyywh" or "ccyywp".