Skip to contents

Loads the party statements dataset (2003–2022) as a data.table. The dataset contains 82,723 official statements from party spokespersons and leadership meeting minutes from South Korea's two major political parties.

Usage

load_party_statements(path = NULL, cache = TRUE, refresh = FALSE)

Arguments

path

Character; path to a local CSV file. If NULL (default), the bundled dataset is used.

cache

Logical; if TRUE (default), the data is cached as an RDS file in the user's cache directory for faster subsequent loads.

refresh

Logical; if TRUE, any existing cache is ignored and the source CSV is re-read. Defaults to FALSE.

Value

A data.table::data.table with the following columns:

ColumnDescription
noRow number / identifier
yearYear of the statement
ymdFull date (YYYY-MM-DD)
titleTitle of the statement
textFull text of the statement
filteredFiltered/preprocessed text indicator
partisanParty affiliation label
conservativeConservative party indicator
idUnique document identifier

See data_dictionary.md for the complete column reference, and the Data Descriptor (Table 9) for yearly entry counts by party.

Details

For full methodology and variable descriptions, see the Data Descriptor: Lim, T.H. (2025). Scientific Data, 12, 1030. doi:10.1038/s41597-025-05220-4

Examples

if (FALSE) { # \dontrun{
# Load with default caching
ps <- load_party_statements()

# Force refresh
ps <- load_party_statements(refresh = TRUE)
} # }