Retrieves data from a table of Statistics Netherlands. A list of available tables
can be retrieved with cbs_get_datasets()
. Use the Identifier
column of
cbs_get_datssets
as id
in cbs_get_data
and cbs_get_meta
.
Identifier of table, can be found in cbs_get_datasets()
optional filter statements, see details.
catalog id, can be retrieved with cbs_get_datasets()
(set catalog=NULL
to see all catalogs)
character
optional, columns to select
Should the data automatically be converted into integer and numeric?
Should column titles be added as a label (TRUE) which are visible in View
Directory where the table should be downloaded. Defaults to temporary directory
Print extra messages what is happening.
optionally specify a different server. Useful for third party data services implementing the same protocol.
Should the data include the ID column for the rows?
data.frame
with the requested data. Note that a csv copy of
the data is stored in dir
.
To reduce the download time, optionaly the data can be filtered on category values: for large tables (> 100k records) this is a wise thing to do.
The filter is specified with (see examples below):
<column_name> = <values>
in which <values>
is a character vector.
Rows with values that are not part of the character vector are not returned.
Note that the values have to be values from the $Key
column of the corresponding meta data. These may contain trailing spaces...
<column_name> = has_substring(x)
in which x is a character vector. Rows with values that
do not have a substring that is in x are not returned. Useful substrings are
"JJ", "KW", "MM" for Periods (years, quarters, months) and "PV", "CR" and "GM"
for Regions (provinces, corops, municipalities).
<column_name> = eq(<values>) | has_substring(x)
, which combines the two statements above.
By default the columns will be converted to their type (typed=TRUE
).
CBS uses multiple types of missing (unknown, surpressed, not measured, missing): users
wanting all these nuances can use typed=FALSE
which results in character columns.
All data are downloaded using cbs_download_table()
The content of CBS opendata is subject to Creative Commons Attribution (CC BY 4.0). This means that the re-use of the content is permitted, provided Statistics Netherlands is cited as the source. For more information see: https://www.cbs.nl/en-gb/about-us/website/copyright
cbs_get_meta()
, cbs_download_data()
Other data retrieval:
cbs_add_date_column()
,
cbs_add_label_columns()
,
cbs_download_data()
,
cbs_extract_table_id()
,
cbs_get_data_from_link()
Other query:
eq()
,
has_substring()
if (FALSE) {
cbs_get_data( id = "7196ENG" # table id
, Periods = "2000MM03" # March 2000
, CPI = "000000" # Category code for total
)
# useful substrings:
## Periods: "JJ": years, "KW": quarters, "MM", months
## Regions: "NL", "PV": provinces, "GM": municipalities
cbs_get_data( id = "7196ENG" # table id
, Periods = has_substring("JJ") # all years
, CPI = "000000" # Category code for total
)
cbs_get_data( id = "7196ENG" # table id
, Periods = c("2000MM03","2001MM12") # March 2000 and Dec 2001
, CPI = "000000" # Category code for total
)
# combine either this
cbs_get_data( id = "7196ENG" # table id
, Periods = has_substring("JJ") | "2000MM01" # all years and Jan 2001
, CPI = "000000" # Category code for total
)
# or this: note the "eq" function
cbs_get_data( id = "7196ENG" # table id
, Periods = eq("2000MM01") | has_substring("JJ") # Jan 2000 and all years
, CPI = "000000" # Category code for total
)
}