Write out a chunk of a larger dataset
, using Hadoop partition=partition
nomenclature,
saving it in the cache dir dataset_name
. The chunk is identified by the rowid
column in rows
,
which is attached to the columns of the dataset identified by cols
. Features matching
the regular expression in rollback
are rolled back one row, and all timestamps are normalized
by dataset_normalize_timestamps.
Arguments
- dataset
data.frameish
dataset to load from, must contain an index column and cannot be an arrow_dplyr_query- partition
character
|integer
identifying the partition the chunk will be saved in- dataset_name
character
cache directory where the partition will be saved in- rows
data.table identifying rows of the dataset to load; will be appended to dataset
- cols
character
columns of the dataset to add to partition- ...
Arguments passed on to
dataset_rollback_event
,dataset_normalize_timestamps
rollback_cols
character vector of columns to roll back
event
character column name containing a logical feature that indicates events to rollback
by
character column name to group the table by
timestamp_cols
character vector of columns to normalize; defaults to all columns with a name containing the word
timestamp