The impute function allows you to perform in-place imputation by filling missing values with aggregates computed on the “na.rm’d” vector. Additionally, you can also perform imputation based on groupings of columns from within the dataset. These columns can be passed by index or by column name to the
by parameter. Note that if a factor column is supplied, then the method must be
impute function accepts the following arguments:
dataset: The dataset containing the column to impute
column: A specific column to impute. The default of
0specifies to impute the entire frame.
method: The type of imputation to perform.
meanreplaces NAs with the column mean;
medianreplaces NAs with the column median;
modereplaces with the most common factor (for factor columns only).
combine_method: If method is
median, then choose how to combine quantiles on even sample sizes. This parameter is ignored in all other cases. Available options for
by: Group by columns
group_by_frame: Impute the column with this pre-computed grouped frame.
values: A vector of impute values (one per column). NaN indicates to skip the column.