Drops duplicated rows across specified columns.
h2o.drop_duplicates(frame, columns, keep = "first")
Arguments
- frame
An H2OFrame object to drop duplicates on.
- columns
Columns to compare during the duplicate detection process.
- keep
Which rows to keep. The "first" value (default) keeps the first row and deletes the rest.
The "last" keeps the last row.
Examples
if (FALSE) { # \dontrun{
library(h2o)
h2o.init()
data <- as.h2o(iris)
deduplicated_data <- h2o.drop_duplicates(data, c("Species", "Sepal.Length"), keep = "first")
} # }