Fill NA values¶
Use this function to fill in NA values in a sequential manner up to a specified limit. When using this function, you will specify the following:
Whether the method to fill the NA values should go forward (default) or backward.
Whether the NA values should be filled along rows (default) or columns.
The maximum number of consecutive NA values to fill (defaults to 1).
import h2o
h2o.init()
# Create a random data frame with 10 rows and 3 columns.
# Specify that no more than 20% of the values are NAs.
df = h2o.create_frame(rows=10,
cols=3,
real_fraction=1.0,
real_range=100,
missing_fraction=0.2,
seed=123)
df
C1 C2 C3
-------- -------- --------
nan nan -77.1047
-93.6409 -13.6593 57.4439
-93.71 25.4342 39.1013
-95.8291 -92.4271 55.4314
84.6372 -43.4759 53.1715
-57.9583 27.4148 -26.9013
83.0921 -62.7819 -91.9426
-77.9814 64.3228 -93.954
nan -80.6142 nan
27.1672 60.5492 -13.2275
[10 rows x 3 columns]
# Forward fill a row. In Python, the values for axis are 0 (row-wise) and 1 (column-wise)
filled = df.fillna(method="forward",axis=0,maxlen=1)
filled
filled
C1 C2 C3
-------- -------- --------
nan nan -77.1047
-93.6409 -13.6593 57.4439
-93.71 25.4342 39.1013
-95.8291 -92.4271 55.4314
84.6372 -43.4759 53.1715
-57.9583 27.4148 -26.9013
83.0921 -62.7819 -91.9426
-77.9814 64.3228 -93.954
-77.9814 -80.6142 -93.954
27.1672 60.5492 -13.2275
[10 rows x 3 columns]
library(h2o)
h2o.init()
# Create a random data frame with 6 rows and 2 columns.
# Specify that no more than 70% of the values are NAs.
fr_with_nas = h2o.createFrame(categorical_fraction = 0.0,
missing_fraction = 0.7,
rows = 6,
cols = 2,
seed = 123)
fr_with_nas
C1 C2
1 NaN NaN
2 -77.10471 -93.64087
3 -13.65926 57.44389
4 NaN NaN
5 39.10130 NaN
6 NaN 55.43136
[6 rows x 2 columns]
# Forward fill a row. In R, the values for axis are 1 (row-wise) and 2 (column-wise)
fr <- h2o.fillna(fr_with_nas, "forward", axis = 1, maxlen = 1L)
fr
C1 C2
1 NaN NaN
2 -77.10471 -93.64087
3 -13.65926 57.44389
4 NaN NaN
5 39.10130 39.10130
6 NaN 55.43136
[6 rows x 2 columns]