Import Dataset with H2O Drive Connector¶
First, we'll initialize a client with our server credentials and store it in the variable dai
.
In [22]:
Copied!
import driverlessai
dai = driverlessai.Client(address='http://localhost:12345', username="py", password="py")
import driverlessai
dai = driverlessai.Client(address='http://localhost:12345', username="py", password="py")
We can check that the H2O Drive connector has been enabled on the Driverless AI server.
In [23]:
Copied!
dai.connectors.list()
dai.connectors.list()
Out[23]:
['upload', 'file', 'hdfs', 's3', 'recipe_file', 'recipe_url', 'h2o_drive']
In the following example, the data source, dataset CSV file, and name are specified.
In [24]:
Copied!
dai.datasets.create(
data_source="h2o_drive",
name="dai_dataset_name",
data="dataset_name_in_h2o_drive",
)
dataset_from_drive.head()
dai.datasets.create(
data_source="h2o_drive",
name="dai_dataset_name",
data="dataset_name_in_h2o_drive",
)
dataset_from_drive.head()
Complete 100.00% - [4/4] Computed stats for column isdepdelayed_rec
Out[24]:
fyear | fmonth | fdayofmonth | fdayofweek | deptime | arrtime | uniquecarrier | origin | dest | distance | isdepdelayed | isdepdelayed_rec |
---|---|---|---|---|---|---|---|---|---|---|---|
"f1987" | "f10" | "f15" | "f4" | 729 | 903 | "PS" | "SAN" | "SFO" | 447 | "NO" | -1 |
"f1987" | "f10" | "f17" | "f6" | 741 | 918 | "PS" | "SAN" | "SFO" | 447 | "YES" | 1 |
"f1987" | "f10" | "f22" | "f4" | 728 | 852 | "PS" | "SAN" | "SFO" | 447 | "NO" | -1 |
"f1987" | "f10" | "f24" | "f6" | 929 | 1052 | "PS" | "SFO" | "RNO" | 192 | "YES" | 1 |
"f1987" | "f10" | "f6" | "f2" | 1505 | 1607 | "PS" | "BUR" | "OAK" | 325 | "NO" | -1 |