Import folder as Dataset¶
First, we'll initialize a client with our server credentials and store it in the variable dai
.
In [1]:
Copied!
import driverlessai
dai = driverlessai.Client(address='http://mr-dl30:12353', username="user", password="user")
import driverlessai
dai = driverlessai.Client(address='http://mr-dl30:12353', username="user", password="user")
We can import a folder as a dataset using any of the following connectors.
- 's3'
- 'h2o_drive'
- 'minio'
- 'gcs'
If the data path ends with "/", Driverless AI considers it as a path for a folder and attempts to create a single dataset by uploading all the files from the folder.
In [2]:
Copied!
dataset_from_s3 = dai.datasets.create(
data_source="s3",
name="dataset_from_s3",
data="s3://h2o-public-test-data/h2o-autodoc-data/credit-card/",
force=True,
)
dataset_from_s3.head()
dataset_from_s3 = dai.datasets.create(
data_source="s3",
name="dataset_from_s3",
data="s3://h2o-public-test-data/h2o-autodoc-data/credit-card/",
force=True,
)
dataset_from_s3.head()
Complete 100.00% - [4/4] Computed stats for column DEFAULT_PAYMENT_NEXT_MONTH
Out[2]:
ID | LIMIT_BAL | SEX | EDUCATION | MARRIAGE | AGE | PAY_0 | PAY_2 | PAY_3 | PAY_4 | PAY_5 | PAY_6 | BILL_AMT1 | BILL_AMT2 | BILL_AMT3 | BILL_AMT4 | BILL_AMT5 | BILL_AMT6 | PAY_AMT1 | PAY_AMT2 | PAY_AMT3 | PAY_AMT4 | PAY_AMT5 | PAY_AMT6 | DEFAULT_PAYMENT_NEXT_MONTH |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
24001 | 50000 | male | university | single | 23 | 2 | 2 | 0 | 0 | 0 | 0 | 51246 | 49758 | 48456 | 44116 | 21247 | 20066 | 8 | 2401 | 2254 | 2004 | 704 | 707 | False |
24002 | 60000 | male | university | single | 26 | 0 | 0 | 0 | 0 | 0 | 0 | 58072 | 59040 | 57416 | 55736 | 26958 | 28847 | 2282 | 2324 | 2049 | 2000 | 3000 | 1120 | True |
24003 | 400000 | male | university | single | 27 | 0 | 0 | 0 | 0 | 0 | 0 | 15330 | 8626 | 11470 | 10745 | 20737 | 9545 | 2501 | 10009 | 1437 | 1105 | 510 | 959 | False |
24004 | 20000 | male | other | single | 27 | 5 | 4 | 3 | 2 | 2 | 2 | 21673 | 21051 | 20440 | 19709 | 20113 | 19840 | 0 | 0 | 0 | 900 | 0 | 0 | False |
24005 | 50000 | male | highschool | single | 27 | 0 | 0 | -2 | -2 | -1 | -1 | 32590 | -100 | 0 | 0 | 70 | 120 | 0 | 100 | 0 | 70 | 200 | 100 | False |
In [ ]:
Copied!