Import Dataset with MinIO Connector¶
First, initialize a client with your server credentials and store them in the variable dai
.
In [10]:
Copied!
import driverlessai
dai = driverlessai.Client(address='http://localhost:12345', username="py", password="py")
import driverlessai
dai = driverlessai.Client(address='http://localhost:12345', username="py", password="py")
You can use the dai.connectors.list()
command to check whether the MinIO connector has been enabled on the Driverless AI server.
In [7]:
Copied!
dai.connectors.list()
dai.connectors.list()
Out[7]:
['upload', 'file', 'hdfs', 's3', 'recipe_file', 'recipe_url', 'minio', 'azrbs']
Use MinIO credentials configured in DAI¶
In [9]:
Copied!
dataset_from_minio = dai.datasets.create(
data_source="minio",
name="train_data",
data="h2oaidev/cc_train.csv",
force=True,
)
dataset_from_minio.head()
dataset_from_minio = dai.datasets.create(
data_source="minio",
name="train_data",
data="h2oaidev/cc_train.csv",
force=True,
)
dataset_from_minio.head()
Complete 100.00% - [4/4] Computed stats for column default payment next month
Out[9]:
ID | LIMIT_BAL | SEX | EDUCATION | MARRIAGE | AGE | PAY_0 | PAY_2 | PAY_3 | PAY_4 | PAY_5 | PAY_6 | BILL_AMT1 | BILL_AMT2 | BILL_AMT3 | BILL_AMT4 | BILL_AMT5 | BILL_AMT6 | PAY_AMT1 | PAY_AMT2 | PAY_AMT3 | PAY_AMT4 | PAY_AMT5 | PAY_AMT6 | default payment next month |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 20000 | 2 | 2 | 1 | 24 | -2 | 2 | -1 | -1 | -2 | -2 | 3913 | 3102 | 689 | 0 | 0 | 0 | 0 | 689 | 0 | 0 | 0 | 0 | True |
2 | 120000 | 2 | 2 | 2 | 26 | -1 | 2 | 0 | 0 | 0 | 2 | 2682 | 1725 | 2682 | 3272 | 3455 | 3261 | 0 | 1000 | 1000 | 1000 | 0 | 2000 | True |
3 | 90000 | 2 | 2 | 2 | 34 | 0 | 0 | 0 | 0 | 0 | 0 | 29239 | 14027 | 13559 | 14331 | 14948 | 15549 | 1518 | 1500 | 1000 | 1000 | 1000 | 5000 | False |
4 | 50000 | 2 | 2 | 1 | 37 | 1 | 0 | 0 | 0 | 0 | 0 | 46990 | 48233 | 49291 | 28314 | 28959 | 29547 | 2000 | 2019 | 1200 | 1100 | 1069 | 1000 | False |
5 | 50000 | 1 | 2 | 1 | 57 | 2 | 0 | -1 | 0 | 0 | 0 | 8617 | 5670 | 35835 | 20940 | 19146 | 19131 | 2000 | 36681 | 10000 | 9000 | 689 | 679 | False |
Use user-specific MinIO credentials¶
Specify user-specific MinIO credentials with the data_source_config
argument to override credentials configured in DAI only for the user.
Note: This is only supported in DAI 1.10.4 and later.
In [11]:
Copied!
dataset_from_custom_minio = dai.datasets.create(
data_source="minio",
name="train_data",
data="h2oaidev/cc_train.csv",
force=True,
data_source_config={
"minio_access_key_id": 'Your Minio access key ID',
"minio_secret_access_key": 'Your Minio secret access key',
}
)
dataset_from_custom_minio.head()
dataset_from_custom_minio = dai.datasets.create(
data_source="minio",
name="train_data",
data="h2oaidev/cc_train.csv",
force=True,
data_source_config={
"minio_access_key_id": 'Your Minio access key ID',
"minio_secret_access_key": 'Your Minio secret access key',
}
)
dataset_from_custom_minio.head()
Complete 100.00% - [4/4] Computed stats for column default payment next month
Out[11]:
ID | LIMIT_BAL | SEX | EDUCATION | MARRIAGE | AGE | PAY_0 | PAY_2 | PAY_3 | PAY_4 | PAY_5 | PAY_6 | BILL_AMT1 | BILL_AMT2 | BILL_AMT3 | BILL_AMT4 | BILL_AMT5 | BILL_AMT6 | PAY_AMT1 | PAY_AMT2 | PAY_AMT3 | PAY_AMT4 | PAY_AMT5 | PAY_AMT6 | default payment next month |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 20000 | 2 | 2 | 1 | 24 | -2 | 2 | -1 | -1 | -2 | -2 | 3913 | 3102 | 689 | 0 | 0 | 0 | 0 | 689 | 0 | 0 | 0 | 0 | True |
2 | 120000 | 2 | 2 | 2 | 26 | -1 | 2 | 0 | 0 | 0 | 2 | 2682 | 1725 | 2682 | 3272 | 3455 | 3261 | 0 | 1000 | 1000 | 1000 | 0 | 2000 | True |
3 | 90000 | 2 | 2 | 2 | 34 | 0 | 0 | 0 | 0 | 0 | 0 | 29239 | 14027 | 13559 | 14331 | 14948 | 15549 | 1518 | 1500 | 1000 | 1000 | 1000 | 5000 | False |
4 | 50000 | 2 | 2 | 1 | 37 | 1 | 0 | 0 | 0 | 0 | 0 | 46990 | 48233 | 49291 | 28314 | 28959 | 29547 | 2000 | 2019 | 1200 | 1100 | 1069 | 1000 | False |
5 | 50000 | 1 | 2 | 1 | 57 | 2 | 0 | -1 | 0 | 0 | 0 | 8617 | 5670 | 35835 | 20940 | 19146 | 19131 | 2000 | 36681 | 10000 | 9000 | 689 | 679 | False |