Import Dataset with s3 Connector¶
First, we'll initialize a client with our server credentials and store it in the variable dai
.
In [1]:
Copied!
import driverlessai
dai = driverlessai.Client(address='http://localhost:12345', username="py", password="py")
import driverlessai
dai = driverlessai.Client(address='http://localhost:12345', username="py", password="py")
We can check that the s3 connector has been enabled on the Driverless AI server.
In [2]:
Copied!
dai.connectors.list()
dai.connectors.list()
Out[2]:
['upload', 'file', 'hdfs', 's3', 'recipe_file', 'recipe_url']
Use AWS credentials configured in DAI¶
In [3]:
Copied!
dataset_from_s3 = dai.datasets.create(
data_source="s3",
name="credit-cards-no-creds",
data="s3://h2o-datasets/dai/CreditCard_Cat-train.csv",
force=True,
)
dataset_from_s3.head()
dataset_from_s3 = dai.datasets.create(
data_source="s3",
name="credit-cards-no-creds",
data="s3://h2o-datasets/dai/CreditCard_Cat-train.csv",
force=True,
)
dataset_from_s3.head()
Complete 100.00% - [4/4] Computed stats for column DEFAULT_PAYMENT_NEXT_MONTH
Out[3]:
ID | LIMIT_BAL | SEX | EDUCATION | MARRIAGE | AGE | PAY_0 | PAY_2 | PAY_3 | PAY_4 | PAY_5 | PAY_6 | BILL_AMT1 | BILL_AMT2 | BILL_AMT3 | BILL_AMT4 | BILL_AMT5 | BILL_AMT6 | PAY_AMT1 | PAY_AMT2 | PAY_AMT3 | PAY_AMT4 | PAY_AMT5 | PAY_AMT6 | DEFAULT_PAYMENT_NEXT_MONTH |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 20000 | female | university | married | 24 | -2 | 2 | -1 | -1 | -2 | -2 | 3913 | 3102 | 689 | 0 | 0 | 0 | 0 | 689 | 0 | 0 | 0 | 0 | True |
2 | 120000 | female | university | single | 26 | -1 | 2 | 0 | 0 | 0 | 2 | 2682 | 1725 | 2682 | 3272 | 3455 | 3261 | 0 | 1000 | 1000 | 1000 | 0 | 2000 | True |
3 | 90000 | female | university | single | 34 | 0 | 0 | 0 | 0 | 0 | 0 | 29239 | 14027 | 13559 | 14331 | 14948 | 15549 | 1518 | 1500 | 1000 | 1000 | 1000 | 5000 | False |
4 | 50000 | female | university | married | 37 | 1 | 0 | 0 | 0 | 0 | 0 | 46990 | 48233 | 49291 | 28314 | 28959 | 29547 | 2000 | 2019 | 1200 | 1100 | 1069 | 1000 | False |
5 | 50000 | male | university | married | 57 | 2 | 0 | -1 | 0 | 0 | 0 | 8617 | 5670 | 35835 | 20940 | 19146 | 19131 | 2000 | 36681 | 10000 | 9000 | 689 | 679 | False |
Use user specific AWS credentials¶
Here we will specify user specific AWS credentials via data_source_config
argument, to override credentials configured in DAI only for the user.
Note: This is only supported from DAI 1.10.4
In [4]:
Copied!
dataset_from_custom_s3 = dai.datasets.create(
data_source="s3",
name="credit-cards-custom-creds",
data="s3://h2o-datasets/dai/CreditCard_Cat-train.csv",
force=True,
data_source_config={
"aws_access_key_id": "your AWS access key id",
"aws_secret_access_key": "your AWS secret access key"
}
)
dataset_from_custom_s3.head()
dataset_from_custom_s3 = dai.datasets.create(
data_source="s3",
name="credit-cards-custom-creds",
data="s3://h2o-datasets/dai/CreditCard_Cat-train.csv",
force=True,
data_source_config={
"aws_access_key_id": "your AWS access key id",
"aws_secret_access_key": "your AWS secret access key"
}
)
dataset_from_custom_s3.head()
Complete 100.00% - [4/4] Computed stats for column DEFAULT_PAYMENT_NEXT_MONTH
Out[4]:
ID | LIMIT_BAL | SEX | EDUCATION | MARRIAGE | AGE | PAY_0 | PAY_2 | PAY_3 | PAY_4 | PAY_5 | PAY_6 | BILL_AMT1 | BILL_AMT2 | BILL_AMT3 | BILL_AMT4 | BILL_AMT5 | BILL_AMT6 | PAY_AMT1 | PAY_AMT2 | PAY_AMT3 | PAY_AMT4 | PAY_AMT5 | PAY_AMT6 | DEFAULT_PAYMENT_NEXT_MONTH |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 20000 | female | university | married | 24 | -2 | 2 | -1 | -1 | -2 | -2 | 3913 | 3102 | 689 | 0 | 0 | 0 | 0 | 689 | 0 | 0 | 0 | 0 | True |
2 | 120000 | female | university | single | 26 | -1 | 2 | 0 | 0 | 0 | 2 | 2682 | 1725 | 2682 | 3272 | 3455 | 3261 | 0 | 1000 | 1000 | 1000 | 0 | 2000 | True |
3 | 90000 | female | university | single | 34 | 0 | 0 | 0 | 0 | 0 | 0 | 29239 | 14027 | 13559 | 14331 | 14948 | 15549 | 1518 | 1500 | 1000 | 1000 | 1000 | 5000 | False |
4 | 50000 | female | university | married | 37 | 1 | 0 | 0 | 0 | 0 | 0 | 46990 | 48233 | 49291 | 28314 | 28959 | 29547 | 2000 | 2019 | 1200 | 1100 | 1069 | 1000 | False |
5 | 50000 | male | university | married | 57 | 2 | 0 | -1 | 0 | 0 | 0 | 8617 | 5670 | 35835 | 20940 | 19146 | 19131 | 2000 | 36681 | 10000 | 9000 | 689 | 679 | False |