Import Dataset with JDBC Connector¶
First, we'll initialize a client with our server credentials and store it in the variable dai
.
In [8]:
Copied!
import driverlessai
dai = driverlessai.Client(address='http://mr-dl26:12345', username='py', password='py')
import driverlessai
dai = driverlessai.Client(address='http://mr-dl26:12345', username='py', password='py')
We can check that the JDBC connector has been enabled on the Driverless AI server.
In [9]:
Copied!
dai.connectors.list()
dai.connectors.list()
Out[9]:
['file', 'upload', 's3', 'hdfs', 'jdbc']
The JDBC connector is considered an advanced connector. Thus, the create methods require a data_source_config
argument to use them.
User Defined JDBC Configuration¶
Here we manually specify the JDBC jar, JDBC driver, and JDBC server location.
In [22]:
Copied!
dataset_from_jdbc = dai.datasets.create(
data="SELECT * FROM creditcardtrain",
data_source="jdbc",
data_source_config=dict(
jdbc_jar="/data/jdbc-jars/postgresql-42.2.23.jar",
jdbc_driver="org.postgresql.Driver",
jdbc_url="jdbc:postgresql://mr-0xc2:5432/h2oaidev",
jdbc_username="h2oaitester",
jdbc_password="h2oaitesterreadonly"
),
name="From JDBC user defined config",
force=True
)
dataset_from_jdbc.head()
dataset_from_jdbc = dai.datasets.create(
data="SELECT * FROM creditcardtrain",
data_source="jdbc",
data_source_config=dict(
jdbc_jar="/data/jdbc-jars/postgresql-42.2.23.jar",
jdbc_driver="org.postgresql.Driver",
jdbc_url="jdbc:postgresql://mr-0xc2:5432/h2oaidev",
jdbc_username="h2oaitester",
jdbc_password="h2oaitesterreadonly"
),
name="From JDBC user defined config",
force=True
)
dataset_from_jdbc.head()
Complete 100.00% - [4/4] Computed stats for column default_payment_next_month
Out[22]:
id | limit_bal | sex | education | marriage | age | pay_1 | pay_2 | pay_3 | pay_4 | pay_5 | pay_6 | bill_amt1 | bill_amt2 | bill_amt3 | bill_amt4 | bill_amt5 | bill_amt6 | pay_amt1 | pay_amt2 | pay_amt3 | pay_amt4 | pay_amt5 | pay_amt6 | default_payment_next_month |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 20000 | female | university | married | 24 | 2 | 2 | -1 | -1 | -2 | -2 | 3913 | 3102 | 689 | 0 | 0 | 0 | 0 | 689 | 0 | 0 | 0 | 0 | 1 |
2 | 120000 | female | university | single | 26 | -1 | 2 | 0 | 0 | 0 | 2 | 2682 | 1725 | 2682 | 3272 | 3455 | 3261 | 0 | 1000 | 1000 | 1000 | 0 | 2000 | 1 |
3 | 90000 | female | university | single | 34 | 0 | 0 | 0 | 0 | 0 | 0 | 29239 | 14027 | 13559 | 14331 | 14948 | 15549 | 1518 | 1500 | 1000 | 1000 | 1000 | 5000 | 0 |
4 | 50000 | female | university | married | 37 | 0 | 0 | 0 | 0 | 0 | 0 | 46990 | 48233 | 49291 | 28314 | 28959 | 29547 | 2000 | 2019 | 1200 | 1100 | 1069 | 1000 | 0 |
5 | 50000 | male | university | married | 57 | -1 | 0 | -1 | 0 | 0 | 0 | 8617 | 5670 | 35835 | 20940 | 19146 | 19131 | 2000 | 36681 | 10000 | 9000 | 689 | 679 | 0 |
Predefined JDBC Configuration¶
Here we use a predefined configuration that was setup on the Driverless AI server. We only need to specify the configuration name along with our username and our password.
In [23]:
Copied!
dataset_from_jdbc_postgres = dai.datasets.create(
data="SELECT * FROM creditcardtrain",
data_source="jdbc",
data_source_config=dict(
jdbc_default_config="postgres",
jdbc_username="h2oaitester",
jdbc_password="h2oaitesterreadonly"
),
name="From JDBC postgres config",
force=True
)
dataset_from_jdbc_postgres.head()
dataset_from_jdbc_postgres = dai.datasets.create(
data="SELECT * FROM creditcardtrain",
data_source="jdbc",
data_source_config=dict(
jdbc_default_config="postgres",
jdbc_username="h2oaitester",
jdbc_password="h2oaitesterreadonly"
),
name="From JDBC postgres config",
force=True
)
dataset_from_jdbc_postgres.head()
Complete 100.00% - [4/4] Computed stats for column default_payment_next_month
Out[23]:
id | limit_bal | sex | education | marriage | age | pay_1 | pay_2 | pay_3 | pay_4 | pay_5 | pay_6 | bill_amt1 | bill_amt2 | bill_amt3 | bill_amt4 | bill_amt5 | bill_amt6 | pay_amt1 | pay_amt2 | pay_amt3 | pay_amt4 | pay_amt5 | pay_amt6 | default_payment_next_month |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 20000 | female | university | married | 24 | 2 | 2 | -1 | -1 | -2 | -2 | 3913 | 3102 | 689 | 0 | 0 | 0 | 0 | 689 | 0 | 0 | 0 | 0 | 1 |
2 | 120000 | female | university | single | 26 | -1 | 2 | 0 | 0 | 0 | 2 | 2682 | 1725 | 2682 | 3272 | 3455 | 3261 | 0 | 1000 | 1000 | 1000 | 0 | 2000 | 1 |
3 | 90000 | female | university | single | 34 | 0 | 0 | 0 | 0 | 0 | 0 | 29239 | 14027 | 13559 | 14331 | 14948 | 15549 | 1518 | 1500 | 1000 | 1000 | 1000 | 5000 | 0 |
4 | 50000 | female | university | married | 37 | 0 | 0 | 0 | 0 | 0 | 0 | 46990 | 48233 | 49291 | 28314 | 28959 | 29547 | 2000 | 2019 | 1200 | 1100 | 1069 | 1000 | 0 |
5 | 50000 | male | university | married | 57 | -1 | 0 | -1 | 0 | 0 | 0 | 8617 | 5670 | 35835 | 20940 | 19146 | 19131 | 2000 | 36681 | 10000 | 9000 | 689 | 679 | 0 |