-
Notifications
You must be signed in to change notification settings - Fork 40
Update README #805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update README #805
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,148 +1,126 @@ | ||
| # databricks-jdbc | ||
| Repository for Java connector for Databricks | ||
| # Databricks JDBC Driver | ||
|
|
||
| **Status**: In Development | ||
| The Databricks JDBC driver implements the JDBC interface providing connectivity to a Databricks SQL warehouse. | ||
| Please refer to [Databricks documentation](https://docs.databricks.com/aws/en/integrations/jdbc-oss/) for more | ||
| information. | ||
|
|
||
| The Databricks JDBC driver implements the JDBC interface providing connectivity to a Databricks SQL warehouse | ||
| [](https://opensource.org/licenses/Apache-2.0) | ||
|
|
||
| ## Getting started | ||
| You can install Databricks JDBC driver by adding the following to your `pom.xml`: | ||
| ## Prerequisites | ||
|
|
||
| ```pom.xml | ||
| Databricks JDBC is compatible with Java 11 and higher. CI testing runs on Java versions 11, 17, and 21. | ||
|
|
||
| ## Installation | ||
|
|
||
| ### Maven | ||
|
|
||
| Add the following dependency to your `pom.xml`: | ||
|
|
||
| ```xml | ||
| <dependency> | ||
| <groupId>com.databricks</groupId> | ||
| <artifactId>databricks-jdbc</artifactId> | ||
| <version>1.0.4-oss</version> | ||
| </dependency> | ||
| ``` | ||
| Databricks JDBC is compatible with Java 11 and higher. CI testing runs on Java versions 11, 17, and 21. | ||
| ## Instructions for building | ||
| From development or main branch, run `mvn clean package` | ||
|
|
||
| The jar file is generated as target/databricks-jdbc-oss-jar-with-dependencies.jar | ||
|
|
||
| ## Authentication | ||
| The JDBC driver supports following modes for authentication: | ||
|
|
||
| 1. Personal Access Tokens: Set AuthMech=3 in connection string to use Personal Access Tokens, which can be set using PWD property. | ||
| 2. OAuth2: Set AuthMech=11 for using OAuth2. We only support Azure and AWS as cloud providers for OAuth2. | ||
| - Access Token: Set Auth_Flow=0 for providing passthrough access token using PWD property. | ||
| - Client Credentials: Set Auth_Flow=1 for using Machine-to-machine OAuth flow. | ||
| - Browser based OAuth: Set Auth_Flow=2 for using User-to-machine OAuth flow. | ||
|
|
||
| ## Integration Tests | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this covered anywhere? else we can still keep the running part and move full documentation to a confluence page
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a separate page for integration test in docs folder |
||
| The project includes a suite of integration tests located in the | ||
| `src/test/java/com/databricks/jdbc/integration/fakeservice/tests`. Each test runs against a set of fake-services | ||
| corresponding to each production service, namely `SQL_EXEC`/`SQL_GATEWAY` and `DBFS`. The [fake-service](./src/test/java/com/databricks/jdbc/integration/fakeservice/FakeServiceExtension.java) | ||
| is based on the open-source project [WireMock](https://wiremock.org/). The tests can be run in the following | ||
| fake-service modes controlled by the environment variable <u>`FAKE_SERVICE_TEST_MODE`</u>: | ||
|
|
||
| 1. `RECORD`: In this mode, the fake-service will record the responses from the production service and save them to the | ||
| corresponding directory in `/src/test/resources/`. This mode is useful for updating the responses when contract with | ||
| the production service changes. | ||
| 2. `REPLAY` (default): In this mode, the fake-service will replay the recorded responses saved in the corresponding | ||
| directory in `/src/test/resources/`. This mode is useful for running the tests without connecting to the production | ||
| service. | ||
| 3. `DRY`: In this mode, the tests will run against the production service and the fake-service will simply act as a | ||
| pass-through proxy, meaning it neither records nor replays the responses. This mode is useful for debugging and | ||
| authoring the tests. | ||
|
|
||
| ### Running Integration Tests | ||
| The driver supports both SQL-Execution (default) and Thrift clients. Integration tests can be executed using either the | ||
| SQL-Execution or Thrift client, determined by setting the environment variable <u>`USE_THRIFT_CLIENT`</u> to `true` or | ||
| `false`. By default, tests run using the SQL-Execution client. Depending on the environment, either the `SQL_EXEC` or | ||
| `SQL_GATEWAY` (Thrift) fake-service is used, and test properties such as `HTTP_PATH`, `DATABRICKS_HOST`, `CATALOG`, | ||
| `SCHEMA`, etc., are loaded accordingly. | ||
|
|
||
| Running [connection](./src/test/java/com/databricks/jdbc/integration/fakeservice/tests/ConnectionIntegrationTests.java) | ||
| tests in `REPLAY` mode using `SQL_GATEWAY`: | ||
|
|
||
| ### Build from Source | ||
|
|
||
| 1. Clone the repository | ||
| 2. Run the following command: | ||
| ```bash | ||
| mvn clean package | ||
| ``` | ||
| 3. The jar file is generated as `target/databricks-jdbc-<version>.jar` | ||
| 4. The test coverage report is generated in `target/site/jacoco/index.html` | ||
|
|
||
| ## Usage | ||
|
|
||
| ### Connection String | ||
|
|
||
| ``` | ||
| USE_THRIFT_CLIENT=true FAKE_SERVICE_TEST_MODE=replay mvn -Dtest=com.databricks.jdbc.integration.fakeservice.tests.ConnectionIntegrationTests test | ||
| jdbc:databricks://<host>:<port>;transportMode=http;ssl=1;AuthMech=3;httpPath=<path>;UID=token;PWD=<token> | ||
| ``` | ||
|
|
||
| Running all tests in `REPLAY` mode using `SQL_EXEC`: | ||
| ### Authentication | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can add public documentation link for config properties if one wants to go into more details on Auth
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems that our public documentation doesn't include all authentication types, and we also plan to exclude them from the README. Therefore, I believe this comment is now unnecessary. |
||
|
|
||
| The JDBC driver supports the following authentication methods: | ||
|
|
||
| #### Personal Access Token (PAT) | ||
|
|
||
| Use `AuthMech=3` for personal access token authentication: | ||
|
|
||
| ``` | ||
| USE_THRIFT_CLIENT=false FAKE_SERVICE_TEST_MODE=replay mvn -Dtest=*IntegrationTests test | ||
| AuthMech=3;UID=token;PWD=<your_token> | ||
| ``` | ||
|
|
||
| To run tests in either `RECORD` or `DRY` mode, set a personal access token in the <u>`DATABRICKS_TOKEN`</u> environment | ||
| variable. | ||
| #### OAuth2 Authentication | ||
|
|
||
| Use `AuthMech=11` for OAuth2-based authentication. Several OAuth flows are supported: | ||
|
|
||
| ##### Token Passthrough | ||
|
|
||
| Direct use of an existing OAuth token: | ||
|
|
||
| ``` | ||
| AuthMech=11;Auth_Flow=0;Auth_AccessToken=<your_access_token> | ||
| ``` | ||
|
|
||
| ##### OAuth Client Credentials (Machine-to-Machine) | ||
|
|
||
| Configure standard OAuth client credentials flow: | ||
|
|
||
| Running [execution](./src/test/java/com/databricks/jdbc/integration/fakeservice/tests/ExecutionIntegrationTests.java) | ||
| tests in `RECORD` mode using `SQL_EXEC`: | ||
| ``` | ||
| DATABRICKS_TOKEN=<personal-access-token> USE_THRIFT_CLIENT=false FAKE_SERVICE_TEST_MODE=record mvn -Dtest=com.databricks.jdbc.integration.fakeservice.tests.ExecutionIntegrationTests test | ||
| AuthMech=11;Auth_Flow=1;OAuth2ClientId=<client_id>;OAuth2Secret=<client_secret> | ||
| ``` | ||
| This will replace the recorded responses with the new responses from the production services. | ||
|
|
||
| ## Logging | ||
|
|
||
| The driver supports both [SLF4J](https://www.slf4j.org/) and [JUL](https://docs.oracle.com/javase/8/docs/api/java/util/logging/package-summary.html) logging frameworks. | ||
|
|
||
| - __SLF4J__: SLF4J logging can be enabled by setting the system property `-Dcom.databricks.jdbc.loggerImpl=SLF4JLOGGER`. | ||
| Customers need to provide the SLF4J binding implementation and corresponding configuration file in the classpath. | ||
| The intention is to give freedom to customers to adapt the JDBC logging as per their needs. | ||
| Example of using SLF4J with Log4j2; dependencies and configuration in `pom.xml` and `log4j2.xml` respectively: | ||
|
|
||
| ``` | ||
| <dependency> | ||
| <groupId>org.apache.logging.log4j</groupId> | ||
| <artifactId>log4j-slf4j2-impl</artifactId> | ||
| <version>${log4j.version}</version> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.logging.log4j</groupId> | ||
| <artifactId>log4j-core</artifactId> | ||
| <version>${log4j.version}</version> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.logging.log4j</groupId> | ||
| <artifactId>log4j-api</artifactId> | ||
| <version>${log4j.version}</version> | ||
| </dependency> | ||
| ``` | ||
|
|
||
| ``` | ||
| <?xml version="1.0" encoding="UTF-8"?> | ||
| <Configuration status="WARN"> | ||
| <Appenders> | ||
| <!-- Console appender for default logging --> | ||
| <Console name="Console" target="SYSTEM_OUT"> | ||
| <PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss} %-5level %logger{36} - %msg%n"/> | ||
| </Console> | ||
| </Appenders> | ||
|
|
||
| <Loggers> | ||
| <!-- Root logger to catch any logs that don't match other loggers --> | ||
| <Root level="info"> | ||
| <AppenderRef ref="Console"/> | ||
| </Root> | ||
| </Loggers> | ||
| </Configuration> | ||
| ``` | ||
|
|
||
| - __Java Util Logging (JUL)__: JUL logging can be enabled by setting the system property | ||
| `-Dcom.databricks.jdbc.loggerImpl=JDKLOGGER`. By default, JDBC driver uses the JUL logging framework. The intention is | ||
| to provide an out-of-the-box logging implementation without dependencies external to the JDK. There are two ways to | ||
| configure JUL logging in the JDBC driver: | ||
| - __JDBC URL__: Standard logging parameters namely, `logLevel`, `logPath`, `logFileSize` (MB), and `logFileCount`can | ||
| be passed in the JDBC URL. Example: | ||
|
|
||
| ``` | ||
| jdbc:databricks://your-databricks-host:443;transportMode=http;ssl=1;AuthMech=3;httpPath=/sql/1.0/warehouses/your-warehouse-id;UID=token;logLevel=DEBUG;logPath=/path/to/dir;logFileSize=10;logFileCount=5 | ||
| ``` | ||
|
|
||
| - __Configuration File__: The logging properties can also be set in a `logging.properties` file. The file should be | ||
| present in the classpath. Example: | ||
|
|
||
| ``` | ||
| handlers=java.util.logging.FileHandler, java.util.logging.ConsoleHandler | ||
| .level=INFO | ||
| java.util.logging.FileHandler.level=ALL | ||
| java.util.logging.FileHandler.pattern=/path/to/dir/databricks-jdbc.log | ||
| java.util.logging.FileHandler.limit=10000000 | ||
| java.util.logging.FileHandler.count=5 | ||
| java.util.logging.FileHandler.formatter=java.util.logging.SimpleFormatter | ||
| java.util.logging.ConsoleHandler.level=ALL | ||
| java.util.logging.ConsoleHandler.formatter=java.util.logging.SimpleFormatter | ||
| ``` | ||
|
|
||
| Optional parameters: | ||
| - `AzureTenantId`: Azure tenant ID for Azure Databricks (default: null). If enabled, the driver will include refreshed | ||
| Azure Active Directory (AAD) Service Principal OAuth tokens with every request. | ||
|
|
||
| ##### Browser-Based OAuth | ||
|
|
||
| Interactive browser-based OAuth flow with PKCE: | ||
|
|
||
| ``` | ||
| AuthMech=11;Auth_Flow=2 | ||
| ``` | ||
|
|
||
| Optional parameters: | ||
| - `OAuth2ClientId` - Client ID for OAuth2 (default: databricks-cli) | ||
| - `OAuth2RedirectUrlPort` - Ports for redirect URL (default: 8020) | ||
| - `EnableOIDCDiscovery` - Enable OIDC discovery (default: 1) | ||
| - `OAuthDiscoveryURL` - OIDC discovery endpoint (default: /oidc/.well-known/oauth-authorization-server) | ||
|
|
||
| ### Logging | ||
|
|
||
| The driver supports both SLF4J and Java Util Logging (JUL) frameworks: | ||
|
|
||
| - **SLF4J**: Enable with `-Dcom.databricks.jdbc.loggerImpl=SLF4JLOGGER` | ||
| - **JUL**: Enable with `-Dcom.databricks.jdbc.loggerImpl=JDKLOGGER` (default) | ||
|
|
||
| For detailed logging configuration options, see [Logging Documentation](./docs/logging.md). | ||
|
|
||
| ## Running Tests | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can also add JVM property for nio under using the driver section
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. will this cover running fake service tests also?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For fake service tests, I have added a separate detailed page in docs folder |
||
|
|
||
| Basic test execution: | ||
|
|
||
| ```bash | ||
| mvn test | ||
| ``` | ||
|
|
||
| **Note**: Due to a change in JDK 16 that introduced a compatibility issue with the Apache Arrow library used by the JDBC | ||
| driver, runtime errors may occur when using the JDBC driver with JDK 16 or later. To avoid these errors, restart your | ||
| application or driver with the following JVM command option: | ||
|
|
||
| ``` | ||
| --add-opens=java.base/java.nio=org.apache.arrow.memory.core ALL-UNNAMED | ||
| ``` | ||
|
|
||
| For more detailed information about integration tests and fake services, see [Testing Documentation](./docs/testing.md). | ||
|
|
||
| ## Documentation | ||
|
|
||
| For more information, see the following resources: | ||
| - [Integration Tests Guide](./docs/testing.md) | ||
| - [Logging Configuration](./docs/logging.md) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,94 @@ | ||
| # Logging Configuration | ||
|
|
||
| The Databricks JDBC driver supports both [SLF4J](https://www.slf4j.org/) and [JUL](https://docs.oracle.com/javase/8/docs/api/java/util/logging/package-summary.html) logging frameworks. | ||
|
|
||
| ## SLF4J Logging | ||
|
|
||
| SLF4J logging can be enabled by setting the system property: | ||
| ``` | ||
| -Dcom.databricks.jdbc.loggerImpl=SLF4JLOGGER | ||
| ``` | ||
|
|
||
| You need to provide an SLF4J binding implementation and corresponding configuration file in the classpath. This gives you the freedom to adapt the JDBC logging to your specific needs. | ||
|
|
||
| ### Example: Using SLF4J with Log4j2 | ||
|
|
||
| Add the following dependencies to your `pom.xml`: | ||
|
|
||
| ```xml | ||
| <dependency> | ||
| <groupId>org.apache.logging.log4j</groupId> | ||
| <artifactId>log4j-slf4j2-impl</artifactId> | ||
| <version>${log4j.version}</version> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.logging.log4j</groupId> | ||
| <artifactId>log4j-core</artifactId> | ||
| <version>${log4j.version}</version> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.logging.log4j</groupId> | ||
| <artifactId>log4j-api</artifactId> | ||
| <version>${log4j.version}</version> | ||
| </dependency> | ||
| ``` | ||
|
|
||
| Create a `log4j2.xml` configuration file: | ||
|
|
||
| ```xml | ||
| <?xml version="1.0" encoding="UTF-8"?> | ||
| <Configuration status="WARN"> | ||
| <Appenders> | ||
| <!-- Console appender for default logging --> | ||
| <Console name="Console" target="SYSTEM_OUT"> | ||
| <PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss} %-5level %logger{36} - %msg%n"/> | ||
| </Console> | ||
| </Appenders> | ||
|
|
||
| <Loggers> | ||
| <!-- Root logger to catch any logs that don't match other loggers --> | ||
| <Root level="info"> | ||
| <AppenderRef ref="Console"/> | ||
| </Root> | ||
| </Loggers> | ||
| </Configuration> | ||
| ``` | ||
|
|
||
| ## Java Util Logging (JUL) | ||
|
|
||
| JUL logging is enabled by default, or can be explicitly set with: | ||
| ``` | ||
| -Dcom.databricks.jdbc.loggerImpl=JDKLOGGER | ||
| ``` | ||
|
|
||
| There are two ways to configure JUL logging: | ||
|
|
||
| ### 1. JDBC URL Parameters | ||
|
|
||
| Standard logging parameters can be passed in the JDBC URL: | ||
|
|
||
| ``` | ||
| jdbc:databricks://your-databricks-host:443;transportMode=http;ssl=1;AuthMech=3;httpPath=/sql/1.0/warehouses/your-warehouse-id;UID=token;logLevel=DEBUG;logPath=/path/to/dir;logFileSize=10;logFileCount=5 | ||
| ``` | ||
|
|
||
| Available parameters: | ||
| - `logLevel`: Logging level (e.g., DEBUG, INFO) | ||
| - `logPath`: Directory path for log files | ||
| - `logFileSize`: Maximum size of each log file in MB | ||
| - `logFileCount`: Maximum number of log files to keep | ||
|
|
||
| ### 2. Configuration File | ||
|
|
||
| Logging properties can also be set in a `logging.properties` file in the classpath: | ||
|
|
||
| ```properties | ||
| handlers=java.util.logging.FileHandler, java.util.logging.ConsoleHandler | ||
| .level=INFO | ||
| java.util.logging.FileHandler.level=ALL | ||
| java.util.logging.FileHandler.pattern=/path/to/dir/databricks-jdbc.log | ||
| java.util.logging.FileHandler.limit=10000000 | ||
| java.util.logging.FileHandler.count=5 | ||
| java.util.logging.FileHandler.formatter=java.util.logging.SimpleFormatter | ||
| java.util.logging.ConsoleHandler.level=ALL | ||
| java.util.logging.ConsoleHandler.formatter=java.util.logging.SimpleFormatter | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add our public documentation link here for more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done