I am a few days trying to add Databricks connection as Spark SQL one but with no success. Does anyone already have done it?
Using this tutorial in Databricks documentation, but with no success. What I notice is that the connection driver protocol is jdbc:spark instead of jdbc:hive2 which seems the one that Metabase driver uses.
Databricks also provides a JAR with its own driver. Should I look for a Metabase driver linked to this? Or should I be able to connect using Spark SQL?
It’s the first time I’ve seen someone talk about Databricks support, so I’m not sure how you should connect - or if you even can.
If Databricks are up for creating a driver that works with Metabase, then I think that will be your best chance of getting integration.
There’s 99 other databases that has been requested support for, and most will likely never be supported unless someone else creates and maintains a driver.
Have you tried contacting Databricks to figure out if they have any knowledge about connecting with Metabase (probably using the current Spark driver)?
Thanks for the quick answer @flamber. I will try to contact Databricks support to see if I get any help from there.
Do you have any other tip of data warehouse that works with Metabase, preferably open source?
I’m not sure, but currently Metabase “only” supports 15 drivers - where Postgres and MySQL are some of the best supported.
There’s a few other drivers, which are maintained by other people - have a look at the link I provided in previous comment to see if one of the requests has a work-in-progress driver or is merely comments.
I’ve only tested the basic connection myself. The tricky thing is ensuring the JDBC flags and UID were correct. The UID/username is literally token and the JDBC flags can be copied from the Databricks cluster JDBC info page.
Currently that driver only works with v0.32.x and v0.33.x - it has not been updated to work with v0.34.x. Here’s a Dockerfile based off of Metabase v0.33.6 that adds the plugin to the relevant location in the container:
FROM metabase/metabase:v0.33.6
ADD --chown=2000:2000 https://github.com/ifood/metabase-sparksql-databricks-driver/releases/download/1.0.0/sparksql-databricks.metabase-driver.jar /plugins/
@decort,
I have followed the same link https://github.com/ifood/metabase-sparksql-databricks-driver and downloaded the jar sparksql-databricks.metabase-driver.jar and placed in …/plugins/ directory .
The metabase version we are using is v0.33.1 and unfortunately the we were not able to establish connection.When all the fields are filled and I clicked on save,Below is the error message:
[Simba]SparkJDBCDriver Error setting/closing session: Open Session Error
To resolve above error we have downloaded the simba custom jar SparkJDBC41.jar and placed in /home/ubuntu/apps/metabase/plugins but no luck the error remains same.
Please let me know is there a way I can resolve this
Hi @chandan
Try https://github.com/fhsgoncalves/metabase-sparksql-databricks-driver which is the authors own personal Github account - and updates will likely only happen there.
Metabase 0.33.1 was a broken build, so I would highly recommend 0.33.7.3 (or perhaps the latest 0.34.2)
Where did you download the Simba dependency from? Should be simba-spark-jdbc41-2.6.3.1003.jar, and automatically downloaded, but perhaps it couldn’t save it to ./plugins/ because of permissions?
Thanks @flamber, could you please provide me a link to download simba-spark-jdbc41-2.6.3.1003.jar, I have downloaded it from different link and i just want to know do i have to place that jar in ./plugins/ or different directory to resolve the error
@chandan I actually haven’t played with Databricks yet, but dependencies are also places in ./plugins/. I would recommend not renaming any of the files, since it be able to find them.
And then check the log on startup to make sure it loads correctly (or gives errors) during the driver process.
@flamber,
I have followed the same steps as you mentioned.
1.Installed metabase version 0.33.7.3
2.Downloaded jars sparksql-databricks.metabase-driver.jar and simba-spark-jdbc41-2.6.3.1003.jar and placed in /./plugins directory
3.Restarted the metabase.
But it still displays same error while iam saving the database connection:
[Simba]SparkJDBCDriver Error setting/closing session: Open Session Error.
Along with this there is other java error stack also,not sure if it is related to spark SQL databricks connection.
Caused by: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
at java.base/sun.security.validator.PKIXValidator.(PKIXValidator.java:89)
at java.base/sun.security.validator.Validator.getInstance(Validator.java:181)
at java.base/sun.security.ssl.X509TrustManagerImpl.getValidator(X509TrustManagerImpl.java:300)
at java.base/sun.security.ssl.X509TrustManagerImpl.checkTrustedInit(X509TrustManagerImpl.java:176)
at java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:189)
at java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:110)
at com.simba.spark.hivecommon.utils.DSTrustManager.checkServerTrusted(Unknown Source)
at java.base/sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:1510)
at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:625)
at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.onCertificate(CertificateMessage.java:460)
at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.consume(CertificateMessage.java:360)
at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:392)
at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:443)
at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:421)
at java.base/sun.security.ssl.TransportContext.dispatch(TransportContext.java:177)
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:164)
at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1152)
at java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1063)
at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:402)
@dacort,@flamber let me know if any additional changes needs to be done to fix this connection issue.
@flamber, thank you so much,we could able to resolve this issue and can able to establish connection now.
Along with the steps I mentioned earlier, there are few additional steps needs to be followed
to resolve the java error described above please follow below link(follow the verified answer).
One week trying to figure this error out, but I couldn’t.
It doesn’t seem to be a driver error, since running in my MacBook everything works perfectly.
I tried running Metabase in Kubernetes and an Azure VM, and this error occurs in both.
When I shut down my Databricks cluster and try to query it using Metabase, different things happen based on where I’m running.
In my local machine, the query wakes up my Databricks cluster and finishes with success.
On Azure, the query fails with 503, wakes up my Databricks cluster, and then the next query gives me the error described by @chandan.