Hive/SparkSQL connection doesn't work

#1

Hello all.

I can’t seem to get the SparkSQL connector working. According to Querying hive from metabase using sparksql it should work with Hive, right?

My setup is all local. I have a minimal Hive server in the form of a Docker container that I can connect to and query perfectly fine with beeline -u jdbc:hive2://localhost:10000. But neither the macOS App nor the .jar want to connect to it. I did put the metabase-sparksql-deps-1.2.1.spark2-standalone.jar in the plugin folders (and set the env var respectively).

With the macOS App it seems to connect fine, but when I open the database, I can’t see any table:


I tried syncing and scanning and don’t get any errors, but it doesn’t help.

With the running .jar I get java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000/default: java.net.ConnectException: Connection refused (Connection refused) when trying to ‘Save’. I’ve also had “Server error encountered” before, but I can’t replicate that. I’ll paste the concole output below.

All components are the most recent version, at least I hope. Downloaded both .jars yesterday, updated the macOS App yesterday and today.

Now what do I do? Where can I find additional logs (other than in the ‘Console’ app, which doesn’t contain anything useful)?

Thanks in advance!
-Elisa

$ java -jar metabase.jar
05-14 14:33:10 INFO metabase.util :: Loading Metabase...
05-14 14:33:10 INFO metabase.util :: Maximum memory available to JVM: 3.6 GB
05-14 14:33:14 INFO util.encryption :: Saved credentials encryption is DISABLED for this Metabase instance. 🔓
 For more information, see https://metabase.com/docs/latest/operations-guide/start.html#encrypting-your-database-connection-details-at-rest
05-14 14:33:17 INFO metabase.driver :: Registered abstract driver :sql  🚚
05-14 14:33:23 INFO metabase.core :: Starting Metabase in STANDALONE mode
05-14 14:33:23 INFO metabase.server :: Launching Embedded Jetty Webserver with config:
 {:port 3000}

05-14 14:33:24 INFO metabase.core :: Starting Metabase version v0.32.7 (e309f28 release-0.32.x) ...
05-14 14:33:24 INFO metabase.core :: System timezone is 'Europe/Berlin' ...
05-14 14:33:24 INFO metabase.plugins :: Loading plugins in /Users/elisae/themenmanagement/metabase/plugins...
05-14 14:33:24 DEBUG plugins.classloader :: Setting current thread context classloader to NEWLY CREATED classloader clojure.lang.DynamicClassLoader@1a445b5...
05-14 14:33:24 INFO plugins.classloader :: Added URL file:/Users/elisae/themenmanagement/metabase/plugins/metabase-sparksql-deps-1.2.1.spark2-standalone.jar to classpath
05-14 14:33:24 INFO plugins.dependencies :: Plugin 'Metabase Google Analytics Driver' depends on plugin 'Metabase Google Drivers Shared Dependencies'
05-14 14:33:24 INFO plugins.dependencies :: Metabase Google Analytics Driver dependency {:plugin Metabase Google Drivers Shared Dependencies} satisfied? false
05-14 14:33:24 INFO plugins.dependencies :: Plugins with unsatisfied deps: ["Metabase Google Analytics Driver"]
05-14 14:33:24 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :snowflake...
05-14 14:33:24 INFO metabase.driver :: Registered abstract driver :sql-jdbc (parents: :sql) 🚚
Load driver :sql-jdbc took 52 ms
05-14 14:33:24 INFO metabase.driver :: Registered driver :snowflake (parents: #{:sql-jdbc}) 🚚
05-14 14:33:25 INFO plugins.dependencies :: Metabase cannot initialize plugin Metabase Oracle Driver due to required dependencies. Metabase requires the Oracle JDBC driver in order to connect to Oracle databases, but we can't ship it as part of Metabase due to licensing restrictions. See https://metabase.com/docs/latest/administration-guide/databases/oracle.html for more details.

05-14 14:33:25 INFO plugins.dependencies :: Metabase Oracle Driver dependency {:class oracle.jdbc.OracleDriver} satisfied? false
05-14 14:33:25 INFO plugins.dependencies :: Plugins with unsatisfied deps: ["Metabase Oracle Driver" "Metabase Google Analytics Driver"]
05-14 14:33:25 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :google...
05-14 14:33:25 INFO metabase.driver :: Registered abstract driver :google  🚚
05-14 14:33:25 INFO plugins.dependencies :: Metabase Google Analytics Driver dependency {:plugin Metabase Google Drivers Shared Dependencies} satisfied? true
05-14 14:33:25 DEBUG plugins.initialize :: Dependencies satisfied; these plugins will now be loaded: ["Metabase Google Analytics Driver"]
05-14 14:33:25 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :googleanalytics...
05-14 14:33:25 INFO metabase.driver :: Registered driver :googleanalytics (parents: #{:google}) 🚚
05-14 14:33:25 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :mongo...
05-14 14:33:25 INFO metabase.driver :: Registered driver :mongo  🚚
05-14 14:33:25 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :druid...
05-14 14:33:25 INFO metabase.driver :: Registered driver :druid  🚚
05-14 14:33:25 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :presto...
05-14 14:33:25 INFO metabase.driver :: Registered driver :presto (parents: #{:sql}) 🚚
05-14 14:33:25 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :redshift...
05-14 14:33:25 INFO metabase.driver :: Registered driver :postgres (parents: :sql-jdbc) 🚚
Load driver :postgres took 729 ms
05-14 14:33:25 INFO metabase.driver :: Registered driver :redshift (parents: #{:postgres}) 🚚
05-14 14:33:25 INFO plugins.dependencies :: Metabase cannot initialize plugin Metabase Vertica Driver due to required dependencies. Metabase requires the Vertica JDBC driver in order to connect to Vertica databases, but we can't ship it as part of Metabase due to licensing restrictions. See https://metabase.com/docs/latest/administration-guide/databases/vertica.html for more details.

05-14 14:33:25 INFO plugins.dependencies :: Metabase Vertica Driver dependency {:class com.vertica.jdbc.Driver} satisfied? false
05-14 14:33:25 INFO plugins.dependencies :: Plugins with unsatisfied deps: ["Metabase Oracle Driver" "Metabase Vertica Driver"]
05-14 14:33:25 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :sqlite...
05-14 14:33:25 INFO metabase.driver :: Registered driver :sqlite (parents: #{:sql-jdbc}) 🚚
05-14 14:33:25 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :sqlserver...
05-14 14:33:25 INFO metabase.driver :: Registered driver :sqlserver (parents: #{:sql-jdbc}) 🚚
05-14 14:33:26 INFO plugins.dependencies :: Plugin 'Metabase BigQuery Driver' depends on plugin 'Metabase Google Drivers Shared Dependencies'
05-14 14:33:26 INFO plugins.dependencies :: Metabase BigQuery Driver dependency {:plugin Metabase Google Drivers Shared Dependencies} satisfied? true
05-14 14:33:26 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :bigquery...
05-14 14:33:26 INFO metabase.driver :: Registered driver :bigquery (parents: #{:sql :google}) 🚚
05-14 14:33:26 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :hive-like...
05-14 14:33:26 INFO metabase.driver :: Registered abstract driver :hive-like (parents: #{:sql-jdbc}) 🚚
05-14 14:33:26 INFO plugins.lazy-loaded-driver :: Registering lazy loading driver :sparksql...
05-14 14:33:26 INFO metabase.driver :: Registered driver :sparksql (parents: #{:hive-like}) 🚚
05-14 14:33:26 INFO metabase.driver :: Registered driver :mysql (parents: :sql-jdbc) 🚚
Load driver :mysql took 17 ms
05-14 14:33:26 INFO metabase.driver :: Registered driver :h2 (parents: :sql-jdbc) 🚚
Load driver :h2 took 22 ms
05-14 14:33:26 INFO metabase.core :: Setting up and migrating Metabase DB. Please sit tight, this may take a minute...
05-14 14:33:26 WARN metabase.db :: WARNING: Using Metabase with an H2 application database is not recommended for production deployments. For production deployments, we highly recommend using Postgres, MySQL, or MariaDB instead. If you decide to continue to use H2, please be sure to back up the database file regularly. See https://metabase.com/docs/latest/operations-guide/start.html#migrating-from-using-the-h2-database-to-mysql-or-postgres for more information.
05-14 14:33:26 INFO metabase.db :: Verifying h2 Database Connection ...
05-14 14:33:26 INFO metabase.driver :: Initializing driver :sql...
05-14 14:33:26 INFO metabase.driver :: Initializing driver :sql-jdbc...
05-14 14:33:26 INFO metabase.driver :: Initializing driver :h2...
05-14 14:33:26 INFO metabase.db :: Verify Database Connection ...  ✅
05-14 14:33:26 INFO metabase.db :: Running Database Migrations...
05-14 14:33:26 INFO metabase.db :: Setting up Liquibase...
05-14 14:33:27 INFO metabase.db :: Liquibase is ready.
05-14 14:33:27 INFO metabase.db :: Checking if Database has unrun migrations...
05-14 14:33:28 INFO metabase.db :: Database Migrations Current ...  ✅
05-14 14:33:28 INFO db.migrations :: Running all necessary data migrations, this may take a minute.
05-14 14:33:28 INFO db.migrations :: Finished running data migrations.
05-14 14:33:28 INFO metabase.events :: Starting events listener: metabase.events.notifications 👂
05-14 14:33:28 INFO metabase.events :: Starting events listener: metabase.events.metabot-lifecycle 👂
05-14 14:33:28 INFO metabase.events :: Starting events listener: metabase.events.dependencies 👂
05-14 14:33:28 INFO metabase.events :: Starting events listener: metabase.events.activity-feed 👂
05-14 14:33:28 INFO metabase.events :: Starting events listener: metabase.events.last-login 👂
05-14 14:33:28 INFO metabase.events :: Starting events listener: metabase.events.revision 👂
05-14 14:33:28 INFO metabase.events :: Starting events listener: metabase.events.sync-database 👂
05-14 14:33:28 INFO metabase.events :: Starting events listener: metabase.events.view-log 👂
05-14 14:33:28 INFO metabase.events :: Starting events listener: metabase.events.driver-notifications 👂
05-14 14:33:28 INFO metabase.task :: Initializing task SyncDatabases 📆
05-14 14:33:28 INFO task.send-pulses :: Sending scheduled pulses...
05-14 14:33:28 INFO task.send-pulses :: Sending scheduled pulses...
05-14 14:33:28 INFO task.send-pulses :: Sending scheduled pulses...
05-14 14:33:28 INFO task.send-pulses :: Sending scheduled pulses...
05-14 14:33:29 INFO task.send-pulses :: Sending scheduled pulses...
05-14 14:33:29 DEBUG sync.util :: Sync operations in flight: {:sync-metadata #{1}}
05-14 14:33:29 INFO sync.util :: STARTING: Sync metadata for h2 Database 1 'Sample Dataset'
05-14 14:33:29 DEBUG sync.util :: STARTING: step 'sync-timezone' for h2 Database 1 'Sample Dataset'
05-14 14:33:29 INFO metabase.task :: Initializing task CheckForNewVersions 📆
05-14 14:33:29 INFO metabase.task :: Initializing task SendAnonymousUsageStats 📆
05-14 14:33:29 INFO metabase.task :: Initializing task SendAbandomentEmails 📆
05-14 14:33:29 INFO metabase.task :: Initializing task SendPulses 📆
05-14 14:33:29 INFO metabase.task :: Initializing task SendFollowUpEmails 📆
05-14 14:33:29 INFO metabase.task :: Initializing task TaskHistoryCleanup 📆
05-14 14:33:29 INFO metabase.core :: Looks like this is a new installation ... preparing setup wizard
05-14 14:33:29 INFO metabase.core :: Please use the following URL to setup your Metabase installation:

http://localhost:3000/setup/
```
#2

Hi @elisae
I don’t know Spark/Hive, but what do you see in the log, when you try to force sync of the database? Admin Settings > Databases
Maybe you’re seeing something similar to this issue?
https://github.com/metabase/metabase/issues/7630 - there’s work being done to try to fix the issue in PR 9889

#3

Hi @flamber, thanks for responding.
Well, where do I find these logs? With the .jar, there is the console output, yes. But I only get to the point where I can force sync with the macOS App (the .jar doesn’t even establish a connection). Where does the App log things?

#4

Okay, that should not be the case unless you’re using different versions or very outdated Java.
Which OS and Java are you running the JAR on?

If you’re using 0.32+, then you have a log viewer in Admin Settings > Troubleshooting > Log.
If it is a Debian-like system, then you can follow this guide to setup Metabase as a service and either log output to a file or syslog:
https://metabase.com/docs/latest/operations-guide/running-metabase-on-debian.html

#5

macOS 10.13.3 and Java 1.8.0_201. Thanks for the pointer! Now this is the output when I click “Sync database schema now”:


May 15 09:34:33 DEBUG metabase.middleware.log :: GET /api/database 200 5 ms (3 DB calls) Jetty threads: 4/50 (5 idle, 0 queued) (47 total active threads) Queries in flight: 0 
May 15 09:34:33 DEBUG metabase.middleware.log :: GET /api/automagic-dashboards/database/5/candidates 200 10 ms (6 DB calls) Jetty threads: 4/50 (5 idle, 0 queued) (47 total active threads) Queries in flight: 0 
May 15 09:34:35 DEBUG metabase.middleware.log :: GET /api/database/10/schemas 200 4 ms (4 DB calls) Jetty threads: 4/50 (5 idle, 0 queued) (47 total active threads) Queries in flight: 0 
May 15 09:34:35 DEBUG metabase.middleware.log :: GET /api/database/10/metadata 200 16 ms (4 DB calls) Jetty threads: 4/50 (5 idle, 0 queued) (47 total active threads) Queries in flight: 0 
May 15 09:34:39 DEBUG metabase.middleware.log :: GET /api/session/properties 200 4 ms (1 DB calls) Jetty threads: 5/50 (4 idle, 0 queued) (47 total active threads) Queries in flight: 0 
May 15 09:34:39 DEBUG metabase.middleware.log :: GET /api/setup/admin_checklist 200 16 ms (10 DB calls) Jetty threads: 5/50 (4 idle, 0 queued) (47 total active threads) Queries in flight: 0 
May 15 09:34:39 DEBUG metabase.middleware.log :: GET /api/setting 200 512 µs (0 DB calls) Jetty threads: 4/50 (4 idle, 0 queued) (49 total active threads) Queries in flight: 0 
May 15 09:34:42 DEBUG metabase.middleware.log :: GET /api/database/10 200 8 ms (3 DB calls) Jetty threads: 4/50 (4 idle, 0 queued) (49 total active threads) Queries in flight: 0 
May 15 09:34:45 DEBUG metabase.sync.util :: Sync operations in flight: {:sync-metadata #{10}} 
May 15 09:34:45 DEBUG metabase.middleware.log :: POST /api/database/10/sync_schema 200 1 ms (1 DB calls) Jetty threads: 4/50 (4 idle, 0 queued) (49 total active threads) Queries in flight: 0 
May 15 09:34:45 INFO metabase.sync.util :: STARTING: Sync metadata for sparksql Database 10 'Hive' 
May 15 09:34:45 DEBUG metabase.sync.util :: STARTING: step 'sync-timezone' for sparksql Database 10 'Hive' 
May 15 09:34:45 DEBUG metabase.sync.util :: FINISHED: step 'sync-timezone' for sparksql Database 10 'Hive' (665 µs) 
May 15 09:34:45 DEBUG metabase.sync.util :: STARTING: step 'sync-tables' for sparksql Database 10 'Hive' 
May 15 09:34:45 ERROR metabase.sync.util :: Error running sync step: Output of db-metadata does not match schema: 
  
{:tables #{{:name (not (instance? java.lang.String nil))}}}    


("--> sync.fetch_metadata$fn__45913$db_metadata__45918.invoke(fetch_metadata.clj:10)" 
"sync.sync_metadata.tables$fn__47381$db_metadata__47386$fn__47387.invoke(tables.clj:144)" 
"sync.sync_metadata.tables$fn__47381$db_metadata__47386.invoke(tables.clj:141)" 
"sync.sync_metadata.tables$fn__47442$sync_tables_BANG___47447$fn__47448.invoke(tables.clj:161)" 
"sync.sync_metadata.tables$fn__47442$sync_tables_BANG___47447.invoke(tables.clj:156)" 
"sync.util$fn__37761$run_step_with_metadata__37766$fn__37770$fn__37772.invoke(util.clj:356)" 
"sync.util$with_start_and_finish_logging_SINGLEQUOTE_.invokeStatic(util.clj:104)" 
"sync.util$with_start_and_finish_logging_SINGLEQUOTE_.invoke(util.clj:98)" 
"sync.util$with_start_and_finish_debug_logging.invokeStatic(util.clj:121)" 
"sync.util$with_start_and_finish_debug_logging.invoke(util.clj:118)" 
"sync.util$fn__37761$run_step_with_metadata__37766$fn__37770.invoke(util.clj:353)" 
"sync.util$fn__37761$run_step_with_metadata__37766.invoke(util.clj:348)" 
"sync.util$fn__37953$run_sync_operation__37958$fn__37959$fn__37960.invoke(util.clj:441)" 
"sync.util$fn__37953$run_sync_operation__37958$fn__37959.invoke(util.clj:441)" 
"sync.util$fn__37953$run_sync_operation__37958.invoke(util.clj:435)" 
"sync.sync_metadata$fn__47495$sync_db_metadata_BANG___47500$fn__47501$fn__47502.invoke(sync_metadata.clj:51)" 
"sync.util$do_with_error_handling.invokeStatic(util.clj:149)" 
"sync.util$do_with_error_handling.invoke(util.clj:144)" 
"sync.util$do_with_error_handling.invokeStatic(util.clj:147)" 
"sync.util$do_with_error_handling.invoke(util.clj:144)" 
"driver$fn__18010.invokeStatic(driver.clj:646)" 
"driver$fn__18010.invoke(driver.clj:646)" 
"sync.util$sync_in_context$fn__37667.invoke(util.clj:140)" 
"sync.util$with_db_logging_disabled$fn__37664.invoke(util.clj:131)" 
"sync.util$with_start_and_finish_logging_SINGLEQUOTE_.invokeStatic(util.clj:104)" 
"sync.util$with_start_and_finish_logging_SINGLEQUOTE_.invoke(util.clj:98)" 
"sync.util$with_start_and_finish_logging$fn__37653.invoke(util.clj:116)" 
"sync.util$with_sync_events$fn__37648.invoke(util.clj:90)" 
"sync.util$with_duplicate_ops_prevented$fn__37639.invoke(util.clj:69)" 
"sync.util$do_sync_operation.invokeStatic(util.clj:168)" 
"sync.util$do_sync_operation.invoke(util.clj:165)" 
"sync.sync_metadata$fn__47495$sync_db_metadata_BANG___47500$fn__47501.invoke(sync_metadata.clj:50)" 
"sync.sync_metadata$fn__47495$sync_db_metadata_BANG___47500.invoke(sync_metadata.clj:47)" 
"api.database$fn__48163$fn__48164.invoke(database.clj:511)")  
May 15 09:34:45 INFO metabase.sync.util :: FINISHED: Sync metadata for sparksql Database 10 'Hive' (257 ms) 
May 15 09:34:45 DEBUG metabase.sync.util :: Sync operations in flight: {:analyze #{10}} 
May 15 09:34:45 INFO metabase.sync.util :: STARTING: Analyze data for sparksql Database 10 'Hive' 
May 15 09:34:45 DEBUG metabase.sync.util :: STARTING: step 'fingerprint-fields' for sparksql Database 10 'Hive' 
May 15 09:34:45 DEBUG metabase.sync.util :: FINISHED: step 'fingerprint-fields' for sparksql Database 10 'Hive' (365 µs) 
May 15 09:34:45 DEBUG metabase.sync.util :: STARTING: step 'classify-fields' for sparksql Database 10 'Hive' 
May 15 09:34:45 DEBUG metabase.sync.util :: FINISHED: step 'classify-fields' for sparksql Database 10 'Hive' (175 µs) 
May 15 09:34:45 DEBUG metabase.sync.util :: STARTING: step 'classify-tables' for sparksql Database 10 'Hive' 
May 15 09:34:45 DEBUG metabase.sync.util :: FINISHED: step 'classify-tables' for sparksql Database 10 'Hive' (180 µs) 
May 15 09:34:45 DEBUG metabase.sync.util ::  
################################################################# 
# Completed analyze on Hive 
# Start: 2019-05-15T07:34:45.402Z 
# End: 2019-05-15T07:34:45.405Z 
# Duration: 3 ms 
# --------------------------------------------------------------- 
# Completed step 'fingerprint-fields' 
# Start: 2019-05-15T07:34:45.402Z 
# End: 2019-05-15T07:34:45.403Z 
# Duration: 1000 µs 
# Fingerprint updates attempted null, updated null, no data found null, failed null 
# --------------------------------------------------------------- 
# Completed step 'classify-fields' 
# Start: 2019-05-15T07:34:45.403Z 
# End: 2019-05-15T07:34:45.404Z 
# Duration: 1000 µs 
# Total number of fields classified 0, 0 failed 
# --------------------------------------------------------------- 
# Completed step 'classify-tables' 
# Start: 2019-05-15T07:34:45.404Z 
# End: 2019-05-15T07:34:45.405Z 
# Duration: 1000 µs 
# Total number of tables classified 0, 0 updated 
#################################################################
 
May 15 09:34:45 INFO metabase.sync.util :: FINISHED: Analyze data for sparksql Database 10 'Hive' (12 ms)
#6

The error seems look very similar to issue #7630, so I would think it will be fixed once PR 9889 is applied (most likely version 0.33)

#7

To be honest, I’m not quite sure if it’s the same error. My table (there is only one) doesn’t have any partitions; no line of the describe table output starts with #. (PR 9215 also tried to resolve this, btw.)
But okay, we’ll see :wink: