Hey guys,
We've been happy with Metabase, using it for around 5 years in total now!
Metabase was very slow after our .43 upgrade, and now has become completely unusable with .44. I know it's probably as simple as tuning some pools, but we cannot figure this out! We have 15 semi-active users and 15 dashboards around the office. We're hosted on Digital Ocean and are using a massive managed db to host the data and metabase's schema
- We run Metabase behind haproxy to terminate TLS. Timeouts are listed below
- .43 seems to have take a far more asynchronous programming model and this has lead to problems stacking up, rather than load shedding
- Running a jstack on the frozen server just shows a ton of waits at pools
- CPU on the box is idling
- Metabase is not using it's full Xmx value of 3g
- JDK17
- The pool
MB_APPLICATION_DB_MAX_CONNECTION_POOL_SIZE
seems to get all of it's connections used. So we set it to an absurd number of 128, but it appears eventually all of the connections in the pool get used and the app crawls to a halt.
/etc/default/metabase:
MB_ANON_TRACKING_ENABLED=false
MB_ASYNC_QUERY_THREAD_POOL_SIZE=16
MB_JDBC_DATA_WAREHOUSE_MAX_CONNECTION_POOL_SIZE=16
MB_APPLICATION_DB_MAX_CONNECTION_POOL_SIZE=128
MB_JETTY_ASYNC_RESPONSE_TIMEOUT=600000
/etc/haproxy/haproxy.cfg [excerpt]
timeout connect 5000
timeout server 610000
timeout server 610000
timeout tunnel 610000
frontend www-https
bind :443 ssl crt /etc/ssl/private/metabase.xyz.com.pem alpn h2,http/1.1
http-request add-header X-Forwarded-Proto https
default_backend metabase-backend
backend metabase-backend
option httpchk OPTIONS /metabase.core/initialized? HTTP/1.0
server metabase-workhorse 127.0.0.1:3000 #check disabled for right now
Most of the threads are locked up in this state:
"qtp1019098496-137" #137 prio=5 os_prio=0 cpu=1.48ms elapsed=26.68s tid=0x00007f70880ecf30 nid=0x55a0 in Object.wait() [0x00007f704c765000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(java.base@17.0.3/Native Method)
- waiting on
at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1503)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:644)- locked <0x000000074760a0e8> (a com.mchange.v2.resourcepool.BasicResourcePool)
at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:554)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse(C3P0PooledConnectionPool.java:758)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:685)
at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:140)
at metabase.db.setup$reify__35566.getConnection(setup.clj:168)
at clojure.java.jdbc$get_connection.invokeStatic(jdbc.clj:372)
at clojure.java.jdbc$get_connection.invoke(jdbc.clj:274)
at clojure.java.jdbc$db_query_with_resultset_STAR_.invokeStatic(jdbc.clj:1111)
at clojure.java.jdbc$db_query_with_resultset_STAR_.invoke(jdbc.clj:1093)
at clojure.java.jdbc$query.invokeStatic(jdbc.clj:1182)
at clojure.java.jdbc$query.invoke(jdbc.clj:1144)
at clojure.java.jdbc$query.invokeStatic(jdbc.clj:1160)
at clojure.java.jdbc$query.invoke(jdbc.clj:1144)
at metabase.server.middleware.session$current_user_info_for_session.invokeStatic(session.clj:245)
Is there a setting to help load shedding? Just setting a 5s max wait on the datasource would allow the server to kick some errors to users and help us diagnsose the problem.
{
"browser-info": {
"language": "en-US",
"platform": "MacIntel",
"userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15",
"vendor": "Apple Computer, Inc."
},
"system-info": {
"file.encoding": "UTF-8",
"java.runtime.name": "OpenJDK Runtime Environment",
"java.runtime.version": "17.0.3+7",
"java.vendor": "Eclipse Adoptium",
"java.vendor.url": "https://adoptium.net/",
"java.version": "17.0.3",
"java.vm.name": "OpenJDK 64-Bit Server VM",
"java.vm.version": "17.0.3+7",
"os.name": "Linux",
"os.version": "4.15.0-189-generic",
"user.language": "en",
"user.timezone": "America/Chicago"
},
"metabase-info": {
"databases": [
"mysql",
"mongo"
],
"hosting-env": "unknown",
"application-database": "mysql",
"application-database-details": {
"database": {
"name": "MySQL",
"version": "8.0.28"
},
"jdbc-driver": {
"name": "MariaDB Connector/J",
"version": "2.7.5"
}
},
"run-mode": "prod",
"version": {
"date": "2022-08-04",
"tag": "v0.44.0",
"branch": "release-x.44.x",
"hash": "d3700f5"
},
"settings": {
"report-timezone": "US/Central"
}
}
}