Metabase Upgrade Failure from 0.49 to 0.56.15 || Migrations are Failing

Jay1 · December 17, 2025, 8:04am

We are upgrading Metabase from 0.49 to 0.56.15, facing Migration Issues
Deployed using docker image via Kubernetes

metabase-clone-db=> SELECT
COUNT(*) as total_cards,
COUNT(view_count) as cards_with_view_count,
MAX(view_count) as max_views
FROM report_card;
total_cards | cards_with_view_count | max_views
-------------+-----------------------+-----------
8373 | 8373 | 0

metabase-clone-db=> EXPLAIN ANALYZE

SELECT c.id, COUNT(*)
FROM report_card c
LEFT JOIN view_log v ON v.model = 'card' AND v.model_id = c.id
GROUP BY c.id
LIMIT 100;
QUERY PLAN

Limit (cost=0.72..366588.74 rows=100 width=12) (actual time=13.071..12929.520 rows=100 loops=1)
-> GroupAggregate (cost=0.72..30694415.83 rows=8373 width=12) (actual time=13.070..12929.458 rows=100 loops=1)
Group Key: c.id
-> Merge Left Join (cost=0.72..30573086.33 rows=24249155 width=4) (actual time=1.123..12924.726 rows=23319 loops=1)
Merge Cond: (c.id = v.model_id)
-> Index Only Scan using report_card_pkey on report_card c (cost=0.29..321.88 rows=8373 width=4) (actual time=0.007..0.142 rows=101 loops=1)
Heap Fetches: 4
-> Index Scan using idx_view_log_model_id on view_log v (cost=0.44..30269629.08 rows=24249155 width=4) (actual time=0.573..12918.858 rows=23359 loops=1)
Filter: ((model)::text = 'card'::text)
Rows Removed by Filter: 300340
Planning Time: 9.489 ms
Execution Time: 12929.712 ms
(12 rows)

metabase-clone-db=> SELECT COUNT(*) FROM view_log WHERE model = 'card';

count

24209123

                           	at metabase.app_db.core$setup_db_BANG\_.doInvoke(core.clj:105)

                               	at clojure.lang.RestFn.invoke(RestFn.java:424)

                               	at metabase.core.core$init_BANG__STAR_.invokeStatic(core.clj:173)

                               	at metabase.core.core$init_BANG__STAR_.invoke(core.clj:151)

                               	at metabase.core.core$init_BANG_.invokeStatic(core.clj:229)

                               	at metabase.core.core$init_BANG_.invoke(core.clj:224)

                               	at metabase.core.core$start_normally.invokeStatic(core.clj:243)

                               	at metabase.core.core$start_normally.invoke(core.clj:235)

                               	at metabase.core.core$entrypoint.invokeStatic(core.clj:278)

                               	at metabase.core.core$entrypoint.doInvoke(core.clj:269)

                               	at clojure.lang.RestFn.invoke(RestFn.java:400)

                               	at clojure.lang.AFn.applyToHelper(AFn.java:152)

                               	at clojure.lang.RestFn.applyTo(RestFn.java:135)

                               	at clojure.lang.Var.applyTo(Var.java:707)

                               	at clojure.core$apply.invokeStatic(core.clj:667)

                               	at clojure.core$apply.invoke(core.clj:662)

                               	at metabase.core.bootstrap$_main.invokeStatic(bootstrap.clj:36)

                               	at metabase.core.bootstrap$_main.doInvoke(bootstrap.clj:29)

                               	at clojure.lang.RestFn.invoke(RestFn.java:400)

                               	at clojure.lang.AFn.applyToHelper(AFn.java:152)

                               	at clojure.lang.RestFn.applyTo(RestFn.java:135)

                               	at metabase.core.bootstrap.main(Unknown Source)

                               2025-12-17 06:34:10,035 INFO core.core :: Metabase Shutting Down ...

                               2025-12-17 06:34:10,036 INFO server.instance :: Shutting Down Embedded Jetty Webserver

                               2025-12-17 06:34:10,046 INFO notification.send :: Shutting down notification dispatchers... {mb-dispatcher-count=2}

                               2025-12-17 06:34:10,051 INFO notification.send :: Starting notification thread pool with 3 threads {mb-dispatcher-count=2}

                               2025-12-17 06:34:10,053 INFO notification.send :: Gracefully shutting down notification dispatcher with 0 pending notifications to process {mb-dispatcher-count=2}

2025-12-17T06:34:11.055495079Z 2025-12-17 06:34:11,055 INFO notification.send :: Notification worker shut down successfully {mb-dispatcher-count=2}

                               2025-12-17 06:34:11,055 INFO notification.send :: Starting notification thread pool with 5 threads {mb-dispatcher-count=2}

                               2025-12-17 06:34:11,056 INFO notification.send :: Gracefully shutting down notification dispatcher with 0 pending notifications to process {mb-dispatcher-count=2}

2025-12-17T06:34:12.057534098Z 2025-12-17 06:34:12,057 INFO notification.send :: Notification worker shut down successfully {mb-dispatcher-count=2}

                               2025-12-17 06:34:12,057 INFO notification.send :: All notification workers shut down successfully {mb-dispatcher-count=2}

                               2025-12-17 06:34:12,059 WARN app-db.liquibase :: ()

                               2025-12-17 06:34:12,060 INFO core.core :: Metabase Shutdown COMPLETE

Let me know if need any other additional details

dragonsahead · December 17, 2025, 9:31am

you seem to have posted also in github, I’m closing that one

Jay1 · December 17, 2025, 10:36am

My ticket was marked as Not Planned, hence added here for reference

Jay1 · December 17, 2025, 4:53pm

Any update on this?

dwhitemv · December 18, 2025, 4:52am

@Jay1

Can you clarify what your question is? A number of your issues were addressed in the GitHub issue.

I think I’m with the devs, just truncate view_log, especially if you are on Metabase OSS where that data is no longer used.

Jay1 · December 18, 2025, 5:35am

@dwhitemv My question is we are upgrading the Metabase from 0.49 to 0.56.15 and it failed.
Based on the logs it shows that update statement failed , added those details above.

As per GitHub, it is been said to increase the memory as we have only 3GB being allocated and 1GB is JVM

We are using K8s for this, so pod is crashing out, w.r.t health check too

So my question is will increasing memory suffice and solve this problem of update taking time → resulting in migration being complete.
2. In prod do we need to truncate the view_log?
3. Both option and then restart

dwhitemv · December 18, 2025, 5:49am

Increasing JVM heap won’t hurt the migration process (unless you make it TOO big and the kernel OOM-kills the JVM). And yes, I would truncate view_log as it is deprecated in Metabase OSS. (You can truncate audit_log too while you’re at it.)

The migration process’s time is controlled more by how long it takes the migration queries to execute. Particularly from the 0.49 timeframe, there is the Great Permission Rectification, which takes quadratic time that scales with number of tables and number of user/groups. Once you get past that, its mostly table create/alters & index builds which are limited by database server resources.

You MUST disable health checks during upgrades, otherwise k8s will kill your pod during migration and damage your app database.

Jay1 · December 18, 2025, 6:50am

@dwhitemv so I am thinking to increase POD memory to 4-5 GB, JVM to 3GB as per this Metabase Upgrade Failure from 0.49 to 0.56.15 || Migrations are Failing · Issue #67065 · metabase/metabase · GitHub
Here do I need to add - JAVA_TOOL_OPTIONS=-Xmx3500m in K8s?

Disable health_check

And as you suggested will check if we can truncate view_log table and audit_log, post that will restart the pod and let’s see.

Please help me if any changes here?

dwhitemv · December 18, 2025, 6:58am

I would back down heap to -Xmx3000m, need to leave ~1GB for JVM overhead and some for the OS. Otherwise looks good.

Jay1 · December 18, 2025, 7:01am

But Overall we should have 5GB pod memory correct? 3GB JVM and rest for others

dwhitemv · December 19, 2025, 3:51am

Seems fine to me.

Jay1 · January 15, 2026, 8:58am

Hey, I have truncated the View_log table and then went ahead with upgrade, it went well, but since we are using clickhouse, suddenly the connections in clickhouse had a huge spike upto 600 connections, any reason why this??
is this bcuz we have multiple dashboards with UUID (No semantic Type) and in this new update Metabase has updated that to Category type??

dwhitemv · January 15, 2026, 3:26pm

I can’t see how table metadata would affect connection counts. Do you see the extra connections in the Metabase log? When queries are active/completing it will say how many connections to the target database are active.

The clickhouse driver in general has some stability issues… it’s gone through good and bad periods.

Jay1 · January 28, 2026, 10:47am

will this Environment variables | Metabase Documentation help if set to True?
Bcuz as soon as it is scaled up, connections keeps on increasing , even though we don’t hit any query

dwhitemv · January 28, 2026, 4:52pm

You probably don’t want to disable the scheduler, it will break lots of things in Metabase.

If your users are creating email/slack notifications, those notifications run queries to generate their output.

Can you go into Clickhouse and see what those connections are doing?

And did you check the Metabase log and see how many connections Metabase thinks are active to that database? You’re looking for lines like this (this API call is running the query for a dashboard card):

2026-01-28 08:25:36,325 DEBUG middleware.log :: POST /api/dashboard/6/dashcard/130/card/114/query 202 [ASYNC: completed] 10811ms (21 DB calls) App DB connections: 1/15 Jetty threads: 4/50 (2 idle, 0 queued) (172 total active threads) Queries in flight: 0 (0 queued); postgres DB 2 connections: 2/6 (0 threads blocked) {:metabase-user-id $$}

Look for the part that says clickhouse DB ## connections: XXX/YYY. It will change depending on what Metabase is doing. ## will be the Metabase database ID (visible in Admin → Databases by looking at the URL for the database link). You might also see the user ID number ($$) that initiated the request.

Is your Clickhouse install on-prem or cloud?

Jay1 · January 29, 2026, 5:26am

Clickhouse is self hosted on gcp VM.
We are not using any email/slack notifications?

so whenever I scale up the metabase service, connection spikes for CH, and have to scale it down immediately to avoid any other prod failures

Will test again and share the logs here

Jay1 · February 4, 2026, 11:44am

@dwhitemv
We retried again, nothing major on logs as well, have tried disable of scan too

2026-02-04 10:56:01,251 DEBUG middleware.log :: e[32mPOST /api/dashboard/1392/dashcard/8240/card/7795/query 202 [ASYNC: completed] 59ms (12 DB calls) App DB connections: 2/14 Jetty threads: 3/50 (5 idle, 0 queued) (107 total active threads) Queries in flight: 1 (0 queued) {:metabase-user-id 548}e[0m

2026-02-04 10:56:04,760 DEBUG middleware.log :: e[32mGET /api/activity/popular_items 200 38313ms (8 DB calls) App DB connections: 0/14 Jetty threads: 3/50 (5 idle, 0 queued) (107 total active threads) Queries in flight: 0 (0 queued) {:metabase-user-id 548}e[0m

2026-02-04 10:56:08,354 INFO middleware.cache :: Query 725c8a04 took 3.1 s to run; minimum for cache eligibility is 10.0 ms; eligible

2026-02-04 10:56:08,355 INFO middleware.cache :: Caching results for next time for query with hash "725c8a04".

2026-02-04 10:56:27,027 DEBUG middleware.log :: e[32mGET /api/cache 200 4ms (1 DB calls) App DB connections: 1/14 Jetty threads: 3/50 (6 idle, 0 queued) (109 total active threads) Queries in flight: 1 (0 queued) {:metabase-user-id 619}e[0m

2026-02-04 10:56:28,387 DEBUG middleware.log :: e[32mGET /api/user-key-value/namespace/user_acknowledgement/key/upsell-dev_instances 204 2ms (1 DB calls) App DB connections: 1/14 Jetty threads: 3/50 (6 idle, 0 queued) (109 total active threads) Queries in flight: 1 (0 queued) {:metabase-user-id 619}e[0m

2026-02-04 10:56:28,438 DEBUG middleware.log :: e[32mGET /api/setting/version-info 200 3ms (0 DB calls) App DB connections: 1/14 Jetty threads: 4/50 (6 idle, 0 queued) (109 total active threads) Queries in flight: 1 (0 queued) {:metabase-user-id 619}e[0m

2026-02-04 10:56:28,452 DEBUG middleware.log :: e[32mGET /api/setting 200 37ms (15 DB calls) App DB connections: 0/14 Jetty threads: 4/50 (6 idle, 0 queued) (109 total active threads) Queries in flight: 1 (0 queued) {:metabase-user-id 619}e[0m

2026-02-04 10:56:31,223 DEBUG middleware.log :: e[32mGET /api/bug-reporting/details 200 6ms (1 DB calls) App DB connections: 0/14 Jetty threads: 3/50 (6 idle, 0 queued) (109 total active threads) Queries in flight: 1 (0 queued) {:metabase-user-id 619}e[0m

2026-02-04 10:56:44,619 DEBUG middleware.log :: e[32mGET /api/table/2459/query_metadata 200 43ms (7 DB calls) App DB connections: 1/14 Jetty threads: 3/50 (6 idle, 0 queued) (109 total active threads) Queries in flight: 1 (0 queued) {:metabase-user-id 619}e[0m

2026-02-04 10:58:19,035 DEBUG middleware.log :: e[32mGET /api/bug-reporting/details 200 6ms (1 DB calls) App DB connections: 2/14 Jetty threads: 3/50 (6 idle, 0 queued) (103 total active threads) Queries in flight: 1 (0 queued) {:metabase-user-id 619}e[0m

2026-02-04 10:58:34,589 WARN core.core :: Received system signal: SIGTERM

2026-02-04 10:58:34,591 INFO core.core :: Metabase Shutting Down ...

2026-02-04 10:58:34,591 INFO util.queue :: Stopping listener search-index-update...

2026-02-04 10:58:34,591 INFO util.queue :: Stopping listener search-index-update...done

2026-02-04 10:58:34,592 INFO core.QuartzScheduler :: Scheduler MetabaseScheduler_$_metabase-clone-65cbd6987c-bk4zs1770201057720 shutting down.

2026-02-04 10:58:34,592 INFO core.QuartzScheduler :: Scheduler MetabaseScheduler_$_metabase-clone-65cbd6987c-bk4zs1770201057720 paused.

2026-02-04 10:58:34,592 INFO core.QuartzScheduler :: Scheduler MetabaseScheduler_$_metabase-clone-65cbd6987c-bk4zs1770201057720 shutdown complete.

2026-02-04 10:58:34,592 INFO server.instance :: Shutting Down Embedded Jetty Webserver

2026-02-04 10:58:34,597 WARN nested.HttpChannelState :: java.lang.UnsupportedOperationException: onError while invoking onError listener metabase.server.statistics_handler.proxy$java.lang.Object$AsyncListener$caf66881@213dad7a

2026-02-04 10:58:34,605 INFO analytics.prometheus :: Prometheus web-server shut down

2026-02-04 10:58:34,610 INFO notification.send :: Shutting down notification dispatchers... {mb-dispatcher-count=2}

2026-02-04 10:58:34,613 INFO notification.send :: Starting notification thread pool with 3 threads {mb-dispatcher-count=2}

2026-02-04 10:58:34,614 INFO notification.send :: Gracefully shutting down notification dispatcher with 0 pending notifications to process {mb-dispatcher-count=2}

2026-02-04 10:58:35,615 INFO notification.send :: Notification worker shut down successfully {mb-dispatcher-count=2}

2026-02-04 10:58:35,615 INFO notification.send :: Starting notification thread pool with 5 threads {mb-dispatcher-count=2}

2026-02-04 10:58:35,616 INFO notification.send :: Gracefully shutting down notification dispatcher with 0 pending notifications to process {mb-dispatcher-count=2}

2026-02-04 10:58:36,617 INFO notification.send :: Notification worker shut down successfully {mb-dispatcher-count=2}

2026-02-04 10:58:36,617 INFO notification.send :: All notification workers shut down successfully {mb-dispatcher-count=2}

2026-02-04 10:58:36,617 INFO core.core :: Metabase Shutdown COMPLETE

dwhitemv · February 4, 2026, 3:59pm

I don’t know what to say, the log section you posted shows one warehouse DB query that completed quickly, and no open connections at the end of that query. I hope there isn’t a bug where the Clickhouse driver is leaking connections.

Is Metabase connecting to your Clickhouse instance through a proxy or connection manager that could be holding onto connections?

Jay1 · February 5, 2026, 10:51am

We are not using any external clickhouse driver, it’s the one which metabase is providing (CH version 22.7)

No proxy is used to connect

Metabase is installed via K8’s, CH is in VM, both under same VPC, same project