V46.1 timeout issues

Couple f days ago we updated from v45..3 to v46.1 and after that we are getting timeout errors while refreshing scheduled data models. Same timeout error when refreshing manually.
Where could we change timeout settings? It was working absolutely fine on older Metabase version.
I was not able to find any timeout settings in admin settings anywhere.

Thank you in advance for any help.

[13ac4ca6-6e74-4265-a91d-b73ba534abdb] 2023-04-13T08:36:52+03:00 INFO metabase.task.persist-refresh Attempting to refresh persisted model 961.
[13ac4ca6-6e74-4265-a91d-b73ba534abdb] 2023-04-13T08:36:52+03:00 INFO metabase.driver.mysql You may need to add trustServerCertificate=true to the additional connection options to connect with SSL.
[13ac4ca6-6e74-4265-a91d-b73ba534abdb] 2023-04-13T08:37:22+03:00 INFO metabase.task.persist-refresh Error refreshing persisting model with card-id 961
java.lang.Exception: Killed mysql process id 3,744,965 due to timeout.
	at metabase.driver.mysql.ddl$kill.invokeStatic(ddl.clj:43)
	at metabase.driver.mysql.ddl$kill.invoke(ddl.clj:34)
	at metabase.driver.mysql.ddl$execute_with_timeout_BANG_.invokeStatic(ddl.clj:56)
	at metabase.driver.mysql.ddl$execute_with_timeout_BANG_.invoke(ddl.clj:45)
	at metabase.driver.mysql.ddl$fn__103041.invokeStatic(ddl.clj:69)
	at metabase.driver.mysql.ddl$fn__103041.invoke(ddl.clj:62)
	at clojure.lang.MultiFn.invoke(MultiFn.java:244)
	at metabase.task.persist_refresh$reify__79259.refresh_BANG_(persist_refresh.clj:65)
	at metabase.task.persist_refresh$refresh_with_stats_BANG_$fn__79262.invoke(persist_refresh.clj:85)
	at metabase.task.persist_refresh$refresh_with_stats_BANG_.invokeStatic(persist_refresh.clj:84)
	at metabase.task.persist_refresh$refresh_with_stats_BANG_.invoke(persist_refresh.clj:69)
	at clojure.lang.AFn.applyToHelper(AFn.java:165)
	at clojure.lang.AFn.applyTo(AFn.java:144)
	at clojure.core$apply.invokeStatic(core.clj:673)
	at clojure.core$partial$fn__5914.doInvoke(core.clj:2660)
	at clojure.lang.RestFn.invoke(RestFn.java:397)
	at metabase.task.persist_refresh$save_task_history_BANG_.invokeStatic(persist_refresh.clj:108)
	at metabase.task.persist_refresh$save_task_history_BANG_.invoke(persist_refresh.clj:103)
	at metabase.task.persist_refresh$refresh_individual_BANG_.invokeStatic(persist_refresh.clj:222)
	at metabase.task.persist_refresh$refresh_individual_BANG_.invoke(persist_refresh.clj:214)
	at metabase.task.persist_refresh$refresh_job_fn_BANG_.invokeStatic(persist_refresh.clj:239)
	at metabase.task.persist_refresh$refresh_job_fn_BANG_.invoke(persist_refresh.clj:233)
	at metabase.task.persist_refresh.PersistenceRefresh.execute(persist_refresh.clj:249)
	at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
[13ac4ca6-6e74-4265-a91d-b73ba534abdb] 2023-04-13T08:37:23+03:00 INFO metabase.task.persist-refresh Finished updated model-id 961 from persisted-info 41.```

Are you able to run queries against that database? If you run a simple query like select * from table limit 10 ... does it return something?

That is the point, that some of the queries runs for no more than 30 seconds, but still getting timeout error. Those run without any issues in Metabase without trying to rebuilt cache.

I am not sure i understand. What i am asking is if queries to that particular DB are running or not form metabase ... To exclude if it's something coming from that model or simply a DB connection issue

Queries running without issues.
Only using Data Model for some of them (which could be run as SQL question in ~30 seconds) I am getting timeouts.

I doubt changing this value will make any difference:

Since you are saying it timesout after 30 sec ...and there are no settings at the DB level which will close connections longer than that?

That is definitely not related to any DB settings, since simple SQL questions could be run without such an issue.
There might be some changes either in Data model behavior or some settings for it in the background.
As I said: before upgrade we did not have those issues before.

If you turn that model to SQL and try to run it directly on your Database, compare the same run with metabase how long does both take?

Same problem here. After updating Metabase to v0.46.1 we are having timeout problems when refreshing models that we did not have in previous versions.

Can you increase the timeout in MySQL? Also, how much time do those queries take when run directly in a sql client? We need more context to reproduce this

I have done this with no effect:


On the other hand, the weirdest thing is that sometimes the model is cached properly sometimes it doesn't.
We launch the caching process once a day.

The SQL query in a SqlClient takes 8.46 seconds to fetch 45K rows

We have another Model that always fails. The SQL takes 1m 22s

Can we confirm this happens only in MySQL? Anything that might be relevant in the db log or Metabase log? We need something to pull some threads from

We use MySQL.

Here is a failure log:

Hi, same problem here after upgrade to 0.46.2 from latest 0.45.x

Any info you could give us to debug?

What kind of info do you want ? We can provide some.

We have the same trace as david.s4t in our logs :

[43de1631-682e-4aad-b231-03a36fa5cbb9] 2023-05-05T10:20:36+02:00 INFO metabase.task.persist-refresh Erreur de rafraîchissement du modèle persisté avec card-id 651
java.lang.Exception: ID de processus mysql 14 011 tué en raison d'un délai d'attente.
	at metabase.driver.mysql.ddl$kill.invokeStatic(ddl.clj:43)
	at metabase.driver.mysql.ddl$kill.invoke(ddl.clj:34)
	at metabase.driver.mysql.ddl$execute_with_timeout_BANG_.invokeStatic(ddl.clj:56)
	at metabase.driver.mysql.ddl$execute_with_timeout_BANG_.invoke(ddl.clj:45)
	at metabase.driver.mysql.ddl$fn__103034.invokeStatic(ddl.clj:69)
	at metabase.driver.mysql.ddl$fn__103034.invoke(ddl.clj:62)
	at clojure.lang.MultiFn.invoke(MultiFn.java:244)
	at metabase.task.persist_refresh$reify__79265.refresh_BANG_(persist_refresh.clj:65)
	at metabase.task.persist_refresh$refresh_with_stats_BANG_$fn__79268.invoke(persist_refresh.clj:85)
	at metabase.task.persist_refresh$refresh_with_stats_BANG_.invokeStatic(persist_refresh.clj:84)
	at metabase.task.persist_refresh$refresh_with_stats_BANG_.invoke(persist_refresh.clj:69)
	at clojure.core$partial$fn__5910.invoke(core.clj:2649)
	at clojure.lang.PersistentVector.reduce(PersistentVector.java:343)
	at clojure.core$reduce.invokeStatic(core.clj:6885)
	at clojure.core$reduce.invoke(core.clj:6868)
	at metabase.task.persist_refresh$refresh_tables_BANG_$thunk__79295.invoke(persist_refresh.clj:207)
	at metabase.task.persist_refresh$save_task_history_BANG_.invokeStatic(persist_refresh.clj:108)
	at metabase.task.persist_refresh$save_task_history_BANG_.invoke(persist_refresh.clj:103)
	at metabase.task.persist_refresh$refresh_tables_BANG_.invokeStatic(persist_refresh.clj:210)
	at metabase.task.persist_refresh$refresh_tables_BANG_.invoke(persist_refresh.clj:198)
	at metabase.task.persist_refresh$refresh_job_fn_BANG_.invokeStatic(persist_refresh.clj:238)
	at metabase.task.persist_refresh$refresh_job_fn_BANG_.invoke(persist_refresh.clj:233)
	at metabase.task.persist_refresh.PersistenceRefresh.execute(persist_refresh.clj:249)
	at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)

We tried to desactivate cache and reactivate it, it deleted all tables in cache databases and it's unable to recreate them. We are stuck !

It looks like there is a 60s timeout, but we are not sure.

We don’t have still a way to reproduce this issue:
-which MySQL version are you using?
-which configuration does that MySQL have?
-how big is the model and how much time does the run take?

If we can’t get all the info needed to reproduce is like shooting in the dark

Hello, we rolled back to 0.45.3.1

{
  "browser-info": {
    "language": "fr-FR",
    "platform": "Linux x86_64",
    "userAgent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36",
    "vendor": "Google Inc."
  },
  "system-info": {
    "file.encoding": "UTF-8",
    "java.runtime.name": "OpenJDK Runtime Environment",
    "java.runtime.version": "11.0.18+10",
    "java.vendor": "Eclipse Adoptium",
    "java.vendor.url": "https://adoptium.net/",
    "java.version": "11.0.18",
    "java.vm.name": "OpenJDK 64-Bit Server VM",
    "java.vm.version": "11.0.18+10",
    "os.name": "Linux",
    "os.version": "4.15.18-2-pve",
    "user.language": "en",
    "user.timezone": "GMT"
  },
  "metabase-info": {
    "databases": [
      "mysql"
    ],
    "hosting-env": "unknown",
    "application-database": "postgres",
    "application-database-details": {
      "database": {
        "name": "PostgreSQL",
        "version": "14.2"
      },
      "jdbc-driver": {
        "name": "PostgreSQL JDBC Driver",
        "version": "42.5.0"
      }
    },
    "run-mode": "prod",
    "version": {
      "date": "2023-03-29",
      "tag": "v0.45.3.1",
      "branch": "release-x.45.x",
      "hash": "3b3097d"
    },
    "settings": {
      "report-timezone": null
    }
  }
}

Our mysql version is 5.5.50-log.
Database is a slave from a master/slave config. Size is 100Gb, metabase is connected to the slave.
Cache database (with models) size is 1Gb.
Models take 3 to 4min to run on 0.45.3.1.

About mysql configuration, do you have specific questions ?

We have both myisam and innodb tables.

About collation :

collation_connection utf8mb4_general_ci
collation_database latin1_swedish_ci
collation_server latin1_swedish_ci

About timeouts :

connect_timeout 10
delayed_insert_timeout 300
innodb_lock_wait_timeout 50
innodb_rollback_on_timeout OFF
interactive_timeout 28800
lock_wait_timeout 31536000
net_read_timeout 30
net_write_timeout 60
slave_net_timeout 3600
wait_timeout 28800

Thanks for you help.

I reported the bug on github : Timeout issue with models in 0.46.x · Issue #30723 · metabase/metabase · GitHub