Freeze on startup upgrading from 31.2 to 32.5 - CNAME Host Name Doesn't work on Aurora MySql

Looks like in 32.x the Mysql driver was replaced with Maria DB driver so maybe that is related.

The metabase DB is hosted on AWS aurora and works fine with 31.2.

When starting up 32.5 the code freezes at the same place every time. It appears like it can verify a database connection but then freezes on “Running Database Migrations…” (example pasted below).

I’ve tried creating a new, empty database in aurora and starting up metabase 32.5 against that and it also freezes at the exact same spot. This makes me believe this is a maria driver issue.

Any ideas would be welcome.

05-08 15:53:45 INFO metabase.core :: Setting up and migrating Metabase DB. Please sit tight, this may take a minute…
05-08 15:53:45 INFO metabase.db :: Verifying mysql Database Connection …
05-08 15:53:45 INFO metabase.driver :: Initializing driver :sql…
05-08 15:53:45 DEBUG metabase.driver :: Reason: (“util$fn__13662$G__13657__13667.invoke(util.clj:277)”
“driver$initialize_if_needed_BANG_.invokeStatic(driver.clj:260)”
“driver$initialize_if_needed_BANG_.invoke(driver.clj:246)”
“driver$initialize_if_needed_BANG_.invokeStatic(driver.clj:251)”
“driver$initialize_if_needed_BANG_.invoke(driver.clj:246)”
“driver$initialize_if_needed_BANG_.invokeStatic(driver.clj:251)”
“driver$initialize_if_needed_BANG_.invoke(driver.clj:246)”
“driver$the_initialized_driver.invokeStatic(driver.clj:269)”
“driver$the_initialized_driver.invoke(driver.clj:265)”
“driver$dispatch_on_initialized_driver.invokeStatic(driver.clj:277)”
“driver$dispatch_on_initialized_driver.doInvoke(driver.clj:272)”
“driver.util$can_connect_with_details_QMARK_$fn__17971.invoke(util.clj:30)”)

05-08 15:53:45 INFO metabase.driver :: Initializing driver :sql-jdbc…
05-08 15:53:45 DEBUG metabase.driver :: Reason: (“util$fn__13662$G__13657__13667.invoke(util.clj:277)”
“driver$initialize_if_needed_BANG_.invokeStatic(driver.clj:260)”
“driver$initialize_if_needed_BANG_.invoke(driver.clj:246)”
“driver$initialize_if_needed_BANG_.invokeStatic(driver.clj:251)”
“driver$initialize_if_needed_BANG_.invoke(driver.clj:246)”
“driver$the_initialized_driver.invokeStatic(driver.clj:269)”
“driver$the_initialized_driver.invoke(driver.clj:265)”
“driver$dispatch_on_initialized_driver.invokeStatic(driver.clj:277)”
“driver$dispatch_on_initialized_driver.doInvoke(driver.clj:272)”
“driver.util$can_connect_with_details_QMARK_$fn__17971.invoke(util.clj:30)”)

05-08 15:53:45 INFO metabase.driver :: Initializing driver :mysql…
05-08 15:53:45 DEBUG metabase.driver :: Reason: (“util$fn__13662$G__13657__13667.invoke(util.clj:277)”
“driver$initialize_if_needed_BANG_.invokeStatic(driver.clj:260)”
“driver$initialize_if_needed_BANG_.invoke(driver.clj:246)”
“driver$the_initialized_driver.invokeStatic(driver.clj:269)”
“driver$the_initialized_driver.invoke(driver.clj:265)”
“driver$dispatch_on_initialized_driver.invokeStatic(driver.clj:277)”
“driver$dispatch_on_initialized_driver.doInvoke(driver.clj:272)”
“driver.util$can_connect_with_details_QMARK_$fn__17971.invoke(util.clj:30)”)

05-08 15:53:45 INFO metabase.db :: Verify Database Connection … :white_check_mark:
05-08 15:53:45 INFO metabase.db :: Running Database Migrations…

Hi @debug

Doing a search for issues involving Aurora seems it’s not the first time there’s been problems.
Does Aurora provide any logs, and if so, can you see what’s going on while it’s trying to start the migration?

Do you know if 0.32.5 would work with Aurora - ignoring the migration to begin with? You could test by just starting a new 0.32.5 and using H2 as the backend.

If it works, then you could try doing the migration manually - and hopefully some of the kinks with the new MariaDB Connector/J will be solved in later versions.
https://www.metabase.com/docs/latest/operations-guide/start.html#running-metabase-database-migrations-manually

Otherwise I would recommend that create a new issue. There’s not much error log to go on…

There’s not much error log to go on…

Lol, you said it brother. I'm super jealous of the people who are getting nice errors when they're upgrading vs freezing.

Anyway: I tried starting 32.5 with H2, then against blank local MySql database. Those both worked fine.

I tried against a new DB in aurora, get the same freeze at the same place. I don't really think it has to do with upgrading tables as much as 32.5 not working the same as 31.x when going against aurora MySql.

I've tried various configs including both MB_DB_AUTOMIGRATE=true/false options. That flag didn't do anything, which supports my guess that the new maria connector is just timing out trying to make any calls. I know it connects though because of the "Verify Database Connection" line in the log.

My next attempt is to create a fresh MYSQL instance in aurora to see if somehow it's the config of the old cluster that is causing issues.

Update: Looks like the issue might be using a CNAME to refer to the AWS Aurora database instead of the longer endpoint name.

CNAME worked in 31.x versions but for some reason it freezes in 32.x.

So…a work around but unfortunate that we can’t use the easier to remember short names.

Hmmm… that’s really strange. Which version of Java are you using? Of course the new driver would change how it communicates, but most of that logic is handled by Java - or so I would think.

And if it is the driver, then you should have problems connecting to CNAME Aurora as a datasource (Admin > Databases > create a new database) as well, right?

I think it’s the first time I’ve seen someone having an issue like this - no matter which backend or datasource they were using - interesting.

But really great that you’re sharing your debugging!

For anyone who gets this far here is a great article on Maria DB drivers and AWS Aurora.

I have no idea why the CNAME doesn’t work but it looks like as a best practice you don’t want to use one with Aurora anyway because there are cases where the cluster host name may point to a read-only instance if the write-instance is currently failing over. The Maria connector apparently has magics to protect against this.

:lying_face:EDIT: The stuff below is incorrect, I had part of my string misconfigured (including a password that needed a chaaracter escaped to work)

So new annoyance. MB documentation says you can start up using a URI “export MB_DB_CONNECTION_URI=<all-in-one-connection-string>” config string instead of the separate MB_DB_xxx configs when launching the metabase.jar but this isn’t the case. It only works with Heroku start script. (I downloaded the source and looked for that string, was only in one place).

I feel like this used to work. Anyway, I digress. I’ll continue to post any findings here but in the short run I have my work around to get the upgrade deployed.

Hi @debug
The MB_DB_CONNECTION_URI should work with any hosting environment, but if you use that, then do not have any of the other MB_DB_..., since I’m not sure which one takes precedence.
Can you post your string with sensitive information replaced?

Hmm, maybe never mind on that one. Tweaking it I got MB_DB_CONNNECTION_URI to be recognized although password is not being recognized. I think it’s because it has funky characters in it that need to be escaped. sigh.

1 Like

Final followup. Yep, password had a character that needed to be escaped so MB_DB_CONNECTION_URI does work as advertised.

1 Like