Initial schema sync process

johnml · February 28, 2016, 1:00am

I’ve attempted to add a relatively large Postgres database (over 1.5TB) to Metabase running on Beanstalk. I haven’t debugged exactly what happens when I initially connect the database, but I can see that it stops replication (possibly locking tables?) on the RDS Postgres read replica. I left the database connected for 45 minutes until I removed it for fear of replication lag growing too far out of recovery.

Is it possible to slow the sync down or break into pieces to allow replication to continue?

sameer · February 29, 2016, 8:40pm

Which version are you using?

For really big databases (where “really big” is determined by your own tolerance for DB load) you can turn off in the database connection detail page.

johnml · March 2, 2016, 2:43pm

version v0.14.1, built on 2016-02-03

I’ll try again leaving off the “in depth analysis”.

I was hoping for an option that would allow this to be enabled, but gradually – if that makes sense. It seems to do a lot of heavy handed locking at the moment.

sameer · March 2, 2016, 7:21pm

Makes sense. The fundamental problem is that the in-depth analysis is pretty load intensive, and we’re trying to figure out a general way to either determine the most important tables to analyze or otherwise speed things up.

julianm · January 31, 2017, 7:12pm

I had the same feeling about in-depth analysis. It takes lot of time and resources. This is something that should be explained to the user before making the choice to enable or not.