Metabase DB cleanup?


#1

Hi,

We are using metabase with the sqlite backed data store (metabase.mv.db etc). The size of the file has consistently grown (we upgraded it several times, etc). metabase.mv.db has grown to 800MB in size, and the interface feels like it got slower with time.
We don’t have that much in our metabase (about 100 saved queries, 3 data source configurations) so we don’t understand why the size of the database is so big. Is there a cleanup operation that can help us?


#2

Interested on this. My database file is around 1.2GB, and, although I don’t feel the slowness, that is a lot of space to be taken from an EC2 server.


#3

Hey, the reason the application DB is so large is probably because the log of query executions has grown to be pretty big. We’re discussing how we can rotate this out in the future but in the meantime you can delete what’s currently the query_execution log if you don’t need it for auditing purposes. (I’m guessing you don’t if it’s a single-user instance you’re running locally)

Get an H2 shell like this:

# H2 automatically adds .h2.db or .mv.db suffix so do not specify it
java -cp metabase.jar org.h2.tools.Shell -url jdbc:h2:./metabase.db -user '' -password '' -driver org.h2.Driver

Or when running with Leiningen, we have an alias for that command:

lein h2

Then at the SQL prompt:

sql> TRUNCATE TABLE query_execution;

That will delete all the rows in the query execution log which should bring the database back to a reasonable size.

That should be all you need to do. With Postgres or MySQL you can connect to the database using psql or mysql and run the same command


#4

Interesting. I run Metabase through the ELB distribution which uses docker internally. I can’t open the database because its in use by the process, but if I try to kill the metabase process the docker VM restarts, starting it again. Is there a clean way to do this operation?

(Not a docker expert)


#5

May those operations be performed on the fly, or should I restart my instance in order to achive that?


#6

@Noam you’re running Metabase via Docker/ELB using an H2 database? Does that even work? When you restart doesn’t it wipe your database?

We definitely recommend sticking to Postgres when running on ELB. You can check out our guide for migrating here: https://metabase.com/docs/latest/operations-guide/start.html#migrating-from-using-the-h2-database-to-mysql-or-postgres

@lucas.lima with Postgres there’s no need to stop your Metabase instance. With H2 you’d have to stop it because the file can’t be opened by multiple processes at the same time


#7

Hi,

This is also an in issue in my organization. The metabase.db.trace.db file has grown to 40GB and if it continues to grow at the current rate it will become unusable. Truncating the query_execution table doesn’t seem to have changed anything - are there any other optimizations I could perform? (running using H2)


#8

Hi @eamonn

Besides recommending you to change from H2 to another database?

You should probably find out which table is large in size. You can check it with this function:
http://h2database.com/html/functions.html#disk_space_used
I haven’t tried this, but it should generate a list of all tables and sizes:

select table_name, DISK_SPACE_USED(table_name) disk_size
from INFORMATION_SCHEMA.TABLES 
where table_schema = 'PUBLIC' and table_type = 'TABLE';

#9

Hi @flamber,

Thanks for the query, no table in particular seems to be unusually large which is why we are so confused.

I’ve tried switching to postgres but there’s already a postgres DB on the server for apache airflow and the 2 won’t play together. Right now I’m waiting on approval to move metabase off it’s current server and ideally I want to migrate to run on Elastic Beanstalk.

I’d love to know what the root cause is, I’ve used metabase in a few different organizations and this is the first time the problem has occurred for me.


#10

@eamonn

I’m not sure, because I ran screaming away from h2 after my first crash/corruption :slight_smile:

You could use MariaDB/MySQL as a temporary solution, since it’s still better than h2.
Or if you’re running Docker, then you can run Postgres on a different port.

Have you tried to compact or defragment on shutdown?

sql > SHUTDOWN COMPACT;
sql > SHUTDOWN DEFRAG;

Could be an index problem, which you can re-build:
http://www.h2database.com/html/features.html#compacting