Running two instances, one is taking too much memory and another consuming nothing

mverma · February 7, 2020, 11:27am

I am running two instances of metabase pointing to same metadata as docker images each given 4GB of heap memory. On inspecting the resource utilisation I found one instance consuming almost 4 GB memory but another one is barely using 1 GB of memory. Does it means only one instance is doing heavy lifting like sync etc, or am I missing some configuration to run multiple instance of metabases.
Metabase version : 0.34.1

{
"java.runtime.name" "OpenJDK Runtime Environment",,
 "java.runtime.version" "11.0.5+10",,
 "java.vendor" "AdoptOpenJDK",,
 "java.vendor.url" "https://adoptopenjdk.net/",,
 "java.version" "11.0.5",,
 "java.vm.name" "OpenJDK 64-Bit Server VM",,
 "java.vm.version" "11.0.5+10",,
 "os.name" "Linux",,
 "os.version" "4.15.0-1031-aws",,
 "user.language" "en",,
 "user.timezone" "GMT"
}

flamber · February 7, 2020, 11:54am

Hi @mverma
You should check the log, since you will see the ID of each instance, and from there you can see what they are doing. Admin > Troubleshooting > Logs
It could also be the load balancer is pushing most traffic to instance1.
But sync/scan is only done by a single instance at the time - there’s no reason for all the instances to do the same sync/scan tasks.

mverma · February 7, 2020, 12:08pm

Hey
Thanks for prompt response. Can you tell why there is very high consumption of memory even when there are hardly 1 or 2 person using metabase ? I have this concern because I have been getting Out of Memory exception consistently and I have to restart the containers.I am very much sure that our queries don’t load lots of data at any time. Just curious what exactly sync does which makes metabase to go out of memory if it doesn’t loads data in memory ? Below are very common exception we get.

Exception in thread "MetabaseScheduler_QuartzSchedulerThread" java.lang.OutOfMemoryError: Java heap space
Exception in thread "async-dispatch-2" java.lang.OutOfMemoryError: Java heap space
Exception in thread "QuartzScheduler_MetabaseScheduler-439af98ed35e1580740511374_ClusterManager" java.lang.OutOfMemoryError: Java heap space
Exception in thread "manifold-pool-1-1" java.lang.OutOfMemoryError: Java heap space

Thanks!!

flamber · February 7, 2020, 2:49pm

@mverma

It depends on how the two people are using Metabase - if they do a lot of big queries and/or export data, then it can quickly eat the memory.
And it also depends on the database type - some drivers consume more memory than others (partly because of design, but also bugs in those upstream drivers).

Currently Metabase does not support streaming of results/exports, but there’s work being done on that right now, which might make it in 0.35. That’s a big reason for the memory consumption currently.

What are your Java options - I’m guessing -Xmx4g ?

mverma · February 11, 2020, 6:47pm

Yes we are using -Xmx4g, but still it doesn’t explains why would any metabase process will consume this much memory and goes out of memory even when no one is using the environment .

flamber · February 11, 2020, 10:58pm

@mverma

Even when no one is using Metabase, there are still many processes running that does everything from syncing, scanning, pulses, alerts and several other things.

I have no idea how many databases you’ve connected, nor the size (tables/columns/data) of those databases, but that has an impact as well.

But like I said, currently Metabase does not stream data, but that should change the memory usage a lot once it’s implemented.

Are you saying that when you restart both instances, then one instantly goes to 4GB and the other stays at 1GB?
Or does it slowly use more memory over a period of time? How long and what’s the activity during that period?

mverma · February 12, 2020, 5:16am

HI,
The memory build up happens over period of time and only one instance takes up huge memory and another not much. I have connected to 14 Databases.
As you can see I get OOM mostly related to Quartz.
Can you suggest Heap size which will be ideal to run metabase ?

Thanks!

flamber · February 12, 2020, 7:40am

@mverma
Since it slowly grows, I’m fairly sure it’s related to non-streaming, so you’ll most likely see better memory usage starting from 0.35

Out of curiosity, how many active users do you have, since you’re using a multi-instance?
Or is there another reason for running a multi-instance setup?

I cannot give you a heap size, which would be perfect - it depends on the setup - so I would recommend try adjusting it by 2GB increments.

mverma · February 13, 2020, 7:15am

We are running multi instance for high availability, hardly 5-6 concurrent users.

mverma · February 14, 2020, 8:27am

One more thing I would like to point out is even after disabling sync, metabase still trying to sync with the database.

flamber · February 14, 2020, 2:23pm

@mverma You cannot disable sync - it can be set to Hourly/Daily. Are you talking about the field scanning process?

mverma · February 17, 2020, 7:19am

I am talking about this following setting, Even after doing this metabase trying to sync the fields.

This is a large database, so let me choose when Metabase syncs and scans

By default, Metabase does a lightweight hourly sync and an intensive daily scan of field values. If you have a large database, we recommend turning this on and reviewing when and how often the field value scans happen.

flamber · February 17, 2020, 9:39am

@mverma
Just for reference: https://www.metabase.com/docs/latest/administration-guide/01-managing-databases.html#database-sync-and-analysis
You can disable the scanning, but not the sync. After enabling “This is a large database…”, you should see a “Scheduling” tab.