@Luiggi At the time of cpu spike, we are seeing middleware.cache in our logs (attached screenshot). Usually, query while caching takes milli-seconds to run, but at that time it is taking 3-5 minutes. We have to restart metabase service for it to work again. There is only metabase process running on this server.
Also we noticed this unusual error at the same duration:
you have other problems there: exports take 3 minutes, getting the collection tree takes 20 seconds
Has anyone tested this cpu utilization issue with metabase 0.48.5 ? We are still facing this issue, is it possible to downgrade metabase to older version (we were using 0.45.4.3) ?
I have found a solution to our problem. Some time ago there was a problem that an error occurred with custom columns that could only be fixed by inserting a "+0" into the calculations. If you remove the "+0" from the calculation, the calculation runs quickly again and the CPU is no longer overloaded.
DO NOT go to that version, at least upgrade to 47.12
Can you elaborate on this fix? Where do you make this change and any reference to the previous problem?
I also face metabase instability.
With 0.48.3 it was high CPU issue, and symptom same as adinamarca describe (UI irresponsive and slow. Then goes down).
Then I updated to version 0.48.5 and the problem still occurs, the same symptom, but the CPU is not loaded to 100%, only the high memory usage (some memory leak happen ??).
With both versions, Metabase stays alive for about 10 days, then crashes (usually when memory usage reach 5,1GB).
Today I will try updating to v0.48.6 but I don't see the related fixes in the changelog.
how are you running Metabase? are you setting the XMS and XMX variables? please check How to run Metabase in production
I use Metabase since version 0.36.0.
It is using a single MariaDB, and I have not set up XMS and XMX variables. It's the same environment as 0.47.x, where no issues were detected. And there has been no increase in users using my metabase. All problems started with v 0.48.x
Does v0.48.6 work effectively?
My server rebooted due to a Windows updates on Tuesday (Feb 20), now I have to wait another 8 days to see what happens. I've also started collecting CPU/RAM metrics from my server, so I'll have some record, similar as @rpataro