Question's average execution duration very long even with caching

Hello!

I'm new to metabase and want to create a fast loading dashboard. However, some of my questions take 30s-50s to execute. I have tried to set a reasonable value for the cache to decrease execution time, but the caching doesn't help since as its stated in the logs that the Query itself just take 1 second to run. How can I decrease the execution time? More memory, CPU,...??

When I change the setting 'Group by week' to 'Group by day' then the question is loaded within 2 seconds.

This is the query of dashboard and some questions:

4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:36:52+02:00 INFO metabase.query-processor.card Question's average execution duration is 40.7 s; using 'magic' TTL of 8.1 hours 💾
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:06+02:00 INFO metabase.query-processor.middleware.cache Query took 318.0 ms to run; minimum for cache eligibility is 5.0 s
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:06+02:00 DEBUG metabase.server.middleware.log POST /api/card/17/query 202 [ASYNC: completed] 14.0 s (14 DB calls) App DB connections: 2/15 Jetty threads: 2/50 (5 idle, 0 queued) (148 total active threads) Queries in flight: 0 (0 queued)
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:52+02:00 DEBUG metabase.server.middleware.log GET /api/dashboard/1 200 61.0 ms (18 DB calls) App DB connections: 2/15 Jetty threads: 3/50 (4 idle, 0 queued) (145 total active threads) Queries in flight: 0 (0 queued)
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:52+02:00 DEBUG metabase.server.middleware.log GET /api/table/7/query_metadata 200 54.7 ms (9 DB calls) App DB connections: 1/15 Jetty threads: 4/50 (3 idle, 0 queued) (145 total active threads) Queries in flight: 0 (0 queued)
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:52+02:00 INFO metabase.query-processor.card Question's average execution duration is 39.6 s; using 'magic' TTL of 7.9 hours 💾
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:52+02:00 INFO metabase.query-processor.card Question's average execution duration is 2.0 s; using 'magic' TTL of 24.5 mins 💾
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:52+02:00 INFO metabase.query-processor.card Question's average execution duration is 1.4 s; using 'magic' TTL of 16.2 mins 💾
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:52+02:00 INFO metabase.query-processor.card Question's average execution duration is 41.8 s; using 'magic' TTL of 8.4 hours 💾
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:52+02:00 INFO metabase.query-processor.card Question's average execution duration is 38.0 s; using 'magic' TTL of 7.6 hours 💾
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:52+02:00 INFO metabase.query-processor.card Question's average execution duration is 1.8 s; using 'magic' TTL of 21.6 mins 💾
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:54+02:00 INFO metabase.query-processor.middleware.cache Query took 1.2 s to run; minimum for cache eligibility is 5.0 s
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:54+02:00 DEBUG metabase.server.middleware.log POST /api/dashboard/1/dashcard/14/card/15/query 202 [ASYNC: completed] 2.0 s (18 DB calls) App DB connections: 1/15 Jetty threads: 3/50 (5 idle, 0 queued) (163 total active threads) Queries in flight: 5 (0 queued)
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:54+02:00 INFO metabase.query-processor.middleware.cache Query took 1.6 s to run; minimum for cache eligibility is 5.0 s
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:54+02:00 DEBUG metabase.server.middleware.log POST /api/dashboard/1/dashcard/2/card/2/query 202 [ASYNC: completed] 2.2 s (19 DB calls) App DB connections: 0/15 Jetty threads: 3/50 (5 idle, 0 queued) (159 total active threads) Queries in flight: 4 (0 queued)
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:55+02:00 INFO metabase.query-processor.middleware.cache Query took 1.7 s to run; minimum for cache eligibility is 5.0 s
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:37:55+02:00 DEBUG metabase.server.middleware.log POST /api/dashboard/1/dashcard/1/card/1/query 202 [ASYNC: completed] 2.4 s (19 DB calls) App DB connections: 1/15 Jetty threads: 3/50 (5 idle, 0 queued) (155 total active threads) Queries in flight: 3 (0 queued)
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:38:35+02:00 INFO metabase.query-processor.middleware.cache Query took 675.0 ms to run; minimum for cache eligibility is 5.0 s
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:38:35+02:00 DEBUG metabase.server.middleware.log POST /api/dashboard/1/dashcard/3/card/3/query 202 [ASYNC: completed] 42.7 s (19 DB calls) App DB connections: 1/15 Jetty threads: 3/50 (5 idle, 0 queued) (164 total active threads) Queries in flight: 2 (0 queued)
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:38:35+02:00 INFO metabase.query-processor.middleware.cache Query took 724.0 ms to run; minimum for cache eligibility is 5.0 s
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:38:35+02:00 DEBUG metabase.server.middleware.log POST /api/dashboard/1/dashcard/18/card/17/query 202 [ASYNC: completed] 42.7 s (21 DB calls) App DB connections: 1/15 Jetty threads: 3/50 (5 idle, 0 queued) (160 total active threads) Queries in flight: 1 (0 queued)
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:38:35+02:00 INFO metabase.query-processor.middleware.cache Query took 521.0 ms to run; minimum for cache eligibility is 5.0 s
[4e7d4b99-92d4-4312-a1d5-683d323a5adf] 2022-10-16T17:38:35+02:00 DEBUG metabase.server.middleware.log POST /api/dashboard/1/dashcard/18/card/16/query 202 [ASYNC: completed] 43.0 s (19 DB calls) App DB connections: 1/15 Jetty threads: 2/50 (6 idle, 0 queued) (156 total active threads) Queries in flight: 0 (0 queued)

Diagnostic info:

{
  "browser-info": {
    "language": "en-GB",
    "platform": "MacIntel",
    "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Safari/605.1.15",
    "vendor": "Apple Computer, Inc."
  },
  "system-info": {
    "file.encoding": "UTF-8",
    "java.runtime.name": "OpenJDK Runtime Environment",
    "java.runtime.version": "11.0.16.1+1",
    "java.vendor": "Eclipse Adoptium",
    "java.vendor.url": "https://adoptium.net/",
    "java.version": "11.0.16.1",
    "java.vm.name": "OpenJDK 64-Bit Server VM",
    "java.vm.version": "11.0.16.1+1",
    "os.name": "Linux",
    "os.version": "4.15.0-180-generic",
    "user.language": "en",
    "user.timezone": "GMT"
  },
  "metabase-info": {
    "databases": [
      "bigquery-cloud-sdk",
      "mongo"
    ],
    "hosting-env": "unknown",
    "application-database": "postgres",
    "application-database-details": {
      "database": {
        "name": "PostgreSQL",
        "version": "14.4"
      },
      "jdbc-driver": {
        "name": "PostgreSQL JDBC Driver",
        "version": "42.5.0"
      }
    },
    "run-mode": "prod",
    "version": {
      "date": "2022-09-29",
      "tag": "v0.44.4",
      "branch": "release-x.44.x",
      "hash": "382d728"
    },
    "settings": {
      "report-timezone": "Europe/Vienna"
    }
  }
}

Hi @Lambod
Which database type are you querying?
There's something off in the log.
I'm having difficulties understanding how long time the actual queries takes to run on your database.
I'm guessing that you're using a reverse-proxy in front of Metabase, which? And have you tried running without it?

Hi @flamber,
thanks for your help! I'm querying mongoDB. And as I stated above, when I change 'group by week' to 'group by day' the execution time is just a few seconds. It seems that grouping the data to weeks take a lot of time.

The setup for Metabase is done with a Docker Container on Azure (without reverse proxy). I followed this guide for the setup: Running Metabase on Microsoft Azure and just used another PostgreSQL flexible instance. However, I'm using just a very small development instance with 100 total ACU, 1.75 GB memory for testing purposes.

@Lambod
So you are only having problems when querying with week granularity? That sounds like a database query planner issue, but check your database query logs.

There's something strange going on, but I cannot see if the problem comes from incorrect Azure setup (a load-balancer is a reverse-proxy), or something on the database, or something else.

Setup Metabase locally on your computer and see how it runs, then take one step at the time.

@flamber
Thanks for the guidance. I have figured out that upgrading my Azure instance to the next plan B2 200 total ACU, 3.5 GB memory reduced the execution time from 42 seconds to approximately 15 seconds for the dashboard. The local docker instance for the same question only take 5 seconds with the same mongodb instance. So it seems that this issue is related to CPU/Memory.

@Lambod I don't know what 200 ACU is equivalent to, but from the logs it looks like a reverse-proxy problem (related to http/1). It's worth noting that Metabase is not doing the querying, your database is, so why you're even seeing such a time difference seems very strange.

Though having way too little CPU and RAM can definitely slow things down a lot, but if you had similar to 1 vCPU and 2GB before, then that should be more than plenty for a little dashboard.

@flamber, thanks for providing more details.
I have double checked that there is no reverse proxy in front of metabase and also tried out to use a NAT but without any effects.

In the setup instructions there is a minimum of 200 ACU recommended, and I'm using 2 docker instances (just development) on such an instance:

AT LEAST 200 total ACU and 3.5GB of memory

The more questions there are in the dashboard, the longer the query for just the question (using weekly grouping) take. When I open only that question it takes 7 sec. In the dashboard the same question takes 18 sec. All other questions are within 1-2 secs.

Is there any other log that could be useful for debugging??

@Lambod So you are not using https, but connecting directly via http on port 3000?

Disable caching in Metabase, so you can figure out where the problems are coming from.
I would check your browser developer Network-tab, and correlate that with the Metabase logs.
There's many thousands of Metabase installations, your problem sound very setup specific.

@flamber, thanks for your help!

I have disabled caching and the developer network-tab just confirms that the query takes as long as stated in the log of metabase. The query has "running_time": 22616. The query of the card of the dashboard via the browser is done through https://domain.com/api/dashboard/1/dashcard/18/card/17/query via Port 443.

However, even when I connect via localhost on port 3000 with docker, I get the same results in terms of running_time.

@Lambod Then enable debug query logging on your database, so you can see exactly what is going on.

I don't think I can help much more.