Dashboard specific Caching

We have several dashboards which are basically static. They are supplied with data that is refreshed daily and don’t change all day.

However, the underlying queries are slow and with several users, they frequently cause a backlog of queries and metabase to become unusable.

The solution we want to explore is to run a script clearing the cache - like so:
No way to clear cache ? (or TRUNCATE TABLE QUERY_CACHE ?) -
and then refresh the relevant dashboards using the API.

we would then want the dashboard to cache all day. However, my concern is that setting the caching threshold very low (e.g. 0 seconds) and TTL to make everything last all day - would cause every query to be cached. And I don’t know what effect that would have on the application.

Has anyone else tried this method? Has anyone else had this issue? how have you gone about solving it?

Many Thanks,
Theo

1 Like

Hi @theo
Set the TTL for each question instead, since you know which questions are on those dashboards.
Have a look at this issue and the referenced issues:
https://github.com/metabase/metabase/issues/13396

1 Like

Perfect - that’s just what I’m looking for

Although one question @flamber how would this apply with a dashboard filter?
one of the dashboards has filter that affects all the questions.

Ideally we’d cache all the possible results from the filter. e.g. if the filter is day of the week - cache the results for each day of the week

would this be possible? is each question with a different filter stored as a separate question?

@theo I haven’t dug into it, just going by memory, but the dashboard filters applied to the cards, which are essentially questions, so the TTL for those questions (in report_card table) is what define the TTL, but again I haven’t tested.

When you make a query through the API (or whatever you are doing to pre-cache everything), then you should see all the results show in the query_cache table.
But you would have to make sure that each filter option is visited, so it stores a cache of that.

Please make sure you upvote this issue: https://github.com/metabase/metabase/issues/4967
And definitely also worth following some of the referenced issues too.

1 Like

@flamber thanks for your help on this one. Although just looking at the report_card table it seems all the values for cache_ttl are NULL.

After manually changing this to 100 as a test - on one 5-second question. The question is still not cached. Have you got an insight on whats going on here?

@theo Correct. All of them are null by default, since that means it will use the global TTL.
But is cache enabled globally? Check your logs, which should tell you how and when caching is handled.

Caching is enabled. My intention was to have caching enabled globally and then to have a custom cache_ttl for those questions that need it, as well.

Then to refresh them with a script daily.

Are you saying that if caching is enabled globally, the cache will not go to cache_ttl in the report card?

@theo No, I’m saying if you disable cache, then it’s disabled. When the card TTL is defined, then that TTL is used instead of global TTL.
https://github.com/metabase/metabase/blob/master/src/metabase/query_processor/middleware/cache.clj#L4-L6

Ah I see.

I suppose I was thinking that if I set cache_ttl to not null then the query would cache regardless of query time. (if it didn’t exceed the minimum duration).

However, that seems not to be the case. Changing the field cache_ttl does cause it to cache for that length of time. But it won’t cache in the first place unless it exceeds the minimum duration.

@flamber have you got any suggestions for how to cause a card to cache, regardless of query time?

@theo Manually generate the query_cache entries, then you’ll have the control you’re looking for. What you’re specifically asking for isn’t part of Metabase yet, as you can see from the issues I have referenced.

Thanks @flamber I appreciate this is not something thats a part of Metabase. Just trying to look for a work around that will help us and maybe some others, without having to change the application code in anyway.

But manually generating the query_cache, would involve hashing the results and I’m not sure how to do that in a way that metabase could read. That doesn’t sound straight forward, and a fair bit of work to have to define the queries in a script, read the results hash them and then insert into metabase db.

One VERY hacky way that I think would work. is simply to put a sleep statement in the questions I want to cache, that is longer than the minimum duration to trigger caching.

And just manually change their TTLs so they last all day.

Feels messy, but I think it would work.

@theo Messy/hacky is not always bad - but the log should help you figure out what is going on.
And then it’s important to check for any known issues, such as:
https://github.com/metabase/metabase/issues/13262

1 Like

Oh thanks, what a pesky bug