Interested in contributing for issue 5579

adriank · August 3, 2018, 10:07pm

Posted in Gitter as well, but I’m interested in implementing https://github.com/metabase/metabase/issues/5579. The company I work for uses native queries pretty heavily, so this feature would be really beneficial for us. I was hoping to discuss the feasibility of actually contributing… From the short look I’ve taken at your documentation/repo, it looks like most of the backend work for this was already done when you added nested queries - this work is only adding the templating to the native query editor, and hooking everything up. Is that correct?

sameer · August 3, 2018, 10:24pm

I think the first step would be to propose a specific template tag, and get buy-in from the team around how to expose these.

I don’t think nested queries is what would be used here, instead we would probably just inject a query in as a sub-select. IMO it would be pretty trivial query manipulation, the main question is what the tag would look/behave like. Would it be present in the side variable panel? etc

adriank · August 3, 2018, 10:33pm

Would injecting a sub-select get the benefit of caching? One of the main reason’s we’re interested in doing this is having the sub query’s result be cached, so the data can be sliced in the main query without re-running everything.

I’ll write up a short proposal on what everything would look like - is this the right place to discuss, or should I use the github issue?

sameer · August 3, 2018, 10:44pm

So even with nested queries, we aren’t caching the nested query itself. Eg, we don’t have an in-app query engine that operates on our cache. All queries are being expanded out and pushed down to the data warehouse.

If there is a workhorse subselect query you’re trying to cache, I’d look into using a materialized view instead.

I’d say the github issue is the place for the design discussion, especially since you’re inclined to actually help out instead of just asking when it’ll be done, which by the way is much appreciated =)

adriank · August 3, 2018, 11:26pm

I guess that makes sense now that I think about it… for some reason I imagined the nested query being run sort of like a CTE which produced a temp table that could be re-used by the main query if it was run again.

I’m essentially looking for a way to automate some of the behavior of creating a materialized view. A user can write a base query, then reference that query in another query slicing/filtering the base query’s cached results. Most of our users don’t know enough about SQL to create materialized views/temp tables themselves(and don’t have write access), and having one of the devs do it doesn’t really scale. I frequently get asked if there’s a way to cache the results of part of a query so the whole thing doesn’t run every time they change a variable, and I was hoping nested queries would be the solution

sameer · August 3, 2018, 11:40pm

This is probably beyond the scope of 5579, but it would definitely be possible to do something like create materialized views for any query used in a nested tag repeatedly.

It would require either calling “out” to create the materialized view, or extending the drivers to support write statements.

Very much within the spirit of things we want to do, but it would require a bit of work

adriank · August 6, 2018, 5:08pm

Yeah that sounds a bit more involved than what I originally hoped, I’ll have to run all this by my team and see what they think. I still think the idea has merit, and would be a big help for us, but it’s just a matter of resources.

Just as a side question - I was thinking about how this would work over the weekend… Would I have access to the internal metabase DB to store metadata for the materialized views/temp tables?