Pivot tables extremely slow

flamber · July 6, 2021, 2:08pm

@ksvsem
Yes, exactly what I'm trying to explain. When you use Pivot Table, then Metabase will generate a lot of queries, so it can show sub-totals etc. Check your database query log and you can see all the queries being made.
You cannot see all the sub-total queries anywhere in Metabase, so the "Show the SQL" will only show the main query, which is just a single query.

The UI slow/freeze is this issue:
https://github.com/metabase/metabase/issues/12378 - upvote by clicking on the first post

ShunShun · March 7, 2022, 11:40am

I want to say that this problem still exists(v0.42.2). You can successfully complete the design of the pivot table in the designer. Hundreds of rows of data were completed in almost one second, but it took nearly 15 minutes to wait for the PivotTable to display.

prigal · May 2, 2022, 1:18pm

I have posted an issue on this subject, the generated query is not correctly filtered, this is why it could take from few minutes to hours on some large database to generate the pivot.

Please upvote here : https://github.com/metabase/metabase/issues/21688

jorge_qv · August 29, 2023, 8:59am

Hello @Luiggi ,
Do you know if this issue was resolved? I don't know if the query on the backend is optimized already or it calculates every subtotal in a separated query.

Thank you

domjammoo · August 29, 2023, 9:52am

Hi. Just got an update in my email that this topic still had activity. I'm no longer a metabase user due to the above issue. Just re-reviewing the thread, though, it seems the issue is clear - Running a separate query back to the database for every subtotal and grouping, for a pivot table, is the wrong design, period. A pivot table is a UI analysis tool to allow real-time crosstab reporting with multiple dimensions and ways of accumulating. The whole point is to be able to drag and drop fields, subtotals, cumulative totals between the axes in order to analyse a data set in close-to real time. It's the same as loading everything into a python dataframe and using stack / unstack / pivot etc to analyse different dimensions of the data.

As such, it's crazy to build the subtotal / total logic as an additional database query - this could never work as a flexible UI tool. The underlying dataset needs to be in memory, either in the UI or in the service that serves the UI.

Anyway, maybe you've fixed it.

I've moved to a Google Cloud Postgres instance, with Retool and pivot.js (free) sitting on top and it works, and is fast.

Hope that's useful

Dominic

Luiggi · August 29, 2023, 10:16am

Not resolved yet

Luiggi · August 29, 2023, 10:26am

All technical decisions have pros and cons. When we built the pivot in the current form we made some decisions that are now not relevant anymore. Your current stack might face limitations in the future, that’s what systems architecture is about, there’s no perfect solution

optonpo · November 8, 2023, 9:11pm

Is there a fix on the horizon?