Pivot tables extremely slow

Hi there.

I have a fairly standard pivot, with five column fields, one column field and one summed data item. My query returns about 1,000 rows.

It is taking literally between 1 and 5 minutes to render the pivot table.

I have a bar chart (on the same dashboard) using the same data, which renders instantaneously.

What’s going wrong?

Thanks

Dominic.

Hi @domjammoo
Please post “Diagnostic Info” from Admin > Troubleshooting, and which database you are querying.
Check the browser developer Network-tab and click the request to see the response-data, which also includes the query - you might want to run the query directly on your database with EXPLAIN ANALYZE prepended to see if it is an indexing problem.

Hi, thanks. Diagnostics here, only slight funny being that I thought I was using Postgres 12, and this says 11.7.

{
“browser-info”: {
“language”: “en-GB”,
“platform”: “Win32”,
“userAgent”: “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.123 Safari/537.36”,
“vendor”: “Google Inc.”
},
“system-info”: {
“file.encoding”: “UTF-8”,
“java.runtime.name”: “OpenJDK Runtime Environment”,
“java.runtime.version”: “11.0.10+9”,
“java.vendor”: “AdoptOpenJDK”,
“java.vendor.url”: “https://adoptopenjdk.net/”,
“java.version”: “11.0.10”,
“java.vm.name”: “OpenJDK 64-Bit Server VM”,
“java.vm.version”: “11.0.10+9”,
“os.name”: “Linux”,
“os.version”: “4.14.200-155.322.amzn2.x86_64”,
“user.language”: “en”,
“user.timezone”: “GMT”
},
“metabase-info”: {
“databases”: [
“h2”,
“postgres”
],
“hosting-env”: “unknown”,
“application-database”: “postgres”,
“application-database-details”: {
“database”: {
“name”: “PostgreSQL”,
“version”: “11.7”
},
“jdbc-driver”: {
“name”: “PostgreSQL JDBC Driver”,
“version”: “42.2.18”
}
},
“run-mode”: “prod”,
“version”: {
“date”: “2021-02-19”,
“tag”: “v1.38.0.1”,
“branch”: “release-x.38.x”,
“hash”: “0635914”
},
“settings”: {
“report-timezone”: null
}
}
}

@domjammoo
You might be querying Postgres 12, but the Metabase application database you’re using is Postgres 11.7
Click the request and check the Response-tab - or enable debug logging on your Postgres to see the query.

Are you an Enterprise customer or using Metabase Cloud?

Cloud, on the free trial.

Ok, if I run the core query (the pure sql part that is enclosed in double quotes), and chop off all this stuff:

"params":null},"results_timezone":"GMT","results_metadata":{"checksum":"WF2/64jVAxb

Running from Navicat, I get 0.069 seconds. It’s a tiny, development table at this point, only 286 rows in total, and the entire query is based on a single view.
image

If I run it in Metabase as a direct SQL query, again it is instantaneous.

On the dashboard, at the moment it’s taking 22 seconds.

I have a bar chart on the same dashboard, querying the same view, but with fewer dimensions (3 dimensions instead of 8). This loads instantly. In viewing the network response, I note that:

  • The bar chart returns 240 rows from the SQL query, and generates a response array of 59 arrays (59 rows) with three elements per array.

  • The pivot table, returning the same 240 rows in SQL, returns an array of 6,893 arrays, with 8 elements per array. It looks like it creates 240 arrays, and then a sequential number indicating the column in the pivot, and then the value. My x-axis (column headings) in the pivot are month periods, there are a few gaps but roughly ~25-30 to 30 so that’s where the 6893 number is coming from.

Is that the problem do you think, writing the 6k array, or rendering it in the client?

Thanks

Hi there - any update?

Unfortunately, pivot tables are my core use case, my goal is providing self-service analytics to my operations and sales teams and, without this basic capability, I can’t get the team off excel.

Thanks very much.

@domjammoo I don’t think I’m fully understanding what you mean by 6893 arrays. Each row is returned as single array with multiple keys - one key for each column.
So there should be 240 rows, which means 240 arrays, where each array contains 8 keys, so the total amount of keys should be 1920.

There’s a key running_time in the main object (usually at the bottom), which is the millisecond time of the query that Metabase gets from the driver.

If you run the exact same question, but with regular Table visualization, does that change the speed?

I have tried multiple different databases with various schemas and I cannot make Pivot Table slow in any noticeable way compared to regular tables.
But I’m noticing that you’re returning 143KB of data, which seems like a lot for just 240 rows with 8 columns.

So this is what the pivot looks like (i’ve redacted some stuff) - you can see not many rows; and 37 columns (3x12 months plus a total). Most of the intersections are blank

by 6893 I mean this: in the query response, it starts like this:

… and there are 240 rows, before the penultimate property changes from a ‘0’ to a ‘1’, and then the 240 rows repeat

The penultimate column goes up to 63 (not sure why), repeating most (but not all) of the 240 rows, which gives us 6,940 (the numbers have gone up since before)

image.

What is slightly peculiar is, for example, I have one row that only has a single value in a single month (6,388.2). This is then included in 32 different rows:

If I change the visualization to a table (so the ‘month’ value that is used for the column headings becomes just another column), then, yes, it just runs instantaneously and returns 240 rows.

Lastly, this is the pivot table definition: nothing complicated:

image

Hello, is there any progress with this issue? We face exactly the same problem. Data is retrieved in a few seconds, simple table is rendered in few seconds, but the same question with pivot table visualisation is taking almost minute to render.

@martin.melisik Every sub-total is a separate query, so depending on how many sub-totals you have, then there's a lot of queries - so this depends on your database and latency too, and specially missing indexes can cause a lot of temporary tables, but you can only see such information if you enable debugging on your database.

Hello @flamber, is it expected that the data returned from pivot query is duplicated like stated above? It seems to happen on v0.38.3 also, and it slow things down.

@rwinzhang But the data isn't duplicated - it is returned in row format. Where every sub-total is calculated in it's own query.

Hi, it also seems that pivot table from dashboard is significantly slower than in question. We used the same filter in both: question (900 ms), dashboard (8100 ms). Why ? Is there any way to speed it up ?

@michal.balaz There should not be any different in query speed - unless you have different filters on the dashboard compared to the question.