I've encountered a bug with the cumulative sum calculation in Metabase version 0.50.xx that didn't exist in version 0.48 (I use 0.50.18 version)
In version 0.48, creating a cumulative sum of quantities sorted by categories worked correctly. However, in version 0.50, the cumulative sum does not sort properly and follows the original data order instead.
Then I try to simulate with sample data.
Steps to reproduce:
Create a question using the "Orders" table.
Join with the "Products" table on "Product ID".
Summarize by "Products → Category" and calculate the sum of "Quantity".
Add a cumulative sum of the sum of "Quantity".
Sort by "Sum of Quantity" in descending order.
Expected result: The cumulative sum should start with the value 18.920 as the first row (corresponding to the sum quantity column), but the result was started with 69540
I've attached screenshots to illustrate the issue.
then I try to summarize in 2 stage, but the result is the order of the product is not base on sum product which should be start from "Widget", but following cumulative that start from Doohickey
herebelow is the SQL code autogenerate by metabase 0.48
SELECT
"Products"."CATEGORY" AS "Products__CATEGORY",
SUM("PUBLIC"."ORDERS"."QUANTITY") AS "sum",
SUM("PUBLIC"."ORDERS"."QUANTITY") AS "sum_2"
FROM
"PUBLIC"."ORDERS"
LEFT JOIN "PUBLIC"."PRODUCTS" AS "Products" ON "PUBLIC"."ORDERS"."PRODUCT_ID" = "Products"."ID"
GROUP BY
"Products"."CATEGORY"
ORDER BY
"sum" DESC,
"Products"."CATEGORY" ASC
herebelow is SQL code autogenerate by metabase 0.50.17
SELECT
"source"."Products__CATEGORY" AS "Products__CATEGORY",
SUM("source"."QUANTITY") AS "sum",
SUM(SUM("source"."QUANTITY")) OVER (
ORDER BY
"source"."Products__CATEGORY" ASC ROWS UNBOUNDED PRECEDING
) AS "sum_2"
FROM
(
SELECT
"Products"."CATEGORY" AS "Products__CATEGORY",
"PUBLIC"."ORDERS"."QUANTITY" AS "QUANTITY"
FROM
"PUBLIC"."ORDERS"
LEFT JOIN "PUBLIC"."PRODUCTS" AS "Products" ON "PUBLIC"."ORDERS"."PRODUCT_ID" = "Products"."ID"
) AS "source"
GROUP BY
"source"."Products__CATEGORY"
ORDER BY
"sum" DESC,
"source"."Products__CATEGORY" ASC
you can see the diffrence.
I think this is big issue that make metabase don't have ability to make pareto diagram anymore. Which is common in analytic jobs
hi @Luiggi Luiggi , Thank you for your response.
I sorting because metabase default is ascending, meanwhile for create pareto diagram, we need to sort descending for the quantity, and sort ascending for the cumulative.
You can see the diffrence between 0.48 and 0.50 in my answer above for other reply.
My team recently got upgraded to v1.50 and is experiencing the same issue. Cumulative Sums by time periods now no logner return one row per time period (eg per month), but simply the number of rows in the table originally (ie not summarising at all).
Hey there, we have changed the implementation of cumulative sum and count to use sql window functions. This is a breaking change in a way that the results now depend on what you have chosen as breakouts in the summarization. Current logic is that the first grouping is used for sorting as a dimension along which the accumulation happens and the rest grouping columns are used as partitions. So to make stuff work most likely you have to move your datetime column to be the first breakout in the summarization. If that doesn't help, please report a bug here.