How to aggregate by "weighted average" of…? (i.e. ratio of sums, rather than sum of ratios)

Context

Consider the Sample Dataset – and its "Orders" table more specifically.

I want to create the following graph: Σ(Discount) ÷ Σ(Total) by User ID. Furthermore, I want the graph to be affected by the "Date Range" filter of my dashboard.

Problem

I can easily create the Σ(Discount) ÷ Σ(Total) by User ID using nested queries:

  1. I create the Σ(Discount) by User ID query
  2. I create the Σ(Total) by User ID query
  3. I create a nested query that joins both queries on User ID, then I create Σ(Discount) ÷ Σ(Total) as a "Custom column".

I can then obtain the graph I want:

However, the result dataset do not contain any date field (namely Created At). Indeed, its stucture is:

This means that I cannot apply a DateRange filter on this graph.

I can then do the following:

  1. I create the Σ(Discount) by User ID and by Created At:Day query
  2. I create the Σ(Total) by User ID and by Created At:Day query
  3. I create a nested query that joins both queries on User ID and Created At:Day using the SQL editor, and create Σ(Discount) ÷ Σ(Total) by User ID and Day as part of this query. (The SQL query is attached below, yet you'd need to adapt it depending on your Questions ID.)

Now I have a dataset that looks like what I want: both User ID and Σ(Discount) ÷ Σ(Total) columns, plus a Created At:Day column to apply the filter to.
I could now apply a DateRange filter on this query.

However, I can't create the Σ(Discount) ÷ Σ(Total) by User ID graph: I can only plot Σ(Discount ÷ Total) by User ID… which is quite different!

Question

How to plot the "weighted average" aggregate of a ratio?

I.e. in our case, how to plot Σ(Discount) ÷ Σ(Total) by User ID from a data set structured as:

User ID Date Discount Total
1 2019-12-12 0 42
1 2019-12-13 13 17
2 2019-12-12 5 15

SQL query

SELECT "source"."USER_ID" AS "USER_ID", "source"."CREATED_AT" AS "CREATED_AT", "source"."sum" AS "discount", "Question 24"."sum" AS "total", "source"."sum" / "Question 24"."sum" AS "ratio"
FROM (SELECT "PUBLIC"."ORDERS"."USER_ID" AS "USER_ID", CAST("PUBLIC"."ORDERS"."CREATED_AT" AS date) AS "CREATED_AT", sum("PUBLIC"."ORDERS"."DISCOUNT") AS "sum" FROM "PUBLIC"."ORDERS"
GROUP BY "PUBLIC"."ORDERS"."USER_ID", CAST("PUBLIC"."ORDERS"."CREATED_AT" AS date)
ORDER BY "PUBLIC"."ORDERS"."USER_ID" ASC, CAST("PUBLIC"."ORDERS"."CREATED_AT" AS date) ASC) "source" INNER JOIN (SELECT "PUBLIC"."ORDERS"."USER_ID" AS "USER_ID", CAST("PUBLIC"."ORDERS"."CREATED_AT" AS date) AS "CREATED_AT", sum("PUBLIC"."ORDERS"."TOTAL") AS "sum" FROM "PUBLIC"."ORDERS" GROUP BY "PUBLIC"."ORDERS"."USER_ID", CAST("PUBLIC"."ORDERS"."CREATED_AT" AS date)) "Question 24" ON "source"."USER_ID" = "Question 24"."USER_ID" AND "source"."CREATED_AT" = "Question 24"."CREATED_AT"
LIMIT 1048576

Hi @ebosi Is this a question or a tutorial? I’m not sure if you’re looking for help, if so, then I don’t understand the problem, and you should probably be doing the entire query in SQL, since it seems a bit too advanced for the UI.
And what is Σ? It looks like some Excel function…

Hi @flamber, I’m sorry my question isn’t clear enough. A couple answers:

  1. Yes, it’s a question: and while replying to you, I’ve realised that the answer is likely to be: use Metabase’s SQL parameters.
  2. I’ve used Σ as the summation operator, to spare a few characters over “sum”, hoping it would make it more readable…
1 Like