Average with empty fields

Hi Everyone,

I am using metabase with a dataset that has empty fields (null) values in each column.

And i would like to know how i avoid the empty values being a part of the average calculation.

Right now metabase calculates all empty values as 0. It does the same if i insert NA in the columns and models the data as a number.

As i understand an average calculation it should be sum devided by count in the column you choose.

Thanks a lot and great work.

Hi @hokmoc
I cannot reproduce on latest release 0.37.0.2
Please post “Diagnostic Info” from Admin > Troubleshooting, and which database you’re querying.

{
“browser-info”: {
“language”: “en-US”,
“platform”: “Linux x86_64”,
“userAgent”: “Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36”,
“vendor”: “Google Inc.”
},
“system-info”: {
“file.encoding”: “UTF-8”,
“java.runtime.name”: “OpenJDK Runtime Environment”,
“java.runtime.version”: “11.0.7+10”,
“java.vendor”: “AdoptOpenJDK”,
“java.vendor.url”: “https://adoptopenjdk.net/”,
“java.version”: “11.0.7”,
“java.vm.name”: “OpenJDK 64-Bit Server VM”,
“java.vm.version”: “11.0.7+10”,
“os.name”: “Linux”,
“os.version”: “5.4.0-52-generic”,
“user.language”: “en”,
“user.timezone”: “GMT”
},
“metabase-info”: {
“databases”: [
“h2”,
“mysql”
],
“hosting-env”: “unknown”,
“application-database”: “h2”,
“application-database-details”: {
“database”: {
“name”: “H2”,
“version”: “1.4.197 (2018-03-18)”
},
“jdbc-driver”: {
“name”: “H2 JDBC Driver”,
“version”: “1.4.197 (2018-03-18)”
}
},
“run-mode”: “prod”,
“version”: {
“date”: “2020-10-26”,
“tag”: “v0.37.0.2”,
“branch”: “release-x.37.x”,
“hash”: “ba7be09”
},
“settings”: {
“report-timezone”: null
}
}
}

And you can see the sampledata here in Google Sheets: https://docs.google.com/spreadsheets/d/1HqIWq8w_cTcsXlZhpNhcGfIeHSqmI_2UC3qqL0Tmn84/edit?usp=sharing

So when i calculate the average of Q3[SQ003] in metabase it outputs 1.83 where it should be 4.75 as you can see in Google Sheets and this is due to metabase taking the sum of the column Q3[SQ003] divided by the number of rows.

But it should be divided by the count of the actual column, which is 64 and not 164.

Hi @hokmoc

  1. Are you querying MySQL? Please provide a sample schema. I cannot reproduce on MariaDB 10.4 with INTEGER columns.
  2. When you use the Average function, then Metabase generates a query with avg(column) (on MySQL).
  3. You should migrate away from H2 if you’re using Metabase in production:
    https://www.metabase.com/docs/latest/operations-guide/migrating-from-h2.html