Dear metabase community,
we are using metabase within our company with great success. It is a phantastic tool that enables many users without SQL knowledge to build the reports they need.
Unfortunately there is one issue that causes major problems for us regarding the usage of date/time related filters:
Many of our database tables contain time series data. The tables grow quite big over time. Each table has an indexed timestamp column. Most of our reports contain a date filter such as
- show me the data for the last month or
- show me the data for the last 5 days
When filtering the data based on the timestamp column using the metabase query builder, the generated SQL does not filter on the raw timestamp field but on some transformation of the timestamp field. Examples:
- WHERE date_trunc(‘month’, CAST(“public”.“my_events”.“event_timestamp” AS timestamp)) = date_trunc(‘month’, CAST((NOW() + INTERVAL ‘-1 month’) AS timestamp))
- WHERE CAST(“public”.“my_events”.“event_timestamp” AS date) BETWEEN CAST((NOW() + INTERVAL ‘-5 day’) AS date) AND CAST((NOW() + INTERVAL ‘-1 day’) AS date)
Postgres, MySQL and probably most other relational database systems will not be able to make use of the index on the the timestamp field when casts or functions are applied to the field in the where condition. As a result, the whole table will be read for building a report about a few days of data. As our tables grow, the reports become slower and slower - and finally unusable.
Since the upcoming metabase release 0.33 will have a focus on the query builder, I would like to highlight this issue once again. I think many people besides us would benefit from improving the query builder with respect to date / time filtering.
Is there anything we can do to get this topic prioritized for the 0.33 release?
If some information is missing or if I can help solving this issue in any way please let me know.
To show that this topic affects a lot of people I collected the discussions and github issues about this topic from the recent years (unfortunately I am not allowed to post more than two links since I am new to this list):
- Performance Improvement: Use date range instead of casting #4043 - Jan 2017
- Relative date filter field evades indices #6098 - Oct 2017
- Performance problem on postgresql’s partitioned tables when using field filter #8275 - Aug 2018:
- mysql table can not using date column index #9139 - Dec 2018:
- filter on date column takes a very long time #9187 - Jan 2019:
- Index usage on MySQL
- Timestamp filters on indexes fields in MySQL
- Select problem toward MariaDB