Bigquery timestamp grouping error

I'm pulling in 2 of the exact same tables into metabase but one is from postgres and one is from bigquery. The bigquery table is replicated directly from the postgres table via (create table bq_table as select * from postgres_table). Metabase is set to US/Pacific report time zone.

My timestamp field is "completed_at". When I try to do something simple with the BQ data source like count rows group by completed_at:day I get very strange results, the daily counts are off by significant amounts and the most current day is missing. The same operation on the postgres data source yields results as expected.

I did a spot check by querying a specific line item directly in postgres as well as directly in bigquery console

postgres: 2020-10-26 19:51:09
bigquery : 2020-10-27 02:51:09 UTC

I also looked up the same line item in metabase using the SQL editor

postgres: Oct 26, 2020, 7:51 PM
bigquery : Oct 26, 2020, 7:51 PM

It appears that bigquery converted my postgres PST timestamp to UTC and metabase converted the bigquery UTC time value back to my local time (PST) so everything appears to be fine.

However here is what a results of the simple count rows group by competed_at:day looks like

Postgres

Bigquery

I also saw this strange alert in the corner of the visualization view in the bigquery version of the question.

warning

Any ideas what could be causing this and how to resolve?

Hi @madkap
Timezones are some of the most complicated things.
https://www.metabase.com/docs/latest/troubleshooting-guide/timezones.html
BigQuery does not respect “Reporting Timezone”.

What is your configuration of “Use the Java Virtual Machine (JVM) timezone” in Admin > Databases > (your bq)
And post “Diagnostic Info” from Admin > Troubleshooting.

Hi @flamber

In bq DB use JVM timezone is off.

Here is diagnostic info

{
  "browser-info": {
    "language": "en-US",
    "platform": "Linux x86_64",
    "userAgent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36",
    "vendor": "Google Inc."
  },
  "system-info": {
    "file.encoding": "UTF-8",
    "java.runtime.name": "OpenJDK Runtime Environment",
    "java.runtime.version": "11.0.7+10-jvmci-20.1-b02",
    "java.vendor": "GraalVM Community",
    "java.vendor.url": "https://www.graalvm.org/",
    "java.version": "11.0.7",
    "java.vm.name": "OpenJDK 64-Bit Server VM",
    "java.vm.version": "11.0.7+10-jvmci-20.1-b02",
    "os.name": "Linux",
    "os.version": "5.3.0-1035-aws",
    "user.language": "en",
    "user.timezone": "America/Los_Angeles"
  },
  "metabase-info": {
    "databases": [
      "bigquery",
      "postgres"
    ],
    "hosting-env": "unknown",
    "application-database": "postgres",
    "application-database-details": {
      "database": {
        "name": "PostgreSQL",
        "version": "11.8"
      },
      "jdbc-driver": {
        "name": "PostgreSQL JDBC Driver",
        "version": "42.2.8"
      }
    },
    "run-mode": "prod",
    "version": {
      "date": "2020-10-26",
      "tag": "v0.37.0.2",
      "branch": "release-x.37.x",
      "hash": "ba7be09"
    },
    "settings": {
      "report-timezone": "US/Pacific"
    }
  }
}

@madkap
Have you tried enabling JVM timezone?
You can see the query being used when grouping - either by using the “Show the SQL” or looking at the request data in the browser developer console.
I’m guessing you’re seeing the warning triangle because the timezones doesn’t match name.

Unfortunately enabling JVM timezone does not change the results.

Thanks for your help with this, seems like a Bigquery issue, not a metabase issue.

Hi @flamber

I'm still investigating this and wanted to see if if the below behavior is expected or if this is a bug?

When the run the below query in the metabase sql editor

select 
     timestamp_trunc(completed_at,day) ts_trunc,
     completed_at
from mydb.mytable
     where lineitem_id='12345'

I get the below results in the metabase ui

ts_trunc = Nov 3, 2020, 4:00 PM
completed_at = Nov 4, 2020, 10:03 AM

When I export the results from the query editor to CSV I get the below results

ts_trunc = 2020-11-04
completed_at = 2020-11-04T18:03:53.148Z

My reporting time is set to 'US/Pacific' so everything appears consistent except the output of timestamp_trunc in metabase UI (csv output is correct). It appears that there is some kind of timezone conversion happening with timestamp_trunc so it is showing the previous day in the UI. Is this expected behavior? Is there any way to have metabase ui show correct output for timestamp_trunc?

@madkap Looks like this:
https://github.com/metabase/metabase/issues/13677 - upvote by clicking :+1: on the first post

@flamber

Thanks for the info. The export is not currently causing me any issues I’m curious why timestamp_trunc in the MB UI is not outputting expected results. This is affecting how the question builder does group by “day”.

For example in my above example if

completed_at = Nov 4, 2020, 10:03 AM

and I ran timestamp_trunc(completed_at,day), I would expect the output to be Nov 4, 2020.

However I’m getting output Nov 3, 2020, 4:00 PM.

It seems like MB is truncating & converting the timestamp instead of just truncating.

Am I misunderstanding how this is related to the github issue you linked?

@madkap I cannot reproduce, but you’re more than welcome to open an issue with steps-to-reproduce (since this involves a specific database, then a sample of that is needed).

The BigQuery driver has a long-standing issue where DATE values seem to get converted into the local timestamp equivalent (using the JVM timezone) of the UTC day at midnight. For date based graphs where you format the date without a time, this manifests as days rather silently being off by one. I’ve gotten in the habit of casting all dates to STRING to work around this—metabase processes those correctly as dates without converting it to a UTC datetime and your charts and tables will show the value properly.

Sidenote: I feel lots guilt writing about this asinine workaround that I’ve made a habit for a year+ without knowing whether or not an existing issue actually captures this. There have been quite a few timezone issues with BQ that have been reported (and many resolved), but I haven’t taken the time to look through the current ones to make sure it’s recorded there somewhere. I’ll look around again and open an issue if I can’t find one.

It sounds like this issue is very related:
https://github.com/metabase/metabase/issues/12380

@lukeman

Thanks for the info, I tried your workaround and it seems that it works when viewing the data in a table but when I try to use the question builder and group by day I still get the “offset by 1 day” issue. I think what is happening is that even though the “correct” date time string is being passed while running the query, when the field comes back to metabase, it is converted back to UTC again.

Were you able to use the question builder and group by day and see correct results etc?

So far I’ve not found a workaround for date grouping in the query builder. Most of the data I expose to Metabase is through the use of views though (typically just augmenting an existing source table with calculated columns that I can just pull into Metabase for convenience), so I do have a few tables where I’ve baked in a day_string column that can be used for display without manually casting each time.