Not able to disable starburst connector field sync after Metabase version 0.43.4 upgrade

Metabase Version :- 0.43.4
Starburst Connector Version :- 1.0.5

Not able stop metabase sync and its running almost every hour. Please refer below screenshot. Not sure why.

Hi @nilesh.sinha
You should report the problem in the driver repo: https://github.com/starburstdata/metabase-driver

Is this sync schedule logic really driver specific? The user says they followed recommendations here:


and set field has_field_values to none. It seems like their issue is related to this github issue. Another thing to note, they are on version 1.43.4, do they need to wait for a newer version to pick up some recent changes?

@adibs6 That issue is scan, and addressed in upcoming v44. The original issue was for sync.

Ok so to clarify, the user disabled the setting of: "Scanning for Filter Values" and it is described as "Metabase can scan the values...". The user is claiming these scans still happen after turning it off. Are you saying this scan setting is not the same as the issue your resolved in 1.44?

Also, would scan tasks show up in task_history as scan_*?

@adibs6 You're seeing these, which are related to 16104 - all of them fixed in v44.
https://github.com/metabase/metabase/issues/16101
https://github.com/metabase/metabase/issues/16102
https://github.com/metabase/metabase/issues/16103

Scans are only viewable by looking in the database query log. Or enabled trace logging in Metabase.

The user claims the above, but from what I see you cannot stop metabase syncs, you can only change the cadence of when they run.

From their configs, the only thing that the user disabled was metabase scans right? From what you're saying

Scans are only viewable by looking in the database query log. Or enabled trace logging in Metabase.

This means they can't even see scan events from the tables.

So the user should expect to see sync events in this table since syncs cannot be turned off, is this correct to assume?

@adibs6 You cannot disable sync (without some hacks), you can change between daily or hourly.
https://github.com/metabase/metabase/issues/10398

You can see analysis (sync, fingerprinting and scan) progress in the logs. You cannot see the queries unless enabling trace logging or looking on the database query log.

There will always be sync activity, unless it has been disabled with hacks.

Also @adibs6 I have scheduled sync at daily level. Which is not happening here.

Hello Nilesh,

From my logs, I was able to see a sync happen at the cadence I set. I just set my sync to daily to try to reproduce, but are you able to confirm from the logs that indeed more sync events are happening than expected?

Yes @adibs6, I can see sync happening more than once a day.

Please check the screenshot on currently available logs.

I made some changes to scheduling on my side, I will compare my logs to yours. Also, I noticed that things are slightly out of order here. It see it jumps from from 07-27 to 07-25 back to -7-27. I am not sure exactly why this is, maybe this is just related to how you are grepping the logs.

Yes, ordering we can ignore. It's just the way I am grepping it.

I'm unable to reproduce. Seems like my changes went through and are working as expected, notice here the sample H2 DB has the default sync schedule, while my starburst is daily at midnight:

2022-07-27 00:00:01,404 INFO sync.util :: STARTING: step 'sync-tables' for starburst Database 24 'SB' //MIDNIGHT SYNC!
2022-07-27 00:24:00,018 INFO sync.util :: STARTING: step 'sync-tables' for h2 Database 1 'Sample Database'
2022-07-27 01:24:00,010 INFO sync.util :: STARTING: step 'sync-tables' for h2 Database 1 'Sample Database'
2022-07-27 02:24:00,007 INFO sync.util :: STARTING: step 'sync-tables' for h2 Database 1 'Sample Database'
2022-07-27 03:24:00,008 INFO sync.util :: STARTING: step 'sync-tables' for h2 Database 1 'Sample Database'
2022-07-27 04:24:00,008 INFO sync.util :: STARTING: step 'sync-tables' for h2 Database 1 'Sample Database'
2022-07-27 05:24:00,022 INFO sync.util :: STARTING: step 'sync-tables' for h2 Database 1
... Just H2 syncs from here on out ....

So, I would suggest you look at all your DBs and make sure you updated the sync schedules for everything, specifically your DataLake DB. Also, maybe there is something failing and a retry is being attempted. Can you search logs for any sort of errors or warnings? One thing that is strange is yours doesn't seem to be syncing at a different schedule. Notice my syncs all happen at a cadence, either 24 hours from my SB db or hourly in the case of H2. Yours seem sort of random, which makes me think that some sort of retry is happening.

@flamber Lastly, can you confirm that, even if there is a sync scheduling issue, that this would even be a Starburst driver issue. I'm leaning towards, if there is a sync scheduling issue, this being a Metabase issue. When implementing a driver, I add sync logic, but never touched anything that had to do with how the sync is scheduled if I recall correctly.

@adibs6 I have referenced a lot of issues, all of them fixed in v44, so try making a backup and testing 44.0-rc2: https://github.com/metabase/metabase/releases

I cannot tell what is going on without logs, but perhaps someone is manually running re-sync?

perhaps someone is manually running re-sync?

Yeah that makes sense, if someone is triggering a sync manually, it will show up in the logs as just another sync. @nilesh.sinha I am not aware if you can see this from metabase audit tables, but you should confirm this on your side if others are doing manually syncs with this button:

@adibs6 There would be POST request indicating sync is triggered compared to scheduled sync.

@adibs6 There hasn't been any manual sync trigger. But seeing some warnings:-

2022-07-27 01:45:15,695 WARN sync.describe-table :: Don't know how to map column type 'string' to a Field base_type, falling back to :type/.
2022-07-27 01:45:15,695 WARN sync.describe-table :: Don't know how to map column type 'string' to a Field base_type, falling back to :type/
.
2022-07-27 01:45:15,695 WARN sync.describe-table :: Don't know how to map column type 'string' to a Field base_type, falling back to :type/.
2022-07-27 01:45:15,695 WARN sync.describe-table :: Don't know how to map column type 'string' to a Field base_type, falling back to :type/
.
2022-07-27 01:45:15,695 WARN sync.describe-table :: Don't know how to map column type 'string' to a Field base_type, falling back to :type/.
2022-07-27 01:45:15,695 WARN sync.describe-table :: Don't know how to map column type 'string' to a Field base_type, falling back to :type/
.
2022-07-27 01:45:15,695 WARN sync.describe-table :: Don't know how to map column type 'string' to a Field base_type, falling back to :type/*.

Weird, why do you have a column type of string? We do not even have a string type: Data types — Trino 438 Documentation

Can you confirm/provide:

  • that this is indeed our Starburst driver sync happening here which is causing the column type 'string'.
  • provide the version of Trino/Starburst you are running.
  • Provide info on what trino connector(s) you are using.
  • Grep logs for post command for your DB like so: grep "POST /api/database/39/sync_schema" (I assume your DB ID is 39 here based on logs you shared)
  • Run SELECT initial_sync_status, auto_run_queries, refingerprint, cache_field_values_schedule, metadata_sync_schedule, is_full_sync, name, id, FROM metabase_database; on your application DB WARNING: Please be careful about what you paste from this output!

You can see here I indeed only have one sync for this DB at 0 0 0 * * ? * and full sync is disabled.

metabase=# SELECT initial_sync_status, auto_run_queries, refingerprint, cache_field_values_schedule, metadata_sync_schedule, is_full_sync, name, id FROM metabase_database;
 complete            | t                |               | 0 0 13 * * ? *              | 0 0 0 * * ? *          | f            | SB              | 24
 complete            | t                |               | 0 0 17 * * ? *              | 0 24 * * * ? *         | t            | Sample Database |  1

Please find the below output of the query.

Also, that grep command did not return anything.
Trino version :- 374
Trino Connector :- Varada