Sync with Athena

Hi,

I’m using metabase v0.37.3 on ECS on AWS, since most of the data I work with is on Athena I used the driver to connect to Athena which works great but I can’t seem to control the sync process which queries Athena all the time more than a hundred times per hour.
Is there a way to control how often the sync is being done to avoid too much querying in the background?

I have already enable the following option but doesn’t seem to change anything:This is a large database, so let me choose when Metabase syncs and scan.

Thanks for the help

Hi @alex23
As with all databases, you can change the sync schedule, and disable the scan if needed, via the settings:
https://www.metabase.com/docs/latest/administration-guide/01-managing-databases.html#database-sync-and-analysis

Thanks a lot I did not realize there was a scheduling option appearing on top of the page!

@flamber,

Even after enabling scheduling manually, I still get that kind of query on Athena which scans a lot of data and costs a bunch of money:
– Metabase
SELECT "source"."year" AS "year", "source"."substring672021" AS "substring672021", "source"."substring672022" AS "substring672022", "source"."users" AS "users", "source"."week" AS "week", "source"."substring672023" AS "substring672023" FROM (SELECT "derived_prod"."sporza_usage_aggregated"."yearweek" AS "yearweek", "derived_prod"."sporza_usage_aggregated"."segment" AS "segment", "derived_prod"."sporza_usage_aggregated"."touchpoint_platform" AS "touchpoint_platform", "derived_prod"."sporza_usage_aggregated"."year" AS "year", "derived_prod"."sporza_usage_aggregated"."users" AS "users", "derived_prod"."sporza_usage_aggregated"."week" AS "week", substring("derived_prod"."sporza_usage_aggregated"."yearweek", 1, 1234) AS "substring672021", substring("derived_prod"."sporza_usage_aggregated"."segment", 1, 1234) AS "substring672022", substring("derived_prod"."sporza_usage_aggregated"."touchpoint_platform", 1, 1234) AS "substring672023" FROM "derived_prod"."sporza_usage_aggregated") "source" LIMIT 10000

@alex23 Those are fingerprinting (part of the initial sync), which helps Metabase to know which data types you have, so the columns can get a field type with the best guesstimate.

Fingerprinting is only done once and then on every new fingerprint version - currently version 5, so that’s about every year.

You cannot disable initial sync, since Metabase would have no idea of tables/columns in your database, so you basically wouldn’t be able to use the GUI to make make questions.

It’s very important to understand the difference between scans, which you can disable or change period for, and sync (including fingerprints), which cannot be disabled.

Fingerprinting and scan are somewhat resource intensive tasks, but sync (by itself) is very fast process.

Again, it has nothing to do with Athena - this is how Metabase works for all databases.