We are seeing a very large number of queries being run by Metabase, presumably related to the syncing of the schema in our DBs. As a part of a snowflake contract discussion, it looks like our snowflake costs are 25% higher than they should be, costing us hundreds (almost thousands) of dollars a month.
Here's what we saw in April in terms of queries being run in snowflake and the credits associated with these queries:
Here's the query:
select query_text,
user_name,
sum(credits_used_cloud_services) as credits_used_cloud_services,
count(query_text) as count
from snowflake.account_usage.query_history
where warehouse_name='ADHOC_REPORTING'
and date_trunc(month, start_time) = '2022-07-01'
group by 1, 2
order by credits_used_cloud_services desc
limit 20
At $3 / credit this was at least $600. The thing that's alarming to me is the count of queries. We have our databases set up to only sync daily, and these queries are being run over 100,000 - 500,000 times in a month. At the low end that's almost 140 times every hour, or over 400 times every hour for an entire month.
This seems crazy to me. Here are the settings that show the warehouse and user:
Is this happening to anyone else? This seems like something has gone horribly, horribly wrong.
I looked for an issue in github but didn't find anything that I think is related to what I'm seeing. I know that there are some recent updates to Metabase that allows for more selective syncing of databases and schemas for snowflake (introduced in 43.0) but we haven't upgraded yet and this behavior would be helped but not fixed by this setting. We self-host on elastic beanstalk and are on 0.42.2.