Is there a way we can set fetch size for JDBC for Redshift database in metabase

From Redshift documentation- “By default, the JDBC driver collects all the results for a query at one time. As a result, when you attempt to retrieve a large result set over a JDBC connection, you might encounter a client-side out-of-memory error. To enable your client to retrieve result sets in batches instead of in a single all-or-nothing fetch, set the JDBC fetch size parameter in your client application.”

https://docs.aws.amazon.com/redshift/latest/dg/queries-troubleshooting.html#set-the-JDBC-fetch-size-parameter

I took a heap dump and noticed 7 com.amazon.redshift.client.PGMessagingContext occupying memory of approximately 800 MB (ranging from 150 MB to 40 MB). We had couple of reports running which is downloading about 70 MB (70,000 rows of about 1 KB each) csv file apart from other smaller queries. Possibly, we have large resultset in metabase instead of fetching small numbers of rows and then converting to csv.

Do we know if we can set fetch size for resultset and if that is going to help in this?

Regards,
Nitin

Hi @tachniki
Currently Metabase does not support streaming, but there’s a lot of work being done to implement that in upcoming versions (maybe 0.35).
It seems like you know more than me about this, so perhaps you could read through this PR:
https://github.com/metabase/metabase/pull/11554

Thanks @flamber for response.

I believe this will solve issue for lots of other database. But I doubt this will solve issue for AWS Redshift. Most database sets fetch size of 10 by default. But Redshift seems to set it to unlimited. So, unless set, it will still return all results at one time causing same issue in spite of async handling of query.

Regards,
Nitin

@tachniki You should probably create a feature request. Searching a bit on the subject it seems like Athena might be hit by this too.
https://github.com/metabase/metabase/issues/new/choose

Just for reference:

1 Like

Thanks for help.