Since upgrading to v0.46.1 (from v0.42.3) we have been facing issues accessing just one of /api/database/N/metadata API endpoints, in particular our main Redshift database. I am not sure what goes on but it times out before returning anything. Whilst another endpoint (N=4) works just fine and returns a JSON.
N=2 (Redshift) returns a 504 Gateway timeout and in the Metabase Logs it reads:
The main difference seems to be the numbers of models that is getting metadata for. The one that is failing is our main Data Warehouse, whilst the others are smaller dbs.
Does anyone have any idea what might causing this? Could there be a permission issue on the Metabase Postgres DB? Or some upper limit on the amount of data that can be returned? This end point worked fine before the migration...
There are some further error logs that begin with:
[a167b5ad-7695-411c-8047-91d5774cdc4b] 2023-05-04T23:03:33+01:00 ERROR metabase.server Unexpected Exception in API request handler
[a167b5ad-7695-411c-8047-91d5774cdc4b] 2023-05-04T23:03:33+01:00 ERROR metabase.server Unexpected exception in endpoint
Checking our postgres database, there are no locked queries.
I don't want to roll back if possible but without this endpoint working we cannot sync our docs to metabase which is a pain!
I set up a brand new instance of Metabase locally, using docker. I connected to the same Data Warehouse. Ran the command: http://localhost:3000/api/database/2/metadata and it returned data, slowly. I then let metabase finish syncing all the table names and columns in the background, I then retried the API call and this resulted in a 50 second API response. It didn't timeout I am guessing as I am running this locally.
Given this issue wasn't present on the same Database before the upgrade this feels like a performance issue that has been introduced between versions. Or indeed a timeout rule has been implemented.
Does anyone have any insight into changes of this nature?
So what you say is that you use that endpoint first to sync metadata and then the second time it takes a huge amount of time. How many databases/fields you have? How much data did you sync? I’m guessing this is a performance issue on the endpoints
When I was testing it locally, I attempted calling the endpoint before all the synchronisation was finished, and then again straight after. All this told me was that the endpoint only really becomes available once the sync finishes.
The main thing is that prior to upgrading to the new version of Metabase we used to call the endpoint without problem (despite there being no change in the number of tables "(1,672 DB calls)". Now it appears too slow and we hit a timeout on our K8s deployment of Metabase. It does work when I run a local version of Metabase, but 50seconds.
So I agree with you that this must be a performance issue on the endpoint. If it is calling the database, maybe an Index is missing or something simple like that?