Card api giving 504 on some questions in MetaBase version 0.43.3

We have recently encountered 504 Gateway Time-out errors when accessing certain questions. Upon investigation, we discovered that the cause of the issue is the api/card/<id> endpoint of the cards API, which is responsible for the error. This endpoint works correctly in 99% of cases, but a small number of questions are affected by the error.

Interestingly, the same queries that produce the error when accessing the affected questions work without any issues when used to create a new question.

The Troubleshoot Center shows the same results when checking the endpoint for this issue.

[8f9373b0-c900-433c-ba15-bedb6cff09fc] 2023-05-01T15:10:19+05:30 DEBUG metabase.server.middleware.log GET /api/card/1727 200 1.9 mins (9 DB calls) App DB connections: 1/15 Jetty threads: 5/50 (20 idle, 0 queued) (154 total active threads) Queries in flight: 0 (0 queued)
[8f9373b0-c900-433c-ba15-bedb6cff09fc] 2023-05-01T15:10:19+05:30 DEBUG metabase.server.middleware.log GET /api/card/1727 null 1.9 mins (9 DB calls) App DB connections: 1/15 Jetty threads: 5/50 (20 idle, 0 queued) (154 total active threads) Queries in flight: 0 (0 queued)
[8f9373b0-c900-433c-ba15-bedb6cff09fc] 2023-05-01T15:10:19+05:30 INFO metabase.server.middleware.exceptions Request canceled before finishing.

Attaching Diagnostic info as well

{
  "browser-info": {
    "language": "en-IN",
    "platform": "MacIntel",
    "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36",
    "vendor": "Google Inc."
  },
  "system-info": {
    "file.encoding": "UTF-8",
    "java.runtime.name": "OpenJDK Runtime Environment",
    "java.runtime.version": "11.0.15+10",
    "java.vendor": "Eclipse Adoptium",
    "java.vendor.url": "https://adoptium.net/",
    "java.version": "11.0.15",
    "java.vm.name": "OpenJDK 64-Bit Server VM",
    "java.vm.version": "11.0.15+10",
    "os.name": "Linux",
    "os.version": "4.14.305-227.531.amzn2.x86_64",
    "user.language": "en",
    "user.timezone": "GMT"
  },
  "metabase-info": {
    "databases": [
      "postgres",
      "snowflake"
    ],
    "hosting-env": "unknown",
    "application-database": "postgres",
    "application-database-details": {
      "database": {
        "name": "PostgreSQL",
        "version": "12.7"
      },
      "jdbc-driver": {
        "name": "PostgreSQL JDBC Driver",
        "version": "42.3.2"
      }
    },
    "run-mode": "prod",
    "version": {
      "date": "2022-06-13",
      "tag": "v0.43.3",
      "branch": "release-x.43.x",
      "hash": "c9c7ef0"
    },
    "settings": {
      "report-timezone": null
    }
  }
}

Please upgrade to the latest version and see if this keeps happening. A few things that come to my mind:

  1. the api call can’t take almost 2 minutes, there’s something weird there
  2. post if possible the definition of the question and try to recreate it and check the diff of the working and non working ones

Blockquote
"post if possible the definition of the question and try to recreate it and check the diff of the working and non working ones"

Do we mean the raw-sql we are using to generate the question here?
and yes even we find it weird that it is timing out after 2 min for some cases, others hardly take milliseconds

my guess is the following: when you hit /api/card/1727 you get a JSON. Recreate that card and now compare the JSON from the first with the JSON from the second

problem is it doesn't return any data, it just times out with 504

then I would check if the database is right or the application is losing connection to the DB, since there should be a response