HTTP Caching for javascript

G'day,

Summary
How can we configure Metabase to allow caching for the java script please?

Longer
We're using Metabase and find that the dashboard load very slowly. On inspection, we discovered that that Javascript is getting loaded very time, and NOT caching in the browsers. We are seeing page load times in the ~6 seconds (very slow!!)

The reason seems to be that the javascript files (.js) have caching completely disabled. This might make sense for Metabase developers to ensure the new javascript works quickly during development. However for production this is really bad. You definitely want caching in your origin layer, in the CDN, and in the browser.

The headers we see are:

Cache-Control: max-age=0, no-cache, must-revalidate, proxy-revalidate

We would like to allow caching for perhaps 7 days ( 604800s ) , with stale-while-revalidate 2 days ( 172800)

Cache-Control: max-age=604800, stale-while-revalidate=172800

If we could set the headers it will mean we can offload the work of serving the java script from the Java web server. Diagram would be like this:

[slow] Java web server -> Nginx proxy/cache -> CDN proxy/cache -> Browser cache

The documentation about caching does not cover HTTP caching at all. :frowning:

We are going to assume there's no way to correct the caching for now and will rewrite the headers with our service mesh (Istio).

Thanks,
Dave Seddon

version?

0.50.19

Ah. Looks like there is a ticket about this, where somebody tried to fix it, but maybe they didn't fix it.

The ticket is closed, yet the last comment is this isn't fixed. - The fix doesn't look too good to me. I'd recommend using stale-while-revalidate if possible

Fonts aren't cachable either

https://metabase.dev.siden.io/app/fonts/Lato/lato-v16-latin-700.woff2
cache-control: max-age=0, no-cache, must-revalidate, proxy-revalidate

@daveseddon please upgrade to at least 52, and check again if this works

Thanks @Luiggi

We will test the latest version.

To help explain what I mean about the cache control headers, this is a draft MR. Please note that it is not clear to me how to make the fonts cachable.

We upgraded to latest 0.53.4.

JSS and CSS is cachable. "public, max-age=31536000".

Fonts are not cachable. "max-age=0, no-cache, must-revalidate, proxy-revalidate".

What's really killing page load times is all the api.js making the queries. We will try cranking up the number of database connections.

Actually, reloading does look better now. Caching definitely helping.

The per user settings call is actually pretty slow. It's taking 100ms for us, and I assume this is blocking everything.

I wonder if this could be made cacheable for some amount of time?

Currently it is "max-age=0, no-cache, must-revalidate, proxy-revalidate"

https://metabase.dev.siden.io/api/session/properties

Actually, just looking at what's in the properties response, there is a LOT of stuff going to the client that doesn't need to be.

For example, we are using Postgres and Clickhouse, so what do we need in here?

  • Databricks = no
  • Druid JDBC = no
  • Postgres = yes
  • SparkSQL = no
  • Mongo = no
  • Druid = no
  • Redshift = no
  • bigquery-cloud-sdk = no
  • snowflake = no
  • athena = no
  • presto-jdbc = no
  • h2 - this was the default, and we just switched to postgres = no
  • clickhouse = yes
  • sqlite = no
  • mysql = no
  • sqlserver = no

So if this /api/session/properties could filter what it returns to only the enabled databases, then this json would be a LOT smaller

@Luiggi

Unfortunately, we had to roll back. :frowning:

The issue was that when you write custom query the dashboard was not allowing the filtration fields on those custom query models which works fine with older version. Not able to add filters using the new query.

Please send us some reproduction of what you're seeing since I don't think I've heard something similar ever

also, about the /properties call... we're talking about 13kb...

@Luiggi

The description of the issue from my team:

The basic use case is a model with a native query, in this case SQL against ClickHouse database, then a question based on that model, which gets added to a dashboard. Before the upgrade, it was possible to create a dashboard filter and bind that filter to a field in the question based on the native-query model. After the upgrade, it was no longer possible to bind the dashboard filter to any field in that same question.

The type of the filter was text.

"also, about the /properties call... we're talking about 13kb..."

yes, but this happens for every user, and is blocking. e.g. All the rest of the calls don't start until this properties returns. This call is taking 100ms is a long time. 1/10th of a second.

Larry Page famously says google search results need to take <200ms, and that's obviously for a search against the entire internet, not just some client config.

Recommend to:

  1. Only return setting that the client needs.
  2. Make it cacheable with stale while revalidate

We've cranked up the settings to get more performance.

For other's here's our config ( I couldn't find any examples )

  - apiVersion: external-secrets.io/v1beta1
    kind: ExternalSecret
    metadata:
      name: metabase-pg-env
    spec:
      data:
        - remoteRef: &defaultRemoteRef
            conversionStrategy: Default
            decodingStrategy: None
            metadataPolicy: None
            key: <SNIP>
          secretKey: password
        - remoteRef:
            <<: *defaultRemoteRef
            key: <SNIP>
          secretKey: user
      secretStoreRef:
        kind: ClusterSecretStore
        name: system-manager
      target:
        template:
          data:
            MB_DB_TYPE: postgres
            MB_DB_DBNAME: metabase
            MB_DB_PORT: "5432"
            MB_DB_USER: "{{ .user }}"
            MB_DB_PASS: "{{ .password }}"
            MB_DB_HOST: metabase-pg-rw
            # Metabase at scale
            #
            # https://www.metabase.com/learn/metabase-basics/administration/administration-and-operation/metabase-at-scale
            #- For every 20 concurrent users, figure roughly need 1 CPU core and 1GB of RAM
            #- Machines with 4-8 Gigabytes should handle hundreds of users, and you can bump the number of cores and gigabytes of memory if needed.
            #
            # Reference issues
            # https://github.com/metabase/metabase/issues/12177
            # https://github.com/metabase/metabase/issues/42139
            #
            # https://www.metabase.com/docs/latest/configuring-metabase/environment-variables#mb_application_db_max_connection_pool_size
            # default MB_APPLICATION_DB_MAX_CONNECTION_POOL_SIZE = 15
            MB_APPLICATION_DB_MAX_CONNECTION_POOL_SIZE: "100"
            #
            # https://www.metabase.com/docs/latest/configuring-metabase/environment-variables#mb_async_query_thread_pool_size
            # default MB_ASYNC_QUERY_THREAD_POOL_SIZE = 50
            MB_ASYNC_QUERY_THREAD_POOL_SIZE: "100"
            #
            # https://www.metabase.com/docs/latest/configuring-metabase/environment-variables#mb_attachment_table_row_limit-1
            # default MB_ATTACHMENT_TABLE_ROW_LIMIT = 20
            MB_ATTACHMENT_TABLE_ROW_LIMIT: "100"
            #
            # https://www.metabase.com/docs/latest/configuring-metabase/environment-variables#mb_jdbc_data_warehouse_max_connection_pool_size
            # default MB_JDBC_DATA_WAREHOUSE_MAX_CONNECTION_POOL_SIZE = 15
            MB_JDBC_DATA_WAREHOUSE_MAX_CONNECTION_POOL_SIZE: "100"
            #
            # https://www.metabase.com/docs/latest/configuring-metabase/environment-variables#mb_jetty_maxthreads
            # default MB_JETTY_MAXTHREADS = 50
            MB_JETTY_MAXTHREADS: "200"
            #
            # https://www.metabase.com/docs/latest/configuring-metabase/environment-variables#mb_jetty_minthreads
            # default MB_JETTY_MINTHREADS = 8
            MB_JETTY_MINTHREADS: "32"
            #
            # https://www.metabase.com/docs/latest/configuring-metabase/environment-variables#mb_jetty_maxidletime
            # default MB_JETTY_MAXIDLETIME = 200000 ( 200 seconds = 3.33... minutes)
            # Let's increase this to 10 minutes
            MB_JETTY_MAXIDLETIME: "600000"
            #
            # https://www.metabase.com/docs/latest/configuring-metabase/environment-variables#mb_jetty_daemon
            # default MB_JETTY_DAEMON false
            # It's not clear what changing this will do.  Let's leave it for now.
            #
            # https://www.metabase.com/docs/latest/data-modeling/model-persistence#difference-between-persisted-models-and-caching
            #https://www.metabase.com/docs/latest/configuring-metabase/environment-variables#mb_persisted_models_enabled
            # default MB_PERSISTED_MODELS_ENABLED = false
            # not suppored with clickhouse
            # MB_PERSISTED_MODELS_ENABLED: true
            #
            # https://www.metabase.com/docs/latest/configuring-metabase/environment-variables#mb_query_caching_max_kb
            # default MB_QUERY_CACHING_MAX_KB = 2000 (2 MB)
            # Let's increase this by x100 to 200MB
            MB_QUERY_CACHING_MAX_KB: "200000"
            #
            # https://www.metabase.com/docs/latest/configuring-metabase/environment-variables#mb_sql_jdbc_fetch_size
            # default MB_SQL_JDBC_FETCH_SIZE = 500
            MB_SQL_JDBC_FETCH_SIZE: "1000"

            JAVA_OPTS: -Xms8G -Xmx8G -XX:+UseCompressedOops -XX:+UseParallelGC -Xlog:gc*:file=/tmp/metabase-gc.log

We also increased the max connections for Postgres, because the default was 100.

Although this didn't seem to help actually. properties took 312ms!

On refresh properties too 90ms

Again, it took 59! Yay! But look the fonts took 239+249ms. :frowning:

Upgrade the clickhouse driver, native questions should expose the filter values to the dashboard filters

Thanks Luiggi

We used:
Metabase v0.53.4
Clickhouse driver 1.53.2

Oh. This slow /api/session/properties is going away. I guess they realized it's slow! Great! Can't wait :slight_smile:

;; TODO: deprecate /api/session/properties and have a single endpoint for listing settings
(api.macros/defendpoint :get "/"
  "Get all `Settings` and their values. You must be a superuser or have `setting` permission to do this.
  For non-superusers, a list of visible settings and values can be retrieved using the /api/session/properties endpoint."
  []