HTTP error 400 Invalid SNI when upgrading from 0.45.3 to 0.46.0

Hello, everyone!

This is my first post on these forums as we are giving Metabase OSS a try, getting to know it and check its possibilities. So far we are very happy with what we are seeing!

Unfortunately, today I had the brilliant idea of upgrading Metabase from version 0.45.3 to version 0.46.0 and now I am seeing this error in the log file:

WARN server.HttpChannel :: handleException / org.eclipse.jetty.http.BadMessageException: 400: Invalid SNI

I presume that embedded Jetty Webserver must have been switched to enforcing SNI (jetty.ssl.sniRequired=true). The thing is that I thought that I had my Java KeyStore properly configured, which contains just a wildcard certificate (with the internal domain of the cluster, which is different from the public domain set up on NGINX, which acts as a reverse proxy):

# keytool -list -v -keystore keystore.jks 
Enter keystore password:  
Keystore type: PKCS12
Keystore provider: SUN

Your keystore contains 1 entry

Alias name: internaldomain.com
Creation date: Apr 4, 2023
Entry type: PrivateKeyEntry
Certificate chain length: 1
Certificate[1]:
Owner: CN=internaldomain.com
Issuer: CN=R3, O=Let's Encrypt, C=US
Serial number: 476c498fec6c6006b8b9a3a0d3f51850187
Valid from: Fri Mar 17 23:03:11 UTC 2023 until: Thu Jun 15 23:03:10 UTC 2023
Certificate fingerprints:
         SHA1: A2:6D:C1:08:FA:C0:62:61:4C:E8:C5:34:E7:29:36:28:F9:18:BF:B1
         SHA256: 6C:58:AE:E3:6D:C9:E4:CC:81:E0:4C:43:B3:A5:D3:9B:B7:26:84:A7:CF:C4:E0:44:AB:2C:87:07:C8:40:5D:94
Signature algorithm name: SHA256withRSA
Subject Public Key Algorithm: 256-bit EC (secp256r1) key
Version: 3

Extensions: 

#1: ObjectId: 1.3.6.1.4.1.11129.2.4.2 Criticality=false
0000: [..]
[..]
00F0: [..]

#2: ObjectId: 1.3.6.1.5.5.7.1.1 Criticality=false
AuthorityInfoAccess [
  [
   accessMethod: ocsp
   accessLocation: URIName: http://r3.o.lencr.org
, 
   accessMethod: caIssuers
   accessLocation: URIName: http://r3.i.lencr.org/
]
]

#3: ObjectId: 2.5.29.35 Criticality=false
AuthorityKeyIdentifier [
KeyIdentifier [
0000: [..]
0010: [..]
]
]

#4: ObjectId: 2.5.29.19 Criticality=true
BasicConstraints:[
  CA:false
  PathLen: undefined
]

#5: ObjectId: 2.5.29.32 Criticality=false
CertificatePolicies [
  [CertificatePolicyId: [2.23.140.1.2.1]
[]  ]
  [CertificatePolicyId: [1.3.6.1.4.1.44947.1.1.1]
[PolicyQualifierInfo: [
  qualifierID: 1.3.6.1.5.5.7.2.1
  qualifier: 0000: [..] sencrypt.org
]]  ]
]

#6: ObjectId: 2.5.29.37 Criticality=false
ExtendedKeyUsages [
  serverAuth
  clientAuth
]

#7: ObjectId: 2.5.29.15 Criticality=true
KeyUsage [
  DigitalSignature
]

#8: ObjectId: 2.5.29.17 Criticality=false
SubjectAlternativeName [
  DNSName: *.internaldomain.com
  DNSName: internaldomain.com
]

#9: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: 0F A2 27 61 3A 09 AD 40   34 12 21 03 E8 18 81 4E  ..'a:..@4.!....N
0010: 7D 66 4F B8                                        .fO.
]
]

*******************************************
*******************************************

This is how the NGINX 1.18 is configured:

upstream metabase {
  server metabase1.internaldomain.com:8080 fail_timeout=10s max_fails=1;
  keepalive 100;
  keepalive_requests 1000;
  keepalive_timeout 75s;
}

server {
listen 443 ssl http2;
  listen [::]:443 ssl http2;
  server_name metabase.publicdomain.com;
  ssl_certificate /etc/ssl/certs/publicdomain.com.crt;
  ssl_certificate_key /etc/ssl/private/publicdomain.com.key;
  ssl_trusted_certificate /etc/ssl/chains/andronautic.com.chn;
  include inc/ssl-options.conf;
  resolver 192.168.0.253 192.168.0.254  ipv6=off;
  [.. logfiles ..]
  location / {
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Forwarded-Host $host;
    proxy_set_header Host $http_host;
    proxy_set_header Connection "keep-alive";
    proxy_set_header Early-Data $ssl_early_data;
    proxy_http_version 1.1;
    proxy_redirect off;
    proxy_ssl_session_reuse on;
    proxy_ssl_protocols TLSv1.2 TLSv1.3;
    proxy_ssl_trusted_certificate /etc/ssl/certs/ISRG_Root_X1.pem;
    proxy_pass https://metabase;
  }
}

This is the system info as seen on the Metabase logfile:

metabase1 metabase[6970]:  {"file.encoding" "UTF-8", 
metabase1 metabase[6970]:  "java.runtime.name" "OpenJDK Runtime Environment", 
metabase1 metabase[6970]:  "java.runtime.version" "11.0.18+10-post-Debian-1deb11u1", 
metabase1 metabase[6970]:  "java.vendor" "Debian", 
metabase1 metabase[6970]:  "java.vendor.url" "https://tracker.debian.org/openjdk-11", 
metabase1 metabase[6970]:  "java.version" "11.0.18", 
metabase1 metabase[6970]:  "java.vm.name" "OpenJDK 64-Bit Server VM", 
metabase1 metabase[6970]:  "java.vm.version" "11.0.18+10-post-Debian-1deb11u1", 
metabase1 metabase[6970]:  "os.name" "Linux", 
metabase1 metabase[6970]:  "os.version" "5.15.104-1-pve"

I have checked the environment variables document looking for some sort of MB_JETTY_SSL_SNI option but I have not been able to find it. Not that it would be a good solution for a wanna-be production environment, but at least would give me more time (downgrading to 0.45.3 does not seem to be an option due to changes in the database schema).

So, questions would be:

  1. Can anyone point out what is not right in my Java KeyStore?
  2. Alternatively, can anyone instruct me on how to prevent Jetty from checking SNI?

Thanks in advance. Keep up the good work on this product!

interesting, thanks for sending this issue, I found the same issue while trying to enable SSL on localhost: Env var to disable SNI check in Jetty 路 Issue #29660 路 metabase/metabase 路 GitHub.

I'm guessing you're not trying that but rather on internaldomain.com, is that correct? what do you have set up in the site url setting of Metabase?

Hi, Luiggi. Thanks for replying to my post.

I think you may be referring to MB_SITE_URL, which is not set at the moment. I think that I should be setting it to the public URL, correct? In my case that would be:

MB_SITE_URL="https://metabase.publicdomain.com"

There is also MB_JETTY_HOST, which is set to 0.0.0.0 at the moment. I just read in the documentation that it can be set to a hostname, which in my case would be metabase1.internaldomain.com (the FQDN of the LXC in the cluster). Should I set it to the hostname instead of the IP address 0.0.0.0?

Thanks.

please try setting that env var to your URL and let me know if it works (either of both env vars), they should be set to a hostname

please take into accounnt that it's the first time I see this issue, so I'm completely lost on how to proceed so I'm trusting you completely here

Hi, Luiggi.

I added these two environment variables:

MB_SITE_NAME="My Metabase"
MB_SITE_URL="https://metabase.publicdomain.com"

The error persisted.

Then I tried changing MB_JETTY_HOST=0.0.0.0 into MB_JETTY_HOST=metabase1.internaldomain.com, but the error persisted, too.

Also, I re-read the documentation about the MB_JETTY_HOST variable after reading the logfile output and I am inclined to believe that setting this one to a hostname does not make any difference, as it will resolve such host name to an IP address and then bind Jetty to it (that is its sole purpose, to bind Jetty to certain network interfaces).

Incidentally, when I rolled back the version (i.e. swapped JAR files from 0.46.0 to 0.45.3.1 and restored the previous PostgreSQL dump), I found this error when accessing the website, which has nothing to do with this thread:

023-04-04 18:50:47,716 ERROR middleware.log :: #033[31mGET /api/bookmark 500 14.2 ms (1 DB calls) 
{:via 
[{:type org.postgresql.util.PSQLException, 
:message "ERROR: relation \"app\" does not exist\n  Position: 1752", 
:at [org.postgresql.core.v3.QueryExecutorImpl receiveErrorResponse "QueryExecutorImpl.java" 2676]}], 
:trace 
[[org.postgresql.core.v3.QueryExecutorImpl receiveErrorResponse "QueryExecutorImpl.java" 2676] 
[org.postgresql.core.v3.QueryExecutorImpl processResults "QueryExecutorImpl.java" 2366] 
[org.postgresql.core.v3.QueryExecutorImpl execute "QueryExecutorImpl.java" 356] 
[org.postgresql.jdbc.PgStatement executeInternal "PgStatement.java" 496] 
[org.postgresql.jdbc.PgStatement execute "PgStatement.java" 413] 
2023-04-04 18:50:47,808 ERROR middleware.log :: #033[31mGET /api/collection/tree 500 27.3 ms (7 DB calls) 
{:via 
[{:type org.postgresql.util.PSQLException, 
:message "ERROR: relation \"app\" does not exist\n  Position: 47", 
:at [org.postgresql.core.v3.QueryExecutorImpl receiveErrorResponse "QueryExecutorImpl.java" 2676]}], 
:trace 
[[org.postgresql.core.v3.QueryExecutorImpl receiveErrorResponse "QueryExecutorImpl.java" 2676] 
[org.postgresql.core.v3.QueryExecutorImpl processResults "QueryExecutorImpl.java" 2366] 
[org.postgresql.core.v3.QueryExecutorImpl execute "QueryExecutorImpl.java" 356] 
[org.postgresql.jdbc.PgStatement executeInternal "PgStatement.java" 496] 
[org.postgresql.jdbc.PgStatement execute "PgStatement.java" 413] 
[clojure.java.jdbc$db_query_with_resultset_STAR_ invokeStatic "jdbc.clj" 1113] 
[clojure.java.jdbc$db_query_with_resultset_STAR_ invoke "jdbc.clj" 1093] 
[clojure.java.jdbc$query invokeStatic "jdbc.clj" 1182] 
[clojure.java.jdbc$query invoke "jdbc.clj" 1144] 
[toucan.db$query invokeStatic "db.clj" 308] 
[toucan.db$query doInvoke "db.clj" 304] 
[clojure.lang.RestFn invoke "RestFn.java" 410] 
[toucan.db$simple_select invokeStatic "db.clj" 414] 
[toucan.db$simple_select invoke "db.clj" 403] 
[toucan.db$select invokeStatic "db.clj" 708] 
[toucan.db$select doInvoke "db.clj" 702]

My guess is that this must have happened at some point while going through all the upgrades from v0.43.0 up to 0.45.0. Fortunately, I just spoke to the guys and they have stopped testing the Metabase installation until they find the time to implement some changes on the customer websites. Which means I am at your disposal to test this as much as you want. I can even drop the database and start from scratch. :slight_smile:

Thanks.

you cannot go back between versions, so in order for this to work, you'll need to restore a backup of 45.3

by the way, you're doing SSL termination at NGINX AND at the Jetty level right?

Oh, I did restore a backup from version 0.45.3. The error seem to prevent the left-menu from loading. Anyway, it's a staging installation, so no serious harm done. From now onwards I'll be more careful when upgrading :wink:

And yes, I am using TLS on NGINX on the URL metabase.publicdomain.com and I am also using SSL on Metabase's Jetty:

# Jetty HTTP server settings
MB_JETTY_HOST=0.0.0.0
# MB_JETTY_PORT=8080
MB_JETTY_SSL=true
MB_JETTY_SSL_PORT=8080
MB_JETTY_SSL_KEYSTORE="/opt/metabase/keystore.jks"
MB_JETTY_SSL_KEYSTORE_PASSWORD="my-password"

# Other options passed to the JVM
JAVA_OPTS="-Xms2g -Xmx3g -Dhttps.proxyHost=proxy.internaldomain.com -Dhttps.proxyPort=8080 -Dhttp.proxyHost=proxy.internaldomain.com -Dhttp.proxyPort=8080 --add-to-start=http2"

P.S. Requesting HTTP/2 support or not did not make any difference either. And, anyway, NGINX does not support HTTP/2 when proxying, only HTTP/1.1.

So I got a little bit worried about this and just did some checks:
I created a key pair with

keytool -genkey -keyalg RSA -alias localhost -keystore selfsigned.jks -validity 365 -keysize 2048

every item was set to "localhost"

which gave me this

Enter keystore password:  
Keystore type: PKCS12
Keystore provider: SUN

Your keystore contains 1 entry

localhost, Apr 4, 2023, PrivateKeyEntry, 
Certificate fingerprint (SHA-256): EA:70:F7:79:CB:DF:7E:6D:13:BE:61:AB:E3:CF:10:E6:78:99:22:DF:29:64:9B:8B:AE:EF:E1:94:FE:3A:2E:5D

Then started Metabase normally with the env vars:
- "MB_JETTY_SSL=true"
- "MB_JETTY_SSL_PORT=8443"
- "MB_JETTY_SSL_KEYSTORE=/app/selfsigned2.jks"
- "MB_JETTY_SSL_KEYSTORE_PASSWORD=storepass"

and everything worked. I just left the github repo with the working demo here:

So in this case, is it possible that there's something weird between the cert and the hostname?

also, can you try passing the following on the JAVA_OPTS=-Djetty.ssl.sniHostCheck=false

Good morning, Luiggi.

I'll give it a try as soon as I have a moment. I also slept over it and here are some thoughts:

  1. The configuration of NXING (proxy_set_header Host $http_host;), which is passing metabase.publicdomain.com to Jetty.
  2. Some misconfiguration of the Keystore. It was created using the Java Keystore Ansible module. Nothing weird with the module, nothing wrong in the listing I posted in my opening post in this thread, but I am no expert with JKS.
  3. Adding some sort of host check skipping property, which you kindly figured out already.

As I said, I'll get back to you as soon as possible. Thank you very much for following up this thread.

Back again, @Luiggi!

I tried the following:

  1. Changing proxy_set_header Host from $host to $proxy_host, so that metabase1.internaldomain.com is passed as host to Metabase instead of metabase.publicdomain.com. It did not work. :frowning:
  2. Added -Djetty.ssl.sniHostCheck=false to JAVA_OPTS. It did not work either. Browsed the Internet and I am unsure why it did not work, to be honest. Tricky... :person_shrugging:

My intutition kept pointing at how NGINX is passing the request to upstream, to the proxied server, i.e. Jetty, so I kept digging until I found the solution. On the server block on NGINX I added/modified these directives:

proxy_set_header Host "metabase1.internaldomain.com";
proxy_ssl_server_name on; # Default is 'off'

And it worked! :smile:

It's a pity I could not use NGINX's variable $proxy_host in the proxy_set_header Host directive, as it includes both the "name and port of a proxied server as specified in the proxy_pass directive". Fortunately, no need to explicitly set proxy_ssl_name as this one is set by default to the host part of $proxy_host, i.e. metabase1.internaldomain.com.

Fortunately, the server block is templated via Ansible, so I can just define those values easily upon deploy.

Final though is that the Jetty parameter jetty.ssl.sniHostCheck should have worked, but it didn't. If you are proficient in the world of Java, which I am not, it would be nice to check the code in Jetty's repo. Maybe there is a bug there. You know, a typo, a wrong condition, or similar.

Thanks a lot for your insightful feedback and your help, Luiggi. Very much appreciated. And I hope this thread helps anyone with the same setup as they upgrade from 0.45.X to 0.46.

1 Like

thanks to you for all your effort and being technically specific with your issue, this is clearly a thread worth to be read for anyone that's using Metabase with a reverse-proxy and SSL

1 Like

I am facing the same problem but we use haproxy in front instead of nginx, any idea on how to accomplish its equivalent?

Hi, @djuarezg! I am sorry but I haven't used HAProxy in so many years! A quick search on Google brought this up, though.

Should you be able to find the solution, please post it here for future reference.

Still no idea on how to fix this issue but I found the culprit.

Metabase 0.46.0 uses ring-jetty9-adapter/jetty9.clj at master 路 sunng87/ring-jetty9-adapter 路 GitHub instead of ring/jetty.clj at master 路 ring-clojure/ring 路 GitHub as referenced on metabase/config.clj at 93c96217e54a9adf8da8c19123429d081092fdfb 路 metabase/metabase 路 GitHub

Basically there is now a default sni-host-check setting enabled by default that was not there before and I do not know how to override it.

For anyone coming with the same problem but using HAproxy instead of nginx, the approach is kind of the same as listed above as working for nginx:

"http-request set-header Host internaldomainname" should be enough for it work, just tested on my side. Gracias @jsabater!, your message really helped tracking this down.

I installed Metabase v0.41.5 and configured SSL on it with port 8443. I can access the site no problem with https://hostname.mydomain.com:8443.
But we use vanity URLs on F5 in our network. URL on F5 is different with its own SSL config, and it is pointed to metabase server as pool memeber.
When I hit the externally published URL, I get HTTP ERROR 400 Invalid SNI.
We have another instance which is used in production that has version v0.41.5 and it works just fine with above setup.
Anyway to overcome this issue. The only way it works is if we change the port on F5 configuration to use non-ssl port with new version.

Check if you can do the header config stated above

I checked it but there's not Nginx involved in our case.
It's a vanity URL hosted on F5 and it is pointed to a pool with actual server as pool member.
Anyway to set it up in Metabase config?

not right now, is there any way you could set it up in F5?

Incidentally, what do you mean by a "vanity URL hosted on F5"? Do you mean F5 the company?

Anyway, generally speaking, when using SNI there is no other way but to set up the headers properly. The only solution I can think of for you is to find out why the jetty.ssl.sniHostCheck=False does not work and somehow make it work (maybe a bug to fix in Jetty?).