Metabase lists inaccessible tables from Redshift in a database

Hello,

We are currently using metabase 0.44, self hosted. It connects to our Redshift cluster. The db user configured for the database is a programmatically configured user, which defines access to various tables and schemas.

The challenge we are meeting is that it loads tables and fields for inaccessible objects (tables and views). I have tried browsing through the metabase repository to understand exactly how metabase enumerates tables and fields, but I am unable to find it.

When I as the metabase db user issue i.e. this SQL:

select table_schema, table_name, column_name, data_type as type_name
from information_schema.columns
where table_schema not like 'pg_%' 
and table_schema not like 'information_schema'

I only see the correct tables and columns, which have been granted access to.
I am very curious to understand which tables metabase enumerates from, as in Redshift this seems to violate the granted access.

The main challenge this causes for us is that users are able to find tables which doesn't work, because we don't want to give access to it, and it also pollutes the available models / tables a lot.

{
  "browser-info": {
    "language": "en-GB",
    "platform": "MacIntel",
    "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36",
    "vendor": "Google Inc."
  },
  "system-info": {
    "file.encoding": "UTF-8",
    "java.runtime.name": "OpenJDK Runtime Environment",
    "java.runtime.version": "11.0.16+8",
    "java.vendor": "Eclipse Adoptium",
    "java.vendor.url": "https://adoptium.net/",
    "java.version": "11.0.16",
    "java.vm.name": "OpenJDK 64-Bit Server VM",
    "java.vm.version": "11.0.16+8",
    "os.name": "Linux",
    "os.version": "4.14.281-212.502.amzn2.x86_64",
    "user.language": "en",
    "user.timezone": "Europe/Berlin"
  },
  "metabase-info": {
    "databases": [
      "redshift",
      "h2"
    ],
    "hosting-env": "unknown",
    "application-database": "postgres",
    "application-database-details": {
      "database": {
        "name": "PostgreSQL",
        "version": "13.4"
      },
      "jdbc-driver": {
        "name": "PostgreSQL JDBC Driver",
        "version": "42.3.5"
      }
    },
    "run-mode": "prod",
    "version": {
      "date": "2022-08-04",
      "tag": "v0.44.0",
      "branch": "release-x.44.x",
      "hash": "d3700f5"
    },
    "settings": {
      "report-timezone": null
    }
  }
}

Best Regards,
Jørgen

1 Like

Hi @JorgenG
Perhaps the Redshift driver code is more helpful for you to read:
https://github.com/metabase/metabase/blob/master/modules/drivers/redshift/src/metabase/driver/redshift.clj
If Metabase has select privileges to a table select true from table where 1 <> 1 limit 0, then it will think the table should be included.

Thanks a lot for the reply, this helps a lot!

I'm still confused at this behavior from Redshift actually.

Issuing

select true from usable_schema.denied_table where 1 <> 1 limit 0

does not fail with permission denied.

However, doing:

select * from usable_schema.denied_table limit 100

does fail with permission denied.

Continuing the experimentation;

Queries that fail with permission denied:

select true from usable_schema.denied_table limit 1
select * from usable_schema.denied_table limit 1

Queries that are successful (with no data):

select true from usable_schema.denied_table where 1 <> 1 limit 0
select true from usable_schema.denied_table where 1 <> 1 limit 1
select * from usable_schema.denied_table limit 0

I'm gonna see if I can somehow figure out how to get that query to fail with permission denied.
If not, I suspect this is an issue with Redshift, and that possibly this behavior should be overridden.

Do you happen to know where this access check query is implemented, and whether it could be easily overridden?

I think I found what I needed now, and I also noticed that SQLite has an override for this. I'll prepare a PR for it.

Of interest though is that the original query works if the schema is inaccessible, however, if the schema is accessible, it appears that redshift optimises the query in such a way that permissions aren't evaluation.

Thanks a lot for the help here.

@JorgenG It's important to keep https://github.com/metabase/metabase/pull/13746 in mind, since there's performance reasons for not selecting data.

Thanks for that reference, I can totally understand that, but in this case, we end up showing hundreds of tables that is not queryable for our users. They also "appear" to be browsable and selectable.

I'll create an issue in Github to continue the conversation there, maybe there are different solutions to this problem.

@JorgenG You can just hide the table in Admin > Data Model.
You are the first experiencing this, so I'm guessing that it has to do with some rather specific privileges, so please make sure to include exact steps to reproduce.

It would be quite painful to manually clicking hide on hundreds of tables when we have automation in place. Also considering every time a new table is added which should not be accessible we would have to remember to do this. We'd want this to be reflected by actual permission control in the database.

I think what you mention there is a good hint. I will investigate if I am able to fix this from Redshift access control perspective as well. Maybe there is something we do oddly there.

Thanks a lot for all the help, at least we now understand what the cause of it is.