Joining Data From Two Different Databases

Hello, everyone

There is an ongoing discussion in my workplace about gathering all our data in one place to avoid creatind data silos. Is it possible to pull the data from different silos direct to metabase?

I can give a more concrete example. Whenever I have to write a Query, I have to pick a data source:

To get all the Data in one place, we have to direct it from the different sources to a common database. This database is connected to Metabase and it is what we're using to make queries.

Is there a way to "load" all the different Databases into Metabase - and merge them. Let's say, is there a way to cut off the middle man? Or do we always have to "prepare" the data into a common database, from which metabase then retrieves it's data?

Thanks for the Clarification

Hi @celioxf
No, Metabase does not support queries across databases. You would need to setup a warehouse in SQL Server, BigQuery, Athena or any other supported database that supports that.
https://github.com/metabase/metabase/issues/3953 - upvote by clicking :+1: on the first post

1 Like

hi @flamber, does this mean I can't join tables from different DBS on metabase?

@Gloria Correct. You cannot makes queries across different database connections. It would need a warehouse to be able to combine the data.

Joining across database is horribly inefficient unless you're doing a lot of caching in some sort of proxy.
It's slightly better if you do something like 'linked databases' in MS SQL (it can be done in others too), but that just reduces the traffic back to the BI tool.
There are plenty of ETL tools (or ELT if you prefer) around. I use both MS SSIS and Talend. Other advantage of using a separate database for your BI is that you no longer need to worry about impacting application performance. Plus, you can do more with indexes to improve query performance.

hmm.. this is very useful. For now, I think we would go the route of a DW

@flamber, can DW be created and linked to metabase?

@Gloria Yes, a warehouse like BigQuery can be queried across datasets, since they are all in the same database connection.
So if you need to create more than a single database connection in Metabase Admin > Databases, then you cannot make queries across those connections.
Many uses separate schemas in Postgres or SQL Server to place their datasets in each their schema.

@flamber can Bigquery work for postgressql?

@Gloria I think you are misunderstanding. BigQuery is a database, just like Postgres is a database.
You need to split your datasets into schemas on Postgres, so Metabase can have one connection to the database.