Multiple databases in one question without a data warehouse- YES YOU CAN!


#1

I’ve been meaning to test this idea for a while, finally found the time today. If there’s enough interest and people struggle with this, I’ll do a proper ‘How To’.

First the bad news (for some of you). You need Windows. Get over it! :stuck_out_tongue_winking_eye:

I’m not including details of any security settings to keep stuff clearer.

Install SQL Server Express (2016 or 2017). This is the free edition, only limitation is the database size of 10GB. As we’re not storing any data, that’s not a problem.
You’ll need SQL Server Management Studio too.
In SQL Server Configuration Manager, set the Named Pipes and TCP/IP to enabled:
image
This is just to allow remote connections. You need to do this even if Metabase is on the same server.

Next create an ODBC connection for each database you want to use. I used a sample from Postgresql and another from MySQL.

Create a database in SQL Server. I called mine Dashboard Data.

Now for the magic. :mage:

In SSMS (SQL Server Management Studio), open ‘Server Objects’ and select ‘Linked Servers’:
image

Right Click Linked Servers and select ‘New Linked Server’:
image

This isn’t as bad as it looks.
Top box, Linked Server, is just the name you’re going to be using to refer to this server from within SQL. Best to name it the same as your ODBC connection.
Provider, change the drop down to ‘Microsoft OLE DB Provider for ODBC Drivers’
Data source is the name of your ODBC driver.
Leave the rest blank.

Next select Security:

Keeping it simple, just change the bottom option to ‘Be made using this security context’. The Remote Login is just the username you use to login to the other database. You’ll need to fill in your password too.
Click OK. Hopefully no error messages.

If it’s working correctly, you should see the linked server in the tree view and the databases within them. These are mine:
image

Although you’re able to see the tables, you won’t be able to see column names. This is just one of those oddities of the security around database schemas. Doesn’t matter.

Next we need to add these tables to our DashboardData database. We’re not copying data, just creating views.
My Postgresql databse has a table called actor in the publc schema. To create a view in DashboardData:

USE dashboardData
GO
create view pg_actor2 as
select * from openquery(postgresql, 'SELECT * FROM public.actor')

Openquery just tells SQL that we’re using a linked database. The first parameter is the name of the linked server, the seconds is the SQL.
I prefixed the name of the view with pg_ to help identify the source database. Strictly speaking, this would be better in a different schema, but I’m keeping things simple.

Repeat for each table you’d like to see in your dashboard database. If someone clever could write a script to do this for all tables, please share.

You can now use the views in DashboardData using normal SQL.
select * from pg_actor2

Now just add the new database to Metabase in the normal way.

Limitations
SQL Server is pretty clever about how it retrieves the data and joins it together on the server, but performance may suffer with large datasets where you join across different linked servers. That’s no different to doing this in a query with multiple sources, just something to be aware of.


Best way to overcome lack of joins
WIKI: query issues (SQL/others)
More SQL - how to create a view when you don't have admin rights