Help or guides to designing a BI data model?

I’m looking for information around designing a data model. I’ve implemented Metabase and progressed through the typical steps of a growing business with a growing operational team. I have stake holders using ad-hoc queries that I have written to get data, some even creating their own queries. The questions have started to change from “can you make this query” to “how do I find this data”, great progress.

Now I would like to start working on a plan to create a denormalized data model. I plan to start with the same data currently accessible, data about our application. If it makes sense, and I can find the time, I will start looking into ingesting data from other services (Salesforce, Recurly, HubSpot etc.).

I was hoping that someone could point me towards info/guides/articles that will help me to start defining a flat data model for BI use. If you have any info you would like to share, I’d greatly appreciate it. Thank!

1 Like

This is indeed not so straightforward and it takes a long time for a schema to mature and tune.

Segment.com is a service that does such normalising for you, so you could look at the schema’s that Segment.com exposes for it’s various sources, for instance https://segment.com/docs/sources/cloud-apps/zendesk/#collections

For “event” style datasets, have a look at the tracking specs: https://segment.com/docs/spec/semantic/

Or, just get a demo account and connect those services and see what comes out.

HTH (more suggestions welcome)

Agreed this is often not simple. Panoply has published a decent overview that might be helpful. https://panoply.io/data-warehouse-guide/data-warehouse-concepts-traditional-vs-cloud/

(I don’t work for or use Panoply, I just stumbled across this content.)