Hi, I'd like to use the metabase API to fetch the top 200 tables that are relevant to the current user. This would include tables used in cards that are most viewed/bookmarked by the user, and cards generally popular across the organization.
This is what I've tried:
- Using the search api for tables like so
/api/search?models=table&table_db_id={db_id}
to get tables ranked by a scoring function.
- The results don't seem to be ordered by popularity.
score.dashboard
is 0 for every table;score.recency
seems to reflect the table's schema update date, not the latest record insert date which is what I'd want. On large dbs, this just gives me some random 1000 tables instead of the popular ones.
- Use the search api for card like so
/api/search?models=card&table_db_id={db_id}&limit={limit}
and then parse the resulting SQLs for table names
- This gives pinned cards first which is good. However the later cards in the result have very low
score.dashboard
, even those used in multiple dashboards. The recency score seems correct afaict. The challenge is that I don't know how many tables I'm going to get after parsing the cards' queries (i'm also assuming all the cards are based on native queries, which might not be true generally)
- Use the
/api/card
api to get all cards, and then parse tables out of them to get top X tables.
- This just times out on large dbs
- Use the
/api/database/{dbId}?include=tables
api to get all tables in the database, and then filter manually for recency
- This does return all the tables in a reasonable amount of time. However the table are missing
latest_sync_timestamp.latest_record_timestamp
field which I would need to do the recency ranking (this field is present when I manually fetch the record for a particular table using/api/table/{tableId}
)
- Use the activity APIs
/api/recent_views
and/api/popular_items
- These give ~100 entities which are mostly dashboards. I'd have to manually query each dashboard to further get their cards and then parse those to get tables.
Approach #2 seems the most promising to me, but with it's own challenges (What's a goodlimit
? Is there a way to set a custom 'scoring' function/weight so that I can prioritize cards appearing in dashboards more over recency? What if it's not a native query?).
Would love any help on this!