Github, Deep Copy, Cloning Dashboards and Underlying Tables

We built a great report for one of our clients that leverages an aggregated data table in SQL Server. Before the concept of cascading filters arrived in v37 we had to create separate aggregate tables by a client, add to Metabase and then rebuild all of the questions from a master example dashboard. I thought there must be an easier way. After looking online in the discussion forum I found some python-based GitHub solutions to allow me to deep copy dashboards and questions. https://github.com/vvaezian/metabase_api_python . Well, I’m not super technical. I had one of our technical people help me figure out how to set up my environment and was successfully able to clone and deep copy a dashboard. This successfully created a new dashboard and independent copies of the questions on that dashboard. The problem is that because the data connection is different there doesn’t seem to be an easy way to be able to update the table underlying each question without recreating the question from scratch.

I think a lot of the heavy lifting is solved by the GitHub solution ( thank you vvaezian) but I needed to go into the underlying tables to be able to update specific information and this is where I got a little confused. I would welcome any help or Direction on how best to update these underlying tables.

I then needed to identify the following

  1. Database ID for both the original clone database as well as the new connection.
    a. Cloned database_id = 1 and the new connection which we will call database_id=2.
  2. Cloned table
    . Cloned table_id=1 and the new table table_id=2
  3. Cloned Collection
    . New Cloned collection Collection_id = 2

With the information, I looked up records in the report_card table so that could update some key fields.

  1. I updated the dataset_query field where I did a search and replace for database":1 with database":2
  2. I updated the dataset_query field where I did a search and replace for source-table":1 with source-table":2
  3. Copy the value out of the dataset_query field into a text editor and isolate all of the field-id values as the existing values relate to the original data connection. I use the id of the original value to look up the text name of that value and then use the text name value to look up the new field-id for the same text name but the new connection. once this was complete I was able to do a search and replace on report_card.dataset_query for the old field number field-id = 1 (from database_id = 1) for the new field number field-id = 2 (from database_id = 2)
  4. Updated report_card.database_id = 2
  5. Updated report_card.table_id = 2

This seemed to work well and I was able to open the dashboard and see the new questions properly formatted. there was still a little bit of wonkiness but I was working through some of those details when I decided to look for other tables that may have information that needed to be updated with Legacy database, table or field values

That is when I found the table->column report_dashboardcard.parameter_mappings

Looks like in this table there are values such as parameter_ID and field-ID, card_id that most likely are referencing the Legacy data connection. I think this field - ID is relatively self-explanatory and just needs to use the same replace update logic as I did in the dataset_query field. I’m a little bit stumped on what parameter_ID is. I thought at first it might be related to the dashboard filters so I tried removing and saving the dashboard without any filters to see if the underlined table removed parameter IDs. But it does not. I also know that the parameter IDs are shared with both the original dashboard and the new clone dashboard with my guess is that it will cause a problem down the road.

My overall question is what cables do I need to update for simply Having new questions in a new dashboard pointed to a new connection. This assumes that I’ve added the connection and need to update the underlying tables as I do not believe the python script that I have will update that information. If I’m missing something please let me know. Thank you for any advice.

Pinging @King_Edward (GitHub vvaezian)

1 Like

Hi @Slickrock22,

As I understand it, you have been able to deepcopy the dashboard, and now you have trouble doing something with the duplicated dashboard.
If you provide an example of the thing you want to do, it would make things easier.
For example you said

I needed to go into the underlying tables to be able to update specific information

Give an example for this.

P.S. Dashboards are comprised of cards (AKA questions). Cards are based on database tables.

Thanks for the response @King_Edward When you Deep copy via Python API It retains all the same database number and field number values. I want to change the database that the copied cards point to. That means I must also update the field numbers in certain tables. I think I know most of the columns that need to be updated but I’m a little stumped on what the field report_dashboardcard.parameter_mappings Is used for. I find field-id values in that column that most likely will need to updated so they match the revised field numbers of the new database connection.

By replacing the database that the card points to, I’m not sure what other data other than the database id need to be changed as well. This can be tested with a minimal example.

As I understand it, you have a blueprint for a set of cards and a dashboard for those cards, and you want to be able to change the database that these cards are using.
Instead of changing the database id of the cards, a probably better approach is to create new cards with the new database id that you want to use (using the create_card function) and then replacing the cards in the duplicated dashboard with the newly created cards. These all of course can be automated.

@King_Edward Thanks! My understanding of Metabase is that database fields (columns) get assigned field-id that are unique for each database/field combination. That would mean that you would need to update both database_id as well as field-id (which I did and it works). I will investigate create_card. Thanks for the suggestion.

Dear @Slickrock22 / @King_Edward I was looking at this and have followed through the code for create_card. Great API wrapper, and thank you @Slickrock22 for your work on this so far. I wondered @Slickrock22 did you get this working?

I need to copy entire dashboards and cards to multiple schemas so we can roll out across different identical schemas.

I'll try to implement this feature in the coming weeks, as there seems to be enough requests for it, and also I've found use-cases for this feature in my own work.

Fantastic, thank you @King_Edward, I look forward to hearing on it. If I can be of assistance, please let me know.

I will share this in the spirit of trying to give back as long no one teases me for how bad my approach is :laughing:

This is how I clone a dashboard, change the database connection and field values (and a failed attempt at updating the parameters). Feel free to ask any questions, propose a better way or rip to shreds. No pride of ownership. Hope this helps someone.

  1. Update the https://github.com/vvaezian/metabase_api_python using the latest revisions to ensure we do not run into a Metabase Version problem.
  2. Go to https://[MyMetabaseURL]/ and Create a New Collection. Take note of the Collection ID at the end of the URL once created. (this doesn’t exactly work as desired as it creates a new collection/folder in the collection)
  3. Add new database connection in Metabase - Look at previous connections for inputs. (Name should be XXX and Username should be XXX) - Turn on This is a large database....
  4. Click Next and set Scan to Daily @ Midnight and Regularly on a Schedule @ Midnight. Click Save.
  5. Go back and click on new connection and take number of the number in the URL. That is your new Database_id.
  6. Click on Data Model, Select new connection XXX and then select the table XXX_SurveyResults and look at the URL. Last number is your new Table_id
  7. Run Metabase-API Deep Copy in Eclipse
  • Update the source_dashboard_id and destination_collection_id (Damb.copy_dashboard(source_dashboard_id=XX, destination_collection_id=XX, deepcopy=True)
  1. This process currently uses dashboard XX - If we need to change that then we will need to update the replace queries below with the correct numeric values as Metabase abstracts.
  2. Open PGAdmin and connect to production Postrges Metabase Database to run queries. Need to modify queries first. This part is no fun.
  • Open - Databases -> ebdb -> Schemas -> public -> Tables
  1. Don’t update with table 15 fields when you do search and replace updates below as these are benchmarks and are not updated to use the new database connection.
  2. Update these so you have a reference when you update the queries below. From ID (9) stays the same as long as you Clone dashboard_id 38. The 'to' what we collected above.
  3. Database_id -> 9 -> 14 (update XX to be whatever the new database_id should be)
  4. Table_id From 16 -> 73 (update XX to be whatever the new table_id should be)

14-28 This will provide more update statements than needed for cards but it is much easier than the manual file modification approach I used before. You will need to udpate Collection_id, table_id and database_id

Select CONCAT('UPDATE report_card SET dataset_query = Replace(dataset_query,''["field-id",',t1.id,']'',''["field-id",',t2.id,']'') Where Collection_id = 30 and table_id = 70 and database_id = 11;')
From
(Select id,name from public.metabase_field WHERE table_id = 70 ) t1
Left Join
(Select id,name from public.metabase_field WHERE table_id = 78) t2
ON (t1.name=t2.name)

  1. Conversion keys
  • database:9 -> database:14,
  • source-table:16, -> source-table:73,
  • dashboard_id = 38 -> dashboard_id = 50
  • First report_card.id value for new adds = 434
  1. Update Database Number
    UPDATE report_card SET dataset_query = Replace(dataset_query,'database:9','database:14') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9
    UPDATE report_card SET dataset_query = Replace(dataset_query,'source-table:16','source-table:73') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9

  2. Update field-id Number
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,494]','[field-id,1888]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,492]','[field-id,1886]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,491]','[field-id,1885]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,495]','[field-id,1889]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,485]','[field-id,1879]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,515]','[field-id,1909]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,519]','[field-id,1914]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,482]','[field-id,1876]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,496]','[field-id,1890]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,499]','[field-id,1893]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,518]','[field-id,1913]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,507]','[field-id,1901]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,511]','[field-id,1905]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,522]','[field-id,1916]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;
    UPDATE report_card SET dataset_query = Replace(dataset_query,'[field-id,526]','[field-id,1920]') Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9;

  3. Update database_id Number

UPDATE report_card SET database_id = 14 Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 9

  1. Update table_id Number
    UPDATE report_card SET table_id = 73 Where ID >= 434 and Collection_id = 27 and table_id = 16 and database_id = 12

STOP - Under Construction Beneath Here


This was an attempt to update the filters on the dashboard. Looks like the associated CardID updates, but the pararmeter_id stays the same on clone and fields still need to be manually update. For now manually delete and add back

Select t1.name, t1.card_id,t1.parameter_mappings, t2.name,t2.card_id, t2.parameter_mappings from
(Select report_card.name,report_dashboardcard.card_id, report_dashboardcard.parameter_mappings from public.report_dashboardcard
INNER JOIN report_card ON report_card.id = report_dashboardcard.card_id
Where dashboard_id = 38) t1
Left Join
(Select report_card.name,report_dashboardcard.card_id, report_dashboardcard.parameter_mappings from public.report_dashboardcard
INNER JOIN report_card ON report_card.id = report_dashboardcard.card_id
Where dashboard_id = 46) t2
ON (t1.name=t2.name)

UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,327]','[field-id,492]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,355]','[field-id,518]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,329]','[field-id,495]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,356]','[field-id,522]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,353]','[field-id,519]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,357]','[field-id,482]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,325]','[field-id,496]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,317]','[field-id,515]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,337]','[field-id,494]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,330]','[field-id,507]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,350]','[field-id,511]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,354]','[field-id,485]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,360]','[field-id,526]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,336]','[field-id,499]') Where dashboard_id = 46;
UPDATE report_dashboardcard SET parameter_mappings = Replace(parameter_mappings,'[field-id,351]','[field-id,512]') Where dashboard_id = 46;

1 Like

@King_Edward Do you think there is a way to clone the filters on the dashboard as well. Since in the looks like the filters are cloned in report_dashboardcard for the cards, but they are using the same Parameter_id as the cloned dashboard which leads me to think that when embedded, there will be confusion between filters that are from the previous cloned report and the new one. I have been just deleting the filters and recreating manually which is a pain on the cloned dashboard.

It should be possible. We'll see.

@King_Edward Any rough ideas when you might add dashboard filters to your cloning process?

Thanks!

I should have said months instead of weeks, as I have been busy with other stuff.
I've created two tickets for this task in the repo.

Any Postgres gurus that can help me clean this up as I am not sure how to write syntax in Postgres. The syntax, once complete will allow users to change a data source on a dashboard. Filters still will not work and will need to be deleted on the dashboard and recreated from scratch.

--I need to declare the variables that will be used to update the new collection, dashboard and cards
do $$
declare
_source_database_id integer := 17;
_source_table_id integer := 78;
_source_collection_id integer := 34;
_destination_database_id integer := 19;
_destination_table_id integer := 80;
_destination_collection_id integer := 35;

Begin -- Just to see if I could get them to display.

raise notice '% % % % %', source_database_id, source_table_id, destination_collection_id, destination_database_id,destination_table_id;

end $$;

--I wasn't sure how to add the new declared variables to my queries so I just used a placeholder as ||declared_variable_name||. I need some help to figure out how to properly add the variables.


UPDATE report_card SET dataset_query = Replace(dataset_query,'"database":||_source_database_id||','"database":||_destination_database_id||') Where Collection_id = ||_destination_collection_id|| and table_id = ||_source_table_id|| and database_id = ||_source_database_id||;
UPDATE report_card SET dataset_query = Replace(dataset_query,'"source-table":||_source_table_id||','"source-table":||_destination_table_id||') Where Collection_id = ||_destination_collection_id|| and table_id = ||_source_table_id|| and database_id = ||_source_database_id||;

-- This query creates the update statements that I normally copy and then run manually. Somehow I need to find a way to run this query and then execute results as now the query is only a select to get the update statements. Way above my pay grade.

Select CONCAT('UPDATE report_card SET dataset_query = Replace(dataset_query,''["field-id",',t1.id,']'',''["field-id",',t2.id,']'')
Where Collection_id = ||_destination_collection_id|| and table_id = ||_source_table_id|| and database_id = ||_source_database_id integer||;')
From
(Select id,name from public.metabase_field WHERE table_id = ||_source_table_id integer|| ) t1
INNER Join
(Select id,name from public.metabase_field WHERE table_id = ||_destination_table_id||) t2
ON (t1.name=t2.name)

--This is the same simple replace as the first block.
UPDATE report_card SET database_id = ||_destination_database_id|| Where Collection_id = ||_destination_collection_id|| and table_id = ||_source_table_id|| and database_id = ||_source_database_id||;
UPDATE report_card SET table_id = ||_destination_table_id|| Where Collection_id = ||_destination_collection_id|| and table_id = ||_source_table_id|| and database_id = ||_destination_database_id||;

Any help is greatly appreciated!

@King_Edward Just wanted to check back in and see if you are still able to modify your cloning process to duplicate filters and allow for datasource modifications as well?

Also, I am usually nervous when I upgrade to the newest version of Metabase out of fear that the cloning process will not work perfectly with an upgraded version. Is this a reasonable fear? Have you experienced previous Metabase version upgrades creating problems with your awesome cloning script?

Thanks!

I did some progress for automating the changing of sources of filters. I'll try to push it in the new release this weekend.

Regarding your second question, there is always a chance that the Metabase team make some changes to their API which is not backward-compatible (see here). But it is rare. And if it happens, I can make the required changes on my package to fix it.

1 Like

See the clone_card function.

This is a cross post from an attempt to get this answered as a standalone question but maybe better here

We are currently using https://github.com/vvaezian/metabase_api_python deep cloning process for being able to copy an existing dashboard with filters to a new dashboard with the same filters. Once this is complete, we change the underlying data source to a new data source using some custom Postgres SQL. What we are finding is that when we look in the database report_dashboardcard.parameter_mappings Has duplicates value in the parameter mappings column for the same filter across different dashboards and different collections. I just wondered if anybody had any idea if this is going to cause a problem and if we need to go through unpublished the report for embedding, Delete the filters and then republish the report to make it so that it has a new parameter_id for the new filters. Thanks!

@King_Edward

Loving the functionality of this but ran into an issue using the clone_card function today.

filters_data = card_info['dataset_query']['native']['template-tags']

KeyError: 'native'

I pulled the data for the card_info variable down today and realised there's no long a 'native' or 'template-tags' Key anymore.

{'description': None, 'archived': False, 'collection_position': None, 'table_id': 83,

'result_metadata': [{'semantic_type': 'type/Category', 'coercion_strategy': None, 'name': 'BusinessUnitCity', 'field_ref':
['field', 2300, None], 'effective_type': 'type/Text', 'id': 2300, 'display_name': 'BusinessUnitCity', 'fingerprint': {'global': {'distinct-count': 9, 'nil%': 0.0}, 'type': {'type/Text':
{'percent-json': 0.0, 'percent-url': 0.0, 'percent-email': 0.0, 'percent-state': 0.0, 'average-length': 9.4411}}}, 'base_type': 'type/Text'}, {'name': 'expression', 'display_name': 'ResponseRate', 'base_type':
'type/Float', 'effective_type': 'type/Float', 'field_ref': ['aggregation', 0], 'semantic_type': None, 'fingerprint': {'global': {'distinct-count': 15, 'nil%': 0.0}, 'type': {'type/Number':
{'min': 0.17477203647416414, 'q1': 0.20321322790141572, 'q3': 0.23475829725829725, 'max': 0.28015075376884424, 'sd': 0.027557622730681446, 'avg': 0.21932606564101043}}}},
{'name': 'count', 'display_name': 'Count', 'base_type': 'type/Integer', 'effective_type': 'type/Integer', 'semantic_type': 'type/Quantity', 'field_ref': ['aggregation', 1], 'fingerprint':
{'global': {'distinct-count': 15, 'nil%': 0.0}, 'type': {'type/Number': {'min': 286.0, 'q1': 514.5, 'q3': 789.75, 'max': 1410.0, 'sd': 276.51904265779046, 'avg': 673.7333333333333}}}}],

'can_write': True, 'database_id': 22, 'enable_embedding': False, 'collection_id': 44, 'query_type': 'query', 'name': 'Response Rate Location', 'last_query_start': '2021-12-03T17:18:57.895632Z',
'dashboard_count': 1, 'average_query_time': 190.86131386861314, 'creator_id': 1, 'moderation_reviews': [], 'updated_at': '2021-12-03T17:18:57.969918Z', 'made_public_by_id': None, 'embedding_params': None, 'cache_ttl': None,

'dataset_query': {'type': 'query', 'query': {'source-table': 83, 'aggregation': [['aggregation-options', ['/', ['sum', ['field', 2296, None]], ['sum', ['field', 2298, None]]], {'display-name': 'ResponseRate'}], ['count']],
'filter': ['and', ['=', ['field', 2325, None], 'Summary'], ['time-interval', ['field', 2292, None], -12, 'month', {'include-current': True}]], 'breakout': [['field', 2300, None]]}, 'database': 22},

'id': 884, 'display': 'combo',

'visualization_settings': {'gauge.segments': [{'min': 0, 'max': 0.075, 'color': '#ED6E6E', 'label': ''}, {'min': 0.075, 'max': 0.15, 'color': '#F9CF48', 'label': ''}, {'min': 0.15, 'max': 1, 'color': '#84BB4C', 'label': ''}],
'series_settings': {'EDGEWATER': {'line.interpolate': 'cardinal', 'line.marker_enabled': True}, 'NEWARK': {'line.interpolate': 'cardinal', 'line.marker_enabled': True}, 'QUEEN ANNE': {'line.interpolate': 'cardinal', 'line.marker_enabled': True},
'HUNT VALLEY': {'line.interpolate': 'cardinal', 'line.marker_enabled': True}, 'POCOMOKE CITY': {'line.interpolate': 'cardinal', 'line.marker_enabled': True}, 'CECILTON':
{'line.interpolate': 'cardinal', 'line.marker_enabled': True}, 'WESTMINSTER': {'line.marker_enabled': True, 'line.interpolate': 'cardinal'}, 'CLAYTON': {'line.interpolate': 'cardinal', 'line.marker_enabled': True},
'SALISBURY': {'line.interpolate': 'cardinal', 'line.marker_enabled': True}, 'OXFORD': {'line.interpolate': 'cardinal', 'line.marker_enabled': True}, 'CHESTERTOWN': {'line.interpolate': 'cardinal', 'line.marker_enabled': True},
'E NEW MARKET': {'line.interpolate': 'cardinal', 'line.marker_enabled': True}, 'HANOVER': {'line.interpolate': 'cardinal', 'line.marker_enabled': True}, 'MECHANICSVILLE': {'line.interpolate': 'cardinal', 'line.marker_enabled':
True}, 'WHITEFORD': {'line.marker_enabled': True, 'line.interpolate': 'cardinal'}}, 'graph.dimensions': ['BusinessUnitCity'], 'column_settings': {'["name","expression"]': {'decimals': 0, 'number_style': 'percent'}},
'graph.metrics': ['expression', 'count']},

'collection': {'authority_level': None, 'description': None, 'archived': False, 'slug': 'Collection_Name', 'color': '#509EE3', 'name': 'Namer',
'personal_owner_id': None, 'id': 44, 'location': '/', 'namespace': None},

'created_at': '2021-09-02T19:33:04.88087Z', 'public_uuid': None}

Does any of the following information stick out as obvious easy replacements for the two variables or would this be a longer restructuring issue?