Field values not populated

Hi,
I have issues on some fields not generated the values, detail as below

Issues : field values for some fields not pupulated, i have check in table metabase_fieldvalues, there is no record for that field, try to insert manually but if i do manual rescan, the record deleted by metabase

Question :

  1. is there any limitation on the count of record for populating field values?
  2. is there any character restriction like aphostrope, backspace, tab character etc.?

BI Environment :
Metabase version : 0.33.3
OS : Ubuntu 16.04
Java : OpenJDK 11
Database for Metabase : Postgres
Database for BI Source : MariaDB columnstore 1.2.5

Please inform if there is any data i need to provide to trace this issues

Thank you,

Hi @bambang

When you run the sync/scan, then check the log for any warnings/errors. Admin > Troubleshooting > Logs.

I think 0.33.3 changed the log level, so it doesnโ€™t show the working sync/scan anymore, but you can change the log level to DEBUG:
https://metabase.com/docs/latest/operations-guide/log-configuration.html

Manually editing metadata is generally a bad idea - I would recommend that you use the API:
https://github.com/metabase/metabase/blob/master/docs/api-documentation.md

Most of the sync/scan code located here, if youโ€™re interested:
https://github.com/metabase/metabase/tree/master/src/metabase/sync

Hi Flamber,

Thank you for your response, i have change log level to DEBUG, there is so many output for this event, i have cut from โ€œrescan_valuesโ€.

But the problem is i have no idea where to start looking for error, appreciate your help on this.

log4j.propoerties : https://pastebin.com/5Lzgv3VY
metabase.log : https://pastebin.com/JN35AdXb

@bambang
But do you see any errors/warnings - just with normal logging - when you run sync/scan?
If everything is normal without errors, then you should see something like this:

09-22 00:00:00 INFO sync.util :: STARTING: Cache field values in h2 Database 4 '1 - Sample2'
09-22 00:00:00 INFO sync.util :: STARTING: Sync metadata for h2 Database 4 '1 - Sample2'
09-22 00:00:00 INFO sync-metadata.tables :: Updating description for tables: (Table  'PUBLIC.PEOPLE' Table  'PUBLIC.REVIEWS' Table  'PUBLIC.ORDERS' Table  'PUBLIC.PRODUCTS')
09-22 00:00:00 INFO sync.util :: FINISHED: Sync metadata for h2 Database 4 '1 - Sample2' (184.5 ms)
09-22 00:00:00 INFO sync.util :: STARTING: Analyze data for h2 Database 4 '1 - Sample2'
09-22 00:00:00 WARN util.date :: Possible timezone conflict found on database 1 - Sample2. JVM timezone is America/Guatemala and detected database timezone is America/Chicago.
09-22 00:00:00 INFO sync.analyze :: fingerprint-fields Analyzed [*******ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท] ๐Ÿ˜ž   15% Table 12 'PUBLIC.PEOPLE'
09-22 00:00:00 INFO sync.analyze :: fingerprint-fields Analyzed [***************ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท] ๐Ÿ˜•   31% Table 13 'PUBLIC.REVIEWS'
09-22 00:00:00 INFO sync.analyze :: classify-fields Analyzed [***********************ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท] ๐Ÿ˜ฌ   46% Table 12 'PUBLIC.PEOPLE'
09-22 00:00:00 INFO sync.analyze :: classify-fields Analyzed [******************************ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท] ๐Ÿ˜Œ   62% Table 13 'PUBLIC.REVIEWS'
09-22 00:00:00 INFO sync.analyze :: classify-tables Analyzed [**************************************ยทยทยทยทยทยทยทยทยทยทยทยท] ๐Ÿ˜‹   77% Table 12 'PUBLIC.PEOPLE'
09-22 00:00:00 INFO sync.analyze :: classify-tables Analyzed [**********************************************ยทยทยทยท] ๐Ÿ˜   92% Table 13 'PUBLIC.REVIEWS'
09-22 00:00:00 INFO sync.util :: FINISHED: Cache field values in h2 Database 4 '1 - Sample2' (232.1 ms)
09-22 00:00:00 INFO sync.util :: FINISHED: Analyze data for h2 Database 4 '1 - Sample2' (25.2 ms)

You only need more logging, if youโ€™re not seeing anything. Iโ€™m not sure if INFO would be enough, but the interesting part is in the sync module, so maybe only change logging levels for that. I think that would be something like:
log4j.logger.metabase.sync=DEBUG

Hi Flamber,

I have change logging level as your suggestion, and found no error in log, but still not populated in metabase DB

DB LOG = https://pastebin.com/22EDTsAJ
Metabase LOG = https://pastebin.com/DMcHQ9eE

@bambang
Okay, which field is it, that doesnโ€™t update?

Because I can see 11 fields that it checks for updated with values.
Example, you can ProductName (field ID 376) in the Metabase log, and then you see the corresponding query in the Postgres log: SELECT ... `ProductName` AS `ProductName` ... LIMIT 5000;

I have seen thousands of posts and issues, but nothing that deleted the FieldValues, when you re-scanned, so Iโ€™m a bit puzzled by whatโ€™s going on.

Iโ€™m inclined to have you make a backup of the metadata and then run Admin > Databases > (db) > Discard saved field values.

By any chance, do you have Visibility set to something other than Everywhere? Admin > Data Model > (database) > (table) > (column) :gear:
Or table visibility set to Hidden? Admin > Data Model > (database) > (table) > (just below the table title)

Which version of Postgres are you using?

Hi Flamber,

Because I can see 11 fields that it checks for updated with values.
Example, you can ProductName (field ID 376) in the Metabase log, and then you see the corresponding query in the Postgres log: SELECT ...ProductNameASProductName... LIMIT 5000;

We are using postgres as metabase database ( in replacement of default metabase database), and for datawarehouse we are using mariadb columnstore 1.2.5, that log is come from mariadb.
I can run query in log in mariadb without any problem

Iโ€™m inclined to have you make a backup of the metadata and then run Admin > Databases > (db) > Discard saved field values.

Already done discard saved values, but still not populated

By any chance, do you have Visibility set to something other than Everywhere? Admin > Data Model > (database) > (table) > (column) :gear:
Or table visibility set to Hidden? Admin > Data Model > (database) > (table) > (just below the table title)

No, table set to queryable and field set to anywhere

Which version of Postgres are you using?

psql (PostgreSQL) 10.10 (Ubuntu 10.10-0ubuntu0.18.04.1)

@bambang
Yeah, I meant ColumnStore - not Postgres.
So is ProductName one of the fields that are not being populated with values?
And you have no problems running the scan query for that from within Metabase Native query? And it returns a bunch of values?

Hi Flamber,

Thankyou for your quick response :+1:

Actually there is some other field not populated, like vendorid, vendorname, etc. detail as screenshot below

Query test to productname

DataModel

Generated field_values

@bambang
Okay, this gave a bit more insight into your structure.
It looks like most of your columns (in MariaDB) are type/Text. I have a feeling thatโ€™s the cause of your problems.

@sbelak Simon, you know everything about the sync-process :wink: Does Metabase not like Text columns for field values, when doing scan, even when they have a Field Type set to i.e. Category?

We donโ€™t store field values for text fields that have no special type and an average length > 50. Globally โ€“ ie. for all types โ€“ there is also a cardinality constraint (< 100) and total size of textual representation of values (100 * cardinality threshold)

1 Like

Hi Simon,

So there is rules to make field values stored in metabase system

  1. Text fields with no special type
  2. Average length > 50
  3. Cardinality constraint (<100)
  4. Total size of textual representation of values (100* cardinality threshold)

is the condition above are AND condition or OR condition? how if i want to check manually if fields are met above condition?

Thank you

Hi Simon,

@sbelak

Sorry to bothering you, is there any update for our issues? at least can you show us how to check the condition of the fields that doesnt meet the validation

@bambang Remember to tag @sbelak otherwise he might not get notified.

@sbelak,

So there is rules to make field values stored in metabase system

Text fields with no special type
Average length > 50
Cardinality constraint (<100)
Total size of textual representation of values (100* cardinality threshold)
is the condition above are AND condition or OR condition? how if i want to check manually if fields are met above condition?

Thank you

AND.

Depending on your SQL dialect, there should be LENGTH function, so you could do something like this:

select count(distinct my_field) as cardinality, avg(length(my_field)) as avg_size, sum(length(my_field)) as total_size from my_table