Hi,
I have issues on some fields not generated the values, detail as below
Issues : field values for some fields not pupulated, i have check in table metabase_fieldvalues, there is no record for that field, try to insert manually but if i do manual rescan, the record deleted by metabase
Question :
is there any limitation on the count of record for populating field values?
is there any character restriction like aphostrope, backspace, tab character etc.?
BI Environment :
Metabase version : 0.33.3
OS : Ubuntu 16.04
Java : OpenJDK 11
Database for Metabase : Postgres
Database for BI Source : MariaDB columnstore 1.2.5
Please inform if there is any data i need to provide to trace this issues
@bambang
But do you see any errors/warnings - just with normal logging - when you run sync/scan?
If everything is normal without errors, then you should see something like this:
09-22 00:00:00 INFO sync.util :: STARTING: Cache field values in h2 Database 4 '1 - Sample2'
09-22 00:00:00 INFO sync.util :: STARTING: Sync metadata for h2 Database 4 '1 - Sample2'
09-22 00:00:00 INFO sync-metadata.tables :: Updating description for tables: (Table 'PUBLIC.PEOPLE' Table 'PUBLIC.REVIEWS' Table 'PUBLIC.ORDERS' Table 'PUBLIC.PRODUCTS')
09-22 00:00:00 INFO sync.util :: FINISHED: Sync metadata for h2 Database 4 '1 - Sample2' (184.5 ms)
09-22 00:00:00 INFO sync.util :: STARTING: Analyze data for h2 Database 4 '1 - Sample2'
09-22 00:00:00 WARN util.date :: Possible timezone conflict found on database 1 - Sample2. JVM timezone is America/Guatemala and detected database timezone is America/Chicago.
09-22 00:00:00 INFO sync.analyze :: fingerprint-fields Analyzed [*******ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท] ๐ 15% Table 12 'PUBLIC.PEOPLE'
09-22 00:00:00 INFO sync.analyze :: fingerprint-fields Analyzed [***************ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท] ๐ 31% Table 13 'PUBLIC.REVIEWS'
09-22 00:00:00 INFO sync.analyze :: classify-fields Analyzed [***********************ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท] ๐ฌ 46% Table 12 'PUBLIC.PEOPLE'
09-22 00:00:00 INFO sync.analyze :: classify-fields Analyzed [******************************ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท] ๐ 62% Table 13 'PUBLIC.REVIEWS'
09-22 00:00:00 INFO sync.analyze :: classify-tables Analyzed [**************************************ยทยทยทยทยทยทยทยทยทยทยทยท] ๐ 77% Table 12 'PUBLIC.PEOPLE'
09-22 00:00:00 INFO sync.analyze :: classify-tables Analyzed [**********************************************ยทยทยทยท] ๐ 92% Table 13 'PUBLIC.REVIEWS'
09-22 00:00:00 INFO sync.util :: FINISHED: Cache field values in h2 Database 4 '1 - Sample2' (232.1 ms)
09-22 00:00:00 INFO sync.util :: FINISHED: Analyze data for h2 Database 4 '1 - Sample2' (25.2 ms)
You only need more logging, if youโre not seeing anything. Iโm not sure if INFO would be enough, but the interesting part is in the sync module, so maybe only change logging levels for that. I think that would be something like: log4j.logger.metabase.sync=DEBUG
@bambang
Okay, which field is it, that doesnโt update?
Because I can see 11 fields that it checks for updated with values.
Example, you can ProductName (field ID 376) in the Metabase log, and then you see the corresponding query in the Postgres log: SELECT ... `ProductName` AS `ProductName` ... LIMIT 5000;
I have seen thousands of posts and issues, but nothing that deleted the FieldValues, when you re-scanned, so Iโm a bit puzzled by whatโs going on.
Iโm inclined to have you make a backup of the metadata and then run Admin > Databases > (db) > Discard saved field values.
By any chance, do you have Visibility set to something other than Everywhere? Admin > Data Model > (database) > (table) > (column)
Or table visibility set to Hidden? Admin > Data Model > (database) > (table) > (just below the table title)
Because I can see 11 fields that it checks for updated with values.
Example, you can ProductName (field ID 376) in the Metabase log, and then you see the corresponding query in the Postgres log: SELECT ... ProductNameASProductName ... LIMIT 5000;
We are using postgres as metabase database ( in replacement of default metabase database), and for datawarehouse we are using mariadb columnstore 1.2.5, that log is come from mariadb.
I can run query in log in mariadb without any problem
Iโm inclined to have you make a backup of the metadata and then run Admin > Databases > (db) > Discard saved field values.
Already done discard saved values, but still not populated
By any chance, do you have Visibility set to something other than Everywhere? Admin > Data Model > (database) > (table) > (column)
Or table visibility set to Hidden? Admin > Data Model > (database) > (table) > (just below the table title)
No, table set to queryable and field set to anywhere
@bambang
Yeah, I meant ColumnStore - not Postgres.
So is ProductName one of the fields that are not being populated with values?
And you have no problems running the scan query for that from within Metabase Native query? And it returns a bunch of values?
@bambang
Okay, this gave a bit more insight into your structure.
It looks like most of your columns (in MariaDB) are type/Text. I have a feeling thatโs the cause of your problems.
@sbelak Simon, you know everything about the sync-process Does Metabase not like Text columns for field values, when doing scan, even when they have a Field Type set to i.e. Category?
We donโt store field values for text fields that have no special type and an average length > 50. Globally โ ie. for all types โ there is also a cardinality constraint (< 100) and total size of textual representation of values (100 * cardinality threshold)
Sorry to bothering you, is there any update for our issues? at least can you show us how to check the condition of the fields that doesnt meet the validation
So there is rules to make field values stored in metabase system
Text fields with no special type
Average length > 50
Cardinality constraint (<100)
Total size of textual representation of values (100* cardinality threshold)
is the condition above are AND condition or OR condition? how if i want to check manually if fields are met above condition?
Tried with suggested query, and this is the result :
cardinality = 644
avg_size = 21.92
total_size = 969,583
compare with validation :
Text fields with no special type => pass
Average length > 50 => pass
Cardinality constraint (<100) => fail
Total size of textual representation of values (100* cardinality threshold) => fail
base on above results, specially on number 3 validation, the unique data should not more than 100, is there any way to change the value in the setting or any other place?