🚸 Restrict "column 'col' not in dataframe" error to at most 10 columns #2549

Zethson · 2025-03-10T17:40:23Z

Before	After
ValidationError: column 'date' not in dataframe. Columns in dataframe: ['transaction_amount_usd_cent', 'currency_name', 'transaction_id', 'transaction_date', 'transaction_time', 'merchant_id', 'merchant_name', 'merchant_category', 'payment_method', 'card_type', 'card_last_four', 'customer_id', 'customer_email', 'transaction_status', 'is_recurring', 'geography_country', 'geography_city', 'device_type', 'platform', 'is_international', 'exchange_rate', 'fees_applied']	ValidationError: column 'date' not in dataframe. Found 22 columns including: 'transaction_amount_usd_cent', 'currency_name', 'transaction_id', 'transaction_date', 'transaction_time', 'merchant_id', 'merchant_name', 'merchant_category', 'payment_method', 'card_type'...

This is in line with the rest of our UX where we also only show the 10 first hits. The difference becomes bigger of course the more columns we have. I observed that a user got spammed with a stupid amount of columns for their dataset which made the whole notebook unreadable.

codecov · 2025-03-11T10:01:53Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.16%. Comparing base (cbe457a) to head (be56055).
Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2549      +/-   ##
==========================================
+ Coverage   92.15%   92.16%   +0.01%     
==========================================
  Files          60       60              
  Lines        9988    10023      +35     
==========================================
+ Hits         9204     9238      +34     
- Misses        784      785       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-03-11T10:05:10Z

🚀 Deployed on https://67d02884cfb7a7a36f9762df--lamindb-qnwk.netlify.app

sunnyosun

This looks so complicated and so much code, is there a way you can utilize

lamindb/lamindb/models/_from_values.py

Line 301 in 1373325

def _format_values(

? Which was used everywhere to truncate logging list.

…/lamindb into feature/df_not_col_msg

Zethson · 2025-03-11T12:09:35Z

@sunnyosun thank you. I don't know why I didn't think of it but it makes it much more concise.

sunnyosun · 2025-03-11T14:09:52Z

lamindb/curators/__init__.py

+                if "column" in err_msg and "not in dataframe" in err_msg:
+                    missing_col = err_msg.split("column '")[1].split("'")[0]
+                    display_cols_str = _format_values(
+                        list(self._dataset.columns), n=10, quotes=True, sep="'"


Could this be simplified to display_cols_str = _format_values(self._dataset.columns, n=10)?

sunnyosun · 2025-03-11T14:12:20Z

lamindb/curators/__init__.py

+                    display_cols_str = _format_values(
+                        list(self._dataset.columns), n=10, quotes=True, sep="'"
+                    )
+                    err_msg = f"column '{missing_col}' not in dataframe. {len(list(self._dataset.columns))} columns in dataframe including: {display_cols_str}"


err_msg = f"column '{missing_col}' not in dataframe. {len(list(self._dataset.columns))} columns in dataframe: {display_cols_str}"

what happens if more than 1 columns are not in dataframe?

sunnyosun · 2025-03-11T14:15:25Z

This is OK if you need it urgently, but it's a bit patchy, we should ideally have a parser for all pandera errors.

Zethson · 2025-03-11T15:00:20Z

No this is not urgent so I might revisit this at a later point with more general code.

✨ Improve col not found error message

215e61f

Zethson changed the title ~~✨ Improve col not found error message~~ 🚸 Improve col not found error message Mar 10, 2025

🎨 Fix tests

7707a84

github-actions bot temporarily deployed to pull request March 11, 2025 10:05 Inactive

Zethson marked this pull request as ready for review March 11, 2025 10:05

🎨 Polish

a63a7a0

Zethson changed the title ~~🚸 Improve col not found error message~~ 🚸 Restrict "column 'col' not in dataframe" error to at most 10 columns Mar 11, 2025

Merge branch 'main' into feature/df_not_col_msg

08a038f

github-actions bot temporarily deployed to pull request March 11, 2025 10:46 Inactive

Zethson requested a review from sunnyosun March 11, 2025 10:52

sunnyosun requested changes Mar 11, 2025

View reviewed changes

Zethson added 2 commits March 11, 2025 12:51

🎨 Polish

d654e21

Merge branch 'feature/df_not_col_msg' of https://github.com/laminlabs…

be56055

…/lamindb into feature/df_not_col_msg

Zethson requested a review from sunnyosun March 11, 2025 12:09

github-actions bot temporarily deployed to pull request March 11, 2025 12:11 Inactive

sunnyosun reviewed Mar 11, 2025

View reviewed changes

sunnyosun marked this pull request as draft March 13, 2025 13:45

falexwolf force-pushed the main branch from f6030cd to af28cf8 Compare March 26, 2025 16:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🚸 Restrict "column 'col' not in dataframe" error to at most 10 columns #2549

🚸 Restrict "column 'col' not in dataframe" error to at most 10 columns #2549

Uh oh!

Zethson commented Mar 10, 2025 •

edited

Loading

Uh oh!

codecov bot commented Mar 11, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Mar 11, 2025 •

edited

Loading

Uh oh!

sunnyosun left a comment

Uh oh!

Zethson commented Mar 11, 2025

Uh oh!

sunnyosun Mar 11, 2025

Uh oh!

sunnyosun Mar 11, 2025

Uh oh!

sunnyosun Mar 11, 2025

Uh oh!

sunnyosun commented Mar 11, 2025

Uh oh!

Zethson commented Mar 11, 2025

Uh oh!

Uh oh!

🚸 Restrict "column 'col' not in dataframe" error to at most 10 columns #2549

Are you sure you want to change the base?

🚸 Restrict "column 'col' not in dataframe" error to at most 10 columns #2549

Uh oh!

Conversation

Zethson commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sunnyosun left a comment

Choose a reason for hiding this comment

Uh oh!

Zethson commented Mar 11, 2025

Uh oh!

sunnyosun Mar 11, 2025

Choose a reason for hiding this comment

Uh oh!

sunnyosun Mar 11, 2025

Choose a reason for hiding this comment

Uh oh!

sunnyosun Mar 11, 2025

Choose a reason for hiding this comment

Uh oh!

sunnyosun commented Mar 11, 2025

Uh oh!

Zethson commented Mar 11, 2025

Uh oh!

Uh oh!

Zethson commented Mar 10, 2025 •

edited

Loading

codecov bot commented Mar 11, 2025 •

edited

Loading

github-actions bot commented Mar 11, 2025 •

edited

Loading