Skip to content

fix: optimize Excel row counting for files with abnormal max_row#13018

Merged
KevinHuSh merged 1 commit intoinfiniflow:mainfrom
yuehong136:fix/excel-row-count
Feb 6, 2026
Merged

fix: optimize Excel row counting for files with abnormal max_row#13018
KevinHuSh merged 1 commit intoinfiniflow:mainfrom
yuehong136:fix/excel-row-count

Conversation

@yuehong136
Copy link
Contributor

@yuehong136 yuehong136 commented Feb 5, 2026

What problem does this PR solve?

Some Excel files have abnormal max_row metadata (e.g., max_row=1,048,534 with only 300 actual data rows). This causes:

  • row_number() returns incorrect count, creating 350+ tasks instead of 1
  • list(ws.rows) iterates through millions of empty rows, causing system hang

This PR uses binary search to find the actual last row with data.

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • Performance Improvement

Co-authored-by: Cursor <cursoragent@cursor.com>
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. 🐞 bug Something isn't working, pull request that fix bug. labels Feb 5, 2026
@KevinHuSh KevinHuSh added the ci Continue Integration label Feb 6, 2026
@KevinHuSh KevinHuSh marked this pull request as draft February 6, 2026 01:54
@KevinHuSh KevinHuSh marked this pull request as ready for review February 6, 2026 01:55
@codecov
Copy link

codecov bot commented Feb 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.13%. Comparing base (1262533) to head (ebe2982).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main   #13018       +/-   ##
===========================================
+ Coverage   33.71%   44.13%   +10.41%     
===========================================
  Files          43       43               
  Lines        9378     9378               
  Branches      107      107               
===========================================
+ Hits         3162     4139      +977     
+ Misses       6207     5220      -987     
- Partials        9       19       +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@KevinHuSh KevinHuSh merged commit 5333e76 into infiniflow:main Feb 6, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🐞 bug Something isn't working, pull request that fix bug. ci Continue Integration size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants