Skip to content

Panic with 'byte index is not a char boundary' #1464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bredov opened this issue Apr 19, 2017 · 2 comments
Closed

Panic with 'byte index is not a char boundary' #1464

bredov opened this issue Apr 19, 2017 · 2 comments
Labels
bug Panic, non-idempotency, invalid code, etc.

Comments

@bredov
Copy link

bredov commented Apr 19, 2017

rustfmt panics on a codebase with multibyte characters in certain situations. Here is a backtrace

stack backtrace:
   0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
   1: std::panicking::default_hook::{{closure}}
   2: std::panicking::default_hook
   3: std::panicking::rust_panic_with_hook
   4: std::panicking::begin_panic
   5: std::panicking::begin_panic_fmt
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::str::slice_error_fail
   9: rustfmt::utils::trim_newlines
  10: rustfmt::visitor::FmtVisitor::visit_item
  11: rustfmt::visitor::FmtVisitor::walk_mod_items
  12: rustfmt::visitor::FmtVisitor::format_separate_mod
  13: rustfmt::format_ast
  14: rustfmt::run
  15: rustfmt::execute
  16: rustfmt::main
  17: __rust_maybe_catch_panic
  18: std::rt::lang_start

It appears that panic happens in utils::trim_newlines function at &input[start..end] line because of improperly calculated let end = input.rfind(|c| c != '\n' && c != '\r').unwrap_or(0) + 1;. Obviously, + 1 part won't work in case of unicode symbols which spans over several bytes.

I've tried to fix the problem, but don't know how to get the width of character at given offset in &str using stable rust features. When unicode-rs/unicode-segmentation#21 will be resolved it would be possible to use GraphemeCursor for that purpose.

To test if proper addend resolves the issue i've used this hacky solution:

let end = input.rfind(|c| c != '\n' && c != '\r').unwrap_or(0);
let rest = ::std::str::from_utf8(&input.as_bytes()[end..]).unwrap();
let char_len = rest.chars().next().unwrap().len_utf8();
let end = end + char_len;

and it did help.

@nrc
Copy link
Member

nrc commented May 1, 2017

cc #6

@nrc nrc added the bug Panic, non-idempotency, invalid code, etc. label May 1, 2017
@topecongiro
Copy link
Contributor

I think this is no longer an issue since we got rid of trim_newlines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Panic, non-idempotency, invalid code, etc.
Projects
None yet
Development

No branches or pull requests

3 participants