Skip to content

Document str::split behavior with contiguous separators #26130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 10, 2015
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 28 additions & 3 deletions src/libcollections/str.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1180,9 +1180,8 @@ impl str {
/// matched by a pattern.
///
/// The pattern can be a simple `&str`, `char`, or a closure that
/// determines the split.
/// Additional libraries might provide more complex patterns like
/// regular expressions.
/// determines the split. Additional libraries might provide more complex
/// patterns like regular expressions.
///
/// # Iterator behavior
///
Expand Down Expand Up @@ -1224,6 +1223,32 @@ impl str {
/// let v: Vec<&str> = "abc1defXghi".split(|c| c == '1' || c == 'X').collect();
/// assert_eq!(v, ["abc", "def", "ghi"]);
/// ```
///
/// If a string contains multiple contiguous separators, you will end up
/// with empty strings in the output:
///
/// ```
/// let x = "||||a||b|c".to_string();
/// let d: Vec<_> = x.split('|').collect();
///
/// assert_eq!(d, &["", "", "", "", "a", "", "b", "c"]);
/// ```
///
/// This can lead to possibly surprising behavior when whitespace is used
/// as the separator. This code is correct:
///
/// ```
/// let x = " a b c".to_string();
/// let d: Vec<_> = x.split(' ').collect();
///
/// assert_eq!(d, &["", "", "", "", "a", "", "b", "c"]);
/// ```
///
/// It does _not_ give you:
///
/// ```rust,ignore
/// assert_eq!(d, &["a", "b", "c"]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Does not give" you isn't that too confusing to bring up? What if we instead show that d.filter(|s| !s.is_empty()) produces this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought contrasting with another common way it could be implemented would be clarifying

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I realize I'm not the target audience, I don't actually know what style works best to communicate.

I would not use this style myself -- I would not mention confusion, I would just underscore the facts as well as possible.

Just my thoughts -- it's awesome this PR addresses this in some way, by underscoring the facts of how this function works!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am usually on that side too, but this seems like a pretty common confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about assert!(d != &["a", "b", "c"])?

/// ```
#[stable(feature = "rust1", since = "1.0.0")]
pub fn split<'a, P: Pattern<'a>>(&'a self, pat: P) -> Split<'a, P> {
core_str::StrExt::split(&self[..], pat)
Expand Down