Skip to content

proposal: bufio: Scanner.{Text,Bytes}Seq #70657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pkierski opened this issue Dec 3, 2024 · 8 comments
Open

proposal: bufio: Scanner.{Text,Bytes}Seq #70657

pkierski opened this issue Dec 3, 2024 · 8 comments
Labels
Milestone

Comments

@pkierski
Copy link

pkierski commented Dec 3, 2024

Proposal Details

Once iterators are introduced in 1.23 there is growing number of libraries based on this feature. Reading data by lines or words are quite common task and could be accomplished with for ... range loop. It requires just two simple methods for bufio.Scanner:

func (s *Scanner) TextSeq() iter.Seq[string] {
	return func(yield func(string) bool) {
		for s.Scan() {
			if !yield(s.Text()) {
				break
			}
		}
	}
}

func (s *Scanner) BytesSeq() iter.Seq[[]byte] {
	return func(yield func([]byte) bool) {
		for s.Scan() {
			if !yield(s.Bytes()) {
				break
			}
		}
	}
}

Reading whole file as collection of lines could be like:

	f, err := os.Open("file.txt")
	if err != nil {
		panic(err)
	}
	defer f.Close()
	scanner := bufio.NewScanner(f)
	
	// read all lines as slice of strings
	lines := slices.Collect(scanner.TextSeq())

	// instead of:
	// lines := make([]string, 0)
	// for scanner.Scan() {
	// 	lines = append(lines, scanner.Text())
	// }
@gopherbot gopherbot added this to the Proposal milestone Dec 3, 2024
@gabyhelp
Copy link

gabyhelp commented Dec 3, 2024

@seankhliao
Copy link
Member

this doesn't address how errors should be surfaced, see #70084 for related discussion

@mateusz834
Copy link
Member

Also why global functions, instead of methods?

@pkierski
Copy link
Author

pkierski commented Dec 3, 2024

Also why global functions, instead of methods?

My bad, I've copied code from my experiments. I've thought about method ofc.

@mateusz834
Copy link
Member

this doesn't address how errors should be surfaced, see #70084 for related discussion

And #70631

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Dec 3, 2024
@earthboundkid
Copy link
Contributor

I think the naming convention is BytesSeq, like strings.SplitSeq etc. in 1.24.

@pkierski
Copy link
Author

pkierski commented Dec 7, 2024

I think the naming convention is BytesSeq, like strings.SplitSeq etc. in 1.24.

Thank you, changed according to this suggestion.

@adonovan adonovan changed the title proposal: bufio: Scanner.IterText/Scanner.IterBytes proposal: bufio: Scanner.{Text,Bytes}Seq Apr 23, 2025
@adonovan
Copy link
Member

The need to remember to call Scanner.Err after iterating, even with the traditional API, is already a problem; iterators may make it even easier to forget to do so. I think we should put this proposal aside until we have a consensus on how to deal with iterable sequences with the potential for errors. In the meantime, you can always write this function yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

7 participants