Skip to content

Parsing vs. algorithms? #547

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
derifatives opened this issue May 14, 2017 · 2 comments
Closed

Parsing vs. algorithms? #547

derifatives opened this issue May 14, 2017 · 2 comments

Comments

@derifatives
Copy link

Hi all. I opened Issue #541 last week suggesting that the Matrix exercise would be better without the string tests. The eventual decision was to remove the tests.

Having just encountered similarish issues with 'Word Count', I'm trying to understand what the philosophy is here. Reading the problem, it sounds like a fun little problem that is basically about building a function that turns a list into an association list of counts: [a] -> [(a, Int)]. Having written the initial code, I spent the majority of my time figuring out how to parse strings to give the desired answers. This experience has been repeated across quite a few problems:

  • The initial description of the problem is vague. (Is this intentional to teach people about dealing with vague specs?) Here, for instance, we're given one hypothetical example and the desired output, and the description of the problem is high level: "a function that takes a text and returns how many times each word appears*."
  • The actual problem contains some set of additional concerns that are not part of the spec, are only discoverable via trial and error or reading the test file, and don't seem especially complete or coherent. In this case, you had to know that lowercase and uppercase letters are the same, that numbers with digit characters can occur as words, that commas may or may not have spaces after them, that quotes inside of words are part of the word but quotes around words should be removed, and that punctuation other than commas or quotes should be removed. It's not clear whether I have to deal with mixed alphanumeric words or quotations that are longer than one word, since those aren't part of the test suite.

Is this all intentional or just sloppy? I'm finding that I'm spending most of my time on a bunch of these problems dealing with parsing issues which weren't part of the problem description. Thoughts?

@petertseng
Copy link
Member

The initial description of the problem is vague

The actual problem contains some set of additional concerns that are not part of the spec, are only discoverable via trial and error or reading the test file, and don't seem especially complete or coherent.

For this case, and for all other cases where the problem description is unclear, that should be reported to x-common since the description is https://github.com/exercism/x-common/blob/master/exercises/word-count/description.md and the test cases we use are https://github.com/exercism/x-common/blob/master/exercises/word-count/canonical-data.json if the file exists. For example, some cases were added in exercism/problem-specifications#403. A proposal to remove them in exercism/problem-specifications#726 received insufficient support.

And for this case in particular: I agree, there pretty much is no description. I expect a lot out of the README, especially the requirements.

@petertseng
Copy link
Member

Thanks again. Don't forget to open the issue in x-common.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants