Parsing vs. algorithms? #547

derifatives · 2017-05-14T18:02:17Z

Hi all. I opened Issue #541 last week suggesting that the Matrix exercise would be better without the string tests. The eventual decision was to remove the tests.

Having just encountered similarish issues with 'Word Count', I'm trying to understand what the philosophy is here. Reading the problem, it sounds like a fun little problem that is basically about building a function that turns a list into an association list of counts: [a] -> [(a, Int)]. Having written the initial code, I spent the majority of my time figuring out how to parse strings to give the desired answers. This experience has been repeated across quite a few problems:

The initial description of the problem is vague. (Is this intentional to teach people about dealing with vague specs?) Here, for instance, we're given one hypothetical example and the desired output, and the description of the problem is high level: "a function that takes a text and returns how many times each word appears*."
The actual problem contains some set of additional concerns that are not part of the spec, are only discoverable via trial and error or reading the test file, and don't seem especially complete or coherent. In this case, you had to know that lowercase and uppercase letters are the same, that numbers with digit characters can occur as words, that commas may or may not have spaces after them, that quotes inside of words are part of the word but quotes around words should be removed, and that punctuation other than commas or quotes should be removed. It's not clear whether I have to deal with mixed alphanumeric words or quotations that are longer than one word, since those aren't part of the test suite.

Is this all intentional or just sloppy? I'm finding that I'm spending most of my time on a bunch of these problems dealing with parsing issues which weren't part of the problem description. Thoughts?

The text was updated successfully, but these errors were encountered:

petertseng · 2017-05-14T18:14:01Z

The initial description of the problem is vague

The actual problem contains some set of additional concerns that are not part of the spec, are only discoverable via trial and error or reading the test file, and don't seem especially complete or coherent.

For this case, and for all other cases where the problem description is unclear, that should be reported to x-common since the description is https://github.com/exercism/x-common/blob/master/exercises/word-count/description.md and the test cases we use are https://github.com/exercism/x-common/blob/master/exercises/word-count/canonical-data.json if the file exists. For example, some cases were added in exercism/problem-specifications#403. A proposal to remove them in exercism/problem-specifications#726 received insufficient support.

And for this case in particular: I agree, there pretty much is no description. I expect a lot out of the README, especially the requirements.

petertseng · 2017-05-23T21:41:29Z

Thanks again. Don't forget to open the issue in x-common.

petertseng closed this as completed May 23, 2017

petertseng mentioned this issue Jun 11, 2017

anagram: Should we case about case sensitivity and/or duplicates? #559

Closed

derifatives mentioned this issue Jul 20, 2017

Philosophy of problem descriptions vs. test suites, using "word-count" as an example. exercism/problem-specifications#869

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parsing vs. algorithms? #547

Parsing vs. algorithms? #547

derifatives commented May 14, 2017

petertseng commented May 14, 2017

petertseng commented May 23, 2017

Parsing vs. algorithms? #547

Parsing vs. algorithms? #547

Comments

derifatives commented May 14, 2017

petertseng commented May 14, 2017

petertseng commented May 23, 2017