Skip to content

nucleotide-count: Add canonical-data.json #505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

bunnymatic
Copy link
Contributor

@bunnymatic bunnymatic commented Jan 23, 2017

@bunnymatic
Copy link
Contributor Author

I noticed that the current Java test suite for Nucleotide includes a test that checks for exceptions being thrown. I did not include that test here but if I should, I see a pattern in here that i could use... https://github.com/exercism/x-common/blob/master/exercises/grains/canonical-data.json

@bunnymatic bunnymatic force-pushed the nucleotide-count-shared-test-data branch from 2639bde to c2ec6d6 Compare January 23, 2017 01:43
Copy link
Member

@petertseng petertseng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, these cases are looking good.

There are some cases that I would wonder about whether to add, all related to invalid inputs;

  • invalid nucleotide in strand of either function
  • invalid nucleotide in count

Past discussions appear to be:

So since we are adding the canonical data now, now seems as good a time as any to discuss whether these cases should be included. If so, let's get them added. If not, I think we can merge what's currently here as-is, right? Maybe nucleotideCounts("GGGGGGGG") would be helpful to show that the solution isn't just special-casing the empty dna strand and can still handle characters that are absent.

@bunnymatic
Copy link
Contributor Author

bunnymatic commented Jan 23, 2017 via email

@bunnymatic
Copy link
Contributor Author

So... if you're inclined to start a pattern or continue the pattern of "an exception was thrown", i'm happy to add something like:

   {
        "description": "this whatever raises an exception",
        "strand": "bogus",
        "expected": -1
    }

as needed. Should I proceed with that? Do we need a quorum of 👍 's?

@jtigger
Copy link
Contributor

jtigger commented Jan 23, 2017

While it hasn't gotten marked as "policy", there was something that looks like rough consensus with regards to exceptional values in #401:

{
  "description": "garbage input",
  "input": "100$#^0002",
  "expected": {
    "error": {
      "message": "Invalid input."
    }
}

@bunnymatic
Copy link
Contributor Author

bunnymatic commented Jan 23, 2017 via email

@bunnymatic
Copy link
Contributor Author

I've added a proposal for 2 new exception cases. One is handled (already) in the java tests for this exercise.

@jtigger
Copy link
Contributor

jtigger commented Jan 23, 2017

I like the simpler syntax.

@jtigger
Copy link
Contributor

jtigger commented Jan 23, 2017

Thanks for doing this, btw, @bunnymatic; every time someone codifies an exercise, an angel gets her wings. 👼

But seriously, it significantly reduces the work required to port. Even for tracks without a test generator, it's far far easier for both the contributor and the reviewer with this kind of data in hand. 🙇

@Insti
Copy link
Contributor

Insti commented Jan 23, 2017

I really don't like that this problem is trying to test 2 things.

Either it should count individual nucleotides, or count all of them.

Someone convince me that doing both is a good idea.

@petertseng
Copy link
Member

How came it to be this way?

Looks like through exercism/exercism@4c43852

Where'd the problem come from? http://rosalind.info/problems/dna/ - there, it actually just wants the four counts at once, never asking for the individuals. So asking for the individuals is our extension!

Should it stay this way?

I have no arguments to offer for either side.

I note that if going back to taking only all counts, this problem becomes quite similar to word-count. Compared to word count, there is no longer the obligation to split on words, but an added obligation to validate the input for valid nucleotides only.

@petertseng
Copy link
Member

validates nucleotides in requested nucleotide

validates nucleotides in the strand

Shall there also be a test for validating nucleotides in the strand for count, as well?

@Insti
Copy link
Contributor

Insti commented Jan 24, 2017

Counting them all at once seems like it makes for a more interesting problem with a couple of different approaches to discuss.

My vote is for only test for the case where all nucleotides are counted at once.

@Insti Insti changed the title Add canonical-data.json for Nucleotide Count nucleotide-count: Add canonical-data.json Jan 24, 2017
@bunnymatic
Copy link
Contributor Author

@petertseng - i'm down with another exception test around validate nucleotides during count. I'll wait until I know whether or not we're going to do the single "countAll" tests or both.

@bunnymatic
Copy link
Contributor Author

bunnymatic commented Jan 24, 2017

As for the two test discussion, I've seen (in reviewing this for other folks on the site) different approaches that may be a remnant of the problem statement and the test cases. I've seen folks solve it by writing count(nucleotide) = function() {} where that runs the strand and pulls out only that nucleotide and counts it up. then counts() simply runs through all known nucleotides and calls count(nuc). The other approach has been - counts() runs the whole strand and counts everything and stashes it on the instance. Then count just looks into this instance map. These two solutions are good for discussion points around memoization/caching, lookup speed etc.

If we went with the "count all" as the only tests, do you think we'd still get both of these solutions and would it reduce the possibility for discussion/learning around these exercises? food for thought.

@bunnymatic bunnymatic force-pushed the nucleotide-count-shared-test-data branch from 6b88d08 to 32f4315 Compare January 28, 2017 01:57
"strand": "",
"nucleotide": "A",
"expected": 0
},
{
"description": "dna strand with only guanine has no adenine",
"description": "strand with one nucleotide",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'one repeated nucleotide'

@Insti
Copy link
Contributor

Insti commented Jan 28, 2017

If we went with the "count all" as the only tests, do you think we'd still get both of these solutions and would it reduce the possibility for discussion/learning around these exercises?

It would not reduce the discussion possibility.
Both approaches are still applicable for 'count all'

}
},
{
"description": "strand with single nucleotide",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"repeated"

@bunnymatic bunnymatic force-pushed the nucleotide-count-shared-test-data branch from 561081f to a99bd0a Compare January 28, 2017 17:40
@bunnymatic
Copy link
Contributor Author

Given the big rewrite, this last commit is a squish leaving only the latest version of the data file around.

Copy link
Contributor

@Insti Insti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@bunnymatic
Copy link
Contributor Author

Once this is merged, we could also close this https://github.com/exercism/todo/issues/116

Copy link
Member

@petertseng petertseng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea is to include a line of the form "Closes https://github.com/exercism/todo/issues/116" or "Closes https://github.com/exercism/todo/issues/116" in the commit message - then when we merge the commit, it will automatically close the issue due to https://help.github.com/articles/closing-issues-via-commit-messages/

I'm OK with these changes. It is a departure from what's currently in tracks, but anyone who wanted to argue for keeping both kinds of tests has had many days to voice their opinion.

So if nobody says anything at 24 hours since the last change, let's merge.

@petertseng petertseng merged commit ab2a8d5 into exercism:master Jan 30, 2017
@petertseng
Copy link
Member

Thanks!

petertseng added a commit to exercism/haskell that referenced this pull request Feb 1, 2017
It was decided that the exercise was doing too many things, and count
really is just one element of nucleotideCounts anyway.

exercism/problem-specifications#505
emcoding pushed a commit that referenced this pull request Nov 19, 2018
.gitignore simplecov output directory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants