simple-cipher: Add canonical-data.json #1241

gustavosobral · 2018-05-11T22:22:34Z

Can anyone clarify the CI building output for me? I see that is failing but I didn't understand why.

As I understood from the canonical-data.json specification, I should be able to nest cases keywords.

For this problem in specific, there is a test case where we don't assert, but we use a match test to check a property against a regex, so I used a match key for it.

Closes #586

gustavosobral · 2018-05-12T14:13:12Z

Okay, I figured out: The building output was not so clear, I was missing some required properties and the output was saying too much.

petertseng

This is an interesting one to deal with, for a few reasons. I thank you for rising to the challenge.

There are a few cases where encode only has plaintext as an input, and other times when it has key and plaintext. I would think that in all cases both are needed, otherwise the test would seem to be underspecified. How would we be able to determine the ciphertext if we don't have both a plaintext and a key?

Same comment with decode.

I considered explicitly testing plaintexts with non-letter characters (which I assume would just get passed through?) but since they are already done in https://github.com/exercism/problem-specifications/blob/master/exercises/atbash-cipher/canonical-data.json and https://github.com/exercism/problem-specifications/blob/master/exercises/rotational-cipher/canonical-data.json I don't feel a particular need to include them in this exercise. I understand that there may be tracks that implement this exercise but neither Atbash nor rotational cipher; I guess I can't defend my position in these cases

petertseng · 2018-05-14T07:27:55Z

exercises/simple-cipher/canonical-data.json

+          "input": {
+            "plaintext": "aaaaaaaaaa"
+          },
+          "expected": "cipher.key"


this is interesting because we are to understand that this is not the literal string "cipher.key" but instead the key of the cipher, whatever that may be.

This therefore falls under one of the categories of #1225 .

This suggests some possible courses of action:

Make no change to this PR. It's up to readers of this JSON file to understand that certain fields are symbolic instead of literal strings:

The expected of every encode

The input.ciphertext of every decode

I'm not even sure that this list is correct

Make no change to this PR. It's up to readers of this JSON file to understand that all strings beginning with the seven characters cipher. are symbolic instead of literal strings.

Make a change to this PR that uses a different type (perhaps some JSON object) for such symbolic inputs and/or outputs.

Add your suggestion here

I will not yet express an preference ordering between these options

Make a change to this PR that uses a different type (perhaps some JSON object) for such symbolic inputs and/or outputs.

Add your suggestion here

As a sort of combination of the above, what about the following:

"expected": { "match": "^{cipher.key}$" }

Where {symbol} is to be interpolated with local variables before matching is applied.

On the surface, this approach seems to simply add complication to the current syntax. However, if this is to act as a precedent for future situations, the interpolation approach allows multiple variable values to be used.

Examples of such use:

"^{name.first} {name.last}$"

"^{a + b / c}$"

However, if my fellow maintainers think this is too complicated, my vote would be option 1.

This one is very specific to the canonical-data specification/syntax. I believe you guys know way more than me about where we are and where we should go. I'm completely open to suggestions.

petertseng · 2018-05-14T07:30:11Z

exercises/simple-cipher/canonical-data.json

+          "expected": "aaaaaaaaaa"
+        },
+        {
+          "description": "It is reversible. I.e., if you apply decode in a encoded result, you must see the same plaintext encode parameter as a result of the decode method",


This test is useful to show the student what is possible. However, it is not possible for the test to reject any implementation that the previous two tests have not already rejected, since it just composes the previous two tests together.

Thus, the question is: Is the advantage of showing the student the possibilities worth the disadvantage of the test that does not help catch any more mistakes?

I personally thought it was not worth keeping this test. Let me know your thoughts on it.

The first test guarantee that if we pass aaaaaaaaaa as argument to the encode, the result should be the same as the key. And the second one guarantee that if we pass key as argument to the decode we expect a aaaaaaaaaa as result. This is the first test that uses a different plaintext.

I believe that is good to explicitly verify that the algorithm is reversible, since is a characteristic of it. But you're right, if the encode and the decode are already verified with valid "reversibles" outputs, the nesting of the encode and decode must be reversible.

I personally like this reversibles test cases. But I see your point

petertseng · 2018-05-14T07:32:13Z

exercises/simple-cipher/canonical-data.json

+      "description": "Incorrect key cipher",
+      "cases": [
+        {
+          "description": "It throws an error with an all uppercase key",


Since not all languages throw errors, I am considering asking for more language-agnostic wording. For example, "it rejects an all-uppercase key" or "it treats an all-uppercase key as an error" or something.

Good point! I think it will be all wrong if I had used it throws an exception, which is very specific. But throw a program error is quite universal for me, it's language agnostic and operating systems oriented.

For sure there is room for improvement, I liked your second suggestion more.

What do you think about my points?

petertseng · 2018-05-14T07:33:36Z

exercises/simple-cipher/canonical-data.json

+      "cases": [
+        {
+          "description": "It throws an error with an all uppercase key",
+          "property": "new",


This is interesting since new might imply too much an object-oriented interface, which would raise questions as to how this file is to be understood by languages less amenable to object-oriented interfaces.

I don't have a concrete suggestion yet. Let's see if there are any.

Good point, this is a tricky one.

I think the problem is that we're not testing the constructor method itself, but verifying that assigning a invalid key to key must cause a error, what is done in the constructor here.

One option for me would be set the property to key or maybe something like key= will be also valid for me. I don't have any better/concrete suggestion too.

We could also use the encode property and name the case something like Encode using uppercase key.

petertseng · 2018-05-14T07:36:48Z

exercises/simple-cipher/canonical-data.json

+          "expected": { "error": "Bad key" }
+        },
+        {
+          "description": "It throws an error with empty key",


I understand that we can immediately know, upon providing an empty key, that we will never be able to encipher or decipher any texts, so we can immediately reject the empty key.

~~What about cases where the key is shorter than the text? How are they handled?~~

For anyone asking that question, the answer is in the currently last test case, entitled "It can handle messages longer than the key"

Yes, you're right

petertseng · 2018-05-14T07:37:24Z

exercises/simple-cipher/canonical-data.json

+          "expected": "aaaaaaaaaa"
+        },
+        {
+          "description": "It is reversible. I.e., if you apply decode in a encoded result, you must see the same plaintext encode parameter as a result of the decode method",


same q as above re: whether showing the possibilities is worth the test which catches no extra mistakes

Please look at my reply in the other review.

gustavosobral · 2018-05-15T06:17:02Z

Thank you so much for your extensive feedback @petertseng! I will get back to it as soon as I can

gustavosobral · 2018-05-15T21:47:32Z

How would we be able to determine the ciphertext if we don't have both a plaintext and a key?

If we pass aaaaaaaaaa as parameter to the encode method, it should return the same value as the key (which can be randomly generated).
If we pass the key as argument to the decode method, it should return aaaaaaaaaa.

So you don't necessarily need both values for testing.

ErikSchierboom

A general remark: I'm not a huge fan of starting every test case description with "it". To me, that does not add anything of value to the description. As an example, the second test case is named "It can encode". What does "it" refer to? I would much prefer it to be named something like "Encode using random key". Same goes for the other descriptions that start with "it".

ErikSchierboom · 2018-05-16T06:36:51Z

exercises/simple-cipher/canonical-data.json

+  "version": "1.0.0",
+  "cases": [
+    {
+      "description": "Random key cipher",


Reading the exercise's description, the random key part is actually the third and last step in the description. To me, this implies that the test cases involving the random key should also be moved to the bottom of the test cases.

The third step is making it fault tolerant. The first step is the cipher without key assignment, that's the random key. But yeah, I will organize the order.

@gustavosobral Is is correct that you have not yet applied the re-ordering?

The first step is the Cipher without key, so that's the Random key set of cases. The second is step is the cypher accepting keys, this is the Substitution set of cases. The third step is making it fault tolerant, that's the Incorrect key cases. I believe they are in order now, no?

Aha, I think I did not explain it correctly. What I meant was not to re-order the individual test cases with the "Random key cipher" case, but the "Random key cipher" case itself. At the moment, the order is:

"Random key cipher"

"Substitution cipher"

"Incorrect key cipher"

I would argue that the first case should actually be the last, as it is a special case of the "Substitution cipher" case. Furthermore, the README also lists it as the third case.

Oh, I see what you mean. So if I understood correctly, your suggestion is to change the order to:

Substitution cipher

Random key cipher

Incorrect key cipher

Right? I think we all agree that the last step is making it fault tolerant against incorrect keys?

I see your point and believe that we can really see the Random key as a expansion of the Substitution part, but the thing is that in the exercise description.md it explicitly say that the step 2 is the random key cipher:

Step 2

Shift ciphers are no fun though when your kid sister figures it out. Try amending the code to allow us to specify a key and use that for the shift distance. This is called a substitution cipher.

So, the exercise description sees the functionality of accepting a key parameter as a expansion of having the random generated case. Which I also believe is correct. Did you see the point here?

Would you still suggest the re-order in this case?

Thank you so much for the feedback and sorry for the late reply.

Ah, I then misread the description. It's fine to leave it as-is then.

ErikSchierboom · 2018-05-16T06:42:03Z

exercises/simple-cipher/canonical-data.json

+      "cases": [
+        {
+          "description": "It throws an error with an all uppercase key",
+          "property": "new",


We could also use the encode property and name the case something like Encode using uppercase key.

ErikSchierboom · 2018-05-16T06:46:00Z

exercises/simple-cipher/canonical-data.json

+      "description": "Substitution cipher",
+      "cases": [
+        {
+          "description": "It keeps submitted key",


Is this a useful test? The current format subtly implies state (keeping track of the key). Why not modify the input for the other test cases to have the key explicitly passed as an input value? This is how virtually canonical data cases are usually setup: all input is specified within the test case itself. As an example, the next case would then look like this:

{ "description": "It can encode", "property": "encode", "input": { "plaintext": "aaaaaaaaaa", "key": "abcdefghij" }, "expected": "abcdefghij" },

Very good point. Thank you

gustavosobral · 2018-05-23T21:05:52Z

We could also use the encode property and name the case something like Encode using uppercase key.

Is not only the encode method that should fail, decode too. What do you think? (I don't know why, but I could not comment this directly on your review. Sorry)

A general remark: I'm not a huge fan of starting every test case description with "it". To me, that does not add anything of value to the description.

I totally agree.

Thanks for the feedback @ErikSchierboom

gustavosobral · 2018-05-23T21:12:20Z

Thank you guys for the extensive feedback! Please check my comments and my changes @petertseng @cmccandless and @ErikSchierboom

ErikSchierboom · 2018-05-24T10:34:43Z

Great set of improvements! I've just added a small question.

ErikSchierboom · 2018-05-30T12:17:26Z

@petertseng @rpottsoh Would you mind reviewing this PR in the current state?

rpottsoh · 2018-05-31T23:55:17Z

Overall I think this is great! Initially I was confused by the use of symbolic and literal strings. I don't think I have seen this used before in the testdata. It caught me be surprise. I didn't review all the comments above before reviewing the JSON. I just dove in. I think it could be worthwhile to include a comment of sorts at the beginning of the JSON to indicate that some of the strings are symbolic and perhaps to indicate which of those are symbolic.

Once I finally reviewed the comments and the README the JSON made perfect sense.

Thanks @gustavosobral for working on this. 👍

gustavosobral · 2018-06-02T11:05:53Z

Thank you for the feedback @rpottsoh. It makes total sense to add comments in the file.

I just did it, thanks.

rpottsoh · 2018-06-02T12:14:41Z

exercises/simple-cipher/canonical-data.json

+  "exercise": "simple-cipher",
+  "version": "1.0.0",
+  "comments":
+    ["Some of the strings used in this file are symbolic and doesn't represent their literal value. They are:",


do not instead of doesn't.

rpottsoh

Thanks for adding the comment at the beginning.

ErikSchierboom · 2018-06-03T12:32:15Z

@petertseng Are you interested in reviewing this PR? If not, not worries of course.

ErikSchierboom · 2018-06-04T06:51:39Z

With three approvers, I think we are now ready to merge this. Thanks a lot @gustavosobral!

gustavosobral added 2 commits May 12, 2018 00:18

Add simple-cipher canonical-data

6158197

Fix schema validation adding missing properties

289dab2

gustavosobral changed the title ~~Add simple-cipher canonical-data.json~~ simple-cipher: Add canonical-data.json May 14, 2018

petertseng reviewed May 14, 2018

View reviewed changes

ErikSchierboom reviewed May 16, 2018

View reviewed changes

Review

92cc456

cmccandless approved these changes May 24, 2018

View reviewed changes

ErikSchierboom approved these changes May 29, 2018

View reviewed changes

Add comments to the exercise definition

80d76a2

rpottsoh reviewed Jun 2, 2018

View reviewed changes

Grammar comment fix

9cb43e5

rpottsoh approved these changes Jun 2, 2018

View reviewed changes

ErikSchierboom merged commit 267f1ea into exercism:master Jun 4, 2018

cmccandless mentioned this pull request Jun 5, 2018

simple-cipher: update tests to canonical data 1.0.0 exercism/python#1399

Closed

petertseng mentioned this pull request Mar 26, 2019

Schema optimises for testing f(inputs) = output. What of other types of tests? #1225

Closed

Uh oh!

simple-cipher: Add canonical-data.json #1241

simple-cipher: Add canonical-data.json #1241

Uh oh!

Conversation

gustavosobral commented May 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gustavosobral commented May 12, 2018

Uh oh!

petertseng left a comment

Choose a reason for hiding this comment

Uh oh!

petertseng May 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gustavosobral May 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gustavosobral commented May 15, 2018

Uh oh!

gustavosobral commented May 15, 2018

Uh oh!

ErikSchierboom left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Step 2

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gustavosobral commented May 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gustavosobral commented May 23, 2018

Uh oh!

ErikSchierboom commented May 24, 2018

Uh oh!

gustavosobral commented May 11, 2018 •

edited

Loading

petertseng May 14, 2018 •

edited

Loading

gustavosobral May 15, 2018 •

edited

Loading

gustavosobral commented May 23, 2018 •

edited

Loading