Skip to content

Incomplete specification for run-length encoding? #238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
remcopeereboom opened this issue Apr 26, 2016 · 7 comments
Closed

Incomplete specification for run-length encoding? #238

remcopeereboom opened this issue Apr 26, 2016 · 7 comments

Comments

@remcopeereboom
Copy link
Contributor

remcopeereboom commented Apr 26, 2016

Unless I'm mistaken, the tests for the run-length encoding problem don't account for numerical data. Single character sequences are not prefixed with a number and runs of characters are not prefixed with an escape flag. This leads to ambiguity when dealing with strings like these: 22333, which encodes to 22333, but what should it decode to: 22333 or (2233 * 3) or 2(23 * 3) or one of the other possibilities?

@Cohen-Carlisle
Copy link
Member

The thought also occurred to me but I assumed it was intentional.
Perhaps a precondition of no digit characters should be explicitly called it if that is the case.

I'd imagine we could run into some interesting problems with the metadata otherwise as different languages may have preferred ways to handle this (multidimensional arrays vs tuples vs escapes in strings, etc.)

@remcopeereboom
Copy link
Contributor Author

remcopeereboom commented Apr 27, 2016

Perhaps a precondition of no digit characters should be explicitly called it if that is the case.

Really any choice is fine, as long is it's specified. That said, I'm not keen on not allowing numbers given that the code explicitly handles unicode characters. The simplest solution would be to also prefix single character sequences with a count of 1. That does tend to defeat the optimization in most practical applications, however.

@Cohen-Carlisle
Copy link
Member

Wouldn't including 1s still not solve the problem of numbers? As per your example: 2233 is ambiguous: is it "233" * 2 or "33" * 22, etc. I like the idea of simply calling out that the approach currently outline only works on inputs without digits.

@remcopeereboom
Copy link
Contributor Author

Yes you are completely right. I missed that.

@behrtam
Copy link
Contributor

behrtam commented Nov 3, 2016

I think adding 1s (always the count) and limiting it to max chunks of 9 would solve the problem, because you would get a clear pattern of count/character. So 2233 would always decode to 2 * '2' + 3 * '3' => '22333' but you would also get encode('11') == '1111' and if you get more that 9 consecutive chars that are the same, it get's quite complicated '9W3WB9B3W3...'.

There are already 9 tracks that implemented this exercise without 1s, so maybe the easier way would be to change the readme instead of all the implementations.

@Insti
Copy link
Contributor

Insti commented Nov 6, 2016

I think that in order to keep the exercise simple, it would be better to not include numbers.

Explicitly mentioning that it only needs to handle letters in the Readme is probably a good idea.

Here is the description.md

@robkeim
Copy link
Contributor

robkeim commented Jan 31, 2017

I just sent out #530 with the clarification on the simplified input as @Insti suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants