Skip to content

Should this track use test generators? #1513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cmccandless opened this issue Sep 25, 2018 · 24 comments
Closed

Should this track use test generators? #1513

cmccandless opened this issue Sep 25, 2018 · 24 comments

Comments

@cmccandless
Copy link
Contributor

cmccandless commented Sep 25, 2018

Pros:

  • Consistent test format
  • (Generally) Much easier to add new tests from the canonical data

Cons:

  • Some exercises don't have canonical data (although this number is shrinking rapidly)
  • Creation of test generators for new exercises/exercises without generators can be more complicated than even implementing the exercise itself.

Discussion commence... now.


TL;DR:

  • Test generators are (probably) a good idea
  • Use Jinja2 templates for code generation
  • Support exercise-specific templates.
@yawpitch
Copy link
Contributor

Is there a working example of this for another language track?

Generally I'd say yes it would seem sensible to generate tests from the canonical data ... that said, with Python's rather rich and diverse -- and at times infuriatingly half-sensical -- set of assert statement variants, I foresee a great deal of difficulty reliably mapping canonical data into a generated test suite that wouldn't need sometimes substantial manual intervention after the fact.

@cmccandless
Copy link
Contributor Author

Is there a working example of this for another language track?

Absolutely. The two I am more aware of are the C# and Ruby tracks.

I foresee a great deal of difficulty reliably mapping canonical data into a generated test suite that wouldn't need sometimes substantial manual intervention after the fact.

You're not wrong. The solution to this is to have a single base generator that might work for ~half of the exercises, then to extend/override the base generator for the exercises that require a more custom solution (example in my fork), or to just create individual generators from the start (probably overkill). This requires a heavier initial workload, but makes adding/modifying test cases trivial because one only needs to run the appropriate generator and verify that the generated code behaves correctly.

@yawpitch
Copy link
Contributor

Right, well looking at your example, an initial thought is that all the code generation is a good case for Jinja2. Once you've got a template for each of the various forms of assert, you can use the tools in Jinja2 to combine them and render them. Much simpler than directly writing all those newlines.

@cmccandless
Copy link
Contributor Author

I like it! I don't have the time to dive into that myself right now, but here's my thoughts:

  • Use Jinja2
  • Have a central generate.py <exercise> script that uses either:
    • An exercise-specific template, if available (hello_world.tmpl)
    • A general-use template (base.tmpl)

It would be really nice to be able to have that generate.py and base.tmpl implemented in the next week or so. That way, there would be a whole chunk of issues for valid base.tmpl for <exercise>, or implement <exercise>.tmpl ready for Hacktoberfest contributors!

@yawpitch
Copy link
Contributor

I'm not certain I'll have enough time free to dedicate to this in the next two weeks, but maybe.

What are the constraints, though? Specifically what version of Python would this need to be targeted to? And how freely can dependencies be added? I see a requirements-travis.yaml, but nothing else. The README indicates that the tests themselves need to run in 2.7 onwards, but does the test generator?

I'd consider a simple, leverage-what-we-can workflow to be generate the file from a general template that imports another template with Jinja2 macros defining the various blocks for specific kinds of tests, rendering that to disk but don't worry about formatting. Then kick off an autoformatter like Black to handle line length, etc. But Black, for instance, requires Python >= 3.6.

@yawpitch
Copy link
Contributor

If we don't care about the autoformat idea, then Jinja2 itself is backwards-compatible, but we could quite easily get into some overlength lines and just uggo tests.

@cmccandless
Copy link
Contributor Author

I would be open to autoformatting, and I don't think requiring Python >= 3.6 is unreasonable. I am not familiar with Black specifically, but it looks solid enough to count on; can I assume it conforms to PEP8?

@yawpitch
Copy link
Contributor

In their own words Black uses "a strict subset of pep-8"; it's a bit more opinionated than pep-8 (for instance it prefers double quotes to single quotes unless there are double quotes in the string itself), and it does a few things that flake8 for instance disagrees with.

Though maybe we should start with the Jinja2 part of things, and look at autoformatting later. When it comes down to it and glancing through a number of canonical data files we're going to have our hands full just generating in any format.

@cmccandless
Copy link
Contributor Author

Though maybe we should start with the Jinja2 part of things, and look at autoformatting later.

Fair enough. Having looked at test generation in the past, I would propose that there are 3 categories of test suites:

  • Stateless: no state information is used, test format is as simple as module.property(**input) (Ex: hello-world, bob, two-bucket)
  • Stateful: requires instantiation of some user object; slightly more complicated format (Ex: simple-cipher, kindergarten-garden)
  • Other: involved some sort of list of operations; very difficult to generate test code (Ex: react, zipper)

@yawpitch
Copy link
Contributor

Yeah, there's just no obvious indication in the canonical data which is which.

For instance for hello-world the "exercise" would map to the module name and the "property" to a function to be imported from that module.

But for allergies the "exercise" could still map to the module name, but also a class within that module, and the "property" in this case would (via string transformation to conform to pep8) to a method on that class.

There's no way to know, programatically, which type of generator would be required. Which sort of implies that the default template should be for the stateless variety and you'd need to special case all the stateful. And for "other", I just dunno.

@cmccandless
Copy link
Contributor Author

Which sort of implies that the default template should be for the stateless variety and you'd need to special case all the stateful.

That's the conclusion I reached as well. Stateful could also have a generic template, of the form:

def test_stateful_format(self):
    cls = getattr(module, module.title().replace('-', ''))
    obj = cls(**input)
    self.assertEqual(obj.property(), expected)

@cmccandless
Copy link
Contributor Author

I started on the generic template and generator script here.

@yawpitch
Copy link
Contributor

Hah! Too fast man! I'm working on one too!

@cmccandless
Copy link
Contributor Author

I wasn't sure if you were going to start working on it immediately or not; I don't mind taking a back seat in this one! Feel free to hijack anything from my work in progress that you think would be useful for yours.

@cmccandless
Copy link
Contributor Author

@yawpitch Perhaps it would be wise to create a WIP PR for this?

@yawpitch
Copy link
Contributor

yawpitch commented Sep 28, 2018 via email

@yawpitch
Copy link
Contributor

Got a bit of time to take a pass at this today; still getting up to speed with what all is in the canonical data format, and not sure how to handle conversion of named inputs to positional / keyword arguments on any sort of a reliable bases. But it's early days.

@cmccandless
Copy link
Contributor Author

not sure how to handle conversion of named inputs to positional / keyword arguments

A simple solution would be to treat all inputs as keyword arguments.

@yawpitch
Copy link
Contributor

yawpitch commented Oct 1, 2018 via email

@cmccandless
Copy link
Contributor Author

Also eliminates the problem of the input being defined in JSON in a type that has no inherent order in Python.

👍

@yawpitch
Copy link
Contributor

yawpitch commented Oct 2, 2018

Opened a new PR (#1547); this one is actually functional and will create tests, plus will auto format them to pep-8 and will verify that the code is viable ... this revealed some additional issues with using the dict in the canonical data to present keyword arguments (POV for instance uses a Python reserved keyword as a name in the input, and ETL uses numerical keys, which are illegal names to pass into a function call).

Basically I stopped doing so much work to update the incoming dict and instead made the text processing filters available to the templates ... this should make the work of building out templates that need to approach the data differently (ie teasing out a class name and instance methods) easier.

@github-actions
Copy link
Contributor

This issue has been automatically marked as abandoned because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions
Copy link
Contributor

This issue has been automatically marked as abandoned because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@cmccandless
Copy link
Contributor Author

The test generator has been in place for quite a while now; this can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants