Skip to content

initial proof-of-concept in a container#11

Open
mroth wants to merge 1 commit intoKhan:masterfrom
mroth:containerize
Open

initial proof-of-concept in a container#11
mroth wants to merge 1 commit intoKhan:masterfrom
mroth:containerize

Conversation

@mroth
Copy link
Copy Markdown
Contributor

@mroth mroth commented Sep 21, 2017

This is a potential alternate way to look at distribution of a dev tool with dependencies using Docker, versus the current vendoring/packaging methodology.

Instead of trying to vendor/bundle all the python/node dependency files in a way that will work on most systems, just build a self-contained docker image. This frees us from having to worry at all about if/what versions of Python/NodeJS are on the machine.

When the linter is updated, new images would be uploaded to Docker Hub (using that vs a private GCR since this is an open source project), end-users can docker pull khanacademy/khan-linter:latest (or a specific version/tag)

This also would provide an alternate way to run on a server without worrying conflicting dependencies between this and the app it is linting.

Note I haven't actually pushed the built image to Docker Hub yet, will do so if/after this is merged.

Future enhancements

For convenience of tagging docker images, it would be good to be more consistent with updating semantic version of releases, and then have the make image task here add the version number, to the image tag, so people could more easily reason about what version they have on their system, especially when doing a docker images list or some such.

What would it mean to fully embrace this?

In a mystical future world if the docker image was to become the "one true way" to use this, then there are some cleanup tasks that would probably become worth looking at:

  • the py2 vs py3 branching and vendoring logic could be removed entirely.
  • the update_check can be removed entirely in favor of pulling the latest docker image (it currently won't fire within the container because its not in a git repo, but the code could be removed for cleanliness then.)
  • the big vendored directories can be removed from this git repo, as with the associated Make tasks.

This is a potential alternate way to look at distribution of a dev tool
with dependencies using Docker.

Instead of trying to vendor/bundle all the python/node dependency files
in a waythat will work on most systems, just build a self-contained docker
image. This frees us from having to worry at all about if/what versions of
Python/NodeJS are on the machine.

When the linter is updated, new images would be uploaded to Docker Hub
(using that vs a private GCR since this is an open source project),
end-users can
`docker pull khanacademy/khan-linter:latest` (or a specific version/tag)

This also would provide an alternate way to run on a server without
worrying conflicting dependencies between this and the app it is linting.

Note I haven't actually pushed the built image to Docker Hub yet, will
do so if/after this is merged.

Future enhancements
-------------------
For convenience of tagging docker images, it would be good to be more
consistent with updating semantic version of releases, and then have
the `make image` task here add the version number, to the image tag,
so people could more easily reason about what version they have on
their system.

What would it mean to fully embrace this?
-----------------------------------------
In a mystical future world if the docker image was to become the "one
true way" to use this, then there are some cleanup tasks that are
probably worth looking at:

- the py2 vs py3 branching and vendoring logic could be removed
  entirely.
- the update_check can be removed entirely in favor of pulling the
  latest docker image (it currently won't fire within the container
  because its not in a git repo, but the code could be removed for
  cleanliness then.)
- the big vendored directories can be removed from this git repo, as
  with the associated Make tasks.
@cjfuller
Copy link
Copy Markdown
Contributor

Can you elaborate a little more on how the py2 vs. py3 thing would be solved? I would think we would need a separate container for each. (The key problem here is not the version under which it has to run, but the version it has to lint. As far as I know you have to lint code of the same version of python you're running?)

@mroth
Copy link
Copy Markdown
Contributor Author

mroth commented Sep 21, 2017

Can you elaborate a little more on how the py2 vs. py3 thing would be solved? I would think we would need a separate container for each. (The key problem here is not the version under which it has to run, but the version it has to lint. As far as I know you have to lint code of the same version of python you're running?)

Oh interesting, I had assumed that was just there for handle running under different environments, didn't realize our lint rules required running under the same version. That would obviously require some changes if true.

@cjfuller
Copy link
Copy Markdown
Contributor

Yeah, as far as I know, the off-the-shelf python linters do version-matched linting. So, e.g., to lint our python3.6 code, we'd need to run under python3.6.

@mroth
Copy link
Copy Markdown
Contributor Author

mroth commented Sep 21, 2017

Okay, that will require a different approach then. While we could have separate images for py2 vs py3, my assumption is it would probably be most convenient for users to have a single image that has both dependencies installed and could use the proper one via a flag or some such?

@carterjbastian
Copy link
Copy Markdown

Wow! Very cool! This is a pretty sweet proof of concept that seems like it would make things significantly easier. I haven't worked on the linter, so I'm curious to hear what other changes we'd need to consider before committing to this way of doing things.

@csilvers
Copy link
Copy Markdown
Member

(I commented in phabricator, but repeating here, ugh so confusing):

Cool! I am learning a lot already.

How long does it take to run runlint over a single file in the docker world as compared to directly? I ask because the git hooks/etc often run lint over jusy one or two files, if that's what's in your commit.

Copy link
Copy Markdown
Member

@csilvers csilvers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearing out my github queue...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants