Skip to content

Package version mechanism incompatible with AWS Lambda #584

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
danie1k opened this issue Jul 16, 2019 · 20 comments
Closed

Package version mechanism incompatible with AWS Lambda #584

danie1k opened this issue Jul 16, 2019 · 20 comments
Labels
Invalid Not a bug, PEBKAC, or an unsupported setup

Comments

@danie1k
Copy link

danie1k commented Jul 16, 2019

Tested on AWS Lambda. Following code does not work on AWS Lambda due to lack of package manager.

https://github.com/Julian/jsonschema/blob/e4fa34f6517895a81e5ba7e648dc0796f25f9b21/jsonschema/__init__.py#L32-L33

It has beed added in b07d0f1 commit.

Stacktrace I'm getting

Traceback (most recent call last):
File "/var/task/wsgi_handler.py", line 44, in import_app
wsgi_module = importlib.import_module(wsgi_fqn_parts[-1])
File "/var/lang/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 678, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/var/task/index.py", line 11, in <module>
(...)
from jsonschema.exceptions import ValidationError
File "/var/task/jsonschema/__init__.py", line 33, in <module>
__version__ = get_distribution(__name__).version
File "/var/task/pkg_resources/__init__.py", line 481, in get_distribution
dist = get_provider(dist)
File "/var/task/pkg_resources/__init__.py", line 357, in get_provider
return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
File "/var/task/pkg_resources/__init__.py", line 900, in require
needed = self.resolve(parse_requirements(requirements))
File "/var/task/pkg_resources/__init__.py", line 786, in resolve
raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'jsonschema' distribution was not found and is required by the application
START RequestId: xxx-xxx-xxx Version: $LATEST
module initialization error: Unable to import index.api

END RequestId: xxx-xxx-xxx
module initialization error
Unable to import index.api
Traceback (most recent call last):
File "/var/task/wsgi_handler.py", line 44, in import_app
wsgi_module = importlib.import_module(wsgi_fqn_parts[-1])
File "/var/lang/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 678, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/var/task/index.py", line 11, in <module>
(...)
from jsonschema.exceptions import ValidationError
File "/var/task/jsonschema/__init__.py", line 33, in <module>
__version__ = get_distribution(__name__).version
File "/var/task/pkg_resources/__init__.py", line 481, in get_distribution
dist = get_provider(dist)
File "/var/task/pkg_resources/__init__.py", line 357, in get_provider
return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
File "/var/task/pkg_resources/__init__.py", line 900, in require
needed = self.resolve(parse_requirements(requirements))
File "/var/task/pkg_resources/__init__.py", line 786, in resolve
raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'jsonschema' distribution was not found and is required by the application
@danie1k danie1k changed the title Package version mehcanism incompatible with serverless environment Package version mechanism incompatible with serverless deployment Jul 16, 2019
@Julian
Copy link
Member

Julian commented Jul 20, 2019

Hi.

If lambda doesn't actually install the packages, that indeed wouldn't be a supported configuration.

@danie1k
Copy link
Author

danie1k commented Jul 22, 2019

I don't know how it works on Azure or other serverless Cloud providers, but in AWS Lambda all third-party packages are just zipped together with your code and sent to Amazon servers.
So all share the same root directory / - in my stacktrace, it's virtually named /var/task/.

Following files are from my project and exists in repository:

  • /var/task/wsgi_handler.py
  • /var/task/index.py

And these are third-parties, normally installed through pip:

  • /var/task/jsonschema/
  • /var/task/pkg_resources/

@harshavardhangsv
Copy link

Hi @danie1k , I also had the same error. In my case it was because I enabled slim mode in serverless configuration. Disabling the slim mode in serverless config fixed it for me. Hope it helps!

@Julian
Copy link
Member

Julian commented Jul 23, 2019

Yeah I'd say overall this is something that has to be handled within AWS, with however they manage to support "normal" package installations.

@harshavardhangsv thanks for chiming in with the specific suggestion on what to tweak there!

Going to close this, but if any others have this issue (or have suggestions for what's needed on the AWS side) I'm sure others finding this will appreciate it.

@Julian Julian closed this as completed Jul 23, 2019
@danie1k
Copy link
Author

danie1k commented Jul 24, 2019

@harshavardhangsv thanks for the tip! But...

Parameter slim is used to make code package smaller and contain only files important for applicatin to work on production; deploying e.g. tests to production is not a good idea, IMO.

To remove the tests, information and caches from the installed packages, enable the slim option. This will: strip the .so files, remove pycache and dist-info directories as well as .pyc and .pyo files.

@Julian With slim: false, size of package can grow even a few times, do we really want to force people to do that for their whole applications, only because one single package uses non-standard (however very briliant) method for setting one parameter? - The parameter that in fact is even not used by the most of the developers and even not important in everyday programming. It's really important only for deployment mechanisms & package managers.

More information on slim flag: https://www.npmjs.com/package/serverless-python-requirements#slim-package.

@Julian
Copy link
Member

Julian commented Jul 25, 2019

I don't think is as uncommon as you're making it out to be, jsonschema uses the same "style" of versioning as it always has (and over the past major version all that happened was switching from one tool that did so to another, setuptools-scm, a PyPA "official" way to do so even).

I of course want things to work for as many users as possible, but AWS is doing nonstandard things here -- so if anything I look at it in the reverse -- should one single relatively uncommon platform (AWS Lambda) cause any package looking to do Python versioning in "modern" ways (modern being like literally available for the past 5 years :) or should Lambda just fix itself (or use other configurations of it) -- e.g., if all slim does is ship source code directly, why not instead ship the output of pip install --target or any of the other ways to actually build stuff before it reaches the server? I don't know enough about AWS Lambda here but yeah I do feel jsonschema is not doing anything out of the ordinary.

@zgoda
Copy link

zgoda commented Sep 12, 2019

Disabling slim mode did not change anything for me, i'm still getting "The 'jsonschema' distribution was not found and is required by the application".

@Julian
Copy link
Member

Julian commented Sep 12, 2019

Hard to help without seeing what commands you (or Amazon) runs to install the package, and what output it produced.

@zgoda
Copy link

zgoda commented Sep 12, 2019

From deployment instructions in AWS docs I can only assume nothing is "installed".

https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html#python-package-dependencies

EDIT: everything works fine with 2.6.0.

@cwells
Copy link

cwells commented Jun 4, 2020

Can you please add a note to the front page of the documentation that this library is not Lambda-compatible? I just threw a few hours down the drain that could have easily been avoided were this prominently noted.

Also, on a side-note, labeling Lambda an "uncommon" platform is perhaps a bit out-of-touch. My current project is about 90% lambda code vs traditional server instances and I don't believe my approach is unusual.

In any case, breaking backwards-compatibility without a really good reason is always a bad idea.

For my own use, I've pinned this package to 3.0.2 and I guess I will be forced to maintain a fork going forward.

@Julian
Copy link
Member

Julian commented Jun 17, 2020

In any case, breaking backwards-compatibility without a really good reason is always a bad idea.

@cwells AWS Lambda was never supported by this library. The only supported platforms are those that run under CI.

@cwells
Copy link

cwells commented Jul 29, 2020

@Julian It may have not been supported, but it worked. My point is that it no longer works, and it's unclear to me what tangible benefit was obtained by the change that broke it.

You may not have planned on having lambda support, but you got it for free. Why throw it away?

@glyph
Copy link

glyph commented Aug 1, 2020

@cwells The change which broke this was a migration from vcversioner (abandoned; last released on 2016-04-12; maintained by an individual) to importlib-metadata and setuptools-scm (supported; recently released; maintained by the usual suspects in the python packaging cabal). Using actively maintained tools for packaging has a whole slew of benefits.

Rather than maintaining a fork, your energy would probably be better spent contributing upstream fixes to those libraries to make them compatible with Lambda - or, better yet, asking the Lambda Python team to engage more actively with the packaging community and developing tooling integration that works with Python's evolving PEP-specified toolchain that solves a whole bunch of problems for tons of other platforms, rather than mandating the somewhat odd "installed but not installed" artifact shape for Python which creates lots of these problems. I'm mostly familiar with the problems rather than the solutions in this space (which is why I avoid lambda) but it looks like there are others actively engaged with this problem, for example maybe the right place for the fix to land is in https://pypi.org/project/lambda-setuptools/ .

@Julian Julian changed the title Package version mechanism incompatible with serverless deployment Package version mechanism incompatible with AWS Lambda Aug 1, 2020
@cwells
Copy link

cwells commented Aug 2, 2020

@glyph If I had time to contribute, I would concur. Unfortunately I'm engaged in developing a startup where I am the only developer, so I don't have the resources to commit to any of the above. This is doubly true since Python packaging isn't an area I have any particular expertise in. I selected this library at a time it was compatible, only to have it break underneath of me during an update.

I'm certainly willing to file a bug report with AWS and will do so, but at this time, a fork appears to be the least time-consuming option, since cherry-picking commits to the library is much more straightforward than diving into the mess that is Python packaging.

I'd also add that of the Python libraries I utilize (which are numerous), this is the only one to cause any sort of problem, hence my skepticism about the value of the change.

@glyph
Copy link

glyph commented Aug 2, 2020

@glyph If I had time to contribute, I would concur. Unfortunately I'm engaged in developing a startup where I am the only developer, so I don't have the resources to commit to any of the above. This is doubly true since Python packaging isn't an area I have any particular expertise in. I selected this library at a time it was compatible, only to have it break underneath of me during an update.

I'm certainly willing to file a bug report with AWS and will do so, but at this time, a fork appears to be the least time-consuming option, since cherry-picking commits to the library is much more straightforward than diving into the mess that is Python packaging.

I'd also add that of the Python libraries I utilize (which are numerous), this is the only one to cause any sort of problem, hence my skepticism about the value of the change.

I can sympathize with these sentiments. Based on my understanding of the Lambda & Python Packaging ecosystems, unless something changes to make them work together more smoothly, I would expect to see this sort of breakage propagate through more libraries as people adopt more modern packaging standards. (Probably if you just ignore it, eventually someone on either the Lambda or the open source side will fix it, since the platform will get less and less useful if lots of libraries start breaking it.)

@cwells
Copy link

cwells commented Aug 3, 2020

I did a little reading last night, and came across this comment in setuptools-scm:

pypa/setuptools-scm#143 (comment)

setuptools-scm doesn't appear particularly well-documented, but it seems it has some options for writing out the version to the filesystem during installation rather than doing it dynamically at runtime.

Maybe this would be a solution? It seems calculating a static value at runtime isn't terribly efficient anyway.

I'd love to test this, but it's unclear where I would add this setting as there is no setup.py =(

@cwells
Copy link

cwells commented Aug 3, 2020

Ok, it seems we were barking up the wrong tree. The problem stems from the serverless-python-requirements plugin. I came across this serverless/serverless-python-requirements#441

Changing my serverless.yml to the following appears to work:

custom:
  pythonRequirements:
    dockerizePip: non-linux
    useDownloadCache: false
    useStaticCache: false
    slim: true
    slimPatternsAppendDefaults: false
    slimPatterns:
      - '**/*.py[c|o]'
      - '**/__pycache__*'
    layer:
      name: api
      description: API requirements
      compatibleRuntimes:
        - python3.8

Kudos to @glyph for shaming me into a bit more effort.

@cwells
Copy link

cwells commented Aug 4, 2020

@Julian Perhaps a blurb can be added to the docs related to deploying this library using Serverless?

If deploying with Serverless framework, ensure that *.dist-info files are included in the deployed .zip file using something like this:

pythonRequirements:
    dockerizePip: non-linux
    useDownloadCache: false
    useStaticCache: false
    slim: true
    slimPatternsAppendDefaults: false
    slimPatterns:
      - '**/*.py[c|o]'
      - '**/__pycache__*'

Otherwise you be unable to import the library. By default, the serverless-python-requirements plugin ignores these files when using slim mode.

@Julian
Copy link
Member

Julian commented Aug 4, 2020

Happy to see that added (maybe to the FAQ?)!

@cwells
Copy link

cwells commented Aug 4, 2020

Sounds good. I'll create a PR.

Sorry, going to have to rescind that. Editing rst is apparently not something vscode does well (no plugin was able to render a preview of the faq) and I'm not willing to go down that rabbit hole. I haven't used rst in years for a reason =P

simenheg added a commit to oslokommune/okdata-pipeline that referenced this issue Dec 4, 2020
Make necessary packaging tweaks (taken from the discussion at
python-jsonschema/jsonschema#584) to make the
`jsonschema` package work when running in AWS Lambda.
Julian added a commit that referenced this issue Aug 16, 2022
b3c8672a3 Merge pull request #587 from json-schema-org/cross-draft-tests
e2a681ac6 $comment -> comment
52eb27902 add draft 2020-12 -> draft 2019-09 test
c4ed16dfe add draft 2019-09 -> draft 7 test
3df4712a9 add draft 2019-09 -> draft 2020-12 test
dab94face add draft 7 -> draft 2019-09 test
dfcea626f Merge pull request #584 from json-schema-org/doi
e4a59d962 Add a badge for a Zenodo DOI.

git-subtree-dir: json
git-subtree-split: b3c8672a3fa0a0691124e455680b34d5512cae94
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Invalid Not a bug, PEBKAC, or an unsupported setup
Projects
None yet
Development

No branches or pull requests

6 participants