-
Notifications
You must be signed in to change notification settings - Fork 1k
Implement a more robust malware detector #7748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
There was a fairly public effort, More details are in this LWN article: https://lwn.net/Articles/574215/ |
Thanks a lot ! This goes in the direction we were heading I guess, leaving at least a few options that were suggested:
I have clearly reached my competency level, and continued a bit beyond, I’d love to learn more but I won’t be able to suggest a lot, and at this point, anything I might add will likely be a laughable proof of the dunning-kruger effect... |
PEP 578 + the new audit API in Python 3.8 would probably work well for this purpose. We'd still need some amount of sandboxing, though. |
I'm now more convinced that the way to go would rather be to provide a way for 3rd parties to be warned anytime a release is uploaded, and an API to report status on those ("found safe" / "malware" / ...) (along the line of the API we already have for CVEs etc). Implementing malware detection within the warehouse codebase and/or on the PyPI server itself is not the solution. |
Looks like the only API for that are XML based RSS feeds https://warehouse.pypa.io/api-reference/feeds.html |
For an effective event propagation through the network it may be possible to use https://docs.libp2p.io/concepts/publish-subscribe/ so that every interested party could run their node that will receive notifications automatically without polling PyPI endpoints. As an extension to that, nodes can sign the events/package hashes with the result of validation checks and submit them to the network the same way. That would require quite a bit of prototyping, so I propose to participate in Gitcoin grants to attract more people who can help psf/fundable-packaging-improvements#40 |
This is gone now #13647 |
Uh oh!
There was an error while loading. Please reload this page.
Hello there. I'm probably going to say a bunch of obvious things, sorry in advance :/
Current YARA-based malware detector can be circumvented easily:
import builtins
will happily not be detected because all spaces have not been marked as repeatable)timeit
doeseval
or thatplatform
has apopen
method... Did I mention that().__class__.__bases__[0].__subclasses__()[88]
is<class 'zipimport.zipimporter'>
? I think it's endless...That being said, maybe there IS such a thing as being thourough. I doubt it. Maybe detecting nearly all dunder methods AND unusual standard lib modules and functions AND a few builtins... Maybe a whitelist ? I'm afraid this would make more noise than signal, but maybe we should try.
(For reference, https://ctf-wiki.github.io/ctf-wiki/pwn/linux/sandbox/python-sandbox-escape/)
So... There is one remaining way to know what a script does: executing it in a sandboxed environment, but this raises questions too:
y
) could do that (and the idea of including Pypy in PyPI is a nice level of meta ;) )(One advantage of this approach would be to be able to extract metadata from sdists though, which I believe is another problem that exists out there)
So many questions... I hope this hasn't already been answered in another issue, I couldn't find anything when I searched.
Ping @xmunoz and @woodruffw to continue the discussion.
The text was updated successfully, but these errors were encountered: