Skip to content

loop.sock_recv failure because of delayed callback handling #78976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kyuupichan mannequin opened this issue Sep 24, 2018 · 8 comments
Closed

loop.sock_recv failure because of delayed callback handling #78976

kyuupichan mannequin opened this issue Sep 24, 2018 · 8 comments
Assignees

Comments

@kyuupichan
Copy link
Mannequin

kyuupichan mannequin commented Sep 24, 2018

BPO 34795
Nosy @asvetlov, @1st1, @kyuupichan

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/asvetlov'
closed_at = None
created_at = <Date 2018-09-24.21:40:22.404>
labels = ['3.7', 'expert-asyncio']
title = 'loop.sock_recv failure because of delayed callback handling'
updated_at = <Date 2018-10-09.08:57:24.311>
user = 'https://github.com/kyuupichan'

bugs.python.org fields:

activity = <Date 2018-10-09.08:57:24.311>
actor = 'asvetlov'
assignee = 'asvetlov'
closed = False
closed_date = None
closer = None
components = ['asyncio']
creation = <Date 2018-09-24.21:40:22.404>
creator = 'kyuupichan'
dependencies = []
files = []
hgrepos = []
issue_num = 34795
keywords = []
message_count = 7.0
messages = ['326291', '326292', '326293', '326296', '327307', '327309', '327310']
nosy_count = 3.0
nosy_names = ['asvetlov', 'yselivanov', 'kyuupichan']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue34795'
versions = ['Python 3.6', 'Python 3.7']

@kyuupichan
Copy link
Mannequin Author

kyuupichan mannequin commented Sep 24, 2018

In certain circumstances the coroutine loop.sock_recv() registers a callback internally, which is called on e.g. task cancellation. The callback assumes a file descriptor that was open and valid at the time the callback was registered is still open and valid when the callback is called, and this need not be the case. For example, when client code handling exceptions (in particular, cancellation) closes the socket itself.

I have reports of this error for Python 3.6 and 3.7 by a user of my library which uses asyncio, and that bad things happen on both Windows (the event loop dies) and Linux (a traceback happens but no loop death).
It likely happens in earlier versions of Python too but my library requires >= Python 3.6

I hope the above description in addition to pointing out the problematic lines in asyncio make the bug clear. I tried to produce a simplified testcase but it doesn't trigger the issue on Linux. I am told it does trigger it on Windows (I don't have Windows to test on).

Here are the problematic lines in selector_events.py where the FD is registered; the callback above them assumes the DF remains valid (but the socket might have been closed in the meantime):

https://github.com/python/cpython/blob/master/Lib/asyncio/selector_events.py#L378-L380

The same problem is evident in sock_recv_into, and perhaps other calls.

Here is code I am told triggers the problem on Windows; it seems to require something a little more complex to trigger reliably on Unix platforms:

import asyncio
import socket


class MyProtocol(asyncio.Protocol):

    def connection_made(self, transport):
        transport.write(b'123')   # just in case a write is needed


port = 6666


async def connect_and_recv(loop, sock):
    try:
        await loop.sock_connect(sock, ('127.0.0.1', port))
        while True:
            await loop.sock_recv(sock, 20)
    except asyncio.CancelledError:
        print("Cancelled")
        sock.close()


async def main(loop):
    server = await loop.create_server(MyProtocol, '127.0.0.1', port)
    sock = socket.socket()
    sock.setblocking(False)
    task = loop.create_task(connect_and_recv(loop, sock))
    await asyncio.sleep(0.1)
    task.cancel()
    await asyncio.sleep(0.1)


loop = asyncio.get_event_loop()
loop.run_until_complete(main(loop))

Here are backtraces from a client application doing things more complex but similar in spirit to the snippet above:

File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\asyncio\base_events.py", line 455, in run_until_complete
self.run_forever()
File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\asyncio\base_events.py", line 422, in run_forever
self._run_once()
File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\asyncio\base_events.py", line 1398, in _run_once
event_list = self._selector.select(timeout)
File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\selectors.py", line 323, in select
r, w, _ = self._select(self._readers, self._writers, [], timeout)
File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\selectors.py", line 314, in _select
r, w, x = select.select(r, w, w, timeout)
OSError: [WinError 10038] An operation was attempted on something that is not a socket

and one for Linux:

Exception in callback BaseSelectorEventLoop._sock_recv(<Future cancelled>, True, <socket.socke...2049, proto=0>, 5)
handle: <Handle BaseSelectorEventLoop._sock_recv(<Future cancelled>, True, <socket.socke...2049, proto=0>, 5)>
Traceback (most recent call last):
  File "/usr/lib/python3.6/asyncio/events.py", line 127, in _run
    self._callback(*self._args)
  File "/usr/lib/python3.6/asyncio/selector_events.py", line 378, in _sock_recv
    self.remove_reader(fd)
  File "/usr/lib/python3.6/asyncio/selector_events.py", line 342, in remove_reader
    return self._remove_reader(fd)
  File "/usr/lib/python3.6/asyncio/selector_events.py", line 279, in _remove_reader
    key = self._selector.get_key(fd)
  File "/usr/lib/python3.6/selectors.py", line 189, in get_key
    return mapping[fileobj]
  File "/usr/lib/python3.6/selectors.py", line 70, in __getitem__
    fd = self._selector._fileobj_lookup(fileobj)
  File "/usr/lib/python3.6/selectors.py", line 224, in _fileobj_lookup
    return _fileobj_to_fd(fileobj)
  File "/usr/lib/python3.6/selectors.py", line 41, in _fileobj_to_fd
    raise ValueError("Invalid file descriptor: {}".format(fd))
ValueError: Invalid file descriptor: -1

@kyuupichan kyuupichan mannequin added 3.7 (EOL) end of life topic-asyncio labels Sep 24, 2018
@1st1
Copy link
Member

1st1 commented Sep 24, 2018

Can you reproduce this with uvloop?

Andrew: I think we need to "lock" sockets from closing in sock_recv and friends the same way we do it in uvloop.

@kyuupichan
Copy link
Mannequin Author

kyuupichan mannequin commented Sep 24, 2018

This seems related: https://bugs.python.org/issue30064

@kyuupichan
Copy link
Mannequin Author

kyuupichan mannequin commented Sep 24, 2018

My library user reports:

I can't reproduce the issue with uvloop on linux. (looks like uvloop does not work on windows atm)

@asvetlov
Copy link
Contributor

asvetlov commented Oct 7, 2018

Yuri, do you mean socket._io_refs manipulations?

@1st1
Copy link
Member

1st1 commented Oct 7, 2018

yes

@asvetlov
Copy link
Contributor

asvetlov commented Oct 7, 2018

Looks like a hack but we have no choice

@asvetlov asvetlov self-assigned this Oct 9, 2018
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@ezio-melotti ezio-melotti moved this to Todo in asyncio Jul 17, 2022
@kumaraditya303 kumaraditya303 removed the 3.7 (EOL) end of life label Sep 22, 2022
@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Sep 22, 2022

Unable to reproduce on main or 3.10. I assume this was already fixed in #74250 so closing.

./python.bat main.py 
Running Debug|x64 interpreter...
D:\python\main\main.py:34: DeprecationWarning: There is no current event loop
  loop = asyncio.get_event_loop()
Cancelled

@kumaraditya303 kumaraditya303 closed this as not planned Won't fix, can't repro, duplicate, stale Sep 22, 2022
Repository owner moved this from Todo to Done in asyncio Sep 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

3 participants