Define the interface of a CodeLike object #117087

iritkatriel · 2024-03-20T16:29:49Z

The C API for monitoring (#111997) works with code-like objects, so that the user is not required to create a CodeObject where there isn't one already.

We need to define the Python API of a CodeLike so that it's useful for tools that use monitoring. This issue is to define which fields of CodeObject we want to have on a CodeLike.

@markshannon @scoder @nedbat @gaogaotiantian

Linked PRs

GH-117087: Initial implementation of support for 'code like' objects in sys.monitoring #131414

gaogaotiantian · 2024-03-26T01:04:04Z

So basically the CodeLike object is a Python object for Python libraries to get information right? If it complies to the existing monitoring callback, then we'd expect any callback function to work even with the CodeLike object - but that's not entirely possible right? The user using sys.monitoring can do whatever they want with the code object so unless we mimic a full CodeObject, they can always fall into some trap.

Or we are actually talking about a useful object that can provide some information about the code, which could be used by a variaty of tools? For example, when cython triggers a monitoring event LINE, it provides a CodeLike object that contains a line number for cython code, even though it has nothing to do with CPython CodeObject. Then the question would be - what information the tools need for monitoring events? Debugger might be too complicated to fulfill but the profilers might have something in common.

markshannon · 2024-04-12T09:08:52Z

The main purpose of the code-like object, from the perspective of tools, is to convert a code_like/offset pair into a full location: filename, startline, startcolumn, endline, endcolumn.

We also want to support the instrospection method/attributes of code objects, and with the same names for ease of porting for coverage.py, profile, etc.

So, how about this:

class CodeLike(metaclass=ABCMeta):

    @abstractmethod
    def offset_to_location(self, offset):
        """Returns the 5-tuple (filename, startline, startcolumn, endline, endcolumn) for the given offset.
         May return None if the offset is valid, but there is no location information for it.
         If the offset is not valid, a ValueError should be raised"""

    @abstractproperty
    def co_name(self):
        "The (short) name of this callable"

    @abstractproperty
    def co_qualname(self):
        "The full, qualified name of this callable"

    @abstractproperty
    def co_filename(self): 
        "The name of the primary file defining this callable"

    @abstractproperty
    def co_argcount(self): 
        "The maximum number of arguments for this callable"

    @abstractproperty
    def co_posonlyargcount(self): 
        "The number of positional only arguments for this callable"

    @abstractproperty
    def co_kwonlyargcount(self): 
        "The maximum of keyword only arguments for this callable"

scoder · 2024-05-27T02:42:39Z

My impression is that most code that uses tracing currently expects a real CodeObject. And the current implementation requires it, see #111997 (comment) and here:

cpython/Python/instrumentation.c

Lines 1611 to 1630 in de19694

    
           allocate_instrumentation_data(PyCodeObject *code) 
        
           { 
        
               ASSERT_WORLD_STOPPED_OR_LOCKED(code); 
        
               if (code->_co_monitoring == NULL) { 
        
                   code->_co_monitoring = PyMem_Malloc(sizeof(_PyCoMonitoringData)); 
        
                   if (code->_co_monitoring == NULL) { 
        
                       PyErr_NoMemory(); 
        
                       return -1; 
        
                   } 
        
                   code->_co_monitoring->local_monitors = (_Py_LocalMonitors){ 0 }; 
        
                   code->_co_monitoring->active_monitors = (_Py_LocalMonitors){ 0 }; 
        
                   code->_co_monitoring->tools = NULL; 
        
                   code->_co_monitoring->lines = NULL; 
        
                   code->_co_monitoring->line_tools = NULL; 
        
                   code->_co_monitoring->per_instruction_opcodes = NULL; 
        
                   code->_co_monitoring->per_instruction_tools = NULL; 
        
               } 
        
               return 0; 
        
           }

Can't we split the CodeObject type somehow and expose a public part (or subclass) of it, so that the code keeps working that needs the public interface and internal code stays internal and e.g. goes through a generic "here's more internal stuff" pointer? The _PyCoMonitoringData already goes into that direction.

The current CodeObject is really two things in one: a CallableMetadataObject and a BytecodeObject. They are not the same, even in CPython. Builtin functions should have the first but not the second.

markshannon · 2025-03-17T12:46:48Z

In the sys.monitoring docs, CodeType is used in the various event callbacks and in sys.monitoring.get_local_events and sys.monitoring.set_local_events.

Changing the callback signature to expect CodeLike instead of just CodeType is mainly a documentation and social issue. We need buy-in from tool authors as well as just changing the docs.
Pure Python code, without explicit isinstance checks should just work, though.

To support sys.monitoring.get_local_events and sys.monitoring.set_local_events we'll need two more methods on CodeLike objects:

class CodeLike(metaclass=ABCMeta):

    ...

    @abstractmethod
    def __get_local_events__(self, tool_id: int) -> int:
        """Gets the local events set previously set by __set_local_events__.
           Called by sys.monitoring.get_local_events(tool_id, self)"""

    @abstractmethod
    def __set_local_events__(self, tool_id: int, event_set: int) -> None:
        "Sets the local events. Called by sys.monitoring.set_local_events(tool_id, self, event_set)"

nedbat · 2025-03-19T12:40:17Z

I'm a tool author, but a bit lost on what is being asked of me here. Coverage.py now fully supports branch coverage using sys.monitoring as it is.

markshannon · 2025-03-19T15:50:19Z

@nedbat We are asking what part of the code object API does coverage (and other tools that use sys.monitoring) actually use, and how easy or difficult would it be to change to use only the API proposed above.

For example, does coverage use the co_positions() method, which is not in the proposed API?
And, if it does use co_positions, how easy would it be to use the proposed offset_to_location() method instead?

nedbat · 2025-03-20T17:42:32Z

Things coverage.py does with code objects:

iterates over code.co_consts to find nested code objects (including using isinstance(c, CodeType) to identify them).
checks if c.co_name != "__annotate__" to skip annotations while looking for code objects.
creates them using the compile() built-in.
reads them from .pyc files using marshal.load().
uses c.co_lines() to get line numbers of executable code. Also still uses c.co_lnotab and c.co_firstlineno for the same, but that will be gone when we drop 3.9 in the fall.
for debug logging, uses c.co_name, c.co_filename, and c.co_firstlineno.
calls dis.get_instructions(c) to analyze bytecode.
uses c.co_firstlineno to record possible jump destinations.
uses id(code) to uniquely identify them (we had a discussion about equality of code objects).

I don't use co_positions() at all.

JacobCoffee added type-feature A feature request or enhancement interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Oct 10, 2024

bedevere-app bot mentioned this issue Mar 18, 2025

GH-117087: Initial implementation of support for 'code like' objects in sys.monitoring #131414

Draft

picnixz added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Mar 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define the interface of a CodeLike object #117087

Define the interface of a CodeLike object #117087

iritkatriel commented Mar 20, 2024 •

edited by bedevere-app bot

Loading

gaogaotiantian commented Mar 26, 2024

markshannon commented Apr 12, 2024 •

edited

Loading

scoder commented May 27, 2024

markshannon commented Mar 17, 2025

nedbat commented Mar 19, 2025

markshannon commented Mar 19, 2025 •

edited

Loading

nedbat commented Mar 20, 2025

Define the interface of a CodeLike object #117087

Define the interface of a CodeLike object #117087

Comments

iritkatriel commented Mar 20, 2024 • edited by bedevere-app bot Loading

Linked PRs

gaogaotiantian commented Mar 26, 2024

markshannon commented Apr 12, 2024 • edited Loading

scoder commented May 27, 2024

markshannon commented Mar 17, 2025

nedbat commented Mar 19, 2025

markshannon commented Mar 19, 2025 • edited Loading

nedbat commented Mar 20, 2025

iritkatriel commented Mar 20, 2024 •

edited by bedevere-app bot

Loading

markshannon commented Apr 12, 2024 •

edited

Loading

markshannon commented Mar 19, 2025 •

edited

Loading