Skip to content

Can we split _INIT_CALL_PY_EXACT_ARGS further in Tier 2? #666

Open
@gvanrossum

Description

@gvanrossum

_INIT_CALL_PY_EXACT_ARGS is already quite streamlined but we may be able to squeeze an extra bit out of it in the abstract interpreter. In many cases the abstract interpreter can know that the self_or_null input is either always NULL or never NULL. In those cases we could simplify to one the following:

        // always NULL
        replicate(5) pure op(_INIT_CALL_PY_EXACT_ARGS_ALWAYS_NULL, (callable, null, args[oparg] -- new_frame: _PyInterpreterFrame*)) {
            assert(null == NULL);
            (void)null;
            STAT_INC(CALL, hit);
            PyFunctionObject *func = (PyFunctionObject *)callable;
            new_frame = _PyFrame_PushUnchecked(tstate, func, oparg);
            for (int i = 0; i < oparg; i++) {
                new_frame->localsplus[i] = args[i];
            }
        }
        // never NULL
        replicate(5) pure op(_INIT_CALL_PY_EXACT_ARGS_NEVER_NULL, (callable, self, args[oparg] -- new_frame: _PyInterpreterFrame*)) {
            assert(self != NULL);
            STAT_INC(CALL, hit);
            PyFunctionObject *func = (PyFunctionObject *)callable;
            new_frame = _PyFrame_PushUnchecked(tstate, func, oparg + 1);
            new_frame->localsplus[0] = self;
            for (int i = 0; i < oparg; i++) {
                new_frame->localsplus[i+1] = args[i];
            }
        }

It does cost about 10 extra uop instructions, but we seem to have about 77 left (more, if we lower the starting point below 300). It also costs extra special-casing in the abstract interpreter. But the JIT templates ought to become even smaller.

@markshannon @brandtbucher

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions