Skip to content

[pass] Fix DCE to keep initializers that are inputs #2245

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from

Conversation

leshabirukov
Copy link
Contributor

@leshabirukov leshabirukov commented Apr 29, 2025

Fix #2235

Add check initializer for input remains
keeping input initializers
@justinchuby justinchuby changed the title Patch to keep input initializers [pass] Fix DCE to keep initializers that are inputs Apr 29, 2025
Copy link
Collaborator

@justinchuby justinchuby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link

codecov bot commented Apr 29, 2025

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
2011 1 2010 260
View the top 1 failed test(s) by shortest run time
onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0981_test_ai_onnx_ml_tree_ensemble_set_membership
Stack Traces | 0.021s run time
onnxscript/converter.py:460: in _eval_constant_expr
    return eval(cpl, self.globals, locals)  # pylint: disable=eval-used
E   NameError: name 'nan' is not defined

The above exception was the direct cause of the following exception:
..../test_ort_nightly/lib/python3.11.../site-packages/parameterized/parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript/backend/onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript/backend/onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
.../hostedtoolcache/Python/3.11.12.../x64/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1204: in _gcd_import
    ???
<frozen importlib._bootstrap>:1176: in _find_and_load
    ???
<frozen importlib._bootstrap>:1147: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:690: in _load_unlocked
    ???
..../test_ort_nightly/lib/python3.11.../_pytest/assertion/rewrite.py:185: in exec_module
    exec(co, module.__dict__)
tests/onnx_backend_test_code/test_ai_onnx_ml_tree_ensemble_set_membership.py:9: in <module>
    @script()
onnxscript/main.py:94: in transform
    result = script_check(f_ast, opset, env, src, default_opset=default_opset)
onnxscript/main.py:38: in script_check
    return convert.translate_function_def(f)
onnxscript/converter.py:1452: in translate_function_def
    fn_ir = self._translate_function_def_common(stmt)
onnxscript/converter.py:1439: in _translate_function_def_common
    self._translate_stmt(s, index_of_stmt=i)
onnxscript/converter.py:961: in _translate_stmt
    return self._translate_assign_stmt(node)
onnxscript/converter.py:1048: in _translate_assign_stmt
    assign(lhs, rhs)
onnxscript/converter.py:992: in assign
    t = self._translate_expr(rhs, lhs).name
onnxscript/converter.py:546: in _translate_expr
    r = self._translate_call_expr(node)
onnxscript/converter.py:825: in _translate_call_expr
    attrs = [
onnxscript/converter.py:826: in <listcomp>
    self._translate_attr(x, y, callee.op_schema.attributes[x])
onnxscript/converter.py:510: in _translate_attr
    val = self._eval_constant_expr(expr)
onnxscript/converter.py:462: in _eval_constant_expr
    raise NameError(
E   NameError: ERROR: Missing names, globals contains ['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__file__', '__cached__', '__builtins__', '@py_builtins', '@pytest_ar', 'numpy', 'TensorProto', 'make_tensor', 'script', 'external_tensor', 'Opset', 'FLOAT', 'ai_onnx_ml5'], locals [].
E   at: Function 'bck_test_ai_onnx_ml_tree_ensemble_set_membership', line 3
E       Y = ai_onnx_ml5.TreeEnsemble(X, aggregate_function=1, leaf_targetids=[0, 1, 2, 3], leaf_weights=make_tensor("value", 1, dims=[4], vals=[1.0, 10.0, 1000.0, 100.0]), membership_values=make_tensor("value", 1, dims=[8], vals=[1.2000000476837158, 3.700000047683716, 8.0, 9.0, nan, 12.0, 7.0, nan]), n_targets=4, nodes_falseleafs=[1, 0, 1], nodes_falsenodeids=[2, 2, 3], nodes_featureids=[0, 0, 0], nodes_modes=make_tensor("value", 2, dims=[3], vals=[0, 6, 6]), nodes_splits=make_tensor("value", 1, dims=[3], vals=[11.0, 232344.0, nan]), nodes_trueleafs=[0, 1, 1], nodes_truenodeids=[1, 0, 1], post_transform=0, tree_roots=[0])
E                                                                                                                                                                                             ^

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

warn user about unused initialized inputs
moved removing unused inputs to special function
for i, inp in reversed(list(enumerate(graph_inputs))):
if inp.name in initializers and not (inp.uses() or inp in graph_outputs):
if self.remove_initialized_inputs:
del graph_inputs[i]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would a user want to remove a graph-input? It is part of its signature. Even if the input is never used. It could potentially fail in a system when the user specifies an input-value, and the system rejects it as an invalid input (eg., could happen with onnxruntime). Against this disadvantage, there is nothing much gained by removing the model input.

The initializer value can, however, be removed, if there is no use of that value within the model. May be that will save some memory for large initializers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the scenario is that when an initializer happens to be declared an input (as it can be in some converters), if only the initializer is removed but not the input, the same calling will fail. A user may decide to not care about those inputs since for example in ort inputs are provided by keyword.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would a user want to remove a graph-input?
...

The issue arose from concrete real life net I need to process, see
#2211
That's the result of transforming a net, which every weight was declared initialized input. I'm not sure, what exactly creator meant, may be probing.

...
The initializer value can, however, be removed, ...

Nope. I have provided the example here:
#2235 (comment)

Copy link
Collaborator

@justinchuby justinchuby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leshabirukov Apologize but after discussion with @gramalingam , we think that instead of creating an option, it is better to separately handle the initialized inputs. I am working on this in #2253. If you may, could you retain the line if not (init.uses() or init in graph_outputs or init in graph_inputs):, and remove the rest of the changes in the PR? I will update this other PR to provide the needed functionality

@github-project-automation github-project-automation bot moved this from Todo to In Progress in ONNX Script Review Board Apr 30, 2025
@justinchuby
Copy link
Collaborator

@leshabirukov I have created #2257. Unfortunately we will have to revert some of the changes you made in favor of #2253. Will the proposed passes work for you?

@justinchuby justinchuby marked this pull request as draft May 1, 2025 02:25
@leshabirukov
Copy link
Contributor Author

@leshabirukov I have created #2257. Unfortunately we will have to revert some of the changes you made in favor of #2253. Will the proposed passes work for you?

No problem, once I was aware of dangling initialized input issue, I new what to do with that. But think about people hit the same issue, I insist, there should be the warning at least. I propose to preserve this function, (in another file if it should):

def _maybe_remove_unused_initialized_inputs(model: ir.Model, do_remove: bool) -> None:
    graph_outputs = model.graph.outputs
    initializers = model.graph.initializers
    graph_inputs = model.graph.inputs
    unused_init_inputs = []
    for i, inp in reversed(list(enumerate(graph_inputs))):
        if inp.name in initializers and not (inp.uses() or inp in graph_outputs):
            if do_remove:
                del graph_inputs[i]
            else:
                unused_init_inputs.append(inp.name)
    if unused_init_inputs:
        logger.warning(
            "RemoveUnusedNodesPass: Found unused initialized inputs %s,"
            " consider turning `remove_initialized_inputs` on",
            unused_init_inputs,

And call it from DCE with do_remove=False

@justinchuby
Copy link
Collaborator

Thanks. I think that’s reasonable

@leshabirukov
Copy link
Contributor Author

ok, then two questions:

  1. api: How user (me) would invoke _maybe_remove_unused_initialized_inputs and in which file it should be.
  2. technical: I'm closing this PR, right?

@justinchuby
Copy link
Collaborator

Sure!

  1. Users can use https://github.com/microsoft/onnxscript/pull/2253/files#diff-11fab46b14484b2a8d9637f8408b1956943d0e74a569128eee2329a4f6c9e344R133 before calling the dce pass
  2. Feel free to do so. I merged in the fixes for the dce pass already, thank you for helping improve the passes!

@leshabirukov
Copy link
Contributor Author

ok, don't forget about warning for DCE

@github-project-automation github-project-automation bot moved this from In Progress to Done in ONNX Script Review Board May 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

If we not removed initialized input, should we keep initializer also?
3 participants