Skip to content

Error in test_integration_nnunetv2_runner #7694

@KumoLiu

Description

@KumoLiu
[2024-04-20T22:34:15.575Z] 2024-04-20 22:34:15.017497: do_dummy_2d_data_aug: False
[2024-04-20T22:34:15.575Z] 2024-04-20 22:34:15.017690: Creating new 5-fold cross-validation split...
[2024-04-20T22:34:15.575Z] 2024-04-20 22:34:15.018173: Desired fold for training: 0
[2024-04-20T22:34:15.575Z] 2024-04-20 22:34:15.018205: This split has 8 training and 2 validation cases.
[2024-04-20T22:34:15.575Z] using pin_memory on device 0
[2024-04-20T22:34:15.830Z] using pin_memory on device 0
[2024-04-20T22:34:15.830Z] 2024-04-20 22:34:15.437163: Using torch.compile...
[2024-04-20T22:34:16.389Z] /usr/local/lib/python3.10/dist-packages/torch/optim/lr_scheduler.py:28: UserWarning: The verbose parameter is deprecated. Please use get_last_lr() to access the learning rate.
[2024-04-20T22:34:16.390Z]   warnings.warn("The verbose parameter is deprecated. Please use get_last_lr() "
[2024-04-20T22:34:16.390Z] 
[2024-04-20T22:34:16.390Z] This is the configuration used by this training:
[2024-04-20T22:34:16.390Z] Configuration name: 3d_fullres
[2024-04-20T22:34:16.390Z]  {'data_identifier': 'nnUNetPlans_3d_fullres', 'preprocessor_name': 'DefaultPreprocessor', 'batch_size': 2, 'patch_size': [24, 24, 24], 'median_image_size_in_voxels': [21.0, 20.5, 20.5], 'spacing': [1.0, 1.0, 1.0], 'normalization_schemes': ['CTNormalization'], 'use_mask_for_norm': [False], 'resampling_fn_data': 'resample_data_or_seg_to_shape', 'resampling_fn_seg': 'resample_data_or_seg_to_shape', 'resampling_fn_data_kwargs': {'is_seg': False, 'order': 3, 'order_z': 0, 'force_separate_z': None}, 'resampling_fn_seg_kwargs': {'is_seg': True, 'order': 1, 'order_z': 0, 'force_separate_z': None}, 'resampling_fn_probabilities': 'resample_data_or_seg_to_shape', 'resampling_fn_probabilities_kwargs': {'is_seg': False, 'order': 1, 'order_z': 0, 'force_separate_z': None}, 'architecture': {'network_class_name': 'dynamic_network_architectures.architectures.unet.PlainConvUNet', 'arch_kwargs': {'n_stages': 3, 'features_per_stage': [32, 64, 128], 'conv_op': 'torch.nn.modules.conv.Conv3d', 'kernel_sizes': [[3, 3, 3], [3, 3, 3], [3, 3, 3]], 'strides': [[1, 1, 1], [2, 2, 2], [2, 2, 2]], 'n_conv_per_stage': [2, 2, 2], 'n_conv_per_stage_decoder': [2, 2], 'conv_bias': True, 'norm_op': 'torch.nn.modules.instancenorm.InstanceNorm3d', 'norm_op_kwargs': {'eps': 1e-05, 'affine': True}, 'dropout_op': None, 'dropout_op_kwargs': None, 'nonlin': 'torch.nn.LeakyReLU', 'nonlin_kwargs': {'inplace': True}, 'deep_supervision': True}, '_kw_requires_import': ['conv_op', 'norm_op', 'dropout_op', 'nonlin']}, 'batch_dice': False} 
[2024-04-20T22:34:16.390Z] 
[2024-04-20T22:34:16.390Z] These are the global plan.json settings:
[2024-04-20T22:34:16.390Z]  {'dataset_name': 'Dataset001_dataroot', 'plans_name': 'nnUNetPlans', 'original_median_spacing_after_transp': [1.0, 1.0, 1.0], 'original_median_shape_after_transp': [21, 20, 20], 'image_reader_writer': 'SimpleITKIO', 'transpose_forward': [0, 1, 2], 'transpose_backward': [0, 1, 2], 'experiment_planner_used': 'ExperimentPlanner', 'label_manager': 'LabelManager', 'foreground_intensity_properties_per_channel': {'0': {'max': 1.0, 'mean': 0.6936690211296082, 'median': 0.5, 'min': 0.5, 'percentile_00_5': 0.5, 'percentile_99_5': 1.0, 'std': 0.24357101321220398}}} 
[2024-04-20T22:34:16.390Z] 
[2024-04-20T22:34:16.390Z] 2024-04-20 22:34:16.104184: unpacking dataset...
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:18.737174: unpacking done...
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:18.737975: Unable to plot network architecture: nnUNet_compile is enabled!
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:18.767291: 
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:18.767349: Epoch 0
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:18.767457: Current learning rate: 0.01
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:38.642981: train_loss -0.4542
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:38.643111: val_loss -0.7395
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:38.643246: Pseudo dice [0.8952, 0.9323]
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:38.643299: Epoch time: 19.88 s
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:38.643342: Yayy! New best EMA pseudo Dice: 0.9137
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:39.571666: Training done.
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:39.579603: Using splits from existing split file: ./work_dir/nnUNet_preprocessed/Dataset001_dataroot/splits_final.json
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:39.579757: The split file contains 5 splits.
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:39.579809: Desired fold for training: 0
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:39.579854: This split has 8 training and 2 validation cases.
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:39.579975: predicting case_0
[2024-04-20T22:34:42.874Z] 2024-04-20 22:34:39.581187: case_0, shape torch.Size([1, 22, 22, 21]), rank 0
[2024-04-20T22:34:42.874Z] Prediction on device was unsuccessful, probably due to a lack of memory. Moving results arrays to CPU
[2024-04-20T22:34:42.874Z] Traceback (most recent call last):
[2024-04-20T22:34:42.874Z]   File "/usr/local/bin/nnUNetv2_train", line 8, in <module>
[2024-04-20T22:34:42.874Z]     sys.exit(run_training_entry())
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/nnunetv2/run/run_training.py", line 274, in run_training_entry
[2024-04-20T22:34:42.874Z]     run_training(args.dataset_name_or_id, args.configuration, args.fold, args.tr, args.p, args.pretrained_weights,
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/nnunetv2/run/run_training.py", line 214, in run_training
[2024-04-20T22:34:42.874Z]     nnunet_trainer.perform_actual_validation(export_validation_probabilities)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py", line 1211, in perform_actual_validation
[2024-04-20T22:34:42.874Z]     prediction = predictor.predict_sliding_window_return_logits(data)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
[2024-04-20T22:34:42.874Z]     return func(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/nnunetv2/inference/predict_from_raw_data.py", line 649, in predict_sliding_window_return_logits
[2024-04-20T22:34:42.874Z]     predicted_logits = self._internal_predict_sliding_window_return_logits(data, slicers, False)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/nnunetv2/inference/predict_from_raw_data.py", line 607, in _internal_predict_sliding_window_return_logits
[2024-04-20T22:34:42.874Z]     raise e
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/nnunetv2/inference/predict_from_raw_data.py", line 590, in _internal_predict_sliding_window_return_logits
[2024-04-20T22:34:42.874Z]     prediction = self._internal_maybe_mirror_and_predict(workon)[0].to(results_device)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/nnunetv2/inference/predict_from_raw_data.py", line 537, in _internal_maybe_mirror_and_predict
[2024-04-20T22:34:42.874Z]     prediction = self.network(x)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
[2024-04-20T22:34:42.874Z]     return self._call_impl(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
[2024-04-20T22:34:42.874Z]     return forward_call(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 454, in _fn
[2024-04-20T22:34:42.874Z]     return fn(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
[2024-04-20T22:34:42.874Z]     return self._call_impl(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
[2024-04-20T22:34:42.874Z]     return forward_call(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 904, in catch_errors
[2024-04-20T22:34:42.874Z]     return callback(frame, cache_entry, hooks, frame_state, skip=1)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 769, in _convert_frame
[2024-04-20T22:34:42.874Z]     result = inner_convert(
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 398, in _convert_frame_assert
[2024-04-20T22:34:42.874Z]     return _compile(
[2024-04-20T22:34:42.874Z]   File "/usr/lib/python3.10/contextlib.py", line 79, in inner
[2024-04-20T22:34:42.874Z]     return func(*args, **kwds)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 669, in _compile
[2024-04-20T22:34:42.874Z]     guarded_code = compile_inner(code, one_graph, hooks, transform)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 249, in time_wrapper
[2024-04-20T22:34:42.874Z]     r = func(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 542, in compile_inner
[2024-04-20T22:34:42.874Z]     out_code = transform_code_object(code, transform)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/bytecode_transformation.py", line 1033, in transform_code_object
[2024-04-20T22:34:42.874Z]     transformations(instructions, code_options)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 163, in _fn
[2024-04-20T22:34:42.874Z]     return fn(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 507, in transform
[2024-04-20T22:34:42.874Z]     tracer.run()
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2122, in run
[2024-04-20T22:34:42.874Z]     super().run()
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 785, in run
[2024-04-20T22:34:42.874Z]     and self.step()
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 748, in step
[2024-04-20T22:34:42.874Z]     getattr(self, inst.opname)(inst)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2241, in RETURN_VALUE
[2024-04-20T22:34:42.874Z]     self.output.compile_subgraph(
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 931, in compile_subgraph
[2024-04-20T22:34:42.874Z]     self.compile_and_call_fx_graph(tx, list(reversed(stack_values)), root)
[2024-04-20T22:34:42.874Z]   File "/usr/lib/python3.10/contextlib.py", line 79, in inner
[2024-04-20T22:34:42.874Z]     return func(*args, **kwds)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 1102, in compile_and_call_fx_graph
[2024-04-20T22:34:42.874Z]     compiled_fn = self.call_user_compiler(gm)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 249, in time_wrapper
[2024-04-20T22:34:42.874Z]     r = func(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 1175, in call_user_compiler
[2024-04-20T22:34:42.874Z]     raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 1156, in call_user_compiler
[2024-04-20T22:34:42.874Z]     compiled_fn = compiler_fn(gm, self.example_inputs())
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/repro/after_dynamo.py", line 117, in debug_wrapper
[2024-04-20T22:34:42.874Z]     compiled_gm = compiler_fn(gm, example_inputs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 1730, in __call__
[2024-04-20T22:34:42.874Z]     return compile_fx(model_, inputs_, config_patches=self.config)
[2024-04-20T22:34:42.874Z]   File "/usr/lib/python3.10/contextlib.py", line 79, in inner
[2024-04-20T22:34:42.874Z]     return func(*args, **kwds)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/compile_fx.py", line 1321, in compile_fx
[2024-04-20T22:34:42.874Z]     return aot_autograd(
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/backends/common.py", line 57, in compiler_fn
[2024-04-20T22:34:42.874Z]     cg = aot_module_simplified(gm, example_inputs, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 891, in aot_module_simplified
[2024-04-20T22:34:42.874Z]     compiled_fn = create_aot_dispatcher_function(
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 249, in time_wrapper
[2024-04-20T22:34:42.874Z]     r = func(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 604, in create_aot_dispatcher_function
[2024-04-20T22:34:42.874Z]     compiled_fn = compiler_fn(flat_fn, fake_flat_args, aot_config, fw_metadata=fw_metadata)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 434, in aot_wrapper_dedupe
[2024-04-20T22:34:42.874Z]     return compiler_fn(flat_fn, leaf_flat_args, aot_config, fw_metadata=fw_metadata)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 639, in aot_wrapper_synthetic_base
[2024-04-20T22:34:42.874Z]     return compiler_fn(flat_fn, flat_args, aot_config, fw_metadata=fw_metadata)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py", line 72, in aot_dispatch_base
[2024-04-20T22:34:42.874Z]     fw_module, updated_flat_args, maybe_subclass_meta = aot_dispatch_base_graph(  # type: ignore[misc]
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/_aot_autograd/dispatch_and_compile_graph.py", line 90, in aot_dispatch_base_graph
[2024-04-20T22:34:42.874Z]     fw_module = _create_graph(
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/_aot_autograd/dispatch_and_compile_graph.py", line 40, in _create_graph
[2024-04-20T22:34:42.874Z]     fx_g = make_fx(
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/experimental/proxy_tensor.py", line 1099, in wrapped
[2024-04-20T22:34:42.874Z]     t = dispatch_trace(wrap_key(func, args, fx_tracer, pre_dispatch), tracer=fx_tracer, concrete_args=tuple(phs))
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_compile.py", line 24, in inner
[2024-04-20T22:34:42.874Z]     return torch._dynamo.disable(fn, recursive)(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 454, in _fn
[2024-04-20T22:34:42.874Z]     return fn(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/external_utils.py", line 25, in inner
[2024-04-20T22:34:42.874Z]     return fn(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/experimental/proxy_tensor.py", line 550, in dispatch_trace
[2024-04-20T22:34:42.874Z]     graph = tracer.trace(root, concrete_args)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 454, in _fn
[2024-04-20T22:34:42.874Z]     return fn(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/external_utils.py", line 25, in inner
[2024-04-20T22:34:42.874Z]     return fn(*args, **kwargs)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/_symbolic_trace.py", line 793, in trace
[2024-04-20T22:34:42.874Z]     (self.create_arg(fn(*args)),),
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/experimental/proxy_tensor.py", line 575, in wrapped
[2024-04-20T22:34:42.874Z]     track_tensor_tree(flat_tensors, flat_proxies, constant=None, tracer=tracer)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/experimental/proxy_tensor.py", line 246, in track_tensor_tree
[2024-04-20T22:34:42.874Z]     wrap_with_proxy(inner_res, proxy_res, constant)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/experimental/proxy_tensor.py", line 220, in wrap_with_proxy
[2024-04-20T22:34:42.874Z]     wrap_with_proxy(ee, proxy[idx], get_constant(idx))
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/experimental/proxy_tensor.py", line 206, in wrap_with_proxy
[2024-04-20T22:34:42.874Z]     set_meta(proxy, e)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/experimental/proxy_tensor.py", line 167, in set_meta
[2024-04-20T22:34:42.874Z]     proxy.node.meta['val'] = extract_val(val)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/experimental/proxy_tensor.py", line 137, in extract_val
[2024-04-20T22:34:42.874Z]     return snapshot_fake(val)
[2024-04-20T22:34:42.874Z]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/experimental/proxy_tensor.py", line 133, in snapshot_fake
[2024-04-20T22:34:42.874Z]     return val.detach()
[2024-04-20T22:34:42.874Z] torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
[2024-04-20T22:34:42.874Z] RuntimeError: Cannot set version_counter for inference tensor
[2024-04-20T22:34:42.874Z] 
[2024-04-20T22:34:42.874Z] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions