Skip to content

attach monitoring data to hdf5 file#2901

Merged
maxnoe merged 41 commits into
mainfrom
mondata_merger
Jan 13, 2026
Merged

attach monitoring data to hdf5 file#2901
maxnoe merged 41 commits into
mainfrom
mondata_merger

Conversation

@TjarkMiener
Copy link
Copy Markdown
Member

@TjarkMiener TjarkMiener commented Dec 3, 2025

This PR tweaks the merger component to add functionalities listed in #2900 (comment)

closes #2890

@TjarkMiener TjarkMiener marked this pull request as ready for review December 4, 2025 12:14
Comment thread src/ctapipe/io/hdf5merger.py Outdated
Comment thread src/ctapipe/io/hdf5merger.py Outdated
Comment thread src/ctapipe/io/hdf5merger.py Outdated
Comment thread src/ctapipe/io/hdf5merger.py
Comment thread src/ctapipe/io/hdf5merger.py Outdated
Comment thread src/ctapipe/conftest.py Outdated
@TjarkMiener TjarkMiener marked this pull request as draft December 4, 2025 17:35
@TjarkMiener TjarkMiener marked this pull request as ready for review December 5, 2025 14:00
@TjarkMiener TjarkMiener requested a review from mexanick December 5, 2025 14:00
@TjarkMiener TjarkMiener marked this pull request as draft December 8, 2025 16:18
@TjarkMiener
Copy link
Copy Markdown
Member Author

Back to draft mode since new test files will be uploaded soon. Requires some updates then.

@TjarkMiener
Copy link
Copy Markdown
Member Author

Should be rebase to #2881

Comment thread src/ctapipe/io/hdf5eventsource.py
Comment thread src/ctapipe/io/tests/test_hdf5eventsource.py Outdated
@TjarkMiener TjarkMiener changed the base branch from main to update_hdf_format_monitoring December 11, 2025 13:51
@TjarkMiener
Copy link
Copy Markdown
Member Author

Should be rebase to #2881

With the data format version bump of #2881 (and the additional here), the tests are failing because the interface i0.2.0 test data were created with data format v7.2.0. I will re-create with the latest version, but it is raising a crucial point: We need to allow attaching of monitoring from different data format versions, in order not to break the CI with the interface file when (unrelated) data format changes results in a bump of version.

@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Dec 12, 2025

As far as I can tell, the CI failure has nothing to do with data format versions.

And this also should never be the case, as long as we support the older versions (which we should, to some point).

@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Dec 12, 2025

E.g. this is simply a bug:

>               f"Input file {other.filename:!r} has different data model version:"
                             ^^^^^^^^^^^^^^^^^^^
                f" {other_version}, expected {self.data_model_version}"
            )
E           ValueError: Invalid format specifier '!r' for object of type 'str'

The colon is wrong here, it needs to be {other.filename!r} not {other.filename:!r}

Comment thread src/ctapipe/io/hdf5merger.py Outdated
Comment thread src/ctapipe/io/hdf5merger.py Outdated
Comment thread src/ctapipe/io/hdf5merger.py Outdated
@TjarkMiener TjarkMiener marked this pull request as ready for review December 12, 2025 10:42
@TjarkMiener TjarkMiener requested a review from maxnoe December 12, 2025 10:42
Comment thread src/ctapipe/io/hdf5merger.py
Comment thread src/ctapipe/io/hdf5merger.py Outdated
Comment thread src/ctapipe/io/hdf5merger.py
Comment thread src/ctapipe/io/hdf5monitoringsource.py
Comment thread src/ctapipe/tools/process.py Outdated
@TjarkMiener
Copy link
Copy Markdown
Member Author

I needed to give some thoughts about the different options in the merger component. As I see it now we have three main usecases to cover with this component:

  1. Merging of simulation, where we have many obs_id
  2. Merging of chunks in the same observational block
  3. Attaching of monitoring data to already merged products

This implies the following:

  1. Process tool should not attach monitoring data to the resulting HDF5 file.
  2. monitoring trait is redundant should be removed because monitoring data should only be attached via attach_monitoring trait flag.
  3. single-ob and attach_monitoring can not be True at the same time and should result in a TraitError.
  4. attach_monitoring can only be True if we are in appending mode (i.e. output file needs to exist)

@maxnoe @mexanick please let me know if you agree so I can add this logic to the option of this component.

Comment thread src/ctapipe/io/hdf5merger.py Outdated
@TjarkMiener
Copy link
Copy Markdown
Member Author

TjarkMiener commented Jan 7, 2026

From my side this PR would be ready for review, we just had an internal discussion yesterday if we should allow or forbid merging monitoring data to the event data. What are your thoughts on this, @maxnoe @kosack? In the current version of this PR, we allow this behaviour.

EDIT:

Yes, events files should be able to contain monitoring data.

Answered by @maxnoe in discussion above

Comment thread src/ctapipe/io/hdf5merger.py
add also a test to raise CannotMerge for invalid merge of different monitoring types
@TjarkMiener TjarkMiener requested a review from mexanick January 8, 2026 10:35
mexanick
mexanick previously approved these changes Jan 8, 2026
Copy link
Copy Markdown
Contributor

@mexanick mexanick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@maxnoe I think the only remaining open topic is considering the check whether the file contains simulation data. I would prefer to use metadata for it and not the negative event IDs. Re-sonar complexity complain, I'd accept it.

@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Jan 8, 2026

Yes, metadata is certainly better than the event ids, which shouldn't even exist, it's a very bad workaround due to a simtel limitation of not assigning unique event ids to calibration events.

@mexanick
Copy link
Copy Markdown
Contributor

@maxnoe @kosack could you please review? This PR is highly desirable for the upcoming DPPS release.

Comment thread src/ctapipe/io/datawriter.py
@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Jan 12, 2026

The only thing I see missing is the resolution of the attach_monitoring vs. monitoring discussion.

I thought we agreed to replace attach_monitoring by something like: file_type = CaseLessStrEenum(["events", "monitoring"])?

This removes boolean flags of single-ob and attach-monitoring
Comment thread src/ctapipe/io/hdf5merger.py Outdated
TjarkMiener and others added 3 commits January 12, 2026 14:03
" observation blocks. This option switches to merging multiple"
" chunks of events of the same ob."
),
),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can keep this flag (and add one for the monitoring merging) by doing:

        "single-ob": (
            {"HDF5Merger": {"merge_strategy":  "events-single-ob"}},
            (
                "By default, the merge tool assumes it is merging multiple"
                " observation blocks. This option switches to merging multiple"
                " chunks of events of the same ob."
            ),
        )

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Done in 3a71ef0.

@TjarkMiener TjarkMiener requested a review from mexanick January 12, 2026 13:45
@ctao-sonarqube
Copy link
Copy Markdown

ctao-sonarqube Bot commented Jan 12, 2026

@maxnoe maxnoe merged commit 8b5da7a into main Jan 13, 2026
12 of 13 checks passed
@maxnoe maxnoe deleted the mondata_merger branch January 13, 2026 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Merging pixel stats of same event type breaks trigger table logic

4 participants