Skip to content

Disp reconstruction#2138

Merged
maxnoe merged 26 commits into
cta-observatory:masterfrom
LukasBeiske:ml_disp_new
Jan 19, 2023
Merged

Disp reconstruction#2138
maxnoe merged 26 commits into
cta-observatory:masterfrom
LukasBeiske:ml_disp_new

Conversation

@LukasBeiske
Copy link
Copy Markdown
Contributor

This is a continuation of #2064 after #1767 was merged.

For now I only implemented a simple weighted average to combine single telescope predictions into event predictions. The estimation of the error of this average seems to underestimate the error a bit, based on a small MC study I did.
I will look into other ways to combine the telescope predictions in the future (including other error estimates), but for now this is better than no error at all, in my opinion.

Comment thread ctapipe/coordinates/disp.py Outdated
Comment thread ctapipe/coordinates/disp.py Outdated
Comment thread ctapipe/reco/sklearn.py Outdated
Comment thread ctapipe/reco/sklearn.py Outdated
Comment thread ctapipe/reco/sklearn.py
Comment thread ctapipe/reco/sklearn.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 25, 2022

Codecov Report

Base: 92.75% // Head: 92.78% // Increases project coverage by +0.02% 🎉

Coverage data is based on head (596811b) compared to base (0e4ffea).
Patch coverage: 94.27% of modified lines in pull request are covered.

❗ Current head 596811b differs from pull request most recent head 22c1ed0. Consider uploading reports for the commit 22c1ed0 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2138      +/-   ##
==========================================
+ Coverage   92.75%   92.78%   +0.02%     
==========================================
  Files         216      216              
  Lines       18050    18385     +335     
==========================================
+ Hits        16742    17058     +316     
- Misses       1308     1327      +19     
Impacted Files Coverage Δ
ctapipe/tools/process.py 87.87% <ø> (+1.21%) ⬆️
ctapipe/tools/train_energy_regressor.py 85.29% <ø> (ø)
ctapipe/tools/train_particle_classifier.py 86.58% <ø> (ø)
ctapipe/tools/train_disp_reconstructor.py 87.05% <87.05%> (ø)
ctapipe/conftest.py 94.42% <91.66%> (-0.17%) ⬇️
ctapipe/tools/apply_models.py 95.04% <92.00%> (+1.94%) ⬆️
ctapipe/reco/sklearn.py 94.60% <93.41%> (-0.89%) ⬇️
ctapipe/reco/stereo_combination.py 95.31% <96.15%> (+0.27%) ⬆️
ctapipe/containers.py 100.00% <100.00%> (ø)
ctapipe/coordinates/disp.py 100.00% <100.00%> (ø)
... and 14 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Comment thread ctapipe/reco/sklearn.py Outdated
Comment thread ctapipe/reco/sklearn.py Outdated
Comment thread ctapipe/reco/sklearn.py Outdated
Comment thread ctapipe/reco/sklearn.py Outdated
Comment thread ctapipe/reco/sklearn.py Outdated
Comment thread ctapipe/containers.py Outdated
Comment thread ctapipe/reco/sklearn.py Outdated
Comment thread ctapipe/containers.py Outdated
Comment thread ctapipe/reco/sklearn.py Outdated
Comment thread ctapipe/coordinates/disp.py Outdated
Comment thread ctapipe/coordinates/disp.py Outdated
Comment thread ctapipe/tools/train_disp_reconstructor.py
Comment thread ctapipe/reco/stereo_combination.py Outdated
Comment thread ctapipe/reco/stereo_combination.py Outdated
Comment thread ctapipe/reco/stereo_combination.py Outdated
Comment thread ctapipe/reco/sklearn.py Outdated
Comment thread ctapipe/reco/sklearn.py
@LukasBeiske
Copy link
Copy Markdown
Contributor Author

I seem to have missed that commit 71672d8 breaks apply-models for the energy regressor (haven't tested the particle classifier yet) which is also why some tests are failing atm.
However, I don't understand which change causes this error, especially since I did not actually change anything at the point in the code where the error occurs.

Traceback (most recent call last):
  File "/scratch/lbeiske/envs/cta/lib/python3.8/site-packages/tables/table.py", line 2192, in append
    wbufRA = np.rec.array(rows, dtype=self._v_dtype)
  File "/scratch/lbeiske/envs/cta/lib/python3.8/site-packages/numpy/core/records.py", line 1078, in array
    new = obj.view(dtype)
ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/net/nfshome/home/lbeiske/ctapipe/ctapipe/core/tool.py", line 358, in run
    self.start()
  File "/net/nfshome/home/lbeiske/ctapipe/ctapipe/tools/apply_models.py", line 178, in start
    self._apply(reconstructor)
  File "/net/nfshome/home/lbeiske/ctapipe/ctapipe/tools/apply_models.py", line 272, in _apply
    write_table(
  File "/net/nfshome/home/lbeiske/ctapipe/ctapipe/io/astropy_helpers.py", line 211, in write_table
    h5_table.append(table.as_array())
  File "/scratch/lbeiske/envs/cta/lib/python3.8/site-packages/tables/table.py", line 2194, in append
    raise ValueError("rows parameter cannot be converted into a "
ValueError: rows parameter cannot be converted into a recarray object compliant with table '/dl2/event/telescope/energy/RandomForestRegressor/tel_001 (Table(18158,)fletcher32, shuffle, blosc:zstd(5)) '''. The error was: <When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.>

@nbiederbeck
Copy link
Copy Markdown
Contributor

nbiederbeck commented Dec 2, 2022

We found the issue in

observation_table = self.read_observation_information(start=start, stop=stop)
table = join_allow_empty(
table, observation_table, keys="obs_id", join_type="left"
)

(Pdb) table['obs_id'].dtype
dtype('int32')
(Pdb) observation_table['obs_id'].dtype
dtype('uint64')

The join makes a float64 out of this.

Do we need to reprocess the files after #2096?

Comment thread ctapipe/io/tableloader.py Outdated
Comment thread ctapipe/reco/preprocessing.py
Comment thread ctapipe/reco/stereo_combination.py Outdated
Comment thread ctapipe/tools/tests/test_train.py
Comment thread ctapipe/tools/train_disp_reconstructor.py Outdated
Comment thread ctapipe/reco/sklearn.py Outdated
nbiederbeck
nbiederbeck previously approved these changes Dec 8, 2022
Comment thread ctapipe/reco/sklearn.py Outdated
@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Dec 8, 2022

@LukasBeiske Could you show some performance plots if you think this is ready?

@LukasBeiske
Copy link
Copy Markdown
Contributor Author

LukasBeiske commented Dec 8, 2022

Sure. cv_disp.pdf these are the performance plots for the model I am currently using.

This is the config I am using atm:

TrainDispReconstructor:
  CrossValidator:
    n_cross_validations: 5

  random_seed: 42

  DispReconstructor:
    norm_cls: RandomForestRegressor
    norm_config:
      n_estimators: 69
      max_features: 0.5227
      max_samples: 0.7138
      min_samples_leaf: 0.000013
      n_jobs: 40

    log_target: True

    sign_cls: RandomForestClassifier
    sign_config:
      n_estimators: 343
      max_features: 0.6587
      max_samples: 0.5815
      min_samples_leaf: 0.000035
      n_jobs: 40

    features:
      - log_RandomForestRegressor_energy
      - log_tel_impact_distance
      - log_RandomForestRegressor_tel_energy
      - log_abs_timing_slope
      - peak_time_std
      - concentration_pixel
      - hillas_length
      - concentration_cog
      - timing_deviation
      - scaled_length
      - HillasReconstructor_h_max
      - RandomForestRegressor_energy
      - area
      - peak_time_kurtosis
      - RandomForestRegressor_tel_energy
      - timing_slope
      - hillas_skewness
      - HillasReconstructor_core_x
      - HillasReconstructor_core_y

    QualityQuery:
      quality_criteria:
        - ["enough intensity", "hillas_intensity > 50"]
        - ["Positive width", "hillas_width > 0"]
        - ["enough pixels", "morphology_n_pixels > 3"]
        - ["not clipped", "leakage_intensity_width_2 < 0.5"]
        - ["HillasValid", "HillasReconstructor_is_valid"]

    FeatureGenerator:
      features:
        - ["area", "hillas_width * hillas_length"]
        - ["log_RandomForestRegressor_energy", "log(RandomForestRegressor_energy)"]
        - ["log_RandomForestRegressor_tel_energy", "log(RandomForestRegressor_tel_energy)"]
        - ["log_tel_impact_distance", "log(HillasReconstructor_tel_impact_distance)"]
        - ["log_abs_timing_slope", "log(abs(timing_slope))"]

@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Dec 8, 2022

Why is core-y so important? That's confusing me a bit.

Could you show the single LST performance? I.e. no stereo features for LST?

@LukasBeiske
Copy link
Copy Markdown
Contributor Author

For mono I used this config:

TrainDispReconstructor:
  CrossValidator:
    n_cross_validations: 5

  random_seed: 42

  DispReconstructor:
    norm_cls: RandomForestRegressor
    norm_config:
      n_estimators: 100
      max_features: "sqrt"
      min_samples_leaf: 0.00001
      n_jobs: 40

    log_target: True

    sign_cls: RandomForestClassifier
    sign_config:
      n_estimators: 100
      max_features: "sqrt"
      min_samples_split: 0.00001
      n_jobs: 40

    features:
      - hillas_intensity
      - hillas_length
      - hillas_width
      - hillas_skewness
      - hillas_kurtosis
      - timing_slope
      - timing_deviation
      - peak_time_std
      - concentration_cog
      - leakage_intensity_width_2
      - morphology_n_pixels
      - scaled_length
      - scaled_width
      - RandomForestRegressor_energy
      - log_abs_timing_slope

    QualityQuery:
      quality_criteria:
        - ["enough intensity", "hillas_intensity > 50"]
        - ["Positive width", "hillas_width > 0"]
        - ["enough pixels", "morphology_n_pixels > 3"]
        - ["not clipped", "leakage_intensity_width_2 < 0.5"]

    FeatureGenerator:
      features:
        - ["log_abs_timing_slope", "log(abs(timing_slope))"]

And the performance plots still look pretty good imo (sign performs slightly worse): cv_disp_mono.pdf

@maxnoe maxnoe requested a review from kosack January 19, 2023 14:58
Comment thread ctapipe/reco/stereo_combination.py
@maxnoe maxnoe merged commit 64cbea6 into cta-observatory:master Jan 19, 2023
@LukasBeiske LukasBeiske deleted the ml_disp_new branch January 19, 2023 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:reco issues related to ctapipe.reco new functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants