Disp reconstruction#2138
Conversation
Codecov ReportBase: 92.75% // Head: 92.78% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #2138 +/- ##
==========================================
+ Coverage 92.75% 92.78% +0.02%
==========================================
Files 216 216
Lines 18050 18385 +335
==========================================
+ Hits 16742 17058 +316
- Misses 1308 1327 +19
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
3d979c9 to
6be5c9f
Compare
|
I seem to have missed that commit 71672d8 breaks |
|
We found the issue in ctapipe/ctapipe/io/tableloader.py Lines 326 to 329 in f789773 (Pdb) table['obs_id'].dtype
dtype('int32')
(Pdb) observation_table['obs_id'].dtype
dtype('uint64')The join makes a float64 out of this. Do we need to reprocess the files after #2096? |
1a7fb0b to
14cb2a9
Compare
202a3c1 to
070e534
Compare
|
@LukasBeiske Could you show some performance plots if you think this is ready? |
|
Sure. cv_disp.pdf these are the performance plots for the model I am currently using. This is the config I am using atm: TrainDispReconstructor:
CrossValidator:
n_cross_validations: 5
random_seed: 42
DispReconstructor:
norm_cls: RandomForestRegressor
norm_config:
n_estimators: 69
max_features: 0.5227
max_samples: 0.7138
min_samples_leaf: 0.000013
n_jobs: 40
log_target: True
sign_cls: RandomForestClassifier
sign_config:
n_estimators: 343
max_features: 0.6587
max_samples: 0.5815
min_samples_leaf: 0.000035
n_jobs: 40
features:
- log_RandomForestRegressor_energy
- log_tel_impact_distance
- log_RandomForestRegressor_tel_energy
- log_abs_timing_slope
- peak_time_std
- concentration_pixel
- hillas_length
- concentration_cog
- timing_deviation
- scaled_length
- HillasReconstructor_h_max
- RandomForestRegressor_energy
- area
- peak_time_kurtosis
- RandomForestRegressor_tel_energy
- timing_slope
- hillas_skewness
- HillasReconstructor_core_x
- HillasReconstructor_core_y
QualityQuery:
quality_criteria:
- ["enough intensity", "hillas_intensity > 50"]
- ["Positive width", "hillas_width > 0"]
- ["enough pixels", "morphology_n_pixels > 3"]
- ["not clipped", "leakage_intensity_width_2 < 0.5"]
- ["HillasValid", "HillasReconstructor_is_valid"]
FeatureGenerator:
features:
- ["area", "hillas_width * hillas_length"]
- ["log_RandomForestRegressor_energy", "log(RandomForestRegressor_energy)"]
- ["log_RandomForestRegressor_tel_energy", "log(RandomForestRegressor_tel_energy)"]
- ["log_tel_impact_distance", "log(HillasReconstructor_tel_impact_distance)"]
- ["log_abs_timing_slope", "log(abs(timing_slope))"] |
|
Why is core-y so important? That's confusing me a bit. Could you show the single LST performance? I.e. no stereo features for LST? |
|
For mono I used this config: TrainDispReconstructor:
CrossValidator:
n_cross_validations: 5
random_seed: 42
DispReconstructor:
norm_cls: RandomForestRegressor
norm_config:
n_estimators: 100
max_features: "sqrt"
min_samples_leaf: 0.00001
n_jobs: 40
log_target: True
sign_cls: RandomForestClassifier
sign_config:
n_estimators: 100
max_features: "sqrt"
min_samples_split: 0.00001
n_jobs: 40
features:
- hillas_intensity
- hillas_length
- hillas_width
- hillas_skewness
- hillas_kurtosis
- timing_slope
- timing_deviation
- peak_time_std
- concentration_cog
- leakage_intensity_width_2
- morphology_n_pixels
- scaled_length
- scaled_width
- RandomForestRegressor_energy
- log_abs_timing_slope
QualityQuery:
quality_criteria:
- ["enough intensity", "hillas_intensity > 50"]
- ["Positive width", "hillas_width > 0"]
- ["enough pixels", "morphology_n_pixels > 3"]
- ["not clipped", "leakage_intensity_width_2 < 0.5"]
FeatureGenerator:
features:
- ["log_abs_timing_slope", "log(abs(timing_slope))"]And the performance plots still look pretty good imo (sign performs slightly worse): cv_disp_mono.pdf |
baed78a to
22c1ed0
Compare
This is a continuation of #2064 after #1767 was merged.
For now I only implemented a simple weighted average to combine single telescope predictions into event predictions. The estimation of the error of this average seems to underestimate the error a bit, based on a small MC study I did.
I will look into other ways to combine the telescope predictions in the future (including other error estimates), but for now this is better than no error at all, in my opinion.