Refactor effect_summary to experiment-owned implementations#606
Refactor effect_summary to experiment-owned implementations#606drbenvincent merged 11 commits intopymc-labs:mainfrom
effect_summary to experiment-owned implementations#606Conversation
…nd 'post' logic within Base
PR SummaryStreamlines effect summarization by decentralizing logic and expanding ITS capabilities.
Written by Cursor Bugbot for commit d7b0fdc. This will update automatically on new commits. Configure here. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #606 +/- ##
==========================================
+ Coverage 93.74% 94.08% +0.34%
==========================================
Files 41 42 +1
Lines 6827 7190 +363
Branches 458 458
==========================================
+ Hits 6400 6765 +365
+ Misses 267 262 -5
- Partials 160 163 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
drbenvincent
left a comment
There was a problem hiding this comment.
Thanks for this contribution! The removal of ~175 lines of duplicated code is a win, and the test coverage you've added is excellent.
I also appreciate you raising the architectural concerns proactively. You've identified the same issues I've been thinking about: the base class is starting to accumulate experiment-specific logic, and _comparison_period_summary feels misplaced.
My main concern with the current approach: The base class now accesses ITS-specific attributes like datapost, treatment_time, and treatment_end_time - note the # type: ignore[attr-defined] comments, which are a signal that we're reaching into subclass territory. This creates an awkward coupling where BaseExperiment knows about concepts that only exist in ITS.
This feels like a good opportunity to rethink the reporting architecture more holistically rather than patching incrementally.
I see two promising directions:
Option 1: Experiment-owned effect_summary() with shared utilities
Each experiment class owns its effect_summary() implementation entirely. The base class either has no implementation or declares it abstract. reporting.py becomes a pure utility module—stateless functions that take data and return results, with no knowledge of experiment objects. ITS handles its three-period logic; SC handles multi-unit logic; scalar experiments (DiD, RD, RKink) handle theirs. Code reuse happens through utility function calls, not inheritance.
Option 2: Strategy/Generator pattern
Create an EffectSummaryGenerator hierarchy in reporting.py. Each experiment instantiates the appropriate generator (e.g., TimeSeriesEffectGenerator, ThreePeriodEffectGenerator). This keeps reporting logic centralized while avoiding base class bloat.
Both approaches keep the base class clean and make experiment-specific logic explicit. I'd lean toward Option 1 for its simplicity—it's easier to understand where logic lives, and pure utility functions are straightforward to test.
What do you think? Happy to discuss further or pair on a refactor if you're interested in tackling this. Your three-period work and tests would slot in nicely once the architecture is settled.
|
Thanks for the detailed feedback! I’m glad to see the reduction in code duplication. I completely agree with your point regarding the Decision: Moving forward with Option 1 Future Refinements:
|
|
bugbot review |
|
Inclined to merge soon. Looks like a good improvement. Just dropping in some auto generated summary to document the changes in prose form, not just code :) TL;DRThis PR moves summary logic into the experiment classes. Before: After: Before: Centralized Dispatch in Base ClassProblems:
After: Experiment-Owned ImplementationBenefits:
|
drbenvincent
left a comment
There was a problem hiding this comment.
@JeanVanDyk if you can update from main and resolve the conflicts then I'm happy to merge
effect_summary to experiment-owned implementations
Hi everyone,
I removed the override of
effect_summaryin interrupted_time_series.py, which removed in total ~100 lines of duplicated logic.While reviewing the current implementation, I have three architectural concerns that I’d like to discuss before we merge:
Relocation of
_comparison_period_summary: The_comparison_period_summarymethod is currently located in iterrupted_time_series.py. Given that its primary responsibility is formatting data for output, it feels like it might be a better fit for reporting.py. Moving it there would help centralize our reporting logic and make it easier to find in the future, such as the_effect_summary_{experiment type}we already find there.Refactoring base.py (Lines 200–375): The logic currently sitting between lines 200 and 375appears to be exclusively specific to experiments :
To keep base.py clean and maintainable, I believe it would be better to move this code into its corresponding experiment-specific class. This ensures our base class remains generic and doesn't "leak" logic from individual experiments.
We could however keep lines 314-375 into another method which would be called both by SyntheticControl and InterruptedTimeSeries to prevent duplicate logic.
Let me know what you think! I’m happy to handle the refactoring if we’re in agreement on the direction.
📚 Documentation preview 📚: https://causalpy--606.org.readthedocs.build/en/606/