-
Notifications
You must be signed in to change notification settings - Fork 89
Refactoring ITS summary method #606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…nd 'post' logic within Base
PR SummaryMajor refactor to localize effect summarization and enhance ITS functionality.
Written by Cursor Bugbot for commit fba5a61. This will update automatically on new commits. Configure here. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #606 +/- ##
==========================================
+ Coverage 93.27% 93.77% +0.50%
==========================================
Files 37 38 +1
Lines 5632 5991 +359
Branches 367 367
==========================================
+ Hits 5253 5618 +365
+ Misses 248 241 -7
- Partials 131 132 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
drbenvincent
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this contribution! The removal of ~175 lines of duplicated code is a win, and the test coverage you've added is excellent.
I also appreciate you raising the architectural concerns proactively. You've identified the same issues I've been thinking about: the base class is starting to accumulate experiment-specific logic, and _comparison_period_summary feels misplaced.
My main concern with the current approach: The base class now accesses ITS-specific attributes like datapost, treatment_time, and treatment_end_time - note the # type: ignore[attr-defined] comments, which are a signal that we're reaching into subclass territory. This creates an awkward coupling where BaseExperiment knows about concepts that only exist in ITS.
This feels like a good opportunity to rethink the reporting architecture more holistically rather than patching incrementally.
I see two promising directions:
Option 1: Experiment-owned effect_summary() with shared utilities
Each experiment class owns its effect_summary() implementation entirely. The base class either has no implementation or declares it abstract. reporting.py becomes a pure utility module—stateless functions that take data and return results, with no knowledge of experiment objects. ITS handles its three-period logic; SC handles multi-unit logic; scalar experiments (DiD, RD, RKink) handle theirs. Code reuse happens through utility function calls, not inheritance.
Option 2: Strategy/Generator pattern
Create an EffectSummaryGenerator hierarchy in reporting.py. Each experiment instantiates the appropriate generator (e.g., TimeSeriesEffectGenerator, ThreePeriodEffectGenerator). This keeps reporting logic centralized while avoiding base class bloat.
Both approaches keep the base class clean and make experiment-specific logic explicit. I'd lean toward Option 1 for its simplicity—it's easier to understand where logic lives, and pure utility functions are straightforward to test.
What do you think? Happy to discuss further or pair on a refactor if you're interested in tackling this. Your three-period work and tests would slot in nicely once the architecture is settled.
|
Thanks for the detailed feedback! I’m glad to see the reduction in code duplication. I completely agree with your point regarding the Decision: Moving forward with Option 1 Future Refinements:
|
Hi everyone,
I removed the override of
effect_summaryin interrupted_time_series.py, which removed in total ~100 lines of duplicated logic.While reviewing the current implementation, I have three architectural concerns that I’d like to discuss before we merge:
Relocation of
_comparison_period_summary: The_comparison_period_summarymethod is currently located in iterrupted_time_series.py. Given that its primary responsibility is formatting data for output, it feels like it might be a better fit for reporting.py. Moving it there would help centralize our reporting logic and make it easier to find in the future, such as the_effect_summary_{experiment type}we already find there.Refactoring base.py (Lines 200–375): The logic currently sitting between lines 200 and 375appears to be exclusively specific to experiments :
To keep base.py clean and maintainable, I believe it would be better to move this code into its corresponding experiment-specific class. This ensures our base class remains generic and doesn't "leak" logic from individual experiments.
We could however keep lines 314-375 into another method which would be called both by SyntheticControl and InterruptedTimeSeries to prevent duplicate logic.
Let me know what you think! I’m happy to handle the refactoring if we’re in agreement on the direction.
📚 Documentation preview 📚: https://causalpy--606.org.readthedocs.build/en/606/