-
Notifications
You must be signed in to change notification settings - Fork 76
Soiling algorithm updates #435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development
Are you sure you want to change the base?
Conversation
…tio and fit multiple soiling rates per soiling interval (piecewise)) as well as CODS algorithm being added
…rials' into development
Signed-off-by: nmoyer <[email protected]>
Move SRR and CODS development branch from noromo01 to rdtools repo
…nto bare_except_error
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## development #435 +/- ##
===============================================
- Coverage 96.19% 96.03% -0.17%
===============================================
Files 11 11
Lines 2209 2370 +161
===============================================
+ Hits 2125 2276 +151
- Misses 84 94 +10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…rror' into qnguyen345-bare_except_error
Seems like the pytest.PytestRemovedIn8Warning: in the degradation_and_soiling_example notebook might be related to nbval < 0.10 computationalmodelling/nbval#180 However, nbval greater than 0.9.6 has a bug with semicolon computationalmodelling/nbval#194 @mdeceglie - I believe you removed most of the semicolons from the notebooks? If so, we could try using a newer version of nbval. |
There are still a few semicolons, I'm fine with attempting to remove them and updating nbval |
rdtools/soiling.py
Outdated
"max_neg_step": min(run.delta), | ||
"start_loss": 1, | ||
"inferred_start_loss": run.pi_norm.median(), # changed from mean/Matt | ||
"inferred_end_loss": run.pi_norm.median(), # changed from mean/Matt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdeceglie - This change from .mean() to .median() seems to cause the RuntimeWarning:Mean of empty slice.
Seems counterintuitive but changing it back to mean get's rid of the warning...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add a check whether pi_norm
is empty.
Also, I'm not sure why "inferred_start_loss" and "inferred_end_loss" are the same here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm adding the following condition for now. If there's a better way , we can change it:
"inferred_start_loss": np.nan if run.pi_norm.isna().any() else run.pi_norm.median(), # changed from mean/Matt "inferred_end_loss": np.nan if run.pi_norm.isna().any() else run.pi_norm.median(), # changed from mean/Matt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request updates the soiling algorithm by modifying dependencies, adding new testing fixtures, and refining plotting functions.
- Updated nbval dependency and CLI flag for compatibility with the latest version
- Introduced two new test fixtures for soiling normalization variations
- Revised plotting functions to remove runtime warnings and standardize docstrings
Reviewed Changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated no comments.
File | Description |
---|---|
setup.py | Removed the nbval version constraint, potentially affecting bug workarounds |
rdtools/test/conftest.py | Added new fixtures for simulating soiling normalization with negative shifts and piecewise slopes |
rdtools/plotting.py | Standardized docstrings and removed experimental warnings from plotting functions |
.github/workflows/nbval.yaml | Updated the nbval CLI flag to match the new version's requirements |
Files not reviewed (1)
- docs/sphinx/source/changelog/v3.0.0-beta.0.rst: Language not supported
Comments suppressed due to low confidence (2)
setup.py:39
- The removal of the nbval version constraint may reintroduce the semicolon bug noted in earlier versions. Confirm that the currently used nbval version resolves that issue.
"nbval",
.github/workflows/nbval.yaml:32
- Ensure that the new CLI flag '--nbval-sanitize-with' is supported by the nbval version in use and functions as expected compared to the previous '--sanitize-with' flag.
pytest --nbval --nbval-sanitize-with docs/nbval_sanitization_rules.cfg docs/${{ matrix.notebook-file }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many minor comments, but my biggest point of discussion is how the the values for the new parameters neg_shift
and picewise
can be combined and how those combos can be combined with the new _complex
cleaning options.
Must neg_shift and picewise always be turned on together? If not, how should the cleaning assumptions behave? If so, perhaps we can combine them into a single parameter.
Ideally I'd like to remove the _complex
versions of the cleaning assumptions and instead have the existing versions check the values of neg_shift
and piecewise
and adjust their logic appropriately,
After addressing all this we should give the docs a once over to ensure everything appears the way we want.
The soiling module is currently experimental. The API, results, | ||
and default behaviors may change in future releases (including MINOR | ||
and PATCH releases) as the code matures. | ||
''' | ||
from rdtools import degradation as RdToolsDeg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change this to from rdtools import degradation
and make the associated changes throughout
raise ValueError('Daily insolation series must have ' | ||
'daily frequency') | ||
if pd.infer_freq(self.insolation_daily.index) != "D": | ||
raise ValueError("Daily insolation series must have " "daily frequency") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise ValueError("Daily insolation series must have " "daily frequency") | |
raise ValueError("Daily insolation series must have daily frequency") |
outlier_factor=1.5): | ||
''' | ||
if pd.infer_freq(self.precipitation_daily.index) != "D": | ||
raise ValueError("Precipitation series must have " "daily frequency") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise ValueError("Precipitation series must have " "daily frequency") | |
raise ValueError("Precipitation series must have daily frequency") |
raise ValueError('Daily performance metric series must have ' | ||
'daily frequency') | ||
if pd.infer_freq(self.pm.index) != "D": | ||
raise ValueError("Daily performance metric series must have " "daily frequency") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise ValueError("Daily performance metric series must have " "daily frequency") | |
raise ValueError("Daily performance metric series must have daily frequency") |
############################################################################### | ||
# all code below for new piecewise fitting in soiling intervals within srr/Matt | ||
############################################################################### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment can be deleted
("perfect_clean_complex", True, True, 0.977116), | ||
("inferred_clean_complex", True, True, 0.975805)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can perfect_clean_complex
and inferred_clean_complex
be used with neg_shift
OR piecewise
? Or do both neg_shift
and picewise
have to be True together? If not, we should test update the doc strings in soiling.py to say "or", and augment the the test matrix to mix and match.
("perfect_clean_complex", True, True, 0.966912), | ||
("inferred_clean_complex", True, True, 0.965565)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same questions as above regarding mixing and matching piecewise and neg_shift
''' | ||
####################################################################### | ||
# add neg_shift and piecewise to the following def/Matt | ||
def run(self, reps=1000, day_scale=13, clean_threshold="infer", trim=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's consider what defaults we want (also for soiling_srr()
, which should be aligned with the run method). Should we activate some of the new functionality by default?
method : str, {'half_norm_clean', 'random_clean', 'perfect_clean', | ||
perfect_clean_complex,inferred_clean_complex} \ | ||
default 'half_norm_clean' | ||
|
||
How to treat the recovery of each cleaning event | ||
|
||
* 'random_clean' - a random recovery between 0-100% | ||
* 'random_clean' - a random recovery between 0-100%, | ||
pair with piecewise=False and neg_shift=False | ||
* 'perfect_clean' - each cleaning event returns the performance | ||
metric to 1 | ||
metric to 1, | ||
pair with piecewise=False and neg_shift=False | ||
* 'half_norm_clean' - The starting point of each interval is taken | ||
randomly from a half normal distribution with its mode (mu) at 1 and | ||
its sigma equal to 1/3 * (1-b) where b is the intercept of the fit to | ||
the interval. | ||
the interval, | ||
pair with piecewise=False and neg_shift=False | ||
*'perfect_clean_complex' - each detected clean event returns the | ||
performance metric to 1 while negative shifts in the data or | ||
piecewise linear fits result in no cleaning, | ||
pair with piecewise=True and neg_shift=True | ||
*'inferred_clean_complex' - at each detected clean event the | ||
performance metric increases based on fits to the data while | ||
negative shifts in the data or piecewise linear fits result in no | ||
cleaning, | ||
pair with piecewise=True and neg_shift=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updated doc string is not rendering correctly on read the docs
method : str, {'half_norm_clean', 'random_clean', 'perfect_clean', | ||
perfect_clean_complex,inferred_clean_complex} \ | ||
default 'perfect_clean_complex' | ||
|
||
How to treat the recovery of each cleaning event | ||
|
||
* 'random_clean' - a random recovery between 0-100% | ||
* 'random_clean' - a random recovery between 0-100%, | ||
pair with piecewise=False and neg_shift=False | ||
* 'perfect_clean' - each cleaning event returns the performance | ||
metric to 1 | ||
metric to 1, | ||
pair with piecewise=False and neg_shift=False | ||
* 'half_norm_clean' - The starting point of each interval is taken | ||
randomly from a half normal distribution with its mode (mu) at 1 and | ||
its sigma equal to 1/3 * (1-b) where b is the intercept of the fit to | ||
the interval. | ||
the interval, | ||
pair with piecewise=False and neg_shift=False | ||
* 'perfect_clean_complex' - each detected clean event returns the | ||
performance metric to 1 while negative shifts in the data or | ||
piecewise linear fits result in no cleaning, | ||
pair with piecewise=True and neg_shift=True | ||
* 'inferred_clean_complex' - at each detected clean event the | ||
performance metric increases based on fits to the data while | ||
negative shifts in the data or piecewise linear fits result in no | ||
cleaning, | ||
pair with piecewise=True and neg_shift=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated doc string not appearing correctly in read the docs
__init__.py