-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Return diffuse components from Perez transposition model #259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
anomam
commented
Nov 1, 2016
- Give the option to users to obtain the diffuse components calculated in the Perez transposition model
- Maintain backward compatibility for users who don't need it
- Write tests to make sure that the results make sense
* dictionary returned depending on optional boolean input argument for backward compatibility
* all tests are passing
Nice. I think this is a good approach for now. We could consider an API change for 0.5 if enough users express a preference. I think we should test that the expected component dataframe actually has the values we want instead of only testing that the sum is correct. Travis shows some dtype errors on Python 3. |
Thanks for the feedback @wholmgren . I think that the Python3 issue is solved now with the last commit I just made. Concerning your second point, do you have any recommendation about what data/result I should use as reference to test each of the components? |
I think your existing test inputs are fine. My suggestion was only that you should also add an You'll also need to add a note and your name or username to the 0.4.2 whatsnew file. You might need to rebase/merge the latest commits before doing that. |
* and not just the sum of them
Got it! Thanks for the details. I tried to apply those suggestions in the last few commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The df_components = pd.DataFrame(components)
in your test made me realize that there were a couple of minor issues in the code regarding types. I commented more on the diff.
return sky_diffuse | ||
if return_components: | ||
component_keys = ('isotropic', 'circumsolar', 'horizon') | ||
diffuse_components = dict.fromkeys(component_keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, but I don't think the fromkeys
call is doing anything useful for us here. You can just call diffuse_components = OrderedDict()
and delete the component_keys
variable. The keys will created, in order, in the lines below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure (see following commit). I was just wondering why we wanted to use an ordered dictionary instead of a regular one here?
|
||
# Set values of components to 0 when sky_diffuse is 0 | ||
for k in diffuse_components.keys(): | ||
diffuse_components[k][sky_diffuse == 0] = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line will fail with scalar inputs (I personally like using scalars when doing some exploratory work). My recommendation is to replace this loop with:
mask = sky_diffuse == 0
if isinstance(sky_diffuse, pd.Series):
diffuse_components = pd.DataFrame(diffuse_components)
diffuse_components.ix[mask] = 0
else:
diffuse_components = {k: np.where(mask, 0, v) for k, v in diffuse_components.items()}
This has the unfortunate side effect of promoting a scalar to an array, but at least it's consistent with the rest of the api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
* Defining components as ordered dict * Treating case where inputs can be scalar
Not too sure why the build is failing in Travis... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of tweaks to make to the test, but we are very close!
@@ -177,9 +177,11 @@ def test_perez_components(): | |||
index=times | |||
) | |||
df_components = pd.DataFrame(components) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
components should already be a dataframe, so this line is no longer necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is not always a dataframe right now:
In [5]: irradiance.perez(15, 0, 100, 1000, 1300, 0, 0, 1, return_components=True)
Out[5]:
(array(98.4801177930642),
{'circumsolar': array(63.06009605934864),
'horizon': array(1.2960861412441615),
'isotropic': array(34.123935592471405)})
But I'll make sure it is!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your test function uses a series as an input, so it gets a DataFrame as an output. I think the perez function itself currently has the correct behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I just got it, you don't want it as a dataframe in all cases!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. pandas in, pandas out. Otherwise, keep the types as similar as is practical.
assert_series_equal(out, expected, check_less_precise=2) | ||
assert_frame_equal(df_components, expected_components) | ||
assert_frame_equal(df_components[columns], expected_components[columns]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifying the columns in this way has the downside that you're not actually testing if the order is as expected. You can remove the [columns] and test the order if you change the way the dataframe is created. Two options for that:
expected_components = pd.DataFrame(index=times)
expected_components['first_name'] = first_values
# and so on...
expected_components = pd.DataFrame(np.array([first_values, second_values, third_values]).T,
columns=['first', 'second', 'third'], index=times)
Not sure if you'd actually need the transpose there or not.
Sorry I missed the new commit -- feel free to ping me in the future! Thanks Marc! |