Skip to content

Return diffuse components from Perez transposition model #259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Dec 2, 2016

Conversation

anomam
Copy link
Contributor

@anomam anomam commented Nov 1, 2016

  • Give the option to users to obtain the diffuse components calculated in the Perez transposition model
  • Maintain backward compatibility for users who don't need it
  • Write tests to make sure that the results make sense

Marc Anoma added 2 commits November 1, 2016 14:42
* dictionary returned depending on optional boolean input argument for
 backward compatibility
* all tests are passing
@wholmgren
Copy link
Member

Nice. I think this is a good approach for now. We could consider an API change for 0.5 if enough users express a preference.

I think we should test that the expected component dataframe actually has the values we want instead of only testing that the sum is correct.

Travis shows some dtype errors on Python 3.

@wholmgren wholmgren added this to the 0.4.2 milestone Nov 4, 2016
@anomam
Copy link
Contributor Author

anomam commented Nov 17, 2016

Thanks for the feedback @wholmgren . I think that the Python3 issue is solved now with the last commit I just made.

Concerning your second point, do you have any recommendation about what data/result I should use as reference to test each of the components?

@wholmgren
Copy link
Member

I think your existing test inputs are fine. My suggestion was only that you should also add an expected_components DataFrame in your test and then add assert_frame_equals(components, expected_components). That way we can be sure that the right values are always assigned to the right columns. It certainly doesn't hurt to keep your existing check on the sum, too.

You'll also need to add a note and your name or username to the 0.4.2 whatsnew file. You might need to rebase/merge the latest commits before doing that.

@anomam
Copy link
Contributor Author

anomam commented Nov 18, 2016

Got it! Thanks for the details. I tried to apply those suggestions in the last few commits.
Please let me know if there's anything else I need to add/change.
Thx

Copy link
Member

@wholmgren wholmgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The df_components = pd.DataFrame(components) in your test made me realize that there were a couple of minor issues in the code regarding types. I commented more on the diff.

return sky_diffuse
if return_components:
component_keys = ('isotropic', 'circumsolar', 'horizon')
diffuse_components = dict.fromkeys(component_keys)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong, but I don't think the fromkeys call is doing anything useful for us here. You can just call diffuse_components = OrderedDict() and delete the component_keys variable. The keys will created, in order, in the lines below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure (see following commit). I was just wondering why we wanted to use an ordered dictionary instead of a regular one here?


# Set values of components to 0 when sky_diffuse is 0
for k in diffuse_components.keys():
diffuse_components[k][sky_diffuse == 0] = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line will fail with scalar inputs (I personally like using scalars when doing some exploratory work). My recommendation is to replace this loop with:

mask = sky_diffuse == 0
if isinstance(sky_diffuse, pd.Series):
    diffuse_components = pd.DataFrame(diffuse_components)
    diffuse_components.ix[mask] = 0
else:
    diffuse_components = {k: np.where(mask, 0, v) for k, v in diffuse_components.items()}

This has the unfortunate side effect of promoting a scalar to an array, but at least it's consistent with the rest of the api.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

* Defining components as ordered dict
* Treating case where inputs can be scalar
@anomam
Copy link
Contributor Author

anomam commented Nov 18, 2016

Not too sure why the build is failing in Travis...

Copy link
Member

@wholmgren wholmgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of tweaks to make to the test, but we are very close!

@@ -177,9 +177,11 @@ def test_perez_components():
index=times
)
df_components = pd.DataFrame(components)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

components should already be a dataframe, so this line is no longer necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not always a dataframe right now:

In [5]: irradiance.perez(15, 0, 100, 1000, 1300, 0, 0, 1, return_components=True)
Out[5]: 
(array(98.4801177930642),
 {'circumsolar': array(63.06009605934864),
  'horizon': array(1.2960861412441615),
  'isotropic': array(34.123935592471405)})

But I'll make sure it is!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your test function uses a series as an input, so it gets a DataFrame as an output. I think the perez function itself currently has the correct behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I just got it, you don't want it as a dataframe in all cases!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. pandas in, pandas out. Otherwise, keep the types as similar as is practical.

assert_series_equal(out, expected, check_less_precise=2)
assert_frame_equal(df_components, expected_components)
assert_frame_equal(df_components[columns], expected_components[columns])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifying the columns in this way has the downside that you're not actually testing if the order is as expected. You can remove the [columns] and test the order if you change the way the dataframe is created. Two options for that:

expected_components = pd.DataFrame(index=times)
expected_components['first_name'] = first_values
# and so on...

expected_components = pd.DataFrame(np.array([first_values, second_values, third_values]).T, 
    columns=['first', 'second', 'third'], index=times)

Not sure if you'd actually need the transpose there or not.

@wholmgren
Copy link
Member

Sorry I missed the new commit -- feel free to ping me in the future!

Thanks Marc!

@wholmgren wholmgren merged commit 730f48e into pvlib:master Dec 2, 2016
@anomam anomam deleted the perez_transp_components branch December 12, 2016 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants