Skip to content

Add functionality for reading EPW weather files #677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 29, 2019

Conversation

roelloonen
Copy link
Contributor

@roelloonen roelloonen commented Mar 20, 2019

Adds an EPW reader to iotools.tmy and includes files for testing.

Most of the code is adapted from the original TMY3 functions in iotools.tmy.
Tested locally with both downloaded EPW files and web links.
Tried to edit and verify test_tmy.py. This seems to work fine but needs confirmation.
In general, I am fairly new to Python and a total novice in the use of GitHub. Tips and directions are therefore very much welcome.

  • Closes add epw file reader #591
  • I am familiar with the contributing guidelines.
  • Fully tested. Added and/or modified tests to ensure correct behavior for all reasonable inputs. Tests (usually) must pass on the TravisCI and Appveyor testing services.
  • Updates entries to docs/sphinx/source/api.rst for API changes.
  • Adds description and name entries in the appropriate docs/sphinx/source/whatsnew file for all changes.
  • Code quality and style is sufficient. Passes LGTM and SticklerCI checks.
  • New code is fully documented. Includes sphinx/numpydoc compliant docstrings and comments in the code where necessary.
  • Pull request is nearly complete and ready for detailed review.

Brief description of the problem and proposed solution (if not already fully described in the issue linked to above):
Adds EPW reader to iotools.tmy similar to TMY2 and TMY3. Both local files and web links are supported.

Adds EPW reader to iotools.tmy and includes files for testing
@roelloonen roelloonen marked this pull request as ready for review March 21, 2019 00:04
@cwhanse
Copy link
Member

cwhanse commented Mar 21, 2019

@roelloonen thanks, and welcome!

@wholmgren should we put this function in a new iotools.epw rather than in iotools.tmy?

@wholmgren
Copy link
Member

should we put this function in a new iotools.epw rather than in iotools.tmy?

Yes.

Copy link
Member

@wholmgren wholmgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the good pull request.


if filename is None:
try:
filename = _interactive_load()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anyone use the interactive load for tmy files? I'd rather deprecate that feature in tmy than continue it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anticipating that the answer will be no, I have omitted this option in the new read_epw function


# Shift one hour back because EPW's usage of hour 24
# and dateutil's inability to handle that.
data["hour"] = data["hour"].apply(lambda x: x - 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The epw function in this comment did not apply a time shift. Which is correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. EPW data files use a similar 1-24h formulation as TMY3, so I mistakenly assumed this shift would be necessary. Apparently, pd.to_datetime was already handling things just fine, but in this update I had to add a correction to the last line to avoid violation of all dates belonging to the same year. There must be a prettier solution for this, but it seems to work for the time being.

includes a couple of other small fixes and updated testing files
@roelloonen
Copy link
Contributor Author

should we put this function in a new iotools.epw rather than in iotools.tmy?

Yes.

Commit 35c7951 creates a new iotools.epw and addresses Will's other suggestions.

'hour']]))

# Localize time series
data = data.tz_localize(int(meta['TZ'] * 3600))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wholmgren I know this line is copied from iotools.tmy but how does it work? I see that it does, but as far as I can tell, it's undocumented behavior of pandas.DataFrame.tz_localize.

Copy link
Member

@wholmgren wholmgren Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how, but we have a bit of context here: https://pvlib-python.readthedocs.io/en/latest/timetimezones.html#fixed-offsets

data.tz_localize(pytz.FixedOffset(int(meta['TZ'] * 60))) would be better. I have a vague recollection that it did not work on an earlier version of pandas, but probably ok now. Something like you could specify Timestamp(tz=...) but could not simply pass it to DataFrame.tz_localize

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a helper function to translate from timezone offset as a float to a pytz object. Suggest that we accept it in this PR, and open a new issue to improve here, in iotools.tmy and in readthedocs.

@wholmgren
Copy link
Member

wholmgren commented Mar 22, 2019 via email

Remove line that was only meant for debugging
data['year'].iloc[-1] = coerce_year - 1

# Update index with correct date information
data = data.set_index(pd.to_datetime(data[['year', 'month', 'day',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line causes problems on the python 2.7-min configuration. Have you checked that the index is correct in your debugging? It's not explicitly tested in any of the tests. https://travis-ci.org/pvlib/pvlib-python/jobs/510118505#L914

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies if this is a silly question, but I think I need a bit more explanation.
What would 'correctness' of the index mean?
Do you have any example of how this can be tested?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only meant that the index is what you expected it to be. Here's a more formal example for the index expectation for the read_solrad test function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest creating the index separately, and localizing it, then setting the dataframe index.
E.g.,

idx = pd.DatetimeIndex(pd.to_datetime(data[['year', 'month', 'day', 'hour']])
idx.tz_localize(xxx)
data.index = idx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to follow Cliff's suggestion for separately creating the index first. This seems to work fine.
But I am getting pretty much stuck with the error in the python 2.7-min configuration.
I have verified that the index gets created, and that it makes sense. The zero padding solution is not helping.
I think that the issue is that the index comes in the form of a series, which to_datetime does not like, apparently.
I have tried to solutions proposed here, but it is not working for me.
Any ideas or suggestions would be very welcome.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is related to the time specification using hour = 24. I checked out your code and got it to work like this:

    # create index that supplies correct date and time zone information
    dts = data[['month', 'day']].astype(str).apply(lambda x: x.str.zfill(2))
    hrs = (data['hour'] - 1).astype(str).str.zfill(2)
    dtscat = data['year'].astype(str) + dts['month'] + dts['day'] + hrs
    idx = pd.to_datetime(dtscat, format='%Y%m%d%H')
    idx = idx.dt.tz_localize(int(meta['TZ'] * 3600))
    data.index = idx

Consider adding a +1 hour timedelta to the index. In any case, we'll need to make sure the docstring correctly describes the index.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roelloonen , @cwhanse, @janinefreeman, @wholmgren : Hi all- I'm curious about the -1 hour offset in read_epw() in the current pvlib release (line 171). I think that this is a mistake - when I try to match modeled dni using ghi and dhi, I can only do this when I calculate solar position at (timestamp + 30 minutes), which is different from the TMY convention of calculating solar position at (timestamp - 30 minutes).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdeline The shift is because EPW files follow an hour-ending timestamp convention, and thus hours take values from 1 to 24. Hour values are shifted to hour-beginning to avoid changing the date for the hour=24 timestamp in the pandas index.

The docstring for read_epw describes hour-ending convention for all variables, which needs to be corrected since we've shifted to hour beginning.

I'm confused about the line number you cite: I see that shift at line 165 of read_epw - is it numbered 171 in your IDE, or are we actually looking at different code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cliff- I was referring to V0.6.3 which has an additional bit of importing at the top:
try:
# python 2 compatibility
from urllib2 import urlopen, Request
except ImportError:
from urllib.request import urlopen, Request

Otherwise, I think it's identical to this 0.6.1 version here. Thanks for your discussion above.

This commit forces the data format and now pads a zero in front of the 'hour' column. I believe that this solves the Python 2.7 issue, but Travis testing needs to confirm this
@wholmgren
Copy link
Member

wholmgren commented Mar 26, 2019

Can you run flake8 on your new files? If not, see my print out below. There are a lot of issues with trailing whitespace, whitespace in blank lines, and a few unused imports. Unfortunately, sticklerci seems to have stopped working and I can't tell how to fix it immediately.

Edited to add that you can ignore the too long lines in the docstring. Not worth the effort for those.

$ flake8 pvlib/iotools/epw.py
pvlib/iotools/epw.py:5:1: F401 'datetime' imported but unused
pvlib/iotools/epw.py:7:1: F401 're' imported but unused
pvlib/iotools/epw.py:15:1: F401 'dateutil' imported but unused
pvlib/iotools/epw.py:22:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:25:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:26:69: W291 trailing whitespace
pvlib/iotools/epw.py:27:58: W291 trailing whitespace
pvlib/iotools/epw.py:28:71: W291 trailing whitespace
pvlib/iotools/epw.py:30:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:31:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:41:64: W291 trailing whitespace
pvlib/iotools/epw.py:43:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:44:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:52:64: W291 trailing whitespace
pvlib/iotools/epw.py:62:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:77:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:78:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:79:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:80:80: E501 line too long (190 > 79 characters)
pvlib/iotools/epw.py:82:80: E501 line too long (190 > 79 characters)
pvlib/iotools/epw.py:83:80: E501 line too long (187 > 79 characters)
pvlib/iotools/epw.py:89:80: E501 line too long (96 > 79 characters)
pvlib/iotools/epw.py:90:80: E501 line too long (89 > 79 characters)
pvlib/iotools/epw.py:91:80: E501 line too long (90 > 79 characters)
pvlib/iotools/epw.py:92:80: E501 line too long (93 > 79 characters)
pvlib/iotools/epw.py:93:80: E501 line too long (82 > 79 characters)
pvlib/iotools/epw.py:94:80: E501 line too long (129 > 79 characters)
pvlib/iotools/epw.py:95:80: E501 line too long (125 > 79 characters)
pvlib/iotools/epw.py:96:80: E501 line too long (121 > 79 characters)
pvlib/iotools/epw.py:97:80: E501 line too long (131 > 79 characters)
pvlib/iotools/epw.py:98:80: E501 line too long (135 > 79 characters)
pvlib/iotools/epw.py:99:80: E501 line too long (130 > 79 characters)
pvlib/iotools/epw.py:100:80: E501 line too long (125 > 79 characters)
pvlib/iotools/epw.py:101:80: E501 line too long (122 > 79 characters)
pvlib/iotools/epw.py:102:80: E501 line too long (127 > 79 characters)
pvlib/iotools/epw.py:103:80: E501 line too long (123 > 79 characters)
pvlib/iotools/epw.py:104:80: E501 line too long (126 > 79 characters)
pvlib/iotools/epw.py:105:80: E501 line too long (86 > 79 characters)
pvlib/iotools/epw.py:106:80: E501 line too long (128 > 79 characters)
pvlib/iotools/epw.py:107:80: E501 line too long (159 > 79 characters)
pvlib/iotools/epw.py:108:80: E501 line too long (87 > 79 characters)
pvlib/iotools/epw.py:109:80: E501 line too long (104 > 79 characters)
pvlib/iotools/epw.py:110:80: E501 line too long (375 > 79 characters)
pvlib/iotools/epw.py:111:80: E501 line too long (87 > 79 characters)
pvlib/iotools/epw.py:112:80: E501 line too long (144 > 79 characters)
pvlib/iotools/epw.py:113:80: E501 line too long (159 > 79 characters)
pvlib/iotools/epw.py:114:80: E501 line too long (108 > 79 characters)
pvlib/iotools/epw.py:115:80: E501 line too long (146 > 79 characters)
pvlib/iotools/epw.py:116:80: E501 line too long (121 > 79 characters)
pvlib/iotools/epw.py:117:80: E501 line too long (181 > 79 characters)
pvlib/iotools/epw.py:118:80: E501 line too long (113 > 79 characters)
pvlib/iotools/epw.py:119:80: E501 line too long (190 > 79 characters)
pvlib/iotools/epw.py:127:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:140:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:144:18: E231 missing whitespace after ','
pvlib/iotools/epw.py:144:63: E231 missing whitespace after ','
pvlib/iotools/epw.py:144:75: W291 trailing whitespace
pvlib/iotools/epw.py:145:42: E231 missing whitespace after ','
pvlib/iotools/epw.py:153:78: W291 trailing whitespace
pvlib/iotools/epw.py:154:61: W291 trailing whitespace
pvlib/iotools/epw.py:155:78: W291 trailing whitespace
pvlib/iotools/epw.py:156:73: W291 trailing whitespace
pvlib/iotools/epw.py:157:64: W291 trailing whitespace
pvlib/iotools/epw.py:158:67: W291 trailing whitespace
pvlib/iotools/epw.py:159:68: W291 trailing whitespace
pvlib/iotools/epw.py:161:77: W291 trailing whitespace
pvlib/iotools/epw.py:162:54: W291 trailing whitespace
pvlib/iotools/epw.py:164:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:165:76: W291 trailing whitespace
pvlib/iotools/epw.py:168:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:174:72: W291 trailing whitespace
pvlib/iotools/epw.py:175:65: W291 trailing whitespace
pvlib/iotools/epw.py:177:53: W291 trailing whitespace
pvlib/iotools/epw.py:180:22: W292 no newline at end of file

$ flake8 pvlib/test/test_epw.py     
pvlib/test/test_epw.py:1:1: F401 'inspect' imported but unused
pvlib/test/test_epw.py:11:1: E302 expected 2 blank lines, found 1
pvlib/test/test_epw.py:14:1: E302 expected 2 blank lines, found 1
pvlib/test/test_epw.py:16:80: E501 line too long (132 > 79 characters)
pvlib/test/test_epw.py:19:1: E302 expected 2 blank lines, found 1
pvlib/test/test_epw.py:22:43: W291 trailing whitespace

Updates code for index creation in epw.py
Improves docstring for clarity
Cleans trailing spaces, missing whitespace, etc. identified by pycodestyle in epw.py and test_epw.py
Last hour of the year led to an error in the forced year check. It was because of a remnant of my quick-and-dirty approach which is no longer necessary using Will's solution.
Copy link
Member

@wholmgren wholmgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more minor change. Can you also update the whatsnew file with a note about the addition and your name and/or username?

@wholmgren wholmgren added this to the 0.6.2 milestone Mar 27, 2019
Copy link
Member

@cwhanse cwhanse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subject to making stickler happy

Copy link
Member

@wholmgren wholmgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. @cwhanse the long lines in the documentation table are tricky. Sphinx simple table markup (like we're using here) doesn't allow for cells to continue to the next line. The more complicated table markup is a major pain. I recommend merging as is.

@cwhanse
Copy link
Member

cwhanse commented Mar 28, 2019

Stickler overruled. I restarted the failed travis build for py36.

@wholmgren wholmgren merged commit c9e3314 into pvlib:master Mar 29, 2019
@wholmgren
Copy link
Member

thanks @roelloonen!

@cdeline
Copy link
Contributor

cdeline commented Mar 29, 2019

Hi guys, thank you for the work on this - I was about to get started on a pull request from my fork, but you beat me to it. :)
In github.com/nrel/bifacial_radiance we currently make use of a standalone readepw.py file based on a code stub submitted on 10/17/17 by squoilin (#261). Once your work here gets merged into a production version of pvlib, we'll point to the pvlib version instead. It looks consistent with what we have right now, with the exception of some header names.

One consideration: when comparing the header names for EPW and TMY3 file imports, is it useful to maintain consistency? Here are some examples of differences. Since EPW files don't contain headers, couldn't we at least adopt something like the existing TMY2 / TMY3 for consistency? Or have these EPW header names been chosen to match some other standard?

TMY3 EPW
USAF WMO_code
State state-prov
ETR etr
ETRN etrn
GHI ghi
DHI dhi
DNI dni
Ghillum global_hor_illum
Zenithlum zenith_luminance
Pressure atmospheric_pressure
DryBulb temp_air
Wspd wind_speed
Wdir wind_direction
RHum relative_humidity

@cdeline
Copy link
Contributor

cdeline commented Mar 29, 2019

Crap, I'm 3 minutes too late. LOL

@wholmgren
Copy link
Member

@cdeline my thought was "crap I was 3 minutes too early"!

Or have these EPW header names been chosen to match some other standard?

We are following the (incomplete) standards documented here: https://pvlib-python.readthedocs.io/en/latest/variables_style_rules.html

@cdeline
Copy link
Contributor

cdeline commented Mar 29, 2019

OK. Is there a translator or option to have the TMY3 and TMY2 files read out into this consistent format? It would be nice to design my code around this to be functional whether I import a TMY2 / TMY3 / EPW file. Like another _recolumn function in tmy.py if you guys are moving to a different IO header convention?

@wholmgren
Copy link
Member

There is not currently a translator or option but I'd support creating one. Unlikely to do it myself, though. These header conventions have been around for a long time, but the TMY readers have been around even longer and we never updated the old functions.

@cwhanse
Copy link
Member

cwhanse commented Mar 29, 2019

@cdeline are you asking if there is a 'standard' format to which to write TMY2/3/EPW data from pvlib?

@cdeline
Copy link
Contributor

cdeline commented Mar 29, 2019

Yeah, I guess that was my question - if you're moving to a new pvlib header convention for the .epw reader, it might be useful to have a way to cast the legacy TMY2 and TMY3 readers into the same header format by passing in a 'newheader' boolean or something.

I suppose it's no big deal for me to just rename the dictionary keys of the new EPW output. I was just figuring that you would want all of your different readers in the .iotools module to have a consistent output instead of one returning 'temp_air' and another returning 'DryBulb' and whatnot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants