Add functionality for reading EPW weather files #677

roelloonen · 2019-03-20T23:58:12Z

Adds an EPW reader to iotools.tmy and includes files for testing.

Most of the code is adapted from the original TMY3 functions in iotools.tmy.
Tested locally with both downloaded EPW files and web links.
Tried to edit and verify test_tmy.py. This seems to work fine but needs confirmation.
In general, I am fairly new to Python and a total novice in the use of GitHub. Tips and directions are therefore very much welcome.

Closes add epw file reader #591
I am familiar with the contributing guidelines.
Fully tested. Added and/or modified tests to ensure correct behavior for all reasonable inputs. Tests (usually) must pass on the TravisCI and Appveyor testing services.
Updates entries to docs/sphinx/source/api.rst for API changes.
Adds description and name entries in the appropriate docs/sphinx/source/whatsnew file for all changes.
Code quality and style is sufficient. Passes LGTM and SticklerCI checks.
New code is fully documented. Includes sphinx/numpydoc compliant docstrings and comments in the code where necessary.
Pull request is nearly complete and ready for detailed review.

Brief description of the problem and proposed solution (if not already fully described in the issue linked to above):
Adds EPW reader to iotools.tmy similar to TMY2 and TMY3. Both local files and web links are supported.

Adds EPW reader to iotools.tmy and includes files for testing

cwhanse · 2019-03-21T15:04:59Z

@roelloonen thanks, and welcome!

@wholmgren should we put this function in a new iotools.epw rather than in iotools.tmy?

wholmgren · 2019-03-21T15:07:54Z

should we put this function in a new iotools.epw rather than in iotools.tmy?

Yes.

wholmgren

thanks for the good pull request.

wholmgren · 2019-03-21T15:24:33Z

pvlib/iotools/tmy.py

+
+    if filename is None:
+        try:
+            filename = _interactive_load()


Does anyone use the interactive load for tmy files? I'd rather deprecate that feature in tmy than continue it here.

Anticipating that the answer will be no, I have omitted this option in the new read_epw function

wholmgren · 2019-03-21T15:28:35Z

pvlib/iotools/tmy.py

+
+    # Shift one hour back because EPW's usage of hour 24
+    # and dateutil's inability to handle that.  
+    data["hour"] = data["hour"].apply(lambda x: x - 1)


The epw function in this comment did not apply a time shift. Which is correct?

Thanks. EPW data files use a similar 1-24h formulation as TMY3, so I mistakenly assumed this shift would be necessary. Apparently, pd.to_datetime was already handling things just fine, but in this update I had to add a correction to the last line to avoid violation of all dates belonging to the same year. There must be a prettier solution for this, but it seems to work for the time being.

pvlib/iotools/tmy.py

includes a couple of other small fixes and updated testing files

roelloonen · 2019-03-21T23:57:31Z

should we put this function in a new iotools.epw rather than in iotools.tmy?

Yes.

Commit 35c7951 creates a new iotools.epw and addresses Will's other suggestions.

cwhanse · 2019-03-22T15:36:41Z

pvlib/iotools/epw.py

+                                               'hour']]))
+
+    # Localize time series
+    data = data.tz_localize(int(meta['TZ'] * 3600))


@wholmgren I know this line is copied from iotools.tmy but how does it work? I see that it does, but as far as I can tell, it's undocumented behavior of pandas.DataFrame.tz_localize.

I don't know how, but we have a bit of context here: https://pvlib-python.readthedocs.io/en/latest/timetimezones.html#fixed-offsets

data.tz_localize(pytz.FixedOffset(int(meta['TZ'] * 60))) would be better. I have a vague recollection that it did not work on an earlier version of pandas, but probably ok now. Something like you could specify Timestamp(tz=...) but could not simply pass it to DataFrame.tz_localize

Maybe a helper function to translate from timezone offset as a float to a pytz object. Suggest that we accept it in this PR, and open a new issue to improve here, in iotools.tmy and in readthedocs.

wholmgren · 2019-03-22T16:38:49Z

Ok with me

…

On Fri, Mar 22, 2019 at 9:36 AM Cliff Hansen ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In pvlib/iotools/epw.py <#677 (comment)>: > + + # We only have to skip 6 rows instead of 7 because we have already used + # the realine call above. + data = pd.read_csv(csvdata, skiprows=6, header=0, names=colnames) + + # Change to single year if requested + if coerce_year is not None: + data["year"] = coerce_year + data['year'].iloc[-1] = coerce_year - 1 + + # Update index with correct date information + data = data.set_index(pd.to_datetime(data[['year', 'month', 'day', + 'hour']])) + + # Localize time series + data = data.tz_localize(int(meta['TZ'] * 3600)) Maybe a helper function to translate from timezone offset as a float to a pytz object. Suggest that we accept it in this PR, and open a new issue to improve here, in iotools.tmy and in readthedocs. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#677 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AELiR3BgA6RQmrHx-1SqTD-47Ra_JztAks5vZQaTgaJpZM4cAfhg> .

pvlib/iotools/epw.py

Remove line that was only meant for debugging

wholmgren · 2019-03-22T22:10:18Z

pvlib/iotools/epw.py

+        data['year'].iloc[-1] = coerce_year - 1
+
+    # Update index with correct date information     
+    data = data.set_index(pd.to_datetime(data[['year', 'month', 'day', 


This line causes problems on the python 2.7-min configuration. Have you checked that the index is correct in your debugging? It's not explicitly tested in any of the tests. https://travis-ci.org/pvlib/pvlib-python/jobs/510118505#L914

Apologies if this is a silly question, but I think I need a bit more explanation.
What would 'correctness' of the index mean?
Do you have any example of how this can be tested?

I only meant that the index is what you expected it to be. Here's a more formal example for the index expectation for the read_solrad test function.

Suggest creating the index separately, and localizing it, then setting the dataframe index.
E.g.,

idx = pd.DatetimeIndex(pd.to_datetime(data[['year', 'month', 'day', 'hour']])
idx.tz_localize(xxx)
data.index = idx

I managed to follow Cliff's suggestion for separately creating the index first. This seems to work fine.
But I am getting pretty much stuck with the error in the python 2.7-min configuration.
I have verified that the index gets created, and that it makes sense. The zero padding solution is not helping.
I think that the issue is that the index comes in the form of a series, which to_datetime does not like, apparently.
I have tried to solutions proposed here, but it is not working for me.
Any ideas or suggestions would be very welcome.

The problem is related to the time specification using hour = 24. I checked out your code and got it to work like this:

# create index that supplies correct date and time zone information dts = data[['month', 'day']].astype(str).apply(lambda x: x.str.zfill(2)) hrs = (data['hour'] - 1).astype(str).str.zfill(2) dtscat = data['year'].astype(str) + dts['month'] + dts['day'] + hrs idx = pd.to_datetime(dtscat, format='%Y%m%d%H') idx = idx.dt.tz_localize(int(meta['TZ'] * 3600)) data.index = idx

Consider adding a +1 hour timedelta to the index. In any case, we'll need to make sure the docstring correctly describes the index.

@roelloonen , @cwhanse, @janinefreeman, @wholmgren : Hi all- I'm curious about the -1 hour offset in read_epw() in the current pvlib release (line 171). I think that this is a mistake - when I try to match modeled dni using ghi and dhi, I can only do this when I calculate solar position at (timestamp + 30 minutes), which is different from the TMY convention of calculating solar position at (timestamp - 30 minutes).

@cdeline The shift is because EPW files follow an hour-ending timestamp convention, and thus hours take values from 1 to 24. Hour values are shifted to hour-beginning to avoid changing the date for the hour=24 timestamp in the pandas index.

The docstring for read_epw describes hour-ending convention for all variables, which needs to be corrected since we've shifted to hour beginning.

I'm confused about the line number you cite: I see that shift at line 165 of read_epw - is it numbered 171 in your IDE, or are we actually looking at different code?

Cliff- I was referring to V0.6.3 which has an additional bit of importing at the top:
try:
# python 2 compatibility
from urllib2 import urlopen, Request
except ImportError:
from urllib.request import urlopen, Request

Otherwise, I think it's identical to this 0.6.1 version here. Thanks for your discussion above.

pvlib/test/test_epw.py

This commit forces the data format and now pads a zero in front of the 'hour' column. I believe that this solves the Python 2.7 issue, but Travis testing needs to confirm this

note .dt option in localizing. see: https://stackoverflow.com/questions/26089670/unable-to-apply-methods-on-timestamps-using-series-built-ins

wholmgren · 2019-03-26T00:56:18Z

Can you run flake8 on your new files? If not, see my print out below. There are a lot of issues with trailing whitespace, whitespace in blank lines, and a few unused imports. Unfortunately, sticklerci seems to have stopped working and I can't tell how to fix it immediately.

Edited to add that you can ignore the too long lines in the docstring. Not worth the effort for those.

$ flake8 pvlib/iotools/epw.py
pvlib/iotools/epw.py:5:1: F401 'datetime' imported but unused
pvlib/iotools/epw.py:7:1: F401 're' imported but unused
pvlib/iotools/epw.py:15:1: F401 'dateutil' imported but unused
pvlib/iotools/epw.py:22:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:25:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:26:69: W291 trailing whitespace
pvlib/iotools/epw.py:27:58: W291 trailing whitespace
pvlib/iotools/epw.py:28:71: W291 trailing whitespace
pvlib/iotools/epw.py:30:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:31:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:41:64: W291 trailing whitespace
pvlib/iotools/epw.py:43:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:44:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:52:64: W291 trailing whitespace
pvlib/iotools/epw.py:62:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:77:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:78:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:79:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:80:80: E501 line too long (190 > 79 characters)
pvlib/iotools/epw.py:82:80: E501 line too long (190 > 79 characters)
pvlib/iotools/epw.py:83:80: E501 line too long (187 > 79 characters)
pvlib/iotools/epw.py:89:80: E501 line too long (96 > 79 characters)
pvlib/iotools/epw.py:90:80: E501 line too long (89 > 79 characters)
pvlib/iotools/epw.py:91:80: E501 line too long (90 > 79 characters)
pvlib/iotools/epw.py:92:80: E501 line too long (93 > 79 characters)
pvlib/iotools/epw.py:93:80: E501 line too long (82 > 79 characters)
pvlib/iotools/epw.py:94:80: E501 line too long (129 > 79 characters)
pvlib/iotools/epw.py:95:80: E501 line too long (125 > 79 characters)
pvlib/iotools/epw.py:96:80: E501 line too long (121 > 79 characters)
pvlib/iotools/epw.py:97:80: E501 line too long (131 > 79 characters)
pvlib/iotools/epw.py:98:80: E501 line too long (135 > 79 characters)
pvlib/iotools/epw.py:99:80: E501 line too long (130 > 79 characters)
pvlib/iotools/epw.py:100:80: E501 line too long (125 > 79 characters)
pvlib/iotools/epw.py:101:80: E501 line too long (122 > 79 characters)
pvlib/iotools/epw.py:102:80: E501 line too long (127 > 79 characters)
pvlib/iotools/epw.py:103:80: E501 line too long (123 > 79 characters)
pvlib/iotools/epw.py:104:80: E501 line too long (126 > 79 characters)
pvlib/iotools/epw.py:105:80: E501 line too long (86 > 79 characters)
pvlib/iotools/epw.py:106:80: E501 line too long (128 > 79 characters)
pvlib/iotools/epw.py:107:80: E501 line too long (159 > 79 characters)
pvlib/iotools/epw.py:108:80: E501 line too long (87 > 79 characters)
pvlib/iotools/epw.py:109:80: E501 line too long (104 > 79 characters)
pvlib/iotools/epw.py:110:80: E501 line too long (375 > 79 characters)
pvlib/iotools/epw.py:111:80: E501 line too long (87 > 79 characters)
pvlib/iotools/epw.py:112:80: E501 line too long (144 > 79 characters)
pvlib/iotools/epw.py:113:80: E501 line too long (159 > 79 characters)
pvlib/iotools/epw.py:114:80: E501 line too long (108 > 79 characters)
pvlib/iotools/epw.py:115:80: E501 line too long (146 > 79 characters)
pvlib/iotools/epw.py:116:80: E501 line too long (121 > 79 characters)
pvlib/iotools/epw.py:117:80: E501 line too long (181 > 79 characters)
pvlib/iotools/epw.py:118:80: E501 line too long (113 > 79 characters)
pvlib/iotools/epw.py:119:80: E501 line too long (190 > 79 characters)
pvlib/iotools/epw.py:127:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:140:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:144:18: E231 missing whitespace after ','
pvlib/iotools/epw.py:144:63: E231 missing whitespace after ','
pvlib/iotools/epw.py:144:75: W291 trailing whitespace
pvlib/iotools/epw.py:145:42: E231 missing whitespace after ','
pvlib/iotools/epw.py:153:78: W291 trailing whitespace
pvlib/iotools/epw.py:154:61: W291 trailing whitespace
pvlib/iotools/epw.py:155:78: W291 trailing whitespace
pvlib/iotools/epw.py:156:73: W291 trailing whitespace
pvlib/iotools/epw.py:157:64: W291 trailing whitespace
pvlib/iotools/epw.py:158:67: W291 trailing whitespace
pvlib/iotools/epw.py:159:68: W291 trailing whitespace
pvlib/iotools/epw.py:161:77: W291 trailing whitespace
pvlib/iotools/epw.py:162:54: W291 trailing whitespace
pvlib/iotools/epw.py:164:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:165:76: W291 trailing whitespace
pvlib/iotools/epw.py:168:1: W293 blank line contains whitespace
pvlib/iotools/epw.py:174:72: W291 trailing whitespace
pvlib/iotools/epw.py:175:65: W291 trailing whitespace
pvlib/iotools/epw.py:177:53: W291 trailing whitespace
pvlib/iotools/epw.py:180:22: W292 no newline at end of file

$ flake8 pvlib/test/test_epw.py     
pvlib/test/test_epw.py:1:1: F401 'inspect' imported but unused
pvlib/test/test_epw.py:11:1: E302 expected 2 blank lines, found 1
pvlib/test/test_epw.py:14:1: E302 expected 2 blank lines, found 1
pvlib/test/test_epw.py:16:80: E501 line too long (132 > 79 characters)
pvlib/test/test_epw.py:19:1: E302 expected 2 blank lines, found 1
pvlib/test/test_epw.py:22:43: W291 trailing whitespace

Updates code for index creation in epw.py Improves docstring for clarity Cleans trailing spaces, missing whitespace, etc. identified by pycodestyle in epw.py and test_epw.py

Last hour of the year led to an error in the forced year check. It was because of a remnant of my quick-and-dirty approach which is no longer necessary using Will's solution.

wholmgren

One more minor change. Can you also update the whatsnew file with a note about the addition and your name and/or username?

pvlib/iotools/epw.py

cwhanse

Subject to making stickler happy

wholmgren

Looks good to me. @cwhanse the long lines in the documentation table are tricky. Sphinx simple table markup (like we're using here) doesn't allow for cells to continue to the next line. The more complicated table markup is a major pain. I recommend merging as is.

cwhanse · 2019-03-28T15:45:45Z

Stickler overruled. I restarted the failed travis build for py36.

wholmgren · 2019-03-29T18:51:50Z

thanks @roelloonen!

cdeline · 2019-03-29T18:54:10Z

Hi guys, thank you for the work on this - I was about to get started on a pull request from my fork, but you beat me to it. :)
In github.com/nrel/bifacial_radiance we currently make use of a standalone readepw.py file based on a code stub submitted on 10/17/17 by squoilin (#261). Once your work here gets merged into a production version of pvlib, we'll point to the pvlib version instead. It looks consistent with what we have right now, with the exception of some header names.

One consideration: when comparing the header names for EPW and TMY3 file imports, is it useful to maintain consistency? Here are some examples of differences. Since EPW files don't contain headers, couldn't we at least adopt something like the existing TMY2 / TMY3 for consistency? Or have these EPW header names been chosen to match some other standard?

TMY3	EPW
USAF	WMO_code
State	state-prov
ETR	etr
ETRN	etrn
GHI	ghi
DHI	dhi
DNI	dni
Ghillum	global_hor_illum
Zenithlum	zenith_luminance
Pressure	atmospheric_pressure
DryBulb	temp_air
Wspd	wind_speed
Wdir	wind_direction
RHum	relative_humidity

cdeline · 2019-03-29T18:54:48Z

Crap, I'm 3 minutes too late. LOL

wholmgren · 2019-03-29T18:57:48Z

@cdeline my thought was "crap I was 3 minutes too early"!

Or have these EPW header names been chosen to match some other standard?

We are following the (incomplete) standards documented here: https://pvlib-python.readthedocs.io/en/latest/variables_style_rules.html

cdeline · 2019-03-29T19:03:58Z

OK. Is there a translator or option to have the TMY3 and TMY2 files read out into this consistent format? It would be nice to design my code around this to be functional whether I import a TMY2 / TMY3 / EPW file. Like another _recolumn function in tmy.py if you guys are moving to a different IO header convention?

wholmgren · 2019-03-29T19:19:09Z

There is not currently a translator or option but I'd support creating one. Unlikely to do it myself, though. These header conventions have been around for a long time, but the TMY readers have been around even longer and we never updated the old functions.

cwhanse · 2019-03-29T20:51:17Z

@cdeline are you asking if there is a 'standard' format to which to write TMY2/3/EPW data from pvlib?

cdeline · 2019-03-29T22:16:23Z

Yeah, I guess that was my question - if you're moving to a new pvlib header convention for the .epw reader, it might be useful to have a way to cast the legacy TMY2 and TMY3 readers into the same header format by passing in a 'newheader' boolean or something.

I suppose it's no big deal for me to just rename the dictionary keys of the new EPW output. I was just figuring that you would want all of your different readers in the .iotools module to have a consistent output instead of one returning 'temp_air' and another returning 'DryBulb' and whatnot.

Add functionality for reading EPW weather files

7d1e172

Adds EPW reader to iotools.tmy and includes files for testing

roelloonen marked this pull request as ready for review March 21, 2019 00:04

Include documentation for read_epw

b88c249

wholmgren reviewed Mar 21, 2019

View reviewed changes

pvlib/iotools/tmy.py Outdated Show resolved Hide resolved

read_epw is now in a separate module instead of part of tmy.py

35c7951

includes a couple of other small fixes and updated testing files

cwhanse reviewed Mar 22, 2019

View reviewed changes

wholmgren reviewed Mar 22, 2019

View reviewed changes

pvlib/iotools/epw.py Outdated Show resolved Hide resolved

Update epw.py

6a00408

Remove line that was only meant for debugging

wholmgren reviewed Mar 22, 2019

View reviewed changes

cwhanse reviewed Mar 22, 2019

View reviewed changes

pvlib/test/test_epw.py Outdated Show resolved Hide resolved

roelloonen added 4 commits March 25, 2019 15:39

Force date format

0dc262e

This commit forces the data format and now pads a zero in front of the 'hour' column. I believe that this solves the Python 2.7 issue, but Travis testing needs to confirm this

Merge remote-tracking branch 'origin/master'

18d0221

Remove debugging statement

e942115

Testing dir update and new code structure for localizing index

e6d4031

note .dt option in localizing. see: https://stackoverflow.com/questions/26089670/unable-to-apply-methods-on-timestamps-using-series-built-ins

roelloonen added 2 commits March 26, 2019 17:24

Update index creation following Will's suggestion and cleaned up code

af47da9

Updates code for index creation in epw.py Improves docstring for clarity Cleans trailing spaces, missing whitespace, etc. identified by pycodestyle in epw.py and test_epw.py

Changed indexing of last hour of the year

b4cf990

Last hour of the year led to an error in the forced year check. It was because of a remnant of my quick-and-dirty approach which is no longer necessary using Will's solution.

wholmgren reviewed Mar 27, 2019

View reviewed changes

pvlib/iotools/epw.py Outdated Show resolved Hide resolved

wholmgren added enhancement io labels Mar 27, 2019

wholmgren added this to the 0.6.2 milestone Mar 27, 2019

Minor change to default input for filename and update of What's New

aa2e7cc

cwhanse approved these changes Mar 28, 2019

View reviewed changes

wholmgren approved these changes Mar 28, 2019

View reviewed changes

wholmgren merged commit c9e3314 into pvlib:master Mar 29, 2019

cwhanse mentioned this pull request Oct 14, 2019

Improve docstring for iotools.tmy.read_tmy3 and read_tmy2 #714

Closed

wholmgren mentioned this pull request Jul 15, 2020

Remove networking code from iotools.read_tmy3, tkinter from read_tmy2/3 #1004

Merged

6 tasks

Add functionality for reading EPW weather files #677

Add functionality for reading EPW weather files #677

Uh oh!

Conversation

roelloonen commented Mar 20, 2019 • edited by wholmgren Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cwhanse commented Mar 21, 2019

Uh oh!

wholmgren commented Mar 21, 2019

Uh oh!

wholmgren left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

roelloonen commented Mar 21, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wholmgren Mar 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wholmgren commented Mar 22, 2019 via email

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wholmgren commented Mar 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wholmgren left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cwhanse left a comment

Choose a reason for hiding this comment

Uh oh!

wholmgren left a comment

Choose a reason for hiding this comment

Uh oh!

cwhanse commented Mar 28, 2019

Uh oh!

wholmgren commented Mar 29, 2019

Uh oh!

cdeline commented Mar 29, 2019

Uh oh!

cdeline commented Mar 29, 2019

Uh oh!

roelloonen commented Mar 20, 2019 •

edited by wholmgren

Loading

wholmgren Mar 22, 2019 •

edited

Loading

wholmgren commented Mar 26, 2019 •

edited

Loading

cdeline commented Mar 29, 2019 •

edited

Loading