-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add MIDC reader #605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MIDC reader #605
Conversation
pvlib/iotools/__init__.py
Outdated
@@ -3,3 +3,4 @@ | |||
from pvlib.iotools.srml import read_srml # noqa: F401 | |||
from pvlib.iotools.srml import read_srml_month_from_solardat # noqa: F401 | |||
from pvlib.iotools.surfrad import read_surfrad # noqa: F401 | |||
from pvlib.iotools.midc import read_midc # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E261 at least two spaces before inline comment
Revisiting providing some default mapping, originally I was concerned about mappings making assumptions for the user. Global PSP [W/m^2] @wholmgren suggested that we either:
@cwhanse do you have any thoughts on these approaches or ideas on instruments to favor? |
My reaction is to favor #1, append an instrument type code to the measurement code, e.g., ghi_psp, ghi_cmp22, ghi_li200, ghi_rc (for reference cell). I don't think we want to arbitrate any debates about the merits of various instruments. Better for us to help to maintain clarity, and let the user decide which of the instruments is mapped to |
Great, thanks for the feedback Cliff. |
@wholmgren There is at least one site that reports everything in Standard units: https://midcdmz.nrel.gov/apps/go2url.pl?site=SPMD I did find one more issue, sometimes there appear to be two names for the same variable at a site, as is the case with "Dew Point Temp [deg C]" and "Dewpoint Temp [deg C]" listed in the field api here: https://midcdmz.nrel.gov/apps/field_api.pl?NWTC |
Too bad about the units inconsistency. I suppose that everyone that wants to use this function is going to need to inspect the sites on the MIDC page, though, so maybe it's reasonable to say check the MIDC page for units and be sure to convert before using with other pvlib functions. Either way is fine with me.
this is ok with me |
pvlib/iotools/midc.py
Outdated
variable_map: dictionary | ||
Dictionary for mapping MIDC field names to pvlib names. See variable | ||
`VARIABLE_MAP` for default and Notes section below for a description of | ||
its format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
W291 trailing whitespace
pvlib/iotools/midc.py
Outdated
----- | ||
Keys of the `variable_map` dictionary should include the first part | ||
of a MIDC field name which indicates the variable being measured. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
W293 blank line contains whitespace
pvlib/iotools/midc.py
Outdated
of a MIDC field name which indicates the variable being measured. | ||
|
||
e.g. 'Global PSP [W/m^2]' is entered as a key of 'Global' | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
W293 blank line contains whitespace
pvlib/iotools/midc.py
Outdated
e.g. 'Global PSP [W/m^2]' is entered as a key of 'Global' | ||
|
||
The 'PSP' indicating instrument is appended to the pvlib variable name | ||
after mapping to differentiate measurements of the same variable. For a full |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (80 > 79 characters)
pvlib/iotools/midc.py
Outdated
|
||
The 'PSP' indicating instrument is appended to the pvlib variable name | ||
after mapping to differentiate measurements of the same variable. For a full | ||
list of pvlib variable names see the `Variable Style Rules <https://pvlib-python.readthedocs.io/en/latest/variables_style_rules.html>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (140 > 79 characters)
pvlib/iotools/midc.py
Outdated
|
||
References | ||
---------- | ||
.. [1] National Renewable Energy Laboratory: Measurement and Instrumentation Data Center |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (92 > 79 characters)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mergeable except for a typo. It would be nice if it also included a function that helped users construct a url for requesting data. From the comment in the original issue, an example url is:
url = 'http://midcdmz.nrel.gov/apps/plot.pl?site=UAT&start=20101103&edy=2&emo=4&eyr=2018&year=2016&month=1&day=1&endyear=2017&endmonth=12&endday=31&time=23:59&inst=4&inst=5&inst=6&inst=7&inst=9&type=data&first=3&math=0&second=-1&value=0.0&user=0&axis=1'
partial implementation might look like
def construct_url(site, data_start, data_end, epoch_start):
base_url = 'http://midcdmz.nrel.gov/apps/plot.pl?'
args = {'start': epoch_start.strftime('%Y%M%D'), 'year': data_start.strftime('%Y'), 'month': ..., ...}
args_str = '&'.join(['{}={}'.format(k, v) for k, v in args.items()])
return url + args_str
I've only manually constructed urls for the Tucson site, so I'm not sure about the difficulty in doing it for all sites.
pvlib/iotools/midc.py
Outdated
Parameters | ||
---------- | ||
variable_map: Dictionary | ||
A dictionary for mapping MIDC field nameto pvlib name. See VARIABLE_MAP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nameto -> name to
pvlib/iotools/midc.py
Outdated
if field_name.startswith(midc_name): | ||
# extract the instument and units field and then remove units | ||
instrument_units = field_name[len(midc_name):] | ||
instrument = instrument_units[:instrument_units.find('[') - 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logic assumes that units are always specified in this format and will produce the wrong result otherwise. Probably a safe assumption, but maybe add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something I had considered and forgot to include in implementation. Would it be preferable to handle the case where no units are provided? e.g. instrument_units.find('[') returns -1?
@wholmgren Building urls programatically for the plot.pl endpoint for more than one site seems problematic. Variables are requested with the We could request data from the raw data api which has a simpler url format that returns all variables. However, the raw data is not quality checked, it is just raw data from the data loggers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good except for a couple of minor concerns
pvlib/test/test_midc.py
Outdated
def test_read_midc_raw_data_from_nrel(): | ||
start_ts = pd.Timestamp('20181018') | ||
end_ts = pd.Timestamp('20181019') | ||
midc.read_midc_raw_data_from_nrel('UAT', start_ts, end_ts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should assert a few reasonable things about the returned data
|
||
test_dir = os.path.dirname( | ||
os.path.abspath(inspect.getfile(inspect.currentframe()))) | ||
midc_testfile = os.path.join(test_dir, '../data/midc_20181014.txt') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we should add a raw data file as well to decouple the format test from the network test
assert data.index[-1] == end | ||
|
||
|
||
def test_midc_format_index_raw(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't this also need @network
?
I just ran into an issue with this where the reported Timezone throws an error when trying to call DataFrame.tz_localize(), particularly with PST. I'll try to find a reasonable solution to this before it gets merged. |
See bottom of this sub-section:
https://pvlib-python.readthedocs.io/en/latest/timetimezones.html#finding-a-time-zone
We could add a dict that maps problematic time zones to corresponding
Etc/GMT. Could be general purpose and public, or we could keep it simple
and private.
…On Tue, Oct 23, 2018 at 11:51 PM lboeman ***@***.***> wrote:
I just ran into an issue with this where the reported Timezone throws an
error when trying to call DataFrame.tz_localize(), particularly with PST.
I'll try to find a reasonable solution to this before it gets merged.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#605 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AELiR1_WnPlyqBL3h7nbAuXFeWTJL7suks5un6tkgaJpZM4XhLM0>
.
|
@wholmgren I wrote a very simple mapper function something along the lines of:
Right now I'm only aware of PST and CST causing problems and I don't believe writing the ~10 lines of code to be an issue where it is needed. Perhaps if this is a more common problem it warrants a public utility function with some default mapping and maybe an iotools.util (or similar) module? |
The map function seems like overkill but maybe that's just me. What about... tz_raw = # string from metadata
tz_map = {'PST': 'Etc/GMT+8', 'CST': 'Etc/GMT+6'} # maybe put in iotools.util
timezone = tz_map.get(tz_raw, tz_raw) # return tz_raw if not in mapping
|
I think this is ready to merge. Any final comments/concerns? |
Merge it |
pvlib python pull request guidelines
Thank you for your contribution to pvlib python! You may delete all of these instructions except for the list below.
You may submit a pull request with your code at any stage of completion.
The following items must be addressed before the code can be merged. Please don't hesitate to ask for help if you're unsure of how to accomplish any of the items below:
docs/sphinx/source/api.rst
for API changes.docs/sphinx/source/whatsnew
file for all changes.Brief description of the problem and proposed solution (if not already fully described in the issue linked to above):