-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Describe the bug
The iotools MIDC Parser produces duplicate column names in the returned DataFrame.
NREL MIDC data's labels are not formatted with enough consistency for pvlib to automatically map to native variable names. For example, some MIDC stations use brackets to differentiate similarly named columns, while others use paranthesis. The UNLV site returns a csv with Global UVE [W/m^2]
and Global UVE [index]
as labels for variables 'Global UVE' and 'Global UVE (index)' respectively. iotools parses both of these names info ghi_UVE
.
To Reproduce
Steps to reproduce the behavior:
Running the following code will return a dataframe with duplicate ghi_UVE
labels.
import pvlib
import pandas as pd
df = pvlib.iotools.read_midc_raw_data_from_nrel('UNLV', pd.Timestamp('20190501'), pd.Timestamp('20190502'))
df.columns
Returns:
Index(['Unnamed: 0', 'Year', 'DOY', 'PST', 'dni_Normal', 'ghi_Horiz',
'ghi_UVA', 'ghi_UVE', 'ghi_UVE', 'Dry Bulb Temp [deg C]',
'Avg Wind Speed @ 30ft [m/s]', 'Avg Wind Direction @ 30ft [deg from N]',
'Peak Wind Speed @ 30ft [m/s]', 'UVSAET Temp [deg C]',
'Logger Temp [deg C]', 'Logger Battery [VDC]',
'Wind Chill Temp [deg C]', 'dhi_Horiz_(calc)', 'solar_zenith',
'solar_azimuth', 'airmass'],
dtype='object')
Expected behavior
Instead of applying a generic mapping automatically, a user should be able to pass in an optional VARIABLE_MAP
dictionary to use for renaming columns to . The default behavior should be to return the MIDC data as a DataFrame with their original labels.
Versions:
pvlib.__version__
: 0.6.1pandas.__version__
: 0.24.2- python: 3.7.2