You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm new to NOAA/NCEP/NWS models, Siphon/THREDDS etc., and pvlib, but am interested in the pvlib forecast module. I understand that it is depreciated. Indeed, the code at pvlib/forecast.py throws errors, as is evident from the current documentation (which contains error messages) and from running the code on my machine. Perhaps the issue is clear to experts here, but it took me some time to figure out what was happening. I offer a quick/dirty fix below. And I have a comment/question.
The error is: Handler dispatch failed; nested exception is java.lang.AssertionError: Multiple feature collections cannot be written as a CF dataset. It occurs when fm.get_processed_data(latitude, longitude, start, end) is called. I'm using Python 3.8.10, pvlib 0.8.0, and pandas 1.4.3. I'm also using the GFS ForecastModel , but my fix below could be extended for the other models.
Apparently, the issue is that the current code tries to grab data for several types of variables, for example: u-component_of_wind_isobaric, Total_cloud_cover_entire_atmosphere_Mixed_intervals_Average, and Temperature_surface. But it seems that only one type of variable can be accessed in any one query. The variable type can be identified by the last word in the name: "isobaric", "Average", and "surface". The exception to this list is the variable Total_cloud_cover_convective_cloud, which I ignore as it does not seem to be used for power calculations with GFS data. The quick fix is to call each variable type separately, which I do below using a kind keyword. After the dataframes are constructed, I concat them.
An issue/question is that given identical time spans, the three queries return dataframes with somewhat different time indexes. I don't know why this happens. In my example run, after concatenation, the forecast_data dataframe looks like this:
data before dropna, shape=(84, 9)
temp_air wind_speed ghi dni dhi total_clouds low_clouds mid_clouds high_clouds
2022-11-18 00:30:00-07:00 NaN NaN 0.000000 0.000000 0.000000 100.000000 0.0 3.700000 100.0
2022-11-18 02:00:00-07:00 14.460541 2.229132 0.000000 0.000000 0.000000 100.000000 0.0 48.299999 100.0
2022-11-18 05:00:00-07:00 13.531006 2.466797 NaN NaN NaN NaN NaN NaN NaN
I simply drop rows with nans. Does anyone have a better solution? Interpolation for nans would be a possibility. Dropping nans reduces the number of rows from 84 to 28, which is quite a cut.
Below I give my edited version of the GFS class, and after that I give my script (based on forecast.py) which runs it.
class GFS(ForecastModel):
_resolutions = ['Half', 'Quarter']
_kind = ['isobaric', 'Average', 'surface']
def __init__(self, resolution='quarter', set_type='best', kind='isobaric'):
model_type = 'Forecast Model Data'
self.kind = kind
resolution = resolution.title()
if resolution not in self._resolutions:
raise ValueError(f'resolution must in {self._resolutions}')
if kind not in self._kind:
raise ValueError(f'kind must be in {self._kind}')
model = f'GFS {resolution} Degree Forecast'
# isobaric variables will require a vert_level to prevent
# excessive data downloads
# split the variables into different groups, based on kind
if kind == 'isobaric':
self.variables = {
'wind_speed_u': 'u-component_of_wind_isobaric',
'wind_speed_v': 'v-component_of_wind_isobaric',
}
self.output_variables = [
'wind_speed',
]
elif kind == 'Average':
self.variables = {
'total_clouds': 'Total_cloud_cover_entire_atmosphere_Mixed_intervals_Average',
'low_clouds': 'Low_cloud_cover_low_cloud_Mixed_intervals_Average',
'mid_clouds': 'Medium_cloud_cover_middle_cloud_Mixed_intervals_Average',
'high_clouds': 'High_cloud_cover_high_cloud_Mixed_intervals_Average',
'boundary_clouds': 'Total_cloud_cover_boundary_layer_cloud_Mixed_intervals_Average',
'ghi_raw': 'Downward_Short-Wave_Radiation_Flux_surface_Mixed_intervals_Average'
}
self.output_variables = [
'ghi',
'dni',
'dhi',
'total_clouds',
'low_clouds',
'mid_clouds',
'high_clouds'
]
elif kind == 'surface':
self.variables = {
'temp_air': 'Temperature_surface',
'wind_speed_gust': 'Wind_speed_gust_surface',
#'convect_clouds': 'Total_cloud_cover_convective_cloud',
}
self.output_variables = [
'temp_air',
]
super().__init__(model_type, model, set_type, vert_level=100000)
def process_data(self, data, cloud_cover='total_clouds', **kwargs):
"""
Defines the steps needed to convert raw forecast data
into processed forecast data.
Parameters
----------
data: DataFrame
Raw forecast data
cloud_cover: str, default 'total_clouds'
The type of cloud cover used to infer the irradiance.
Returns
-------
data: DataFrame
Processed forecast data.
"""
data = super().process_data(data, **kwargs)
if self.kind == 'Average':
irrads = self.cloud_cover_to_irradiance(data[cloud_cover], **kwargs)
data = data.join(irrads, how='outer')
elif self.kind == "isobaric":
data['wind_speed'] = self.uv_to_speed(data)
elif self.kind == "surface":
data['temp_air'] = self.kelvin_to_celsius(data['temp_air'])
return data[self.output_variables]
My code to run the example in forecast.py is below.
from pvlib import solarposition, irradiance, atmosphere, pvsystem, inverter, temperature
#from pvlib.forecast import GFS, NAM, NDFD, RAP, HRRR
from forecast_module import GFS, NAM, NDFD, RAP, HRRR # my version of the forecast module
import datetime
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import sys
import time
import pytz
# Choose a location.
# Tucson, AZ
latitude = 32.2
longitude = -110.9
tz = 'US/Mountain'
tzinfo = pytz.timezone(tz)
surface_tilt = 30
surface_azimuth = 180 # pvlib uses 0=North, 90=East, 180=South, 270=West convention
albedo = 0.2
start = pd.Timestamp(datetime.date.today(), tz=tz) # today's date
#dt = datetime.datetime(2020, 7, 7, 6, 0, tzinfo=tzinfo)
#start = pd.Timestamp(dt)
end = start + pd.Timedelta(days=7) # 7 days from today
# Define forecast model
data = {}
for kind in ['isobaric', 'Average', 'surface']:
fm = GFS(kind=kind)
data[kind] = {'fm': fm, 'data': fm.get_processed_data(latitude, longitude, start, end)}
forecast_data = pd.concat([data['surface']['data'],data['isobaric']['data'],data['Average']['data']], axis=1)
# The three kinds have different time indices, and so forecast_data has nans in some rows. Drop these.
print("\ndata before dropna, shape={} \n{}\n".format(forecast_data.shape, forecast_data.head()))
forecast_data.dropna(axis=0, how='any', inplace=True)
print("\ndata after dropna, shape={} \n{}\n".format(forecast_data.shape, forecast_data.head()))
forecast_data['temp_air'].plot()
plt.title('temp_air');
plt.savefig("temp_air.png")
plt.close('all')
ghi = forecast_data['ghi']
ghi.plot()
plt.ylabel('Irradiance ($W/m^{-2}$)')
plt.title('GHI');
plt.savefig("ghi.png")
plt.close('all')
# retrieve time and location parameters, for Average which contains dhi/dni data
time = forecast_data.index
a_point = data['Average']['fm'].location
solpos = a_point.get_solarposition(time, method='nrel_numba')
solpos.plot()
plt.title('solar position');
plt.savefig("solpos.png")
plt.close('all')
dni_extra = irradiance.get_extra_radiation(time)
dni_extra.plot()
plt.ylabel('Extra terrestrial radiation ($W/m^{-2}$)')
plt.title('DNI Extra');
plt.savefig("dni_extra.png")
plt.close('all')
airmass = atmosphere.get_relative_airmass(solpos['apparent_zenith'])
airmass.plot()
plt.ylabel('Airmass')
plt.title('Airmass');
plt.savefig("Airmass.png")
plt.close('all')
poa_sky_diffuse = irradiance.haydavies(
surface_tilt,
surface_azimuth,
forecast_data['dhi'],
forecast_data['dni'],
dni_extra,
solpos['apparent_zenith'],
solpos['azimuth']
)
poa_sky_diffuse.plot()
plt.ylabel('Irradiance ($W/m^{-2}$)')
plt.title('POA Sky Diffuse');
plt.savefig("poa_sky_diffuse.png")
plt.close('all')
poa_ground_diffuse = irradiance.get_ground_diffuse(surface_tilt, ghi, albedo=albedo)
poa_ground_diffuse.plot()
plt.ylabel('Irradiance ($W/m^{-2}$)')
plt.title('POA Ground Diffuse');
plt.savefig("poa_ground_diffuse.png")
plt.close('all')
aoi = irradiance.aoi(surface_tilt, surface_azimuth, solpos['apparent_zenith'], solpos['azimuth'])
aoi.plot()
plt.ylabel('Angle of incidence (deg)')
plt.title('Angle of Incidence');
plt.savefig("aoi.png")
plt.close('all')
poa_irrad = irradiance.poa_components(
aoi,
forecast_data['dni'],
poa_sky_diffuse,
poa_ground_diffuse
)
poa_irrad.plot()
plt.ylabel('Irradiance ($W/m^{-2}$)')
plt.title('POA Irradiance');
plt.savefig("poa_irrad.png")
plt.close('all')
ambient_temperature = forecast_data['temp_air']
wnd_spd = forecast_data['wind_speed']
thermal_params = temperature.TEMPERATURE_MODEL_PARAMETERS['sapm']['open_rack_glass_polymer']
pvtemp = temperature.sapm_cell(poa_irrad['poa_global'], ambient_temperature, wnd_spd, **thermal_params)
pvtemp.plot()
plt.ylabel('Temperature (C)');
plt.title('PV Temp');
plt.savefig("pvtemp.png")
plt.close('all')
sandia_modules = pvsystem.retrieve_sam('SandiaMod')
sandia_module = sandia_modules.Canadian_Solar_CS5P_220M___2009_
effective_irradiance = pvsystem.sapm_effective_irradiance(poa_irrad.poa_direct, poa_irrad.poa_diffuse,
airmass, aoi, sandia_module)
sapm_out = pvsystem.sapm(effective_irradiance, pvtemp, sandia_module)
print("\nsapm_out= \n{}\n".format(sapm_out.head()))
sapm_out[['p_mp']].plot()
plt.ylabel('DC Power (W)');
plt.title('DC Power');
plt.savefig("dc_power.png")
plt.close('all')
sapm_inverters = pvsystem.retrieve_sam('sandiainverter')
sapm_inverter = sapm_inverters['ABB__MICRO_0_25_I_OUTD_US_208__208V_']
p_ac = inverter.sandia(sapm_out.v_mp, sapm_out.p_mp, sapm_inverter)
p_ac.plot()
plt.ylabel('AC Power (W)')
plt.ylim(0, None);
plt.title('AC Power');
plt.savefig("ac_power.png")
plt.close('all')
print("\np_ac describe= \{}\n".format(p_ac.describe()))
p_ac.index.freq
p_ac[start:start+pd.Timedelta(days=2)].plot();
plt.title('AC Power, 2 Day');
plt.savefig("ac_power_2day.png")
plt.close('all')
# integrate power to find energy yield over the forecast period
energy = p_ac.sum() * 3
print("\nenergy= {}\n".format(energy))
# I think it would be better to integrate p_ac over time to get the energy.
The text was updated successfully, but these errors were encountered:
I'm new to NOAA/NCEP/NWS models, Siphon/THREDDS etc., and pvlib, but am interested in the pvlib forecast module. I understand that it is depreciated. Indeed, the code at pvlib/forecast.py throws errors, as is evident from the current documentation (which contains error messages) and from running the code on my machine. Perhaps the issue is clear to experts here, but it took me some time to figure out what was happening. I offer a quick/dirty fix below. And I have a comment/question.
The error is:
Handler dispatch failed; nested exception is java.lang.AssertionError: Multiple feature collections cannot be written as a CF dataset
. It occurs whenfm.get_processed_data(latitude, longitude, start, end)
is called. I'm using Python 3.8.10, pvlib 0.8.0, and pandas 1.4.3. I'm also using the GFS ForecastModel , but my fix below could be extended for the other models.Apparently, the issue is that the current code tries to grab data for several types of variables, for example: u-component_of_wind_isobaric, Total_cloud_cover_entire_atmosphere_Mixed_intervals_Average, and Temperature_surface. But it seems that only one type of variable can be accessed in any one query. The variable type can be identified by the last word in the name: "isobaric", "Average", and "surface". The exception to this list is the variable Total_cloud_cover_convective_cloud, which I ignore as it does not seem to be used for power calculations with GFS data. The quick fix is to call each variable type separately, which I do below using a
kind
keyword. After the dataframes are constructed, I concat them.An issue/question is that given identical time spans, the three queries return dataframes with somewhat different time indexes. I don't know why this happens. In my example run, after concatenation, the
forecast_data
dataframe looks like this:I simply drop rows with nans. Does anyone have a better solution? Interpolation for nans would be a possibility. Dropping nans reduces the number of rows from 84 to 28, which is quite a cut.
Below I give my edited version of the GFS class, and after that I give my script (based on forecast.py) which runs it.
My code to run the example in forecast.py is below.
The text was updated successfully, but these errors were encountered: