-
Notifications
You must be signed in to change notification settings - Fork 1.1k
pvlib.iam.marion_integrate uses too much memory for vector inputs #1402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I see no reason not to use What about a third option - complement the existing I don't think (*) I think the calculation may be wrong when applied to ground-reflected irradiance, since an integrating element is defined by a solid angle, not by an equal area portion of the viewed ground. I think that a proper view factor is needed in the integrand. The sky diffuse calculation doesn't have this problem because it is assumed that the irradiance source is the hemisphere rather than the ground plane. |
What an interesting conundrum. I too feel
Sorry, these are just some random ideas, because this problem just seems so fun! |
Not wholly opposed, but the cool part of
Good point, I was ignoring the parameter names and treating it like a general interpolator:
Does Equation 8 in the reference not define a proper solid angle integration element? Maybe I need to take another look but I thought that was done correctly for all three regions, including the ground plane.
Yes this would do the same memory reclamation as the |
Not opposed to del but would the numpy out kwarg help? |
I think it's worth trying. I'd like to see it's memory profile in the chart with the other options. |
I don't oppose anything above. Other options:
|
I think you can still use a (dynamically calculated) lookup table with custom IAM functions. Something like: if isinstance(surface_tilt, pd.Series):
# use a look-up table to avoid memory error
lut_surface_tilts = np.linspace(0, 90, 181)
idxs = np.searchsorted(lut_surface_tilts, surface_tilt)
iam_diffuse_sky = pvlib.iam.marion_integrate(iam_function, lut_surface_tilts, "sky")[idxs]
else:
# surface_tilt is just a number, no need for a look-up table
iam_diffuse_sky = pvlib.iam.marion_integrate(iam_function, surface_tilt, "sky") |
Hi Karel, long time no see! Some ideas:
|
Hi Anton, Nice to interact with you! I do agree that there are better options than checking for a Series. I just wanted to post the general idea. However, I don't think interpolation is worth the effort. The LUT values are already close enough. Unless one cares about the 4th digit after the comma... |
Hi Karel, for me it's not about accuracy but I have a phobia of discontinuities! Your general idea is good. |
pvlib.iam.marion_integrate
(which is mostly relevant as a helper forpvlib.iam.marion_diffuse
) needs quite a bit of memory when passed vector inputs. An input of length 1000 allocates around 2GB of memory on my machine, so naively passing in a standard 8760 would use roughly 17-18 GB. Unfortunately I was very much focused on fixed tilt simulations when I wrote pvlib's implementation and never tried it out on large vector inputs, so this problem went unnoticed until @spaneja pointed it out to me.I think any vectorized implementation of this algorithm is going to be rather memory-heavy, so I'm skeptical that achieving even a factor of 10 reduction in memory usage is possible here without completely changing the approach (and likely shifting the burden from memory to CPU). However, here are two low-hanging fruits worth considering:
del
statement to instruct python that those arrays are no longer needed allows python to reclaim that memory immediately and recycle it for subsequent allocations. This is probably a simplification of what actually happens, but it seems consistent with the below observations.np.float32
cuts memory usage in half compared withnp.float64
and (probably) doesn't meaningfully change the result. It's not likesurface_tilt
has more than a few sig figs anyway.Here is a rough memory and timing comparison (using memory_profiler, very handy).
pvlib
is the current implementation; the twodel
variants use a strategic sprinkling ofdel
but are otherwise not much different frompvlib
. This is for an input of length 1000. The traces here are memory usage sampled at short intervals across a single function invocation; for example the bluepvlib
trace shows that the function call took 1.4 seconds to complete and had a peak memory usage slightly higher than 2GB.So using a few
del
s cuts peak memory usage roughly in half. Dropping down tonp.float32
cuts it roughly in half again (and gives a nontrivial speedup too). It's possible that further improvements can be had with other tricks (e.g. using theout
parameter that some numpy functions provide) but I've not yet explored them.My main question: are we open to using these two strategies in pvlib? Despite being built into python itself,
del
still seems unpythonic to me for some reason. Switching away fromfloat64
is objectionable to the extent that it's the standard in scientific computing and is therefore baked into the models by assumption. I think I'm cautiously open to both of the above approaches, iff they are accompanied by good explanatory comments and switching tofloat32
can be reasonably shown to not introduce a meaningful difference in output.Remark: even ignoring this memory bloat, I tend to think that applying
marion_integrate
directly to an 8760 is a bit strange. In simulations with time seriessurface_tilt
s, a better approach IMHO is to calculate the IAM values only fornp.linspace(0, 90, 1)
or similar and usepvlib.iam.interp
to generate the 8760 IAM series. If nothing else, we might suggest that in the docs.The text was updated successfully, but these errors were encountered: