Skip to content

Organize pvlib.data #1056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cwhanse opened this issue Sep 9, 2020 · 6 comments
Open

Organize pvlib.data #1056

cwhanse opened this issue Sep 9, 2020 · 6 comments

Comments

@cwhanse
Copy link
Member

cwhanse commented Sep 9, 2020

pvlib.data currently contains 1) databases for module and inverter models; 2) Linke turbidity values; 3) data files for tests and examples and 4) variable_style_rules.csv. Accurately described as a "junk drawer." The majority of the data files are in category 3, supporting tests.

As a start, maybe create data.tests and perhaps subfolders within tests that mirror the subfolders in pvlib.tests And perhaps add a prefix or other text to file names to help identify where or how it is used, e.g., PVsyst_demo.csv becomes test_sdm_pvsyst_demo.csv.

@wholmgren
Copy link
Member

Mirroring the structure would be an improvement.

Another option to consider is moving the subpackage tests into the subpackage, along with a data subdirectory within that test directory. For example:

pvlib/data  # databases, linke turbidity, anything a user might need
pvlib/tests  # test_atmosphere.py, etc
pvlib/tests/data  # singleaxis_tracker_wslope.csv, etc
pvlib/iotools/tests  # test_tmy.py, etc
pvlib/iotools/tests/data  # pvgis_tmy_test.dat, etc
pvlib/ivtools/tests
pvlib/ivtools/tests/data

I proposed a similar structure when we created pvlib/tests/iotools. I was outvoted but I still think it's better!

@cwhanse
Copy link
Member Author

cwhanse commented Sep 9, 2020

I'm in favor of pvlib/tests/data, etc. rather than pvlib/data/tests.

@echedey-ls
Copy link
Contributor

echedey-ls commented Oct 20, 2024

68 files to organize... I propose splitting up the work to make it manageable. I'd say the first step is to categorize each file. Feel free to react to this message with a 👍 if you want to contribute to that. I'll split the work in ranges for each one to work on it by mentioning you in this message next week.

but for those brave enough, you can work on it now: https://docs.google.com/spreadsheets/d/12LeEFa9-wRqc3v7utfgcTk96KTmaWfhHSPkLx6K0QhY/edit?usp=sharing

There are five categories, four for what Cliff said in this issue plus one if it's unknown; multiple can be selected for each entry. My way to go would be to look up where this files are mentioned and select the appropriate labels. Could be automated, but I don't feel like overengineering today - and that wouldn't take into account files not mentioned anywhere (if any).

React 👎 if you are against doing it this way (and potentially have a better idea)

@cwhanse
Copy link
Member Author

cwhanse commented Oct 21, 2024

@echedey-ls can you add "Lookup table" or something like that to the pull down options? For files like the CEC module parameters.

@echedey-ls
Copy link
Contributor

This is a summary of the classification, available in the spreadsheet's third sheet.

image

@AdamRJensen
Copy link
Member

This is a summary of the classification, available in the spreadsheet's third sheet.

@echedey-ls This looks good to me.

Should we create a sub-folder for the files only used for testing, i.e., the files that can be excluded?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants