PyCropYieldPrediction-withTransfer

An extension of Gabriel Tseng's PyTorch implementation of Jiaxuan You's Deep Gaussian Process built on a CNN for soybean crop forecasting in Argentina. In addition, code components from the work "Deep Transfer Learning for Crop Yield Prediction with Remote Sensing Data" are used to export Argentine satellite data.

The code was used to produce the results published in the publication "Leveraging Remote Sensing Data for Yield Prediction with Deep Transfer Learning". The document is open access and can be found at https://www.mdpi.com/1424-8220/24/3/770 . If you find our code helpful, please cite our work as follows:

@Article{s24030770,
AUTHOR = {Huber, Florian and Inderka, Alvin and Steinhage, Volker},
TITLE = {Leveraging Remote Sensing Data for Yield Prediction with Deep Transfer Learning},
JOURNAL = {Sensors},
VOLUME = {24},
YEAR = {2024},
NUMBER = {3},
ARTICLE-NUMBER = {770},
URL = {https://www.mdpi.com/1424-8220/24/3/770},
ISSN = {1424-8220},
DOI = {10.3390/s24030770}
}

Pipeline

USA

Exporting

Run

python run.py export

to export the US satellite data into your Google Drive. You will need up to 165 Gb of storage. The export class allows checkpointing. The Earth Engine Task Manager shows your ongoing tasks. This may take longer. Once all the data has been exported to your Google Drive, you can drag the folders crop_yield-data_image, crop_yield-data_mask and crop_yield-data_temperature into your local data folder (Google Drive Desktop is recommended, otherwise the data will be downloaded in a lot of ZIP files). The yield data can be downloaded from the USDA. Examples of the format can be found in the data directory.

(Optional) Data Cleansing

If you want to use our data cleansing (>2000 cropland pixel) on your own data, you have to run

python run.py data_cleansing

and

python cyp/data/merge_yield_pix-count_usa.py

Note here that the corresponding csv are addressed according to their column orders. The formatting of our data can be found in the data directory.

Preprocessing

python run.py process

Merges data and splits them by year. Saves files as .npy files.

Feature Engineering

python run.py engineer

Generates histograms from the processed .npy files.

(Optional) Hyperparameter tuning

python run.py run_optuna_usa

Non cross-validated hyperparameter search (run hyp_multi_trans_cnn_usa for a ten-fold cross validation, but it's runtime is immense). Results are saved in the data folder with the name given by out_hyp_csv.

Model Training

python run.py train_cnn

Trains the CNN and saves the model and the results in data/models/<new_model>. Additional information are saved into your Weights and Biases account.

Argentina

The basic procedure in Argentina is the same, but in some places paths or names need to be adjusted. The descriptions can be taken from the US Pipeline and are not repeated here.

Exporting

python cyp/data/argentina_export.py

The yield data can be downloaded from the Ministerio de Agricultura. Examples of the format can be found in the data directory.

(Optional) Data Cleansing

python run.py data_cleansing

Adjust the names and paths inside run.py to the Argentinian values as it is commented.

python cyp/data/yield-csv_to-utf_with-buxacre.py

This removes Spanish characters, converts tons per acre to bushels per acre, and applies data cleansing of at least 2000 cropland pixels. The variable YIELDFILE in the head of the script can be changed to the name of your yield data file.

Preprocessing

python run.py process_argentina

Feature Engineering

python run.py arg_engineer

(Optional) Hyperparameter tuning

python run.py run_optuna

Model Training

python run.py train_trans_cnn

To change the referenced US Model, the paths within models/transfer_base.py and models/transfer_convnet.py must be adjusted.

Setup

To set up the environment, the package manager Anaconda with Python 3.7 is required. Run

conda env create -f crop_yield_prediction.yml

to create an environment named crop_yield_prediction and run

conda activate crop_yield_prediction

to activate the environment.
Additionally you need to sign up to Google Earth Engine and authenticate yourself within the crop_yield_prediction environment by runnning

earthengine authenticate

and following the instructions.
Weights and Biases is used to track experiments. Run

wandb login

and follow the instructions to activate wandb. You can also disable it by running

wandb disabled

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
cyp		cyp
data		data
LICENSE.txt		LICENSE.txt
README.md		README.md
__init__.py		__init__.py
crop_yield_prediction.yml		crop_yield_prediction.yml
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PyCropYieldPrediction-withTransfer

Pipeline

USA

Exporting

(Optional) Data Cleansing

Preprocessing

Feature Engineering

(Optional) Hyperparameter tuning

Model Training

Argentina

Exporting

(Optional) Data Cleansing

Preprocessing

Feature Engineering

(Optional) Hyperparameter tuning

Model Training

Setup

About

Uh oh!

Releases

Packages

Languages

License

alvin-in/PyCropYieldPrediction-withTransfer

Folders and files

Latest commit

History

Repository files navigation

PyCropYieldPrediction-withTransfer

Pipeline

USA

Exporting

(Optional) Data Cleansing

Preprocessing

Feature Engineering

(Optional) Hyperparameter tuning

Model Training

Argentina

Exporting

(Optional) Data Cleansing

Preprocessing

Feature Engineering

(Optional) Hyperparameter tuning

Model Training

Setup

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages