An extension of Gabriel Tseng's PyTorch implementation of Jiaxuan You's Deep Gaussian Process built on a CNN for soybean crop forecasting in Argentina. In addition, code components from the work "Deep Transfer Learning for Crop Yield Prediction with Remote Sensing Data" are used to export Argentine satellite data.
The code was used to produce the results published in the publication "Leveraging Remote Sensing Data for Yield Prediction with Deep Transfer Learning". The document is open access and can be found at https://www.mdpi.com/1424-8220/24/3/770 . If you find our code helpful, please cite our work as follows:
@Article{s24030770,
AUTHOR = {Huber, Florian and Inderka, Alvin and Steinhage, Volker},
TITLE = {Leveraging Remote Sensing Data for Yield Prediction with Deep Transfer Learning},
JOURNAL = {Sensors},
VOLUME = {24},
YEAR = {2024},
NUMBER = {3},
ARTICLE-NUMBER = {770},
URL = {https://www.mdpi.com/1424-8220/24/3/770},
ISSN = {1424-8220},
DOI = {10.3390/s24030770}
}
Run
python run.py exportto export the US satellite data into your Google Drive. You will need up to 165 Gb of storage. The export class allows checkpointing.
The Earth Engine Task Manager shows your ongoing tasks. This may take longer.
Once all the data has been exported to your Google Drive, you can drag the folders crop_yield-data_image, crop_yield-data_mask and
crop_yield-data_temperature into your local data folder (Google Drive Desktop is recommended,
otherwise the data will be downloaded in a lot of ZIP files).
The yield data can be downloaded from the USDA. Examples of the format can be found in the data directory.
If you want to use our data cleansing (>2000 cropland pixel) on your own data, you have to run
python run.py data_cleansingand
python cyp/data/merge_yield_pix-count_usa.pyNote here that the corresponding csv are addressed according to their column orders. The formatting of our data can be found in the data directory.
python run.py processMerges data and splits them by year. Saves files as .npy files.
python run.py engineerGenerates histograms from the processed .npy files.
python run.py run_optuna_usaNon cross-validated hyperparameter search (run hyp_multi_trans_cnn_usa for a ten-fold cross validation, but it's runtime is immense).
Results are saved in the data folder with the name given by out_hyp_csv.
python run.py train_cnnTrains the CNN and saves the model and the results in data/models/<new_model>. Additional information are saved into your Weights and Biases account.
The basic procedure in Argentina is the same, but in some places paths or names need to be adjusted. The descriptions can be taken from the US Pipeline and are not repeated here.
python cyp/data/argentina_export.pyThe yield data can be downloaded from the Ministerio de Agricultura. Examples of the format can be found in the data directory.
python run.py data_cleansingAdjust the names and paths inside run.py to the Argentinian values as it is commented.
python cyp/data/yield-csv_to-utf_with-buxacre.pyThis removes Spanish characters, converts tons per acre to bushels per acre, and applies data cleansing of at least 2000 cropland pixels.
The variable YIELDFILE in the head of the script can be changed to the name of your yield data file.
python run.py process_argentinapython run.py arg_engineerpython run.py run_optunapython run.py train_trans_cnnTo change the referenced US Model, the paths within models/transfer_base.py and models/transfer_convnet.py must be adjusted.
To set up the environment, the package manager Anaconda with Python 3.7 is required. Run
conda env create -f crop_yield_prediction.ymlto create an environment named crop_yield_prediction and run
conda activate crop_yield_predictionto activate the environment.
Additionally you need to sign up to Google Earth Engine
and authenticate yourself within the crop_yield_prediction environment by runnning
earthengine authenticateand following the instructions.
Weights and Biases is used to track experiments. Run
wandb loginand follow the instructions to activate wandb. You can also disable it by running
wandb disabled