Hierarchical_foliar_trait_estimation

Estimating foliar plant function traits using hyperspectral data.

Paper

The working manuscript file is paper_manuscript.qmd

Getting and Cleaning ECOSIS data

All codes in R_codes/Getting_and_cleaning_ECOSIS_data

Getting ECOSIS data

The code 01_get_data_ecosis.R reads the ECOSIS website and downloads data which has not been already downloaded and separates the output into metadata and spectra folders

Cleaning ECOSIS data

02_create_parquet_metadata.R: This is the main code that writes the parquet file for the metadata. Each data folder that is analyzed has its own parquet file which contains the downloaded metadata dataframe plus the metadata parquet file that I create. The code also writes a trait_already_done_for_metadata.txt to the data folder which contains the traits that I have already added the metadata for.
The code 03_adding_genus_species_info.R crunches the metadata and adds a column for genus and species. We also add Family data using taxize package. Other data related to taxize can be added as well if the user wants.

Supporting_R_functions

These are .R files in R_codes/supporting_R_functions used by main R files

getting_traits_data.R: This code has functions used by adding_more_metadata_information_to_metadata_files.R. Code is mostly automatic but needs a visual inspection for new trait names and to build the database in [[trait_and_sample_id_Database_for_ECOSIS_Data.R]].
[[Steps_to_Add_Metadata_for_a_Trait_in_a_dataset.R]]: This is an instruction file for [[adding_more_metadata_information_to_metadata_files.R]] and needs to be read before running beyond Line 20 of the [[adding_more_metadata_information_to_metadata_files.R]]. This file contains steps on how and where to add the metadata information which is then read by the various R-codes.
[[trait_and_sample_id_Database_for_ECOSIS_Data.R]]: This is where most of the metadata specific to each dataset is saved.
[[scrape_wikipedia_for_species_metadata_table.R]]: This is the code which scrapes wikipedia and adds metadata group information. The data arising from this is stored in data/Species_data. Update: I found that using ChatGPT, it is much more accurate, so the updated species_attribute_data is the one that is currently being used in the code (data/species_data/species_attribute_data_Dhruva_GPT.csv)

Extra data cleaning for some datasets

Folder: Getting_and_cleaning_ECOSIS_Data/Further_data_cleaning: In some of the datasets, I write further R-codes which clean the metadata so that they can be used in the analysis. For such datasets, the original downloaded datasets are in the sub-folder "data_downloaded_using_api" and the modified versions are in the main dataset folder.

Where data are kept

Both the downloaded data and the parquet versions are in the raw_data folder inside their respective dataset folders.

Regression Analysis

Bayesian Analysis

Run the file supervised_pc_and_raw_spectra_bayesian.R with the prediction algorithm set to "raw_spectra". This will output and save a brms file which will have the MCMC samples from the fitted model. This uses all of the spectra.
Next run "covariate_reduction.R" to run the analysis to find the best subset of spectra which gives similar prediction as the full model. This requires the output brms file saved in the above step.

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
R_codes		R_codes
data		data
output_data_frame_predictions		output_data_frame_predictions
paper_draft		paper_draft
paper_manuscript_files		paper_manuscript_files
.Rhistory		.Rhistory
.gitattributes		.gitattributes
.gitignore		.gitignore
Hierarchical_foliar_trait_estimation.Rproj		Hierarchical_foliar_trait_estimation.Rproj
LICENSE		LICENSE
README.html		README.html
README.md		README.md
data_test_Carotenoid_Area.csv		data_test_Carotenoid_Area.csv
data_train_Carotenoid_Area.csv		data_train_Carotenoid_Area.csv
file_for_getting_intital_values_of_parameters_from_past_brms_fits.R		file_for_getting_intital_values_of_parameters_from_past_brms_fits.R
methods_in_ecology_and_evolution_title_page_for_submission.docx		methods_in_ecology_and_evolution_title_page_for_submission.docx
out.csv		out.csv
out2.csv		out2.csv
paper_manuscript.odt		paper_manuscript.odt
paper_manuscript.qmd		paper_manuscript.qmd
paper_trash.qmd		paper_trash.qmd
prediction_file.csv		prediction_file.csv
prediction_files.csv		prediction_files.csv
references.bib		references.bib
supplementary_file.qmd		supplementary_file.qmd
test.csv		test.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hierarchical_foliar_trait_estimation

Paper

Getting and Cleaning ECOSIS data

Getting ECOSIS data

Cleaning ECOSIS data

Supporting_R_functions

Extra data cleaning for some datasets

Where data are kept

Regression Analysis

Bayesian Analysis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

DhruvaKathuria/Hierarchical_foliar_trait_estimation

Folders and files

Latest commit

History

Repository files navigation

Hierarchical_foliar_trait_estimation

Paper

Getting and Cleaning ECOSIS data

Getting ECOSIS data

Cleaning ECOSIS data

Supporting_R_functions

Extra data cleaning for some datasets

Where data are kept

Regression Analysis

Bayesian Analysis

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages