Skip to content

Fix seed type & feature importance column #46

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 26, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,15 @@ jobs:
build:
name: build
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.11"]

steps:
- uses: actions/checkout@v3
with:
persist-credentials: false
fetch-depth: 0
- uses: conda-incubator/setup-miniconda@v2
- uses: conda-incubator/setup-miniconda@v3
with:
python-version: 3.11
miniforge-variant: Mambaforge
miniforge-version: latest
activate-environment: github-actions
environment-file: workflow/envs/github-actions.yml
Expand Down
24 changes: 12 additions & 12 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
FROM condaforge/mambaforge:latest
FROM condaforge/miniforge3:latest
LABEL io.github.snakemake.containerized="true"
LABEL io.github.snakemake.conda_env_hash="6aa289536136aae2d34bac6dce9ce47d037da888ed09e2c8ada989c90ef10658"
LABEL io.github.snakemake.conda_env_hash="c1edb6a4917211d511a661c54768e671f7f067b1ea473011e8fdcbc485178d2c"

# Step 1: Retrieve conda environments
# Step 2: Retrieve conda environments

# Conda environment:
# source: workflow/envs/graphviz.yml
Expand All @@ -17,7 +17,7 @@ COPY workflow/envs/graphviz.yml /conda-envs/b42323b0ffd5d034544511c9db1bdead/env

# Conda environment:
# source: workflow/envs/mikropml.yml
# prefix: /conda-envs/3f83a46ff5ea715a12fde6ee46136b0b
# prefix: /conda-envs/e7c23e20e8aab7662ae81be2ad57d998
# name: mikropml
# channels:
# - conda-forge
Expand All @@ -30,15 +30,15 @@ COPY workflow/envs/graphviz.yml /conda-envs/b42323b0ffd5d034544511c9db1bdead/env
# - r-future
# - r-future.apply
# - r-import
# - r-mikropml>=1.5.0
# - r-mikropml>=1.6.0
# - r-patchwork
# - r-rmarkdown
# - r-rpart
# - r-purrr
# - r-schtools>=0.4.0
# - r-tidyverse
RUN mkdir -p /conda-envs/3f83a46ff5ea715a12fde6ee46136b0b
COPY workflow/envs/mikropml.yml /conda-envs/3f83a46ff5ea715a12fde6ee46136b0b/environment.yaml
RUN mkdir -p /conda-envs/e7c23e20e8aab7662ae81be2ad57d998
COPY workflow/envs/mikropml.yml /conda-envs/e7c23e20e8aab7662ae81be2ad57d998/environment.yaml

# Conda environment:
# source: workflow/envs/smk.yml
Expand All @@ -54,9 +54,9 @@ COPY workflow/envs/mikropml.yml /conda-envs/3f83a46ff5ea715a12fde6ee46136b0b/env
RUN mkdir -p /conda-envs/457b7b75191d44b96e5086432876e333
COPY workflow/envs/smk.yml /conda-envs/457b7b75191d44b96e5086432876e333/environment.yaml

# Step 2: Generate conda environments
# Step 3: Generate conda environments

RUN mamba env create --prefix /conda-envs/b42323b0ffd5d034544511c9db1bdead --file /conda-envs/b42323b0ffd5d034544511c9db1bdead/environment.yaml && \
mamba env create --prefix /conda-envs/3f83a46ff5ea715a12fde6ee46136b0b --file /conda-envs/3f83a46ff5ea715a12fde6ee46136b0b/environment.yaml && \
mamba env create --prefix /conda-envs/457b7b75191d44b96e5086432876e333 --file /conda-envs/457b7b75191d44b96e5086432876e333/environment.yaml && \
mamba clean --all -y
RUN conda env create --prefix /conda-envs/b42323b0ffd5d034544511c9db1bdead --file /conda-envs/b42323b0ffd5d034544511c9db1bdead/environment.yaml && \
conda env create --prefix /conda-envs/e7c23e20e8aab7662ae81be2ad57d998 --file /conda-envs/e7c23e20e8aab7662ae81be2ad57d998/environment.yaml && \
conda env create --prefix /conda-envs/457b7b75191d44b96e5086432876e333 --file /conda-envs/457b7b75191d44b96e5086432876e333/environment.yaml && \
conda clean --all -y
2 changes: 1 addition & 1 deletion workflow/envs/mikropml.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ dependencies:
- r-future
- r-future.apply
- r-import
- r-mikropml>=1.5.0
- r-mikropml>=1.6.0
- r-patchwork
- r-rmarkdown
- r-rpart
Expand Down
3 changes: 2 additions & 1 deletion workflow/scripts/find_feature_importance.R
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ feat_imp <- mikropml::get_feature_importance(
seed = seed,
)

wildcards <- schtools::get_wildcards_tbl()
wildcards <- schtools::get_wildcards_tbl() %>%
dplyr::mutate(seed = as.numeric(seed))

readr::write_csv(
feat_imp %>%
Expand Down
6 changes: 3 additions & 3 deletions workflow/scripts/plot_feature_importance.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ feat_df <- readr::read_csv(snakemake@input[["csv"]])
top_n <- as.numeric(snakemake@params[["top_n"]])

top_feats <- feat_df %>%
group_by(method, names) %>%
group_by(method, feat) %>%
summarize(median_diff = median(perf_metric_diff)) %>%
slice_max(order_by = median_diff, n = top_n)

feat_plot <- feat_df %>%
right_join(top_feats, by = c("method", "names")) %>%
mutate(features = factor(names, levels = unique(top_feats$names))) %>%
right_join(top_feats, by = c("method", "feat")) %>%
mutate(features = factor(feat, levels = unique(top_feats$feat))) %>%
ggplot(aes(x = perf_metric_diff, y = features, color = method)) +
geom_boxplot() +
facet_wrap(~method) +
Expand Down
3 changes: 2 additions & 1 deletion workflow/scripts/train_ml.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@ ml_results <- mikropml::run_ml(
hyperparameters = hyperparams
)

wildcards <- schtools::get_wildcards_tbl()
wildcards <- schtools::get_wildcards_tbl() %>%
dplyr::mutate(seed = as.numeric(seed))

readr::write_csv(
ml_results$performance %>%
Expand Down