Skip to content

Update base image #1484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 12 additions & 17 deletions Dockerfile.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ FROM gcr.io/kaggle-images/python-lightgbm-whl:${BASE_IMAGE_TAG}-${LIGHTGBM_VERSI
{{ end }}
FROM ${BASE_IMAGE}:${BASE_IMAGE_TAG}

#b/415358342: UV reports missing requirements files https://github.com/googlecolab/colabtools/issues/5237
ENV UV_CONSTRAINT= \
UV_BUILD_CONSTRAINT=

ADD kaggle_requirements.txt /kaggle_requirements.txt

# Freeze existing requirements from base image for critical packages:
Expand All @@ -29,39 +33,30 @@ ENV PATH="~/.local/bin:${PATH}"
RUN uv pip uninstall --system google-cloud-bigquery-storage

# b/394382016: sigstore (dependency of kagglehub) requires a prerelease packages, installing separate.
RUN uv pip install --system --force-reinstall --prerelease=allow kagglehub[pandas-datasets,hf-datasets,signing]>=0.3.9

# b/408284143: google-cloud-automl 2.0.0 introduced incompatible API changes, need to pin to 1.0.1
# google-cloud-automl 2.0.0 introduced incompatible API changes, need to pin to 1.0.1
RUN uv pip install --system --force-reinstall --prerelease=allow kagglehub[pandas-datasets,hf-datasets,signing]>=0.3.12 \
google-cloud-automl==1.0.1

# b/408284435: Keras 3.6 broke test_keras.py > test_train > keras.datasets.mnist.load_data()
# See https://github.com/keras-team/keras/commit/dcefb139863505d166dd1325066f329b3033d45a
# Colab base is on Keras 3.8, we have to install the package separately
RUN uv pip install --system google-cloud-automl==1.0.1 google-cloud-aiplatform google-cloud-translate==3.12.1 \
google-cloud-videointelligence google-cloud-vision google-genai "keras<3.6"
RUN uv pip install --system "keras<3.6"

# uv cannot install this in requirements.txt without --no-build-isolation
# to avoid affecting the larger build, we'll post-install it.
RUN uv pip install --no-build-isolation --system "git+https://github.com/Kaggle/learntools"

# b/408281617: Torch is adamant that it can not install cudnn 9.3.x, only 9.1.x, but Tensorflow can only support 9.3.x.
# This conflict causes a number of package downgrades, which are handled in this command
# b/302136621: Fix eli5 import for learntools
RUN uv pip install --system --force-reinstall --extra-index-url https://pypi.nvidia.com "cuml-cu12==25.2.1" \
"nvidia-cudnn-cu12==9.3.0.75" scipy tsfresh scikit-learn==1.2.2 category-encoders eli5

RUN uv pip install --system --force-reinstall "pynvjitlink-cu12==0.5.2"

# b/385145217 Latest Colab lacks mkl numpy, install it.
RUN uv pip install --system --force-reinstall -i https://pypi.anaconda.org/intel/simple numpy

# newer daal4py requires tbb>=2022, but libpysal is downgrading it for some reason
RUN uv pip install --system "tbb>=2022" "libpysal==4.9.2"

# b/404590350: Ray and torchtune have conflicting tune cli, we will prioritize torchtune.
RUN uv pip install --system --force-reinstall --no-deps torchtune
# b/415358158: Gensim removed from Colab image to upgrade scipy
RUN uv pip install --system --force-reinstall --no-deps torchtune gensim

# Adding non-package dependencies:

ADD clean-layer.sh /tmp/clean-layer.sh
ADD patches/nbconvert-extensions.tpl /opt/kaggle/nbconvert-extensions.tpl
ADD patches/template_conf.json /opt/kaggle/conf.json
Expand Down Expand Up @@ -181,13 +176,13 @@ RUN mkdir -p /root/.jupyter && touch /root/.jupyter/jupyter_nbconvert_config.py
mkdir -p /etc/ipython/ && echo "c = get_config(); c.IPKernelApp.matplotlib = 'inline'" > /etc/ipython/ipython_config.py && \
/tmp/clean-layer.sh

# Fix to import bq_helper library without downgrading setuptools
# Fix to import bq_helper library without downgrading setuptools and upgrading protobuf
RUN mkdir -p ~/src && git clone https://github.com/SohierDane/BigQuery_Helper ~/src/BigQuery_Helper && \
mkdir -p ~/src/BigQuery_Helper/bq_helper && \
mv ~/src/BigQuery_Helper/bq_helper.py ~/src/BigQuery_Helper/bq_helper/__init__.py && \
mv ~/src/BigQuery_Helper/test_helper.py ~/src/BigQuery_Helper/bq_helper/ && \
sed -i 's/)/packages=["bq_helper"])/g' ~/src/BigQuery_Helper/setup.py && \
uv pip install --system -e ~/src/BigQuery_Helper && \
uv pip install --system -e ~/src/BigQuery_Helper "protobuf<3.21"&& \
/tmp/clean-layer.sh


Expand Down
2 changes: 1 addition & 1 deletion config.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
BASE_IMAGE=us-docker.pkg.dev/colab-images/public/runtime
BASE_IMAGE_TAG=release-colab_20250219-060225_RC01
BASE_IMAGE_TAG=release-colab_20250404-060113_RC00
LIGHTGBM_VERSION=4.6.0
CUDA_MAJOR_VERSION=12
CUDA_MINOR_VERSION=5
14 changes: 14 additions & 0 deletions kaggle_requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ arrow
bayesian-optimization
boto3
catboost
category-encoders
cesium
comm
cytoolz
Expand All @@ -32,6 +33,8 @@ deap
dipy
docker
easyocr
# b/302136621: Fix eli5 import for learntools
eli5
emoji
fastcore>=1.7.20
fasttext
Expand All @@ -42,6 +45,13 @@ fuzzywuzzy
geojson
# geopandas > v0.14.4 breaks learn tools
geopandas==v0.14.4
gensim
google-cloud-aiplatform
# b/315753846: Unpin translate package.
google-cloud-translate==3.12.1
google-cloud-videointelligence
google-cloud-vision
google-genai
gpxpy
h2o
haversine
Expand Down Expand Up @@ -112,12 +122,16 @@ qtconsole
ray
rgf-python
s3fs
# b/302136621: Fix eli5 import for learntools
scikit-learn==1.2.2
# Scikit-learn accelerated library for x86
scikit-learn-intelex>=2023.0.1
scikit-multilearn
scikit-optimize
scikit-plot
scikit-surprise
# b/415358158: Gensim removed from Colab image to upgrade scipy to 1.14.1
scipy==1.15.1
# Also pinning seaborn for learntools
seaborn==0.12.2
git+https://github.com/facebookresearch/segment-anything.git
Expand Down
4 changes: 3 additions & 1 deletion tests/test_automl.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@

def _make_credentials():
import google.auth.credentials
return Mock(spec=google.auth.credentials.Credentials)
credentials = Mock(spec=google.auth.credentials.Credentials)
credentials.universe_domain = 'googleapis.com'
return credentials

class TestAutoMl(unittest.TestCase):

Expand Down
4 changes: 3 additions & 1 deletion tests/test_gcs.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@

def _make_credentials():
import google.auth.credentials
return Mock(spec=google.auth.credentials.Credentials)
credentials = Mock(spec=google.auth.credentials.Credentials)
credentials.universe_domain = 'googleapis.com'
return credentials

class TestStorage(unittest.TestCase):

Expand Down