Skip to content

justinchuby/onnx-safetensors

Repository files navigation

onnx-safetensors

CI PyPI - Version PyPI - Python Version PyPI Downloads Ruff Ruff

ONNX extension for saving to and loading from safetensors πŸ€—.

Features

  • βœ… Load and save ONNX weights from and to safetensors
  • βœ… Support all ONNX data types, including float8, float4 and 4-bit ints
  • βœ… Allow ONNX backends (including ONNX Runtime) to use safetensors

Install

pip install --upgrade onnx-safetensors

Usage

Load tensors to an ONNX model

Tip

You can use safetensors as external data for ONNX.

import os
import onnx
import onnx_safetensors

# Provide your ONNX model here
model: onnx.ModelProto

tensor_file = "path/to/onnx_model/model.safetensors"
base_dir = "path/to/onnx_model"
data_path = "model.safetensors"

# Apply weights from the safetensors file to the model and turn them to in memory tensor
# NOTE: If model size becomes >2GB you will need to offload weights with onnx_safetensors.save_file, or onnx.save with external data options to keep the onnx model valid
model = onnx_safetensors.load_file(model, tensor_file)

# If you want to use the safetensors file in ONNX Runtime:
# Use safetensors as external data in the ONNX model
model_with_external_data = onnx_safetensors.load_file_as_external_data(model, data_path, base_dir=base_dir)

# Save the modified model
# This model is a valid ONNX model using external data from the safetensors file
onnx.save(model_with_external_data, os.path.join(base_dir, "model_using_safetensors.onnx"))

Save weights to a safetensors file

import onnx
import onnx_safetensors

# Provide your ONNX model here
model: onnx.ModelProto
base_dir = "path/to/onnx_model"
data_path = "model.safetensors"

# Offload weights from ONNX model to safetensors file without changing the model
onnx_safetensors.save_file(model, data_path, base_dir=base_dir, replace_data=False)  # Generates model.safetensors

# If you want to use the safetensors file in ONNX Runtime:
# Offload weights from ONNX model to safetensors file and use it as external data for the model by setting replace_data=True
model_with_external_data = onnx_safetensors.save_file(model, data_path, base_dir=base_dir, replace_data=True)

# Save the modified model
# This model is a valid ONNX model using external data from the safetensors file
onnx.save(model_with_external_data, os.path.join(base_dir, "model_using_safetensors.onnx"))

Save an ONNX model with safetensors weights

The save_model function is a convenient way to save both the ONNX model and its weights to separate files:

import onnx_safetensors

# Provide your ONNX model here
model: onnx.ModelProto

# Save model and weights in one step
# This creates model.onnx and model.safetensors
onnx_safetensors.save_model(model, "model.onnx")

# You can also specify a custom name for the weights file
onnx_safetensors.save_model(model, "model.onnx", external_data="weights.safetensors")

Shard large models

For large models, you can automatically shard the weights across multiple safetensors files:

import onnx_safetensors

# Provide your ONNX model here
model: onnx.ModelProto

# Shard the model into multiple files (e.g., 5GB per shard)
# This creates:
# - model.onnx
# - model-00001-of-00003.safetensors
# - model-00002-of-00003.safetensors
# - model-00003-of-00003.safetensors
# - model.safetensors.index.json (index file mapping tensors to shards)
onnx_safetensors.save_model(model, "model.onnx", max_shard_size="5GB")

# You can also use save_file with sharding
onnx_safetensors.save_file(
    model,
    "weights.safetensors",
    base_dir="path/to/save",
    max_shard_size="5GB"
)

The sharding format is compatible with the Hugging Face transformers library, making it easy to share and load models across different frameworks.

Embed ONNX model in a safetensors file

For storage or transfer purposes, you can embed an entire ONNX model (structure and weights) into a single safetensors file:

import onnx_safetensors

# Provide your ONNX model here
model: onnx.ModelProto

# Save the entire model (structure + weights) into a safetensors file
onnx_safetensors.save_safetensors_model(model, "model.safetensors")

# Later, extract the model from the safetensors file
model = onnx_safetensors.extract_safetensors_model("model.safetensors")

# Or extract and save to an ONNX file that references the safetensors file as external data
onnx_safetensors.extract_safetensors_model(
    "model.safetensors",
    output_path="model.onnx"
)

Note

This format is for storage/transfer only and is not compatible with ONNX Runtime. Use extract_safetensors_model with output_path to create a runnable ONNX model that references the safetensors file as external data.

Command Line Interface

ONNX-safetensors provides a command-line interface for converting ONNX models to use safetensors format:

# Basic conversion
onnx-safetensors convert input.onnx output.onnx

# Convert with sharding (split large models into multiple files)
onnx-safetensors convert input.onnx output.onnx --max-shard-size 5GB

# You can also specify size in MB
onnx-safetensors convert input.onnx output.onnx --max-shard-size 500MB

# Embed an ONNX model into a safetensors file
onnx-safetensors embed input.onnx output.safetensors

The convert command:

  • Loads an ONNX model from the input path
  • Saves it with safetensors external data to the output path
  • Optionally shards large models using --max-shard-size
  • Creates index files automatically when sharding is enabled

The embed command:

  • Loads an ONNX model from the input path
  • Embeds the entire model (structure and weights) into a single safetensors file
  • Useful for storage or transfer purposes
  • Use onnx_safetensors.extract_safetensors_model in Python to extract the model later

Examples

Star History

Star History Chart

About

Use safetensors with ONNX πŸ€—

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages