ONNX extension for saving to and loading from safetensors π€.
- β Load and save ONNX weights from and to safetensors
- β Support all ONNX data types, including float8, float4 and 4-bit ints
- β Allow ONNX backends (including ONNX Runtime) to use safetensors
pip install --upgrade onnx-safetensorsTip
You can use safetensors as external data for ONNX.
import os
import onnx
import onnx_safetensors
# Provide your ONNX model here
model: onnx.ModelProto
tensor_file = "path/to/onnx_model/model.safetensors"
base_dir = "path/to/onnx_model"
data_path = "model.safetensors"
# Apply weights from the safetensors file to the model and turn them to in memory tensor
# NOTE: If model size becomes >2GB you will need to offload weights with onnx_safetensors.save_file, or onnx.save with external data options to keep the onnx model valid
model = onnx_safetensors.load_file(model, tensor_file)
# If you want to use the safetensors file in ONNX Runtime:
# Use safetensors as external data in the ONNX model
model_with_external_data = onnx_safetensors.load_file_as_external_data(model, data_path, base_dir=base_dir)
# Save the modified model
# This model is a valid ONNX model using external data from the safetensors file
onnx.save(model_with_external_data, os.path.join(base_dir, "model_using_safetensors.onnx"))import onnx
import onnx_safetensors
# Provide your ONNX model here
model: onnx.ModelProto
base_dir = "path/to/onnx_model"
data_path = "model.safetensors"
# Offload weights from ONNX model to safetensors file without changing the model
onnx_safetensors.save_file(model, data_path, base_dir=base_dir, replace_data=False) # Generates model.safetensors
# If you want to use the safetensors file in ONNX Runtime:
# Offload weights from ONNX model to safetensors file and use it as external data for the model by setting replace_data=True
model_with_external_data = onnx_safetensors.save_file(model, data_path, base_dir=base_dir, replace_data=True)
# Save the modified model
# This model is a valid ONNX model using external data from the safetensors file
onnx.save(model_with_external_data, os.path.join(base_dir, "model_using_safetensors.onnx"))The save_model function is a convenient way to save both the ONNX model and its weights to separate files:
import onnx_safetensors
# Provide your ONNX model here
model: onnx.ModelProto
# Save model and weights in one step
# This creates model.onnx and model.safetensors
onnx_safetensors.save_model(model, "model.onnx")
# You can also specify a custom name for the weights file
onnx_safetensors.save_model(model, "model.onnx", external_data="weights.safetensors")For large models, you can automatically shard the weights across multiple safetensors files:
import onnx_safetensors
# Provide your ONNX model here
model: onnx.ModelProto
# Shard the model into multiple files (e.g., 5GB per shard)
# This creates:
# - model.onnx
# - model-00001-of-00003.safetensors
# - model-00002-of-00003.safetensors
# - model-00003-of-00003.safetensors
# - model.safetensors.index.json (index file mapping tensors to shards)
onnx_safetensors.save_model(model, "model.onnx", max_shard_size="5GB")
# You can also use save_file with sharding
onnx_safetensors.save_file(
model,
"weights.safetensors",
base_dir="path/to/save",
max_shard_size="5GB"
)The sharding format is compatible with the Hugging Face transformers library, making it easy to share and load models across different frameworks.
For storage or transfer purposes, you can embed an entire ONNX model (structure and weights) into a single safetensors file:
import onnx_safetensors
# Provide your ONNX model here
model: onnx.ModelProto
# Save the entire model (structure + weights) into a safetensors file
onnx_safetensors.save_safetensors_model(model, "model.safetensors")
# Later, extract the model from the safetensors file
model = onnx_safetensors.extract_safetensors_model("model.safetensors")
# Or extract and save to an ONNX file that references the safetensors file as external data
onnx_safetensors.extract_safetensors_model(
"model.safetensors",
output_path="model.onnx"
)Note
This format is for storage/transfer only and is not compatible with ONNX Runtime. Use extract_safetensors_model with output_path to create a runnable ONNX model that references the safetensors file as external data.
ONNX-safetensors provides a command-line interface for converting ONNX models to use safetensors format:
# Basic conversion
onnx-safetensors convert input.onnx output.onnx
# Convert with sharding (split large models into multiple files)
onnx-safetensors convert input.onnx output.onnx --max-shard-size 5GB
# You can also specify size in MB
onnx-safetensors convert input.onnx output.onnx --max-shard-size 500MB
# Embed an ONNX model into a safetensors file
onnx-safetensors embed input.onnx output.safetensorsThe convert command:
- Loads an ONNX model from the input path
- Saves it with safetensors external data to the output path
- Optionally shards large models using
--max-shard-size - Creates index files automatically when sharding is enabled
The embed command:
- Loads an ONNX model from the input path
- Embeds the entire model (structure and weights) into a single safetensors file
- Useful for storage or transfer purposes
- Use
onnx_safetensors.extract_safetensors_modelin Python to extract the model later