convert FireRedASR-AED to ONNX format with batch inference. accelerate inference and maintain the original ASR performance.
- Create and Activate the conda environment
conda create -n asr_export python=3.12
conda activate asr_export
- Install dependencies
pip install -r requirements.txtnote: onnxruntime-gpu 1.22.0 need glibc >= 2.27
- Download or Prepare FireRedASR-AED weights, e.g.,
huggingface-cli download FireRedTeam/FireRedASR-AED-L --local-dir ./weights/FireRedASR-AED-L- Export FireRedASR-ASR to ONNX (Save to
onnx_folder_path)
python Export_FireRedASR_AED_Batch.py --model_path ./weights/FireRedASR-AED-L --project_path ./FireRedASR --onnx_folder_path ./onnx_model- (Optional, with limited improvement) Optim exported ONNX models by
ONNXSlim(Save to./onnx_modelby default)
python Optim_FireRedASR_AED_ONNX_Batch.py --input onnx_model --output onnx_slim- Inference with CUDA
python Inference_FireRedASR_AED_ONNX_Batch.py --model_path ./weights/FireRedASR-AED-L --project_path ./FireRedASR --onnx_folder_path ./onnx_slim --batch_size 4