Object of type BitsAndBytesConfig is not JSON serializable error with TensorBoard integration

### System Info

transformers==4.51.3
Python version: 3.11


### Who can help?

@zach-huggingface @SunMarc @MekkCyber

### Information

- [ ] The official example scripts
- [x] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [x] My own task or dataset (give details below)

### Reproduction


When using SFTTrainer with BitsAndBytes and TensorBoard integration, the TrainingArguments are serialized to JSON but fails with:

```
[rank0]: Traceback (most recent call last):
[rank0]:     main({'model_name_or_path': 'meta-llama/Llama-4-Scout-17B-16E-Instruct', 'model_revision': 'main', 'torch_dtype': 'bfloat16', 'attn_implementation': 'flex_attention', 'use_liger': False, 'use_peft': False, 'lora_r': 16, 'lora_alpha': 8, 'lora_dropout': 0.05, 'lora_target_modules': ['q_proj', 'v_proj', 'k_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj'], 'lora_modules_to_save': [], 'load_in_4bit': False, 'load_in_8bit': True, 'dataset_name': 'gsm8k', 'dataset_config': 'main', 'dataset_train_split': 'train', 'dataset_test_split': 'test', 'dataset_text_field': 'text', 'dataset_kwargs': {'add_special_tokens': False, 'append_concat_token': False}, 'max_seq_length': 512, 'dataset_batch_size': 1000, 'packing': False, 'num_train_epochs': 10, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'auto_find_batch_size': False, 'eval_strategy': 'epoch', 'bf16': True, 'tf32': False, 'learning_rate': 0.0002, 'warmup_steps': 10, 'lr_scheduler_type': 'inverse_sqrt', 'optim': 'adamw_torch_fused', 'max_grad_norm': 1.0, 'seed': 42, 'gradient_accumulation_steps': 1, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': {'use_reentrant': False}, 'fsdp': 'full_shard auto_wrap', 'fsdp_config': {'activation_checkpointing': True, 'cpu_ram_efficient_loading': False, 'sync_module_states': True, 'use_orig_params': True, 'limit_all_gathers': False}, 'save_strategy': 'epoch', 'save_total_limit': 1, 'resume_from_checkpoint': False, 'log_level': 'info', 'logging_strategy': 'steps', 'logging_steps': 1, 'report_to': ['tensorboard'], 'output_dir': '/mnt/shared/Llama-4-Scout-17B-16E-Instruct'})
[rank0]:   File "/tmp/tmp.jsNRcydokN/ephemeral_script.py", line 126, in main
[rank0]:     trainer.train(resume_from_checkpoint=checkpoint)
[rank0]:   File "/opt/app-root/lib64/python3.11/site-packages/transformers/trainer.py", line 2238, in train
[rank0]:     return inner_training_loop(
[rank0]:            ^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/opt/app-root/lib64/python3.11/site-packages/transformers/trainer.py", line 2462, in _inner_training_loop
[rank0]:     self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
[rank0]:                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/opt/app-root/lib64/python3.11/site-packages/transformers/trainer_callback.py", line 506, in on_train_begin
[rank0]:     return self.call_event("on_train_begin", args, state, control)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/opt/app-root/lib64/python3.11/site-packages/transformers/trainer_callback.py", line 556, in call_event
[rank0]:     result = getattr(callback, event)(
[rank0]:              ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/opt/app-root/lib64/python3.11/site-packages/transformers/integrations/integration_utils.py", line 698, in on_train_begin
[rank0]:     self.tb_writer.add_text("args", args.to_json_string())
[rank0]:                                     ^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/opt/app-root/lib64/python3.11/site-packages/transformers/training_args.py", line 2509, in to_json_string
[rank0]:     return json.dumps(self.to_dict(), indent=2)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/usr/lib64/python3.11/json/__init__.py", line 238, in dumps
[rank0]:     **kw).encode(obj)
[rank0]:           ^^^^^^^^^^^
[rank0]:   File "/usr/lib64/python3.11/json/encoder.py", line 202, in encode
[rank0]:     chunks = list(chunks)
[rank0]:              ^^^^^^^^^^^^
[rank0]:   File "/usr/lib64/python3.11/json/encoder.py", line 432, in _iterencode
[rank0]:     yield from _iterencode_dict(o, _current_indent_level)
[rank0]:   File "/usr/lib64/python3.11/json/encoder.py", line 406, in _iterencode_dict
[rank0]:     yield from chunks
[rank0]:   File "/usr/lib64/python3.11/json/encoder.py", line 406, in _iterencode_dict
[rank0]:     yield from chunks
[rank0]:   File "/usr/lib64/python3.11/json/encoder.py", line 439, in _iterencode
[rank0]:     o = _default(o)
[rank0]:         ^^^^^^^^^^^
[rank0]:   File "/usr/lib64/python3.11/json/encoder.py", line 180, in default
[rank0]:     raise TypeError(f'Object of type {o.__class__.__name__} '
[rank0]: TypeError: Object of type BitsAndBytesConfig is not JSON serializable
```


### Expected behavior

The BitsAndBytesConfig should be converted to dict before TrainingArguments are serialized.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Object of type BitsAndBytesConfig is not JSON serializable error with TensorBoard integration #37518

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Object of type BitsAndBytesConfig is not JSON serializable error with TensorBoard integration #37518

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions