Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ Using CLI for fine-tuning LLMs:

## What's New

- [PR 592](https://github.com/h2oai/h2o-llmstudio/pull/592) Starting to deprecate RLHF in favor of DPO/IPO optimization. Training is no disabled, but old experiments are still viewable. RLHF will be fully removed in a future release.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Training is disabled

- [PR 530](https://github.com/h2oai/h2o-llmstudio/pull/530) Introduced a new problem type for DPO/IPO optimization. This optimization technique can be used as an alternative to RLHF.
- [PR 288](https://github.com/h2oai/h2o-llmstudio/pull/288) Introduced Deepspeed for sharded training allowing to train larger models on machines with multiple GPUs. Requires NVLink. This feature replaces FSDP and offers more flexibility. Deepspeed requires a system installation of cudatoolkit and we recommend using version 11.8. See [Recommended Install](#recommended-install).
- [PR 449](https://github.com/h2oai/h2o-llmstudio/pull/449) New problem type for Causal Classification Modeling allows to train binary and multiclass models using LLMs.
Expand Down
2 changes: 0 additions & 2 deletions documentation/docs/tooltips/experiments/_problem-type.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ Defines the problem type of the experiment, which also defines the settings H2O

- DPO Modeling: Used to fine-tune large language models using Direct Preference Optimization

- Rlhf Language Modeling: Used to fine-tune RLHF language models

- Sequence To Sequence Modeling: Used to fine-tune large sequence to sequence models

- Causal Classification Modeling: Used to fine-tune causal classification models
1 change: 0 additions & 1 deletion llm_studio/app_utils/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,6 @@ def get_size(x):
"problem_types": [
"text_causal_language_modeling_config",
"text_dpo_modeling_config",
"text_rlhf_language_modeling_config",
"text_sequence_to_sequence_modeling_config",
"text_causal_classification_modeling_config",
],
Expand Down
Loading