Skip to content

feat: Add delay to multi turn conversations#452

Merged
debermudez merged 2 commits intomainfrom
dbermudez/aip-568-session-delay-not-added-in-multi-turn-scenarios
Nov 7, 2025
Merged

feat: Add delay to multi turn conversations#452
debermudez merged 2 commits intomainfrom
dbermudez/aip-568-session-delay-not-added-in-multi-turn-scenarios

Conversation

@debermudez
Copy link
Copy Markdown
Contributor

@debermudez debermudez commented Nov 6, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Added configurable delays between turns in multi-turn conversations, allowing realistic simulation of reading and thinking time.
    • Delays can be controlled via mean, standard deviation, and ratio parameters.
  • Documentation

    • Updated multi-turn tutorial with detailed conversation flow, turn sequencing, and delay configuration guidance.
  • Tests

    • Added tests for turn delay configuration and zero-delay scenarios.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Nov 6, 2025

Try out this PR

Quick install:

pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@dbermudez/aip-568-session-delay-not-added-in-multi-turn-scenarios

Recommended with virtual environment (using uv):

uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@dbermudez/aip-568-session-delay-not-added-in-multi-turn-scenarios

@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 6, 2025

Codecov Report

❌ Patch coverage is 40.00000% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/aiperf/workers/worker.py 33.33% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Nov 6, 2025

Walkthrough

This pull request implements multi-turn conversation delays to simulate realistic reading and thinking time between turns. It adds configuration parameters (mean, stddev, ratio), applies conditional logic to prevent zero/negative delays, and integrates delay application across the dataset composer and worker execution layers with proper async handling and trace logging.

Changes

Cohort / File(s) Summary
Documentation
docs/tutorials/multi-turn.md
Introduces "Real-World Conversation Flow" subsection detailing turn sequencing (Turn 0, Turn 1, etc.) with explicit delays. Documents delay control parameters (conversation-turn-delay-mean, -stddev, -ratio) and clarifies execution flow including first-turn behavior (no delay) and subsequent-turn delays. Updates Quick Reference to align with revised flow model.
Core Implementation
src/aiperf/dataset/composer/synthetic.py, src/aiperf/workers/worker.py
Adds conditional delay logic in composer to sample delays only when not the first turn and configured mean > 0. Worker implementation applies per-turn delays before sending non-first turns via asyncio.sleep, converts ms to seconds, includes trace logging, and moves task stats increment to occur after delay.
Tests
tests/composers/test_synthetic_composer.py
Adds two new tests: test_turn_delays_from_config_options validates delays from TurnDelayConfig propagate correctly and scale with ratio parameter; test_turn_delays_with_zero_mean validates zero mean results in no delays. Uses fixed RNG seed for deterministic outcomes across single-modality and multi-turn scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Async handling in worker: Verify that asyncio.sleep integration is correct and doesn't introduce race conditions or deadlocks
  • Delay guard logic: Confirm conditional checks (turn_index > 0, mean > 0) are consistently applied across composer and worker
  • Test coverage: Validate that fixed RNG seed produces expected delay distributions and ratio scaling behavior
  • Documentation alignment: Ensure tutorial accurately reflects implementation behavior, especially around zero/negative delay handling

Poem

🐰 Hops through the turns with thoughtful delay,
Reading and pondering along the way,
Conversations flourish, so wonderfully real,
With configured pauses to seal the deal!

Pre-merge checks

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding delay functionality to multi-turn conversations, which is reflected across documentation, synthetic composer logic, worker execution, and tests.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e5194fa and 06cf1852f7554d58abdb2f071c49784e8318ca35.

📒 Files selected for processing (4)
  • docs/tutorials/multi-turn.md (1 hunks)
  • src/aiperf/dataset/composer/synthetic.py (1 hunks)
  • src/aiperf/workers/worker.py (3 hunks)
  • tests/composers/test_synthetic_composer.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/composers/test_synthetic_composer.py (4)
src/aiperf/common/config/input_config.py (1)
  • InputConfig (32-325)
src/aiperf/common/config/conversation_config.py (3)
  • ConversationConfig (113-156)
  • TurnConfig (73-110)
  • TurnDelayConfig (19-70)
src/aiperf/common/random_generator.py (2)
  • reset (467-478)
  • init (396-426)
src/aiperf/dataset/composer/synthetic.py (2)
  • SyntheticDatasetComposer (15-193)
  • create_dataset (34-57)
src/aiperf/workers/worker.py (2)
src/aiperf/common/protocols.py (2)
  • is_trace_enabled (55-55)
  • trace (63-63)
tests/utils/time_traveler.py (1)
  • sleep (37-43)
🪛 markdownlint-cli2 (0.18.1)
docs/tutorials/multi-turn.md

370-370: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: build (macos-latest, 3.12)
  • GitHub Check: build (macos-latest, 3.10)
  • GitHub Check: build (ubuntu-latest, 3.11)
  • GitHub Check: build (ubuntu-latest, 3.10)
  • GitHub Check: integration-tests (macos-latest, 3.10)
  • GitHub Check: integration-tests (ubuntu-latest, 3.12)
  • GitHub Check: integration-tests (macos-latest, 3.11)
  • GitHub Check: integration-tests (macos-latest, 3.12)
  • GitHub Check: integration-tests (ubuntu-latest, 3.10)
  • GitHub Check: integration-tests (ubuntu-latest, 3.11)
🔇 Additional comments (8)
src/aiperf/dataset/composer/synthetic.py (1)

82-87: LGTM! Clean guard logic for delay application.

The conditional guard correctly ensures delays are only sampled and applied when:

  1. The turn is not the first turn (not is_first)
  2. The configured mean delay is positive (mean > 0)

This prevents unnecessary RNG sampling when delays are disabled and aligns with the real-world conversation flow where the first turn has no delay.

src/aiperf/workers/worker.py (3)

183-200: Excellent documentation of multi-turn delay behavior.

The docstring clearly explains the real-world conversation flow simulation, including:

  • Step-by-step turn sequence with delays
  • Configuration parameters for controlling delays
  • Clear distinction between first turn (no delay) and subsequent turns

This documentation will help maintainers understand the delay semantics.


214-223: Delay application logic is correct and well-traced.

The implementation properly:

  • Guards against applying delays to the first turn (turn_index > 0)
  • Validates delay is present and positive
  • Converts milliseconds to seconds correctly
  • Uses asyncio.sleep for async-compatible delays
  • Logs delay application at trace level for debugging

225-225: Good placement of task stats increment.

Moving the increment after the delay correctly reflects that the task hasn't truly begun until after the simulated user thinking time. This ensures timing metrics accurately capture when work actually starts.

tests/composers/test_synthetic_composer.py (2)

282-344: Comprehensive test coverage for delay configuration and scaling.

This test effectively validates:

  • Delays are configured via TurnDelayConfig parameters
  • First turn has no delay (None)
  • Subsequent turns have positive delays with expected variance
  • Ratio scaling correctly adjusts delay values (tested with 1.0 and 0.5)
  • Reproducibility with fixed RNG seed

The assertions use reasonable ranges (1000-4000ms and 500-2000ms) that account for stddev=500.


345-374: Good test for zero-mean delay guard condition.

This test correctly validates that when mean=0, no delays are sampled or applied to any turns (all turn.delay values are None). This confirms the guard condition in the composer works as expected.

docs/tutorials/multi-turn.md (2)

354-382: Clear documentation of realistic conversation flow.

The new section effectively explains:

  • Why delays are applied (simulating reading/thinking time)
  • When delays are applied (before non-first turns)
  • How to configure delays (mean, stddev, ratio parameters)

The terminology uses "Turn 0" for the first turn and "Turn 1" for the second turn, which aligns with the 0-based indexing in the code (turn_index).


383-394: Execution flow updated to reflect delay behavior.

The revised flow correctly describes:

  • Turn 1 as the first turn with no delay
  • Delay application before subsequent turns
  • History accumulation between turns

Minor note: The documentation uses "turn 1" and "turn 2" which could be interpreted as the first and second turns (0-indexed: turn_index 0 and 1). This is consistent with common user understanding (first turn = turn 1), though it differs from the 0-based turn_index in code. The current phrasing "Execute turn 1 (first turn, no delay)" makes this clear.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@debermudez debermudez force-pushed the dbermudez/aip-568-session-delay-not-added-in-multi-turn-scenarios branch from 06cf185 to cacea00 Compare November 6, 2025 22:22
Copy link
Copy Markdown
Contributor

@ajcasagrande ajcasagrande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, but recommend considering addressing nit: MILLIS_PER_SECOND before merge.

Comment thread src/aiperf/workers/worker.py
Comment thread src/aiperf/workers/worker.py Outdated
Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
@debermudez debermudez force-pushed the dbermudez/aip-568-session-delay-not-added-in-multi-turn-scenarios branch from 385d005 to 357c66f Compare November 7, 2025 01:04
@debermudez debermudez enabled auto-merge (squash) November 7, 2025 01:04
@debermudez debermudez merged commit 306e943 into main Nov 7, 2025
20 of 21 checks passed
@debermudez debermudez deleted the dbermudez/aip-568-session-delay-not-added-in-multi-turn-scenarios branch November 7, 2025 01:18
saturley-hall pushed a commit that referenced this pull request Nov 7, 2025
Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
saturley-hall added a commit that referenced this pull request Nov 7, 2025
Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
Co-authored-by: Elias Bermudez <6505145+debermudez@users.noreply.github.com>
vinhngx pushed a commit to vinhngx/aiperf that referenced this pull request Jan 12, 2026
Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
Signed-off-by: vinhn <vinhn@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants