Fix libsql prompt scope regressions#1651
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses several regressions related to Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request enhances multi-tenancy support by ensuring system prompts and workspace configurations are correctly scoped to individual users. Key changes include introducing a scoped_to_user method in the Workspace struct, adding Clone derives to WorkspaceStorage and Repository, and modifying the agent's dispatcher to utilize user-scoped workspaces for system prompt generation. Additionally, cron expressions in routine triggers are now normalized, with corresponding test updates. A performance and readability improvement was suggested for the read_user_ids logic in src/workspace/mod.rs, recommending the use of a HashSet to optimize its complexity from O(n^2) to O(n).
| let mut read_user_ids = vec![user_id.clone()]; | ||
| for scope in &self.read_user_ids { | ||
| if scope != &self.user_id && !read_user_ids.contains(scope) { | ||
| read_user_ids.push(scope.clone()); | ||
| } | ||
| } | ||
| for scope in crate::workspace::layer::MemoryLayer::read_scopes(&memory_layers) { | ||
| if !read_user_ids.contains(&scope) { | ||
| read_user_ids.push(scope); | ||
| } | ||
| } |
There was a problem hiding this comment.
For improved performance and readability, you can refactor the logic for building read_user_ids. The current implementation uses Vec::contains within loops, resulting in O(n^2) complexity. By using a HashSet to track seen scopes, you can achieve O(n) complexity.
Here's a suggested refactoring that also simplifies the logic by chaining iterators:
use std::collections::HashSet;
let mut read_user_ids = vec![user_id.clone()];
let mut seen_scopes: HashSet<String> = [user_id.clone()].into();
// Combine scopes from the original workspace (excluding the old user) and the updated memory layers.
let other_scopes = self.read_user_ids.iter()
.filter(|s| *s != &self.user_id)
.cloned()
.chain(crate::workspace::layer::MemoryLayer::read_scopes(&memory_layers));
for scope in other_scopes {
if seen_scopes.insert(scope.clone()) {
read_user_ids.push(scope);
}
}References
- The use of
HashSetfor tracking seen scopes aligns with the principle of using efficient data structures for deduplication to improve performance, similar to howHashMapis recommended for O(1) lookups in other deduplication scenarios, avoiding linear scans.
There was a problem hiding this comment.
Thanks. I agree a HashSet-backed dedupe would be a reasonable cleanup if this path grows, but I am going to leave it as-is for this change. In practice read_user_ids is very small here, this path is not performance-sensitive, and the current Vec-based construction keeps insertion order obvious and deterministic for debugging. If we revisit it later, I would likely keep the Vec for stable ordering and add a HashSet only for seen tracking rather than switching the representation outright.
There was a problem hiding this comment.
Pull request overview
Fixes multi-tenant regressions where chat system prompt generation and routine cron schedules were being scoped/serialized inconsistently, particularly when using libSQL-backed workspaces.
Changes:
- Scope chat system prompt generation to the incoming message’s user workspace (rather than the startup/owner workspace).
- Normalize cron expressions on routine creation so create/update persist the same canonical schedule string.
- Update E2E/regression tests to seed identity data into the active test user scope and expect normalized cron schedules.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/multi_tenant_system_prompt.rs | Updates test module docs to reflect the regression being covered. |
| tests/e2e_workspace_coverage.rs | Seeds IDENTITY.md into the active test user workspace instead of the owner workspace. |
| tests/e2e_builtin_tool_coverage.rs | Updates expected stored cron schedule to the normalized 7-field form. |
| src/workspace/repository.rs | Derives Clone for Repository to support cloning workspace storage. |
| src/workspace/mod.rs | Derives Clone for WorkspaceStorage and adds Workspace::scoped_to_user() for per-user scoping. |
| src/tools/builtin/routine.rs | Normalizes cron schedule when building cron triggers during routine creation + adds a unit test. |
| src/agent/dispatcher.rs | Uses a user-scoped workspace for system prompt generation based on IncomingMessage.user_id. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
I think there's one valid issue:
In src/workspace/mod.rs lines 595-603, the read_user_ids loop has O(n²) complexity: let mut read_user_ids = vec![user_id.clone()]; Fix: Use HashSet instead: |
Summary
Testing