feat(reborn): add prompt write safety policy#3167
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive prompt-write safety policy framework to protect sensitive files, such as system prompts and agent identities, from malicious injections. It implements a versioned registry for protected paths, a safety policy trait with a default sanitizer-based implementation, and integrates enforcement checks into the memory backend and filesystem adapters. Feedback identifies critical issues with the non-atomic nature of the new append_file implementations, which are susceptible to race conditions. Additionally, there are opportunities to optimize performance by avoiding redundant database reads for content hashing and to improve the architecture by centralizing policy enforcement to prevent duplicate checks and the loss of safety allowances in the filesystem adapter.
| async fn append_file(&self, path: &VirtualPath, bytes: &[u8]) -> Result<(), FilesystemError> { | ||
| self.ensure_file_documents(path, FilesystemOperation::AppendFile)?; | ||
| let document_path = self.parse_file_path(path, FilesystemOperation::AppendFile)?; | ||
| let context = MemoryContext::new(document_path.scope().clone()); | ||
| let mut combined = self | ||
| .backend | ||
| .read_document(&context, &document_path) | ||
| .await? | ||
| .unwrap_or_default(); | ||
| let is_protected = prompt_write_protected_classification( | ||
| self.prompt_safety_policy.as_ref(), | ||
| &self.prompt_protected_path_registry, | ||
| &document_path, | ||
| ) | ||
| .is_some(); | ||
| let previous_hash = if is_protected { | ||
| std::str::from_utf8(&combined).ok().map(content_sha256) | ||
| } else { | ||
| None | ||
| }; | ||
| combined.extend_from_slice(bytes); | ||
| if is_protected { | ||
| let content = std::str::from_utf8(&combined).map_err(|_| { | ||
| memory_error( | ||
| path.clone(), | ||
| FilesystemOperation::AppendFile, | ||
| "memory document content must be UTF-8", | ||
| ) | ||
| })?; | ||
| enforce_prompt_write_safety( | ||
| self.prompt_safety_policy.as_ref(), | ||
| &self.prompt_protected_path_registry, | ||
| PromptWriteSafetyCheck { | ||
| scope: context.scope(), | ||
| path: &document_path, | ||
| operation: PromptWriteOperation::Append, | ||
| source: PromptWriteSource::MemoryFilesystemAdapter, | ||
| content, | ||
| previous_content_hash: previous_hash.as_deref(), | ||
| allowance: context.prompt_write_safety_allowance(), | ||
| filesystem_operation: FilesystemOperation::AppendFile, | ||
| }, | ||
| ) | ||
| .await?; | ||
| } | ||
| self.backend | ||
| .write_document(&context, &document_path, &combined) | ||
| .await | ||
| } |
There was a problem hiding this comment.
The append_file implementation performs a non-atomic read-modify-write operation, which can lead to race conditions and data loss. To prevent this, a write lock should be held across the entire operation. Consider extending the MemoryBackend trait with an append_document method to allow backends to handle this atomically (e.g., using BEGIN IMMEDIATE in LibSql to serialize concurrent writers).
References
- To prevent race conditions when updating a swappable resource, hold a write lock across the entire read-modify-write operation, including both the inner resource mutation and the snapshot refresh.
- When saving document versions in libSQL, wrap the SELECT and INSERT operations in a transaction with
BEGIN IMMEDIATEto acquire a write lock upfront. This serializes concurrent writers and prevents race conditions where multiple requests might fetch the same next version number, leading to unique constraint violations.
| async fn append_file(&self, path: &VirtualPath, bytes: &[u8]) -> Result<(), FilesystemError> { | ||
| let document_path = self.parse_file_path(path, FilesystemOperation::AppendFile)?; | ||
| let mut combined = self | ||
| .repository | ||
| .read_document(&document_path) | ||
| .await? | ||
| .unwrap_or_default(); | ||
| let is_protected = prompt_write_protected_classification( | ||
| self.prompt_safety_policy.as_ref(), | ||
| &self.prompt_protected_path_registry, | ||
| &document_path, | ||
| ) | ||
| .is_some(); | ||
| let previous_hash = if is_protected { | ||
| std::str::from_utf8(&combined).ok().map(content_sha256) | ||
| } else { | ||
| None | ||
| }; | ||
| combined.extend_from_slice(bytes); | ||
| if is_protected { | ||
| let content = std::str::from_utf8(&combined).map_err(|_| { | ||
| memory_error( | ||
| path.clone(), | ||
| FilesystemOperation::AppendFile, | ||
| "memory document content must be UTF-8", | ||
| ) | ||
| })?; | ||
| enforce_prompt_write_safety( | ||
| self.prompt_safety_policy.as_ref(), | ||
| &self.prompt_protected_path_registry, | ||
| PromptWriteSafetyCheck { | ||
| scope: document_path.scope(), | ||
| path: &document_path, | ||
| operation: PromptWriteOperation::Append, | ||
| source: PromptWriteSource::MemoryDocumentFilesystem, | ||
| content, | ||
| previous_content_hash: previous_hash.as_deref(), | ||
| allowance: None, | ||
| filesystem_operation: FilesystemOperation::AppendFile, | ||
| }, | ||
| ) | ||
| .await?; | ||
| } | ||
| self.repository | ||
| .write_document(&document_path, &combined) | ||
| .await?; | ||
| if let Some(indexer) = &self.indexer { | ||
| let _ = indexer.reindex_document(&document_path).await; | ||
| } | ||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
Similar to the adapter implementation, this append_file implementation is not atomic. Concurrent appends will result in race conditions. To ensure reliability, the MemoryDocumentRepository trait should be extended with an atomic append_document method so that implementations can hold a write lock (e.g., using BEGIN IMMEDIATE in libSQL) across the entire read-modify-write cycle.
References
- To prevent race conditions when updating a swappable resource, hold a write lock across the entire read-modify-write operation, including both the inner resource mutation and the snapshot refresh.
- When saving document versions in libSQL, wrap the SELECT and INSERT operations in a transaction with
BEGIN IMMEDIATEto acquire a write lock upfront. This serializes concurrent writers and prevents race conditions where multiple requests might fetch the same next version number, leading to unique constraint violations.
| let previous_hash = if prompt_write_protected_classification( | ||
| self.prompt_safety_policy.as_ref(), | ||
| &self.prompt_protected_path_registry, | ||
| path, | ||
| ) | ||
| .is_some() | ||
| { | ||
| self.repository | ||
| .read_document(path) | ||
| .await? | ||
| .and_then(|bytes| std::str::from_utf8(&bytes).ok().map(content_sha256)) | ||
| } else { | ||
| None | ||
| }; |
There was a problem hiding this comment.
This explicit read_document call to obtain a content hash for the safety policy is performed for every write to a protected path, even if the configured policy does not utilize the hash. This results in redundant database roundtrips. While performance optimizations should be gated on profiling per repository guidelines, this architectural redundancy should be addressed by making the read conditional based on policy requirements or deferring it to the repository layer.
References
- Performance optimizations, such as adding a caching layer, should be gated on profiling, especially when existing mechanisms (e.g., ETags, short-circuiting for specific paths) already reduce overhead for common cases.
| let is_protected = prompt_write_protected_classification( | ||
| self.prompt_safety_policy.as_ref(), | ||
| &self.prompt_protected_path_registry, | ||
| &document_path, | ||
| ) | ||
| .is_some(); | ||
| if is_protected { | ||
| let content = std::str::from_utf8(bytes).map_err(|_| { | ||
| memory_error( | ||
| path.clone(), | ||
| FilesystemOperation::WriteFile, | ||
| "memory document content must be UTF-8", | ||
| ) | ||
| })?; | ||
| let previous_hash = self | ||
| .backend | ||
| .read_document(&context, &document_path) | ||
| .await? | ||
| .and_then(|bytes| std::str::from_utf8(&bytes).ok().map(content_sha256)); | ||
| enforce_prompt_write_safety( | ||
| self.prompt_safety_policy.as_ref(), | ||
| &self.prompt_protected_path_registry, | ||
| PromptWriteSafetyCheck { | ||
| scope: context.scope(), | ||
| path: &document_path, | ||
| operation: PromptWriteOperation::Write, | ||
| source: PromptWriteSource::MemoryFilesystemAdapter, | ||
| content, | ||
| previous_content_hash: previous_hash.as_deref(), | ||
| allowance: context.prompt_write_safety_allowance(), | ||
| filesystem_operation: FilesystemOperation::WriteFile, | ||
| }, | ||
| ) | ||
| .await?; | ||
| } |
There was a problem hiding this comment.
Safety policy enforcement is implemented in both MemoryBackendFilesystemAdapter and RepositoryMemoryBackend. When these layers are stacked, the same safety checks are executed twice on the same content. Since RepositoryMemoryBackend acts as the primary persistence bridge, it may be more appropriate to centralize enforcement there and have the adapter rely on the backend's enforcement, especially since the backend will see the final combined content in the case of appends.
| @@ -898,11 +1510,96 @@ impl RootFilesystem for MemoryBackendFilesystemAdapter { | |||
| self.ensure_file_documents(path, FilesystemOperation::WriteFile)?; | |||
| let document_path = self.parse_file_path(path, FilesystemOperation::WriteFile)?; | |||
| let context = MemoryContext::new(document_path.scope().clone()); | |||
There was a problem hiding this comment.
The adapter creates a fresh MemoryContext for every operation, which effectively strips any safety allowances (like empty_prompt_file_clear) that might have been intended by the caller. Since the RootFilesystem trait does not currently support passing context or allowances, this makes it impossible to perform policy-restricted operations (such as clearing a protected file) through the filesystem interface, even for privileged host-level callers.
This reverts commit e0023fd.
Summary
Refs #3019.
Scope note
This PR implements the Reborn memory substrate for #3019: policy/registry/event types plus repository and filesystem write/append enforcement. It intentionally does not complete final product-service wiring for future/outer patch, import, seed, profile, admin system-prompt, or capability-specific mutation surfaces. Those callers must route their final resolved content through this hook (or present an explicit one-shot allowance) before #3019 is closed as a full cutover blocker.
Test Plan
cargo test -p ironclaw_memory --features libsqlcargo clippy -p ironclaw_memory --features libsql --all-targets -- -D warningscargo test -p ironclaw_architecturecargo fmt --checkgit diff --checkNotes
cargo test -p ironclaw_safety -p ironclaw_memory -p ironclaw_host_runtimecurrently fails onorigin/reborn-integrationin unrelated host-runtime testhost_http_egress_consumes_staged_policy_when_request_validation_fails; I reproduced the same failure with this branch's changes stashed.