-
Notifications
You must be signed in to change notification settings - Fork 83
Conversation
This change introduces secrets on the fly scanning. It still requires * Response interception, e.g. outout from the LLM * Validation with an end to end IDE <> LLM flow
Note, this will fail when #120 merges as I can't push formatted code and not keep getting caught up in format rebases from merge conflicts. |
def translate_completion_input_params(self, kwargs: Dict) -> Optional[ChatCompletionRequest]: | ||
try: | ||
return ChatCompletionRequest(**kwargs) | ||
except Exception as e: | ||
raise ValueError(f"Invalid completion parameters: {str(e)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this functionality moved to the LLamaCppInputNormalizer
and is not needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, there was a lot of movement in the core engine and not enough docs. Let's merge this and clean up in a follow up. We have too many PRs in flight now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, just left one comment :)
|
||
def generate_session_key(self, session_id): | ||
"""Generates a session key with an associated timestamp.""" | ||
key = os.urandom(32) # Generate a 256-bit key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question, is this the recommended way in python to get random data? I see that we are importing cryptography, does it provide some function to securely generate randomness? (I know this is urandom and won't block so this is OK, just asking)
self.session_keys[session_id] = (key, time.time()) | ||
return key | ||
|
||
def get_session_key(self, session_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one caller currently and it uses None as the session ID. Were you thinking about using the chat ID as the session? I guess it should be something the client doesn't set?
|
||
def decrypt_token(self, encrypted_token, session_id): | ||
"""Decrypts a token and validates its timestamp to prevent replay attacks.""" | ||
key = self.get_session_key(session_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I was asking about the session ID was that I was wondering if we could make the session key single-use and delete after use to help against replay attacks.
Returns: | ||
Absolute position in text | ||
""" | ||
lines = text.split("\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and now I understand why you asked me about line splits in the normalization..
so we don't handle those ourselves. I think we should test on windows how are the snippets sent from the client to the server..
if isinstance(request["messages"], list): | ||
for msg in request["messages"]: | ||
if msg.get("role") == "user" and msg.get("content") == extracted_string: | ||
msg["content"] = protected_string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, but get_last_user_message returns (msg, index), so you could just replace the message at that index instead of looking for it.
Let's not change it now and maybe file an issue, I think it would be nice to change get_last_user_message to return a named tuple instead so that it's more obvious
def _ensure_signatures_loaded(cls) -> None: | ||
"""Ensure signatures are loaded before use.""" | ||
if not cls._signatures_loaded: | ||
with cls._instance_lock: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I will borrow this pattern for the snippet extraction :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple of comments inline. We should merge this and iterate.
Sounds good, I am going to review the crypto code a few times again, I had to do a last minute refactor to a different library as py crypto is no longer maintained |
This change introduces secrets on the fly scanning. It still requires