Skip to content

Commit 029b71f

Browse files
authored
feat: remove keyring trace (#105)
* feat: add change document to remove the keyring trace * feat: remove all references to the keyring trace and bump version of affected specification documents * docs: add another drawback and example * docs: add security implications * docs: change changes file naming convention to use ISO8601 date of proposal as UID * chore: reorganize changes files to group all files for a change into a directory * chore: revert prettier formatting of copyright header (see #110)
1 parent e9a28c8 commit 029b71f

File tree

10 files changed

+633
-239
lines changed

10 files changed

+633
-239
lines changed
Lines changed: 294 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,294 @@
1+
[//]: # (Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.)
2+
[//]: # (SPDX-License-Identifier: CC-BY-SA-4.0)
3+
4+
# The AWS Encryption SDK (ESDK) and the keyring trace
5+
6+
## Background
7+
8+
When we designed keyrings,
9+
we added a concept of a "keyring trace"
10+
that the keyring uses to communicate
11+
what actions it took.
12+
This is an evolution of earlier indicators
13+
in the decryption API that indicated which master key
14+
decrypted the data key.
15+
In both cases, we exposed the data to the caller
16+
but did not include any guidance on what they should do with it,
17+
how to interact with it,
18+
or why it is important.
19+
This is similar to how we treat encryption context
20+
in the encryption and decryption API results.
21+
22+
## Goals
23+
24+
Our goal is to determine how, or if,
25+
we should expose the keyring trace.
26+
27+
## Success Measurements
28+
29+
We will know we are succeeding if we can assemble
30+
multiple known customer problems that we think keyring traces solve
31+
and present examples that address each problem
32+
that _either_ demonstrate why keyring traces are needed
33+
and how they solve those problems
34+
_or_ demonstrate why keyring traces are not needed.
35+
36+
## Out of Scope
37+
38+
Anything that requires us to add API surface area,
39+
whether that is modifying existing APIs or interfaces,
40+
must be treated as new features.
41+
All new features must be
42+
reviewed through the specification modification process.
43+
44+
## Issues and Alternatives
45+
46+
As they exist today,
47+
keyring traces are not very usable,
48+
but more importantly
49+
we never explain or show why they should be used.
50+
51+
Each following issue is dependent on
52+
answering the previous issue.
53+
54+
_Preferred options are in italics._
55+
56+
**New feature requirements are in bold.**
57+
58+
### Issue 0: Why should callers interact with the keyring trace?
59+
60+
If we cannot define a clear purpose for the keyring trace
61+
that is not already met by other ESDK framework components,
62+
we should not expose it to callers.
63+
This needs to include not only
64+
an explanation of what problems the keyring trace solves,
65+
but also guidance on how to use the keyring trace
66+
to solve those problems
67+
and where in the framework those problems should be solved.
68+
69+
- _Option: They shouldn't._
70+
71+
- If we cannot come up with any problems
72+
that the keyring trace solves in its current state,
73+
then we should not expose it to customers in any way
74+
and we should not mention it in any documentation or examples.
75+
It should remain an implementation detail
76+
until or unless we find a use for it.
77+
78+
- Option: Asynchronous audit log.
79+
80+
- Writing the keyring trace to an audit log
81+
would give customers useful metrics on
82+
how they are using the ESDK
83+
throughout their systems.
84+
85+
- counter: This just moves the question of "why" down the road.
86+
87+
- Option: Data protection controls.
88+
89+
- Not all keyrings provide the same protections.
90+
One use of the keyring trace could be to
91+
validate that certain protections were applied to
92+
the encrypted data keys in an encrypted message.
93+
94+
- ex: Require that all keyrings that encrypted the data key
95+
also signed the encryption context.
96+
97+
- alternative: Inspect keyrings before use
98+
to check that they meet your requirements.
99+
100+
- Option: Live usage audit.
101+
102+
- Because keyring behaviors can get complex,
103+
a live audit of keyring actions
104+
could be useful to enforce wrapping key requirements.
105+
106+
- ex: Allow only AWS KMS wrapping keys within a specific account
107+
on decryption.
108+
109+
- alternative: Make a keyring that filters out undesirable EDKs.
110+
111+
- If a customer accepts encrypted messages from unverified sources,
112+
they might want to not trust encrypted messages
113+
that contain EDKs for unknown wrapping keys
114+
and use unsigned algorithm suites.
115+
116+
- alternative: Make a CMM that checks these requirements
117+
before attempting to decrypt any EDKs.
118+
119+
- Option: Notification of failures and no-ops on decryption.
120+
121+
- **Requires adding a new keyring trace action flag.**
122+
- Because CMMs and keyrings can be deeply nested
123+
and keyrings do not halt decryption
124+
if they encounter an error on decrypt,
125+
it can be difficult to determine
126+
why a decryption request failed.
127+
Requiring keyrings to add keyring trace entries
128+
that describe no-op and failure events
129+
would help a caller determine
130+
why no EDKs could be decrypted.
131+
132+
### Issue 1: What should callers read from the keyring trace?
133+
134+
The keyring trace is defined as a list of entries,
135+
each entry composed of
136+
one or more action flags that describe what a keyring did,
137+
as well as information that identifies the keyring that performed those actions.
138+
139+
1. _Option: Both the action flag and the keyring identifier._
140+
141+
- If both the action taken and the keyring that took it are important,
142+
the caller MUST be able to connect a trace entry
143+
to a keyring instance.
144+
145+
1. Option: Nothing.
146+
147+
- If the keyring trace is intended solely for asynchronous audit,
148+
the caller should not interact with it at runtime.
149+
150+
1. Option: Only the action flag values.
151+
152+
- If the primary value is in the action taken
153+
rather than the keyring that took that action,
154+
the caller should not attempt to connect a trace entry
155+
to a keyring instance
156+
or to an EDK.
157+
158+
1. Option: Only the keyring identifier.
159+
160+
- Included for completeness.
161+
If the only thing that is important is which keyrings took any action,
162+
the keyring trace is already overly complicated.
163+
164+
### Issue 2: How should callers interact with the keyring trace?
165+
166+
More than one of these options might be necessary,
167+
depending on the answer to **Issue 1**.
168+
169+
1. Option: Given an action flag, find all entries containing that flag.
170+
171+
- This is straightforward and already possible
172+
with the current structure of the keyring trace entries.
173+
174+
1. Option: Given a keyring, find all entries created by that keyring.
175+
176+
- **This will likely require an addition to the keyring interface.**
177+
- Because keyrings can have more than one key namespace and key name,
178+
connecting a keyring to one or more trace entries can be difficult.
179+
180+
### Issue 3: Where and when should callers interact with the keyring trace?
181+
182+
1. _Option: Within cryptographic materials managers (CMMs)._
183+
184+
- All request and message values can be accessed at this level.
185+
- This should be sufficient for enforcing requirements
186+
either statically or based on the request or message metadata.
187+
188+
- ex: A CMM that requires that
189+
all keyrings that encrypted the data key
190+
also signed the encryption context.
191+
- ex: A CMM that requires that an escrow keyring
192+
encrypted the data key for any messages
193+
whose encryption context contains a specific value.
194+
- ex: A CMM that writes the keyring trace to an audit log.
195+
196+
1. _Option: Within keyrings._
197+
198+
- Not all request and message values can be accessed at this level.
199+
- This should be sufficient for keyrings that might choose
200+
to take (or not take) certain actions based on
201+
previous actions.
202+
203+
- ex: A multi-keyring that keeps trying child keyrings
204+
until at least one keyring has
205+
verified the encryption context.
206+
207+
1. Option: Outside of the ESDK.
208+
209+
- **Requires adding output values to the API signatures.**
210+
- The keyring trace must be returned from the top-level APIs.
211+
- This should only be necessary if the requirements
212+
that we expect customers to want to enforce
213+
vary across messages
214+
or depend on details outside of
215+
the message and request metadata.
216+
217+
1. Option: Within the ESDK client.
218+
219+
- **Requires adding input values to the API signatures.**
220+
- **Requires adding a new conceptual feature.**
221+
- The caller providers per-request keyring trace checking requirements
222+
that the ESDK client performs after calling the CMM.
223+
224+
- This is conceptually similar to previous ideas about
225+
how to give customers a way to check the encryption context
226+
before decrypting an encrypted message.
227+
- This should only be necessary if the requirements
228+
that we expect customers to want to enforce
229+
vary across messages
230+
or depend on details outside of
231+
the message and request metadata.
232+
233+
### Issue 4: Which actions flags should a keyring trace entry allow?
234+
235+
1. Option: Successful actions.
236+
237+
- Any action that a keyring completes successfully.
238+
- This is what happens today for:
239+
- generate data key
240+
- encrypt data key
241+
- sign encryption context
242+
- decrypt data key
243+
- verify encryption context
244+
245+
1. Option: Failure.
246+
247+
- **Requires adding a new keyring trace action flag.**
248+
- Any action that a keyring attempted but failed to complete.
249+
- This is useful for debugging why an encrypt or decrypt request failed.
250+
251+
1. Option: No-op.
252+
253+
- **Requires adding a new keyring trace action flag.**
254+
- If a keyring chooses to do nothing.
255+
- This is useful for debugging why an encrypt or decrypt request failed.
256+
257+
## One-Way Doors
258+
259+
Any change that would add API surface area is a one-way door.
260+
Any such changes must be treated as new features
261+
and handled through the specification modification process.
262+
263+
1. Adding functionality to the keyring interface. (**Issue 2**)
264+
1. Returning the keyring trace from the ESDK APIs. (**Issue 3**)
265+
1. Adding a "message requirements" system to the ESDK APIs. (**Issue 3**)
266+
1. Adding new keyring trace action flags. (**Issue 4**)
267+
268+
## Impact
269+
270+
1. All pending and future ESDK releases are blocked by these issues.
271+
1. Each of the one-way doors also represents a new feature
272+
that must be reviewed through the specification modification process.
273+
This will impact all projected ESDK development and release targets.
274+
275+
## Open Questions
276+
277+
- Is it important to be able to tie
278+
a successful keyring trace entry to an EDK?
279+
- Is the order of entries in the keyring trace important? If so, what order?
280+
281+
- Absolute order?
282+
- Relative order?
283+
- State of materials beforehand?
284+
- What about concurrent actions? (ex: parallel multi-keyring)
285+
286+
- "[..] the requirements
287+
that we expect customers to want to enforce
288+
vary across messages
289+
or depend on details outside of
290+
the message and request metadata."
291+
292+
- Do these requirements exist
293+
and are they requirements that
294+
the ESDK should support solving?

0 commit comments

Comments
 (0)