Skip to content

Conversation

@DJAndries
Copy link
Collaborator

@DJAndries DJAndries commented Aug 25, 2023

This will allow us to have separate OPRF instances for each P3A cadence: monthly, weekly and daily.

@DJAndries DJAndries requested a review from a team as a code owner August 25, 2023 04:45
@DJAndries DJAndries force-pushed the multiple-instances branch 2 times, most recently from f360faf to b3a2eec Compare August 25, 2023 20:26
@DJAndries
Copy link
Collaborator Author

This PR is blocked by https://github.com/brave/reviews/issues/1381 for the new calendar-duration crate.

Copy link
Contributor

@rillian rillian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've previously made separate submissions endpoints for each P3A report type. Why combine the randomness servers for each cadence, rather than having separate deployments?

How would you update the PRIVATE_KEY env variable from the key sharing design in #175 for multiple instances.

Comment on lines +57 to +64
fn calculate(base_time: OffsetDateTime, instance_epoch_duration: CalendarDuration) -> Self {
let now = time::OffsetDateTime::now_utc();
let mut elapsed_epoch_count = 0;
let mut next_rotation = base_time + instance_epoch_duration;
while next_rotation < now {
next_rotation = next_rotation + instance_epoch_duration;
elapsed_epoch_count += 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. If our goal is to align with periodic points on the calendar, maybe it would be less convoluted to have a Daily, Weekly, Monthly enum and calculate the delay to the next point from that, instead of relying on the right interaction with the base time?

The time::Date type has next_day() and next_occurance(Weekday) methods that might be helpful here. Looks like it doesn't provide exactly what we need for months, but the Month type has next() and previous() methods which wrap around properly, so a value could be constructed that way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. If our goal is to align with periodic points on the calendar, maybe it would be less convoluted to have a Daily, Weekly, Monthly enum and calculate the delay to the next point from that, instead of relying on the right interaction with the base time?

i'm unsure how using an enum would simplify the calculation of the initial epoch, which was the code that you highlighted. previously we used arithmetic to calculate the starting epoch, but it seems that we can no longer do this with monthly epochs due to the fact that months vary in the amount of days. i switched to an iterative approach to account for this.

Also, switching from a duration string defined by the user to an enum will make testing short epoch duration difficult, which was a concern that you mentioned here: #50 (comment)

The ability to define a custom epoch duration provides this kind of flexibility.

The time::Date type has next_day() and next_occurance(Weekday) methods that might be helpful here. Looks like it doesn't provide exactly what we need for months, but the Month type has next() and previous() methods which wrap around properly, so a value could be constructed that way.

The new small crate calendar-duration provides an abstraction layer for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm unsure how using an enum would simplify the calculation of the initial epoch

Sorry, I don't mean that it simplifies the code. I meant an enum would simplify the interface. This change does two things. It adds a parser so someone using the command line doesn't have to figure out how many seconds a week is, or whatever. That's nice. Separately, it allows specifying durations over a month in a way that aligns with calendar dates, which isn't something that was possible before, because months aren't a fixed number of seconds.

But, to get what we want, which is to have epochs rotating at the zero hour UTC on daily, weekly, and monthly cadences, we still have to think carefully about whether the --epoch-base-time we pass is a Monday and the first day of a month. (It is. Did you do that on purpose?) Seems like it would be nicer to be able to just request that directly and let the code do the math.

an enum will make testing short epoch duration difficult

I think we could still include a Seconds(u64) variant in the enum for testing or other custom intervals.

Also, would this change support rotating epochs more frequently than the calendar measurement period like we've discussed doing with P3A? What arguments would I pass to rotate three times a month?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It is. Did you do that on purpose?)

nope, that was all you 😃

I think we could still include a Seconds(u64) variant in the enum for testing or other custom intervals.

what benefit would this provide over just specifying --epoch-duration 30s or something similar?

Also, would this change support rotating epochs more frequently than the calendar measurement period like we've discussed doing with P3A? What arguments would I pass to rotate three times a month?

This is a good question. I'm afraid not. One idea for a future PR: we could rename --epoch-duration to --report-cadence-duration and add a switch called --epochs-per-report-period. we could perform a small amount of extra arithmetic to calculate the epoch; I expect that the burden wouldn't be significant

@DJAndries
Copy link
Collaborator Author

DJAndries commented Aug 29, 2023

Why combine the randomness servers for each cadence, rather than having separate deployments?

a few reasons:

  • when I added support for multiple cadences for json measurements in the client, i was encouraged to continue to use a single host during sec/privacy review. less hosts in the client makes audits easier
  • less devops overhead. this is especially important now that we are implementing a key sync mechanism, which increases the complexity of our deployments. given the limited human resources we have allocated to star + devops, it's ideal to keep our deployments as lean as possible
  • less attestation overhead in the client. the client only needs to attest one host, instead of three. this results in slightly less network/computation overhead + less state management in the client.
  • constellation-processors will follow the same pattern, for reasons similar to the above.

How would you update the PRIVATE_KEY env variable from the key sharing design in #175 for multiple instances.

left a comment in the linked issue with some questions. if we do decide to move forward with the env var, i have some ideas to solve this.

@rillian
Copy link
Contributor

rillian commented Aug 31, 2023

Why combine the randomness servers for each cadence, rather than having separate deployments?

Thanks. All those reasons make sense to me.

We could pass the instance name through an optional instance or cadence property on the submission, similar to how the client can request a specific epoch. That would avoid the need for a separate /instance/randomness endpoint, but it would have to be a query parameter on the GET /info which isn't any more tidy.

@DJAndries DJAndries force-pushed the multiple-instances branch 2 times, most recently from d003522 to 4e287db Compare September 20, 2023 23:34
Copy link
Contributor

@rillian rillian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a bug with the currentEpoch field. The value returned with randomness requests doesn't match what's in the log output or the value returned from instances/main/info.

Copy link
Contributor

@rillian rillian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the design here isn't what we really want, but the implementation seems fine, so approving is a bases for future feedback as we gain experience with it.

Json(request): Json<RandomnessRequest>,
) -> Result<Json<RandomnessResponse>, Error> {
#[instrument(skip(state, request))]
async fn randomness(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are no await calls in here, this function could be synchronous, handling the full request in a single future. Might improve throughput, but I didn't check.

@rillian rillian merged commit 9396188 into main Sep 26, 2023
@rillian rillian deleted the multiple-instances branch September 26, 2023 23:08
@DJAndries
Copy link
Collaborator Author

DJAndries commented Sep 26, 2023

I think the design here isn't what we really want

Can you elaborate on this? Feedback is much appreciated. Especially before we use these endpoints in production.

@rillian
Copy link
Contributor

rillian commented Sep 26, 2023

Just what I was trying to say before about how the epoch durations are configured. We want to align rotations with calendar periods, so I suspect something that let us say "This one rotates Sunday midnight, this one rotates on the 1st, 5th, and 18th days of the month" would be a better API.

@DJAndries
Copy link
Collaborator Author

Just what I was trying to say before about how the epoch durations are configured. We want to align rotations with calendar periods, so I suspect something that let us say "This one rotates Sunday midnight, this one rotates on the 1st, 5th, and 18th days of the month" would be a better API.

Gotcha, will give this some more thought

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants