Skip to content

Conversation

@shawnhathaway
Copy link
Contributor

@shawnhathaway shawnhathaway commented Sep 9, 2020

What changed?
Incremental PR of changes on way to have MutableState be a proto everywhere. This PR:

  • Removes persistence.InternalWorkflowExecutionInfo - This can return later when we are happy with object structure.
  • Aligns WEI field types with expected proto generated types
  • Add time nil resiliency where accessed unsafely
  • Embed persistenceblobs.ExecutionStats in persistenceblobs.WorkflowExecutionInfo and deprecate HistorySize -- HistorySize needs to be dual written both as a field and in the new stats sub object until we validate complete migration and can remove the deprecated field. (v1.1+++)

Bug Fixes:

  • createMutableState/copyWorkflowExecutionInfo test util now copies WorkflowTaskScheduledTimestamp
  • historyEngine_test#TestRecordWorkflowTaskStarted***Sticky*** tests correctly compare response.ScheduledTime to the scheduled event written
  • timestamp.UnixOrZero now functions as UnixNano (to be renamed in seperate PR)
    • Introduced timestamp.UnixOrEmpty here to replace this as zero is ambiguous

How did you test it?
Local integration, unit tests. Buildkite validation

Potential risks
Time is now nillable and we could have nil pointer exceptions hidden about. This is the first incremental PR and more work will be done on this before in upcoming PRs. The pipelines should give us extra coverage here.

return TimePtr(UnixOrEmptyTime(nanos))
}

func UnixOrZeroTime(nanos int64) time.Time {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnixOrZeroTime to be renamed to UnixNano in later PR - UnixOrEmptyTime will now return empty time object.

import "temporal/api/history/v1/message.proto";
import "temporal/api/failure/v1/message.proto";

message ReplicationInfo {
Copy link
Contributor Author

@shawnhathaway shawnhathaway Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was mistakenly removed in a prior commit. @alexshtin Appears we don't validate the build for proto?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't. You supposed to regenerate go files every time manually. I agree, we should add this step and check if there are any changes in api dir after make proto.

WorkflowTaskAttempt: sourceInfo.WorkflowTaskAttempt,
WorkflowTaskStartedTimestamp: sourceInfo.WorkflowTaskStartedTimestamp,
WorkflowTaskOriginalScheduledTimestamp: sourceInfo.WorkflowTaskOriginalScheduledTimestamp,
WorkflowTaskScheduledTimestamp: sourceInfo.WorkflowTaskScheduledTimestamp,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long-standing bug that this was missing

s.NotNil(response)
s.True(response.StartedTime.After(*expectedResponse.ScheduledTime))
expectedResponse.StartedTime = response.StartedTime
expectedResponse.ScheduledTime = &time.Time{}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ScheduledTime should match event time. This was a bug hidden by missing field in copyWorfklowExecutionInfo

ScheduledTimestamp: m.msb.timeSource.Now().UnixNano(),
StartedTimestamp: 0,
ScheduledTimestamp: timestamp.TimePtr(m.msb.timeSource.Now()),
StartedTimestamp: timestamp.UnixOrZeroTimePtr(0),
Copy link
Contributor Author

@shawnhathaway shawnhathaway Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outstanding Todo: Initialize 0 values with nil instead of 0 -- Note this applies to Time only. Durations (ie Timeouts) of 0 vs nil has actual distinction.

@shawnhathaway shawnhathaway marked this pull request as ready for review September 9, 2020 02:56

// Back compat for GetHistorySize
if info.GetHistorySize() >= 0 && info.GetExecutionStats() == nil {
executionInfo.ExecutionStats = &persistenceblobs.ExecutionStats{HistorySize: 0}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore bug here - fixed in a separate branch/commit with a test to ensure this works correctly migration. Will separate and apply to this PR shortly.

import "temporal/api/history/v1/message.proto";
import "temporal/api/failure/v1/message.proto";

message ReplicationInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't. You supposed to regenerate go files every time manually. I agree, we should add this step and check if there are any changes in api dir after make proto.

Comment on lines +231 to +233
WorkflowTaskStartedTimestamp *time.Time
WorkflowTaskScheduledTimestamp *time.Time
WorkflowTaskOriginalScheduledTimestamp *time.Time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these need to be renamed to *Time not *Timestamp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next PR :)

WorkflowTaskScheduleId: executionInfo.WorkflowTaskScheduleID,
WorkflowTaskStartedId: executionInfo.WorkflowTaskStartedID,
WorkflowTaskRequestId: executionInfo.WorkflowTaskRequestID,
WorkflowTaskTimeout: timestamp.DurationFromSeconds(executionInfo.WorkflowTaskTimeout),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So ideally, when we finish all time conversion we should not use timestamp.DurationFromSeconds and timestamp.TimestampFromTime at all. (Maybe only when we read from configs, but even there we can use duration and time).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely shouldn't need in serialization layers and agree with overall direction. However, some places we may still need it in code such as Persistence interfaces which take int64 timeouts in some areas to provide to cassandra as TTL (seconds only), whereas SQL layers may use timestamps.

WorkflowTypeName: "wType",
WorkflowRunTimeout: 20,
DefaultWorkflowTaskTimeout: 13,
WorkflowRunTimeout: timestamp.DurationFromSeconds(20),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You lose beauty of go duration here. Wouldn't it be better: timestamp.DurationPtr(20 * time.Second)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho, I prefer having DurationFromUnit() to avoid mistakes in math and make refactoring a bit simpler. Can always drop down to `DurationPtr() with more complex times. Lets chat more.

executionInfo := mutableState.executionInfo
executionInfo.HasRetryPolicy = true
executionInfo.WorkflowExpirationTime = s.now.Add(1000 * time.Second)
executionInfo.WorkflowExpirationTime = timestamp.TimeNowPtrUtcAddSeconds(1000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this helper func :-). Yeah, definitely don't like.

StartTimestamp: startTimeUnixNano,
ExecutionTimestamp: executionTimeUnixNano,
RunTimeout: int64(runTimeout),
RunTimeout: int64(timestamp.DurationValue(runTimeout).Round(time.Second).Seconds()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect all these rounds to be temporary, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these will stay due to Cassandra TTL being in seconds.

@shawnhathaway shawnhathaway merged commit faf2fe7 into temporalio:master Sep 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants