-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Changes ListClosedWorkflow result ordering for SQL-based visibility stores to be based off of workflow close time #578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes ListClosedWorkflow result ordering for SQL-based visibility stores to be based off of workflow close time #578
Conversation
| mdb.converter.ToMySQLDateTime(*filter.MaxStartTime), | ||
| *filter.RunID, | ||
| *filter.MinStartTime, | ||
| *filter.MaxStartTime, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shawnhathaway - this also seemed like a bug to me. can you confirm that this change makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea this does look like a bug as it doesn't appear MinStartTime is even affected by the paging token and this would break the contract of returning earlier records than specified with the filter it appears.
| pdb.converter.ToPostgresDateTime(*filter.MaxStartTime), | ||
| *filter.RunID, | ||
| *filter.MinStartTime, | ||
| *filter.MaxStartTime, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shawnhathaway - ditto here
| CREATE INDEX by_type_start_time ON executions_visibility (namespace_id, workflow_type_name, status, start_time DESC, run_id); | ||
| CREATE INDEX by_workflow_id_start_time ON executions_visibility (namespace_id, workflow_id, status, start_time DESC, run_id); | ||
| CREATE INDEX by_status_by_close_time ON executions_visibility (namespace_id, status, start_time DESC, run_id); | ||
| CREATE INDEX by_status_by_start_time ON executions_visibility (namespace_id, status, start_time DESC, run_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shawnhathaway - technically we can create a partial index here (https://www.postgresql.org/docs/8.0/indexes-partial.html). MySQL doesn't support this, but Postgres does. This means we can have some indexes only for workflows in the closed state while others are in the open state. The downside is that there are more updates to the indices. Let me know if you think it is worth doing the differentiation here, or if you think it is fine as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho, this should be fine as is unless it's very low hanging fruit effort-wise to add the partials. It adds size to index, but from my understanding high-performant visibility users will use Elastisearch. Let's sync today.
…listclosedworkflowexecutions_order_sql
| CREATE INDEX by_type_start_time ON executions_visibility (namespace_id, workflow_type_name, status, start_time DESC, run_id); | ||
| CREATE INDEX by_workflow_id_start_time ON executions_visibility (namespace_id, workflow_id, status, start_time DESC, run_id); | ||
| CREATE INDEX by_status_by_close_time ON executions_visibility (namespace_id, status, start_time DESC, run_id); | ||
| CREATE INDEX by_status_by_start_time ON executions_visibility (namespace_id, status, start_time DESC, run_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho, this should be fine as is unless it's very low hanging fruit effort-wise to add the partials. It adds size to index, but from my understanding high-performant visibility users will use Elastisearch. Let's sync today.
| mdb.converter.ToMySQLDateTime(*filter.MaxStartTime), | ||
| *filter.RunID, | ||
| *filter.MinStartTime, | ||
| *filter.MaxStartTime, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea this does look like a bug as it doesn't appear MinStartTime is even affected by the paging token and this would break the contract of returning earlier records than specified with the filter it appears.
This PR fixes the following issues:
SQL stores were using the Workflow Open Time instead of the Close Time for pagination, which is not aligned with Cassandra and is not the intention.
The SQL store pagination logic had a bug where the tie-breaker logic of using run_id for two rows with the same close_time was not implemented properly.
The SQL store pagination logic had a bug where it was using the MinStart time instead of the MaxStart time for pagination purposes (to be confirmed as to whether this is truly a bug or not, but it does appear that way)
Fixes Makefile so that schema installation for MySQL / Postgres functions properly.
Fixes Postgres username / password in postgres development environment.
Adds appropriate MySQL/Postgres indexes for querying by close_time
Testing: