Skip to content

Conversation

@markmark206
Copy link

@markmark206 markmark206 commented Jul 30, 2020

What changed?

Update Front End Load Balancer configs to default to multiple task queue partitions, to match the defaults in the Matching Engine.

Why?

This is part of the general effort to improve the usability of "task queue partitions" functionality

Before this change, all the pollers defaulted to the root partition, so attempting to run a tq describe on a non-root partition resulted in the "No pollers" error message.

This change allows for pollers to be distributed across available (as configured) task queue partitions.

This change allows running tctl [admin] tq describe on non-root task queue partitions.

Here is the output illustrating the current functionality:

  1. listing all the partitions when running with one instance of the matching service:
◉ 10:49:25 [markmark ~/src/temporal-mm] (describe|m9u) $ ./tctl --ns benchtest tq lp  --tq temporal-bench
     WORKFLOWTASKQUEUEPARTITION    |      HOST
  temporal-bench                   | 127.0.0.1:7235
  /__temporal_sys/temporal-bench/1 | 127.0.0.1:7235
  /__temporal_sys/temporal-bench/2 | 127.0.0.1:7235
  /__temporal_sys/temporal-bench/3 | 127.0.0.1:7235
     ACTIVITYTASKQUEUEPARTITION    |      HOST
  temporal-bench                   | 127.0.0.1:7235
  /__temporal_sys/temporal-bench/1 | 127.0.0.1:7235
  /__temporal_sys/temporal-bench/2 | 127.0.0.1:7235
  /__temporal_sys/temporal-bench/3 | 127.0.0.1:7235
  1. sub-partitions can now be queried for pollers:
◉ 10:47:41 [markmark ~/src/temporal-mm] (describe|m9u) $ ./tctl --ns benchtest tq describe  --tq /__temporal_sys/temporal-bench/1
2020/07/30 10:49:00 INFO  No logger configured for temporal client. Created default one.
         WORKFLOW POLLER IDENTITY         |     LAST ACCESS TIME
  [email protected]@ | 2020-07-30T10:45:28-07:00
◉ 10:49:03 [markmark ~/src/temporal-mm] (describe|m9u) $ ./tctl --ns benchtest tq describe  --tq /__temporal_sys/temporal-bench/2
2020/07/30 10:49:12 INFO  No logger configured for temporal client. Created default one.
         WORKFLOW POLLER IDENTITY         |     LAST ACCESS TIME
  [email protected]@ | 2020-07-30T10:48:25-07:00
◉ 10:49:14 [markmark ~/src/temporal-mm] (describe|m9u) $ ./tctl --ns benchtest tq describe  --tq /__temporal_sys/temporal-bench/2
2020/07/30 10:49:18 INFO  No logger configured for temporal client. Created default one.
         WORKFLOW POLLER IDENTITY         |     LAST ACCESS TIME
  [email protected]@ | 2020-07-30T10:48:25-07:00
◉ 10:49:20 [markmark ~/src/temporal-mm] (describe|m9u) $ ./tctl --ns benchtest tq describe  --tq /__temporal_sys/temporal-bench/3
2020/07/30 10:49:23 INFO  No logger configured for temporal client. Created default one.
Error: No poller for taskqueue: /__temporal_sys/temporal-bench/3
  1. admin: sub-partitions can now be queried, using tctl admin tq describe command
◉ 11:09:53 [markmark ~/src/temporal-mm] (describe|m9u) $ ./tctl --ns benchtest admin tq describe  --tq temporal-bench
  READ LEVEL | ACK LEVEL | BACKLOG | LEASE START TASKID | LEASE END TASKID
     1100000 |    400722 |       0 |            1100001 |          1200000

         WORKFLOW POLLER IDENTITY         |     LAST ACCESS TIME
  [email protected]@ | 2020-07-30T11:23:57-07:00
◉ 11:24:49 [markmark ~/src/temporal-mm] (describe|m9u) $ ./tctl --ns benchtest admin tq describe  --tq /__temporal_sys/temporal-bench/3
  READ LEVEL | ACK LEVEL | BACKLOG | LEASE START TASKID | LEASE END TASKID
      100000 |         0 |       0 |             100001 |           200000

         WORKFLOW POLLER IDENTITY         |     LAST ACCESS TIME
  [email protected]@ | 2020-07-30T11:21:58-07:00
  1. Running `tctl [admin] tq describe' on a partition with no pollers results in the appropriate error message:
◉ 11:32:22 [markmark ~/src/temporal-mm] (describe|m9u) $ ./tctl --ns benchtest admin tq describe  --tq /__temporal_sys/temporal-bench/3
  READ LEVEL | ACK LEVEL | BACKLOG | LEASE START TASKID | LEASE END TASKID
      100000 |         0 |       0 |             100001 |           200000

Error: No poller for taskqueue: /__temporal_sys/temporal-bench/3
  1. task queue partitions in instances with multiple matching service replicas are spread across replicas:

(Notice that HOST information differs from partition to partition, except for the root partition and /__temporal_sys/temporal-bench/1, which are hosted on the same node -- in this example, running in our pipeline, we have 3 replicas and 4 tq partitions. ).

bash-5.0# ./tctl --ns benchtest tq lp  --tq temporal-bench
     WORKFLOWTASKQUEUEPARTITION    |        HOST
  temporal-bench                   | 172.31.35.207:7235
  /__temporal_sys/temporal-bench/1 | 172.31.35.207:7235
  /__temporal_sys/temporal-bench/2 | 172.31.51.136:7235
  /__temporal_sys/temporal-bench/3 | 172.31.20.173:7235
     ACTIVITYTASKQUEUEPARTITION    |        HOST
  temporal-bench                   | 172.31.35.207:7235
  /__temporal_sys/temporal-bench/1 | 172.31.35.207:7235
  /__temporal_sys/temporal-bench/2 | 172.31.51.136:7235
  /__temporal_sys/temporal-bench/3 | 172.31.20.173:7235

How did you test it?

I tested this by running CLI on my machine.

Potential risks
Not sure, but if we see any problems we will fix or revert.

@markmark206 markmark206 requested review from a team, alexshtin, mastermanu and samarabbas July 30, 2020 17:52
@samarabbas
Copy link
Contributor

@markmark206 can you make sure buildkite builds are green?

@markmark206
Copy link
Author

@samarabbas yes, looks like flakiness in either tests or buildkite (the test passes on my machine, and re-running in buildkite succeeded).

(i have since picked up the updated master, so it is now rebuilding again, we'll see what happens now;).

@markmark206 markmark206 changed the title Default to multiple tq partitions in FE LB configs Default to multiple TaskQueue partitions in FrontEnd LoadBalancer configs Jul 30, 2020
@markmark206 markmark206 merged commit 58f95bc into temporalio:master Jul 30, 2020
@markmark206 markmark206 deleted the describe branch July 30, 2020 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants