Explicitly exit ShuffleWorker process when terminate future finished

In our usage, we encounter a case where the shuffle worker registers timeout and triggers a fatal error, but the shuffle worker process does not exit and this leads to no new worker being spawned to replace the current one .

The reason behind this is that the shuffle worker will execute closeAsync and shutdown all the component services. Obviously, the process will exit after all the non-daemon threads exit. But our metric client start extra thread not close rightly which cause this problem, this should fix by close these threads in the reporter#close method. 

But I still think we should improve the shutdown logic a bit. We could explicitly exit the shuffle worker when the termination future completed. So that it will be safe for any situation when there are threads that can not be freed timely. 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Explicitly exit ShuffleWorker process when terminate future finished #75

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Explicitly exit ShuffleWorker process when terminate future finished #75

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions