Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track scheduling frontier #521

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

petrosagg
Copy link
Contributor

This PR introduces a scheduling frontier which is a frontier that tracks the dataflows that can be scheduled in the cluster. This is accomplished using a timestamp that can be one of two variants:

  • An Installed(id) variant, which represents the fact that id might be scheduled.
  • A Future(lower) variant, which represents the fact that dataflows with id >= lower might be scheduled.

Each worker initializes a local MutableAntichain with num_worker copies of Future(0) elements (the minimum) which represents the fact that any worker might schedule any id. A channel is allocated through the allocator so that changes to that local view of the frontier can be communicated between all the workers.

When a dataflow is installed in a worker Installed(id) and Future(id+1) elements are inserted into the frontier and a Future(id) element is retracted.

When a dataflow naturally terminates an Installed(id) element is retracted from the frontier.

When a dataflow is dropped through drop_dataflow is is first moved into a frozen dataflow list but all its resources are maintained. Then Installed(id) is retracted from the frontier to let the other workers know that it will not be scheduled anymore.

Is is ensured that only one of the two paths above are taken to ensure that we only retract a Installed(id) element once.

Eventually all workers will retract their copies of a Dataflow(id), either because it was frozen or because it naturally terminated. When this happens id stops being beyond the global scheduling frontier and at that moment each worker is free to drop the dataflow and its associated resources since no worker will scheduled it again.

This PR introduces a scheduling frontier which is a frontier that tracks
the dataflows that can be scheduled in the cluster. This is accomplished
using a timestamp that can be one of two variants:

* An `Installed(id)` variant, which represents the fact that `id` might be scheduled.
* A `Future(lower)` variant, which represents the fact that dataflows
  with `id >= lower` might be scheduled.

Each worker initializes a local `MutableAntichain` with `num_worker`
copies of `Future(0)` elements (the minimum) which represents the fact
that any worker might schedule any id. A channel is allocated through
the allocator so that changes to that local view of the frontier can be
communicated between all the workers.

When a dataflow is installed in a worker `Installed(id)` and
`Future(id+1)` elements are inserted into the frontier and a
`Future(id)` element is retracted.

When a dataflow naturally terminates an `Installed(id)` element is
retracted from the frontier.

When a dataflow is dropped through `drop_dataflow` is is first moved
into a frozen dataflow list but all its resources are maintained. Then
`Installed(id)` is retracted from the frontier to let the other workers
know that it will not be scheduled anymore.

Is is ensured that only one of the two paths above are taken to ensure
that we only retract a `Installed(id)` element once.

Eventually all workers will retract their copies of a `Dataflow(id)`,
either because it was frozen or because it naturally terminated. When
this happens `id` stops being beyond the global scheduling frontier and
at that moment each worker is free to drop the dataflow and its
associated resources since no worker will scheduled it again.

Co-authored-by: Jan Teske <[email protected]>
Signed-off-by: Petros Angelatos <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant