Skip to content

[Doc] Make C++ code more plausible #2156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 10, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions media/docs/cpp/blackwell_cluster_launch_control.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,17 +43,18 @@ __device__ non_persistent_kernel(...) {
setup_common_data_structures();
dim3 workCoordinates = blockIdx;
coordinate_specific_compute(workCoordinates);
}
}
```
#### Static Persistent Kernel
``` c++
// Static Persistent Kernel
__device__ static_persistent_kernel(...) {
setup_common_data_structures(...);
dim3 workCoordinates = blockIdx;
bool isValidId;
do {
coordinate_specific_compute(workCoordinates);
isValidId, workCoordinates = staticTileScheduler.fetch_next_work();
std::tie(isValidId, workCoordinates) = staticTileScheduler.fetch_next_work();
} while (isValidId);
}
```
Expand All @@ -65,9 +66,11 @@ __device__ static_persistent_kernel(...) {
__device__ clc_dynamic_persistent_kernel(...) {
setup_common_data_structures(...);
dim3 workCoordinates = blockIdx;
dim3 newClcID;
bool isValidId;
do {
coordinate_specific_compute(workCoordinates);
isValidId, newClcID = clcTileScheduler.fetch_next_work();
std::tie(isValidId, newClcID) = clcTileScheduler.fetch_next_work();
workCoordinates = newClcID;
} while (isValidId);
}
Expand All @@ -76,7 +79,7 @@ __device__ clc_dynamic_persistent_kernel(...) {

### Cluster Launch Control Pipeline Class

Please refer to the `PipelineCLCFetchAsync` pipeline class defined in [Cluster launch control pipeline class](https://github.com/NVIDIA/cutlass/tree/main/include/cutlass/pipeline/sm100_pipeline.hpp). Cluster launch control queries can be pipelined and mananged by an asynchronous pipeline with producer-consumer relationship (See
Please refer to the `PipelineCLCFetchAsync` pipeline class defined in [Cluster launch control pipeline class](https://github.com/NVIDIA/cutlass/tree/main/include/cutlass/pipeline/sm100_pipeline.hpp). Cluster launch control queries can be pipelined and managed by an asynchronous pipeline with producer-consumer relationship (See
[pipeline](pipeline.md) document). The producer is the scheduler warp of the 0th CTA in the cluster and the consumers are all warps that need `ClcID`s.

To setup a CLC pipeline correctly, we need to make sure the params are set to the right values:
Expand Down