Skip to content

Commit faf248b

Browse files
committed
Turn off nb execution
1 parent ead87d1 commit faf248b

File tree

2 files changed

+13
-13
lines changed

2 files changed

+13
-13
lines changed

cuda_bindings/docs/source/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
# ones.
3333
extensions = ["sphinx.ext.autodoc", "sphinx.ext.napoleon", "myst_nb", "enum_tools.autoenum"]
3434

35-
jupyter_execute_notebooks = "force"
35+
nb_execution_mode = "off"
3636
numfig = True
3737

3838
# Add any paths that contain templates here, relative to this directory.

cuda_bindings/docs/source/overview.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ API](https://docs.nvidia.com/cuda/cuda-driver-api/index.html) and
4848
Python package. In this example, you copy data from the host to device. You need
4949
[NumPy](https://numpy.org/doc/stable/contents.html) to store data on the host.
5050

51-
```{code-cell} python
51+
```python
5252
from cuda.bindings import driver, nvrtc
5353
import numpy as np
5454
```
@@ -58,7 +58,7 @@ example is provided.
5858
In a future release, this may automatically raise exceptions using a Python
5959
object model.
6060

61-
```{code-cell} python
61+
```python
6262
def _cudaGetErrorEnum(error):
6363
if isinstance(error, driver.CUresult):
6464
err, name = driver.cuGetErrorName(error)
@@ -86,7 +86,7 @@ Python that requires some understanding of CUDA C++. For more information, see
8686
[An Even Easier Introduction to
8787
CUDA](https://developer.nvidia.com/blog/even-easier-introduction-cuda/).
8888

89-
```{code-cell} python
89+
```python
9090
saxpy = """\
9191
extern "C" __global__
9292
void saxpy(float a, float *x, float *y, float *out, size_t n)
@@ -108,7 +108,7 @@ In the following code example, the Driver API is initialized so that the NVIDIA
108108
and GPU are accessible. Next, the GPU is queried for their compute capability. Finally,
109109
the program is compiled to target our local compute capability architecture with FMAD enabled.
110110

111-
```{code-cell} python
111+
```python
112112
# Initialize CUDA Driver API
113113
checkCudaErrors(driver.cuInit(0))
114114

@@ -138,7 +138,7 @@ context. CUDA contexts are analogous to host processes for the device. In the
138138
following code example, a handle for compute device 0 is passed to
139139
`cuCtxCreate` to designate that GPU for context creation.
140140

141-
```{code-cell} python
141+
```python
142142
# Create context
143143
context = checkCudaErrors(driver.cuCtxCreate(0, cuDevice))
144144
```
@@ -148,7 +148,7 @@ module. A module is analogous to dynamically loaded libraries for the device.
148148
After loading into the module, extract a specific kernel with
149149
`cuModuleGetFunction`. It is not uncommon for multiple kernels to reside in PTX.
150150

151-
```{code-cell} python
151+
```python
152152
# Load PTX as module data and retrieve function
153153
ptx = np.char.array(ptx)
154154
# Note: Incompatible --gpu-architecture would be detected here
@@ -161,7 +161,7 @@ application performance, you can input data on the device to eliminate data
161161
transfers. For completeness, this example shows how you would transfer data to
162162
and from the device.
163163

164-
```{code-cell} python
164+
```python
165165
NUM_THREADS = 512 # Threads per block
166166
NUM_BLOCKS = 32768 # Blocks per grid
167167

@@ -184,7 +184,7 @@ Python doesn’t have a natural concept of pointers, yet `cuMemcpyHtoDAsync` exp
184184
`void*`. Therefore, `XX.ctypes.data` retrieves the pointer value associated with
185185
XX.
186186

187-
```{code-cell} python
187+
```python
188188
dXclass = checkCudaErrors(driver.cuMemAlloc(bufferSize))
189189
dYclass = checkCudaErrors(driver.cuMemAlloc(bufferSize))
190190
dOutclass = checkCudaErrors(driver.cuMemAlloc(bufferSize))
@@ -209,7 +209,7 @@ Like `cuMemcpyHtoDAsync`, `cuLaunchKernel` expects `void**` in the argument list
209209
the earlier code example, it creates `void**` by grabbing the `void*` value of each
210210
individual argument and placing them into its own contiguous memory.
211211

212-
```{code-cell} python
212+
```python
213213
# The following code example is not intuitive
214214
# Subject to change in a future release
215215
dX = np.array([int(dXclass)], dtype=np.uint64)
@@ -222,7 +222,7 @@ args = np.array([arg.ctypes.data for arg in args], dtype=np.uint64)
222222

223223
Now the kernel can be launched:
224224

225-
```{code-cell} python
225+
```python
226226
checkCudaErrors(driver.cuLaunchKernel(
227227
kernel,
228228
NUM_BLOCKS, # grid x dim
@@ -251,7 +251,7 @@ stream are serialized. After the call to transfer data back to the host is
251251
executed, `cuStreamSynchronize` is used to halt CPU execution until all operations
252252
in the designated stream are finished.
253253

254-
```{code-cell} python
254+
```python
255255
# Assert values are same after running kernel
256256
hZ = a * hX + hY
257257
if not np.allclose(hOut, hZ):
@@ -261,7 +261,7 @@ if not np.allclose(hOut, hZ):
261261
Perform verification of the data to ensure correctness and finish the code with
262262
memory clean up.
263263

264-
```{code-cell} python
264+
```python
265265
checkCudaErrors(driver.cuStreamDestroy(stream))
266266
checkCudaErrors(driver.cuMemFree(dXclass))
267267
checkCudaErrors(driver.cuMemFree(dYclass))

0 commit comments

Comments
 (0)