|
84 | 84 | "import numpy as np\n", |
85 | 85 | "from scipy.stats import zscore\n", |
86 | 86 | "from voxelwise_tutorials.io import load_hdf5_array\n", |
| 87 | + "from voxelwise_tutorials.utils import zscore_runs\n", |
87 | 88 | "\n", |
88 | 89 | "file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n", |
89 | 90 | "Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n", |
|
96 | 97 | "run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n", |
97 | 98 | "\n", |
98 | 99 | "# zscore each training run separately\n", |
99 | | - "Y_train = np.split(Y_train, run_onsets[1:])\n", |
100 | | - "Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n", |
| 100 | + "Y_train = zscore_runs(Y_train, run_onsets)\n", |
101 | 101 | "# zscore each test run separately\n", |
102 | 102 | "Y_test = zscore(Y_test, axis=1)" |
103 | 103 | ] |
|
287 | 287 | "cell_type": "markdown", |
288 | 288 | "metadata": {}, |
289 | 289 | "source": [ |
290 | | - "## Intermission: understanding delays\n", |
| 290 | + "## Understanding delays\n", |
291 | 291 | "\n", |
292 | 292 | "To have an intuitive understanding of what we accomplish by delaying the\n", |
293 | 293 | "features before model fitting, we will simulate one voxel and a single\n", |
294 | 294 | "feature. We will then create a ``Delayer`` object (which was used in the\n", |
295 | | - "previous pipeline) and visualize its effect on our single feature. Let's\n", |
296 | | - "start by simulating the data.\n", |
297 | | - "\n" |
| 295 | + "previous pipeline) and visualize its effect on our single feature. \n", |
| 296 | + "\n", |
| 297 | + "Let's start by simulating the data. We assume a simple scenario in which an event in\n", |
| 298 | + "our experiment occurred at $t = 20$ seconds and lasted for 10 seconds. For each timepoint, our simulated feature\n", |
| 299 | + "is a simple variable that indicates whether the event occurred or not." |
298 | 300 | ] |
299 | 301 | }, |
300 | 302 | { |
|
305 | 307 | }, |
306 | 308 | "outputs": [], |
307 | 309 | "source": [ |
308 | | - "# number of total trs\n", |
309 | | - "n_trs = 50\n", |
310 | | - "# repetition time for the simulated data\n", |
311 | | - "TR = 2.0\n", |
312 | | - "rng = np.random.RandomState(42)\n", |
313 | | - "y = rng.randn(n_trs)\n", |
314 | | - "x = np.zeros(n_trs)\n", |
315 | | - "# add some arbitrary value to our feature\n", |
316 | | - "x[15:20] = 0.5\n", |
317 | | - "x += rng.randn(n_trs) * 0.1 # add some noise\n", |
| 310 | + "from voxelwise_tutorials.delays_toy import create_voxel_data\n", |
318 | 311 | "\n", |
319 | | - "# create a delayer object and delay the features\n", |
320 | | - "delayer = Delayer(delays=[0, 1, 2, 3, 4])\n", |
321 | | - "x_delayed = delayer.fit_transform(x[:, None])" |
| 312 | + "# simulate an activation pulse at 20 s for 10 s of duration\n", |
| 313 | + "simulated_X, simulated_Y, times = create_voxel_data(onset=20, duration=10)" |
| 314 | + ] |
| 315 | + }, |
| 316 | + { |
| 317 | + "cell_type": "markdown", |
| 318 | + "metadata": {}, |
| 319 | + "source": [ |
| 320 | + "We next plot the simulated data. In this toy example, we assumed a \"canonical\" \n", |
| 321 | + "hemodynamic response function (HRF) (a double gamma function). This is an idealized\n", |
| 322 | + "HRF that is often used in the literature to model the BOLD response. In practice, \n", |
| 323 | + "however, the HRF can vary significantly across brain areas.\n", |
| 324 | + "\n", |
| 325 | + "Because of the HRF, notice that even though the event occurred at $t = 20$ seconds, \n", |
| 326 | + "the BOLD response is delayed in time. " |
| 327 | + ] |
| 328 | + }, |
| 329 | + { |
| 330 | + "cell_type": "code", |
| 331 | + "execution_count": null, |
| 332 | + "metadata": {}, |
| 333 | + "outputs": [], |
| 334 | + "source": [ |
| 335 | + "import matplotlib.pyplot as plt\n", |
| 336 | + "from voxelwise_tutorials.delays_toy import plot_delays_toy\n", |
| 337 | + "\n", |
| 338 | + "plot_delays_toy(simulated_X, simulated_Y, times)\n", |
| 339 | + "plt.show()" |
322 | 340 | ] |
323 | 341 | }, |
324 | 342 | { |
325 | 343 | "cell_type": "markdown", |
326 | 344 | "metadata": {}, |
327 | 345 | "source": [ |
328 | | - "In the next cell we are plotting six lines. The subplot at the top shows the\n", |
329 | | - "simulated BOLD response, while the other subplots show the simulated feature\n", |
330 | | - "at different delays. The effect of the delayer is clear: it creates multiple\n", |
| 346 | + "We next create a `Delayer` object and use it to delay the simulated feature. \n", |
| 347 | + "The effect of the delayer is clear: it creates multiple\n", |
331 | 348 | "copies of the original feature shifted forward in time by how many samples we\n", |
332 | 349 | "requested (in this case, from 0 to 4 samples, which correspond to 0, 2, 4, 6,\n", |
333 | 350 | "and 8 s in time with a 2 s TR).\n", |
334 | 351 | "\n", |
335 | 352 | "When these delayed features are used to fit a voxelwise encoding model, the\n", |
336 | 353 | "brain response $y$ at time $t$ is simultaneously modeled by the\n", |
337 | | - "feature $x$ at times $t-0, t-2, t-4, t-6, t-8$. In the remaining\n", |
338 | | - "of this example we will see that this method improves model prediction\n", |
339 | | - "accuracy and it allows to account for the underlying shape of the hemodynamic\n", |
340 | | - "response function.\n", |
341 | | - "\n" |
| 354 | + "feature $x$ at times $t-0, t-2, t-4, t-6, t-8$. For example, the time sample highlighted\n", |
| 355 | + "in the plot below ($t = 30$ seconds) is modeled by the features at \n", |
| 356 | + "$t = 30, 28, 26, 24, 22$ seconds." |
342 | 357 | ] |
343 | 358 | }, |
344 | 359 | { |
345 | 360 | "cell_type": "code", |
346 | 361 | "execution_count": null, |
347 | | - "metadata": { |
348 | | - "collapsed": false |
349 | | - }, |
| 362 | + "metadata": {}, |
350 | 363 | "outputs": [], |
351 | 364 | "source": [ |
352 | | - "import matplotlib.pyplot as plt\n", |
| 365 | + "# create a delayer object and delay the features\n", |
| 366 | + "delayer = Delayer(delays=[0, 1, 2, 3, 4])\n", |
| 367 | + "simulated_X_delayed = delayer.fit_transform(simulated_X[:, None])\n", |
353 | 368 | "\n", |
354 | | - "fig, axs = plt.subplots(6, 1, figsize=(6, 6), constrained_layout=True, sharex=True)\n", |
355 | | - "times = np.arange(n_trs) * TR\n", |
356 | | - "\n", |
357 | | - "axs[0].plot(times, y, color=\"r\")\n", |
358 | | - "axs[0].set_title(\"BOLD response\")\n", |
359 | | - "for i, (ax, xx) in enumerate(zip(axs.flat[1:], x_delayed.T)):\n", |
360 | | - " ax.plot(times, xx, color=\"k\")\n", |
361 | | - " ax.set_title(\n", |
362 | | - " \"$x(t - {0:.0f})$ (feature delayed by {1} sample{2})\".format(\n", |
363 | | - " i * TR, i, \"\" if i == 1 else \"s\"\n", |
364 | | - " )\n", |
365 | | - " )\n", |
366 | | - "for ax in axs.flat:\n", |
367 | | - " ax.axvline(40, color=\"gray\")\n", |
368 | | - " ax.set_yticks([])\n", |
369 | | - "_ = axs[-1].set_xlabel(\"Time [s]\")\n", |
| 369 | + "# plot the simulated data and highlight t = 30\n", |
| 370 | + "plot_delays_toy(simulated_X_delayed, simulated_Y, times, highlight=30)\n", |
370 | 371 | "plt.show()" |
371 | 372 | ] |
372 | 373 | }, |
| 374 | + { |
| 375 | + "cell_type": "markdown", |
| 376 | + "metadata": {}, |
| 377 | + "source": [ |
| 378 | + "This simple example shows how the delayed features take into account of the HRF. \n", |
| 379 | + "This approach is often referred to as a \"finite impulse response\" (FIR) model.\n", |
| 380 | + "By delaying the features, the regression model learns the weights for each voxel \n", |
| 381 | + "separately. Therefore, the FIR approach is able to adapt to the shape of the HRF in each \n", |
| 382 | + "voxel, without assuming a fixed canonical HRF shape. \n", |
| 383 | + "As we will see in the remaining of this notebook, this approach improves model \n", |
| 384 | + "prediction accuracy significantly." |
| 385 | + ] |
| 386 | + }, |
373 | 387 | { |
374 | 388 | "cell_type": "markdown", |
375 | 389 | "metadata": {}, |
|
0 commit comments