|
1440 | 1440 | "outputs": [], |
1441 | 1441 | "source": [ |
1442 | 1442 | "from scipy.stats import zscore\n", |
| 1443 | + "from voxelwise_tutorials.utils import zscore_runs\n", |
1443 | 1444 | "\n", |
1444 | 1445 | "# indice of first sample of each run\n", |
1445 | 1446 | "run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n", |
1446 | 1447 | "print(run_onsets)\n", |
1447 | 1448 | "\n", |
1448 | 1449 | "# zscore each training run separately\n", |
1449 | | - "Y_train = np.split(Y_train, run_onsets[1:])\n", |
1450 | | - "Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n", |
| 1450 | + "Y_train = zscore_runs(Y_train, run_onsets)\n", |
1451 | 1451 | "# zscore each test run separately\n", |
1452 | 1452 | "Y_test = zscore(Y_test, axis=1)" |
1453 | 1453 | ] |
|
1474 | 1474 | "source": [ |
1475 | 1475 | "Y_test = Y_test.mean(0)\n", |
1476 | 1476 | "# We need to zscore the test data again, because we took the mean across repetitions.\n", |
1477 | | - "# This averaging step makes the standard deviation approximately equal to 1/sqrt(n_repeats)\n", |
1478 | 1477 | "Y_test = zscore(Y_test, axis=0)\n", |
1479 | 1478 | "\n", |
1480 | 1479 | "print(\"(n_samples_test, n_voxels) =\", Y_test.shape)" |
|
2117 | 2116 | "Similarly to {cite:t}`huth2012`, we correct the coefficients of features linked by a\n", |
2118 | 2117 | "semantic relationship. When building the wordnet features, if a frame was\n", |
2119 | 2118 | "labeled with `wolf`, the authors automatically added the semantically linked\n", |
2120 | | - "categories `canine`, `carnivore`, `placental mammal`, `mamma`, `vertebrate`,\n", |
| 2119 | + "categories `canine`, `carnivore`, `placental mammal`, `mammal`, `vertebrate`,\n", |
2121 | 2120 | "`chordate`, `organism`, and `whole`. The authors thus argue that the same\n", |
2122 | 2121 | "correction needs to be done on the coefficients.\n", |
2123 | 2122 | "\n" |
|
2413 | 2412 | "import numpy as np\n", |
2414 | 2413 | "from scipy.stats import zscore\n", |
2415 | 2414 | "from voxelwise_tutorials.io import load_hdf5_array\n", |
| 2415 | + "from voxelwise_tutorials.utils import zscore_runs\n", |
2416 | 2416 | "\n", |
2417 | 2417 | "file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n", |
2418 | 2418 | "Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n", |
|
2425 | 2425 | "run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n", |
2426 | 2426 | "\n", |
2427 | 2427 | "# zscore each training run separately\n", |
2428 | | - "Y_train = np.split(Y_train, run_onsets[1:])\n", |
2429 | | - "Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n", |
| 2428 | + "Y_train = zscore_runs(Y_train, run_onsets)\n", |
2430 | 2429 | "# zscore each test run separately\n", |
2431 | 2430 | "Y_test = zscore(Y_test, axis=1)" |
2432 | 2431 | ] |
|
2616 | 2615 | "cell_type": "markdown", |
2617 | 2616 | "metadata": {}, |
2618 | 2617 | "source": [ |
2619 | | - "## Intermission: understanding delays\n", |
| 2618 | + "## Understanding delays\n", |
2620 | 2619 | "\n", |
2621 | 2620 | "To have an intuitive understanding of what we accomplish by delaying the\n", |
2622 | 2621 | "features before model fitting, we will simulate one voxel and a single\n", |
2623 | 2622 | "feature. We will then create a ``Delayer`` object (which was used in the\n", |
2624 | | - "previous pipeline) and visualize its effect on our single feature. Let's\n", |
2625 | | - "start by simulating the data.\n", |
2626 | | - "\n" |
| 2623 | + "previous pipeline) and visualize its effect on our single feature. \n", |
| 2624 | + "\n", |
| 2625 | + "Let's start by simulating the data. We assume a simple scenario in which an event in\n", |
| 2626 | + "our experiment occurred at $t = 20$ seconds and lasted for 10 seconds. For each timepoint, our simulated feature\n", |
| 2627 | + "is a simple variable that indicates whether the event occurred or not." |
2627 | 2628 | ] |
2628 | 2629 | }, |
2629 | 2630 | { |
|
2634 | 2635 | }, |
2635 | 2636 | "outputs": [], |
2636 | 2637 | "source": [ |
2637 | | - "# number of total trs\n", |
2638 | | - "n_trs = 50\n", |
2639 | | - "# repetition time for the simulated data\n", |
2640 | | - "TR = 2.0\n", |
2641 | | - "rng = np.random.RandomState(42)\n", |
2642 | | - "y = rng.randn(n_trs)\n", |
2643 | | - "x = np.zeros(n_trs)\n", |
2644 | | - "# add some arbitrary value to our feature\n", |
2645 | | - "x[15:20] = 0.5\n", |
2646 | | - "x += rng.randn(n_trs) * 0.1 # add some noise\n", |
| 2638 | + "from voxelwise_tutorials.delays_toy import create_voxel_data\n", |
2647 | 2639 | "\n", |
2648 | | - "# create a delayer object and delay the features\n", |
2649 | | - "delayer = Delayer(delays=[0, 1, 2, 3, 4])\n", |
2650 | | - "x_delayed = delayer.fit_transform(x[:, None])" |
| 2640 | + "# simulate an activation pulse at 20 s for 10 s of duration\n", |
| 2641 | + "simulated_X, simulated_Y, times = create_voxel_data(onset=20, duration=10)" |
| 2642 | + ] |
| 2643 | + }, |
| 2644 | + { |
| 2645 | + "cell_type": "markdown", |
| 2646 | + "metadata": {}, |
| 2647 | + "source": [ |
| 2648 | + "We next plot the simulated data. In this toy example, we assumed a \"canonical\" \n", |
| 2649 | + "hemodynamic response function (HRF) (a double gamma function). This is an idealized\n", |
| 2650 | + "HRF that is often used in the literature to model the BOLD response. In practice, \n", |
| 2651 | + "however, the HRF can vary significantly across brain areas.\n", |
| 2652 | + "\n", |
| 2653 | + "Because of the HRF, notice that even though the event occurred at $t = 20$ seconds, \n", |
| 2654 | + "the BOLD response is delayed in time. " |
| 2655 | + ] |
| 2656 | + }, |
| 2657 | + { |
| 2658 | + "cell_type": "code", |
| 2659 | + "execution_count": null, |
| 2660 | + "metadata": {}, |
| 2661 | + "outputs": [], |
| 2662 | + "source": [ |
| 2663 | + "import matplotlib.pyplot as plt\n", |
| 2664 | + "from voxelwise_tutorials.delays_toy import plot_delays_toy\n", |
| 2665 | + "\n", |
| 2666 | + "plot_delays_toy(simulated_X, simulated_Y, times)\n", |
| 2667 | + "plt.show()" |
2651 | 2668 | ] |
2652 | 2669 | }, |
2653 | 2670 | { |
2654 | 2671 | "cell_type": "markdown", |
2655 | 2672 | "metadata": {}, |
2656 | 2673 | "source": [ |
2657 | | - "In the next cell we are plotting six lines. The subplot at the top shows the\n", |
2658 | | - "simulated BOLD response, while the other subplots show the simulated feature\n", |
2659 | | - "at different delays. The effect of the delayer is clear: it creates multiple\n", |
| 2674 | + "We next create a `Delayer` object and use it to delay the simulated feature. \n", |
| 2675 | + "The effect of the delayer is clear: it creates multiple\n", |
2660 | 2676 | "copies of the original feature shifted forward in time by how many samples we\n", |
2661 | 2677 | "requested (in this case, from 0 to 4 samples, which correspond to 0, 2, 4, 6,\n", |
2662 | 2678 | "and 8 s in time with a 2 s TR).\n", |
2663 | 2679 | "\n", |
2664 | 2680 | "When these delayed features are used to fit a voxelwise encoding model, the\n", |
2665 | 2681 | "brain response $y$ at time $t$ is simultaneously modeled by the\n", |
2666 | | - "feature $x$ at times $t-0, t-2, t-4, t-6, t-8$. In the remaining\n", |
2667 | | - "of this example we will see that this method improves model prediction\n", |
2668 | | - "accuracy and it allows to account for the underlying shape of the hemodynamic\n", |
2669 | | - "response function.\n", |
2670 | | - "\n" |
| 2682 | + "feature $x$ at times $t-0, t-2, t-4, t-6, t-8$. For example, the time sample highlighted\n", |
| 2683 | + "in the plot below ($t = 30$ seconds) is modeled by the features at \n", |
| 2684 | + "$t = 30, 28, 26, 24, 22$ seconds." |
2671 | 2685 | ] |
2672 | 2686 | }, |
2673 | 2687 | { |
2674 | 2688 | "cell_type": "code", |
2675 | 2689 | "execution_count": null, |
2676 | | - "metadata": { |
2677 | | - "collapsed": false |
2678 | | - }, |
| 2690 | + "metadata": {}, |
2679 | 2691 | "outputs": [], |
2680 | 2692 | "source": [ |
2681 | | - "import matplotlib.pyplot as plt\n", |
| 2693 | + "# create a delayer object and delay the features\n", |
| 2694 | + "delayer = Delayer(delays=[0, 1, 2, 3, 4])\n", |
| 2695 | + "simulated_X_delayed = delayer.fit_transform(simulated_X[:, None])\n", |
2682 | 2696 | "\n", |
2683 | | - "fig, axs = plt.subplots(6, 1, figsize=(6, 6), constrained_layout=True, sharex=True)\n", |
2684 | | - "times = np.arange(n_trs) * TR\n", |
2685 | | - "\n", |
2686 | | - "axs[0].plot(times, y, color=\"r\")\n", |
2687 | | - "axs[0].set_title(\"BOLD response\")\n", |
2688 | | - "for i, (ax, xx) in enumerate(zip(axs.flat[1:], x_delayed.T)):\n", |
2689 | | - " ax.plot(times, xx, color=\"k\")\n", |
2690 | | - " ax.set_title(\n", |
2691 | | - " \"$x(t - {0:.0f})$ (feature delayed by {1} sample{2})\".format(\n", |
2692 | | - " i * TR, i, \"\" if i == 1 else \"s\"\n", |
2693 | | - " )\n", |
2694 | | - " )\n", |
2695 | | - "for ax in axs.flat:\n", |
2696 | | - " ax.axvline(40, color=\"gray\")\n", |
2697 | | - " ax.set_yticks([])\n", |
2698 | | - "_ = axs[-1].set_xlabel(\"Time [s]\")\n", |
| 2697 | + "# plot the simulated data and highlight t = 30\n", |
| 2698 | + "plot_delays_toy(simulated_X_delayed, simulated_Y, times, highlight=30)\n", |
2699 | 2699 | "plt.show()" |
2700 | 2700 | ] |
2701 | 2701 | }, |
| 2702 | + { |
| 2703 | + "cell_type": "markdown", |
| 2704 | + "metadata": {}, |
| 2705 | + "source": [ |
| 2706 | + "This simple example shows how the delayed features take into account of the HRF. \n", |
| 2707 | + "This approach is often referred to as a \"finite impulse response\" (FIR) model.\n", |
| 2708 | + "By delaying the features, the regression model learns the weights for each voxel \n", |
| 2709 | + "separately. Therefore, the FIR approach is able to adapt to the shape of the HRF in each \n", |
| 2710 | + "voxel, without assuming a fixed canonical HRF shape. \n", |
| 2711 | + "As we will see in the remaining of this notebook, this approach improves model \n", |
| 2712 | + "prediction accuracy significantly." |
| 2713 | + ] |
| 2714 | + }, |
2702 | 2715 | { |
2703 | 2716 | "cell_type": "markdown", |
2704 | 2717 | "metadata": {}, |
|
2988 | 3001 | "import numpy as np\n", |
2989 | 3002 | "from scipy.stats import zscore\n", |
2990 | 3003 | "from voxelwise_tutorials.io import load_hdf5_array\n", |
| 3004 | + "from voxelwise_tutorials.utils import zscore_runs\n", |
2991 | 3005 | "\n", |
2992 | 3006 | "file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n", |
2993 | 3007 | "Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n", |
|
3000 | 3014 | "run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n", |
3001 | 3015 | "\n", |
3002 | 3016 | "# zscore each training run separately\n", |
3003 | | - "Y_train = np.split(Y_train, run_onsets[1:])\n", |
3004 | | - "Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n", |
| 3017 | + "Y_train = zscore_runs(Y_train, run_onsets)\n", |
3005 | 3018 | "# zscore each test run separately\n", |
3006 | 3019 | "Y_test = zscore(Y_test, axis=1)" |
3007 | 3020 | ] |
|
3383 | 3396 | "semantic information.\n", |
3384 | 3397 | "\n", |
3385 | 3398 | "To better disentangle the two feature spaces, we developed a joint model\n", |
3386 | | - "called `banded ridge regression` {cite}`nunez2019,dupre2022`, which fits multiple feature spaces\n", |
| 3399 | + "called **banded ridge regression** {cite}`nunez2019,dupre2022`, which fits multiple feature spaces\n", |
3387 | 3400 | "simultaneously with optimal regularization for each feature space. This model\n", |
3388 | 3401 | "is described in the next example.\n", |
3389 | 3402 | "\n" |
|
3488 | 3501 | "import numpy as np\n", |
3489 | 3502 | "from scipy.stats import zscore\n", |
3490 | 3503 | "from voxelwise_tutorials.io import load_hdf5_array\n", |
| 3504 | + "from voxelwise_tutorials.utils import zscore_runs\n", |
3491 | 3505 | "\n", |
3492 | 3506 | "file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n", |
3493 | 3507 | "Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n", |
|
3500 | 3514 | "run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n", |
3501 | 3515 | "\n", |
3502 | 3516 | "# zscore each training run separately\n", |
3503 | | - "Y_train = np.split(Y_train, run_onsets[1:])\n", |
3504 | | - "Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n", |
| 3517 | + "Y_train = zscore_runs(Y_train, run_onsets)\n", |
3505 | 3518 | "# zscore each test run separately\n", |
3506 | 3519 | "Y_test = zscore(Y_test, axis=1)" |
3507 | 3520 | ] |
|
0 commit comments