Skip to content

Commit 03b02a4

Browse files
committed
Add merged notebooks
1 parent 01609e0 commit 03b02a4

File tree

2 files changed

+77
-65
lines changed

2 files changed

+77
-65
lines changed

tutorials/notebooks/shortclips/vem_tutorials_merged_for_colab.ipynb

Lines changed: 69 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1440,14 +1440,14 @@
14401440
"outputs": [],
14411441
"source": [
14421442
"from scipy.stats import zscore\n",
1443+
"from voxelwise_tutorials.utils import zscore_runs\n",
14431444
"\n",
14441445
"# indice of first sample of each run\n",
14451446
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
14461447
"print(run_onsets)\n",
14471448
"\n",
14481449
"# zscore each training run separately\n",
1449-
"Y_train = np.split(Y_train, run_onsets[1:])\n",
1450-
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
1450+
"Y_train = zscore_runs(Y_train, run_onsets)\n",
14511451
"# zscore each test run separately\n",
14521452
"Y_test = zscore(Y_test, axis=1)"
14531453
]
@@ -1474,7 +1474,6 @@
14741474
"source": [
14751475
"Y_test = Y_test.mean(0)\n",
14761476
"# We need to zscore the test data again, because we took the mean across repetitions.\n",
1477-
"# This averaging step makes the standard deviation approximately equal to 1/sqrt(n_repeats)\n",
14781477
"Y_test = zscore(Y_test, axis=0)\n",
14791478
"\n",
14801479
"print(\"(n_samples_test, n_voxels) =\", Y_test.shape)"
@@ -2117,7 +2116,7 @@
21172116
"Similarly to {cite:t}`huth2012`, we correct the coefficients of features linked by a\n",
21182117
"semantic relationship. When building the wordnet features, if a frame was\n",
21192118
"labeled with `wolf`, the authors automatically added the semantically linked\n",
2120-
"categories `canine`, `carnivore`, `placental mammal`, `mamma`, `vertebrate`,\n",
2119+
"categories `canine`, `carnivore`, `placental mammal`, `mammal`, `vertebrate`,\n",
21212120
"`chordate`, `organism`, and `whole`. The authors thus argue that the same\n",
21222121
"correction needs to be done on the coefficients.\n",
21232122
"\n"
@@ -2413,6 +2412,7 @@
24132412
"import numpy as np\n",
24142413
"from scipy.stats import zscore\n",
24152414
"from voxelwise_tutorials.io import load_hdf5_array\n",
2415+
"from voxelwise_tutorials.utils import zscore_runs\n",
24162416
"\n",
24172417
"file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n",
24182418
"Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n",
@@ -2425,8 +2425,7 @@
24252425
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
24262426
"\n",
24272427
"# zscore each training run separately\n",
2428-
"Y_train = np.split(Y_train, run_onsets[1:])\n",
2429-
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
2428+
"Y_train = zscore_runs(Y_train, run_onsets)\n",
24302429
"# zscore each test run separately\n",
24312430
"Y_test = zscore(Y_test, axis=1)"
24322431
]
@@ -2616,14 +2615,16 @@
26162615
"cell_type": "markdown",
26172616
"metadata": {},
26182617
"source": [
2619-
"## Intermission: understanding delays\n",
2618+
"## Understanding delays\n",
26202619
"\n",
26212620
"To have an intuitive understanding of what we accomplish by delaying the\n",
26222621
"features before model fitting, we will simulate one voxel and a single\n",
26232622
"feature. We will then create a ``Delayer`` object (which was used in the\n",
2624-
"previous pipeline) and visualize its effect on our single feature. Let's\n",
2625-
"start by simulating the data.\n",
2626-
"\n"
2623+
"previous pipeline) and visualize its effect on our single feature. \n",
2624+
"\n",
2625+
"Let's start by simulating the data. We assume a simple scenario in which an event in\n",
2626+
"our experiment occurred at $t = 20$ seconds and lasted for 10 seconds. For each timepoint, our simulated feature\n",
2627+
"is a simple variable that indicates whether the event occurred or not."
26272628
]
26282629
},
26292630
{
@@ -2634,71 +2635,83 @@
26342635
},
26352636
"outputs": [],
26362637
"source": [
2637-
"# number of total trs\n",
2638-
"n_trs = 50\n",
2639-
"# repetition time for the simulated data\n",
2640-
"TR = 2.0\n",
2641-
"rng = np.random.RandomState(42)\n",
2642-
"y = rng.randn(n_trs)\n",
2643-
"x = np.zeros(n_trs)\n",
2644-
"# add some arbitrary value to our feature\n",
2645-
"x[15:20] = 0.5\n",
2646-
"x += rng.randn(n_trs) * 0.1 # add some noise\n",
2638+
"from voxelwise_tutorials.delays_toy import create_voxel_data\n",
26472639
"\n",
2648-
"# create a delayer object and delay the features\n",
2649-
"delayer = Delayer(delays=[0, 1, 2, 3, 4])\n",
2650-
"x_delayed = delayer.fit_transform(x[:, None])"
2640+
"# simulate an activation pulse at 20 s for 10 s of duration\n",
2641+
"simulated_X, simulated_Y, times = create_voxel_data(onset=20, duration=10)"
2642+
]
2643+
},
2644+
{
2645+
"cell_type": "markdown",
2646+
"metadata": {},
2647+
"source": [
2648+
"We next plot the simulated data. In this toy example, we assumed a \"canonical\" \n",
2649+
"hemodynamic response function (HRF) (a double gamma function). This is an idealized\n",
2650+
"HRF that is often used in the literature to model the BOLD response. In practice, \n",
2651+
"however, the HRF can vary significantly across brain areas.\n",
2652+
"\n",
2653+
"Because of the HRF, notice that even though the event occurred at $t = 20$ seconds, \n",
2654+
"the BOLD response is delayed in time. "
2655+
]
2656+
},
2657+
{
2658+
"cell_type": "code",
2659+
"execution_count": null,
2660+
"metadata": {},
2661+
"outputs": [],
2662+
"source": [
2663+
"import matplotlib.pyplot as plt\n",
2664+
"from voxelwise_tutorials.delays_toy import plot_delays_toy\n",
2665+
"\n",
2666+
"plot_delays_toy(simulated_X, simulated_Y, times)\n",
2667+
"plt.show()"
26512668
]
26522669
},
26532670
{
26542671
"cell_type": "markdown",
26552672
"metadata": {},
26562673
"source": [
2657-
"In the next cell we are plotting six lines. The subplot at the top shows the\n",
2658-
"simulated BOLD response, while the other subplots show the simulated feature\n",
2659-
"at different delays. The effect of the delayer is clear: it creates multiple\n",
2674+
"We next create a `Delayer` object and use it to delay the simulated feature. \n",
2675+
"The effect of the delayer is clear: it creates multiple\n",
26602676
"copies of the original feature shifted forward in time by how many samples we\n",
26612677
"requested (in this case, from 0 to 4 samples, which correspond to 0, 2, 4, 6,\n",
26622678
"and 8 s in time with a 2 s TR).\n",
26632679
"\n",
26642680
"When these delayed features are used to fit a voxelwise encoding model, the\n",
26652681
"brain response $y$ at time $t$ is simultaneously modeled by the\n",
2666-
"feature $x$ at times $t-0, t-2, t-4, t-6, t-8$. In the remaining\n",
2667-
"of this example we will see that this method improves model prediction\n",
2668-
"accuracy and it allows to account for the underlying shape of the hemodynamic\n",
2669-
"response function.\n",
2670-
"\n"
2682+
"feature $x$ at times $t-0, t-2, t-4, t-6, t-8$. For example, the time sample highlighted\n",
2683+
"in the plot below ($t = 30$ seconds) is modeled by the features at \n",
2684+
"$t = 30, 28, 26, 24, 22$ seconds."
26712685
]
26722686
},
26732687
{
26742688
"cell_type": "code",
26752689
"execution_count": null,
2676-
"metadata": {
2677-
"collapsed": false
2678-
},
2690+
"metadata": {},
26792691
"outputs": [],
26802692
"source": [
2681-
"import matplotlib.pyplot as plt\n",
2693+
"# create a delayer object and delay the features\n",
2694+
"delayer = Delayer(delays=[0, 1, 2, 3, 4])\n",
2695+
"simulated_X_delayed = delayer.fit_transform(simulated_X[:, None])\n",
26822696
"\n",
2683-
"fig, axs = plt.subplots(6, 1, figsize=(6, 6), constrained_layout=True, sharex=True)\n",
2684-
"times = np.arange(n_trs) * TR\n",
2685-
"\n",
2686-
"axs[0].plot(times, y, color=\"r\")\n",
2687-
"axs[0].set_title(\"BOLD response\")\n",
2688-
"for i, (ax, xx) in enumerate(zip(axs.flat[1:], x_delayed.T)):\n",
2689-
" ax.plot(times, xx, color=\"k\")\n",
2690-
" ax.set_title(\n",
2691-
" \"$x(t - {0:.0f})$ (feature delayed by {1} sample{2})\".format(\n",
2692-
" i * TR, i, \"\" if i == 1 else \"s\"\n",
2693-
" )\n",
2694-
" )\n",
2695-
"for ax in axs.flat:\n",
2696-
" ax.axvline(40, color=\"gray\")\n",
2697-
" ax.set_yticks([])\n",
2698-
"_ = axs[-1].set_xlabel(\"Time [s]\")\n",
2697+
"# plot the simulated data and highlight t = 30\n",
2698+
"plot_delays_toy(simulated_X_delayed, simulated_Y, times, highlight=30)\n",
26992699
"plt.show()"
27002700
]
27012701
},
2702+
{
2703+
"cell_type": "markdown",
2704+
"metadata": {},
2705+
"source": [
2706+
"This simple example shows how the delayed features take into account of the HRF. \n",
2707+
"This approach is often referred to as a \"finite impulse response\" (FIR) model.\n",
2708+
"By delaying the features, the regression model learns the weights for each voxel \n",
2709+
"separately. Therefore, the FIR approach is able to adapt to the shape of the HRF in each \n",
2710+
"voxel, without assuming a fixed canonical HRF shape. \n",
2711+
"As we will see in the remaining of this notebook, this approach improves model \n",
2712+
"prediction accuracy significantly."
2713+
]
2714+
},
27022715
{
27032716
"cell_type": "markdown",
27042717
"metadata": {},
@@ -2988,6 +3001,7 @@
29883001
"import numpy as np\n",
29893002
"from scipy.stats import zscore\n",
29903003
"from voxelwise_tutorials.io import load_hdf5_array\n",
3004+
"from voxelwise_tutorials.utils import zscore_runs\n",
29913005
"\n",
29923006
"file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n",
29933007
"Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n",
@@ -3000,8 +3014,7 @@
30003014
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
30013015
"\n",
30023016
"# zscore each training run separately\n",
3003-
"Y_train = np.split(Y_train, run_onsets[1:])\n",
3004-
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
3017+
"Y_train = zscore_runs(Y_train, run_onsets)\n",
30053018
"# zscore each test run separately\n",
30063019
"Y_test = zscore(Y_test, axis=1)"
30073020
]
@@ -3383,7 +3396,7 @@
33833396
"semantic information.\n",
33843397
"\n",
33853398
"To better disentangle the two feature spaces, we developed a joint model\n",
3386-
"called `banded ridge regression` {cite}`nunez2019,dupre2022`, which fits multiple feature spaces\n",
3399+
"called **banded ridge regression** {cite}`nunez2019,dupre2022`, which fits multiple feature spaces\n",
33873400
"simultaneously with optimal regularization for each feature space. This model\n",
33883401
"is described in the next example.\n",
33893402
"\n"
@@ -3488,6 +3501,7 @@
34883501
"import numpy as np\n",
34893502
"from scipy.stats import zscore\n",
34903503
"from voxelwise_tutorials.io import load_hdf5_array\n",
3504+
"from voxelwise_tutorials.utils import zscore_runs\n",
34913505
"\n",
34923506
"file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n",
34933507
"Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n",
@@ -3500,8 +3514,7 @@
35003514
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
35013515
"\n",
35023516
"# zscore each training run separately\n",
3503-
"Y_train = np.split(Y_train, run_onsets[1:])\n",
3504-
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
3517+
"Y_train = zscore_runs(Y_train, run_onsets)\n",
35053518
"# zscore each test run separately\n",
35063519
"Y_test = zscore(Y_test, axis=1)"
35073520
]

tutorials/notebooks/shortclips/vem_tutorials_merged_for_colab_model_fitting.ipynb

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -843,14 +843,14 @@
843843
"outputs": [],
844844
"source": [
845845
"from scipy.stats import zscore\n",
846+
"from voxelwise_tutorials.utils import zscore_runs\n",
846847
"\n",
847848
"# indice of first sample of each run\n",
848849
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
849850
"print(run_onsets)\n",
850851
"\n",
851852
"# zscore each training run separately\n",
852-
"Y_train = np.split(Y_train, run_onsets[1:])\n",
853-
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
853+
"Y_train = zscore_runs(Y_train, run_onsets)\n",
854854
"# zscore each test run separately\n",
855855
"Y_test = zscore(Y_test, axis=1)"
856856
]
@@ -877,7 +877,6 @@
877877
"source": [
878878
"Y_test = Y_test.mean(0)\n",
879879
"# We need to zscore the test data again, because we took the mean across repetitions.\n",
880-
"# This averaging step makes the standard deviation approximately equal to 1/sqrt(n_repeats)\n",
881880
"Y_test = zscore(Y_test, axis=0)\n",
882881
"\n",
883882
"print(\"(n_samples_test, n_voxels) =\", Y_test.shape)"
@@ -1520,7 +1519,7 @@
15201519
"Similarly to {cite:t}`huth2012`, we correct the coefficients of features linked by a\n",
15211520
"semantic relationship. When building the wordnet features, if a frame was\n",
15221521
"labeled with `wolf`, the authors automatically added the semantically linked\n",
1523-
"categories `canine`, `carnivore`, `placental mammal`, `mamma`, `vertebrate`,\n",
1522+
"categories `canine`, `carnivore`, `placental mammal`, `mammal`, `vertebrate`,\n",
15241523
"`chordate`, `organism`, and `whole`. The authors thus argue that the same\n",
15251524
"correction needs to be done on the coefficients.\n",
15261525
"\n"
@@ -1823,6 +1822,7 @@
18231822
"import numpy as np\n",
18241823
"from scipy.stats import zscore\n",
18251824
"from voxelwise_tutorials.io import load_hdf5_array\n",
1825+
"from voxelwise_tutorials.utils import zscore_runs\n",
18261826
"\n",
18271827
"file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n",
18281828
"Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n",
@@ -1835,8 +1835,7 @@
18351835
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
18361836
"\n",
18371837
"# zscore each training run separately\n",
1838-
"Y_train = np.split(Y_train, run_onsets[1:])\n",
1839-
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
1838+
"Y_train = zscore_runs(Y_train, run_onsets)\n",
18401839
"# zscore each test run separately\n",
18411840
"Y_test = zscore(Y_test, axis=1)"
18421841
]
@@ -2218,7 +2217,7 @@
22182217
"semantic information.\n",
22192218
"\n",
22202219
"To better disentangle the two feature spaces, we developed a joint model\n",
2221-
"called `banded ridge regression` {cite}`nunez2019,dupre2022`, which fits multiple feature spaces\n",
2220+
"called **banded ridge regression** {cite}`nunez2019,dupre2022`, which fits multiple feature spaces\n",
22222221
"simultaneously with optimal regularization for each feature space. This model\n",
22232222
"is described in the next example.\n",
22242223
"\n"
@@ -2323,6 +2322,7 @@
23232322
"import numpy as np\n",
23242323
"from scipy.stats import zscore\n",
23252324
"from voxelwise_tutorials.io import load_hdf5_array\n",
2325+
"from voxelwise_tutorials.utils import zscore_runs\n",
23262326
"\n",
23272327
"file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n",
23282328
"Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n",
@@ -2335,8 +2335,7 @@
23352335
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
23362336
"\n",
23372337
"# zscore each training run separately\n",
2338-
"Y_train = np.split(Y_train, run_onsets[1:])\n",
2339-
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
2338+
"Y_train = zscore_runs(Y_train, run_onsets)\n",
23402339
"# zscore each test run separately\n",
23412340
"Y_test = zscore(Y_test, axis=1)"
23422341
]

0 commit comments

Comments
 (0)