-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathnotebook.py
1482 lines (1184 loc) · 54.6 KB
/
notebook.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# -*- coding: utf-8 -*-
# ---
# jupyter:
# jupytext:
# formats: ipynb,py:percent
# text_representation:
# extension: .py
# format_name: percent
# format_version: '1.2'
# jupytext_version: 1.2.4
# kernelspec:
# display_name: Python 3
# language: python
# name: python3
# ---
# %% [markdown]
# # A guide to neuroimaging meta-analysis
# %% [markdown]
# ## Alexandre Perez
# Neuro-Data Science - ORIGAMI lab
# %% [markdown]
# ## Foreword
# %% [markdown]
# In neuroimaging, meta-analysis is an essential tool to better synthesize literature results and to answer the recurrent issue of the lack of reproductibility of studies conducted on a small number of subjects.
# This notebook gathers and explains some well known meta-analysis techniques, discusses their limitations and applies them to real fMRI data. Neuroimaging meta-analysis are divided into two types: coordinate-based meta-analysis (CBMA) and image-based meta-analysis (IBMA); so is this notebook.
#
# **Warning**:
# In order to illustrate the techniques, operations have been coded in an **inefficient** way in this notebook for the sake of clarity. Do not reuse the provided code for real analysis. These techniques have been coded in a more efficient and complete way in the python package NiMARE. The use of this package is strongly recommanded and is illustrated at the end of the notebook.
# %% [markdown]
# ### Requirements and installation
#
# If you run this notebook on Binder, then everything is all set. However, if you run it manually, please refer to the Readme instructions in the repository to install the required dependencies.
# %% [markdown]
# ## Data
# %% [markdown]
# Before getting to the heart of the matter, let's load the data that will be used to illustrate the techniques. On first reading, one can jump to the next section: **Coordinate-based meta-analysis**.
# %%
# %matplotlib inline
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
import os
from os import path as osp
import multiprocessing
from joblib import Parallel, delayed
import nilearn
import numpy as np
from nilearn import masking, plotting
from nipy.labs.statistical_mapping import get_3d_peaks
import nibabel as nib
import scipy
from matplotlib import pyplot as plt
import ntpath
from nimare.dataset import Dataset
import nimare
import copy
import shutil
# Some global variables used in the notebook
input_dir = 'data/sim/' # Studies' directory at the git repository root
study_filename = 'study_' # Template of a study image: f'{study_filename}_{i}.nii.gz'
template = nilearn.datasets.load_mni152_template()
Ni, Nj, Nk = template.shape
affine = template.affine
gray_mask = masking.compute_gray_matter_mask(template)
# %% [markdown]
# ### Required data structure
# %% [markdown]
# The dataset is a set of 10 studies that have been entirely simulated from a real base image, from the study's map to each of the subjects' maps (not reported here). The simulation was performed by adding normal noise, previously smoothed with a gaussian kernel, to the base image. From the simulated subjects (a total of 10 for each study), the standard deviation map was derived.
#
# The data, located at the root of the repo, follow the following structure:
# ```
# root_of_repo/
# notebook.ipynb
# data/
# ...
# sim/
# study_0.nii.gz
# study_0_con.nii.gz
# study_0_se.nii.gz
# study_1.nii.gz
# study_1_con.nii.gz
# study_1_se.nii.gz
# ...
# ```
# Where
# - `study_i.nii.gz` is the z map file of the i-th study.
# - `study_i_con.nii.gz`is the contrast map file of the i-th study.
# - `study_i_se.nii.gz`is the standard deviation map of the subjects of the i-th study.
#
# Note that you can change both the data directory and the template names at the beginning of this notebook. However, your custom filenames still would have to meet the naming requirements `prefix_i_type.nii.gz` where:
# - `prefix` is custom.
# - `i` is the index of the study.
# - `type` is the map's type (z: empty string, contrast: `con`, standard deviation: `se`)
# %% [markdown]
# ### Construct list of images paths
# %%
# retrieve path of images in the data directory
img_paths = []
i = 0
while True:
path = os.path.abspath(f'{input_dir}{study_filename}{i}.nii.gz')
if not os.path.isfile(path):
break
img_paths.append(path)
i += 1
N_img = len(img_paths)
# %% [markdown]
# ### Extract peaks
# %% [markdown]
# The provided data contains the full brain images (with same dimension as the MNI template image). This is perfect for image-based meta-analysis methods but not for coordinate-based ones. To demonstrate the coordinate-based meta-analysis (CBMA) techniques we need to extract from these images the peaks' coordinates. Then we will also be able to run the CBMA methods on the provided data.
#
# To do so, we use the `get_3D_peaks` function of the `nipy.labs.statistical_mapping` module of the `nipy` package.
# %%
def get_activations(path, threshold, space='ijk'):
"""
Retrieve the activation coordinates from an image.
Args:
path (string or Nifti1Image): Path to or object of a
nibabel.Nifti1Image from which to extract coordinates.
threshold (float): Peaks under this threshold will not be detected.
space (string): Space of coordinates. Available: 'ijk' and 'pos'.
Returns:
(tuple): Size 3 tuple of np.array storing respectively the X, Y and
Z coordinates
"""
I, J, K = [], [], []
try:
img = nilearn.image.load_img(path)
except ValueError: # File path not found
print(f'File {path} not found. Ignored.')
return None
if np.isnan(img.get_fdata()).any():
print(f'Img {path} contains Nan. Ignored.')
return None
img = nilearn.image.resample_to_img(img, template)
peaks = get_3d_peaks(img, mask=gray_mask, threshold=threshold)
if not peaks:
return I, J, K
for peak in peaks:
I.append(peak[space][0])
J.append(peak[space][1])
K.append(peak[space][2])
del peaks
I, J, K = np.array(I), np.array(J), np.array(K)
if space == 'ijk':
I, J, K = I.astype(int), J.astype(int), K.astype(int)
return I, J, K
# %%
activation_thresh = 1.96
activation_peaks = [get_activations(path, activation_thresh, space='ijk')
for path in img_paths]
# %% [markdown]
# We turn peaks' coordinates into binary Nifti images for convenience and create a list of nifti images:
# %%
binary_imgs = []
for I, J, K in activation_peaks:
arr = np.zeros(template.shape)
arr[I, J, K] = 1
img = nib.Nifti1Image(arr, affine)
binary_imgs.append(img)
# %% [markdown]
# Now that the data is properly loaded, let's talk about meta-analysis techniques.
# %% [markdown]
# ## Coordinate-based meta-analysis (CBMA)
# %% [markdown]
# The first type of neuroimaging meta-analysis technique is coordinate-based meta-analysis. These techniques only use a list of activation coordinates (which can be seen as binary maps) as input.
# This part has been written using the following references: [1] [2] [5].
# %% [markdown]
# **Context**:
# Coordinate-based meta-analysis techniques are used in the following context:
#
# One has $N$ experiments that deal with the same or related task/topic. For each of these experiments, one has a list of coordinates of activation peaks in the brain. One wants to combine these results to build a common activation map in the hope of having a stronger prior on the activation areas of a given task than each experiment can give individually.
# %% [markdown]
# ### ALE
# %% [markdown]
# The key idea of the Activation Likelihood Estimation (ALE) algorithm is to consider the activation foci not in a discrete way, but rather as a center of probability distribution.
#
# #### Notations:
# - $N$: Number of experiments/studies.
# - $S_n$: Number of subjects in the experiment $n$.
# - $F_n$: Number of foci in the experiment $n$.
#
# %% [markdown]
# #### Pipeline
# The ALE pipeline is summarized in figure [1](#ALE_fig), and can be described as follows:
#
# 1. Gather the reported activation peaks in $N$ binary maps, with one map per study.
# 2. For each of these maps, apply the following pipeline:
# a. Consider each peak individually and convolve each individual peak with a gaussian kernel, resulting in one map per peak. Note that the width of the gaussian kernel depends on the number of subjects in the considered experiment.
# b. Merge the obtained probability maps within the given study into a single map, called a Modeled Activation (MA) map.
# 3. Merge all the MA maps across studies into a single map; this ALE map is the result of the meta-analysis.
# %% [markdown]
# <a id='ALE_fig'></a>
# 
# %% [markdown]
# **Figure 1**: The ALE pipeline. The input are $N$ binary maps, one for each of the studies.
# %% [markdown]
# 3 points remain unclear for now:
# - How the gaussian kernel depends on the number of subjects.
# - How to merge probability maps into MA maps (labeled by 1 in figure 1).
# - How to merge MA maps into an ALE map (labeled by 2 in figure 1).
#
# Let's clarify these points.
# %% [markdown]
# #### ALE Kernel
# %% [markdown]
# The idea of having a kernel depend on the number of subjects of an experiment is justified by the fact that an experiment with a low number of subjects has more uncertainty than an experiment with a high number of subjects. Therefore, the lower the number of subjects in an experiment, the greater is the FWHM of the gaussian kernel.
# %% [markdown]
# #### Merging probability maps (within a study)
# %% [markdown]
# Why do we split the binary map into as many maps as foci, leaving only one focus in each map, to finally merge them together? The convolution is linear right, why don't we simply convolve the full binary map?
# Well, the answer is we don't apply a linear operation to merge the probability maps. Basically, the MA map is derived from the probability maps by taking the maximum of all the probability maps at each point in space.
# %% [markdown]
# #### Merging MA maps (across studies)
# %% [markdown]
# To understand how the MA maps are merged together into the ALE map, we first have to remember the meaning of the MA maps and what we want to express in the ALE map.
# %% [markdown]
# Remember that the key point of ALE is to see the discrete foci as the centers of probability distributions. Hence, the MA map of an experiment can be thought as a rational modelisation of the distribution of the experiment's peaks. More specifically, the value associated with a given coordinate (i, j, k) represents the likelihood that an activation peak lies on this coordinate.
# If we denote $\textit{MA}_n$ the MA map associated with the experiment $n$, $$\textit{MA}_n(i, j, k) = P(\mbox{ijk activated in the experiment $n$})$$
# %% [markdown]
# A rational expectation for a meta-analysis map such as ALE is to give us an approximation of the likelihood that a given coordinate $ijk$ is truly activated in at least one experiment. Hence, if we denote $\textit{ALE}$ the ALE map:
# $$
# \begin{equation} \label{eq1}
# \begin{split}
# \textit{ALE}(i,j,k) & = P(\mbox{$ijk$ activated by at least one experiment}) \\
# & = 1 - P(\mbox{$ijk$ not activated by any experiment}) \\
# & = 1 - P(\mbox{For each experiment $n$, $ijk$ not activated in $\textit{MA}_n$}) \\
# & = 1 - \prod_{n=1}^{N}P(\mbox{$ijk$ not activated in $\textit{MA}_n$}) \\
# & = 1 - \prod_{n=1}^{N}(1-P(\mbox{$ijk$ activated in $\textit{MA}_n$})) \\
# & = 1 - \prod_{n=1}^{N}(1 - \textit{MA}_n(i,j,k))
# \end{split}
# \end{equation}
# $$
# %% [markdown]
# #### Mathematical recap
# %% [markdown]
# For the sake of simplicity and to be closer to the implementation, we will consider the mathematical objects directly in the discrete 3D brain space rather than the continuous case.
# %% [markdown]
# We recall the previous notations:
# - $N$: Number of experiments/studies.
# - $S_n$: Number of subjects in experiment $n$.
# - $F_n$: Number of foci in experiment $n$.
# %% [markdown]
# And add new ones:
# - Let $(N_i, N_j, N_k)$ be the shape of the brain space. Typically $(N_i, N_j, N_k) = (91, 109, 91)$ in the MNI space.
# - For all $n \in [1 .. N],$
# * let $B_n \in \{0, 1\}^{N_iN_jN_k}$ be the binary map associated with experiment $n$.
# * let $D_{p, n} \in \{0, 1\}^{N_iN_jN_k}$ be the binary dirac map associated with experiment $n$ and focus $p$ for all $p \in [1 .. F_n]$. In other words, a null map with only the position of the focus $p$ set to 1 (called dirac map on figure [1](#ALE)).
# * let $\textit{MA}_n \in \mathbb{R}^{N_iN_jN_k}$ be the Modeled Activation map of experiment $n$.
# * let $G_{S_n}$ be the gaussian kernel associated with experiment $n$.
# - Let $\textit{ALE} \in \mathbb{R}^{N_iN_jN_k}$ be the desired ALE map.
# %% [markdown]
# Given these notations, we have the following:
# %% [markdown]
# $$
# \begin{equation}
# B_n(i, j, k) = \sum_{p=1}^{F_n}D_{p, n}(i, j, k)
# \end{equation}
# $$
# %% [markdown]
# $$
# \begin{equation}
# \textit{MA}_n(i, j, k) = \underset{n \in [1..N]}{\text{max}} \Big[(D_{p, n}*G_{S_n})(i, j, k)\Big]
# \end{equation}
# $$
# %% [markdown]
# $$
# \begin{equation}
# \textit{ALE}(i, j, k) = 1 - \prod_{n=1}^{N}(1 - \textit{MA}_n(i,j,k))
# \end{equation}
# $$
# %% [markdown]
# $$
# \begin{equation}
# \textit{ALE} = 1 - \prod_{n=1}^{N}(1 - \textit{MA}_n)
# \end{equation}
# $$
# %% [markdown]
# #### Inefficient python implementation
# %% [markdown]
# Let's see what output is obtained with this method on the loaded studies.
# %% [markdown]
# | Math variables | Python variables |
# |:-------------------:|:-------------------------:|
# |$B_n$ | `binary_arr` |
# |$D_{p, n}*G_{S_n}$ | `prob_arrs[p, :, :, :]` |
# |$\textit{MA}_n$ | `ma_maps[n, :, :, :]` |
# |$\textit{ALE}$ | `ale_arr` |
#
# <center>Link between math and python variables.</center>
# %%
# sigma is the width of the gaussian kernel used in the convolution of the
# dirac maps to build the probability maps. Dimension is in voxel.
sigma = 2.
def compute_ale_ma_maps():
"""
Use the following global variables
N_img, Ni, Nj, Nk
binary_imgs
sigma
"""
ma_maps = np.zeros((N_img, Ni, Nj, Nk))
for n in range(N_img):
print(f'{n+1} out of {N_img}', end='\r')
binary_img = binary_imgs[n]
# Create probability maps
binary_arr = binary_img.get_fdata()
nz_i, nz_j, nz_k = np.nonzero(binary_arr)
prob_arrs = np.zeros((len(nz_i), Ni, Nj, Nk))
for p in range(len(nz_i)):
prob_arrs[p, nz_i[p], nz_j[p], nz_k[p]] = 1
# Gaussian convolve
prob_arrs[p, :, :, :] = scipy.ndimage.gaussian_filter(prob_arrs[p, :, :, :],
sigma=sigma)
# Merge probability maps
ma_maps[n, :, :, :] = np.max(prob_arrs, axis=0)
return ma_maps
def merge_ale_ma_maps(ma_maps):
return 1 - np.prod(1-ma_maps, axis=0)
ale_ma_maps = compute_ale_ma_maps()
ale_arr = merge_ale_ma_maps(ale_ma_maps)
ale_img = nib.Nifti1Image(ale_arr, affine)
# %%
plotting.view_img(ale_img, threshold=0.01)
# %% [markdown]
# ### KDA
# %% [markdown]
# The Kernel Density Analysis (KDA) technique differs from ALE in the way peaks are considered. Whereas a probability map was derived from the peaks in ALE, in KDA, the value of interest is the peak density (namely the number of peaks lying in a sphere of a given radius).
# %% [markdown]
# #### Pipeline
# The KDA pipeline is quite similar to the ALE one, the few differences being the kernel and the way maps are merged together.
#
# The pipeline is summarized in figure [2](#KDA_fig) and can be described as follows:
#
# 1. First of all, gather the reported activation peaks in $N$ binary maps (one map per study).
# 2. Convolve each of these maps with a spherical uniform kernel to build the Modeled Activity (MA) maps.
# 3. Derive the KDA map by summing the MA maps together.
# %% [markdown]
# <a id='KDA_fig'></a>
# 
# %% [markdown]
# **Figure 2**: The KDA pipeline.
# %% [markdown]
# The first thing we see is that the KDA process is simpler than the ALE one. Indeed, in KDA, the merge operation number 1 of ALE (the non linear maximum element-wise) is replaced by a sum operation element-wise, which is linear. So, in KDA we can mix all these steps into a single convolution of the binary map.
#
# The second merge operation (#2 in figure [2](#KDA)) is also replaced by a sum element-wise.
# %% [markdown]
# Let's give a bit of interpretation of the different parts in KDA.
# %% [markdown]
# #### KDA Kernel
# %% [markdown]
# As previously stated, the KDA kernel is a sphere with a chosen radius $r$ of ones and zero else-where. Therefore, the result of a convolution of a dirac map (i.e. a zero-valued map with a single 1) with this kernel is simply a sphere of ones of same radius.
# The derived map expresses a density. For each peak, the value stored in this map gives the number of peaks lying at most at a distance of $r$ from the chosen peak. In the particular case of a dirac map, the derived density is 0 when we look further than $r$ of the unique peak and 1 when we look closer than a $r$ distance from the unique peak.
# %% [markdown]
# #### MA maps
# %% [markdown]
# Following from the previous point, the result of a convolution of a binary map (i.e. a sum of dirac maps) with the KDA Kernel of radius $r$, expresses simply the density of the binary map with a radius $r$.
# %% [markdown]
# #### KDA map
# %% [markdown]
# Here again, since the merge operation #2 is just a sum, the KDA map expresses a density with a radius $r$ accross studies. That is to say the number of peaks lying at most at a distance of $r$.
# %% [markdown]
# #### Mathematical recap
# %% [markdown]
# - Let $(N_i, N_j, N_k)$ be the shape of the brain space. Typically $(N_i, N_j, N_k) = (91, 109, 91)$ in the MNI space.
# - For all $n \in [1 .. N],$
# * let $B_n \in \{0, 1\}^{N_iN_jN_k}$ be the binary map associated with experiment $n$.
# * let $\textit{MA}_n \in \mathbb{R}^{N_iN_jN_k}$ be the Modeled Activation map of experiment $n$.
# * let $K_r$ be the uniform kernel of size $r$ associated with experiment $n$.
# - Let $\textit{KDA} \in \mathbb{R}^{N_iN_jN_k}$ be the desired KDA map.
# %% [markdown]
# $$
# \begin{equation}
# \textit{MA}_n = B_n*K_r
# \end{equation}
# $$
# %% [markdown]
# $$
# \begin{equation}
# \textit{KDA}(i, j, k) = \sum_{n=1}^N\textit{MA}_n(i, j, k)
# \end{equation}
# $$
# %% [markdown]
# #### Inefficient python implementation
# %% [markdown]
# Let's see what output is obtained with this method on the loaded studies.
# %% [markdown]
# | Math variables | Python variables |
# |:-------------------:|:--------------------------------------------------------:|
# |$B_n$ | `binary_arr` |
# |$\textit{MA}_n$ | `prob_arrs[:, :, :]` <br> and `ma_maps[n, :, :, :]` |
# |$\textit{KDA}$ | `kda_arr` |
#
# <center>Link between math and python variables.</center>
# %%
r = 15. # mm
def uniform_kernel(r, affine):
A, B, C = r, r, r
a = int(A/abs(affine[0, 0]))
b = int(B/abs(affine[1, 1]))
c = int(C/abs(affine[2, 2]))
kernel = np.zeros((2*a+1, 2*b+1, 2*c+1))
i0, j0, k0 = a, b, c
for i in range(2*a+1):
for j in range(2*b+1):
for k in range(2*c+1):
if ((i-i0)/a)**2 + ((j-j0)/b)**2 + ((k-k0)/c)**2 <= 1:
kernel[i, j, k] = 1
return kernel
def compute_kda_ma_maps():
ma_maps = np.zeros((N_img, Ni, Nj, Nk))
for n in range(N_img):
print(f'{n+1} out of {N_img}', end='\r')
binary_img = binary_imgs[n]
# Create probability maps
binary_arr = binary_img.get_fdata()
prob_arrs = scipy.ndimage.filters.convolve(binary_arr, uniform_kernel(r, affine), mode='constant')
# Merge probability maps
ma_maps[n, :, :, :] = prob_arrs
return ma_maps
def merge_kda_ma_maps(ma_maps):
return np.sum(ma_maps, axis=0)
kda_ma_maps = compute_kda_ma_maps()
kda_arr = merge_kda_ma_maps(kda_ma_maps)
kda_img = nib.Nifti1Image(kda_arr, affine)
# %%
plotting.view_img(kda_img, threshold=25)
# %% [markdown]
# ### ALE and KDA limitations [1]
# %% [markdown]
# Both ALE and KDA treat each focus independently. As a consequence, if one study has a lot more peaks than the others, this study will have an immense impact on the results, even if the increased number of foci is not relevant and is not indicative of the quality of a study (e.g due to different thresholding). To prevent studies with a lot of peaks from getting the upper hand, MKDA technique does not treat all foci equally; instead, it considers study as the unit of analysis.
# A more fundamental issue is that studies with no activations cannot be included in the meta analyses.
# %% [markdown]
# ### MKDA
# %% [markdown]
# The Multilevel Kernel Density Analysis (MKDA) technique, as the name suggests, is closely related to the KDA one. Actually, it is an improved version of KDA in which the unit of analysis is no longer peak location but study. This allows it to avoid the drawbacks mentioned above.
# %% [markdown]
# #### Pipeline
# %% [markdown]
# The MKDA pipeline is basically the KDA one with extra steps.
#
# The pipeline is summarized in figure [3](#MKDA_fig) and can be described as follows:
#
# 1. First of all, gather the reported activation peaks in $N$ binary maps.
# 2. Convolve each of these maps with a spherical uniform kernel.
# 3. Binarize these maps by setting all strictly positive values to 1 to build the Comparison Indicator Maps (CIM).
# 3. Derive the MKDA map by performing a weighted average on the CIM (weights are described later).
# %% [markdown]
# <a id='MKDA_fig'></a>
# 
# %% [markdown]
# **Figure 3**: The MKDA pipeline.
# %% [markdown]
# Let's describe these extra steps.
# %% [markdown]
# #### MKDA Kernel
# %% [markdown]
# The MKDA kernel is the same as the KDA kernel, namely, a sphere of ones with a chosen radius $r$ and zero elsewhere.
# %% [markdown]
# #### Binarization
# %% [markdown]
# In comparison to KDA, a new step is added between the convolution and the merge operations: binarization. The purpose of this operation is to prevent studies that have a lot of peaks from getting the upper hand over studies having only a few peaks, which is, as said before, a problem with ALE and KDA.
# %% [markdown]
# #### Weighted average
# %% [markdown]
# The purpose of the average by itself is intuitive, but what about the weighting?
#
# 1. We want the larger studies to carry more weight in the meta-analysis because it is assumed that the larger the sample, the more precise the results.
# 2. Moreover, two main kinds of studies exist in the literature: those that treat subjects as fixed effects (FFX, mostly older studies), and those subjecs as random effects (RFX, which uses the appropriate degrees of freedom). Thus, the idea is to downweight the first kind of studies because their results may be less generalizable beyond their sample. There is no agreed-upon value for this downweighting, but some papers propose 0.75 ([1]).
# %% [markdown]
# In order to meet these needs, a reasonable weight for an experiment $n$ is the following:
#
# $$
# \alpha_n = \delta_n\sqrt{S_n}
# $$
#
# With
# - $S_n$ is the number of subjects in the experiment $n$.
# - $\delta_n$ the RFX/FFX weight given to experiment $n$ :
#
# $$\delta_n = \left\{
# \begin{array}[cc]\\
# 1 & \mbox{if RFX}\\
# .75 & \mbox{if FFX}
# \end{array}
# \right.
# $$
# %% [markdown]
# #### Mathematical recap
# %% [markdown]
# - Let $N \in \mathbb{N}^*$ be the number of experiments/studies.
# - Let $S_n \in \mathbb{N}$ be the number of subjects in the experiment $n$.
# - Let $(N_i, N_j, N_k)$ be be the shape of the brain space. Typically $(N_i, N_j, N_k) = (91, 109, 91)$ in MNI space.
# - For all $n \in [1 .. N],$
# * let $B_n \in \{0, 1\}^{N_iN_jN_k}$ be the binary map associated with experiment $n$.
# * let $\textit{MA}_n \in \mathbb{R}^{N_iN_jN_k}$ be the Modeled Activation map of experiment $n$.
# * let $K_r$ be the uniform kernel of size $r$ associated with experiment $n$.
# - Let $\textit{MKDA} \in \mathbb{R}^{N_iN_jN_k}$ be the desired MKDA map.
# %% [markdown]
# $$
# \begin{equation}
# \textit{MA}_n(i, j, k) = \min((B_n*K_r)(i, j, k),~1)
# \end{equation}
# $$
# %% [markdown]
# $$
# \begin{equation}
# \textit{MKDA}(i, j, k) = \frac{1}{\sum_{n=1}^N\alpha_n}\sum_{n=1}^N\alpha_n\textit{MA}_n(i, j, k)
# \end{equation}
# $$
# %% [markdown]
# With:
# $$
# \alpha_n = \delta_n\sqrt{S_n}
# $$
# %% [markdown]
# $$\delta_n = \left\{
# \begin{array}[cc]\\
# 1 & \mbox{if RFX}\\
# .75 & \mbox{if FFX}
# \end{array}
# \right.
# $$
# %% [markdown]
# #### Inefficient python implementation
# %% [markdown]
# Let's see what output is obtained with this method on the loaded studies.
# %% [markdown]
# | Math variables | Python variables |
# |:-------------------:|:--------------------------------------------------------:|
# |$B_n$ | `binary_arr` |
# |$\textit{MA}_n$ | `prob_arrs[:, :, :]` <br> and `ma_maps[n, :, :, :]` |
# |$\textit{MKDA}$ | `mkda_arr` |
#
# <center>Link between math and python variables.</center>
# %%
r = 10. # radius of the sphere, mm
def compute_mkda_ma_maps():
ma_maps = np.zeros((N_img, Ni, Nj, Nk))
for n in range(N_img):
print(f'{n+1} out of {N_img}', end='\r')
binary_img = binary_imgs[n]
# Create probability maps
binary_arr = binary_img.get_fdata()
prob_arrs = scipy.ndimage.filters.convolve(binary_arr, uniform_kernel(r, affine), mode='constant')
prob_arrs = prob_arrs > 0
# Merge probability maps
ma_maps[n, :, :, :] = prob_arrs
return ma_maps
def merge_mkda_ma_maps(ma_maps):
return np.sum(ma_maps, axis=0)
mkda_ma_maps = compute_mkda_ma_maps()
mkda_arr = merge_mkda_ma_maps(mkda_ma_maps)
mkda_img = nib.Nifti1Image(mkda_arr, affine)
# %%
plotting.view_img(mkda_img, threshold=5)
# %% [markdown]
# ## Coordinate-based meta-analysis limitations
# %% [markdown]
# The main limitation of coordinate-based meta-analysis is the loss of information between the full image of the brain (which is acquired during the experiment) and the reported coordinates in the published paper. Another limitation of this representation is the high sensitivity of coordinates to methods used in the study, from thresholding to reporting.
# That's why, whenever the full images of the brain is available, one should use image-based meta-analysis to obtain results based on more information.
# %% [markdown]
# ## Image-based meta-analysis (IBMA)
# %% [markdown]
# In image-based meta-analysis, as the name suggests, we have at our disposal at least the full group-level activation image from each study, and often the full activation image for each subject within each study. The idea of image-based meta-analysis is to mix these maps together to create a meta-analysis map.
#
# We will go through three methods: Fisher, Stouffers and Multilevel GLM. These methods differ in the way they combine images. Fisher and Stouffers are general methods that are used to combine a number of independant tests, whereas Multilevel GLM is more neuroimaging-specific in the way the operations are conceived.
#
# This part has been written using the following references: [3] [4] [5].
# %% [markdown]
# **Context**
# What precisely are the inputs of image-based meta-analysis?
# The type of map (contrast, z, t or standard error maps) depends on the method used, and will be detailed in each section.
#
# However, whatever the type of map, one should be aware about the following caveat.
# Since the input data is now the full brain image instead of coordinates (that are assumed to be in a common space), one should check that all images are in a common space (e.g. the [MNI152](http://www.bic.mni.mcgill.ca/ServicesAtlases/ICBM152NLin2009)).
# %% [markdown]
# ### Fisher
# %% [markdown]
# Fisher's combination method, as Stouffers' method, aims to combine a number of independant tests into one.
# %% [markdown]
# **Input data**
# Only the group-level z maps from each of the studies are required.
#
# **Ouput data**
# The ouput of the method is a single p-value map. A z map can be computed from this map, seen as the meta-analysis result.
# %% [markdown]
# #### Mathematic foreword
# Before applying the Fisher's method to neuroimaging, let's consider the general case.
#
# Let $T_1, ... T_N$ be $N$ independant tests on $H_0$.
# Let $p_1, ..., p_N$ be their corresponding p-values.
# %% [markdown]
# The Fisher's method builds the test statistics $T_f$ as follows:
# $$
# T_f = -2\sum_{k=1}^N\ln(p_k)
# $$
# %% [markdown]
# An important result is that, under the null hypothesis $H_0$, $T_f$ follows a $\chi^2$ distribution with $2N$ degrees of freedom. Thus, the only remaining thing to do is to compare the obtained value of $T_f$ to the null $\chi^2_{2N}$ distribution. To control for a risk of error $\alpha$, one would chose a threshold $\chi^2_{2N\alpha}$ value such that the propability to find a $\chi^2_{2N}$ greater or equal to $\chi^2_{2N\alpha}$ is $\alpha$.
# Large values of $T_f$, less likely according to the $\chi^2_{2N}$ distribution, lead to the rejection of $H_0$.
#
# The p-value $p_f$ of the combined probability test can then be computed by looking to the tail of the $\chi^2_{2N}$ distribution.
#
# $$
# p_f = \mathbb{P}(X \geq T_f)
# $$
# With
# $$
# X \sim \chi^2_{2N}
# $$
# %% [markdown]
# <img src="resource/chi2.png" width="400"> \
# **Figure 4:** The $\chi^2$ distribution.
# %% [markdown]
# #### Neuorimaging application
#
# The key idea is to understand how to transform z maps into p-value maps and vice versa.
#
# **Z map to p-value map**
# Z maps store z-scores. For a voxel $v$, the null hypothesis is "the voxel is not activated." When $H_0$ is true, the z-score of this voxel follows a standard normal distribution. Since we are interested in activation rather than deactivation, given a z-score $Z$, its associated p-value $p_Z$ is the probability to observe a value greater than the test critical value.
# %% [markdown]
# $$
# p_Z = \mathbb{P}(X \geq Z)
# $$
# With
# $$X \sim \mathcal{N}(0, 1)$$
#
# Which can be rewritten using the normal cumulative distribution function $\Phi$, and gives the link between z score and its p-value:
# $$
# p_Z = 1 - \Phi(Z)
# $$
# %% [markdown]
# <img src="resource/normal.jpg" width="300">
#
# <a id='Normal'></a>**Figure 5:** The normal distribution of the z map.
# %% [markdown]
# Considering only activation (the right side of the distribution) and not deactivation is a one-sided test. Considering both of them (answering a more general question of excessive positive or negative values) is referred to as a two-sided test.
# %% [markdown]
# **p-value map to Z map**
# Using the inverse normal cumulative distribution function, it is straightforward to invert the previous process. Given a p-value $p$, the associated critical value is:
# %% [markdown]
# $$
# Z = \Phi^{-1}(1-p_Z)
# $$
# %% [markdown]
# **Application**
# Let's consider only one brain voxel $v$. In each study $k$, the question of whether $v$ is activated, i.e., whether we reject $H_0$, is represented by the hypothesis test $T_k$. The studies are supposedly independant; so are the tests $T_k$. For each study $k$, we can derive the p-value $p_k$ from the z-score associated to voxel $v$ as described previously. We then combine all these p-values together to build a test $T_f$ using the Fisher's formula. We compare $T_f$ to the $\chi^2_{2N}$ distribution and derive the combined p-value $p_f$. Finally, we derive the meta-analysis z-score from the combined p-value $p_f$ as described above.
# By performing this task voxel-wise, we can derive the meta-analysis z map.
# %% [markdown]
# #### Inefficient python implementation
# %% [markdown]
# Let's see what output is obtained with this method on the loaded studies.
# %% [markdown]
# | Math variables | Python variables|
# |---------------------|-------------|
# |$T_f$ | `T_f` |
# |$p_f$ | `p_f` |
# |$\Phi^{-1}(1-p_f)$ | `fisher_arr`|
#
# <center>Link between math and python variables.</center>
# %%
def z_to_p(z_arr):
return 1-scipy.stats.norm.cdf(z_arr)
def p_to_z(p_arr):
"""
scipy.special.ndtri returns the argument x for which the area under the Gaussian
probability density function (integrated from minus infinity to x) is equal to y.
"""
return scipy.special.ndtri(1-p_arr)
p_values = np.zeros((N_img, Ni, Nj, Nk))
for n in range(N_img):
img = nilearn.image.load_img(img_paths[n])
arr = img.get_fdata()
p_values[n, :, :, :] = z_to_p(arr)
T_f = np.nan_to_num(-2*np.sum(np.log(p_values), axis=0))
# chi2 survival funtion:
p_f = scipy.stats.chi2.sf(T_f, 2 * N_img)
fisher_arr = p_to_z(p_f)
maxnan = fisher_arr[fisher_arr != np.inf].max()
minnan = fisher_arr[fisher_arr != -np.inf].min()
np.nan_to_num(fisher_arr, copy=False, posinf=maxnan, neginf=minnan) # Truncate inf values
fisher_img = nib.Nifti1Image(np.array(fisher_arr), affine)
# %%
plotting.plot_stat_map(fisher_img)
# %% [markdown]
# ### Stouffers
# %% [markdown]
# Stouffers' method is very close to Fisher's one. Both have z maps as inputs and outputs and both aim to mix p-value maps into one.
# %% [markdown]
# **Input data**
# Only the z maps of each studies are required.
#
# **Ouput data**
# The ouput of the method is a single p-value map. From this map can be computed a z map, seen as the meta-analysis result.
# %% [markdown]
# **Mathematic foreword**
#
# Let $Z_1, ..., Z_N$ be $N$ Z-scores. The Stouffers' method aims to build a test $T_s$ as follow:
# %% [markdown]
# $$
# T_s = \sum_{k=1}^N \frac{Z_k}{\sqrt{N}}
# $$
# %% [markdown]
# Note that since $Z_k = \Phi^{-1}(1-p_k)$ with the previous notations, it can be seen as a p-value combination method, as Fisher's one.
# %% [markdown]
# Under $H_0$, since $Z_k \sim \mathcal{N}(0, 1)$, $T_s$ follows a standard normal distribution
# $$
# T_s \sim \mathcal{N}(0, 1)
# $$
# %% [markdown]
# From there, the process is similar to the one explained in the Fisher's part, replacing the $\chi^2_{2N}$ distribution with the $\mathcal{N}(0, 1)$ one.
# %% [markdown]
# The p-value $p_s$ of the combined probability test is similarly given by
#
# $$
# p_s = \mathbb{P}(X \geq T_s)
# $$
# With
# $$
# X \sim \mathcal{N}(0,1)
# $$
# %% [markdown]
# #### Application
# As with Fisher's method, let's consider only one brain voxel $v$. For each study $k$, we combine the Z-scores together to build a test $T_s$. The p-value $p_s$ and its associated Z-score are then derived by comparing $T_s$ to the $\mathcal{N}(0, 1)$ distribution. By performing this task voxel-wise, we can derive the meta-analysis z map.
# %% [markdown]
# #### Inefficient python implementation
# %% [markdown]
# Let's see what output is obtained with this method on the loaded studies.
# %% [markdown]
# | Math variables | Python variables|
# |---------------------|-----------------|
# |$T_s$ | `T_s` |
# |$p_s$ | `p_s` |
# |$\Phi^{-1}(1-p_s)$ | `stouffers_arr` |
#
# <center>Link between math and python variables.</center>
# %%
T_s = np.zeros((Ni, Nj, Nk))
for n in range(N_img):
img = nilearn.image.load_img(img_paths[n])
T_s += img.get_fdata()
T_s /= np.sqrt(N_img)
p_s = scipy.stats.norm.sf(T_s, loc=0, scale=1)
stouffers_arr = p_to_z(p_s)
maxnan = stouffers_arr[stouffers_arr != np.inf].max()
minnan = stouffers_arr[stouffers_arr != -np.inf].min()
np.nan_to_num(stouffers_arr, copy=False, posinf=maxnan, neginf=minnan) # Truncate inf values
stouffers_img = nib.Nifti1Image(stouffers_arr, affine)
# %%
plotting.view_img(stouffers_img)
# %% [markdown]
# ### Fisher's or Stouffers'?
# %% [markdown]
# Here are some comparisons found in the litterature:
#
# Stouffer’s method is commonly preferred in the presence of weights, as the weighted test statistic
# $\sum^n_{i=1} w_i\phi^{−1}(p_i)$ retains a closed-form distribution, $\mathcal{N}(0,\sum_{i=1}^n w_i)$, under the null hypothesis [6]. This straightforward way to introduce weights can make Stouffer's method more powerful than Fisher’s method when the p-values are from studies of different size [7] [8].
# %% [markdown]
# ### Multilevel GLM
# %% [markdown]
# The last technique to be approached is also the most complex: the Multilevel General Linear Level. Below is the general idea of a simplified version of the method, where a strong hypothesis has been made.
#
# The key idea of this technique is to add a level of GLM on top of the other usual levels, that is, adding an inter-study level GLM on top of subject level and within-study level GLM.
#
# This part strongly references the paper of Beckmann et al [4].
# %% [markdown]
# <a id="GLM_Model"></a>
# #### GLM Model
# GLM intempts to explain a variable $Y$ as a combination of $\beta$-weighted predictors $X$ and noise $e$. The $\beta$ weights are the unknown values that are estimated. The simplest modelisation of GLM is the following:
#
# $$
# Y = X\beta + e
# $$
#
# $$e \sim \mathcal{N}(0, \sigma^2 I)$$
#
# Where $Y$, $X$, $e$ are vectors and $X$ is a matrix.
# - $X$ is called the design matrix and is fixed by the experimenter (assumed full rank).
# - $Y$ are observations.
# - $\beta$ is unknown and will be estimated in the GLM process.
# - $e$ is unknown, but assumptions are made on its covariance structure (e.g independant and identically distributed...)
# %% [markdown]
# **1. Estimation of $\beta$**
# Under the more general assumption $e \sim \mathcal{N}(0, V)$, $\beta$ and its covariance can be estimated using the Generalized Least Square (GLS) technique through the following formulae:
#
# $$
# \hat{\beta} = (X^TV^{-1}X)^{-1}X^TV^{-1}Y
# $$