Review: Differentiating BOLD from non-BOLD signals in fMRI time series using multi-echo EPI

Disclaimer: I'm afraid I haven't done a very good job reviewing the entirety of this paper because the stats/processing part was pretty much opaque to me. I've done my best to glean what I can out of it, and then I've focused as much as I can on the acquisition, since that is one part where I can penetrate the text and offer some useful commentary. Perhaps someone with better knowledge of stats/ICA/processing will review those sections elsewhere.

The last paper I reviewed used a bias field map to attempt to correct for some of the effects of subject motion in time series EPI. A different approach is taken by Prantik Kundu et al. in another recently published study. In their paper, Differentiating BOLD from non-BOLD signals in fMRI time series using multi-echo EPI, Kundu et al. set out to differentiate between signal changes that have a plausible neurally-driven BOLD origin from those that are likely to have been modulated by something other than neuronal activity. In the latter category we have cardiac and respiratory fluctuations and, of course, subject motion.

The method involves sorting BOLD-like from spurious changes using an independent component analysis (ICA) and to then "de-noise" the time series before applying connectivity analysis. For resting state fMRI in particular, the lack of any sort of ground truth and an absence of independent knowledge that one has with task-based fMRI makes disambiguating neurally driven signal changes from artifacts a major problem. Kundu et al. use a relatively simple philosophical approach to the separation:

"We hypothesized that if TE-dependence could be used to differentiate BOLD and non-BOLD signals, non-BOLD signal could be removed to denoise data without conventional noise modeling. To test this hypothesis, whole brain multi-echo data were acquired at 3 TEs and decomposed with Independent Components Analysis (ICA) after spatially concatenating data across space and TE. Components were analyzed for the degree to which their signal changes fit models for R₂* and S₀ change, and summary scores were developed to characterize each component as BOLD-like or not BOLD-like."

And, noting again the caveat that there is an absence of ground truth, the approach seems to work:

"These scores clearly differentiated BOLD-like “functional network” components from non BOLD-like components related to motion, pulsatility, and other nuisance effects. Using non BOLD-like component time courses as noise regressors dramatically improved seed-based correlation mapping by reducing the effects of high and low frequency non-BOLD fluctuations."

What does it mean to be BOLD-like?

BOLD contrast is achieved via changes of T₂* in and around the venous blood. In a typical fMRI experiment the TE of a T₂*-sensitive image acquisition such as EPI is set approximately equal to the T₂* of the gray matter, thereby producing a contrast mechanism that is maximally sensitive to changes in T₂*. (See Note 1.) Instead, Kundu et al. acquire three different T₂*-weighted images per slice, i.e. three successive images are acquired at different TEs, allowing a fit for T₂* for each voxel in the brain that contains signal. (In the paper the authors usually use the relaxivity, defined as 1/T₂* = R₂*, rather than the relaxation rate constant, T₂* but the terms are clearly interchangeable.)

Multi-echo acquisition has been used before to characterize BOLD changes, or to combat dropout effects that arise from the distribution of T₂* values across a brain. (See the paper's Introduction for a review.) What's new in this work is the use of ICA to separate R₂*-dependent components for each voxel's time series, where the goodness-of-fit to an R₂* model can be used to characterize whether a particular time series component is more likely to be neurally-driven, i.e. BOLD-like, than a non-BOLD change, such as could arise from head motion.

So, the method relies upon the ability to model appropriately signal changes that are BOLD-like from everything else. What does it mean to be BOLD-like? Kundu et al. explain it in detail in their Theory section, but in brief it means that a signal has a mono-exponential TE dependence that is consistent with small changes in magnetic susceptibility due to small changes in oxygenation in the (venous) blood. For BOLD-like modulation, then, the change of signal level from a baseline state to an activated state, cast as ΔS/S, is linearly dependent on acquisition TE in the limit of small changes in T₂*, as shown at bottom-right in the figure below. Non-BOLD-like modulations don't fit this model. Instead, the ΔS/S is TE-invariant, as shown at bottom-left:

Mono-exponentiality is assumed in the BOLD versus non-BOLD characterization. Is that fair? I think it probably is, for voxels that aren't significantly larger than about 4 mm on a side. But the assumption is likely to be better the smaller one can make the voxels. There are only three TEs with which to characterize the TE dependence anyway, so for the time being that would seem to be a limitation that we are required to live with.

The other assumption is that only small changes in magnetic susceptibility are expected in BOLD-like fluctuations. The concentration of deoxyhemoglobin in venous blood varies by a few percent as the upstream neural activity varies; we're not expecting huge shifts of T₂*. Large changes in magnetic susceptibility across a voxel can accompany movement, however. But in the case of head movement the dephasing across a voxel leads to changes in signal level that will be reflected in the term ΔS₀/S₀ as well. Thus, movement may also cause a change in the TE dependence but we can still separate movement from BOLD-driven signals because we want changes in the TE dependence only; no change in ΔS₀/S₀.

The practical stuff

The immediate problem is obtaining data suitable to fitR₂*. This isn't trivial because it takes tens of milliseconds to acquire a single image, a problem that manifests in routine EPI as distortion in the phase encoding dimension. On their 3 T GE scanner, Kundu et al. were able to acquire three different T₂* weightings (TE = 15, 39 and 63 ms) using relatively large voxels (3.75x3.75 mm in-plane, 4.2 mm slice thickness) by employing SENSE parallel imaging with an acceleration factor of two. Even so the TR was 2500 ms for 31 slices (0.3 mm gap) to cover the whole brain.

Pulse and respiration data were recorded separately and allowed different types of de-noising to be compared. I won't get into the comparisons for brevity; also because I'm only interested in the multi-echo data characterized by ICA. For the ICA pipeline, then, conventional slice timing correction was applied and motion correction (a.k.a. realignment) was applied using the central TE image for each time point. I don't know why the central TE image was selected, but perhaps it was because it's the median value and thus might be expected to minimally bias the time series to BOLD-like or non-BOLD-like changes. I think I would have been tempted to use the first TE images because they exhibit less dropout, thus there should be more brain signal for the realignment algorithm to work with.

At this point some sort of magic happens. As previously mentioned, I have zero knowledge of ICA and only very slightly greater knowledge about stats in general. Apologies for glossing over this part and making an assumption that the fitting and statistical procedures were valid; I'm not qualified to do anything else. Anyway, ICA was applied to the time series data by treating the three TE images at each time point as a fourth spatial variable for spatial ICA.Each ICA component was then analyzed for its TE dependence: BOLD-like or non-BOLD-like, as established by the TE-dependent criteria mentioned previously. The magic thus produces two new variables, κ and ρ, to characterize each independent component of a voxel's time series:

"High κ indicated strong ΔR₂*-like character (BOLD-like), and high ρ indicated strong ΔS₀-like character (non BOLD-like)."

But will it blend?

By this point I feel like my head has been in a Blendtec. (This is the last time I try to review a heavy stats/modeling paper!) However, the results are compelling (for someone who has no knowledge of ICA or the actual processing used in the paper). For example, here is a comparison of a BOLD-like (top panel) versus a non-BOLD component (bottom panel):

The TE dependence of images in the lower left quadrant suggests that the non-BOLD modulation isn't driven by neurons, instead the peripheral signal changes resemble classic head movement artifacts. Thus, for the non-BOLD (artifact) component there is a corresponding ring on the periphery of the brain having high percent ΔS₀ (bottom-right corner image); head movement primarily modulates signal level without a TE dependence. In contrast, BOLD-like modulations show very little change in percent ΔS₀, instead showing strong, localized changes in ΔR₂* (top-right corner image). For the BOLD-like component shown here, κ was 184 and ρ was 15 whereas for the artifact component κ was 22 and ρ was 90.

The generation of κ and ρ permittedautomated sorting of the BOLD-like wheat from the non-BOLD-like chaff:

"The ICA components were rank-ordered based on their κ and ρ scores. These two rank orderings (κ-spectrum and ρ-spectrum) were used to differentiate BOLD components from non-BOLD components. Both κ and ρ spectra were found to be L-curves with well-defined elbows distinguishing high score and low score regimes. This inherent separation was used to identify BOLD components in an automated procedure. First, the elbows of κ and ρ spectra were identified. The spectra were scanned from right to left to identify an abruptly high score following a series of similarly valued low scores. The κ and ρ scores marking abrupt changes were used as thresholds. Those components with κ greater than the κ threshold and ρ less than the ρ threshold were considered BOLD components. All other components were considered non-BOLD components. These were used as noise regressors in time course de-noising."

The elbows in the L-shaped rank plots of independent components seemed to be clearer for κ (BOLD-like) than for ρ (non-BOLD) values, but the features were certainly complementary:

Maps corresponding to high κ matched resting state networks seen in other studies. I'm not sure if that is, by itself, a good thing but let's move on. Maps of the highest ρ tended to feature brain edges or CSF-filled spaces, a strong indication that they would be motion-related artifacts. Maps of components near the elbows of the κ and ρ spectra were more difficult to interpret, but tended to be more suggestive of artifact than "proper" BOLD networks, according to the authors' interpretation.

_{Which brings us to the final proof: connectivity analysis. Time series correlation using seeds in hippocampus and brain stem showed that the de-noising using multi-echo ICA yielded spatial patterns that were more consistent across subjects than when standard de-noising techniques (such as RETROICOR) were used instead:}

"The group T-maps based on low κ de-noising showed much higher T-statistics for connected regions than the group T-maps based on standard de-noising. This indicated that (Z-transformed) correlation coefficients based on ME-ICA were more consistent across subjects than Z-transformed correlation coefficients based on standard de-noising."

That seems like a good finding. One naively assumes that brains are connected with greater similarity than dissimilarity when examined with our relatively coarse fMRI tools.

But why the hippocampal and brain stem seeds?

"Studying functional connectivity of subcortical regions is challenging due to low functional contrast-to-noise due to CSF and blood flow pulsatility and distance from receiver elements. Where standard de-noising showed no clear correlation patterns for the hippocampal and brain stem seeds, ME-ICA de-noising revealed robust correlation patterns. The brain stem seed was localized to the anterior pons that contains corticospinal (pyramindal) tracts connecting to premotor, parietal, and motor regions (Kiernan, 2009). This pattern of anatomical connectivity agrees well with the pattern of functional connectivity exposed after ME-ICA de-noising. The hippocampus seed was localized to the head of the right hippocampus that has anatomical connectivity to sensory regions via temporal and entorhinal cortices (Kiernan, 2009). The pattern of functional connectivity exposed after ME-ICA denoising agreed with this pattern of anatomical connectivity."

This also seems to be good news.

Limitations of the study and ME-ICA

Assuming that the statistical evaluation works as presented, what are the limitations of using ME-ICA for de-noising fMRI data? My first concern is the use of three TEs, two of which (39 ms and 65 ms) may not offer very much signal in important brain regions having short T₂* (below perhaps 20 ms); namely, portions of frontal and temporal lobes. There is the danger of a 2-point or even a 1-point fit to the TE dependence in these regions. How does the method fare when the SNR is very low? I would want to assess regional variations in the de-noising beforepushing for widespread adoption.

And talking of validation, de-noising doesn't turn a bad experiment into a good one. I assume - because the paper gives no indication to the contrary - that all eight subjects were compliant. Perhaps they were all experienced fMRI subjects, too. Thus, I would put the data acquired in the experiments into the "good" bin; the subjects probably didn't move much compared to your typical, off the street fMRI volunteers, or kids, or elderly subjects with a medical condition. How does ME-ICA fare when movement is higher than for these eight subjects? Does the model always capture the non-BOLD components and leave the same BOLD-like networks? I would want to see some failure tests before moving to global use of ME-ICA. It would be very useful to have the same subjects scanned under different intentional movement regimes, for example.

But I also have a minor concern about how the method fares with very small amounts of subject motion, too. Very small head movements could cause small T₂* changes with minuscule concomitant signal intensity changes, through small shifts in magnetic susceptibility gradients across tissue boundaries, for instance. I wonder, then, whether the method might characterize very small movements as being BOLD-like. That would be perverse. Again, it would be important to test the method under the movement extremes to see how it could work (and fail) in practice.

My final concern is whether the method would be adopted based on the acquisition parameters presented. The acquisition requires multiple TEs for each slice, a temporally expensive thing to do. In fact, getting down to the TEs of 15, 39 and 65 ms as used in the study required the use of parallel imaging (SENSE with acceleration factor of two), an option that isn't without its own penalties (of increased motion sensitivity and decreased SNR). And even so, it was only possible to acquire 31 slices in TR = 2500 ms. I can see a lot of people turning their noses up at that performance. It might be feasible to decrease the TEs used, thereby increasing the number of slices/TR, by using even higher acceleration factors, but again the SNR goes farther down and the motion sensitivity gets larger. More on the prospects for pulse sequence developments in the final section, below.

What I do like about the proposed method, however, is the principle. This paper tries to develop a conceptual framework for discerning noise components, using a simple model of how a signal should behave with TE in order to be considered BOLD-like. It's a step up from the T₂*-weighted acquisitions everyone else uses for connectivity. I note that nobody does plain diffusion-weighted imaging any more, we've moved to diffusion models such as tensors to fit and interpret the anisotropic motion-encoded signals that we acquire. Yet we haven't made that step en masse for fMRI as yet. We're still doing the same T₂*-weighted imaging that we were doing in the early nineties. A move towards principled evaluations - something more quantitative, as presented in this paper - would seem to be a worthwhile advance, and I welcome it.

Whatdevelopments might benefit multi-echo acquisitions?

As I mentioned, I suspect the biggest hurdle facing ME-ICA isn't the complexity of the analysis or the lack of rigorous failure analysis when the method is applied to wiggly subjects, but rather the temporal overhead associated with the multiple echo acquisition. I did a quick back-of-the-envelope calculation and I reckon it is possible to get the same TE=15,39,65 ms performance without parallel imaging (no SENSE or GRAPPA) using 6/8ths partial Fourier acquisitions instead; assuming 0.5 ms echo spacing for a 64x64 matrix over a 220x220 mm field-of-view. That eliminates the increased motion sensitivity of parallel imaging but it doesn't get the data into the bag any faster.

The authors suggest solutions to the speed issue in their Discussion section: use multi-band (MB) imaging in the slice direction to speed things up. I've recently started tinkering with MB-EPI and the acceleration is really quite amazing. With the University of Minnesota's variant (developed as part of the Human Connectome Project) it is possible to get whole brain, 2 mm isotropic voxels with a BOLD-optimal TE of 38 ms in a TR of 1300 ms using a 6-fold acceleration in the slice dimension. Damn, that's fast. Now, before you all rush off to use MB-EPI for all of your experiments I would caution you that there are likely similar motion sensitivities in MB-EPI as there are for in-plane GRAPPA - they use similar principles applied orthogonally. So, there is a good chance that motion may hamper MB-EPI performance a lot more than you'd like. The validations haven't yet been presented. But if we assume that those clever pulse sequence folk can maintain a reasonable degree of robustness to motion then what might a multi-echo, MB-EPI acquisition look like?

Avoiding in-plane parallel imaging and sticking to the partial Fourier scheme I suggested above, it would be reasonable to obtain TE=15,39,65 ms images at each slice position using a multi-band factor of between two and six, and expect to get a corresponding improvement in the number of slices/TR. We really only need a factor of two to get full brain coverage when the slices are approximately 3 mm thick. But this assumes comparable in-plane resolution - between 3 and 4 mm - to that used in the current study. Higher resolution necessarily increases the echo train length of the EPI readout and would thus extend the TEs well beyond the 15-70 ms range we need to fit T₂* at 3 T.

What are the options for substantially increasing the in-plane resolution beyond 3 mm? The 2 mm voxels for the MB-EPI sequence I've tested had a minimum TE of 38 ms. The echo train length per image, using 6/8ths partial Fourier, is already some 38 ms long, too. I don't think many parts of the brain would fit T₂* very well with TE=38,76,114 ms images, even with the reduced dephasing (via modest extension of T₂*) that one gets from smaller voxels. To reduce the TEs to values that could be expected to fit T₂* for most brain regions at 3 T requires either faster gradients or in-plane parallel imaging. With respect to gradient speed, we are already at the limit set by cardiac stimulation when using a whole body gradient set, so that's off the table at the present time. Perhaps in-plane parallel imaging can be used profitably? Doubtless someone will be able to show that using in-plane GRAPPA and high resolution can be made to work with multi-band imaging under certain circumstances - good subjects not prone to moving very much, perhaps - but at the risk of being boring I don't think these are the sorts of sequences that should be applied in routine practice. Motion would have to be very low indeed or the image quality would be awful.

Another option might be to move away from T₂* BOLD and instead try to do the same sort of ICA decomposition for T₂ BOLD, using multiple spin echoes in conjunction with multi-band encoding. Such an approach has its own limitations, of course: power deposition goes up a lot, maybe prohibitively, while overall BOLD sensitivity (at 3 T) is diminished by about 50%. Or, perhaps asymmetric spin echo images could be obtained, using the spin echo to extend the lifetime of the signal but offsetting the center of each image readout to encode some T₂* rather than T₂. We have options to explore, perhaps trading sensitivity for specificity. And that goal - specificity - is the main lesson of the paper, I think. It's where we should be aiming. Doing the same basic T₂*-weighted resting state acquisitions over and over isn't getting us very far. I'm glad that Kundu et al. are suggesting ways to push through our present inertia.

In sum, then, I don't see everyone switching to multi-echo EPI acquisitions by this time next year. Best case, I suspect some hardy types might try ME-ICA as a way to validate what others are seeing; to use it as a yardstick. We still don't have ground truth, but I would put more faith into a network derived from ME-ICA than I would from the coincidental findings of a hundred "standard" resting state fMRI studies.

_________________

Notes:

1. It is well known that different regions of the brain have different T₂*, thus requiring different TE for optimal BOLD contrast. The voxel resolution as well as neuroanatomical variations, and magnetic susceptibility arising from the skull and sinuses all interact to produce a complicated T₂* dependence across a brain. Therefore, a compromise value of TE is usually set to achieve sufficient BOLD contrast in "good" regions of the brain, such as parietal and occipital lobes, where T₂* is 30-50 ms at 3 T, as well as in "bad" regions of the brain, such as frontal and temporal lobes, where the T₂* is generally below 25 ms. Echo times in the range 25-35 ms are thus typical for 3-4 mm voxels. The TE for optimal BOLD contrast can sometimes be increased when voxels smaller than about 3 mm are used. An example of adjusting TE with voxel resolution is given in this paper, on amygdala fMRI.

References and Links:

Differentiating BOLD and non-BOLD signals in fMRI time series using multi-echo EPI.
P Kundu, SJ Inati, JW Evans, W-M Luh and PA Bandettini. NeuroImage60 (3), 1759-70 (2012).

http://www.sciencedirect.com/science/article/pii/S1053811911014303
http://dx.doi.org/10.1016/j.neuroimage.2011.12.028
PMID: 22209809