meegkit.cca#

Canonical Correlation Analysis.

Functions

cca_crossvalidate(xx, yy[, shifts, sfreq, ...])

CCA with cross-validation.

mcca(C, n_channels[, n_keep])

Multiway canonical correlation analysis.

nt_cca([X, Y, lags, C, m, thresh, sfreq])

Compute CCA from covariance.

whiten(C[, fudge])

Whiten covariance matrix C of X.

whiten_nt(C[, thresh, keep])

Covariance whitening function from noisetools.

whiten_svd(X)

SVD whitening.

whiten_zca(C[, thresh])

Compute ZCA whitening matrix (aka Mahalanobis whitening).

meegkit.cca.cca_crossvalidate(xx, yy, shifts=None, sfreq=1, surrogate=False, plot=False)#

CCA with cross-validation.

Parameters:
  • xx (list of arrays) – If a list is provided, each element should have shape=(n_times, n_chans). If array, it should be 3D of shape=(n_times, n_chans, n_trials).

  • yy (list of arrays) – If a list is provided, each element should have shape=(n_times, n_chans). If array, it should be 3D of shape=(n_times, n_chans, n_trials).

  • shifts (array, shape=(n_shifts,)) – Array of shifts to apply to y relative to x (can be negative).

  • sfreq (float) – Sampling frequency. If not 1, lags are assumed to be given in seconds.

  • surrogate (bool) – If True, estimate SD of correlation over non-matching pairs.

  • plot (bool) – Produce some plots.

Returns:

  • AA, BB (arrays) – Cell arrays of transform matrices.

  • RR (array, shape=(n_comps, n_shifts, n_trials)) – Correlations (2D).

  • SD (array) – Standard deviation of correlation over non-matching pairs (2D).

meegkit.cca.mcca(C, n_channels, n_keep=[])#

Multiway canonical correlation analysis.

As described in [1].

Parameters:
  • C (array, shape=(n_channels * n_datasets, n_channels * n_datasets)) – Covariance matrix of aggregated data sets.

  • n_channels (int) – Number of channels of each data set.

  • n_keep (int) – Number of components to keep (for orthogonal transforms).

Returns:

  • A (array, shape=(n_channels * n_datasets, n_channels * n_datasets)) – Transform matrix.

  • scores (array, shape=(n_comps,)) – Commonality score (ranges from 1 to N^2).

  • AA (list of arrays, shapes = (n_channels, n_channels * n_datasets)) – Subject-specific MCCA transform matrices.

References

[1]

de Cheveigne, A., Di Liberto, G. M., Arzounian, D., Wong, D., Hjortkjaer, J., Fuglsang, S. A., & Parra, L. C. (2018). Multiway Canonical Correlation Analysis of Brain Signals. bioRxiv, 344960.

meegkit.cca.nt_cca(X=None, Y=None, lags=None, C=None, m=None, thresh=1e-12, sfreq=1)#

Compute CCA from covariance.

Parameters:
  • X (arrays, shape=(n_times, n_chans[, n_trials])) – Data.

  • Y (arrays, shape=(n_times, n_chans[, n_trials])) – Data.

  • lags (array, shape=(n_lags,)) – Array of lags. A positive lag means Y delayed relative to X. If sfreq is > 1, lags are interpreted as times in seconds.

  • C (array, shape=(n_chans, n_chans[, n_lags])) – Covariance matrix of [X, Y]. C can be 3D, which case CCA is derived independently from each page.

  • m (int) – Number of channels of X.

  • thresh (float) – Discard principal components below this value.

  • sfreq (float) – Sampling frequency. If not 1, lags are assumed to be given in seconds.

Returns:

  • A (array, shape=(n_chans_X, min(n_chans_X, n_chans_Y)[, n_lags])) – Transform matrix mapping X to canonical space, where n_comps is equal to min(n_chans_X, n_chans_Y).

  • B (array, shape=(n_chans_Y, n_comps[, n_lags])) – Transform matrix mapping Y to canonical space, where n_comps is equal to min(n_chans_X, n_chans_Y).

  • R (array, shape=(n_comps, n_lags)) – Correlation scores.

Notes

Usage 1: CCA of X, Y >> A, B, R = nt_cca(X, Y) # noqa

Usage 2: CCA of X, Y for each value of lags. >> A, B, R = nt_cca(X, Y, lags) # noqa

A positive lag indicates that Y is delayed relative to X.

Usage 3: CCA from covariance matrix >> C = [X, Y].T * [X, Y] # noqa >> A, B, R = nt_cca(None, None, None, C=C, m=X.shape[1]) # noqa

Use the third form to handle multiple files or large data (covariance C can be calculated chunk-by-chunk).

Warning

Means of X and Y are NOT removed.

Warning

A, B are scaled so that (X * A)^2 and (Y * B)^2 are identity matrices (differs from sklearn).

meegkit.cca.whiten(C, fudge=1e-18)#

Whiten covariance matrix C of X.

If X has shape=(observations, components), X_white = np.dot(X, W).

References

https://stackoverflow.com/questions/6574782/how-to-whiten-matrix-in-pca

meegkit.cca.whiten_nt(C, thresh=1e-12, keep=False)#

Covariance whitening function from noisetools.

Parameters:
  • C (array) – Covariance matrix.

  • thresh (float) – PCA threshold.

  • keep (bool) – If True, infrathreshold components are set to zero. If False (default), infrathreshold components are truncated.

meegkit.cca.whiten_svd(X)#

SVD whitening.

meegkit.cca.whiten_zca(C, thresh=None)#

Compute ZCA whitening matrix (aka Mahalanobis whitening).

Parameters:
  • C (array) – Covariance matrix.

  • thresh (float) – Whitening constant, it prevents division by zero.

Returns:

ZCA – ZCA matrix, to be multiplied with data.

Return type:

array, shape (n_chans, n_chans)