meegkit.asr#
Artifact Subspace Reconstruction.
Functions
|
Calibration function for the Artifact Subspace Reconstruction method. |
|
Apply Artifact Subspace Reconstruction method. |
|
Remove periods with abnormally high-power content from continuous data. |
Classes
|
Artifact Subspace Reconstruction. |
- class meegkit.asr.ASR(sfreq=250, cutoff=5, blocksize=100, win_len=0.5, win_overlap=0.66, max_dropout_fraction=0.1, min_clean_fraction=0.25, name='asrfilter', method='euclid', estimator='scm', **kwargs)#
Bases:
object
Artifact Subspace Reconstruction.
Artifact subspace reconstruction (ASR) is an automatic, online, component-based artifact removal method for removing transient or large-amplitude artifacts in multi-channel EEG recordings [1].
- Parameters:
sfreq (float) – Sampling rate of the data, in Hz.
is (The following are optional parameters (the key parameter of the method)
cutoff) (the)
cutoff (float) – Standard deviation cutoff for rejection. X portions whose variance is larger than this threshold relative to the calibration data are considered missing data and will be removed. The most aggressive value that can be used without losing too much EEG is 2.5. A quite conservative value would be 5 (default=5).
blocksize (int) – Block size for calculating the robust data covariance and thresholds, in samples; allows to reduce the memory and time requirements of the robust estimators by this factor (down to Channels x Channels x Samples x 16 / Blocksize bytes) (default=10).
win_len (float) – Window length (s) that is used to check the data for artifact content. This is ideally as long as the expected time scale of the artifacts but not shorter than half a cycle of the high-pass filter that was used (default=1).
win_overlap (float) – Window overlap fraction. The fraction of two successive windows that overlaps. Higher overlap ensures that fewer artifact portions are going to be missed, but is slower (default=0.66).
max_dropout_fraction (float) – Maximum fraction of windows that can be subject to signal dropouts (e.g., sensor unplugged), used for threshold estimation (default=0.1).
min_clean_fraction (float) – Minimum fraction of windows that need to be clean, used for threshold estimation (default=0.25).
method ({'riemann', 'euclid'}) – Method to use. If riemann, use the riemannian-modified version of ASR [2].
memory (float) – Memory size (s), regulates the number of covariance matrices to store.
estimator (str in {'scm', 'lwf', 'oas', 'mcd'}) – Covariance estimator (default: ‘scm’ which computes the sample covariance). Use ‘lwf’ if you need regularization (requires pyriemann).
- ``state_``
Initial state of the ASR filter.
- Type:
dict
- ``zi_``
Filter initial conditions.
- Type:
array, shape=(n_channels, filter_order)
- ``ab_``
Coefficients of an IIR filter that is used to shape the spectrum of the signal when calculating artifact statistics. The output signal does not go through this filter. This is an optional way to tune the sensitivity of the algorithm to each frequency component of the signal. The default filter is less sensitive at alpha and beta frequencies and more sensitive at delta (blinks) and gamma (muscle) frequencies.
- Type:
2-tuple
- ``cov_``
Previous covariance matrix.
- Type:
array, shape=(channels, channels)
- ``state_``
Previous ASR parameters (as derived by
asr_calibrate()
) for successive calls totransform()
. Required fields are:M
: Mixing matrixT
: Threshold matrixR
: Reconstruction matrix (array | None)
- Type:
dict
References
[1]Kothe, C. A. E., & Jung, T. P. (2016). U.S. Patent Application No. 14/895,440. https://patents.google.com/patent/US20160113587A1/en
[2]Blum, S., Jacobsen, N. S. J., Bleichner, M. G., & Debener, S. (2019). A Riemannian Modification of Artifact Subspace Reconstruction for EEG Artifact Handling. Frontiers in Human Neuroscience, 13. https://doi.org/10.3389/fnhum.2019.00141
- __init__(sfreq=250, cutoff=5, blocksize=100, win_len=0.5, win_overlap=0.66, max_dropout_fraction=0.1, min_clean_fraction=0.25, name='asrfilter', method='euclid', estimator='scm', **kwargs)#
- fit(X, y=None, **kwargs)#
Calibration for the Artifact Subspace Reconstruction method.
The input to this data is a multi-channel time series of calibration data. In typical uses the calibration data is clean resting EEG data of data if the fraction of artifact content is below the breakdown point of the robust statistics used for estimation (50% theoretical, ~30% practical). If the data has a proportion of more than 30-50% artifacts then bad time windows should be removed beforehand. This data is used to estimate the thresholds that are used by the ASR processing function to identify and remove artifact components.
The calibration data must have been recorded for the same cap design from which data for cleanup will be recorded, and ideally should be from the same session and same subject, but it is possible to reuse the calibration data from a previous session and montage to the extent that the cap is placed in the same location (where loss in accuracy is more or less proportional to the mismatch in cap placement).
- Parameters:
X (array, shape=(n_channels, n_samples)) – The calibration data should have been high-pass filtered (for example at 0.5Hz or 1Hz using a Butterworth IIR filter), and be reasonably clean not less than 30 seconds (this method is typically used with 1 minute or more).
- reset()#
Reset filter.
- transform(X, y=None, **kwargs)#
Apply Artifact Subspace Reconstruction.
- Parameters:
X (array, shape=([n_trials, ]n_channels, n_samples)) – Raw data.
- Returns:
out – Filtered data.
- Return type:
array, shape=([n_trials, ]n_channels, n_samples)
- meegkit.asr.asr_calibrate(X, sfreq, cutoff=5, blocksize=100, win_len=0.5, win_overlap=0.66, max_dropout_fraction=0.1, min_clean_fraction=0.25, method='euclid', estimator='scm')#
Calibration function for the Artifact Subspace Reconstruction method.
The input to this data is a multi-channel time series of calibration data. In typical uses the calibration data is clean resting EEG data of ca. 1 minute duration (can also be longer). One can also use on-task data if the fraction of artifact content is below the breakdown point of the robust statistics used for estimation (50% theoretical, ~30% practical). If the data has a proportion of more than 30-50% artifacts then bad time windows should be removed beforehand. This data is used to estimate the thresholds that are used by the ASR processing function to identify and remove artifact components.
The calibration data must have been recorded for the same cap design from which data for cleanup will be recorded, and ideally should be from the same session and same subject, but it is possible to reuse the calibration data from a previous session and montage to the extent that the cap is placed in the same location (where loss in accuracy is more or less proportional to the mismatch in cap placement).
The calibration data should have been high-pass filtered (for example at 0.5Hz or 1Hz using a Butterworth IIR filter).
- Parameters:
X (array, shape=([n_trials, ]n_channels, n_samples)) – zero-mean (e.g., high-pass filtered) and reasonably clean EEG of not much less than 30 seconds (this method is typically used with 1 minute or more).
sfreq (float) – Sampling rate of the data, in Hz.
cutoff (float) – Standard deviation cutoff for rejection. X portions whose variance is larger than this threshold relative to the calibration data are considered missing data and will be removed. The most aggressive value that can be used without losing too much EEG is 2.5. A quite conservative value would be 5 (default=5).
blocksize (int) – Block size for calculating the robust data covariance and thresholds, in samples; allows to reduce the memory and time requirements of the robust estimators by this factor (down to n_chans x n_chans x n_samples x 16 / blocksize bytes) (default=100).
win_len (float) – Window length that is used to check the data for artifact content. This is ideally as long as the expected time scale of the artifacts but short enough to allow for several 1000 windows to compute statistics over (default=0.5).
win_overlap (float) – Window overlap fraction. The fraction of two successive windows that overlaps. Higher overlap ensures that fewer artifact portions are going to be missed, but is slower (default=0.66).
max_dropout_fraction (float) – Maximum fraction of windows that can be subject to signal dropouts (e.g., sensor unplugged), used for threshold estimation (default=0.1).
min_clean_fraction (float) – Minimum fraction of windows that need to be clean, used for threshold estimation (default=0.25).
method ({'euclid', 'riemann'}) – Metric to compute the covariance matrix average.
- Returns:
M (array) – Mixing matrix.
T (array) – Threshold matrix.
- meegkit.asr.asr_process(X, X_filt, state, cov=None, detrend=False, method='riemann', sample_weight=None)#
Apply Artifact Subspace Reconstruction method.
This function is used to clean multi-channel signal using the ASR method. The required inputs are the data matrix, the sampling rate of the data, and the filter state.
- Parameters:
X (array, shape=([n_trials, ]n_channels, n_samples)) – Raw data.
X_filt (array, shape=([n_trials, ]n_channels, n_samples)) – Yulewalk-filtered epochs to estimate covariance. Optional if covariance is provided.
state (dict) –
Initial ASR parameters (as derived by
asr_calibrate()
):M
: Mixing matrixT
: Threshold matrixR
: Previous reconstruction matrix (array | None)
cov (array, shape=([n_trials, ]n_channels, n_channels) | None) – Covariance. If None (default), then it is computed from
X_filt
. If a 3D array is provided, the average covariance is computed from all the elements in it.detrend (bool) – If True, detrend filtered data (default=False).
method ({'euclid', 'riemann'}) – Metric to compute the covariance matrix average.
- Returns:
clean (array, shape=([n_trials, ]n_channels, n_samples)) – Clean data.
state (3-tuple) – Output ASR parameters.
- meegkit.asr.clean_windows(X, sfreq, max_bad_chans=0.2, zthresholds=[-3.5, 5], win_len=0.5, win_overlap=0.66, min_clean_fraction=0.25, max_dropout_fraction=0.1, show=False)#
Remove periods with abnormally high-power content from continuous data.
This function cuts segments from the data which contain high-power artifacts. Specifically, only windows are retained which have less than a certain fraction of “bad” channels, where a channel is bad in a window if its power is above or below a given upper/lower threshold (in standard deviations from a robust estimate of the EEG power distribution in the channel).
- Parameters:
X (array, shape=(n_channels, n_samples)) – Continuous data set, assumed to be appropriately high-passed (e.g. > 1Hz or 0.5Hz - 2.0Hz transition band)
max_bad_chans (float) – The maximum number or fraction of bad channels that a retained window may still contain (more than this and it is removed). Reasonable range is 0.05 (very clean output) to 0.3 (very lax cleaning of only coarse artifacts) (default=0.2).
zthresholds (2-tuple) – The minimum and maximum standard deviations within which the power of a channel must lie (relative to a robust estimate of the clean EEG power distribution in the channel) for it to be considered “not bad”. (default=[-3.5, 5]).
tuned. (The following are detail parameters that usually do not have to be)
want (If you can't get the function to do what you)
consider (you might)
data. (adapting these to your)
win_len (float) – Window length that is used to check the data for artifact content. This is ideally as long as the expected time scale of the artifacts but not shorter than half a cycle of the high-pass filter that was used. Default: 1.
win_overlap (float) – Window overlap fraction. The fraction of two successive windows that overlaps. Higher overlap ensures that fewer artifact portions are going to be missed, but is slower (default=0.66).
min_clean_fraction (float) – Minimum fraction that needs to be clean. This is the minimum fraction of time windows that need to contain essentially uncontaminated EEG. (default=0.25)
max_dropout_fraction (float) – Maximum fraction that can have dropouts. This is the maximum fraction of time windows that may have arbitrarily low amplitude (e.g., due to the sensors being unplugged) (default=0.1).
- Returns:
clean (array, shape=(n_channels, n_samples)) – Dataset with bad time periods removed.
sample_mask (boolean array, shape=(1, n_samples)) – Mask of retained samples (logical array).