class sklearn_extra.kernel_methods.EigenProClassifier(batch_size='auto', n_epoch=2, n_components=1000, subsample_size='auto', kernel='rbf', gamma=0.02, degree=3, coef0=1, kernel_params=None, random_state=None)[source]

Classification using EigenPro iteration.

Train least squared kernel classification model with mini-batch EigenPro iteration.

batch_sizeint, default = ‘auto’

Mini-batch size for gradient descent.

n_epochint, default = 2

The number of passes over the training data.

n_componentsint, default = 1000

the maximum number of eigendirections used in modifying the kernel operator. Convergence rate speedup over normal gradient descent is approximately the largest eigenvalue over the n_componenth eigenvalue, however, it may take time to compute eigenvalues for large n_components

subsample_sizeint, default = ‘auto’

The size of subsamples used for estimating the largest n_component eigenvalues and eigenvectors. When it is set to ‘auto’, it will be 4000 if there are less than 100,000 samples (for training), and otherwise 12000.

kernelstring or callable, default = “rbf”

Kernel mapping used internally. Strings can be anything supported by scikit-learn, however, there is special support for the rbf, laplace, and cauchy kernels. If a callable is given, it should accept two arguments and return a floating point number.

gammafloat, default=’scale’

Kernel coefficient. If ‘scale’, gamma = 1/(n_features*X.var()). Interpretation of the default value is left to the kernel; see the documentation for sklearn.metrics.pairwise. For kernels that use bandwidth, bandwidth = 1/sqrt(2*gamma).

degreefloat, default=3

Degree of the polynomial kernel. Ignored by other kernels.

coef0float, default=1

Zero coefficient for polynomial and sigmoid kernels. Ignored by other kernels.

kernel_paramsmapping of string to any

Additional parameters (keyword arguments) for kernel function passed as callable object.

random_stateint, RandomState instance or None (default=None)

The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.


  • Siyuan Ma, Mikhail Belkin “Diving into the shallows: a computational perspective on large-scale machine learning”, NIPS 2017.


>>> from sklearn_extra.kernel_methods import EigenProClassifier
>>> import numpy as np
>>> n_samples, n_features, n_targets = 4000, 20, 3
>>> rng = np.random.RandomState(1)
>>> x_train = rng.randn(n_samples, n_features)
>>> y_train = rng.randint(n_targets, size=n_samples)
>>> rgs = EigenProClassifier(n_epoch=3, gamma=.01, subsample_size=50)
>>>, y_train)
EigenProClassifier(gamma=0.01, n_epoch=3, subsample_size=50)
>>> y_pred = rgs.predict(x_train)
>>> loss = np.mean(y_train != y_pred)
__init__(batch_size='auto', n_epoch=2, n_components=1000, subsample_size='auto', kernel='rbf', gamma=0.02, degree=3, coef0=1, kernel_params=None, random_state=None)[source]

Initialize self. See help(type(self)) for accurate signature.

Examples using sklearn_extra.kernel_methods.EigenProClassifier