sklearn_extra.kernel_methods
.EigenProClassifier¶
- class sklearn_extra.kernel_methods.EigenProClassifier(batch_size='auto', n_epoch=2, n_components=1000, subsample_size='auto', kernel='rbf', gamma=0.02, degree=3, coef0=1, kernel_params=None, random_state=None)[source]¶
Classification using EigenPro iteration.
Train least squared kernel classification model with mini-batch EigenPro iteration.
- Parameters:
- batch_sizeint, default = ‘auto’
Mini-batch size for gradient descent.
- n_epochint, default = 2
The number of passes over the training data.
- n_componentsint, default = 1000
the maximum number of eigendirections used in modifying the kernel operator. Convergence rate speedup over normal gradient descent is approximately the largest eigenvalue over the n_componenth eigenvalue, however, it may take time to compute eigenvalues for large n_components
- subsample_sizeint, default = ‘auto’
The size of subsamples used for estimating the largest n_component eigenvalues and eigenvectors. When it is set to ‘auto’, it will be 4000 if there are less than 100,000 samples (for training), and otherwise 12000.
- kernelstring or callable, default = “rbf”
Kernel mapping used internally. Strings can be anything supported by scikit-learn, however, there is special support for the rbf, laplace, and cauchy kernels. If a callable is given, it should accept two arguments and return a floating point number.
- gammafloat, default=’scale’
Kernel coefficient. If ‘scale’, gamma = 1/(n_features*X.var()). Interpretation of the default value is left to the kernel; see the documentation for sklearn.metrics.pairwise. For kernels that use bandwidth, bandwidth = 1/sqrt(2*gamma).
- degreefloat, default=3
Degree of the polynomial kernel. Ignored by other kernels.
- coef0float, default=1
Zero coefficient for polynomial and sigmoid kernels. Ignored by other kernels.
- kernel_paramsmapping of string to any
Additional parameters (keyword arguments) for kernel function passed as callable object.
- random_stateint, RandomState instance or None (default=None)
The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
References
Siyuan Ma, Mikhail Belkin “Diving into the shallows: a computational perspective on large-scale machine learning”, NIPS 2017.
Examples
>>> from sklearn_extra.kernel_methods import EigenProClassifier >>> import numpy as np >>> n_samples, n_features, n_targets = 4000, 20, 3 >>> rng = np.random.RandomState(1) >>> x_train = rng.randn(n_samples, n_features) >>> y_train = rng.randint(n_targets, size=n_samples) >>> rgs = EigenProClassifier(n_epoch=3, gamma=.01, subsample_size=50) >>> rgs.fit(x_train, y_train) EigenProClassifier(gamma=0.01, n_epoch=3, subsample_size=50) >>> y_pred = rgs.predict(x_train) >>> loss = np.mean(y_train != y_pred)
Examples using sklearn_extra.kernel_methods.EigenProClassifier
¶
Comparison of EigenPro and SVC on Digit Classification