`ranksim`.RankSimilarityTransform¶

class ranksim.RankSimilarityTransform(*, n_filters='auto', max_filters=5000, n_fast_filters=1000, initialize='random', spreading='max', n_iter=5, random_state=None, filter_function='auto', create_distribution=None, **kwargs)[source]¶

Rank Similarity Transform

Transform the data base on responses of rank similarity filters. Output dimensions are equal to n_filters. Values are between 0 and 1.

Read more in the User Guide.

Parameters

n_filters{‘auto’} or int, default=’auto’

Number of filters to use. ‘auto’ will determine this based on max_filters, n_fast_filters and the size of the input data.

max_filtersint, default=5000

Maximum number of filters to allocate.

Only used when n_filters='auto'.

n_fast_filters: int, default=1000

Minimum number of filters to allocate, unless the input data has fewer samples than this number.

Only used when n_filters='auto'.

initialize{‘random’,’weighted_avg’,’plusplus’}, default=’random’

Type of filter initialization.

‘random’, filters are initialized with a random data point.
‘weighted_avg’, creates filters from similar data, used when
there are more filters than input data.
‘plusplus’, filters are initialized with dissimilar data as k-means++

spreading{‘max’, ‘weighted_avg’} or None, default=’max’

Determines how data is spread between filters during training

‘max’, the data point is allocated to the maximum responding
filter.
‘weighted_avg’, the weighted average of a fixed number of data
points are allocated to the maximum responding filter, used when there are more filters than data.

n_iterint, default=5

Number of iterations/sweeps over the training dataset to perform during training.

random_stateint, RandomState instance, default=None

Determines random number generation for filter initialization. Pass an int for reproducible results across multiple function calls.

filter_function{‘auto’} or callable, default=’auto’

Function which determines the weights from subsections of the input data. ‘auto’ performs a mean and rank, optionally drawn from a distribution.

create_distribution{‘confusion’}, callable or None, default=None

Creates a distribution to draw ranks from.

‘confusion’ is a distribution based on the confusibility of
features in the input data.

Note: the ‘confusion’ option is extremely slow.

Attributes

filters_ndarray of shape (n_filters_, n_features): Weights of the calculated filters.
n_filters_int: Number of filters.
n_iter_int: The number of iterations run by the spreading function.
n_outputs_int: Number of outputs.
filterFactory_class: Class used to create the filters.

Examples

>>> from multifilter import RankSimilarityTransform
>>> from sklearn.datasets import load_digits
>>> X, _ = load_digits(return_X_y=True)
>>> X.shape
(1797, 64)
>>> embedding = RankSimilarityTransform(n_filters=10, n_iter=20, 
                                        random_state=0)
>>> X_transformed = embedding.fit_transform(X)
>>> X_transformed.shape
(1797, 10)
>>> X_transformed[:1, :]
array([[0.22846891, 0.03269542, 0.        , 0.17862841, 0.23724085,
    0.09489637, 0.        , 1.        , 0.47966508, 0.22846891]])

Methods

`fit`(X[, y])	Fit the rank similarity transform from the training dataset.
`fit_transform`(X[, y])	Fit to data, then transform it.
`get_params`([deep])	Get parameters for this estimator.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(X[, n_best])	Transforms X.

ranksim.RankSimilarityTransform¶

`ranksim`.RankSimilarityTransform¶