macrosynergy.learning.transformers#
Collection of custom scikit-learn transformer classes.
- class ENetSelector(alpha=1.0, l1_ratio=0.5, positive=True)[source]#
Bases:
BaseEstimator
,SelectorMixin
- get_support(indices=False)[source]#
Method to return a mask, or integer index, of the features selected for the Pandas dataframe.
- Parameters:
indices – Boolean to specify whether to return the column indices of the selected features instead of a boolean mask
- Return <np.ndarray>:
Boolean mask or integer index of the selected features
- class LassoSelector(alpha, positive=True)[source]#
Bases:
BaseEstimator
,SelectorMixin
- get_support(indices=False)[source]#
Method to return a mask, or integer index, of the features selected for the Pandas dataframe.
- Parameters:
indices – Boolean to specify whether to return the column indices of the selected features instead of a boolean mask
- Return <np.ndarray>:
Boolean mask or integer index of the selected features
- class MapSelector(threshold=0.05, positive=False)[source]#
Bases:
BaseEstimator
,SelectorMixin
- fit(X, y)[source]#
Fit method to assess significance of each feature using the Macrosynergy panel test.
- get_support(indices=False)[source]#
Method to return a mask, or integer index, of the features selected for the Pandas dataframe.
- Parameters:
indices – Boolean to specify whether to return the column indices of the selected features instead of a boolean mask
- Return <np.ndarray>:
Boolean mask or integer index of the selected features
- class FeatureAverager(use_signs=False)[source]#
Bases:
BaseEstimator
,TransformerMixin
- fit(X, y=None)[source]#
Fit method. Since this transformer is a simple averaging of features, no fitting is required.
- class ZnScoreAverager(neutral='zero', use_signs=False)[source]#
Bases:
BaseEstimator
,TransformerMixin
- fit(X, y=None)[source]#
Fit method to extract relevant standardisation/normalisation statistics from a training set so that PiT statistics can be computed in the transform method for a hold-out set.
- transform(X)[source]#
Transform method to compute an out-of-sample benchmark signal for each unique date in the input test dataframe. At a given test time, the relevant statistics (implied by choice of neutral value) are calculated using all training information and test information until (and including) that test time, since the test time denotes the time at which the return was available and the features lag behind the returns.
- Parameters:
X (
DataFrame
) – Pandas dataframe of input features.
- class PanelMinMaxScaler[source]#
Bases:
BaseEstimator
,TransformerMixin
,OneToOneFeatureMixin
Transformer class to extend scikit-learn’s MinMaxScaler() to panel datasets. It is intended to replicate the aforementioned class, but critically returning a Pandas dataframe or series instead of a numpy array. This preserves the multi-indexing in the inputs after transformation, allowing for the passing of standardised features into transformers that require cross-sectional and temporal knowledge.
- NOTE: This class is designed to replicate scikit-learn’s MinMaxScaler() class.
It should primarily be used to satisfy the assumptions of various models.
- class PanelStandardScaler(with_mean=True, with_std=True)[source]#
Bases:
BaseEstimator
,TransformerMixin
,OneToOneFeatureMixin