macrosynergy.signal.signal_return_relations#

Module for analysing and visualizing signal and a return series.

class SignalReturnRelations(df, rets=None, sigs=None, cids=None, sig_neg=None, cosp=False, start=None, end=None, blacklist=None, freqs='M', agg_sigs='last', fwin=1, slip=0, ms_panel_test=False, additional_metrics=None)[source]#

Bases: object

Class for analysing and visualizing signal and a return series.

Parameters:
  • df (DataFrame) – standardized DataFrame with the following necessary columns: ‘cid’, ‘xcat’, ‘real_date’ and ‘value.

  • rets (Union[str, List[str]]) – one or several target return categories.

  • sigs (Union[str, List[str]]) – list of signal categories to be considered for which detailed relational statistics can be calculated.

  • sig_neg (Union[bool, List[bool]]) – if set to True puts the signal in negative terms for all analysis. Default is False.

  • cosp (bool) – If True the comparative statistics are calculated only for the “communal sample periods”, i.e. periods and cross-sections that have values for all compared signals. Default is False.

  • start (str) – earliest date in ISO format. Default is None in which case the earliest date available will be used.

  • end (str) – latest date in ISO format. Default is None in which case the latest date in the df will be used.

  • blacklist (dict) – cross-sections with date ranges that should be excluded from the data frame. If one cross-section has several blacklist periods append numbers to the cross-section code.

  • freqs (Union[str, List[str]]) – letters denoting all frequencies at which the series may be sampled. This must be a selection of ‘D’, ‘W’, ‘M’, ‘Q’, ‘A’. Default is only ‘M’. The return series will always be summed over the sample period. The signal series will be aggregated according to the values of agg_sigs.

  • agg_sigs (Union[str, List[str]]) – aggregation method applied to the signal values in down-sampling. The default is “last”. Alternatives are “mean”, “median” and “sum”. If a single aggregation type is chosen for multiple signal categories it is applied to all of them.

  • fwin (int) – forward window of return category in base periods. Default is 1. This conceptually corresponds to the holding period of a position in accordance with the signal.

  • slip (int) – implied slippage of feature availability for relationship with the target category. This mimics the relationship between trading signals and returns, which is often characterized by a delay due to the setup of positions. Technically, this is a negative lag (early arrival) of the target category in working days prior to any frequency conversion. Default is 0.

  • ms_panel_test (bool) – if True the Macrosynergy Panel test is calculated. Please note that this is a very time-consuming operation and should be used only if you require the result.

  • additional_metrics (List[Callable]) – list of additional metrics to be calculated and added to the output table.

accuracy_bars(ret=None, sigs=None, freq=None, agg_sig=None, type='cross_section', title=None, title_fontsize=16, size=None, legend_pos='best')[source]#

Plot bar chart for the overall and balanced accuracy metrics. For types: cross_section and years. If sigs is not specified, then the first signal in the list of signals will be used.

Parameters:
  • type (str) – type of segment over which bars are drawn. Either “cross_section” (default), “years” or “signals”.

  • title (str) – chart header - default will be applied if none is chosen.

  • title_fontsize (int) – font size of chart header. Default is 16.

  • size (Tuple[float]) – 2-tuple of width and height of plot - default will be applied if none is chosen.

  • legend_pos (str) – position of legend box. Default is ‘best’. See the documentation of matplotlib.pyplot.legend.

correlation_bars(ret=None, sigs=None, freq=None, type='cross_section', title=None, title_fontsize=16, size=None, legend_pos='best')[source]#

Plot correlation coefficients and significance.

Parameters:
  • ret (str) – return category. Default is the first return category.

  • sig – signal category. Default is the first signal category.

  • type (str) – type of segment over which bars are drawn. Either “cross_section” (default), “years” or “signals”.

  • title (str) – chart header. Default will be applied if none is chosen.

  • title_fontsize (int) – font size of chart header. Default is 16.

  • size (Tuple[float]) – 2-tuple of width and height of plot. Default will be applied if none is chosen.

  • legend_pos (str) – position of legend box. Default is ‘best’. See matplotlib.pyplot.legend.

static apply_slip(df, slip, cids, xcats, metrics)[source]#

Function used to call the apply slip method that is defined in management/utils.py

Parameters:
  • df (DataFrame) – standardised DataFrame.

  • slip (int) – slip value to apply to df.

  • cids (List[str]) – list of cids in df to apply slip.

  • xcats (List[str]) – list of xcats in df to apply slip.

  • metrics (List[str]) – list of metrics in df to apply slip.

Return type:

DataFrame

static is_list_of_strings(variable)[source]#

Function used to test whether a variable is a list of strings, to avoid the compiler saying a string is a list of characters :type variable: Any :param variable: variable to be tested. :return <bool>: True if variable is a list of strings, False otherwise.

Return type:

bool

manipulate_df(xcat, freq, agg_sig)[source]#

Used to manipulate the DataFrame to the desired format for the analysis. Firstly reduces the dataframe to only include data outside of the blacklist and data that is relevant to xcat and sig. Then applies the slip to the dataframe. It then converts the dataframe to the desired format for the analysis and checks whether any negative signs should be introduced.

Parameters:
  • xcat (str) – xcat to be analysed.

  • freq (str) – frequency to be used in analysis.

  • agg_sig (str) – aggregation method to be used in analysis.

  • sig – signal to be analysed.

  • sst – Boolean that specifies whether this function is to be used for a single statistic table.

  • df_result – DataFrame to be used for single statistic table. None by default, and when using with sst set to False.

map_pval(ret_vals, sig_vals)[source]#
calculate_single_stat(stat, ret=None, sig=None, type=None)[source]#

Calculates a single statistic for a given signal-return relation.

Parameters:
  • stat (str) – statistic to be calculated.

  • ret (str) – return category. Default is the first return category.

  • sig (str) – signal category. Default is the first signal category.

  • cstype – type of segment over which bars are drawn. Either “panel” (default), “years” or “signals”.

summary_table(cross_section=False, years=False)[source]#
signals_table(sigs=None)[source]#
cross_section_table()[source]#
yearly_table()[source]#
single_relation_table(ret=None, xcat=None, freq=None, agg_sigs=None, table_type=None)[source]#

Computes all the statistics for one specific signal-return relation:

Parameters:
  • ret (str) – single target return category. Default is first in target return list of the class.

  • xcat (str) – single signal category to be considered. Default is first in feature category list of the class.

  • freq (str) – letter denoting single frequency at which the series will be sampled. This must be one of the frequencies selected for the class. If not specified uses the freq stored in the class.

  • agg_sigs (str) – aggregation method applied to the signal values in down-sampling.

  • table_type (str) – type of table to be returned. Either “summary”, “years”, “cross_section”.

reindex_multindex_df(df, desired_order, var_type)[source]#
multiple_relations_table(rets=None, xcats=None, freqs=None, agg_sigs=None, signal_name_dict=None, return_name_dict=None)[source]#

Calculates all the statistics for each return and signal category specified with each frequency and aggregation method, note that if none are defined it does this for all categories, frequencies and aggregation methods that were stored in the class.

Parameters:
  • rets (Union[str, List[str]]) – target return category

  • xcats (Union[str, List[str]]) – signal categories to be considered

  • freqs (Union[str, List[str]]) – letters denoting frequency at which the series are to be sampled. This must be one of ‘D’, ‘W’, ‘M’, ‘Q’, ‘A’. If not specified uses the freq stored in the class.

  • agg_sigs (Union[str, List[str]]) – aggregation methods applied to the signal values in down-sampling.

single_statistic_table(stat, type='panel', rows=['xcat', 'agg_sigs'], columns=['ret', 'freq'], show_heatmap=False, title=None, title_fontsize=16, row_names=None, column_names=None, signal_name_dict=None, return_name_dict=None, min_color=None, max_color=None, figsize=(14, 8), annotate=True, round=5)[source]#

Creates a table which shows the specified statistic for each row and column specified as arguments:

Parameters:
  • stat (str) – type of statistic to be displayed (this can be any o :param stat: type of statistic to be displayed (this can be any of the column names of summary_table).

  • type (str) – type of the statistic displayed. This can be based o :param type: type of the statistic displayed. This can be based on the overall panel (“panel”, default), an average of annual panels (mean_years), an average of cross-sectional relations (“mean_cids”), the positive ratio across years(“pr_years”), positive ratio across sections (“pr_cids”).

  • rows (List[str]) – row indices, which can be return categories, feature categories, frequencies and/or aggregations. The choice is made through a list of one or more of “xcat”, “ret”, “freq” and “agg_sigs”. The default is [“xcat”, “agg_sigs”] resulting in index strings (<agg_signs>) or if only one aggregation is available.

  • columns (List[str]) – column indices, which can be return categories, feature categories, frequencies and/or aggregations. The choice is made through a list of one or more of “xcat”, “ret”, “freq” and “agg_sigs”. The default is [“ret”, “freq] resulting in index strings () or if only one frequency is available.

  • show_heatmap (bool) – if True, the table is visualized as a heatmap. Default is False.

  • title (Optional[str]) – plot title; if none given default title is shown.

  • title_fontsize (int) – font size of title. Default is 16.

  • row_names (Optional[List[str]]) – specifies the labels of rows in the heatmap. If None, the indices of the generated DataFrame are used.

  • column_names (Optional[List[str]]) – specifies the labels of columns in the heatmap. If None, the columns of the generated DataFrame are used.

  • min_color (Optional[float]) – minimum value of the color scale. Default is None, in which case the minimum value of the table is used.

  • max_color (Optional[float]) – maximum value of the color scale. Default is None, in which case the maximum value of the table is used.

  • figsize (Tuple[float]) – Tuple (w, h) of width and height of graph.

  • annotate (bool) – if True, the values are annotated in the heatmap.

  • round (int) – number of decimals to round the values to on the heatmap’s annotations.

Returns:

DataFrame with the specified statistic for each row and column

set_df_labels(rows_dict, rows, columns)[source]#

Creates two lists of strings that will be used as the row and column labels for the resulting dataframe.

Parameters:
  • rows_dict (Dict) – dictionary containing the each value for each of the xcat, ret, freq and agg_sigs categories.

  • rows (List[str]) – list of strings specifying which of the categories are included in the rows of the dataframe.

  • columns (List[str]) – list of strings specifying which of the categories are included in the columns of the dataframe.

get_rowcol(hash, rowcols)[source]#

Calculates which row/column the hash belongs to.

Parameters:
  • hash (str) – hash of the statistic.

  • rowcols (List[str]) – list of strings specifying which of the categories

are in the rows/columns of the dataframe.