macrosynergy.pnl.naive_pnl#

“Naive” PnLs with limited signal options and disregarding transaction costs.

class NaivePnL(df, ret, sigs, cids=None, bms=None, start=None, end=None, blacklist=None)[source]#

Bases: object

Computes and collects illustrative PnLs with limited signal options and disregarding transaction costs.

Parameters:
  • df (DataFrame) – standardized DataFrame with the following necessary columns: ‘cid’, ‘xcat’, ‘real_date’ and ‘value’.

  • ret (str) – return category.

  • sigs (List[str]) – signal categories. Able to pass in multiple possible signals to the Class’ constructor and their respective vintages will be held on the instance’s DataFrame. The signals can subsequently be referenced through the self.make_pnl() method which receives a single signal per call.

  • cids (List[str]) – cross sections that are traded. Default is all in the dataframe.

  • bms (Union[str, List[str]]) – list of benchmark tickers for which correlations are displayed against PnL strategies.

  • start (str) – earliest date in ISO format. Default is None and earliest date in df is used.

  • end (str) – latest date in ISO format. Default is None and latest date in df is used.

  • blacklist (dict) – cross-sections with date ranges that should be excluded from the dataframe.

add_bm(df, bms, tickers)[source]#

Returns a dictionary with benchmark return series.

Parameters:
  • df (DataFrame) – aggregate DataFrame passed into the Class.

  • bms (List[str]) – benchmark return tickers.

  • tickers (List[str]) – the available tickers held in the reduced DataFrame. The reduced DataFrame consists exclusively of the signal & return categories.

classmethod rebalancing(dfw, rebal_freq='daily', rebal_slip=0)[source]#

The signals are calculated daily and for each individual cross-section defined in the panel. However, re-balancing a position can occur more infrequently than daily. Therefore, produce the re-balancing values according to the more infrequent timeline (weekly or monthly).

Parameters:
  • dfw (DataFrame) – DataFrame with each category represented by a column and the daily signal is also included with the column name ‘psig’.

  • rebal_freq (str) – re-balancing frequency for positions according to signal must be one of ‘daily’ (default), ‘weekly’ or ‘monthly’.

  • rebal_slip – re-balancing slippage in days.

Return <pd.Series>:

will return a pd.Series containing the associated signals according to the re-balancing frequency.

make_pnl(sig, sig_op='zn_score_pan', sig_add=0, sig_neg=False, pnl_name=None, rebal_freq='daily', rebal_slip=0, vol_scale=None, min_obs=261, iis=True, sequential=True, neutral='zero', thresh=None)[source]#

Calculate daily PnL and add to class instance.

Parameters:
  • sig (str) – name of raw signal that is basis for positioning. The signal is assumed to be recorded at the end of the day prior to position taking.

  • sig_op (str) – signal transformation options; must be one of ‘zn_score_pan’, ‘zn_score_cs’, or ‘binary’. The default is ‘zn_score_pan’. ‘zn_score_pan’: transforms raw signals into z-scores around zero value based on the whole panel. The neutral level & standard deviation will use the cross-section of panels. ‘zn_score_cs’: transforms signals to z-scores around zero based on cross-section alone. ‘binary’: transforms signals into uniform long/shorts (1/-1) across all sections. N.B.: zn-score here means standardized score with zero being the natural neutral level and standardization through division by mean absolute value.

  • sig_add (float) – add a constant to the signal after initial transformation. This allows to give PnLs a long or short bias relative to the signal score. Default is 0.

  • sig_neg (bool) – if True the PnL is based on the negative value of the transformed signal. Default is False.

  • pnl_name (str) – name of the PnL to be generated and stored. Default is None, i.e. a default name is given. The default name will be: ‘PNL_<signal name>[_<NEG>]’, with the last part added if sig_neg has been set to True. Previously calculated PnLs of the same name will be overwritten. This means that if a set of PnLs are to be compared, each PnL requires a distinct name.

  • rebal_freq (str) – re-balancing frequency for positions according to signal must be one of ‘daily’ (default), ‘weekly’ or ‘monthly’. The re-balancing is only concerned with the signal value on the re-balancing date which is delimited by the frequency chosen. Additionally, the re-balancing frequency will be applied to make_zn_scores() if used as the method to produce the raw signals.

  • rebal_slip – re-balancing slippage in days. Default is 1 which means that it takes one day to re-balance the position and that the new positions produce PnL from the second day after the signal has been recorded.

  • vol_scale (float) – ex-post scaling of PnL to annualized volatility given. This is for comparative visualization and not out-of-sample. Default is none.

  • min_obs (int) – the minimum number of observations required to calculate zn_scores. Default is 252.

  • iis (bool) – if True (default) zn-scores are also calculated for the initial sample period defined by min_obs, on an in-sample basis, to avoid losing history.

  • sequential (bool) – if True (default) score parameters (neutral level and standard deviations) are estimated sequentially with concurrently available information only.

  • neutral (str) – method to determine neutral level. Default is ‘zero’. Alternatives are ‘mean’ and “median”.

  • thresh (float) – threshold value beyond which scores are winsorized, i.e. contained at that threshold. Therefore, the threshold is the maximum absolute score value that the function is allowed to produce. The minimum threshold is one standard deviation. Default is no threshold.

make_long_pnl(vol_scale=None, label=None)[source]#

The long-only returns will be computed which act as a basis for comparison against the signal-adjusted returns. Will take a long-only position in the category passed to the parameter ‘self.ret’.

Parameters:
  • vol_scale (Optional[float]) – ex-post scaling of PnL to annualized volatility given. This is for comparative visualization and not out-of-sample, and is applied to the long-only position. Default is None.

  • label (Optional[str]) – associated label that will be mapped to the long-only DataFrame. The label will be used in the plotting graphic for plot_pnls(). If a label is not defined, the default will be the name of the return category.

static long_only_pnl(dfw, vol_scale=None, label=None)[source]#

Method used to compute the PnL accrued from simply taking a long-only position in the category, ‘self.ret’. The returns from the category are not predicated on any exogenous signal.

Parameters:
  • dfw (DataFrame) –

  • vol_scale (float) – ex-post scaling of PnL to annualized volatility given. This is for comparative visualization and not out-of-sample. Default is none.

  • label (str) – associated label that will be mapped to the long-only DataFrame.

Return <pd.DataFrame> panel_pnl:

standardised dataframe containing exclusively the return category, and the long-only panel return.

plot_pnls(pnl_cats=None, pnl_cids=['ALL'], start=None, end=None, facet=False, ncol=3, same_y=True, title='Cumulative Naive PnL', title_fontsize=20, xcat_labels=None, xlab='', ylab='% of risk capital, no compounding', share_axis_labels=True, figsize=(12, 7), aspect=1.7, height=3, label_adj=0.05, title_adj=0.95, y_label_adj=0.95)[source]#

Plot line chart of cumulative PnLs, single PnL, multiple PnL types per cross section, or multiple cross sections per PnL type.

Parameters:
  • pnl_cats (List[str]) – list of PnL categories that should be plotted.

  • pnl_cids (List[str]) – list of cross sections to be plotted; default is ‘ALL’ (global PnL). Note: one can only have multiple PnL categories or multiple cross sections, not both.

  • start (str) – earliest date in ISO format. Default is None and earliest date in df is used.

  • end (str) – latest date in ISO format. Default is None and latest date in df is used.

  • facet (bool) – parameter to control whether each PnL series is plotted on its own respective grid using Seaborn’s FacetGrid. Default is False and all series will be plotted in the same graph.

  • ncol (int) – number of columns in facet grid. Default is 3. If the total number of PnLs is less than ncol, the number of columns will be adjusted on runtime.

  • same_y (bool) – if True (default) all plots in facet grid share same y axis.

  • title (str) – allows entering text for a custom chart header.

  • title_fontsize (int) – font size for the title. Default is 20.

  • xcat_labels (Union[List[str], dict]) – custom labels to be used for the PnLs.

  • xlab (str) – label for x-axis of the plot (or subplots if faceted), default is None (empty string)..

  • ylab (str) – label for y-axis of the plot (or subplots if faceted), default is ‘% of risk capital, no compounding’.

  • share_axis_labels (bool) – if True (default) the axis labels are shared by all subplots in the facet grid.

  • figsize (Tuple) – tuple of plot width and height. Default is (12 , 7).

  • aspect (float) – width-height ratio for plots in facet. Default is 1.7.

  • height (float) – height of plots in facet. Default is 3.

  • label_adj (float) – parameter that sets bottom of figure to fit the label. Default is 0.05.

  • title_adj (float) – parameter that sets top of figure to accommodate title. Default is 0.95.

  • y_label_adj (float) – parameter that sets left of figure to fit the y-label.

Return type:

None

signal_heatmap(pnl_name, pnl_cids=None, start=None, end=None, freq='m', title='Average applied signal values', x_label='', y_label='', figsize=None)[source]#

Display heatmap of signals across times and cross-sections.

Parameters:
  • pnl_name (str) – name of naive PnL whose signals are displayed. N.B.: Signal is here is the value that actually determines the concurrent PnL.

  • pnl_cids (List[str]) – cross-sections. Default is all available.

  • start (str) – earliest date in ISO format. Default is None and earliest date in df is used.

  • end (str) – latest date in ISO format. Default is None and latest date in df is used.

  • freq (str) – frequency for which signal average is displayed. Default is monthly (‘m’). The only alternative is quarterly (‘q’).

  • title (str) – allows entering text for a custom chart header.

  • x_label (str) – label for the x-axis. Default is None.

  • y_label (str) – label for the y-axis. Default is None.

  • figsize (Optional[Tuple[float, float]]) – width and height in inches. Default is (14, number of cross sections).

agg_signal_bars(pnl_name, freq='m', metric='direction', title=None, y_label='Sum of Std. across the Panel')[source]#

Display aggregate signal strength and - potentially - direction.

Parameters:
  • pnl_name (str) – name of the PnL whose signal is to be visualized. N.B.: The referenced signal corresponds to the series that determines the concurrent PnL.

  • freq (str) – frequency at which the signal is visualized. Default is monthly (‘m’). The alternative is quarterly (‘q’).

  • metric (str) – the type of signal value. Default is “direction”. Alternative is “strength”.

  • title (str) – allows entering text for a custom chart header. Default will be “Directional Bar Chart of <pnl_name>.”.

  • y_label (str) – label for the y-axis. Default is the sum of standard deviations across the panel corresponding to the default signal transformation: ‘zn_score_pan’.

evaluate_pnls(pnl_cats, pnl_cids=['ALL'], start=None, end=None, label_dict=None)[source]#

Table of key PnL statistics.

Parameters:
  • pnl_cats (List[str]) – list of PnL categories that should be plotted.

  • pnl_cids (List[str]) – list of cross-sections to be plotted; default is ‘ALL’ (global PnL). Note: one can only have multiple PnL categories or multiple cross-sections, not both.

  • start (str) – earliest date in ISO format. Default is None and earliest date in df is used.

  • end (str) – latest date in ISO format. Default is None and latest date in df is used.

  • label_dict (Dict[str, str]) – dictionary with keys as pnl_cats and values as new labels for the PnLs.

Return <pd.DataFrame>:

standardized DataFrame with key PnL performance statistics.

print_pnl_names()[source]#

Print list of names of available PnLs in the class instance.

pnl_df(pnl_names=None, cs=False)[source]#

Return dataframe with PnLs.

Parameters:
  • pnl_names (List[str]) – list of names of PnLs to be returned. Default is ‘ALL’.

  • cs (bool) – inclusion of cross section PnLs. Default is False.

Return <pd.DataFrame>:

custom DataFrame with PnLs

create_results_dataframe(title, df, ret, sigs, cids, sig_ops, sig_adds, neutrals, threshs, bm=None, sig_negs=None, cosp=False, start=None, end=None, blacklist=None, freqs='M', agg_sigs='last', sigs_renamed=None, fwin=1, slip=0)[source]#

Create a DataFrame with key performance metrics for the signals and PnLs.

Parameters:
  • title (str) – title of the DataFrame.

  • df (DataFrame) – DataFrame with the data.

  • ret (str) – name of the return signal.

  • sigs (:param <Union[str, List[str]) – name of the comparative signal(s).

  • cids (:param <Union[str, List[str]) – name of the cross-section(s).

  • sig_ops (:param <Union[str, List[str]) – operation(s) to be applied to the signal(s).

  • sig_adds (:param <Union[float, List[float]) – value(s) to be added to the signal(s).

  • neutrals (:param <Union[str, List[str]) – neutralization method(s) to be applied.

  • threshs (:param <Union[float, List[float]) – threshold(s) to be applied to the signal(s).

  • bm (str) – name of the benchmark signal.

  • sig_negs (:param <Union[bool, List[bool]) – whether the signal(s) should be negated.

  • cosp (bool) – whether the signals should be cross-sectionally standardized.

  • start (str) – start date of the analysis.

  • end (str) – end date of the analysis.

  • blacklist (dict) – dictionary with the blacklisted dates.

  • freqs (:param <Union[str, List[str]) – frequency of the rebalancing.

  • agg_sigs (:param <Union[str, List[str]) – aggregation method(s) for the signal(s).

  • sigs_renamed (dict) – dictionary with the renamed signals.

  • fwin (int) – frequency of the rolling window.

  • slip (int) – slippage to be applied to the PnLs.

Return <pd.DataFrame>:

DataFrame with the performance metrics.