macrosynergy.management.simulate.simulate_quantamental_data#

Module with functionality for generating mock quantamental data for testing purposes.

simulate_ar(nobs, mean=0, sd_mult=1, ar_coef=0.75)[source]#

Create an auto-correlated data-series as numpy array.

Parameters:
  • nobs (int) – number of observations.

  • mean (float) – mean of values, default is zero.

  • sd_mult (float) – standard deviation multipliers of values, default is 1. This affects non-zero means.

  • ar_coef (float) – autoregression coefficient (between 0 and 1): default is 0.75.

Return <np.ndarray>:

autocorrelated data series.

dataframe_generator(df_cids, df_xcats, cid, xcat)[source]#

Adjacent method used to construct the quantamental DataFrame.

Parameters:
  • df_cids (DataFrame) –

  • df_xcats (DataFrame) –

  • cid (str) – individual cross-section.

  • xcat (str) – individual category.

Return <Tuple[pd.DataFrame, pd.DatetimeIndex]>:

Tuple containing the quantamental DataFrame and a DatetimeIndex of the business days.

make_qdf(df_cids, df_xcats, back_ar=0)[source]#

Make quantamental DataFrame with basic columns: ‘cid’, ‘xcat’, ‘real_date’, ‘value’.

Parameters:
  • df_cids (DataFrame) –

    DataFrame with parameters by cid. Row indices are cross-sections. Columns are: ‘earliest’: string of earliest date (ISO) for which country values are available; ‘latest’: string of latest date (ISO) for which country values are available; ‘mean_add’: float of country-specific addition to any category’s mean; ‘sd_mult’: float of country-specific multiplier of an category’s standard

    deviation.

  • df_xcats (DataFrame) – dataframe with parameters by xcat. Row indices are cross-sections. Columns are: ‘earliest’: string of earliest date (ISO) for which category values are available; ‘latest’: string of latest date (ISO) for which category values are available; ‘mean_add’: float of category-specific addition; ‘sd_mult’: float of country-specific multiplier of an category’s standard deviation; ‘ar_coef’: float between 0 and 1 denoting set auto-correlation of the category; ‘back_coef’: float, coefficient with which communal (mean 0, SD 1) background factor is added to category values.

  • back_ar (float) – float between 0 and 1 denoting set auto-correlation of the background factor. Default is zero.

Return <pd.DataFrame>:

basic quantamental DataFrame according to specifications.

make_qdf_black(df_cids, df_xcats, blackout)[source]#

Make quantamental DataFrame with basic columns: ‘cid’, ‘xcat’, ‘real_date’, ‘value’. In this DataFrame the column, ‘value’, will consist of Binary Values denoting whether the cross-section is active for the corresponding dates.

Parameters:

df_cids (DataFrame) – dataframe with parameters by cid. Row indices are cross-sections. Columns are:

‘earliest’: string of earliest date (ISO) for which country values are available; ‘latest’: string of latest date (ISO) for which country values are available; ‘mean_add’: float of country-specific addition to any category’s mean; ‘sd_mult’: float of country-specific multiplier of an category’s standard deviation. :type df_xcats: DataFrame :param df_xcats: dataframe with parameters by xcat. Row indices are

cross-sections. Columns are:

‘earliest’: string of earliest date (ISO) for which category values are available; ‘latest’: string of latest date (ISO) for which category values are available; ‘mean_add’: float of category-specific addition; ‘sd_mult’: float of country-specific multiplier of an category’s standard deviation; ‘ar_coef’: float between 0 and 1 denoting set autocorrelation of the category; ‘back_coef’: float, coefficient with which communal (mean 0, SD 1) background

factor is added to categoy values.

Parameters:

blackout (dict) – Dictionary defining the blackout periods for each cross- section. The expected form of the dictionary is: {‘AUD’: (Timestamp(‘2000-01-13 00:00:00’), Timestamp(‘2000-01-13 00:00:00’)), ‘USD_1’: (Timestamp(‘2000-01-03 00:00:00’), Timestamp(‘2000-01-05 00:00:00’)), ‘USD_2’: (Timestamp(‘2000-01-09 00:00:00’), Timestamp(‘2000-01-10 00:00:00’)), ‘USD_3’: (Timestamp(‘2000-01-12 00:00:00’), Timestamp(‘2000-01-12 00:00:00’))} The values of the dictionary are tuples consisting of the start & end-date of the respective blackout period. Each cross-section could have potentially more than one blackout period on a single category, and subsequently each key will be indexed to indicate the number of periods.

Return <pd.DataFrame>:

basic quantamental DataFrame according to specifications with binary values.

generate_lines(sig_len, style='linear')[source]#

Returns a numpy array of a line with a given length.

Parameters :type sig_len: int :param sig_len: The number of elements in the returned array. :type style: str :param style: The style of the line. Default ‘linear’. Current choices are:

linear, decreasing-linear, sharp-hill, four-bit-sine, sine, cosine, sawtooth. Adding “inv” or “inverted” to the style will return the inverted version of that line. For example, ‘inv-sawtooth’ or ‘inverted sawtooth’ will return the inverted sawtooth line. ‘any’ will return a random line. ‘all’ will return a list of all the available styles.

Return <Union[np.ndarray, List[str]]>:

A numpy array of the line. If style is ‘all’, then a list (of strings) of all the available styles is returned.

NOTE: It is indeed request an “inverted linear” or “inverted decreasing-linear” line. They’re just there for completeness and readability.

Return type:

Union[ndarray, List[str]]

make_test_df(cids=['AUD', 'CAD', 'GBP'], xcats=['XR', 'CRY'], start='2010-01-01', end='2020-12-31', style='any')[source]#

Generates a test dataframe with pre-defined values. These values are meant to be used for testing purposes only. The functions generates a standard quantamental dataframe with where the value column is populated with pre-defined values. These values are simple lines, or waves that are easy to identify and differentiate in a plot.

Parameters

Parameters:
  • cids (List[str]) – A list of strings for cids.

  • xcats (List[str]) – A list of strings for xcats.

  • start_date – An ISO-formatted date string.

  • end_date – An ISO-formatted date string.

  • style (str) – A string that specifies the type of line to generate. Current choices are: ‘linear’, ‘decreasing-linear’, ‘sharp-hill’, ‘four-bit-sine’, ‘sine’, ‘cosine’, ‘sawtooth’, ‘any’. See macrosynergy.management.simulate.simulate_quantamental_data.generate_lines().

Return type:

QuantamentalDataFrame