macrosynergy.management.utils.check_availability#

Module for checking the availability of data availabity from a Quantamental DataFrame. Includes functions for checking start years and end dates of a DataFrame, as well as visualizing the results.

check_availability(df, xcats=None, cids=None, start=None, start_size=None, end_size=None, start_years=True, missing_recent=True, use_last_businessday=True)[source]#

Wrapper for visualizing start and end dates of a filtered DataFrame.

Parameters:
  • df (DataFrame) – standardized DataFrame with the following necessary columns: ‘cid’, ‘xcat’, ‘real_date’.

  • xcats (List[str]) – extended categories to be checked on. Default is all in the DataFrame.

  • cids (List[str]) – cross sections to be checked on. Default is all in the DataFrame.

  • start (str) – string representing earliest considered date. Default is None.

  • start_size (Tuple[float]) – tuple of floats with width / length of the start years heatmap. Default is None (format adjusted to data).

  • end_size (Tuple[float]) – tuple of floats with width/length of the end dates heatmap. Default is None (format adjusted to data).

  • start_years (bool) – boolean indicating whether or not to display a chart of starting years for each cross-section and indicator. Default is True (display start years).

  • missing_recent (bool) – boolean indicating whether or not to display a chart of missing date numbers for each cross-section and indicator. Default is True (display missing days).

  • use_last_businessday (bool) – boolean indicating whether or not to use the last business day before today as the end date. Default is True.

missing_in_df(df, xcats=None, cids=None)[source]#

Print missing cross-sections and categories

Parameters:
  • df (QuantamentalDataFrame) – standardized DataFrame with the following necessary columns: ‘cid’, ‘xcat’, ‘real_date’.

  • xcats (List[str]) – extended categories to be checked on. Default is all in the DataFrame.

  • cids (List[str]) – cross sections to be checked on. Default is all in the DataFrame.

check_startyears(df)[source]#

DataFrame with starting years across all extended categories and cross-sections

Parameters:

df (DataFrame) – standardized DataFrame with the following necessary columns: ‘cid’, ‘xcat’, ‘real_date’.

check_enddates(df)[source]#

DataFrame with end dates across all extended categories and cross sections.

Parameters:

df (DataFrame) – standardized DataFrame with the following necessary columns: ‘cid’, ‘xcat’, ‘real_date’.

Return type:

DataFrame

business_day_dif(df, maxdate)[source]#

Number of business days between two respective business dates.

Parameters:
  • df (DataFrame) – DataFrame cross-sections rows and category columns. Each cell in the DataFrame will correspond to the start date of the respective series.

  • maxdate (Timestamp) – maximum release date found in the received DataFrame. In principle, all series should have values up until the respective business date. The difference will represent possible missing values.

Return <pd.DataFrame>:

DataFrame consisting of business day differences for all series.

visual_paneldates(df, size=None, use_last_businessday=True)[source]#

Visualize panel dates with color codes.

Parameters:
  • df (DataFrame) – DataFrame cross sections rows and category columns.

  • size (Tuple[float]) – tuple of floats with width/length of displayed heatmap.

  • use_last_businessday (bool) – boolean indicating whether or not to use the last business day before today as the end date. Default is True.