"If you can't measure it, you can't manage it" takes on a new meaning in the face of the COVID-19 pandemic. The systemic declines affecting assets across all categories are abundantly clear. What's less obvious is where the points of friction within a company's operating model lie driving the narrative that declines in economic activity will translate to lower future earnings. Identifying those points of friction or, as many refer to them, inflection points necessitates having a firm grasp around what makes a company tick. At a high level, we can generalize those activities by answering three fundamental questions to help frame a company's operating model:
To begin exploring those questions, this notebook will guide you through analyzing exposure to countries with a large number of confirmed cases of COVID-19.
This example workbook provides queries in SQL formatted for MSSQL and sample python code. It may be necessary to adjust the query language to suit a different SQL DBMS. Sample queries and code should be viewed as possible examples of data calculations and are subject to change. The queries and code provided should be used as a guide to understand how the underlying data items can be used but are not guaranteed to represent the same methodology as the FactSet Workstation or other industry calculations. Queries cannot be guaranteed to be written for performance and/or efficiency.
import pandas as pd
import numpy as np
import pyodbc
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
from matplotlib import colors
import seaborn as sns
sns.set_context('talk')
import sys
import os
sys.path.append('../')
import utils.loadsql as loadsql
sys.path.append('/')
#Define Directory of SQL Queries
sql_path = './../SQL Queries/'
To begin exploring those questions, we'll limit our focus to companies held within the S&P 500 and analyze their exposure to countries with a large number of confirmed cases of COVID-19. That list is growing as we write this, so for now we've limited our scope to China, Italy, South Korea, and Japan (CIKJ).
etf_ticker_regions: This is a list of ETF Tickers with their region appended, i.e., SPY-US. This will be used to retrieve our universe from FactSet Ownership.
target_countries: This is a list of target countries for our analysis. Enter countries with their ISO2 abbreviation.
# Examples: IEUR-US (ISHARES TR/ISHARES CORE MSCI )
# IPAC-US (iShares Core MSCI Pacific ETF | IPAC)
etf_ticker_regions = ['SPY-US'] #S&P 500 Proxy
target_countries = ['CN', 'JP', 'IT', 'KR']
#Convert the country list into a format for a SQL "IN" clause:
target_countries = "'"+"','".join(ctry for ctry in target_countries)+"'"
Enter a DSN associated with the MSSQL database containing FactSet Standard DataFeeds.
dsn = 'SDF'
cxn = pyodbc.connect('DSN={dsn_name}'.format(dsn_name = dsn))
# matplotlib label function
def autolabel(rects):
"""Attach a text label above each bar in *rects*, displaying its height."""
for rect in rects:
height = rect.get_height()
ax.annotate(
"{}%".format(height),
xy=(rect.get_x() + rect.get_width() / 2, height),
xytext=(0, 3), # 3 points vertical offset
textcoords="offset points",
ha="center",
va="bottom",
)
We will leverage the following data sets in our analysis:
FactSet Ownership: Retrieval of our baseline universe
FactSet Supply Chain Relationships: Retrieval of each companies Suppliers and Customers with their Country of Domicile
FactSet Geographic Revenue: Retrieval of each companies total percentage revenue exposure to the target countries
FactSet Data Management Solutions: Retrieval of each company's full corporate entity structure. For each company in the hierarchy: country of domicile, country of risk, and geographic revenue country of risk.
To learn more about any of these content sets, visit www.open.factset.com or check out their associated starter kits within Data Exploration or on the individual product cards.
To get started, let's retrieve our data and begin to create a few aggregate values for analysis.
%%time
all_index_agg_df = []
dms_all_df = []
gr_all_df = []
for etf_ticker_region in etf_ticker_regions:
# Retrieve Supply Chain Data
q = loadsql.get_sql_q(
os.path.join(
sql_path, "1.1.1 Current Suppliers and Customers for Effected Countries.sql"
),
show=0,
connection=dsn,
).format(etf_ticker_region=etf_ticker_region, target_countries=target_countries)
df = pd.read_sql(q, cxn)
# Retrieve GeoRev Data
id_list = df.factset_entity_id.unique().tolist()
id_list = "'" + "','".join(ids for ids in id_list) + "'"
q = loadsql.get_sql_q(
os.path.join(
sql_path,
"1.1.2 Current Geographic Revenue Exposure for Effected Countries.sql",
),
show=0,
connection=dsn,
).format(id_list=id_list, target_countries=target_countries)
gr = pd.read_sql(q, cxn)
df = df.merge(gr, how="left", on="factset_entity_id")
gr["ETF"] = etf_ticker_region
# Retrieve DMS Data
q = loadsql.get_sql_q(
os.path.join(
sql_path, "1.1.3 Entity Structure Exposure with Effected Countries.sql"
),
show=0,
connection=dsn,
).format(id_list=id_list, target_countries=target_countries)
dms = pd.read_sql(q, cxn)
df = df.merge(dms, how="left", on="factset_entity_id")
# Create bins for later analysis
dms["Entity Structure Size"] = np.where(
dms.companies_in_entity_structure < 50,
"<=49",
np.where(
dms.companies_in_entity_structure < 500,
"<=499",
np.where(dms.companies_in_entity_structure < 999, "<=999", ">=1000"),
),
).tolist()
dms["ETF"] = etf_ticker_region
# Retrieve RBICS Data
q = loadsql.get_sql_q(
os.path.join(sql_path, "1.1.4 RBICS Sectors for Universe.sql"),
show=0,
connection=dsn,
).format(id_list=id_list)
df = df.merge(pd.read_sql(q, cxn), how="left", on="factset_entity_id")
# Map RBICS to other data sets
dms["RBICS_l1"] = dms.factset_entity_id.map(
dict(zip(df.factset_entity_id, df.RBICS_l1))
)
gr["RBICS_l1"] = gr.factset_entity_id.map(
dict(zip(df.factset_entity_id, df.RBICS_l1))
)
# Calculate Aggregate Fields
df[
"Pct Effected Customers by Country of Domicile"
] = df.effected_customers_by_country_of_domicile.divide(df.num_customers).multiply(
100
)
df[
"Pct Effected Suppliers by Country of Domicile"
] = df.effected_suppliers_by_country_of_domicile.divide(df.num_suppliers).multiply(
100
)
df["Pct of Effected Companies in Entity Structure by Country of Domicile"] = (
df["effected_subs_by_country_of_domicile"]
.divide(df.companies_in_entity_structure)
.multiply(100)
)
df["Pct of Effected Companies in Entity Structure by Country of Risk"] = (
df["effected_subs_by_cor"]
.divide(df.companies_in_entity_structure)
.multiply(100)
)
df["Pct of Effected Companies in Entity Structure by GeoRev Country of Risk"] = (
df["effected_subs_by_cor_georev"]
.divide(df.companies_in_entity_structure)
.multiply(100)
)
df = df[
[
"proper_name",
"factset_entity_id",
"RBICS_l1",
"RBICS_l2",
"RBICS_l3",
"RBICS_l4",
"Pct Effected Customers by Country of Domicile",
"effected_customers_avg_rank",
"Pct Effected Suppliers by Country of Domicile",
"effected_suppliers_avg_rank",
"rev_exposure_to_target_countries",
"Pct of Effected Companies in Entity Structure by Country of Domicile",
"Pct of Effected Companies in Entity Structure by Country of Risk",
"Pct of Effected Companies in Entity Structure by GeoRev Country of Risk",
]
].rename(
columns={
"proper_name": "Company Name",
"effected_suppliers_avg_rank": "Effected Suppliers Avg Relevance Rank",
"effected_customers_avg_rank": "Effected Customers Avg Relevance Rank",
"rev_exposure_to_target_countries": "Pct Rev Exposure to Effected Countries",
}
)
df.fillna(0, inplace=True)
df["ETF"] = etf_ticker_region
all_index_agg_df.append(df)
gr_all_df.append(gr)
dms_all_df.append(dms)
del df
del dms
del gr
print("{etf} Complete.".format(etf=etf_ticker_region))
all_index_agg_df = pd.concat(all_index_agg_df).set_index(
["Company Name", "RBICS_l1", "RBICS_l4", "ETF"]
)
gr_all_df = pd.concat(gr_all_df).set_index("ETF")
dms_all_df = pd.concat(dms_all_df).set_index("ETF")
print("Complete. Data Retrieved.")
With mounting concerns about the decline in production coming from countries with exposure to COVID-19, it's important to understand how that perceived lack of production will impact companies in our universe. Let's begin by looking to the country of domicile (CoD) by counting the number of entities in each company's corporate hierarchy with a CoD in CIKJ. The companies in the S&P 500, while headquartered in the US, are globally distributed businesses with massive operations across the globe. The chart below helps paint that picture, showing many companies within the index having anywhere from 50 to 500 entities in their corporate hierarchy.
for etf in dms_all_df.index.get_level_values(0).unique().tolist():
dms_all_df.loc[etf].groupby(
["Entity Structure Size", "RBICS_l1"]
).factset_entity_id.nunique().swaplevel(0).unstack().plot(
kind="bar", figsize=(12, 6), stacked=True
)
plt.title("S&P 500".format(etf=etf), fontsize=18)
plt.suptitle("Complexity of Companies - Entity Structure Size", fontsize=20)
plt.xlabel("RBICS Economy")
plt.legend(
title="Entity Structure Size", loc="center left", bbox_to_anchor=(1, 0.5)
)
plt.annotate(
"""Calculation: Total number number of entities within each constituents corporate hierarchy.
Source: FactSet - Data Management Solutions: Entity Structure""",
(0, 0),
(0, -250),
xycoords="axes fraction",
textcoords="offset points",
va="top",
fontsize=12,
)
plt.show()
The sheer number of global relationships shown above raises an interesting question: is country of domicile the best measure of "exposure" to a country or set of countries? Perhaps, but by definition country of domicile is limited to a single location and doesn't explore the global network most companies operate in. One alternative is country of risk (CoR), which is designed to capture influences beyond the country of domicile by accounting for the affect economic and political turmoil can have on companies. Let's explore both measures side by side using RBICS Economy as the sector classification. For CoD, we measure the percentage of entities across each corporate hierarchy with a country of domicile in CIKJ. For CoR, we conduct a similar analysis but use FactSet GoeRev Country of Risk (GeoCor).
for etf in dms_all_df.index.get_level_values(0).unique().tolist():
country_exposure = (
dms_all_df.loc[etf]
.groupby(["RBICS_l1"])[
[
"companies_in_entity_structure",
"effected_subs_by_country_of_domicile",
"effected_subs_by_cor",
"effected_subs_by_cor_georev",
]
]
.sum()
)
country_exposure = (
country_exposure[
["effected_subs_by_country_of_domicile", "effected_subs_by_cor_georev"]
]
.divide(country_exposure.companies_in_entity_structure, axis=0)
.multiply(100)
.sort_values("effected_subs_by_cor_georev", ascending=False)
)
labels = country_exposure.index.tolist()
sups = [
round(x, 2) for x in country_exposure["effected_subs_by_country_of_domicile"]
]
rr = [round(x, 2) for x in country_exposure["effected_subs_by_cor_georev"]]
x = np.arange(len(labels)) # the label locations
width = 0.35 # the width of the bars
fig, ax = plt.subplots(figsize=(12, 6))
rects1 = ax.bar(x - width / 2, sups, width, label="Country of Domicile")
rects2 = ax.bar(x + width / 2, rr, width, label="GeoCOR")
# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_ylabel("Percentage")
ax.set_title(
"{etf}: Entity Structure Country Exposure".format(etf=etf), fontsize=22
)
ax.set_xticks(x)
ax.set_xticklabels(labels, rotation=90)
ax.legend()
plt.ylim(0, 10)
autolabel(rects1)
autolabel(rects2)
plt.legend(title="", loc="center left", bbox_to_anchor=(1, 0.5))
plt.annotate(
"""Calculation: Percentage of entities across the each constituents' corporate hierarchy wher the country of domicile or GeoCOR is {cn}.
Source: FactSet - Data Management Solutions: Entity Structure""".format(
cn=target_countries.replace("'", "")
),
(0, 0),
(0, -175),
xycoords="axes fraction",
textcoords="offset points",
va="top",
fontsize=12,
)
plt.show()
Both measures show minimal exposure to the affected countries and seemingly side-step the relationships we are after. At the outset of the article, we mentioned the noticeable influence COVID-19 has had on markets, so why are these measures falling short? A potential reason is the scope of these metrics. Both CoD and CoR do not account for more involved business models, like the large, multi-national businesses present in our universe. Even CoR, which accounts for several factors beyond CoD when assigning a country designation, doesn't explore each of the country exposures a company has, which are of paramount importance during a pandemic.
So how can we expand upon our current measures of exposure? Let's begin by looking beyond primary sources of revenue and view all geographies contributing to a company's top line. The histogram below illustrates the distribution of the S&P 500 companies' percentage of revenue generated from CIKJ by RBICS Economy.
for etf in all_index_agg_df.index.get_level_values(3).unique().tolist():
all_index_agg_df[all_index_agg_df.index.get_level_values(3) == etf][
"Pct Rev Exposure to Effected Countries"
].hist(
by=all_index_agg_df[
all_index_agg_df.index.get_level_values(3) == etf
].index.get_level_values(1),
bins=10,
figsize=(12, 12),
sharex=True,
sharey=False,
)
plt.suptitle(
"Distribution of Company Level Geographic Revenue Exposure to Target Countries\n by RBICS Economy - {etf}".format(
etf=etf
),
fontsize=16,
)
plt.annotate(
"""Calculation: Distribution of the total geographic revenue exposure to {cn} by company within each RBICS Economy.
Source: FactSet - Geographic Revenue Exposure and FactSet - RBICS""".format(
cn=target_countries.replace("'", "")
),
(-2.75, 0),
(0, -75),
xycoords="axes fraction",
textcoords="offset points",
va="top",
fontsize=12,
)
plt.show()
In our hunt to quantify exposure, this provides us with a few breadcrumbs. Utilities, Telecommunications, and Business Services exhibit minimal exposure, while Technology, Non-Energy Materials, and Healthcare's revenue exposure could translate into higher risk due to COVID-19. Recall that 'risk' is defined as the level of uncertainty around an outcome, and in this case, it's unclear that companies with a large amount of revenue from CIKJ will be able to count on that same level of revenue in the future.
Seeing the relatively high percentage of revenue from CIKJ in Technology, we begin to wonder what types of technology firms are at higher/lower risk than others. The chart below illustrates the distribution of total geographic revenue exposure to CIKJ by industry group.
values = ["Pct Rev Exposure to Effected Countries"]
for etf in all_index_agg_df.index.get_level_values(3).unique().tolist():
df = all_index_agg_df[all_index_agg_df.index.get_level_values(3) == etf]
mask = (df.index.get_level_values(1) == "Technology") & (
df.groupby(df.index.get_level_values(2)).factset_entity_id.transform("size") > 1
)
plt.figure(figsize=(12, 5))
sns.boxplot(
x=df[mask].index.get_level_values(2), y=values[0], data=df[mask]
).set_title(
"Geographic Revenue Exposure to Target Countries - {etf} \n Technology Industry Groups".format(
etf=etf
),
fontsize=16,
)
plt.xticks(rotation=90, fontsize=14)
plt.ylabel("Geogrphic Rev. Exposure (PCT)")
plt.xlabel("RBICS Industry Group")
plt.annotate(
"""Calculation: Distribution of the total geographic revenue exposure to {cn} by company within each RBICS Economy.
Source: FactSet - Geographic Revenue Exposure and FactSet - RBICS""".format(
cn=target_countries.replace("'", "")
),
(0, 0),
(0, -350),
xycoords="axes fraction",
textcoords="offset points",
va="top",
fontsize=12,
)
plt.show()
Measuring risk based on revenue exposure leads us to identify Specialized Semiconductors, Analog and Mixed Signal Semiconductors (AMSS), and Semiconductor Manufacturing Capital Equipment (SMCE) as industries that will be heavily impacted by COVID-19. In addition to high revenue exposure (35-65% generate in CIJK), the activities associated with these industries could be susceptible to outside intervention as governments continue to manage the outbreak. For example, manufacturing firms may be hard-pressed to find adequate work from home scenarios in the event of extended lockdowns or quarantines, at least without diminishing their current or expected levels of production, leading us to another important question: how will company to company relationships be impacted?
Expanding our definition of exposure to account for global sources of revenue has already borne fruit relative to our earlier, more limited definitions relying on CoD and CoR, but there's more we can do. Exploring company supply chains is an integral component to understanding the sources of global revenue and how they may evolve. A simple method to quantify a company's supply chain exposure is to determine the percentage of suppliers and customers with a country of domicile listed as being one of the affected countries (CIKJ). Again, focusing on Technology, it becomes clear Semiconductor companies have significant supply chain exposure. Nearly a third of AMSS companies' customer and SMCE companies' suppliers are domiciled within CIKJ.
for etf in all_index_agg_df.index.get_level_values(3).unique().tolist():
values = [
"Pct Effected Suppliers by Country of Domicile",
"Pct Effected Customers by Country of Domicile",
]
# Limit to Technology, L4's>1 Company, and target ETF
mask = (
(all_index_agg_df.index.get_level_values(1) == "Technology")
& (
all_index_agg_df.groupby(
all_index_agg_df.index.get_level_values(2)
).factset_entity_id.transform("size")
> 1
)
& (all_index_agg_df.index.get_level_values(3) == etf)
)
df = all_index_agg_df[mask].copy()
df = df.groupby(df.index.get_level_values(2))[values].median()
df.sort_values(values[0], inplace=True, ascending=False)
labels = df.index.tolist()
sups = [round(x, 2) for x in df[values[0]]]
cust = [round(x, 2) for x in df[values[1]]]
x = np.arange(len(labels)) # the label locations
width = 0.35 # the width of the bars
fig, ax = plt.subplots(figsize=(16, 6))
rects1 = ax.bar(x - width / 2, sups, width, label="Median Pct. Suppliers Effected")
rects2 = ax.bar(x + width / 2, cust, width, label="Median Pct. Customers Effected")
# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_ylabel("Percentage of Supply Chain")
ax.set_xlabel("RBICS Industry Group")
ax.set_title(
"{etf} Supply Chain Exposure\nCustomers and Suppliers From Target Countries".format(
etf=etf
),
fontsize=22,
)
ax.set_xticks(x)
ax.set_xticklabels(labels, rotation=90)
ax.legend()
autolabel(rects1)
autolabel(rects2)
plt.legend(title="", loc="center left", bbox_to_anchor=(1, 0.5))
plt.annotate(
"""Calculation: Percentage of customers/suppliers within a company's supply chain that are domiciled in {cn}.
Source: FactSet - Supply Chain Relationships""".format(
cn=target_countries.replace("'", "")
),
(0, 0),
(0, -325),
xycoords="axes fraction",
textcoords="offset points",
va="top",
fontsize=12,
)
plt.show()
Sticking with S&P 500 Technology sector, let's see how companies in AMSS and SMCE have fared by stacking up each measure of exposure and market performance from February 1st to March 13th. The last three columns to the right include absolute and benchmark/industry relative returns to separate returns attributable to the market from more idiosyncratic returns driven by the company (e.g. remove beta to market/industry).
values = [
"Pct Effected Customers by Country of Domicile",
"Pct Effected Suppliers by Country of Domicile",
"Pct Rev Exposure to Effected Countries",
"Pct of Effected Companies in Entity Structure by Country of Domicile",
"Pct of Effected Companies in Entity Structure by GeoRev Country of Risk",
]
final_df = all_index_agg_df[values].copy()
# Rank Companies on each value, sum their total ranks, rerank on "total rank"
final_df["Rank"] = (
final_df[values]
.rank(ascending=False, method="first")
.sum(axis=1)
.rank(ascending=True, method="first")
)
final_df = final_df.droplevel(1)
final_df.set_index("Rank", append=True, inplace=True)
final_df.sort_index(level=(3)).droplevel(3, axis=0)[values].head(
25
).style.background_gradient(subset=values, cmap="Reds").set_properties(
**{"width": "10em", "text-align": "center"}
).format(
"{:0.1f}%"
)
Companies in the SMCE industry have outperformed both the index and their direct peer group. Still, with high levels of exposure across the different measures shown above, it begs the question if that trend will continue or if a more severe drawdown is looming. AMSS results, on the other hand, are mixed. Texas Instruments, for example, has outperformed relative to the sector and market while generating 57% of its revenues in CIJK. Perhaps this seemingly anomalous result is a function of their business model, where near-term revenue growth is tied to contracts that won't be impacted in the subsequent months, or maybe the market has yet to account for this exposure.
With 500 companies in our universe and potentially endless connections when considering each company's supply chain, it's clear that we're just getting started. As we conclude our short analysis, it's essential to keep in mind that as with most measures of risk, financial or otherwise, there isn't one universal metric that tells the entire story. Gaining a more complete understanding of a company's country/countries of risk requires a variety of checkpoints, similar to going to the doctor for an annual check-up. Doctor's providing health assessments rarely turn to one number, rather they look to a series of tests to form a consensus (blood pressure, cholesterol level, diet & exercise habits, etc.). In that same vein, the more information you have at your disposal, the better prepared you are to understand the relationships of interest. In the case of country exposures, these relationships are informed by understanding what your companies do, where they do it, and who they do it with.
values = ["Pct of Effected Companies in Entity Structure by Country of Domicile"]
all_index_agg_df[values].sort_values(values, ascending=False).head(
10
).style.background_gradient(subset=values[0], cmap="Blues").set_properties(
**{"text-align": "center"}
).format(
{values[0]: "{:0.1f}%"}
)
values = ["Pct of Effected Companies in Entity Structure by Country of Risk"]
all_index_agg_df[values].sort_values(values, ascending=False).head(
10
).style.background_gradient(subset=values[0], cmap="Blues").set_properties(
**{"text-align": "center"}
).format(
{values[0]: "{:0.1f}%"}
)
values = ["Pct of Effected Companies in Entity Structure by GeoRev Country of Risk"]
all_index_agg_df[values].sort_values(values, ascending=False).head(10).style.background_gradient(
subset=values[0], cmap="Greens"
).set_properties(**{"text-align": "center"}).format({values[0]: "{:0.1f}%"})
# For this example, we will focus on 1 index, to show multiple or to target a specific index, adjust this filter
values = [
"Pct Effected Customers by Country of Domicile",
"Effected Customers Avg Relevance Rank",
]
all_index_agg_df[values].sort_values(values, ascending=[False, True]).head(
10
).style.background_gradient(subset=values[0], cmap="Oranges").background_gradient(
subset=values[1], cmap="Oranges_r"
).set_properties(
**{"text-align": "center"}
).format(
{values[0]: "{:0.1f}%"}
)
values = [
"Pct Effected Suppliers by Country of Domicile",
"Effected Suppliers Avg Relevance Rank",
]
all_index_agg_df[values].sort_values(values, ascending=[False, True]).head(
10
).style.background_gradient(subset=values[0], cmap="Oranges").background_gradient(
subset=values[1], cmap="Oranges_r"
).set_properties(
**{"text-align": "center"}
).format(
{values[0]: "{:0.1f}%"}
)
To learn more about the content sets used within this notebook, check out the following resources: