Sector neutralization is a technique to neutralize sector exposures by ranking and comparing factors separately within each sector rather than comparing companies across the entire market. This is done to hedge out sector bets and reduce the impact of sector-specific risks on the portfolio.
Many fundamental metrics naturally vary across sectors and industries. These variations don't necessarily indicate better or worse companies but simply represent differences in the way different industries operate. Sector neutralization can be useful with these kinds of metrics.
In this notebook we'll use the debt-to-equity ratio (D/E ratio) to explore sector neutralization.
Let's start by looking at how the D/E ratio varies across sectors. To do so, we define a pipeline for our base universe, with the D/E ratio and sector as columns:
from zipline.pipeline import master, sharadar, Pipeline
from codeload.fundamental_factors.universe import BaseUniverse
fundamentals = sharadar.Fundamentals.slice('ART')
de = fundamentals.DE.latest
sector = master.SecuritiesMaster.sharadar_Sector.latest
universe = BaseUniverse()
pipeline = Pipeline(
columns={
'de': de,
'sector': sector
},
screen=universe
)
Then we run the pipeline, group the results by sector, and plot the median D/E ratio for each sector:
from zipline.research import run_pipeline
results = run_pipeline(pipeline, '2022-12-30', '2022-12-30')
results.groupby('sector').median().plot(kind="barh", title="Median debt-to-equity ratio by sector");
The median D/E ratio for the financial sector is much higher than for other sectors, and the D/E ratio for the healthcare sector is considerably lower than for other sectors. This means that if we rank stocks by their D/E ratio, the financial sector will be over-represented at one end of the rankings and the healthcare sector will be over-represented at the other end.
Let's visualize this over-representation. To do so, we'll rank by D/E ratio in ascending order (that is, assigning ranking 1 to the lowest D/E ratio) and form quintiles from the ranks. We will then be able to analyze the sector representation in the highest and lowest quintiles, respectively.
pipeline = Pipeline(
columns={
# we mask rank() with universe to avoid ranking stocks that aren't in our universe
'quintiles': de.rank(mask=universe).quintiles(),
'sector': sector,
},
screen=universe
)
results = run_pipeline(pipeline, '2022-12-30', '2022-12-30')
To establish a baseline, let's count the number of securities per sector to see how different sectors compare in size. We can see that Financial Services, Healthcare, Technology, and Industrials are the largest sectors in the market:
securities_by_sector = results.groupby('sector').quintiles.count()
ax = securities_by_sector.plot(kind="pie", title="Number of securities by sector")
ax.set_ylabel('');
Now, let's count the number of securities per sector in the lowest and highest quintiles by D/E ratio. The quintile labels are zero-indexed, meaning quintile 0 contains stocks with the lowest D/E ratios, and quintile 4 contains stocks with the highest D/E ratios.
lowest_quintile_by_sector = results[results.quintiles == 0].groupby('sector').quintiles.count()
highest_quintile_by_sector = results[results.quintiles == 4].groupby('sector').quintiles.count()
We then view pie charts of the results. As expected, healthcare stocks dominate the low D/E quintile, and financial stocks dominate the high D/E quintile. Consequently, a long-short portfolio formed using the D/E ratio would largely be just a bet on healthcare vs financials:
import matplotlib.pyplot as plt
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(12, 6))
lowest_quintile_by_sector.plot(kind="pie", ax=axes[0], title="Number of securities in lowest DE quintile by sector")
highest_quintile_by_sector.plot(kind="pie", ax=axes[1], title="Number of securities in highest DE quintile by sector")
for ax in axes:
ax.set_ylabel('')
fig.tight_layout()
To avoid sector concentration, we can incorporate the sector into the ranking process by using the groupby
parameter of rank()
. Normally, the rank()
method ranks all securities from $1 \rightarrow N$, where $N$ is the number of securities in the universe. In contrast, rank(groupby=sector)
ranks securities from $1 \rightarrow N$ within each sector, where $N$ is the number of securities in the sector. In the resulting output, if there are $S$ sectors, there will be $S$ stocks ranked 1, $S$ stocks ranked 2, etc. Ranking by sector will allow us to form a portfolio in which each sector is equally represented.
The following pipeline expression ranks by sector, then forms quintiles from the resulting ranks:
neutral_quintiles = de.rank(mask=universe, groupby=sector).quintiles()
Let's re-run the previous pipeline with the neutral quintiles to see how this affects the composition of our low D/E quintile:
pipeline = Pipeline(
columns={
'quintiles': de.rank(mask=universe).quintiles(),
'neutral_quintiles': neutral_quintiles,
'sector': sector,
},
screen=universe
)
results = run_pipeline(pipeline, '2022-12-30', '2022-12-30')
We plot the sector breakdown of the low DE quintile without sector neutralization (on the left) and with sector neutralization (on the right). Sector neutralization has eliminated the over-weighting of healthcare and created balance across sectors:
lowest_quintile_by_sector_neutralized = results[results.neutral_quintiles == 0].groupby('sector').neutral_quintiles.count()
import matplotlib.pyplot as plt
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(12, 6))
lowest_quintile_by_sector.plot(kind="pie", ax=axes[0], title="Number of securities in lowest DE quintile by sector (not neutralized)")
lowest_quintile_by_sector_neutralized.plot(kind="pie", ax=axes[1], title="Number of securities in lowest DE quintile by sector (neutralized)")
for ax in axes:
ax.set_ylabel('')
fig.tight_layout()
Let's create a similar set of plots for the high D/E quintile, to see if we have eliminated the over-weighting of financial stocks in this quintile:
highest_quintile_by_sector_neutralized = results[results.neutral_quintiles == 4].groupby('sector').neutral_quintiles.count()
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(12, 6))
highest_quintile_by_sector.plot(kind="pie", ax=axes[0], title="Number of securities in highest DE quintile by sector (not neutralized)")
highest_quintile_by_sector_neutralized.plot(kind="pie", ax=axes[1], title="Number of securities in highest DE quintile by sector (neutralized)")
for ax in axes:
ax.set_ylabel('')
fig.tight_layout()
That doesn't look good! The sector-neutralized, high D/E quintile (right-hand plot) is now highly concentrated in four sectors: Healthcare, Financial Services, Technology, and Industrials. What's going on? You may notice that these four sectors are the four largest sectors, as we saw earlier. When we rank by sector using rank(groupby=sector)
, the stocks in each sector are ranked from $1 \rightarrow N$, where $N$ is the number of securities in the sector. This means that larger sectors end up having securities with larger maximum ranks. As a result, when we get to the last quintile (which contains stocks with high ranks), only the larger sectors have any securities left to put in the quintile.
This means that if you want to construct a sector-neutralized high D/E quintile, you should not use quintile 4 of the D/E ratio ranked by sector in ascending order. Rather, you should form quintiles from D/E ratio ranked by sector in descending order (highest D/E ratio first) and select quintile 0.
This is the approach used by the top()
and bottom()
Factor methods in Pipeline, so you can avoid the above problem by using those methods to select high and low D/E stocks. The following code selects the 10 stocks from each sector with the lowest D/E ratio and the 10 stocks from each sector with the highest D/E ratio:
# this ranks D/E from low to high and takes the first 10 per sector
lowest_de_stocks = de.bottom(10, mask=universe, groupby=sector)
# this ranks D/E from high to low and takes the first 10 per sector
highest_de_stocks = de.top(10, mask=universe, groupby=sector)
To double-check that this results in the sector balance we expect, let's create a pipeline containing the low D/E and high D/E groups of stocks we just selected. We use the if_else()
Filter method combined with Constant()
to label the groups as "low" and "high", respectively:
from zipline.pipeline import Constant
# limit the output to the low DE and high DE stocks we just selected
screen = lowest_de_stocks | highest_de_stocks
# label the two groups of stocks (if_else() returns the first argument
# if lowest_de_stocks is True and the second argument if it is False)
label = lowest_de_stocks.if_else(Constant('low'), Constant('high'))
pipeline = Pipeline(
columns={
'label': label,
'sector': sector
},
screen=screen
)
Running this pipeline, we confirm that there are 10 stocks per sector in the low and high D/E groups:
results = run_pipeline(pipeline, start_date="2022-12-30", end_date="2022-12-30")
results['label'] = results.label.cat.remove_unused_categories()
results.groupby(['label', 'sector']).size()
label sector high 0 Basic Materials 10 Communication Services 10 Consumer Cyclical 10 Consumer Defensive 10 Energy 10 Financial Services 10 Healthcare 10 Industrials 10 Real Estate 10 Technology 10 Utilities 10 low 0 Basic Materials 10 Communication Services 10 Consumer Cyclical 10 Consumer Defensive 10 Energy 10 Financial Services 10 Healthcare 10 Industrials 10 Real Estate 10 Technology 10 Utilities 10 dtype: int64
Another way to neutralize sectors is to demean them. With demeaning, we calculate the mean D/E ratio for each sector and subtract it from the observed values. Accordingly, the resulting values are centered around zero for each sector, which allows us to better compare values across sectors. If we rank stocks on their demeaned D/E ratios, we are ranking them not on their absolute D/E ratio but on how high or low their D/E ratio is relative to the sector average.
The following pipeline expression forms quintiles from sector-demeaned D/E ratios:
demeaned_quintiles = de.demean(mask=universe, groupby=sector).quintiles()
Alternatively, we can z-score the D/E ratios, which is like demeaning but includes the additional step of dividing the demeaned values by the standard deviation of D/E ratios for the sector. Whereas demeaning neutralizes the effect of one sector having generally higher or lower D/E ratios than another sector, z-scoring additionally removes the effect of one sector having a wider variation of D/E ratios than another sector.
To z-score in Pipeline, just modify the previous expression to use zscore()
instead of demean()
:
zscored_quintiles = de.zscore(mask=universe, groupby=sector).quintiles()
Note that, unlike the ranking approaches shown above, demeaning and z-scoring don't guarantee that your quantiles will be equally weighted among all sectors. It is still possible that the best or worst D/E ratios will cluster more in one sector than another, even after adjusting for sector differences. Moreover, since some sectors are larger than others, we should naturally expect that those larger sectors will make up a larger portion of any given quantile, on average.
An alternative to neutralizing by sector is to neutralize by industry. Industries are more specific than sectors. In theory, the appeal of using industries instead of sectors is that companies within a given industry are more closely related than companies within a given sector, thus providing a more accurate benchmark. However, many industries are too small to allow for meaningful comparisons. For this reason, sectors usually provide a better balance of granularity and adequate sample size. Industries are best used when you wish to specifically target a particular industry or group of industries, which you know in advance are large enough to yield meaningful results.
The number of stocks per industry is shown below:
pipeline = Pipeline(
columns={
'sector': master.SecuritiesMaster.sharadar_Sector.latest,
'industry': master.SecuritiesMaster.sharadar_Industry.latest,
},
screen=universe
)
results = run_pipeline(pipeline, '2022-12-30', '2022-12-30')
counts = results.groupby(['sector', 'industry']).size()
print(counts[counts > 0].to_string())
sector industry Basic Materials Agricultural Inputs 10 Aluminum 3 Building Materials 6 Building Products & Equipment 29 Chemicals 18 Coking Coal 5 Copper 2 Gold 11 Lumber & Wood Production 4 Other Industrial Metals & Mining 5 Other Precious Metals & Mining 3 Paper & Paper Products 6 Specialty Chemicals 49 Steel 14 Thermal Coal 8 Uranium 4 Communication Services Telecom Services 28 Consumer Cyclical Advertising Agencies 22 Apparel Manufacturing 17 Apparel Retail 32 Auto & Truck Dealerships 14 Auto Manufacturers 13 Auto Parts 35 Broadcasting 19 Department Stores 5 Entertainment 27 Footwear & Accessories 12 Furnishings Fixtures & Appliances 22 Gambling 9 Home Improvement Retail 9 Leisure 26 Lodging 8 Luxury Goods 5 Packaging & Containers 21 Personal Services 12 Publishing 8 Recreational Vehicles 15 Residential Construction 20 Resorts & Casinos 17 Restaurants 48 Specialty Retail 41 Textile Manufacturing 3 Consumer Defensive Beverages - Brewers 2 Beverages - Non-Alcoholic 10 Beverages - Wineries & Distilleries 6 Confectioners 4 Discount Stores 9 Education & Training Services 14 Farm Products 17 Food Distribution 9 Grocery Stores 10 Household & Personal Products 21 Packaged Foods 42 Pharmaceutical Retailers 5 Tobacco 5 Energy Oil & Gas Drilling 7 Oil & Gas E&P 70 Oil & Gas Equipment & Services 49 Oil & Gas Integrated 4 Oil & Gas Midstream 36 Oil & Gas Refining & Marketing 19 Financial Services Asset Management 86 Banks - Diversified 5 Banks - Regional 346 Capital Markets 32 Credit Services 44 Financial Conglomerates 3 Financial Data & Stock Exchanges 10 Insurance - Diversified 9 Insurance - Life 17 Insurance - Property & Casualty 37 Insurance - Reinsurance 7 Insurance - Specialty 18 Insurance Brokers 11 Mortgage Finance 18 Healthcare Biotechnology 461 Diagnostics & Research 61 Drug Manufacturers - General 12 Drug Manufacturers - Specialty & Generic 46 Healthcare Plans 9 Medical Care Facilities 37 Medical Devices 93 Medical Distribution 7 Medical Instruments & Supplies 44 Industrials Aerospace & Defense 44 Airlines 13 Airports & Air Services 2 Business Equipment & Supplies 8 Conglomerates 9 Consulting Services 15 Diversified Industrials 1 Electrical Equipment & Parts 35 Engineering & Construction 32 Farm & Heavy Construction Machinery 22 Industrial Distribution 16 Infrastructure Operations 1 Integrated Freight & Logistics 14 Marine Shipping 7 Metal Fabrication 13 Pollution & Treatment Controls 8 Railroads 10 Rental & Leasing Services 20 Security & Protection Services 17 Shell Companies 54 Specialty Business Services 29 Specialty Industrial Machinery 74 Staffing & Employment Services 25 Tools & Accessories 14 Travel Services 9 Trucking 16 Waste Management 12 Real Estate REIT - Diversified 26 REIT - Healthcare Facilities 16 REIT - Hotel & Motel 18 REIT - Industrial 17 REIT - Mortgage 39 REIT - Office 26 REIT - Residential 18 REIT - Retail 27 REIT - Specialty 17 Real Estate - Development 9 Real Estate - Diversified 6 Real Estate Services 18 Technology Communication Equipment 47 Computer Hardware 23 Consumer Electronics 13 Electronic Components 31 Electronic Gaming & Multimedia 11 Electronics & Computer Distribution 8 Health Information Services 37 Information Technology Services 54 Internet Content & Information 29 Internet Retail 21 Scientific & Technical Instruments 23 Semiconductor Equipment & Materials 24 Semiconductors 49 Software - Application 160 Software - Infrastructure 86 Solar 12 Utilities Utilities - Diversified 16 Utilities - Independent Power Producers 2 Utilities - Regulated Electric 25 Utilities - Regulated Gas 13 Utilities - Regulated Water 13 Utilities - Renewable 8