Visualising the health of the global economy with Python
Part II of a series on building a free, automated dashboard of real-time macroeconomic indicators
In part I of this series we created an automated process for sourcing and wrangling economic data from convenient sources such as the Quandl API, the slightly more challenging data sets from the Reserve Bank of Australia all the way to really difficult sources such as the German Bundesbank.
In this part of the series we will perform some simple analyses and mainly focus on visualising the data we sourced and wrangled before. The main tools will be Python libraries such as Matplotlib and Seaborn.
The full code for both parts of this series can be found on my Github.
Both of these libraries can be imported with two lines of code as shown below. We also set a few parameters underneath e.g. selecting the colour scheme “seaborn-bright”. Seaborn works well with Matplotlib as it is built on top of it — as a simpler, high-level interface. We also import the other dependencies from part I of the series such as pandas, which we will need later for manipulating some data.
Basic plotting of line charts
We start with some basic line charts of market prices of inflation expectations. Inflation is one of the most important macroeconomic indicators alongside growth and unemployment. Tracking forward-looking time series such as the below indicators is often more instructive than looking at last month’s actual inflation gauges (e.g. CPI, PCE). More over these time series provide a “real-time” insight rather than just having one monthly data point. Here’s a brief explanation from the St Louis Fed’s FRED:
5-Year, 5-Year Forward Inflation Expectation Rate
This series is a measure of expected inflation (on average) over the five-year period that begins five years from today.
10-Year Breakeven Inflation Rate
The breakeven inflation rate represents a measure of expected inflation derived from 10-Year Treasury Constant Maturity Securities and 10-Year Treasury Inflation-Indexed Constant Maturity Securities. The latest value implies what market participants expect inflation to be in the next 10 years, on average.
The first section of the code will look familiar to those who read part I of this series. The second section — from line 11 — defines the start date for the chart. This date will be used in line 13 that slices the dataframe from ‘20140101’. An end date could also be added, e.g. end = ‘20190702’ so that the dataframe would be sliced in this way: US_infl_exp_download.loc[start:end] . If no end date is specified, the date range will be from the start date all the way last available data point.
Lines 14 and 15 set the size of the chart and give it a title. Line 16 creates the actual plot while line 17 uses the Seaborn library to remove the borders of the chart, except for the bottom one, i.e. the x-axis. There are various ways of removing borders with Matplotlib methods however they are more complicated than using sns.despine (Seaborn).
The final line of code has various uses but from a practical point of view it is mainly included to remove unnecessary text from the chart output that Python creates when plotting.
Below is the result of our script.
Adding moving averages
We can make one simple addition to line charts, which are moving averages. These can be useful especially when a time series is noisy. We use one such dataset below, the volatility index of the S&P 500 (VIX). Simply put, it measures investors’ fear that the stock market will decline drastically. If the index spikes, it means there is more demand to buy downside protection in the form of (put) options. Since it’s a very noisy data set, it makes sense to look at longer-term moving averages that smooth out short-term spikes.
We can do that with only one additional line of code, see line 5 below. This calculates the moving average (MA) of the VIX with a lookback period of 30 days. The corresponding chart is right below the code and the MA is the green line.
Bar charts are often used to represent and compare categorical values, e.g. yield changes for different tenors of German bonds. Bar charts should not be confused with histograms, which show distributions of variables where ranges of data are often grouped into bins or intervals.
With the below code we create a self-updating dataframe of yield data 1 month ago, 3 months ago etc and calculating the differences between those and today. The key operation is to slice the existing dataframe. In line 6 for example we get a the yield data from a year ago, so we go back 252 data points instead of 360. A year only has around 250 trading days because no data points are produced on weekends.
The output of the script shows that bond yields have fallen across the entire yield curve — usually not a good sign for the economy.
A very similar way to show these developments is to plot the yield curve with a modified line chart. We use the US Treasury yield data here because it has moved in interesting ways recently. We have to do similar slicing operations to extract the data as in the German bond example above. We also use a different set of parameters here for fonts and font sizes (lines 8–11) to show how these parameters can be set globally, not just for one particular plot. Also we can run visualisations directly from the pandas dataframe (lines 13–14), which also uses Matplotlib as the underlying library. Finally we remove the chart’s borders, apart from the x-axis with Seaborn in the final line of code.
The chart below shows the changes in yields in a different way compared to the bar chart in the previous section. Both are useful but if the absolute level of yields is essential then using the yield curve below is preferable to the bar chart that only shows the differences.
Creating multiple plots that are shown as part of the same overall chart can be done very easily. Much of the below code will look familiar as it is very similar to the above. The only significant difference is in line 7 where the parameter subplots is given the argument True and layout is set to (5,2), the result of which can be seen immediately below.
Correlation heat maps
Heat maps are a great way of visualising the Pearson correlation coefficient for each variable to every other variable in our dataset for a specified period of time. We want to compare heat maps for two different periods: the first one is from early 2018 until the end of Q1 2019. The second period is the last quarter of available data, April to June 2019. We set those date ranges below.
Other parts of the script will look familiar. We generate titles for each chart in lines 13 and 34 and adjust the position of it along the y-axis. We also generate subtitles for each chart that display the period of time for which the heat map is calculated. For the subtitle of the first chart we use Python’s old string formatting style — see line 14 — and for the subtitle of the second chart we use the new style with curly brackets, see line 35. Both work well and it’s ultimately a matter of preference which you want to use.
The heat map itself is generated with Seaborn in lines 16–20 and 37–41. To make sure we display the full range of the correlation coefficient from -1 to 1 (instead of showing a more narrow range that often makes it difficult to decipher) we set the parameter vmin to -1. We choose one of Seaborn’s colour maps coolwarm, which has a nice range from blue to red. When setting the argument annot to True the individual correlation coefficients for each quadrant of the heat map will be displayed in the chart. The last argument annot_kws sets the font size for these.
Both charts show very interesting results. From early 2018 until the end of the first quarter of 2019 short-term bond yields (6 month to 1 year) showed negative correlations to longer-dated bonds. However since April 2019 all tenors are strongly positively correlated. That means short-term bonds (“the front of the yield curve”) move in tandem with long-term yields. This may be a consequence of poor economic data in Europe that could cause the ECB to continue with monetary easing, pushing down yields across the entire yield curve.
In the next part of this series we’ll show how to turn these visualisations into an interactive format, so users can select and manipulate data and charts via a browser-based UI.
Also please check out my new project, Oyler.co.