Natural language processing – a new tool to decode the Fed

What is natural language processing, and how can we use it to analyse FOMC minutes and the Beige Book?

Key points

  •  New quantitative analysis techniques have paved the way to explore alternative data sources. By using natural language processing techniques, economists can systematically study transcripts such as central banks’ publications, newspaper articles or company statements to identify patterns and draw conclusions.
  • We focus our study on the US economy by analysing the minutes to Federal Open Market Committee (FOMC) meetings and the ‘Beige Book’ – a qualitative summary of US economic conditions.
  • Translating these transcripts to sentiment scores, we illustrate common patterns with US GDP growth. We highlight the latest Beige Book report, published on 15 July, points to a strong rebound in activity between end of May and beginning of July. We also provide extra insight into the FOMC minutes by highlighting the most discussed topics during the committee and their evolution over time.  

Until recently, quantitative economic analysis has been mainly based on the study of hard data and survey evidence. However, new quantitative analysis techniques have paved the way to explore alternative data sources. By using natural language processing (NLP) techniques, economists can systematically study transcripts like central banks’ publications, newspaper articles or company statements. We explain here how we have extended our toolbox by adding this method and present some of our results. We focus here on the US economy and the minutes to the Federal Open Market Committee’s (FOMC) latest meeting and its Beige Book. The Beige Book is published every six weeks and covers current economic conditions across the 12 Federal Reserve (Fed) Districts. These transcripts have been chosen because they are important releases, have a high frequency (every six/seven weeks) and a regular format.

What is natural language processing?

In essence natural language processing identifies key words or phrases used to describe certain situations and then counts the number of times these descriptions occur. This provides a summary statisitic for how positive or negative the report or text is, which can then be compared to more traditional measures of activity. The process is dependent on refining the identifying words or phrases in the count.

In this application, our economic sentiment score ranks specific words from the Fed’s Beige Book into two categories – positives and negatives. We use an enhanced version of the Loughran-McDonald sentiment dictionary combined with the Henry’s word list, a finance-specific dictionary. The sentiment score is calculated by totalling the word count in both categories and then normalising these with respect to the total number of sentiment-related words. This gives us a balance of positive/
negative descriptions of the US economy as reflected in the Fed’s communication since 1996. The indicators echo the US business cycles as defined by GDP growth (Exhibit 1) with positive scores during booms and negative readings during slowdowns.

Exhibit 1: Sentiment score as a proxy for GDP growth
   

Source: Fed and AXA IM Macro Research, as of July 2020

We attempt to minimise biases inherent in the methodology by making various adjustments to the raw scores. While the Loughran-McDonald dictionary is a good starting point for economic text analysis, the dictionary was created with the premise that there are "significant relationships between stock price reactions and the sentiment of news releases, as measured by word classifications”. In practice, a lot a words that are generally perceived as negative could have positive connotations, depending on their context. For example, the words "delinquencies", "prolongation" or "unemployment" are categorised as negative. However, falling credit delinquencies, the prolongation of a fiscal stimulus plan or a reduction in unemployment rates would all be positive developments – this creates room for misrepresentation.

To fine-tune this, we have enhanced the dictionary by adjusting each category of words based on its context. For example, we added the words "contracted" and "weighs" to the negative sentiment and removed the words “unemployment” and “collaboration”. Overall, these alterations make a meaningful improvement to the sentiment scores. For instance, we include the words "upward" and "boosts" which represent 5% of the positive words used in the reports over the last 10 years respectively.

“I know you think you understand what you thought I said but I'm not sure you realize that what you heard is not what I meant” – Alan Greenspan, Chair of the Fed, 18 Nov 2012

Central bank communication is notoriously perceived as ambiguous – what do we do with negations? Our model verifies if there are negative – or positive – words preceding (or following) a sentiment word and adjusts the count accordingly. Furthermore, the macroeconomic backdrop often changes the implications of specific topics. The Fed's objectives are to promote maximum employment and price stability, now defined as an inflation target of 2%. Thus the notion of high or low inflation is relative to this yardstick. The phrase "prices fell sharply" will have different connotations depending on where we stand relative to the target – the regime defines the sentiment related to the subject.

The target level of inflation since January 2012 allows us to identify inflation regimes and improve the quality of our sentiment indicator (Exhibit 2). However, it is widely acknowledged, including by the Fed, that the maximum level of employment is not directly measurable since it is determined by non-monetary factors that affect the structure of the labour market[1].

Exhibit 2: The inflation topic represents around 10% of Beige Book themes covered

  

Source: Datastream and AXA IM Research, as of July 2020

What does NLP tell us about the US economy?

Using the above NLP technique, we built sentiment scores for each document (Exhibit 1). Both scores illustrate common patterns compared with measures of US output – even though they are quite volatile, scores are highly correlated with each other. Since 2009, we also observe a large structural shift/break to the upside for Beige Book, while the minutes now seem less volatile. We suggest that the central bank has adopted more cautious communication since the global financial crisis in 2008, but we will observe how this behaviour continues through the current downturn.

Sentiment scores highlight a growing use of negative words before the dot-com crisis in 2001 and the global financial crisis in 2008. Scores are well-correlated with GDP figures, sometimes leading the peak (or trough) in activity. More recently, the Beige Book sentiment score has declined drastically, following rising trade tensions between China and the US and, more recently, the coronavirus pandemic.

The latest Beige Book was published on 15 July and it is quite informative about the recovery of activity. The sentiment score increased to -0.54 from -0.85. Despite a massive rebound, the score remains far from its end-2019 level. We continue to be cautious on the outlook as a recent reacceleration in the number of new US virus cases is slowing the pace of reopening and may impact the sentiment outlook after 6 July, the closing date of the current survey.

The Fed minutes’ sentiment score has been an interesting herald of Fed policy (Exhibit 3 and 4). We observe that when the policy rate was far from the zero lower bound, a sharp fall in score triggered rate cuts. More recently, the Fed increased the policy rate for the first time since the financial crisis. This came only in 2015, despite some large increases in sentiment scores in 2013 and 2014, followed by a large drop in sentiment across 2015. Thereafter, the Fed pursued interest rate normalisation, consistent with a gradual improvement in the sentiment score, until trade tensions picked up over 2017-18. The sentiment score fell over the second half of 2018 before the Fed announced a pause in its interest rate normalisation at the end of the year.

Exhibit 3: Sentiment score and Fed Fund Rate

  

Source: Fed and AXA IM Macro Research, as of July 2020

The sentiment score seems to be a reliable guide to the policy rate, when it is the main tool of monetary policy. However, the score has been quite volatile even as Fed policy stabilised at the effective lower bound. Does the score provide as good a guide to unconventional policy – Treasury purchases, as part of the quantitative easing (QE) programme? The dotted lines in Exhibit 4 highlight rapid drops in the sentiment score. In the initial years of QE, such declines in sentiment score were followed by a policy response, either rapid increases in net asset purchases (2009, 2010 and 2012), or the Maturity Extension Program in 2011. However, subsequent declines in 2013 and 2015 did not see further unconventional policy easing.

Exhibit 4: Sentiment score during the zero lower bound

  

Source: Fed and AXA IM Macro Research, as of June 2020

What's the hot topic right now?

Another interesting aspect of the NLP method is to analyse topics most covered by the Fed. To achieve this, we studied the residual of processed words – relevant words unrelated to sentiment. We looked at these by reviewing the 30 most frequent words used and measuring the frequency of pre-defined topics.

A score above zero means the word has been quoted more frequently in the June report than in March. For example, “policy”, “GDP”, “unemployment”, “pandemic” and “uncertainty” have been quoted more frequently in the latest report (Exhibit 5).

Exhibit 5: Minutes summary in 30 words

  

Source: Fed and AXA IM Macro Research, as of June 2020

This kind of instantaneous picture is extremely helpful to identify unusual topics such as mentions of trade tensions or the pandemic. Furthermore, we can predefine some topics and follow them over time.

Exhibit 6: “Hot topic” coverage over time

  

Source: Fed and AXA IM Macro Research, as of June 2020

Exhibit 6 also helps illustrate which categories Fed members focus on during FOMC meetings. “Economic conditions” is still the most discussed topic, but it has increased in dominance over the last two meetings, in line with an increase in mentions of the “pandemic”. “Inflation” as a topic remains highly discussed, even if we observe a declining reference to inflation targeting.

Future study

The above analysis illustrates a novel approach to data analytics, and allows us to apply a supplementary method for
judging economic performance and gauging monetary policy in the US. We also provide an interesting way of tracking the prevalence of certain discussion topics at FOMC meetings. We will follow these relationships with interest over the coming months.

However, the broader approach of NLP opens up a wide range of alternative measures for many different aspects of different economies, with immediate application to other key central bank publications as well as more general qualitative reports.

[1] In the FOMC’s June 2020 Summary of Economic Projections, a longer-run normal rate of unemployment between 3.5% and 4.7% has been estimated.