Logical tests

Historical Time Series. Data Quality Issues and Solutions.

Issues.

Incorrect metadata.

Questionable index construction methods.

Unjustified faith in somebody else's analytics.

Errors typical for manual data entry.

Price changes that are not market moves.

Misinterpreting documents describing corporate actions.

Data points unskillfully manufactured by vendors.

Missing dividends.

Solutions and Recommendations.

Have good error-detection and error-correction software.

Protect your data against modification by operators lacking necessary skills.

Pay attention to correlations across asset classes.

Time zones: account for non-contemporaneous quotes.

Beware of potentially material difference between composite and primary exchange quotes.

==========

Incorrect metadata. In the given example, the error is in the creator of the index and the data vendor using different reference tables.

This example of incorrect metadata illustrates the ability of an experienced practitioner to see the root cause of the problem without benefit of seeing the operations of the index administrator and the data vendor.

The screenshot below has Country=Slovakia for the KOSPI tracker index. That designation is apparently incorrect because KOSPI is a Korean index.

The reason for the error: the index administrator tells Bloomberg that the country is South Korea, or "SK". On its side, Bloomberg looks up "SK" in the ISO table of country codes. In that table, the code for Korea is "KO" while "SK" is reserved for Slovakia.

Questionable index construction methods.

At one of the major banks, the JCMDCB Index at one time was the most material time series used in historical simulation. A closer look had revealed that during the credit crisis, the behaviour of the index seems to be unrelated to the market. As further investigation has established, for the entire 2008 calendar year this Global Emerging Markets index was comprised of just two constituents: one Egyptian bond and one Mexican bond.

Sophisticated analytical engine that in fact is a crude manual operation.

Bloomberg Fair Value curves are expensive groups of time series, constructed by proprietary Bloomberg analytics. The details of the construction are undisclosed. Supposedly, the analytical engine performing the construction is a sophisticated software package.

We have investigated a sample of the data, and will demonstrate the results on about one month of history of the Korean Government yield curve. The data for one of the tenors, 6 months, is shown below. It turns out that Bloomberg's analytical engine is not even trying to mark individual tenors on the curve. Instead, it subjects all the tenors to the same daily move (parallel shift). In real life, the curve does not behave like that. Short term rates might move by one amount while long term rates might move by another amount. The curve movements are much more complicated than simply going up or down. The shape of the curve is changing as well.

So if Bloomberg analytics performs nothing but parallel shift, while in reality the shape of the curve is changing every day, after a while the shape of the Bloomberg curve would become materially different from the real curve. At that point, a human operator intervenes, manually adjusting certain tenors in order to bring the curve into shape.

How do we know that the adjustments are both infrequent and manual, performed by a human, not by a computer? In order to answer that question, let's start by putting the data for all the tenors in a table.

Now let's calculate the daily changes in the value of each tenor, and separate the daily changes into a part common to all the tenors (parallel shift), and "the rest":

Suppose that someone is helping you to bring heavy bags to your car, and you want to tip that person. How much would you give? You might tip a dollar, $1.00, or five dollars, $5.00, but it is very unlikely that you tip something like 16 cents. Unless you have a pocket full of change and want to get rid of it, the size of your tip is most likely to be some round number.

The same happens here. While the parallel shifts are small fractional numbers, "the rest" is a set of large round numbers.

For the Bloomberg Fair Value Korean Government yield curve, similar evidence of manual operation was noted in other investigated periods as well. Here is another part of history for the short end of the curve:

The data:

Daily changes decomposed into "parallel shift" and "manual adjustments":

Data entry errors in time series that are not supposed to be subject to such errors.

We have already shown that Bloomberg Fair Value curves have not been produced entirely by an analytical engine but are subject to crude manual adjustments by human operators. On top of that, the Bloomber Fair Value time series have errors that are typical for manual data entry process, as opposed to automated feed process.

On June 29, 2012, the 3-year tenor of the Bloomberg Fair Value C366 quasi-qovernment curve has a spike.

The spike in the 3-year tenor looks suspicious, especially because the neighboring 2-year and 4-year tenors have no spike on that day.

The spike is indeed an error. Apparently, someone was typing in the tenor names as well as the values, and for the 4.1745 value has mistakenly entered "3Y" instead of "3M", so the 4.1745 3-month yield ended up having been assigned to the 3-year tenor.

Material price changes that have nothing to do with market moves.

According to Bloomberg, on May 13, 2019 the 5-year CDS spreads of the Deutsche Bank experienced an unprecedented contraction, tightening from 180 basis points (bps) to 100 bps. In the prior twelve years, only twice had Deutsche Bank spreads tightened by more than 40 bps. The first time - during the credit crisis, on Oct. 13, 2008, which was the best market day in recent history. The second time - on May 10, 2010, two business days after the Flash Crash and one day after the European debt crisis, due to the announced relief measures by the European governments. The 80 bps tightening on May 13, 2019, a quiet day, could not possibly be a market move.

Prior to May 11, 2019, the legal definition of Senior Unsecured spreads (SNRFOR) was making them little different from Loss Absorbing Capacity spreads (SNRLAC). That was specific to German banks. On May 11, 2019, the weekend, the change in the definition should have been implemented.

On Monday, May 13, 2019, Markit Partners has not reported SNRLAC spreads. On Tuesday, May 14, 2019, reporting of SNRLAC spreads resumed. As of May 13, the SNRFOR spreads are reported and are significantly tighter than SNRLAC.

Misinterpreting documents describing corporate actions.

On a number of occasions, Bloomberg misses corporate action, such as dividend payment. Below are the stats for a Vanguard Mid-Cap index as originally shown by Bloomberg.

Here, the same index after we notified Bloomberg, with the error having been fixed.

For CMT CN Equity, Bloomberg had incorrect adjustment factor accounting for corporate action because the corporate event was misinterpreted. The price of the rights issued on Dec. 23, 2015 was incorrectly treated as "per share", while in fact it was "per right", the right being equal 4.3386 common shares.

Bloomberg has corrected the error after being notified about it.

Data points unskillfully manufactured by vendors.

What happens if an instrument did not trade because the exchange was closed for a holiday or for some other reason? In some cases, the price is unreported. For example, during superstorm Sandy, the U.S. equity market was closed for two days. Consequently, the SPX Index as well as almost all the other equities have a two-day gap. One exception is the Russell family of indices. For the two days in question, they were reported as having the same value as the day before.

Data vendors might try something more sophisticated. Sometimes, these efforts result in filled values that are not reasonable.

If a filled value is good, it should blend with the rest and not be easily discoverable. Conversely, if it stands out, and someone can take your dataset and tell which points are real and which are filled by you, your filling methods might need improvement.

For example, by looking just at the reported closing values of the MSCI Brazil index (MXBR Index), together with other equities, we were pretty sure that the 2008-11-20 closing value of 1,416.27 was not real.

The conclusion was quickly verified by checking the trading volume information. As can be seen below, the trading volume on 2008-11-20 was zero, which means MSCI had come up with the 1,416.27 value by employing some filling technique. The reason for zero trading volume: a local holiday in Brazil.

To see why the 1,416.27 value is unrealistic, let's compare relative performance of the MXBR Index and the Brazil ETF that was trading in the U.S. on that day:

Given the task of filling the missing 2008-11-20 point, we would have produced a value about 200 points lower than the one reported by MSCI.

Note: some banks protect themselves against poorly manufactured data points on holidays and other non-trading days by automatically rejecting quotes associated with zero trading volume. As a side benefit, such practice results in rejection of IPO prices, so large jumps that are often observed on the very first day of trading do not enter the historical simulation.

Missing dividends.

Missing dividend is seemingly a mundane problem, but it might be pretty disqueting when it affects your own retirement portfolio if the portfolio's benchmark was involved.

Here, Vanguard Variable Insurance Fund - Mid-Cap Index Portfolio is tracking the performance of the CRSP US Mid Cap Index and is serving as the benchmark for certain retirement funds. The index performance statistics shown below are incorrect because Bloomberg has missed a large dividend.

We have notified Bloomberg about the error, and after correcting it Bloomberg is showing much improved statistics:

Please be aware of the following: if you own a dividend-paying instrument, the actual payment of the dividend might be days or even weeks after the instrument goes ex-dividend. During that period, the cash value of your account is going to be lower than the accrued value. So if we are interested in the cash value, there is nothing wrong in having a sudden drop in the account value on the ex-dividend date.

Need for good error-detection and error-correction software.

Experienced practitioners often have an intuition as to where the errors might be hiding. Still, even they can be overwhelmed by the large amount of data to be analyzed. It is therefore essential to have good software that can detect errors, and, once the errors have been removed, repair the dataset by filling these points, as well as the points missing in the original data.

Some banks run programs performing no more than rudimentary data quality checks. Some firms do not do even that. There are large banks that claim having implemented comprehensive and sophisticared outlier detection and gap-filling software, while in fact the implementation is so wanting that it cannot detect the most obvious errors, such as the Deutsche Bank spread error on 2019-05-13, which is easily detectable by visual review.

To illustrate what an error-correction software might be capable of doing, we took seven weeks of history of the SPX Index, four major banks, and IBM (the pink table on the left). Then we have removed some data points (the yellow table on the right)

We then gave the yellow table to a gap-filling program and asked it to find the maximum likelihood values for all the missing points. The green table on the right is the output.

In the input data (the yellow table), one of the days is the Labor Day, 2011-09-05. On that day, all the prices are missing for real. Since no information whatsoever had been provided for the Labor Day, the program's best guess for the filled values on 2011-09-05 is the geometric average of the day before and the day after.

You might be thinking that for such a small dataset, the results are too good to be true. For example, the true closing price for JPM on 2011-09-01 is $36.3 and the program is within one cent of the true value with the $36.31 guess. Such a good performance on a small dataset is due to the period selection. U.S. downgrade on Aug. 8, 2011 and the period around it is the time of crisis. In such times, the correlations tend to be materially higher than usual, and with the unusually high correlations the task of gap-filling becomes significantly easier.

Protect your data against modification by operators lacking necessary skills.

Unskilled operators can do a lot of damage to the data. In the discussion of the Deutsche Bank spread error on 2019-05-13, we have mentioned that between the beginning of 2007 and the beginning of 2020, the 5-year CDS spread of Deutsche Bank have moved by more than 40 basis points on just three occasions. On all three occasions, it was tightening. Two out of three are real moves, in agreement with the rest of the market. One out of three is an error, originating at Markit Partners.

At one large bank, both real moves have been removed from history by the human operators, who deemed both moves to be anomalous. At the same time, the operators have retained the largest of the three moves, the only one of the three that is erroneous. Ironically, the outlier detection software at that bank has not flagged the enormous erroneous move either. That has made the treatment of all three large moves incorrect: the good moves have been thrown away while the bad move has been retained.

Pay attention to correlation across asset classes.

Some banks analyze their equity and credit datasets separately. In doing that, they fail to fully utilize the information coming from company-specific events that simultaneously affect equity and credit of a company.

For example, during the emissions scandal, the CDS spreads of the Volkswagen widened dramatically. The degree of the widening is well corroborated by the degree of decline in equity price. The chart below displays CDS spread and inverse equity price together.

Time zones: account for non-contemporanous quotes.

Normally, the correlation between SPX Index and NKY Index is over 70%. To retain that high correlation, the closing prices must be placed in their corresponding time zones. If the closing prices are viewed simply as daily quotes, without any regard to the difference in closing times, the correlation would be about 20% or lower. In fact, the correlation of the NKY Index to the yesterday's SPX Index is usually over 50%, much higher than the correlation to the today's SPX.

In the plot below, the three major equity indices from Japan, Europe, and the U.S. are shown. Recognition of time zones helps to visualize their strong correlation.

In addition, the plot demonstrates that using the prior day's quote for a missing might produce unreasonable results. October 13, 2008 was one of the biggest equity rallies in history. That day was a local holiday in Japan, so the NKY value is unavailable. Prior Value for 2008-10-13 looks out of place. The value for NKY in the chart has been suggested by our gap-filling software and looks more in agreement with the other two indices.

Beware of potential material difference between composite and primary exchange quotes.

As an example, for IBM the composite ticker is IBM US Equity and the primary exchange ticker is IBM UN Equity. For actively traded U.S. equities like IBM, the difference primary exchange and composite quotes is usually either non-existent or neligible. The same cannot be said about actively traded names in Europe and Japan.

The plot below shows a spike in the closing prices of Canon, if judged by the composite quotes, on 2016-11-16. The primary exchange quotes do not have a spike.

Based on our experience, in Japan and Europe the primary exchange time series behave better.

For the U.S., you might want to use composite quotes. There is an additional benefit in that. Some firms lose the stock history when the primary exchange ticker changes due to the company move, let's day, from the New York Stock Exchange to NASDAQ.