What is Alternative Data for Finance?

Alternative data is data that gives an indication about unreleased "officially-reported" data. Officially-reported data include data points that are publicly and uniformly disseminated. Examples of officially reported data include company financial statements and official government statistics like unemployment. Critically, alternative data is only a prediction of some "true" value -- it cannot be the actual value itself. In some cases, the “officially-reported” data may never actually be released -- cases where you’re trying to proxy a private company’s revenues that never publicly reports them.

What Can Alternative Data Help Measure?

Alternative Data Chain (1).png

Company Health

Online Reviews


Sentiment Data


Web Crawling


Alternative Data to Proxy Company Revenues


Certain types of data tell you about the run up to a company actually generating revenues.

App Downloads

If you’re interested in Target’s online sales, one of the things you may want to look at are how many people are downloading the Target app on their phone. They may give you an early indication of how much people will be spending at Target.

Satellite Data

For a company like Home Depot, one of the things you can do is buy data from satellite imagery companies that can determine how many cars are in a parking lot. You might be able to use this to figure out how many people are shopping at Home Depot and compare it to equivalent periods in previous years.

Drone Data

Satellite data may not have the resolution or the frequency you're looking for. In that case, you could use drones to do parking lot counts more frequently.

Mobile Location Data

What happens if you're interested in Sears, which are typically located inside malls? Parking lot traffic gives you ask indication of overall mall traffic but you can do better. Most people visiting a mall will have a phone capable of tracking location. You've given permission for a number of those apps to track your location both while you use the app as well as to track your location in the background. Knowing that, you can draw polygons around the stores you care about and buy location data from any number of these apps to determine traffic to each of the stores in the mall.

With the type of data available, you could theoretically find out where traffic is being moved. For example, if someone used to shop at Sears, there is data out there that tells them where those customers are going now.

Other Types of Location Data

What do you do when you want to track foot traffic to Starbucks, particularly in dense cities? People don't spend a lot of time at a Starbucks so it can be difficult to distinguish between someone walking by the Starbucks, someone standing outside of the Starbucks waiting for the bus, someone in the office above the Starbucks, and someone actually at the Starbucks buying coffee.

One of the ways to deal with this is something like Foursquare check-in data. When someone uses Foursquare, they're making that distinction explicit. You know with much more certainty that they're at the Starbucks. That being said, you then have to rely on people checking in or otherwise using the Foursquare API.

We know of companies that are installing beacons in client stores to let the client know where in the store customers are spending their time.


Once an order has been placed, there is a direct line to company revenue. One can think about this as placing an order online

Point of Sale Data

While all of the previous data tells you how much traffic is entering the store, you don’t necessarily know how much people are spending when they go into these stores. The point of sale terminal (a more generic term for cash register) that the firm uses to ring up your purchases knows how much you’re spending and some of them are able to anonymize and aggregate that data so that investors can get a sense of sales and in particular same-store-sales, which is a key retail metric. While point of sale data knows both credit card and cash purchases made in the retail establishment, it misses the online portion of the business. Most businesses these days do a percentage of their sales online and those are not captured by the physical point of sale terminal.

The point of sale can capture individual line items as well. It knows not only that you made a purchase at Macy’s but also that you bought a Louis Vuitton bag at the Macy’s along with other items. That has implications for an array of other companies outside of the particular retailer.

Credit Card Data

Credit card data can help fill the gap with online purchases. This data tends to be messy and there are a number of companies that help aggregate this information. Credit card data can come from a variety of different sources:

  • The bank that issued the card
  • The payment processor
  • Visa/Mastercard/Amex/etc
  • Apps that have connected to your bank account and that are helping you manage money

Each of these have their own pros and cons but all of them boil down to how much of the total picture are you able to obtain. For example, if you only get data from one bank, you’re going to miss all the other banks. If you’re only getting data from a particular processor, you may miss sales in other countries.

E-mail receipts

There are a number of apps out there that you’ve granted access to your email. Some of them are email clients, others do things like make it easier to unsubscribe from email newsletters, like Unrollme. With access to your e-mail, they are able to scan for and aggregate receipts you get in your e-mail. This would include things like Lyft and Uber receipts — those are always sent via e-mail. Uber got into some hot water recently because they were purchasing data about Lyft’s ridership using e-mail receipt data.

Alternative Data to Proxy Company Costs

Hiring Activity

Major recruitment firms as well as job posting websites are able to measure the level of hiring activity of various firms. Depending on how a particular firm conducts hiring activity, this can give indications on company costs. In cases where the hiring is linked to orders, an unexpected spike in hiring level could indicate an unexpected spike in orders.

Alternative Data to Proxy Financial Market Activity

Crowd Sourced Data

There are a number of apps out there that let many users make predictions about company financials and even whether they think a stock is going to go up or down. While we have no evaluated the claims, some of those companies claim that the aggregate user base can more accurately predict company financials and market movements than Wall Street research desks.

Privacy Issues with Alternative Data

Location data needs proper privacy agreements. Those privacy policies that nobody reads are actually important and hedge funds with strong compliance teams will absolutely not use any data collected before the privacy policy allows it. In that case, the funds work with the provider to modify the privacy agreements.

Some data crosses over into “creepy” territory despite privacy policies that technically allow the collection of certain types of information.

One interesting thing is that some providers including major credit card companies don’t want to sell data under their own name. They will transmit it through a third party that does nothing else other than remove the major brand’s name from the data

In all cases, Personally Identifying Information must be removed. For Finance, that actually isn’t even an issue. They want company-level stuff. One of the things we’re doing is helping providers understand that.

How Investors Purchase and Use Alternative Data

Jobs in Alternative Data

- We know of one large fund with a team of almost 50 people working on alternative data. They have people to source it, vet it, make reports, do analysis, etc. etc.

- Matt Ober, who used to work at WorldQuant used to make $200k and then he was poached by ThirdPoint and is now apparently making $2M there. We only know this because he’s getting sued by WorldQuant.

- One hedge funds we know of is looking for a data scientist to work with alternative data. The pay for that is $250k to $400k.

- We’ve heard of one hedge fund that has a backlog of about 2000 potential data providers. They have yet to go through this whole list and that’s what we’re trying to help with.

- BlackRock is literally advertising right now a position called “Data Hunter”

- Some traditional data providers, like FactSet, know that their data is getting more and more commoditized. One thing they’re doing is they’re also going out and sourcing alternative data providers and actually bringing them more into the FactSet ecosystem.

- Based on the job postings, we see people sourcing alt data fall into these categories:

- Specialized alternative data people

- Repurposed market data people

- Data scientists who are tasked with finding the data in addition to analyzing it

- Some risk management departments have been looking into alternative data


Fundamental Investors

Until recently, the major users of what we call alternative data were typically quantitative funds. There were a few firms that would purchase certain types of alternative data but most funds relied on company financial statements only. As more and more data becomes available, we have seen a surge of interest from fundamental investors across the board from small hedge funds to large asset managers.

- Predict next quarter’s revenue numbers

- Anomaly detection vs. competitors. One day might have seen a surge for one company vs. others

Opportunity for New Companies in Alternative Data

  • More and more providers popping up

  • Need to make the data more usable, various layers to this

  • As more providers come online, we see a need to be able to sift through a lot of the new providers.

  • There’s a pricing challenge. New providers typically don't know how valuable their data is to various firms.

  • There are a number of firms that are providing interesting data sets typically for marketing purposes but have not yet realized that their data could be useful for investors.

  • Potential for intermediaries that do some level of anonymization — this could be on both ends. Some data providers do not really want the broader market to know that they are selling this data so they go through potentially several layers of anonymization so that the end user doesn’t know who the underlying data provider is. There’s also potential for companies working on behalf of the investment manager to anonymize the fact that a particular investment manager is in the market for a certain data set. There can be a number of reasons for this. One possibility is to not disclose an investment thesis. Another is to obtain preferential pricing. The biggest fund managers feel like they're getting ripped off.

How Long Will the Alternative Data Trend Last?

When earnings are announced for a stock, this has the tendency to surprise financial markets. In some cases, the price volatility on an earnings day is 10 times that of a regular trading day for a stock. This indicates that there is something in the officially-reported data which the market was not able to anticipate. So long as that earnings day volatility exists, there will be an opportunity for investors to have an edge in the markets and allow them to continue purchasing new and differentiated alternative data sets.