In financial domain, "alternative data" is anything apart from traditional pricing and fundamentals data, that can be used in fair value analysis. Given an explosion in relevant big data sources, this area has experienced rapid growth in recent years. Here are the major "themes" that cover most of important datasets available for an investor


Information from job postings and resumes can be used to predict business growth, corporate strategy and labour costs of the company. They can also help diagnose employee loyalty the company and its talent retention capabilities.


Despite the fact that pricing was the first machine-readable data used to analyse companies, modern computational capabilities allow extracting ever larger amount of signal from intraday exchange and OTC quotes.


Sell-side analysts have for decades formed a consensus view on the company’s growth and profitability outlook. At the same time we see a number of new approaches in that vein, for instance crowdsourcing platforms seeking to broaden the audience of the forecasters, and also increase the number of metrics and prediction formats.


Holdings of major financial institutions such as index, mutual and hedge funds can be used to track both the individual company’s investor patterns, as well as map relationships between companies based on mutual shareholders.


Correctly interpreting satellite imagery can yield precise estimates of manufacturing volumes, footfall in retail chains and activity on commodity deposits. This data can be relevant both on individual company and on macro-level.


Information about current consumer sales can be obtained from credit card data, e-commerce apps, from parsing email receipts etc. It can be aggregated by buyers, products and manufacturer, and can be used to predict revenue number far in advance of the official financial reporting of the company.


Following the developments in natural language processing, both the official news channels and social media posts can be evaluated for their impact on companies mentioned.


Annual and quarterly reporting (mostly 10-K’s and 10-Q’s submitted to SEC) can be used both to track major metrics reflecting the business of the company, as well as more nuanced factors such as delays in filing the reports.


ESG (environmental, social and corporate governance) data reflects both the internal metrics of a well-run company (boardroom composition, executive pay, voting patterns etc.) and the quality of its relationship with the outside world (interaction with stakeholders and impact on the environment).


A big chink of the listed companies’ capital is obtained through the bond market, so the bond / CDS prices have a close relationship to stock price of a company.

Why is this important?

As the returns in the active management industry show, alpha is much harder to find today than ten or twenty years ago.  While the traditional techniques see their edge dissappear due to overcrowding, the focus is on managers to find new sources of outperformance in rapidly changing informational landscape. The data that was considered scarce has suddenly become ubiquitous, overwhelming researchers and portfolio managers with "three Vs of big data": variety, volume and velocity.

Number of alternative datasets, by year (source: alternativedata.org)
We believe the race will belong to those who have the right tools to turn quantity into quality. To that end we are building a universal data processing framework, that will ingest and normalize the data in real-time, delivering a turnkey solution for building your own data mosaic.