Quantifying the semantics of search behavior before stock market moves |
| |
Authors: | Chester Curme Tobias Preis H. Eugene Stanley Helen Susannah Moat |
| |
Affiliation: | aCenter for Polymer Studies and Department of Physics, Boston University, Boston, MA, 02215; and;bWarwick Business School, University of Warwick, Coventry CV4 7AL, United Kingdom |
| |
Abstract: | Technology is becoming deeply interwoven into the fabric of society. The Internet has become a central source of information for many people when making day-to-day decisions. Here, we present a method to mine the vast data Internet users create when searching for information online, to identify topics of interest before stock market moves. In an analysis of historic data from 2004 until 2012, we draw on records from the search engine Google and online encyclopedia Wikipedia as well as judgments from the service Amazon Mechanical Turk. We find evidence of links between Internet searches relating to politics or business and subsequent stock market moves. In particular, we find that an increase in search volume for these topics tends to precede stock market falls. We suggest that extensions of these analyses could offer insight into large-scale information flow before a range of real-world events.Financial crises arise from the complex interplay of decisions made by many individuals. Stock market data provide extremely detailed records of such decisions, and as such both these data and the complex networks that underlie them have generated considerable scientific attention (1–20). However, despite their gargantuan size, such datasets capture only the final action taken at the end of a decision-making process. No insight is provided into earlier stages of this process, where traders may gather information to determine what the consequences of various actions may be (21).Nowadays, the Internet is a core information resource for humans worldwide, and much information gathering takes place online. For many, search engines such as Google act as a gateway to information on the Internet. Google, like other search engines, collects extensive data on the behavior of its users (22–25), and some of these data are made publicly available via its service Google Trends. These datasets catalog important aspects of human information gathering activities on a global scale and thereby open up new opportunities to investigate early stages of collective decision making.In line with this suggestion, previous studies have shown that the volume of search engine queries for specific keywords can be linked to a range of real-world events (26), such as the popularity of films, games, and music on their release (27); unemployment rates (28); reports of flu infections (29); and trading volumes in US stock markets (30, 31). A recent study showed that Internet users from countries with a higher per capita gross domestic product (GDP), in comparison with Internet users from countries with a lower per capita GDP, search for proportionally more information about the future than information about the past (32).Here, we investigate whether we can identify topics for which changes in online information-gathering behavior can be linked to the sign of subsequent stock market moves. A number of recent results suggest that online search behavior may measure the attention of investors to stocks before investing (33–35). We build on a recently introduced method (33) that uses trading strategies based on search volume data to identify online precursors for stock market moves. This previous analysis of search volume for 98 terms of varying financial relevance suggests that, at least in historic data, increases in search volume for financially relevant search terms tend to precede significant losses in financial markets (33). Similarly, Moat et al. (36) demonstrated a link between changes in the number of views of Wikipedia articles relating to financial topics and subsequent large stock market moves. The importance of the semantic content of these Wikipedia articles is emphasized by a parallel analysis that finds no such link for data from Wikipedia pages relating to actors and filmmakers.Financial market systems are complex, however, and trading decisions are usually based on information about a huge variety of socioeconomic topics and societal events. The initial examples above (33, 36) focus on a narrow range of preidentified financially related topics. Instead of choosing topics for which search data should be retrieved and investigating whether links exist between the search data and financial market moves, here we present a method that allows us to identify topics for which levels of online interest change before large movements of the Standard & Poor’s 500 index (S&P 500). Although we restrict ourselves to stock market moves in this study, our methodology can be readily extended to determine topics that Internet users search for before the emergence of other large-scale real-world events.Our approach is as follows. First, we take a large online corpus, Wikipedia, and use a well-known technique from computational linguistics (37) to identify lists of words constituting semantic topics within this corpus. Second, to give each of these automatically identified topics a name, we engage users of the online service Amazon Mechanical Turk. Third, we take lists of the most representative words of each of these topics and retrieve data on how frequently Google users searched for the terms over the past 9 y. Finally, we use the method introduced in ref. 33 to examine whether the search volume for each of these terms contains precursors of large stock market moves. We find that our method is capable of automatically identifying topics of interest before stock market moves and provide evidence that for complex events such as financial market movements valuable information may be contained in search engine data for keywords with less-obvious semantic connections. |
| |
Keywords: | complex systems computational social science data science online data financial markets |
|
|