System and method for predicting security price movements using financial news
A method of creating a price prediction model that forecasts short-term price fluctuations in financial instruments by collecting, analyzing and classifying financial news for a financial instrument into categories. Distributions for the changes in price of the financial instrument for a set period of time and distributions for the changes in price of the financial instrument as a result of the financial news for each news category for a set period of time are then obtained. If the distributions for the changes in price of the financial instrument are statistically significantly different than the distributions for the changes in price of the financial instrument for a particular news category, and the mean for the change in price is greater or less than zero, a signal is produced indicating the trading action that should be taken for the financial instrument.
This application is a continuation of and claims priority of U.S. application Ser. No. 10/113,895 filed Mar. 28, 2002 which claims priority to U.S. provisional application 60/350,264 filed on Jan. 18, 2002.
BACKGROUND OF THE INVENTIONA. Field of the Invention
A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, As it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
Background and Prior Art
One “Holy Grail” in the financial markets is the development of an automated system that predicts price movements of financial instruments. If one is able to predict whether prices were moving up or down for financial instruments such as stocks, bonds, and commodities, then, one would have a way to generate money. Several prediction strategies exist that find patterns in price fluctuations. They fall into two categories: fundamental analysis and technical analysis. Fundamental analysis is performed by an analyst that keeps abreast of the news and data affecting a specific stock or market. The successful analyst warehouses correlation in the market and predicts the correct trend. This type of analysis often involves a prediction with a long-term horizon, such as a few months or years. Technical analysis is performed by a person or machine that looks for numeric trends in changes in financial and economic measures. Technical analysis is often used for short-term and long-term trading. The following invention is a fusion of Fundamental and Technical analysis. The invention predicts the movement of a financial instrument given historical closing prices and daily financial news about the underlying financial instrument.
The Engineering and Economic research literature is replete with approaches that use historical stock prices and economic values for predicting when to purchase a stock. For example, Yoon and Swales used a four-layered neural network to determine well performing firms and poorly performing firms using nine economic measures as input [1]. However, these approaches, whether they use neural networks or statistical regression, do not incorporate the events, and in particular, the news events that are responsible for the actual day-to-day price movements.
Economic news event studies have motivated several research projects. A typical event study would determine if a correlation exists between price changes and a particular event such as a stock splits, merger announcements, or the reporting of earnings. The example on page A-5 in this document contains an example using merger announcements. Some related research have used proxies for more general classifications of news. For example, Depken [4] uses a decomposition of volume as a proxy for “Good” and “Bad” to study how split-stocks react to news. In this work and others, the measure of interest is the statistical variance of volume and price changes. However, it is not clear that event studies using variance or volatility as the measure of interest have predictive value. Volatility can be defined as the standard deviation (square of the variance) of the annual expected return of a security. By definition, volatility does not predict the direction of price movements, only a dispersion of possible annual returns, both negative and positive.
Upon close examination of the Economic event study literature, it is evident that prediction is not the purpose of the research. The motivation of this research is to find and explain a market behavior in the context of a correlation between specific events and price changes, thus much of the research does not provide results for prediction, or recommend how the techniques described could be used in a prediction process. See Chan [3] for a comprehensive summary of previous related research for Economic event studies.
There is some recent research from the Machine Learning and Information Retrieval literature that is concerned with prediction. This research attempts to find a correlation with the words in the news that co-occur with surprising price changes. For example, Fawcett and Provost [5] find a set of words that often occur with 10% price changes in a stock. This type of text retrieval process shares a similarity to the invention described here, because it is extensible to events in general and not specific to predefined events. However, in this type of research the words predict when a particular price change event will occur, and there is no attempt to use an analyst's classification of “news” as input.
SUMMARY OF THE INVENTIONThis SYSTEM AND METHOD FOR PREDICTING SECURITY PRICE MOVEMENTS USING FINANCIAL NEWS forecasts short-term price fluctuations in domestic or international stocks. However, the present invention may be utilized for any financial instrument and the embodiment of this approach is not limited to applications in the stock market.
In one specific embodiment of the approach, textual financial documents obtained from public interest web sites were reviewed by financial analysts and classified to be either “good news” or “bad news” relative to the expected performance of a financial instrument. In addition, “mixed news” and “mention news” were used as classifications for financial news. Distributions of price changes for a particular financial instrument were sampled from the data based on the occurrences of the different classification of news. In this embodiment of the approach, the distributions were used to form a model that produces buy, sell, and no-trade signals for the financial instrument. The model is then used to predict when to buy, sell or not trade the stock given the daily occurrences of the underlying company's financial news.
The present invention, described herein, is for predicting short-term price fluctuations in domestic or international stocks. However, the present invention can be utilized for any financial instrument; therefore, it should be understood that the embodiment of this approach is not limited to applications in the stock market.
The salient distinction between this invention and previous approaches is the novel use of news as the input to the price prediction model. In embodiments of this invention, an analyst classifies or judges financial news articles using the following four classes or categories:
GOOD—good news, an event that improves the fundamental outlook of the company (ex: ‘results of a study that proved the high effectiveness of JNJ's coated stents, and cited it as likely the first to receive government approval’), better than expected earnings, a new contract, the expectation of new business, the acquiring of key personnel, etc.
BAD—bad news, something financially detrimental to the company or its industry, events such as extremely large litigation settlements, pipeline shutdowns due to indeterminately long political turmoil, unexpected poor earnings, loss of key clients, loss of key personnel, announcement of bankruptcy, unusual insider selling, etc.
MIXED—mixed news, some good and some bad news mixed in the same story, article not specifying why the price movement was contrary to what the fundamentals indicated (ex: while the earnings were bad year over year, they were better than consensus), bad earnings with expectation of good earnings growth, layoffs implying improved bottom line, loss of business and gain of new business, etc.
MENTION—mention news, the company's name is mentioned in an article in passing, (ex: ‘JNJ is the second largest pharmaceutical company, behind MRK’), a fundamental change in a company that was announced weeks ago, etc.
The judgements for stories are used for two purposes: 1) to build a price prediction model (see
In one embodiment of the invention, analysts classified financial news stories that were available on the internet from various news feeds. The stories and articles were from the Associated Press and Reuter's financial news wire about publicly traded companies. For the purpose of this embodiment, a total of three analysts were used with Masters degrees in Business Administration, and backgrounds comprising several years of financial markets experience. They were given guidelines similar to those listed above. In this embodiment, classification was based on the impact of the event on the financial outlook of the company, and not whether the stock price would go up or down.
A price prediction model for a stock is determined using historical closing prices and a set of financial news judgements for the articles about the stock. The approach is illustrated in
For example, assume we have a stock with the following data:
The training period is Jan. 2, 2002-Jan. 5, 2000.
The distribution of the 1-day change in price of the stock in general is:
t1=0.5, t2=−0.17, and t3=0.6.
The distribution of the 2-day change in price of the stock in general is:
t2=−0.25, and t3=0.33.
Incorporated herein is references to A-1 to A-3 of the Appendix, which provide a description and equations for calculating the mean the standard deviation of a distribution.
The apparatus for collecting analyst classifications via a website is illustrated in
Price change distributions for the days when news appears are determined for each class or category of news. For example, if at t0, there existed an article assessed as “good news”, the price change between t0 and t1 becomes a member of the distribution for good news, which is assumed to be approximately normal and represented as ˜N(μgood, σgood). In addition, distributions ˜N(μbad, σbad), ˜N(μmixed, σmixed), and ˜N(μmention, σmention) are also determined for days where bad, mixed, and mention news appear in the news.
Referring to example 1 above:
The distribution of the 1-day change in price of the stock when good news appears is:
t1=0.5
The distribution of the 2-day change in price of the stock when good news appears is:
t2=−0.25
The distribution of the 1-day change in price of the stock when bad news appears is:
t2=−0.17
The distribution of the 2-day change in price of the stock when bad news appears is:
t3=0.33
The five distributions are used to create the price prediction model. The price prediction model has four classifiers that produce buy, sell, and no-trade signals. There is one classifier Cclass for each news class, i.e., good, bad, mixed, and mention news. A classifier Cclass produces a buy signal for a news class, if (˜N(μclass, σclass)≠˜N(μstock, σstock)) and μclass>0), a sell signal if (˜N(μclass, σclass)≠˜N(μstock, σstock)) and μclass<0), and a no-trade signal otherwise. (˜N(μclass, σclass)≠˜N(μstock, σstock)) is determined by a statistical hypothesis test that tests if the distributions are significantly different [2]. We refer to pages A-4, A-S, and A-6 in the appendix that describes a significance test to determine if the distributions are significantly different.
If the distributions are significantly different, then classifier Cclass will produce a buy signal when μclass>0, and a sell signal when μclass<0. If the distributions are not significantly different, or μclass=0, then classifier Cclass will produce a no-trade signal.
When the price distribution of the class of news is statistically-significantly different than the price distribution of the stock in general i.e., (˜N(μclass, σclass)≠˜N(μstock, σstock)), it implies that μclass≠μstock above and beyond random chance. In terms of price movement, it implies that, on average, the change in price of a stock will be μclass when articles from the news class appear, and not μstock. For example, if a stock has moved up on average 2% in one day when good news appears, and that, in general the stock has historically moved 0.01% a day, knowing this information implies that an investor can improve upon a buy and hold return strategy for the stock by investing only on the days when good news appear. If this event occurred 5 times in the course of a year, the investor would have an estimated return of 10% . The buy and hold strategy has an estimated return of roughly 2.8%.
In one embodiment of the invention (see
In
In this embodiment of the invention, a two-sample t test with unequal variance [2] was used. In this embodiment of the invention, α<0.1 was used as a threshold, to determine whether there was a significant difference between the sample distributions of 1 and 2-day price changes for the stock and the sample distribution of 1 and 2-day price changes for the stock when news from a particular news class appears. Based on the 2-day distributions for Boeing are illustrated in
In another embodiment of the invention, the 1-day price change distributions for Boeing for the trading days between Jun. 30, 1999, and Aug. 31, 2001 for every tj−ti=1 business day. In addition, distributions for the 1-day price changes were collected for the four news classes good, bad, mixed, and mention. The distributions of 1-day price changes between Jun. 30, 1999 and Aug. 31, 2001 are plotted for Boeing in general (white area) and for days when bad news appears (black area within white area). On average, over a 1-day period when bad news appeared on day ti, the stock went down −1.2% with a s.d. of 2.3%. The stock of Boeing was up an average of 0.06% over a 1-day period in general. In this embodiment of the invention, bad news gave rise to a classifier with a sell signal, because the distribution of price changes when bad news appeared was statistically-significantly different based on the t test described above, and the other classes of news gave rise to classifiers producing no-trade signals.
In another embodiment of the invention (see
The daily price prediction process (see
In general, once the price prediction models are calculated for a financial instrument, it is straight forward to apply the price prediction model. The daily news for the financial instrument is categorized into good, bad, mixed, and mention news. Each article produces a trade signal depending its news class and its associated classifier in the prediction model. If the number of buy signals exceeds the number of sell and no trade signals, then the instrument is purchased and then sold in 1 or 2 days (depending on the number of days used to gather the distributions). If the number of sell signals exceeds the number of buy and no-trade signals, the instrument is sold short and then repurchased in 1 or 2 days. A no-trade decision is made when neither a buy or sell decision is predicted.
For example, once the prediction models for Boeing are determined (see
One embodiment of this invention assumes that prediction and trading will occur a few minutes before the 4 pm stock market close of the current day. It was run for 16 stocks and their price prediction models were determined using distributions for 1 and 2-day price changes. The stock prediction models were based on historical closing prices and financial news occurring on the trading days between Jun. 30, 1999, and Aug. 31, 2001. The results are presented in
Although the invention has been described and illustrated in the context of stocks, it is to be clearly understood that the same is intended by way of illustration and example only, and is not to be taken by way of limitation. The spirit and scope of this invention is also applicable to financial instruments of any kind that are affected by publicly available news.
Claims
1-28. (canceled)
29. A method of providing financial news items and significance of the financial news items, the method comprising the steps of:
- receiving financial news items for one or more financial instruments associated with business entities;
- classifying the financial news items into categories that include: a positive category indicating that a particular news item is favorable to the financial outlook of the associated business entity; and a negative category indicating that a particular news item is unfavorable to the financial outlook of the associated business entity;
- sending the financial news items to a computer of a user for display on the user computer; and
- sending the classified categories of the financial news items to the user computer for display on the user computer near the respective news items.
30. The method according to claim 29, wherein:
- each category is represented by a graphic symbol; and
- the step of sending the classified categories includes sending the associated graphic symbols.
31. The method according to claim 30, wherein:
- a graphic symbol for the positive category includes an arrow pointing upwards; and
- a graphic symbol for the negative category includes an arrow pointing downwards.
32. The method according to claim 29, wherein the categories include a mixed category indicating that a particular news item contains both a favorable and unfavorable financial outlooks of the associated business entity.
33. The method according to claim 29, wherein the categories include a mention category indicating that a particular news item mentions the associated business entity.
34. The method according to claim 29, wherein the categories include a mention category indicating that a particular news item mentions the associated publicly traded company.
35. A method of providing financial news items and significance of the financial news items for use in making trading decisions of financial instruments representing publicly traded companies, the method comprising the steps of:
- receiving financial news items for a plurality of financial instruments representing a plurality of publicly traded companies;
- classifying the financial news items into categories that indicate the impact of the news items on the financial outlook of the associated publicly traded companies;
- sending the financial news items to a computer of a user for display on the user computer; and
- sending the classified categories of the financial news items to the user computer for display on the user computer near the respective news items for analysis by the user in making trading decisions of the financial instruments.
36. The method according to claim 35, wherein the categories include:
- a positive category indicating that a particular news item is favorable to the financial outlook of the associated publicly traded companies; and
- a negative category indicating that a particular news item is unfavorable to the financial outlook of the associated publicly traded companies.
37. The method according to claim 35, wherein:
- each category is represented by a graphic symbol; and
- the step of sending the classified categories includes sending the associated graphic symbols.
38. The method according to claim 37, wherein:
- a graphic symbol for the positive category includes an arrow pointing upwards; and
- a graphic symbol for the negative category includes an arrow pointing downwards.
39. The method according to claim 35, wherein the categories include a mixed category indicating that a particular news item contains both a favorable and unfavorable financial outlooks of the publicly traded company.
40. The method according to claim 35 wherein the categories include a mention category indicating that a particular news item mentions the associated publicly traded company.
Type: Application
Filed: Jul 11, 2008
Publication Date: Feb 26, 2009
Inventor: Ron Papka (Short Hills, NJ)
Application Number: 12/218,216
International Classification: G06Q 40/00 (20060101);