System and method for predicting security price movements using financial news

A method of creating a price prediction model that forecasts short-term price fluctuations in financial instruments by collecting, analyzing and classifying financial news for a financial instrument into categories. Distributions for the changes in price of the financial instrument for a set period of time and distributions for the changes in price of the financial instrument as a result of the financial news for each news category for a set period of time are then obtained. If the distributions for the changes in price of the financial instrument are statistically significantly different than the distributions for the changes in price of the financial instrument for a particular news category, and the mean for the change in price is greater or less than zero, a signal is produced indicating the trading action that should be taken for the financial instrument.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
PRIORITY

This application is a continuation of and claims priority of U.S. application Ser. No. 10/113,895 filed Mar. 28, 2002 which claims priority to U.S. provisional application 60/350,264 filed on Jan. 18, 2002.

BACKGROUND OF THE INVENTION

A. Field of the Invention

A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, As it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.

Background and Prior Art

One “Holy Grail” in the financial markets is the development of an automated system that predicts price movements of financial instruments. If one is able to predict whether prices were moving up or down for financial instruments such as stocks, bonds, and commodities, then, one would have a way to generate money. Several prediction strategies exist that find patterns in price fluctuations. They fall into two categories: fundamental analysis and technical analysis. Fundamental analysis is performed by an analyst that keeps abreast of the news and data affecting a specific stock or market. The successful analyst warehouses correlation in the market and predicts the correct trend. This type of analysis often involves a prediction with a long-term horizon, such as a few months or years. Technical analysis is performed by a person or machine that looks for numeric trends in changes in financial and economic measures. Technical analysis is often used for short-term and long-term trading. The following invention is a fusion of Fundamental and Technical analysis. The invention predicts the movement of a financial instrument given historical closing prices and daily financial news about the underlying financial instrument.

The Engineering and Economic research literature is replete with approaches that use historical stock prices and economic values for predicting when to purchase a stock. For example, Yoon and Swales used a four-layered neural network to determine well performing firms and poorly performing firms using nine economic measures as input [1]. However, these approaches, whether they use neural networks or statistical regression, do not incorporate the events, and in particular, the news events that are responsible for the actual day-to-day price movements.

Economic news event studies have motivated several research projects. A typical event study would determine if a correlation exists between price changes and a particular event such as a stock splits, merger announcements, or the reporting of earnings. The example on page A-5 in this document contains an example using merger announcements. Some related research have used proxies for more general classifications of news. For example, Depken [4] uses a decomposition of volume as a proxy for “Good” and “Bad” to study how split-stocks react to news. In this work and others, the measure of interest is the statistical variance of volume and price changes. However, it is not clear that event studies using variance or volatility as the measure of interest have predictive value. Volatility can be defined as the standard deviation (square of the variance) of the annual expected return of a security. By definition, volatility does not predict the direction of price movements, only a dispersion of possible annual returns, both negative and positive.

Upon close examination of the Economic event study literature, it is evident that prediction is not the purpose of the research. The motivation of this research is to find and explain a market behavior in the context of a correlation between specific events and price changes, thus much of the research does not provide results for prediction, or recommend how the techniques described could be used in a prediction process. See Chan [3] for a comprehensive summary of previous related research for Economic event studies.

There is some recent research from the Machine Learning and Information Retrieval literature that is concerned with prediction. This research attempts to find a correlation with the words in the news that co-occur with surprising price changes. For example, Fawcett and Provost [5] find a set of words that often occur with 10% price changes in a stock. This type of text retrieval process shares a similarity to the invention described here, because it is extensible to events in general and not specific to predefined events. However, in this type of research the words predict when a particular price change event will occur, and there is no attempt to use an analyst's classification of “news” as input.

SUMMARY OF THE INVENTION

This SYSTEM AND METHOD FOR PREDICTING SECURITY PRICE MOVEMENTS USING FINANCIAL NEWS forecasts short-term price fluctuations in domestic or international stocks. However, the present invention may be utilized for any financial instrument and the embodiment of this approach is not limited to applications in the stock market.

In one specific embodiment of the approach, textual financial documents obtained from public interest web sites were reviewed by financial analysts and classified to be either “good news” or “bad news” relative to the expected performance of a financial instrument. In addition, “mixed news” and “mention news” were used as classifications for financial news. Distributions of price changes for a particular financial instrument were sampled from the data based on the occurrences of the different classification of news. In this embodiment of the approach, the distributions were used to form a model that produces buy, sell, and no-trade signals for the financial instrument. The model is then used to predict when to buy, sell or not trade the stock given the daily occurrences of the underlying company's financial news.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A: Historical News and Price Classification Building Process.

FIG. 1B: Apparatus used by an analyst to visualize news articles and associated news classifications.

FIG. 1C: Apparatus used to gather a classification for a news article.

FIG. 2A: Prediction Model and 2-day price change distribution for a stock and for four classes of news.

FIG. 2B: Prediction Model and 2-day price change distributions for a stock and for days when good news appears.

FIG. 2C: Prediction Model and 2-day price change distributions for a stock and for days when bad news appears.

FIG. 2D: Prediction Model and 2-day price change distributions for a stock and for days when mixed news appears.

FIG. 2E: Prediction Model and 2-day price change distributions for a stock and for days when mention news appears.

FIG. 3: Prediction Model and 1-day price change distributions for a stock and for days when good news appears.

FIG. 4A: Daily Trade Decision Process for a Stock.

FIG. 4B: Apparatus used to predict price direction given prediction models.

FIG. 5: Sample Return Effectiveness based on 16 Stocks.

DETAILED DESCRIPTION OF THE INVENTION

The present invention, described herein, is for predicting short-term price fluctuations in domestic or international stocks. However, the present invention can be utilized for any financial instrument; therefore, it should be understood that the embodiment of this approach is not limited to applications in the stock market.

The salient distinction between this invention and previous approaches is the novel use of news as the input to the price prediction model. In embodiments of this invention, an analyst classifies or judges financial news articles using the following four classes or categories:

GOOD—good news, an event that improves the fundamental outlook of the company (ex: ‘results of a study that proved the high effectiveness of JNJ's coated stents, and cited it as likely the first to receive government approval’), better than expected earnings, a new contract, the expectation of new business, the acquiring of key personnel, etc.

BAD—bad news, something financially detrimental to the company or its industry, events such as extremely large litigation settlements, pipeline shutdowns due to indeterminately long political turmoil, unexpected poor earnings, loss of key clients, loss of key personnel, announcement of bankruptcy, unusual insider selling, etc.

MIXED—mixed news, some good and some bad news mixed in the same story, article not specifying why the price movement was contrary to what the fundamentals indicated (ex: while the earnings were bad year over year, they were better than consensus), bad earnings with expectation of good earnings growth, layoffs implying improved bottom line, loss of business and gain of new business, etc.

MENTION—mention news, the company's name is mentioned in an article in passing, (ex: ‘JNJ is the second largest pharmaceutical company, behind MRK’), a fundamental change in a company that was announced weeks ago, etc.

The judgements for stories are used for two purposes: 1) to build a price prediction model (see FIGS. 1A-C), and 2) to be used as input for a daily price prediction process for making actual trades (see FIGS. 4A-B). Cleary, the analyst's judgements are subjective, but it is assumed that the analyst is an expert and has experience in the financial markets, and has some specification for the guidelines of the different categories. The above set of classes or categories would be useful for stocks.

In one embodiment of the invention, analysts classified financial news stories that were available on the internet from various news feeds. The stories and articles were from the Associated Press and Reuter's financial news wire about publicly traded companies. For the purpose of this embodiment, a total of three analysts were used with Masters degrees in Business Administration, and backgrounds comprising several years of financial markets experience. They were given guidelines similar to those listed above. In this embodiment, classification was based on the impact of the event on the financial outlook of the company, and not whether the stock price would go up or down.

A price prediction model for a stock is determined using historical closing prices and a set of financial news judgements for the articles about the stock. The approach is illustrated in FIG. 1A. The first step is to use historical daily closing prices for the stock and determine a mean, μstock, and standard deviation, σstock, for the change in price for a stock. The change in price for a stock between times ti and tj is: (closing_price(tj)—closing_price (ti))/closing_price(ti). During the training period for the stock's price prediction model, distributions are gathered where tj-ti are 1 and 2 business days apart. The distribution of price changes is assumed to be approximately normal (bell shaped curve), and the distribution is represent as ˜N(μstock, σstock), or ˜N(μ, σ) stock as a shorthand.

For example, assume we have a stock with the following data:

Stock: XMPL Closing 1-Day Change 2-Day Change Date Price News Class in Price (%) in Price (%) Jan. 2, 2002 1.00 GOOD, 1 article Jan. 3, 2002 1.50 BAD, 1 article 0.50 Jan. 4, 2002 1.25 No News −0.17 0.25 Jan. 5, 2002 2.00 No News 0.60 0.33

EXAMPLE 1

The training period is Jan. 2, 2002-Jan. 5, 2000.

The distribution of the 1-day change in price of the stock in general is:


t1=0.5, t2=−0.17, and t3=0.6.

The distribution of the 2-day change in price of the stock in general is:


t2=−0.25, and t3=0.33.

Incorporated herein is references to A-1 to A-3 of the Appendix, which provide a description and equations for calculating the mean the standard deviation of a distribution.

The apparatus for collecting analyst classifications via a website is illustrated in FIGS. 1B-C. A listing of news article titles for a company is displayed on the computer screen. In addition, each article has a graphic indicating the classification of the article, or a graphic indicating that the article needs to be classified. In one embodiment of the experiment (see FIG. 1B), an up arrow in a green box indicated the article was classified as good news, a down arrow in a red box indicated bad news. An up and down arrow in a yellow box indicated mixed news, a horizontal line in a gray box indicated mention news. If a J appeared in the box, the analyst clicked on the box and would enter the information required to maintain the classification for the stock's news over time. The apparatus in FIG. 1C is used to collect the classification for each article of news. The stock's ticker, the date/time of the article, the location of the article, and the analyst's classification are entered. When done, the analyst clicks the ‘Submit Judgement’ button on the graphic user interface. The classifications are used with daily price changes to build the price prediction model for a stock.

Price change distributions for the days when news appears are determined for each class or category of news. For example, if at t0, there existed an article assessed as “good news”, the price change between t0 and t1 becomes a member of the distribution for good news, which is assumed to be approximately normal and represented as ˜N(μgood, σgood). In addition, distributions ˜N(μbad, σbad), ˜N(μmixed, σmixed), and ˜N(μmention, σmention) are also determined for days where bad, mixed, and mention news appear in the news.

Referring to example 1 above:

The distribution of the 1-day change in price of the stock when good news appears is:


t1=0.5

The distribution of the 2-day change in price of the stock when good news appears is:


t2=−0.25

The distribution of the 1-day change in price of the stock when bad news appears is:


t2=−0.17

The distribution of the 2-day change in price of the stock when bad news appears is:


t3=0.33

The five distributions are used to create the price prediction model. The price prediction model has four classifiers that produce buy, sell, and no-trade signals. There is one classifier Cclass for each news class, i.e., good, bad, mixed, and mention news. A classifier Cclass produces a buy signal for a news class, if (˜N(μclass, σclass)≠˜N(μstock, σstock)) and μclass>0), a sell signal if (˜N(μclass, σclass)≠˜N(μstock, σstock)) and μclass<0), and a no-trade signal otherwise. (˜N(μclass, σclass)≠˜N(μstock, σstock)) is determined by a statistical hypothesis test that tests if the distributions are significantly different [2]. We refer to pages A-4, A-S, and A-6 in the appendix that describes a significance test to determine if the distributions are significantly different.

If the distributions are significantly different, then classifier Cclass will produce a buy signal when μclass>0, and a sell signal when μclass<0. If the distributions are not significantly different, or μclass=0, then classifier Cclass will produce a no-trade signal.

When the price distribution of the class of news is statistically-significantly different than the price distribution of the stock in general i.e., (˜N(μclass, σclass)≠˜N(μstock, σstock)), it implies that μclass≠μstock above and beyond random chance. In terms of price movement, it implies that, on average, the change in price of a stock will be μclass when articles from the news class appear, and not μstock. For example, if a stock has moved up on average 2% in one day when good news appears, and that, in general the stock has historically moved 0.01% a day, knowing this information implies that an investor can improve upon a buy and hold return strategy for the stock by investing only on the days when good news appear. If this event occurred 5 times in the course of a year, the investor would have an estimated return of 10% . The buy and hold strategy has an estimated return of roughly 2.8%.

In one embodiment of the invention (see FIG. 2A-2E), price change distributions for Boeing were calculated for the trading days between Jun. 30, 1999, and Aug. 31, 2001 for every tj−ti=2 business days. In addition, distributions for the 2-day price changes where collected for the four news classes good, bad, mixed, and mention. The five distributions are specified in the legend of the graph in FIG. 2A, and shown individually relative the 2-day changes in price of Boeing's stock in FIG. 2B-2E.

In FIG. 2B, the distributions of 2-day price changes between Jun. 30, 1999 and Aug. 31, 2001 are plotted for Boeing in general (white area) and for days when good news appears (Grey area within white area). For example, there were 2 occurrences of Boeing's stock going down −7.5% over a 2-day period when good news was reported. On average, over a 2-day period when good news appeared on day ti, the stock went down −1.3% with a s.d. of 3.0%. The stock of Boeing was up an average of 0.1% over a 2-day period independent of the type of news reported. The standard deviation for the 2-day price change of the stock in general was 3.2% and is listed with the distribution based on good news in the legend of FIG. 2B. In FIG. 2C the distribution of 2-day price changes is graphed when bad news was reported. The stock went down an average of -1.9% with a s.d. of 3.3% . In FIG. 2D the distribution for mixed news appears, and the stock went down an average of −0.8% with a s.d. of 2.8% . In FIG. 2E the distribution of 2-day price changes when the stock is mentioned has an average of −0.7% with a s.d. of 3.6%. Note that the distributions for mixed and mention news are sparse and they only contain a few articles in the sample of articles available during this time period.

In this embodiment of the invention, a two-sample t test with unequal variance [2] was used. In this embodiment of the invention, α<0.1 was used as a threshold, to determine whether there was a significant difference between the sample distributions of 1 and 2-day price changes for the stock and the sample distribution of 1 and 2-day price changes for the stock when news from a particular news class appears. Based on the 2-day distributions for Boeing are illustrated in FIG. 2A-E. The four classifiers that makeup the 2-day prediction model for Boeing are depicted in the legend of FIG. 2A. The distributions for good news (FIG. 2B), and bad news (FIG. 2C) were significantly different than the distribution of 2-day price changes for the stock in general. Since the mean of the good and bad news price distributions are negative, the prediction for their associated classifiers will both be sell signals. The mixed news (FIG. 2D) and mention news (FIG. 2E) distributions were not significantly different than the distribution of the stock in general, and their classifiers in the 2-day prediction model for Boeing will produce no-trade signals

In another embodiment of the invention, the 1-day price change distributions for Boeing for the trading days between Jun. 30, 1999, and Aug. 31, 2001 for every tj−ti=1 business day. In addition, distributions for the 1-day price changes were collected for the four news classes good, bad, mixed, and mention. The distributions of 1-day price changes between Jun. 30, 1999 and Aug. 31, 2001 are plotted for Boeing in general (white area) and for days when bad news appears (black area within white area). On average, over a 1-day period when bad news appeared on day ti, the stock went down −1.2% with a s.d. of 2.3%. The stock of Boeing was up an average of 0.06% over a 1-day period in general. In this embodiment of the invention, bad news gave rise to a classifier with a sell signal, because the distribution of price changes when bad news appeared was statistically-significantly different based on the t test described above, and the other classes of news gave rise to classifiers producing no-trade signals.

In another embodiment of the invention (see FIG. 3), the same parameters for the t test were used to determine that a buy signal is predicted when stories discussing good news about AT&T appear. This was the case since the 1-day distribution of price changes for AT&T in general is statistically-significantly different than the 1-day distribution of prices changes for days when articles containing good news appears. The bad, mixed and mention distributions resulted in classifiers that produce no-trade signals.

The daily price prediction process (see FIG. 4A), which can be used for making actual trades, uses the price prediction model for a stock, which is described above, and the analysts' classifications for news stories that appeared between the previous day's market close until the time of prediction. Each article about the stock during this time period is considered the stock's daily financial news. Each article is categorized by an analyst, and gives rise to one buy, sell, or no-trade signal. The stock is purchased and held for 1 day (when tj−ti=1) for an embodiment of the invention if the number of buy signals is greater than the number of sell and no-trade signals combined. The stock is sold short and the trade unwound after 1 day (when tj−ti=1) if the number of sell signals is greater than the number of buy and no-trade signals combined.

In general, once the price prediction models are calculated for a financial instrument, it is straight forward to apply the price prediction model. The daily news for the financial instrument is categorized into good, bad, mixed, and mention news. Each article produces a trade signal depending its news class and its associated classifier in the prediction model. If the number of buy signals exceeds the number of sell and no trade signals, then the instrument is purchased and then sold in 1 or 2 days (depending on the number of days used to gather the distributions). If the number of sell signals exceeds the number of buy and no-trade signals, the instrument is sold short and then repurchased in 1 or 2 days. A no-trade decision is made when neither a buy or sell decision is predicted.

For example, once the prediction models for Boeing are determined (see FIGS. 2B-2E), the apparatus in FIG. 4B, which is an embodiment of the invention, can be used to predict future 1 day and 2 day price movements given the stocks prediction model. As depicted in FIGS. 2B-E, Boeing resulted in a prediction model for tj−ti=1 such that a sell signal results for good news and a no-trade signals resulted for all other classifications of news. The prediction model for tj−ti=2 (see FIG. 2B) such that a sell signal results for good and bad news and a no-trade signals resulted for mixed and mention news. If Boeing had 4 good, 1 bad, 1 mixed, and 1 mention news article appear between the time of trading and the previous market's close, then the prediction would be to sell Boeing and unwind the trade for 1 day, and also (in a separate trade) to sell Boeing and unwind the trade in 2 days.

One embodiment of this invention assumes that prediction and trading will occur a few minutes before the 4 pm stock market close of the current day. It was run for 16 stocks and their price prediction models were determined using distributions for 1 and 2-day price changes. The stock prediction models were based on historical closing prices and financial news occurring on the trading days between Jun. 30, 1999, and Aug. 31, 2001. The results are presented in FIG. 5. In total, 40 trades were predicted for the time period between Sep. 4, 2001 and Sep. 28, 2001. The average buy and hold return for this period was −11.37%, and the average prediction model return, or the resulting return using an embodiment of the invention was 2.82% in the same period. The results suggest that using this invention produces a significantly greater return on investment than a buy and hold strategy. The data also suggests that for some stocks, there exists a correlation between the price movement of the stock, and the appearance of good, bad, mixed, and mention news.

Although the invention has been described and illustrated in the context of stocks, it is to be clearly understood that the same is intended by way of illustration and example only, and is not to be taken by way of limitation. The spirit and scope of this invention is also applicable to financial instruments of any kind that are affected by publicly available news.

Claims

1-28. (canceled)

29. A method of providing financial news items and significance of the financial news items, the method comprising the steps of:

receiving financial news items for one or more financial instruments associated with business entities;
classifying the financial news items into categories that include: a positive category indicating that a particular news item is favorable to the financial outlook of the associated business entity; and a negative category indicating that a particular news item is unfavorable to the financial outlook of the associated business entity;
sending the financial news items to a computer of a user for display on the user computer; and
sending the classified categories of the financial news items to the user computer for display on the user computer near the respective news items.

30. The method according to claim 29, wherein:

each category is represented by a graphic symbol; and
the step of sending the classified categories includes sending the associated graphic symbols.

31. The method according to claim 30, wherein:

a graphic symbol for the positive category includes an arrow pointing upwards; and
a graphic symbol for the negative category includes an arrow pointing downwards.

32. The method according to claim 29, wherein the categories include a mixed category indicating that a particular news item contains both a favorable and unfavorable financial outlooks of the associated business entity.

33. The method according to claim 29, wherein the categories include a mention category indicating that a particular news item mentions the associated business entity.

34. The method according to claim 29, wherein the categories include a mention category indicating that a particular news item mentions the associated publicly traded company.

35. A method of providing financial news items and significance of the financial news items for use in making trading decisions of financial instruments representing publicly traded companies, the method comprising the steps of:

receiving financial news items for a plurality of financial instruments representing a plurality of publicly traded companies;
classifying the financial news items into categories that indicate the impact of the news items on the financial outlook of the associated publicly traded companies;
sending the financial news items to a computer of a user for display on the user computer; and
sending the classified categories of the financial news items to the user computer for display on the user computer near the respective news items for analysis by the user in making trading decisions of the financial instruments.

36. The method according to claim 35, wherein the categories include:

a positive category indicating that a particular news item is favorable to the financial outlook of the associated publicly traded companies; and
a negative category indicating that a particular news item is unfavorable to the financial outlook of the associated publicly traded companies.

37. The method according to claim 35, wherein:

each category is represented by a graphic symbol; and
the step of sending the classified categories includes sending the associated graphic symbols.

38. The method according to claim 37, wherein:

a graphic symbol for the positive category includes an arrow pointing upwards; and
a graphic symbol for the negative category includes an arrow pointing downwards.

39. The method according to claim 35, wherein the categories include a mixed category indicating that a particular news item contains both a favorable and unfavorable financial outlooks of the publicly traded company.

40. The method according to claim 35 wherein the categories include a mention category indicating that a particular news item mentions the associated publicly traded company.

Patent History
Publication number: 20090055324
Type: Application
Filed: Jul 11, 2008
Publication Date: Feb 26, 2009
Inventor: Ron Papka (Short Hills, NJ)
Application Number: 12/218,216
Classifications
Current U.S. Class: 705/36.0R
International Classification: G06Q 40/00 (20060101);