Patents by Inventor Philip Shi-lung Yu

Philip Shi-lung Yu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7548937
    Abstract: A computer implemented method, apparatus, and computer usable program code for processing multi-way stream correlations. Stream data are received for correlation. A task is formed for continuously partitioning a multi-way stream correlation workload into smaller workload pieces. Each of the smaller workload pieces may be processed by a single host. The stream data are sent to different hosts for correlation processing.
    Type: Grant
    Filed: May 4, 2006
    Date of Patent: June 16, 2009
    Assignee: International Business Machines Corporation
    Inventors: Xiaohui Gu, Haixun Wang, Philip Shi-lung Yu
  • Publication number: 20090100014
    Abstract: Techniques are disclosed for adaptive source filtering and load shedding in such data stream processing systems. For example, in one aspect of the invention, a method for use in filtering data in a distributed data stream processing system, wherein a server receives and processes one or more data streams from one or more data sources, comprises the steps of the server periodically re-configuring one or more filters and sending the one or more periodically re-configured filters to the one or more data sources, and the one or more data sources performing data filtering based on the one or more periodically re-configured filters received from the server.
    Type: Application
    Filed: October 10, 2007
    Publication date: April 16, 2009
    Inventors: BUGRA GEDIK, KUN-LUNG WU, PHILIP SHI-LUNG YU
  • Publication number: 20090094265
    Abstract: A system and method for rights protection of a dataset that includes multiple trajectory objects includes determining an intensity power for embedding a watermarking key in a data trajectory. The data trajectory is modified to embed a watermarking key at the intensity power such that the intensity power guarantees an original pair-wise relationship between distance-based neighboring objects before and after embedding of the key such that a modified trajectory provides a watermarked version of the data trajectory.
    Type: Application
    Filed: October 5, 2007
    Publication date: April 9, 2009
    Inventors: MICHAIL VLACHOS, Philip Shi-Lung Yu
  • Publication number: 20090086755
    Abstract: Systems and methods for the identification of correlated burst events among two or more data streams, given one or more specific query time spans are disclosed. Also broadly contemplated is the act of finding, from one or more data streams, those streams that have correlated burst events with another given data stream within a time span.
    Type: Application
    Filed: July 22, 2008
    Publication date: April 2, 2009
    Inventors: Shyh-Kwei Chen, Michail Vlachos, Kun-Lung Wu, Philip Shi-lung Yu
  • Publication number: 20090077148
    Abstract: Techniques for perturbing an evolving data stream are provided. The evolving data stream is received. An online linear transformation is applied to received values of the evolving data stream generating a plurality of transform coefficients. A plurality of significant transform coefficients are selected from the plurality of transform coefficients. Noise is embedded into each of the plurality of significant transform coefficients, thereby perturbing the evolving data stream. A total noise variance does not exceed a defined noise variance threshold.
    Type: Application
    Filed: September 14, 2007
    Publication date: March 19, 2009
    Inventors: Philip Shi-Lung Yu, Spyridon Papadimitriou
  • Patent number: 7487167
    Abstract: A technique for classifying data from a test data stream is provided. A stream of training data having class labels is received. One or more class-specific clusters of the training data are determined and stored. At least one test instance of the test data stream is classified using the one or more class-specific clusters.
    Type: Grant
    Filed: May 31, 2007
    Date of Patent: February 3, 2009
    Assignee: International Business Machines Corporation
    Inventors: Charu C. Aggarwal, Philip Shi-Lung Yu
  • Publication number: 20090024618
    Abstract: The present invention provides an index structure for managing weighted-sequences in large databases. A weighted-sequence is defined as a two-dimensional structure in which each element in the sequence is associated with a weight. A series of network events, for instance, is a weighted-sequence because each event is associated with a timestamp. Querying a large sequence database by events' occurrence patterns is a first step towards understanding the temporal causal relationships among the events. The index structure proposed herein enables the efficient retrieval from the database of all subsequences (contiguous and non-contiguous) that match a given query sequence both by events and by weights. The index structure also takes into consideration the nonuniform frequency distribution of events in the sequence data.
    Type: Application
    Filed: August 26, 2008
    Publication date: January 22, 2009
    Inventors: WEI FAN, CHANG-SHING PERNG, HAIXUN WANG, PHILIP SHI-LUNG YU
  • Patent number: 7480913
    Abstract: The present invention relates to the problem of scheduling work for employees and/or other resources in a help desk or similar environment. The employees have different levels of training and availabilities. The jobs, which occur as a result of dynamically occurring events, consist of multiple tasks ordered by chain precedence. Each job and/or task carries with it a penalty which is a step function of the time taken to complete it, the deadlines and penalties having been negotiated as part of one or more service level agreement contracts. The goal is to minimize the total amount of penalties paid. The invention consists of a pair of heuristic schemes for this difficult scheduling problem, one greedy and one randomized. The greedy scheme is used to provide a quick initial solution, while the greedy and randomized schemes are combined in order to think more deeply about particular problem instances.
    Type: Grant
    Filed: September 9, 2003
    Date of Patent: January 20, 2009
    Assignee: International Business Machines Corporation
    Inventors: Melissa Jane Buco, Rong Nickle Chang, Laura Zaihua Luan, Christopher Ward, Joel Leonard Wolf, Philip Shi-lung Yu
  • Patent number: 7454410
    Abstract: A Web crawler data collection method is provided for collecting information associated with a plurality of queries, which is used to calculate estimates of return probabilities, clicking probabilities and incorrect response probabilities. The estimated return probabilities relate to a probability that a search engine will return a particular Web page in a particular position of a particular query result page. The estimated clicking probabilities relate to a frequency with which a client selects a returned Web page in a particular position of a particular query result. The estimated incorrect response probabilities relate to the probability that a query to a stale version of a particular Web page yields an incorrect or vacuous response. Further, information may be collected regarding the characteristics and update time distributions of a plurality of Web pages.
    Type: Grant
    Filed: May 9, 2003
    Date of Patent: November 18, 2008
    Assignee: International Business Machines Corporation
    Inventors: Mark Steven Squillante, Joel Leonard Wolf, Philip Shi-Lung Yu
  • Publication number: 20080250265
    Abstract: A system and method for using continuous failure predictions for proactive failure management in distributed cluster systems includes a sampling subsystem configured to continuously monitor and collect operation states of different system components. An analysis subsystem is configured to build classification models to perform on-line failure predictions. A failure prevention subsystem is configured to take preventive actions on failing components based on failure warnings generated by the analysis subsystem.
    Type: Application
    Filed: April 5, 2007
    Publication date: October 9, 2008
    Inventors: SHU-PING CHANG, Xiaohui Gu, Spyridon Papadimitriou, Philip Shi-lung Yu
  • Publication number: 20080234977
    Abstract: Methods and apparatus are provided for outlier detection in databases by determining sparse low dimensional projections. These sparse projections are used for the purpose of determining which points are outliers. The methodologies of the invention are very relevant in providing a novel definition of exceptions or outliers for the high dimensional domain of data.
    Type: Application
    Filed: June 6, 2008
    Publication date: September 25, 2008
    Applicant: International Business Machines Corporation
    Inventors: Charu C. Aggarwal, Philip Shi-Lung Yu
  • Patent number: 7418455
    Abstract: The present invention provides an index structure for managing weighted-sequences in large databases. A weighted-sequence is defined as a two-dimensional structure in which each element in the sequence is associated with a weight. A series of network events, for instance, is a weighted-sequence because each event is associated with a timestamp. Querying a large sequence database by events' occurrence patterns is a first step towards understanding the temporal causal relationships among the events. The index structure proposed herein enables the efficient retrieval from the database of all subsequences (contiguous and non-contiguous) that match a given query sequence both by events and by weights. The index structure also takes into consideration the nonuniform frequency distribution of events in the sequence data.
    Type: Grant
    Filed: November 26, 2003
    Date of Patent: August 26, 2008
    Assignee: International Business Machines Corporation
    Inventors: Wei Fan, Chang-Shing Perng, Haixun Wang, Philip Shi-Lung Yu
  • Publication number: 20080189494
    Abstract: A method (and system) of storing data in a value-based storage system, includes optimizing a value of data stored in the value-based storage system.
    Type: Application
    Filed: April 3, 2008
    Publication date: August 7, 2008
    Applicant: International Business Machines Corporation
    Inventors: Nikhil Bansal, Frederick Douglis, Lisa Karen Fleischer, Kirsten Weale Hildrum, Akshay Kumar Reddy Katta, John Davis Palmer, Elizabeth Suzanne Richards, David Tao, William Harold Tetzlaff, Joe Leonard Wolf, Philip Shi-lung Yu
  • Publication number: 20080189241
    Abstract: There are provided a method, a computer program product, and a system for maintaining a materialized view defined on a relation of a relational database. The method includes the step of performing content-based filtering on the relation to identify an update to the relation as being irrelevant with respect to the materialized view.
    Type: Application
    Filed: April 2, 2008
    Publication date: August 7, 2008
    Inventors: Gang Luo, Philip Shi-lung Yu
  • Publication number: 20080172355
    Abstract: A system and method for processing continual queries includes partitioning an entire monitoring area into regions of different size based upon node and query densities, and deciding an amount of information updates to be received from nodes in the regions based upon load conditions in each region. Information updates are received based on the amount of information updates determined for each region.
    Type: Application
    Filed: January 12, 2007
    Publication date: July 17, 2008
    Inventors: Bugra Gedik, Kun-Lung Wu, Philip Shi-lung Yu
  • Patent number: 7395250
    Abstract: Methods and apparatus are provided for outlier detection in databases by determining sparse low dimensional projections. These sparse projections are used for the purpose of determining which points are outliers. The methodologies of the invention are very relevant in providing a novel definition of exceptions or outliers for the high dimensional domain of data.
    Type: Grant
    Filed: October 11, 2000
    Date of Patent: July 1, 2008
    Assignee: International Business Machines Corporation
    Inventors: Charu C. Aggarwal, Philip Shi-Lung Yu
  • Patent number: 7379939
    Abstract: A technique for classifying data from a test data stream is provided. A stream of training data having class labels is received. One or more class-specific clusters of the training data are determined and stored. At least one test instance of the test data stream is classified using the one or more class-specific clusters.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: May 27, 2008
    Assignee: International Business Machines Corporation
    Inventors: Charu C. Aggarwal, Philip Shi-Lung Yu
  • Publication number: 20080082475
    Abstract: A system and method for resource adaptive classification of data streams. Embodiments of systems and methods provide classifying data received in a computer, including discretizing the received data, constructing an intermediate data structure from said received data as training instances, performing subspace sampling on said received data as test instances and adaptively classifying said received data based on statistics of said subspace sampling.
    Type: Application
    Filed: September 12, 2006
    Publication date: April 3, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charu C. Aggarwal, Philip Shi-lung Yu
  • Patent number: 7353218
    Abstract: A technique of clustering data of a data stream is provided. Online statistics are first created from the data stream. Offline processing of the online statistics is then performed when offline processing either required or desired. Online statistics may be created through the reception of data points from the data stream and the formation and updating of data groups. Offline processing may be performed by reclustering groups of data points around sampled data points and reporting the newly formed clusters.
    Type: Grant
    Filed: August 14, 2003
    Date of Patent: April 1, 2008
    Assignee: International Business Machines Corporation
    Inventors: Charu C. Aggarwal, Philip Shi-Lung Yu
  • Publication number: 20080071721
    Abstract: A system and method for learning models from scarce and/or skewed training data includes partitioning a data stream into a sequence of time windows. A most likely current class distribution to classify portions of the data stream is determined based on observing training data in a current time window and based on concept drift probability patterns using historical information.
    Type: Application
    Filed: August 18, 2006
    Publication date: March 20, 2008
    Inventors: Haixun Wang, Jian Yin, Philip Shi-lung Yu