Data Mining Patents (Class 707/776)
  • Patent number: 8682926
    Abstract: A system, system, method, and computer program product are provided for augmenting a user profile utilizing data collected in association with a social network service profile. In use, a first profile of a user managed by an entity is identified. Additionally, a second profile of the user maintained by a social network service separate from the entity is identified. Further, data associated with the second profile of the user is collected. Moreover, the first profile of the user is augmented utilizing the data.
    Type: Grant
    Filed: December 19, 2012
    Date of Patent: March 25, 2014
    Assignee: Amdocs Software Systems Limited
    Inventors: Amit Braytenbaum, Oren Agassy
  • Patent number: 8676839
    Abstract: An information processing device includes a range of activity information acquisition unit which acquires information of the range of activity that is an area that a user who contributes contribution information on a facility visits, and a reliability level evaluation unit which evaluates the level of reliability for the contribution information based on the information of the range of activity.
    Type: Grant
    Filed: August 11, 2011
    Date of Patent: March 18, 2014
    Assignee: Sony Corporation
    Inventor: Satoshi Suzuno
  • Patent number: 8676840
    Abstract: A network's evolution is characterized by graph evolution rules. A graph, formed by merging multiple graphs representing the multiple snapshots of the network, that represents an evolutionary network is mined to identify evolutional patterns of the network. A pattern is selected from the identified patterns. Graph evolution rules are generated using identified evolutional patterns. The generated graph evolution rules represent the evolutional patterns of the network, the rules indicating that any occurrence of a child pattern of the selected pattern implies a corresponding occurrence of the selected pattern.
    Type: Grant
    Filed: June 6, 2012
    Date of Patent: March 18, 2014
    Assignee: Yahoo! Inc.
    Inventors: Francesco Bonchi, Aristides Gionis, Michele Berlingerio, Björn Bringmann
  • Publication number: 20140074885
    Abstract: A processor-implemented method, system, and/or computer program product generates and utilizes synthetic context-based objects. A non-contextual data object is associated with a context object to define a synthetic context-based object, where the non-contextual data object ambiguously relates to multiple subject-matters, and where the context object provides a context that identifies a specific subject-matter, from the multiple subject-matters, of the non-contextual data object. The synthetic context-based object is then associated with at least one specific data store, which includes data that is associated with data contained in the non-contextual data object and the context object. A request for a data store that is associated with the synthetic context-based object results in the return of at least one data store that is associated with the synthetic context-based object.
    Type: Application
    Filed: September 11, 2012
    Publication date: March 13, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: SAMUEL S. ADAMS, ROBERT R. FRIEDLANDER, JAMES R. KRAEMER
  • Patent number: 8671111
    Abstract: A method includes providing a columnar database comprising a plurality of columnar data structures associated with one column attribute; providing first data records having a plurality of first attribute-value pairs comprising counting information indicative of a number of first data records having the respective first attribute-value pair; providing mask data structures comprising one or more second attribute-value pairs; selecting second data records by intersecting the columnar data structures and the mask data structures; selecting one of the column attributes and one value contained in the column data structure associated with said selected column attribute as the destination attribute-value pair; creating one second rule for each first attribute-value pair; calculating, for each second rule, a co-occurrence-count between its respective source attribute-value pair and its destination attribute-value pair; and specifically selecting one or more of said second rules as the first rules in dependence on the
    Type: Grant
    Filed: May 7, 2012
    Date of Patent: March 11, 2014
    Assignee: International Business Machines Corporation
    Inventors: Patrick Dantressangle, Eberhard Hechler, Martin Oberhofer, Michael Wurst
  • Patent number: 8670968
    Abstract: A method for training a ranking application. The method includes ranking the help postings to create an initial ranking using initial parameter values, and storing user interactions with the help postings to obtain stored interactions. Simulations are performed using the stored interactions to generate revised parameter values for the ranking application. Performing the simulations includes calculating relevance values from the stored interactions, creating a test posting, assigning, to the test posting, an initial score and a relevance value randomly selected from the relevance values to generate a test ranking, and simulating user interactions with the test ranking to generate simulated rankings. The simulated rankings are analyzed to obtain revised parameter values. The method further includes ranking, using the revised parameter values, the help postings to generate a revised ranking, and displaying the help postings in the forum according to the revised ranking.
    Type: Grant
    Filed: August 31, 2012
    Date of Patent: March 11, 2014
    Assignee: Intuit Inc.
    Inventors: Igor A. Podgorny, Floyd J. Morgan, Derek Szydlowski
  • Patent number: 8671093
    Abstract: Approaches and techniques are discussed for ranking the documents indicated in search results for a query based on click-through information collected for the query in previous query sessions. According to an embodiment of the invention, when calculating a relevance score for a particular document, one may overcome positional bias by utilizing click-through information about other documents previously returned in the same search results as the particular document. According to an embodiment, one may utilize Dynamic Bayesian Network, based on said click-through information, to model relevance. According to an embodiment of the invention, one may utilize click-through information to generate targets for learning a ranking function.
    Type: Grant
    Filed: November 18, 2008
    Date of Patent: March 11, 2014
    Assignee: Yahoo! Inc.
    Inventors: Olivier Chapelle, Anne Ya Zhang
  • Patent number: 8666785
    Abstract: A method and system is provided for validating claim submissions against a claim policy that can perform a comparative analysis by comparing structured or unstructured claim submissions to semantically structured policies to direct and optimize processing of the claim submission. A method and system is also provided for enabling semantic interoperability across different proprietary electronic transaction records. Semantic queries and semantic analysis can be performed on a collection of electronic transaction records originating from different proprietary systems.
    Type: Grant
    Filed: July 28, 2011
    Date of Patent: March 4, 2014
    Assignee: Wairever Inc.
    Inventors: Wasyl Baluta, Shafquat Mahmud
  • Patent number: 8666998
    Abstract: A method, system and computer program product provides a first characteristic associated with a first data set and a single data value, and a second characteristic associated with a second data set; and calculates at least one of: 1) the similarity of the first data set with the second data set based on the first and second characteristics, 2) the similarity of the first data set with the single data value based on the first characteristic and the single data value, 3) confidence indicating how well the first characteristic reflects properties of the first data set based on the first characteristic, and 4) confidence indicating how well the similarity of the first data set with the single data value reflects properties of the single data value based on the first characteristic and the single data value.
    Type: Grant
    Filed: June 30, 2011
    Date of Patent: March 4, 2014
    Assignee: International Business Machines Corporation
    Inventors: Sebastian Nelke, Martin A Oberhofer, Yannick Saillet, Jens Seifert
  • Patent number: 8666913
    Abstract: The present invention relates to a method of checking data gathered from a content source comprising: receiving initial data from the content source; training a data profiler to generate a set of trusted constraint modules, said training comprising (1) selecting constraint modules having parameters that are applicable to the initial data, (2) adjusting the parameters of the applicable constraint modules to conform with new data from the content source, (3) identifying non-stable constraint modules, and (4) generating a set of trusted constraint modules by removing the non-stable constraint modules; applying the set of trusted constraint modules to subsequently received data from the content source to determine whether the subsequently received data meets the parameters of the set of trusted constraint modules; and signaling a failure upon the subsequently received data failing to meet the parameters of the set of trusted constraint modules.
    Type: Grant
    Filed: November 12, 2010
    Date of Patent: March 4, 2014
    Assignee: Connotate, Inc.
    Inventors: Vincent Sgro, Aravind Kalyan Varma Datla, HimaBindu Datla
  • Patent number: 8661067
    Abstract: Various embodiments for optimizing data migration and recall in a computing storage environment by a processor device are provided. Data stored in the computing storage environment is analyzed over a predetermined period of time to identify a usage pattern of a portion of the data. The portion of the data having the usage pattern is recalled in advance of a usage time, the usage time predicted by the usage pattern for the portion of the data to be accessed.
    Type: Grant
    Filed: October 13, 2010
    Date of Patent: February 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Douglas L. Lehr, Franklin E. McCune, David C. Reed, Max D. Smith
  • Patent number: 8661039
    Abstract: A process for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.
    Type: Grant
    Filed: April 2, 2012
    Date of Patent: February 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Jeffrey M. Achtermann, Indrajit Bhattacharya, Kevin W. English, Jr., Shantanu R. Godbole, Sachindra Joshi, Ashwin Srinivasan, Ashish Verma
  • Publication number: 20140052758
    Abstract: Techniques are presented for providing a software fitting assessment. The techniques may be performed by methods, apparatus, and/or computer program products. The techniques include automatically matching on a computer system one or more specified requirements for a project with one or more software functions stored in a repository. The automatically matching includes mining the repository in order to match requirements. The repository includes software functions, requirements accumulated from previous projects, and results of stored matches between the software functions and the requirements accumulated from previous projects. The techniques include outputting by the computer system one or more results of the matching.
    Type: Application
    Filed: September 12, 2012
    Publication date: February 20, 2014
    Applicant: International Business Machines Corporation
    Inventors: Matthew J. Callery, Michael Desmond, Sophia Krasikov, Harold L. Ossher, Edith Schonberg, Harini Srinivasan
  • Publication number: 20140052757
    Abstract: Techniques are presented for providing a software fitting assessment. The techniques may be performed by methods, apparatus, and/or computer program products. The techniques include automatically matching on a computer system one or more specified requirements for a project with one or more software functions stored in a repository. The automatically matching includes mining the repository in order to match requirements. The repository includes software functions, requirements accumulated from previous projects, and results of stored matches between the software functions and the requirements accumulated from previous projects. The techniques include outputting by the computer system one or more results of the matching.
    Type: Application
    Filed: August 17, 2012
    Publication date: February 20, 2014
    Applicant: International Business Machines Corporation
    Inventors: Matthew J. Callery, Michael Desmond, Sophia Krasikov, Harold L. Ossher, Edith Schonberg, Harini Srinivasan
  • Patent number: 8655884
    Abstract: A computer system for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.
    Type: Grant
    Filed: March 29, 2012
    Date of Patent: February 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Jeffrey M. Achtermann, Indrajit Bhattacharya, Kevin W. English, Jr., Shantanu R. Godbole, Sachindra Joshi, Ashwin Srinivasan, Ashish Verma
  • Patent number: 8655869
    Abstract: A data-driven information navigation system and method enable search and analysis of a set of objects or other materials by certain common attributes that characterize the materials, as well as by relationships among the materials. The invention includes several aspects of a data-driven information navigation system that employs this navigation mode. The navigation system of the present invention includes features of a knowledge base, a navigation model that defines and enables computation of a collection of navigation states, a process for computing navigation states that represent incremental refinements relative to a given navigation state, and methods of implementing the preceding features.
    Type: Grant
    Filed: September 12, 2011
    Date of Patent: February 18, 2014
    Assignee: Oracle OTC Subsidiary LLC
    Inventors: Adam J. Ferrari, Frederick C. Knabe, Vinay S. Mohta, Jason P. Myatt, Benjamin S. Scarlet, Daniel Tunkelang, John S. Walter, Joyce Jeanpin Wang, Michael Tucker
  • Patent number: 8655911
    Abstract: Techniques are provided for (1) extending SQL to support direct invocation of frequent itemset operations, (2) improving the performance of frequent itemset operations by clustering itemset combinations to more efficiently use previously produced results, and (3) making on-the-fly selection of the occurrence counting technique to use during each phase of a multiple phase frequent itemset operation. When directly invoked in an SQL statement, a frequent itemset operation may receive input from results of operations specified in the SQL statement, and provide its results directly to other operations specified in the SQL statement. By clustering itemset combinations, resources may be used more efficiently by retaining intermediate information as long as it is useful, and then discarding it to free up volatile memory.
    Type: Grant
    Filed: August 18, 2003
    Date of Patent: February 18, 2014
    Assignee: Oracle International Corporation
    Inventors: Wei Li, Jiansheng Huang, Ari Mozes
  • Publication number: 20140046977
    Abstract: The various embodiments herein provide a system and method for mining frequent patterns in relationship space from a plurality of relationship sequences extracted from a big data The system comprises a data repository for collecting and storing the big data. An Entity Store for collecting and storing a plurality of entities from the big data, an Entity Hierarchy for representing a hierarchical structure of entities, a Relationship Store for collecting and storing relationship instances between the pluralities of entities, a Relationship Hierarchy for representing a hierarchical structure of relationship, a language/domain model for organizing entities and relationships in a hierarchical manner, a pattern query Processing Module (PQPM) for processing, a pattern query related to finding patterns in relationships and entities, and a Pattern Generation Module (PGM) to generate frequent patterns and a Frequent Pattern Display Module (FPDM) to provide a visual presentation of the mined patterns.
    Type: Application
    Filed: January 31, 2013
    Publication date: February 13, 2014
    Applicant: XURMO TECHNOLOGIES PVT. LTD.
    Inventors: SRIDHAR GOPALAKRISHNAN, SUJATHA RAVIPRASAD UPADHYAYA
  • Patent number: 8650213
    Abstract: Distributed privacy preserving data mining techniques are provided. A first entity of a plurality of entities in a distributed computing environment exchanges summary information with a second entity of the plurality of entities via a privacy-preserving data sharing protocol such that the privacy of the summary information is preserved, the summary information associated with an entity relating to data stored at the entity. The first entity may then mine data based on at least the summary information obtained from the second entity via the privacy-preserving data sharing protocol. The first entity may obtain, from the second entity via the privacy-preserving data sharing protocol, information relating to the number of transactions in which a particular itemset occurs and/or information relating to the number of transactions in which a particular rule is satisfied.
    Type: Grant
    Filed: May 23, 2007
    Date of Patent: February 11, 2014
    Assignee: International Business Machines Corporation
    Inventors: Charu C. Aggarwal, Philip Shi-Lung Yu
  • Patent number: 8645417
    Abstract: An approach is described for performing a name search using a name search operation and a ranking operation. The name search operation may take text as input and apply a fuzzy matching operation and a lookup operation to generate a collection of candidate names with respective probability scores. In other cases, speech or handwriting recognition may generate the collection of candidate names and probability scores. The ranking operation may then rank these candidate names using a ranking function. The ranking function may rank the candidate names based on the probability scores associated with the names and at least one other factor. One such factor may reflect whether information provided by a user matches profile information associated with a candidate name under consideration. Another factor may reflect an extent of a nexus between the user and a person associated with the candidate name. Other types of factors can be used.
    Type: Grant
    Filed: June 18, 2008
    Date of Patent: February 4, 2014
    Assignee: Microsoft Corporation
    Inventors: Dirk H. Groeneveld, Dmitriy Meyerzon, David Mowatt, Jessica A. Alspaugh
  • Patent number: 8645418
    Abstract: A method and an apparatus for word quality mining and evaluating are disclosed. The method includes: calculating a Document Frequency (DF) of a word in mass categorized data; evaluating the word in multiple single-aspects according to the DF of the word; and evaluating the word in multiple aspects according to the multiple single aspect evaluations to obtain an importance weight of the word. According to the solution of the present invention, the importance of the word in the mass categorized data may be evaluated, and words with high quality may be obtained through an integrated evaluation.
    Type: Grant
    Filed: May 7, 2012
    Date of Patent: February 4, 2014
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventors: Huaijun Liu, Zhongbo Jiang, Gaolin Fang
  • Patent number: 8639696
    Abstract: A computer program product evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.
    Type: Grant
    Filed: March 28, 2012
    Date of Patent: January 28, 2014
    Assignee: International Business Machines Corporation
    Inventors: Jeffrey M. Achtermann, Indrajit Bhattacharya, Kevin W. English, Jr., Shantanu R. Godbole, Sachindra Joshi, Ashwin Srinivasan, Ashish Verma
  • Patent number: 8639719
    Abstract: System and methods are provided that enable a data and information repository with a semantic engine that enables users to easily capture information in various formats from various devices along with rich metadata relating to that information. The information repository can be configured to query the captured information and any metadata to extrapolate new meaning, including semantic meaning, and to perform various tasks, including but not limited to sharing of the information and metadata. In some embodiments, the information repository is configured to generate recommendations to users based on analysis of the captured information.
    Type: Grant
    Filed: February 2, 2012
    Date of Patent: January 28, 2014
    Inventors: Paul Tepper Fisher, Zeeshan Hussain Zaidi
  • Publication number: 20140025663
    Abstract: Multi-media content for inclusion into an SMS (Short Message Service), MMS (Multi-media Message Service), IM (Instant Message) or other message type can be searched, pre-searches, fetched and pre-fetched based upon predictive- and rules-based searching techniques. A system can predict or infer an in-process message, for example, based upon a portion of the inputted text message. Thereafter, in real- or near real-time, content related to the topic of conversation can be retrieved from a local store, remote stores (e.g., servers) or cloud-based sources. The retrieved content can be incorporated into the SMS, MMS, or IM message as appropriate or desired thereby enhancing the messaging experience.
    Type: Application
    Filed: September 20, 2013
    Publication date: January 23, 2014
    Applicant: AT&T Mobility II LLC
    Inventors: Kristin Marie Pascal, Andrew Evan Klonsky, Matthew James Bailey
  • Patent number: 8635237
    Abstract: A method, a system and a computer program product for enabling a customer response speech recognition unit to dynamically receive customer feedback. The customer response speech recognition unit is positioned at a customer location. The speech recognition unit is automatically initialized when one or more spoken words are detected. The response statements of customers are dynamically received by the customer response speech recognition unit at the customer location, in real time. The customer response speech recognition unit determines when the one or more spoken words of the customer response statement are associated with a score in a database. An analysis of the words is performed to generate a score that reflects the evaluation of the subject by the customer. The score is dynamically updated as new evaluations are received, and the score is displayed within graphical user interface (GUI) to be viewed by one or more potential customers.
    Type: Grant
    Filed: July 2, 2009
    Date of Patent: January 21, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Ravi P. Bansal, Mike V. Macias, Saidas T. Kottawar, Salil P. Gandhi, Sandip D. Mahajan
  • Patent number: 8630989
    Abstract: Described herein are methods, systems, apparatuses and products for automatically discovering patterns in a text corpus. An aspect provides extracting at least one context string related to at least one annotator from the at least one text corpus; analyzing the at least one context string for at least one sequence, the at least one sequence comprised of at least one subsequence; determining at least one sequence signature for each at least one sequence by applying applicable rules to the at least one sequence; and grouping the at least one sequence signature into at least one group.
    Type: Grant
    Filed: May 27, 2011
    Date of Patent: January 14, 2014
    Assignee: International Business Machines Corporation
    Inventors: Sebastian Johannes Blohm, Vivian Yaw-Wen Chu, Ching-Tien Ho, Yunyao Li, Huaiyu Zhu
  • Patent number: 8630823
    Abstract: A feature parameter candidate generation apparatus has a storage unit that stores the values of feature parameters extracted from each of samples, an index value calculation unit that calculates an index value, which is obtained by normalizing the number of the kinds of the values of feature parameters by the number of the samples, for each of the feature parameters, an evaluation object selection unit that selects combinations of feature parameters which are objects to be evaluated, an evaluation unit that evaluates whether the uniformity of a frequency distribution of index values of the individual feature parameters for combinations of feature parameters selected as the objects to be evaluated satisfies a predetermined criterion, and a candidate determination unit that determines, as feature parameter candidates to be given to the model generation device, a combination of feature parameters that is evaluated to satisfy the predetermined criterion.
    Type: Grant
    Filed: October 31, 2008
    Date of Patent: January 14, 2014
    Assignee: Omron Corporation
    Inventors: Mitsuhiro Yoneda, Hiroshi Nakajima, Naoki Tsuchiya, Hiroshi Tasaki
  • Publication number: 20140012878
    Abstract: Enables data mining of monitored information, activities, and agreements associated with a throttling system. An agreement includes one or more conditions to satisfy the agreement, such as one or more tasks or activities to be performed by an agreement performer or events that may be detected, and actions performed to enforce or assert the agreement such as controlling the electronic device and/or enabling or disabling or otherwise limiting, reducing or increasing the amount or type of information allowed with respect to any or all electronic devices associated with the agreement performer.
    Type: Application
    Filed: August 7, 2012
    Publication date: January 9, 2014
    Inventors: Negeen MOUSSAVIAN, Amir Moussavian, Ben Badiee, Mark Lewis
  • Patent number: 8626772
    Abstract: A method is provided for determining a correlation between a reference user and another user on the basis of two sets of ratings, where each rating is associated with a respective user. In response to a trigger, user ratings associated with the reference user and user ratings associated with the other user are collected, and all co-rated items of these two sets are correlated in the basis of an adjusted cosine correlation function which is weighted by a first and a second weighting function. The correlation is then stored and may be repeated for a plurality of users. The stored correlations may be used e.g. for ranking purposes.
    Type: Grant
    Filed: September 29, 2008
    Date of Patent: January 7, 2014
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventors: Mattias Lidstrom, Jonas Bjork, Joakim Soderberg
  • Patent number: 8626791
    Abstract: Methods, systems, and apparatus, including computer programs encoded on one or more computer storage devices, for caching predictive models are described. Records are obtained, each record including a time of a previously submitted predictive request and an identifier of a trained predictive model. A trained scheduling model is generated using the records as training data. A set of identifiers of trained predictive models are determined from a plurality of trained predictive models that are stored in a secondary memory of a computing system. The target time is inputted to the trained scheduling model. In response, a second predictive output is received that comprises the set of identifiers. A set of trained predictive models are obtained that correspond to the set of identifiers from the secondary memory. The set of trained predictive models are stored in a primary memory of the computing system.
    Type: Grant
    Filed: June 14, 2011
    Date of Patent: January 7, 2014
    Assignee: Google Inc.
    Inventors: Wei-Hao Lin, Travis H. K. Green, Robert Kaplow, Gang Fu, Gideon S. Mann
  • Patent number: 8626790
    Abstract: A processor is operated to combine a first row of a dimension table in a data warehouse with a second row in the dimension table. The result is a combined row that includes a row identification key for the first row and a row identification key for the second row. The row identification key for the first row joins the combined row to fact data from a prior time period. The second row corresponds to a current time period that is later than the prior time period. The processor is also operated to join at least a portion of the combined row to at least a portion of the corresponding row in a fact table associated with the dimension table. The fact data from the prior time period is included in the corresponding row in the fact table.
    Type: Grant
    Filed: April 23, 2010
    Date of Patent: January 7, 2014
    Assignee: Hartford Fire Insurance Company
    Inventors: Asha Kiran Potdar, Harikrishna Raghumandala, John Vernale
  • Publication number: 20140006447
    Abstract: A method, computer program product, and system generating epigenetic cohorts for a specific time period through clustering of epigenetic surprisal data at a specific time comprising. receiving a phenotypic and/or demographic parameter and a cluster characteristics input from a user; searching the epigenetic surprisal data at a specific time for the parameter and storing matches in a repository; generating a cluster comprising a centroid for each parameter by populating the cluster based on the matches of the parameter with the epigenetic surprisal data at a specific time period; determining at least two epigenetic cohorts for a specific time period from the cluster for each parameter and based on the input from the user; and if the cohorts do not match the input of the user, reporting the cohorts determined to the user and returning to the step of receiving a parameter and characteristic input from a user.
    Type: Application
    Filed: July 31, 2012
    Publication date: January 2, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Robert R. Friedlander, James R. Kraemer
  • Publication number: 20130346447
    Abstract: Methods and systems of performing data mining may include receiving a plurality of web log records and a plurality of call log records; associating one or more web log records with a call log record, wherein the associated user for each of the associated one or more web log records and the call log record are the same; identifying one or more patterns among the web log records for the plurality of call log records, wherein each pattern comprises one or more web accesses, a time stamp at which each of the one or more web accesses is performed and the call topic for the call log record; identifying one or more web log records associated with a new call, and predicting a call topic for the new call based on at least one pattern and the one or more web log records.
    Type: Application
    Filed: June 21, 2012
    Publication date: December 26, 2013
    Applicant: XEROX CORPORATION
    Inventors: Changjun Wu, Shanmuga-Nathan Gnanasambandam, Gueyoung Jung, Shi Zhao
  • Patent number: 8612410
    Abstract: The disclosed subject matter provides for employing timed fingerprint location (TFL) information in dynamically selecting a subset of content from a set of content. TFL information can provide location information for user equipment without employing conventional location techniques. As such, TLF information can provide for location-centric selection of content. Further, secondary information correlated to TFL information can be received and employed in dynamic content subset selection. In an aspect, rules can be employed to predict the future location of a user equipment such that dynamic content selection can be tailored to the predicted future position of the user equipment. Moreover, privacy components can be employed to limit the propagation of sensitive information.
    Type: Grant
    Filed: June 30, 2011
    Date of Patent: December 17, 2013
    Assignee: AT&T Mobility II LLC
    Inventors: Sheldon Meredith, Mark Austin
  • Patent number: 8612533
    Abstract: The disclosure relates to systems and methods of burning, snapshotting, streaming and curating geofeeds, each geofeed including a plurality of geofeed content items that are aggregated from a plurality of content providers using respective requests formatted specifically for individual ones of the plurality of content providers, where each individual set of a plurality of content is relevant to one or more geographically definable locations. Archives of a geofeed may be generated by burning portions or all of the geofeed content items and/or generating snapshots of geofeeds at different times. A real-time geofeed may be streamed by continuously or periodically obtaining newly available geofeed content items and updated a geofeed stream in real-time. Collections of geofeed content items may be curated in order to organize and follow geofeed content items of interest.
    Type: Grant
    Filed: March 7, 2013
    Date of Patent: December 17, 2013
    Assignee: Geofeedr, Inc.
    Inventors: Philip B. Harris, Scott K. Mitchell, Michael J. Mulroy
  • Patent number: 8612479
    Abstract: A systems and methods are described detect fraud in existing logs of raw data. There can be several disparate logs, each including data of disparate data types and generated by different and possibly unrelated software enterprise applications. The fraud management system aggregates and organizes the raw log data, extends the raw data with reference data, archives the data in a manner that facilitates efficient access and processing of the data, allows for investigation of potentially fraudulent usage scenarios, and uses the results of the investigation to identify patterns of data that correspond to correspond to high risk usage scenarios and/or process steps. In subsequent processing, archived data can be compared against the identified patterns corresponding to high risk usage scenarios to detect matches, and the invention thereby automatically detects high risk usage scenarios and issues appropriate alerts and reports.
    Type: Grant
    Filed: May 15, 2007
    Date of Patent: December 17, 2013
    Assignee: FIS Financial Compliance Solutions, LLC
    Inventors: Jwahar R. Bammi, Bagepalli C. Krishna, Robert Posniak, Joseph Walsh
  • Publication number: 20130332449
    Abstract: The present invention provides a computer-implemented code generation system that generates data processing code from a directed acyclic graph (DAG). The generated code is both declarative and procedural, and can be run in a relational database or in a Map Reduce implementation using Apache Pig. Each node of the DAG specifies operations performed on tabular data that can be stored in a delimited plain text file, a spreadsheet, or a relational database.
    Type: Application
    Filed: June 6, 2013
    Publication date: December 12, 2013
    Inventors: John David Amos, Oleg Merlugov
  • Publication number: 20130325896
    Abstract: In one embodiment, receiving, from a user of a social network, an image with embedded metadata; and suggesting, to the user, information to be associated with the image based on the embedded metadata.
    Type: Application
    Filed: May 29, 2012
    Publication date: December 5, 2013
    Inventor: Peter William Hunt
  • Publication number: 20130325897
    Abstract: Method, system, and programs for generating questions for a user. A request for content from a user is received via the communication platform. The content is retrieved from a content source. A question is generated for the user based on the content requested by the user and a history of previous information accessed or posted by the user. The question is sent to the user.
    Type: Application
    Filed: May 30, 2012
    Publication date: December 5, 2013
    Applicant: YAHOO! INC.
    Inventors: Nitin Motgi, Masood Mortazavi, Bruno Fernandez-Ruiz
  • Publication number: 20130325898
    Abstract: Large-scale event processing systems are often designed to perform data mining operations by storing a large set of events in a massive database, applying complex queries to the records of the events, and generating reports and notifications. However, because such queries are performed on very large data sets, the processing of the queries often introduces a significant delay between the occurrence of the events and the reporting or notification thereof. Instead, a large-scale event processing system may be devised as a large state machine organized according to an evaluation plan, comprising a graph of event processors that, in realtime, evaluate each event in an event stream to update an internal state of the event processor, and to perform responses when response conditions are met. The continuous monitoring and evaluation of the stream of events may therefore enable the event processing system to provide realtime responses and notifications to complex queries.
    Type: Application
    Filed: August 8, 2013
    Publication date: December 5, 2013
    Applicant: Microsoft Corporation
    Inventors: Nir Nice, Daniel Sitton, Dror Kremer, Michael Feldman
  • Patent number: 8601024
    Abstract: Described is releasing output data representing a search log, in which the data is suitable for most data mining/analysis applications, but is safe to publish by preserving user privacy. The search log is processed such that a query is only included if a sufficient count of that query is present; noise may be added. User contributions that are considered may be limited to a maximum number of queries. The output may indicate how often (possibly plus noise) that each query appeared. Other output may comprise a query-action graph, a query-inaction graph and/or a query-reformulation graph, with nodes representing queries and nodes representing actions, inactions or reformulations (e.g., clicked URLs, skipped URLs, or selected related queries), and edges between nodes representing action, skip or selection counts (possibly plus noise). The output may correspond to the top results/related queries returned from a search.
    Type: Grant
    Filed: June 16, 2009
    Date of Patent: December 3, 2013
    Assignee: Microsoft Corporation
    Inventors: Krishnaram Kenthapadi, Aleksandra Korolova, Nina Mishra, Alexandros Ntoulas
  • Patent number: 8601023
    Abstract: The invention comprises a set of complementary techniques that dramatically improve enterprise search and navigation results. The core of the invention is an expertise or knowledge index, called UseRank that tracks the behavior of website visitors. The expertise-index is designed to focus on the four key discoveries of enterprise attributes: Subject Authority, Work Patterns, Content Freshness, and Group Know-how. The invention produces useful, timely, cross-application, expertise-based search and navigation results. In contrast, traditional Information Retrieval technologies such as inverted index, NLP, or taxonomy tackle the same problem with an opposite set of attributes than what the enterprise needs: Content Population, Word Patterns, Content Existence, and Statistical Trends. Overall, the invention encompasses Baynote Search—a enhancement over existing IR searches, Baynote Guide—a set of community-driven navigations, and Baynote Insights—aggregated views of visitor interests and trends and content gaps.
    Type: Grant
    Filed: October 17, 2007
    Date of Patent: December 3, 2013
    Assignee: Baynote, Inc.
    Inventors: Scott Brave, Robert Bradshaw, Jack Jia, Christopher Minson
  • Patent number: 8595718
    Abstract: A computer system in accordance with one or more embodiments of the invention includes one or more data miners configured to mine software deliverables for metadata, a metadata filter configured to generate a filtered view of metadata associated with a subset of the software deliverables, an inventory generator configured to generate an inventory of the subset, a rules manager configured to generate rules using the filtered view and the inventory, where the rules are based on software relationships within the subset, and a package generator configured to generate a knowledge package based on the rules, where the knowledge package includes guidelines for obtaining the subset and installing the subset.
    Type: Grant
    Filed: August 17, 2007
    Date of Patent: November 26, 2013
    Assignee: Oracle America, Inc.
    Inventors: Ilan Naslavsky, Yuval Turgeman
  • Patent number: 8595781
    Abstract: Systems and methods for identifying which video segment is being displayed on a screen of a television system. The video segment is identified by deriving data from the television signals, the derived data being indicative of the video segment being displayed on the screen. This feature can be used to extract a viewer's reaction (such as changing the channel) to a specific video segment (such as an advertisement) and reporting the extracted information as metrics. The systems and methods may further provide contextually targeted content to the television system. The contextual targeting is based on not only identification of the video segment being displayed, but also a determination concerning the playing time or offset time of the particular portion of the video segment being currently displayed.
    Type: Grant
    Filed: May 27, 2010
    Date of Patent: November 26, 2013
    Assignee: Cognitive Media Networks, Inc.
    Inventors: Zeev Neumeier, Edo Liberty
  • Publication number: 20130311513
    Abstract: Systems and methods for storing and accessing data. Example embodiments may perform optimization based on patterns of requests received by the system and relations between data sets identified by the system. Example embodiments may identify restrictions on a data set based on a different data set. Conditions for automatically algebraically partitioning the data set based on a constituent of a different data set may be evaluated, including evaluation of the relationship between the data sets and identification of a pattern of statements restricting the data set using the same logical structure. If the conditions are met, component data sets and a partition data set may be algebraically defined based on ranges applied to constituent(s) of the other data set. The component data sets may also be realized in storage to physically partition the data set.
    Type: Application
    Filed: May 15, 2012
    Publication date: November 21, 2013
    Applicant: Algebraix Data Corporation
    Inventors: Christopher M. Piedmonte, William A. Rogers
  • Publication number: 20130311283
    Abstract: a data mining method for a social network of a terminal user and related methods, apparatuses and systems are provided. A user identifier of a terminal may be acquired, then a communication record of a user is acquired by using the user identifier, and first social information is obtained according to the user identifier and the communication record; a data packet for accessing, by the user, a social network service is acquired according to the user identifier, and second social information is obtained according to the user identifier and the data packet; information published by the user on the Internet may be acquired according to the user identifier, and third social information is obtained according to the user identifier and the published information, and finally, a user social database is established or updated by using the first social information, the second social information and the third social information.
    Type: Application
    Filed: May 15, 2013
    Publication date: November 21, 2013
    Applicant: Huawei Technologies Co., Ltd.
    Inventors: Hewei Liu, Dong Tang, Shaoyu Wang
  • Patent number: 8589326
    Abstract: Even with SIP and the varied communications options available to users, the monitoring of presence itself may not provide sufficient information to determine a most appropriate contact modality for a communication. Therefore, there is a need for a more complete method of generating a landscape of recent activity from which a preferred communication modality can be determined, with a corresponding solution provided that allows a contactor to contact a contactee via that modality. For example, a network of bearer channels or feeds can be established as input to a presence determination module. The inputs can include not only presence information, but also feeds from one or more blogs, micro blogs, social networking sites, etc., that allow a more complete picture of recent activity to be determined. Based on this input a preferred contact modality can be determined that may enhance the ability of a contactor to contact a contactee.
    Type: Grant
    Filed: July 2, 2010
    Date of Patent: November 19, 2013
    Assignee: Avaya Inc.
    Inventors: Mehmet C. Balasaygun, Michael J. Killian
  • Patent number: 8590049
    Abstract: A method and apparatus for providing an anonymization of data are disclosed. For example, the method receives a request for anonymizing, wherein the request comprises a bipartite graph for a plurality of associations or a table that encodes the plurality of associations for the bipartite graph. The method places each node in the bipartite graph in a safe group and provides an anonymized graph that encodes the plurality of associations of the bipartite graph, if a safe group for all nodes of the bipartite graph is found.
    Type: Grant
    Filed: August 17, 2009
    Date of Patent: November 19, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Graham Cormode, Divesh Srivastava, Ting Yu, Qing Zhang
  • Patent number: 8583686
    Abstract: The present invention provides a system, method and computer program for multi-dimensional temporal abstraction and data mining. The invention comprises collecting and optionally cleaning multi-dimensional data, the multi-dimensional data including a plurality of data streams; temporally abstracting the multi-dimensional data; and relatively aligning the temporally abstracted multi-dimensional data based on a shared time point of interest.
    Type: Grant
    Filed: July 22, 2010
    Date of Patent: November 12, 2013
    Assignee: University of Ontario Institute of Technology
    Inventor: Carolyn Patricia McGregor
  • Patent number: 8583687
    Abstract: Systems and methods for storing and accessing data. Example embodiments may perform optimization based on patterns of requests received by the system and relations between data sets identified by the system. Example embodiments may identify restrictions on a data set based on a different data set. Conditions for automatically algebraically partitioning the data set based on a constituent of a different data set may be evaluated, including evaluation of the relationship between the data sets and identification of a pattern of statements restricting the data set using the same logical structure. If the conditions are met, component data sets and a partition data set may be algebraically defined based on ranges applied to constituent(s) of the other data set. The component data sets may also be realized in storage to physically partition the data set.
    Type: Grant
    Filed: May 15, 2012
    Date of Patent: November 12, 2013
    Assignee: Algebraix Data Corporation
    Inventors: Christopher M. Piedmonte, William A. Rogers