Patents by Inventor Alfredo Alba

Alfredo Alba has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20180225259
    Abstract: A method comprising receiving a document having multiple sections of different types using a processor. The method also comprises obtaining a plurality of lexicons using the processor, each of the lexicons for interpreting fragments in one or more of the section types. The method further comprises interpreting fragments in a first section of the multiple sections using the processor and one or more lexicons. The method still further comprises determining, based upon the interpretation and using the processor, that a fragment in the first section is misplaced. The method still further comprises re-locating, using the processor, the misplaced fragment to a second section of the multiple sections in the document to generate a re-organized document. The method additionally includes storing the re-organized document in a hardware storage system using the processor.
    Type: Application
    Filed: June 19, 2017
    Publication date: August 9, 2018
    Inventors: Alfredo ALBA, Anni R. CODEN, Clemens DREWS, Daniel F. GRUHL, Neal R. LEWIS, Pablo N. MENDES, Cartic RAMAKRISHNAN, Joseph F. TERDIMAN
  • Publication number: 20180225374
    Abstract: A mechanism is provided in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to implement an automated lexicon expansion for an identified corpus. For a selected corpus in a set of corpora, the mechanism determines an estimated number of new terms in the selected corpus that are not in the lexicon based on a frequency count known terms in the selected corpus. Responsive to the estimated number of new terms in the selected corpus being greater than a threshold, the mechanism performs lexicon expansion using the selected corpus to form an expanded lexicon. Responsive to the estimated number of new terms in the selected corpus not being greater than the threshold, the mechanism halts lexicon expansion.
    Type: Application
    Filed: December 8, 2017
    Publication date: August 9, 2018
    Inventors: Alfredo Alba, Clemens Drews, Daniel F. Gruhl, Linda H. Kato, Neal R. Lewis, Pablo N. Mendes, Meenakshi Nagarajan
  • Publication number: 20180225276
    Abstract: A system, comprising an input device configured to receive a first item and a second item, and a processor communicably coupled to the input device and configured to determine that the first item is a fragment matching a lexicon, and place the fragment in a section of a document, the section selected based on the matching lexicon, wherein the processor is configured to perform the determination and the placement after it receives the first item but before it receives the second item.
    Type: Application
    Filed: February 9, 2017
    Publication date: August 9, 2018
    Inventors: Alfredo ALBA, Anni R. CODEN, Clemens DREWS, Daniel F. GRUHL, Neal R. LEWIS, Pablo N. MENDES, Cartic RAMAKRISHNAN, Joseph F. TERDIMAN
  • Publication number: 20180225373
    Abstract: A mechanism is provided in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to implement an automated lexicon expansion for an identified corpus. For a selected corpus in a set of corpora, the mechanism determines an estimated number of new terms in the selected corpus that are not in the lexicon based on a frequency count known terms in the selected corpus. Responsive to the estimated number of new terms in the selected corpus being greater than a threshold, the mechanism performs lexicon expansion using the selected corpus to form an expanded lexicon. Responsive to the estimated number of new terms in the selected corpus not being greater than the threshold, the mechanism halts lexicon expansion.
    Type: Application
    Filed: February 7, 2017
    Publication date: August 9, 2018
    Inventors: Alfredo Alba, Clemens Drews, Daniel F. Gruhl, Linda H. Kato, Neal R. Lewis, Pablo N. Mendes, Meenakshi Nagarajan
  • Publication number: 20180225258
    Abstract: A computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to receive a document having multiple section headers, segment the document into at least first and second sections based on the section headers, segment items in the first section into fragments and identify a section type for each of the fragments, determine that the identified section type for at least one of the fragments better matches a type of the second section than it matches a type of the first section, and re-locate the at least one of the fragments to the second section.
    Type: Application
    Filed: February 9, 2017
    Publication date: August 9, 2018
    Inventors: Alfredo ALBA, Anni R. CODEN, Clemens DREWS, Daniel F. GRUHL, Neal R. LEWIS, Pablo N. MENDES, Cartic RAMAKRISHNAN, Joseph F. TERDIMAN
  • Publication number: 20170091191
    Abstract: Influencers (individuals or groups) over a selected audience (observers or recipients of information, objects and/or events) on a given topic are measured based on influence features, which include a sentiment flipping influence feature indicative of ability of an audience member to influence other audience members to change their sentiment on the selected topic. Other influence features include the ability to influence others: to change followership; to express interest in a topic associated with a hashtag pioneered by the influencer, based on the effectiveness and phrasing of language used. The output of the influence engine can be a score representing the relative influence of audience members over the audience on the topic of interest. Influencers may be ranked according to their total influence score over the audience on the topic.
    Type: Application
    Filed: September 29, 2015
    Publication date: March 30, 2017
    Inventors: Alfredo Alba, Clemens Drews, Daniel Gruhl, Neal R. Lewis, Pablo N. Mendes, Meenakshi Nagarajan, Cartic Ramakrishnan
  • Patent number: 9146983
    Abstract: A method for creating a semantically aggregated index in an indexer-agnostic index building system includes: extracting documents from a data source, each document including a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Grant
    Filed: August 24, 2012
    Date of Patent: September 29, 2015
    Assignee: International Business Machines Corporation
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Asim V. Singh, Kevin B. Wang
  • Patent number: 9104749
    Abstract: A computer program product for an indexer-agnostic index building system includes a computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index. The operations include: extracting documents from a data source, wherein each document includes a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Grant
    Filed: January 12, 2011
    Date of Patent: August 11, 2015
    Assignee: International Business Machines Corporation
    Inventors: Alfredo Alba, Chad E DeLuca, Vuk Ercegovac, Thomas D Griffin, Jun Rao, Asim V Singh, Kevin B Wang
  • Publication number: 20150220510
    Abstract: Embodiments of the present invention relate to interactive optimization of messages published to digital media based on past performance of similar messages. In one embodiment, an input token is received. At least one candidate substitute token is retrieved from a dictionary. The dictionary comprises a mapping from the input token to the at least one candidate substitute token. A score associated with the at least one candidate substitute token is determined. A score associated with the input token is determined. The score associated with the input token, the at least one candidate substitute token, and the score associated with the at least one candidate substitute token are outputted.
    Type: Application
    Filed: January 31, 2014
    Publication date: August 6, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Timothy Bethea, Clemens Drews, Daniel Gruhl, Neal R. Lewis, Meenakshi Nagarajan
  • Publication number: 20150220643
    Abstract: Embodiments of the present invention relate to scoring of messages published to digital media based on past performance of similar messages. In one embodiment, an input token is received. A plurality of messages is selected from a corpus of messages. Each of the plurality of messages has a publication time and contents. The contents of each of the plurality of messages include the input token. A plurality of root messages is determined from the plurality of messages. Each of the plurality of root messages relates to at least one related message. The at least one related message is one of the plurality of messages. Each of the plurality of root messages is the earliest message of the corpus of messages related to its at least one related message. A score is determined for the input token based on the plurality of root messages.
    Type: Application
    Filed: January 31, 2014
    Publication date: August 6, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Clemens Drews, Daniel Gruhl, Neal R. Lewis, Meenakshi Nagarajan
  • Patent number: 8352412
    Abstract: A system for transforming domain specific unstructured data into structured data including an intake platform controlled by feed back from a control platform. The intake platform includes an intake acquisition module for acquiring data building baseline data related to a domain and problem of interest, an intake pre-processing module, an intake language module, an intake application descriptors module, and an intake adjudication module. The control platform includes a control data acquisition module, a control data consistency collator, a control auditor, a control event definition and policy repository, an error resolver, and an output that outputs results of the workflow into structured data enabled to be used in data analytics.
    Type: Grant
    Filed: February 27, 2009
    Date of Patent: January 8, 2013
    Assignee: International Business Machines Corporation
    Inventors: Alfredo Alba, Varun Bhagwan, Tyrone W. A. Grandison, Daniel F. Gruhl, Jan H. Pieper
  • Publication number: 20120323920
    Abstract: A method for creating a semantically aggregated index in an indexer-agnostic index building system includes: extracting documents from a data source, each document including a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Application
    Filed: August 24, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Asim V. Singh, Kevin B. Wang
  • Publication number: 20120323919
    Abstract: Embodiments of the invention relate to building a distributed reverse semantic index. In one general embodiment a plurality of documents are received with each document having at least one defined rule and or semantic. The documents are distributed among a plurality of nodes of a system. The documents are processed in a generally parallel fashion. Processing the documents includes processing text data of each of the document and breaking each document into fields to index the text data to create index data by deferring how to categorize the text data based upon the defined rule and or semantics. The indexed data is combined back together to create an indexer-agnostic semantic index including a plurality of the semantic index shards and to semantically classify the documents based on the index shards into groups based on document type to create the distributed reverse semantic index.
    Type: Application
    Filed: August 27, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Eugene J. Shekita, Asim V. Singh, Yuanyuan Tian, Kevin B. Wang
  • Patent number: 8307025
    Abstract: The present invention relates to a method for the configurable real time transformation of dissimilar data sources, the method further consisting of the steps of acquiring real time information pertaining to at least one data source, wherein the information comprises reference information that is associated with the data source, data transformation specification information that is associated with the data source, and scheduled event specification information that is associated with the data source, and maintaining the data source information.
    Type: Grant
    Filed: May 29, 2008
    Date of Patent: November 6, 2012
    Assignee: International Business Machines Corporation
    Inventor: Alfredo Alba
  • Publication number: 20120254089
    Abstract: Embodiments of the invention relate to building a distributed reverse semantic index. In one general embodiment a plurality of documents are received with each document having at least one defined rule and or semantic. The documents are distributed among a plurality of nodes of a system. The documents are processed in a generally parallel fashion. Processing the documents includes processing text data of each of the document and breaking each document into fields to index the text data to create index data by deferring how to categorize the text data based upon the defined rule and or semantics. The indexed data is combined back together to create an indexer-agnostic semantic index including a plurality of the semantic index shards and to semantically classify the documents based on the index shards into groups based on document type to create the distributed reverse semantic index.
    Type: Application
    Filed: March 31, 2011
    Publication date: October 4, 2012
    Applicant: International Business Machines Corporation
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Eugene J. Shekita, Asim V. Singh, Yuanyuan Tian, Kevin B. Wang
  • Publication number: 20120179684
    Abstract: A computer program product for an indexer-agnostic index building system includes a computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index. The operations include: extracting documents from a data source, wherein each document includes a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Application
    Filed: January 12, 2011
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Asim V. Singh, Kevin B. Wang
  • Patent number: 7809711
    Abstract: The present invention provides a method, system, and service of analyzing electronic documents in an intranet, where the intranet includes a plurality of web sites. In an exemplary embodiment, the method, system, and service include (1) crawling HTML content and text content in a set of the sites, (2) deep-scanning non-HTML content and non-text content in the set of sites, (3) reverse-scanning the set of sites, (4) performing a semantic analysis of the crawled content and the deep-scanned content, (5) correlating the results of the semantic analysis with the results of the reverse-scanning, and (6) comparing user navigation patterns and content from the members of the set of sites. In a further embodiment, the method, system, and service further include combining the results of the performing, the results of the correlating, and the results of the comparing.
    Type: Grant
    Filed: June 2, 2006
    Date of Patent: October 5, 2010
    Assignee: International Business Machines Corporation
    Inventors: Alfredo Alba, Varun Bhagwan, Daniel Frederick Gruhl, Savitha Srinivasan
  • Publication number: 20100223226
    Abstract: A system for transforming domain specific unstructured data into structured data including an intake platform controlled by feed back from a control platform. The intake platform includes an intake acquisition module for acquiring data building baseline data related to a domain and problem of interest, an intake pre-processing module, an intake language module, an intake application descriptors module, and an intake adjudication module. The control platform includes a control data acquisition module, a control data consistency collator, a control auditor, a control event definition and policy repository, an error resolver, and an output that outputs results of the workflow into structured data enabled to be used in data analytics.
    Type: Application
    Filed: February 27, 2009
    Publication date: September 2, 2010
    Applicant: International Business Machines Corporation
    Inventors: Alfredo Alba, Varun Bhagwan, Tyrone W.A. Grandison, Daniel F. Gruhl, Jan H. Pieper
  • Publication number: 20090248631
    Abstract: A method and system for processing complex long running queries with respect to a database in which the database workload is determined in terms of quality of service (QoS) requirements of with respect to short running queries, which can be of a transactional type, in which long running queries are partitioned into a plurality of sub-queries that satisfy the database QoS requirements, are then processed and the results of processing the plurality of sub-queries are aggregated so as to correspond to the processing of the long running query.
    Type: Application
    Filed: March 31, 2008
    Publication date: October 1, 2009
    Inventors: Alfredo Alba, Nikolaos Anerousis, Michael Ching, Genady Y. Grabarnik, Larisa Shwartz
  • Publication number: 20080294790
    Abstract: The present invention relates to a method for the configurable real time transformation of dissimilar data sources, the method further consisting of the steps of acquiring real time information pertaining to at least one data source, wherein the information comprises reference information that is associated with the data source, data transformation specification information that is associated with the data source, and scheduled event specification information that is associated with the data source, and maintaining the data source information.
    Type: Application
    Filed: May 29, 2008
    Publication date: November 27, 2008
    Applicant: International Business Machines Corporation
    Inventor: Alfredo Alba