Patents by Inventor David A. Ferrucci

David A. Ferrucci has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20090292687
    Abstract: A system, method and computer program product for conducting questions and answers with deferred type evaluation based on any corpus of data. The method includes processing a query including waiting until a “Type” (i.e. a descriptor) is determined AND a candidate answer is provided; the Type is not required as part of a predetermined ontology but is only a lexical/grammatical item. Then, a search is conducted to look (search) for evidence that the candidate answer has the required LAT (e.g., as determined by a matching function that can leverage a parser, a semantic interpreter and/or a simple pattern matcher). In another embodiment, it may be attempted to match the LAT to a known Ontological Type and then look for a candidate answer up in an appropriate knowledge-base, database, and the like determined by that type. Then, all the evidence from all the different ways to determine that the candidate answer has the expected lexical answer type (LAT) is combined and one or more answers are delivered to a user.
    Type: Application
    Filed: May 23, 2008
    Publication date: November 26, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James Fan, David Ferrucci, David C. Gondek, Wlodek W. Zadrozny
  • Publication number: 20090287678
    Abstract: A system, method and computer program product for providing answers to questions based on any corpus of data. The method facilitates generating a number of candidate passages from the corpus that answer an input query, and finds the correct resulting answer by collecting supporting evidence from the multiple passages. By analyzing all retrieved passages and that passage's metadata in parallel, there is generated an output plurality of data structures including candidate answers based upon the analyzing. Then, by each of a plurality of parallel operating modules, supporting passage retrieval operations are performed upon the set of candidate answers, and for each candidate answer, the data corpus is traversed to find those passages having candidate answer in addition to query terms. All candidate answers are automatically scored causing the supporting passages by a plurality of scoring modules, each producing a module score.
    Type: Application
    Filed: May 14, 2008
    Publication date: November 19, 2009
    Applicant: International business machines corporation
    Inventors: Eric W. Brown, David Ferrucci, Adam Lally, Wlodek W. Zadrozny
  • Publication number: 20080168080
    Abstract: An unknown annotator and its annotation type system are compared against a reference annotation type system. The comparison is done by providing a plurality of documents, and annotating each document using the reference set of document annotators, producing instances of reference annotation types, to generate a pre-annotated reference document set, and using the subject annotator and its subject annotation type system to generate a pre-annotated evaluation document set. Documents in the pre-annotated evaluation document set are compared to documents in the pre-annotated reference document set, and matches in location, within the compared documents, of instances of the subject annotation types and the reference annotation types are identified. Based on the matching data, reference document annotation types are selected that sufficiently correlate with the subject annotation type system.
    Type: Application
    Filed: January 5, 2007
    Publication date: July 10, 2008
    Inventors: Yurdaer N. Doganata, Youssef Drissi, David A. Ferrucci, Tong-Haing Fin, Genady Grabarnik, Lev Kozakov
  • Publication number: 20070168193
    Abstract: A method (and system) which autonomously generates a cohesive script from a text database for creating a speech corpus for concatenative text-to-speech, and more particularly, which generates cohesive scripts having fluency and natural prosody that can be used to generate compact text-to-speech recordings that cover a plurality of phonetic events.
    Type: Application
    Filed: January 17, 2006
    Publication date: July 19, 2007
    Applicant: International Business Machines Corporation
    Inventors: Andrew Aaron, David Ferrucci, John Pitrelli
  • Patent number: 7139752
    Abstract: Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. Also disclosed is system, method and computer program product to process document data. The method includes inputting a document and operating at least one text analysis engine that comprises a plurality of coupled annotators for tokenizing document data for identifying and annotating a particular type of semantic content. Operating the at least one text analysis engine generates a plurality of views of a document, where each of the plurality of views are derived from a different tokenization of the document.
    Type: Grant
    Filed: May 30, 2003
    Date of Patent: November 21, 2006
    Assignee: International Business Machines Corporation
    Inventors: Andrei Z Broder, David Carmel, Arthur C Ciccolo, David Ferrucci, Yoelle Maarek, Yosi Mass, Aya Soffer, Wlodek W Zadrozny
  • Publication number: 20040243645
    Abstract: Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. Also disclosed is system, method and computer program product to process document data. The method includes inputting a document and operating at least one text analysis engine that comprises a plurality of coupled annotators for tokenizing document data for identifying and annotating a particular type of semantic content. Operating the at least one text analysis engine generates a plurality of views of a document, where each of the plurality of views are derived from a different tokenization of the document.
    Type: Application
    Filed: May 30, 2003
    Publication date: December 2, 2004
    Applicant: International Business Machines Corporation
    Inventors: Andrei Z. Broder, David Carmel, Arthur C. Ciccolo, David Ferrucci, Yoelle Maarek, Yosi Mass, Aya Soffer, Wlodek W. Zadrozny
  • Publication number: 20040243560
    Abstract: Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. The data processing system includes a token inverted file system storing tokens obtained by at least one tokenizer from document data. An annotation inverted file system stores annotations, a list of one or more occurrences of each annotation, and, for each listed occurrence, a set comprised of at least two token locations spanned by the respective annotation.
    Type: Application
    Filed: May 30, 2003
    Publication date: December 2, 2004
    Applicant: International Business Machines Corporation
    Inventors: Andrei Z. Broder, David Ferrucci, Alan Marwick, Yosi Mass, Wlodek W. Zadrozny
  • Publication number: 20040243556
    Abstract: Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A data structure includes an application program interface specifying a type system, an index repository and workflow composition. The type system specifies inheritance relationships between elements and attribute and features of the elements, and functions to specify constraints on an execution sequence of the document annotators. The index repository records results of processing of the document annotators.
    Type: Application
    Filed: May 30, 2003
    Publication date: December 2, 2004
    Applicant: International Business Machines Corporation
    Inventors: David Ferrucci, Thilo Goetz, Thomas Hampp, Alan D. Marwick, Oliver Suhre, Wlodek W. Zadrozny
  • Publication number: 20040243554
    Abstract: Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique.
    Type: Application
    Filed: May 30, 2003
    Publication date: December 2, 2004
    Applicant: International Business Machines Corporation
    Inventors: Andrei Z. Broder, Arthur C. Ciccolo, David Ferrucci, Alan D. Marwick, Wlodek W. Zadrozny