Patents by Inventor David Carmel

David Carmel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20090327271
    Abstract: Information retrieval with unified search between heterogeneous objects is described. The method includes: indexing a first object as a document in a search index; referencing a second object related to the first object in a facet of the document; and storing a relationship strength between the first and second objects in the facet of the document in the search index. Multiple heterogeneous objects can be related to the first object and referenced in multiple facets of the document, each with its relationship strength to the first object. Scoring an indirect object by indirect relation to a query object can be carried out by aggregating the relationship strengths between the indirect object and the retrieved objects multiplied by the retrieved objects' direct scores of relationship strength to the query object.
    Type: Application
    Filed: June 30, 2008
    Publication date: December 31, 2009
    Inventors: Einat Amitay, David Carmel, Nadav Golbandi, Nadav Y. Har'el, Shila Ofek-Koifman, Sivan Yogev
  • Publication number: 20090307209
    Abstract: An apparatus for searching a document collection is provided.
    Type: Application
    Filed: June 10, 2008
    Publication date: December 10, 2009
    Inventors: David CARMEL, Adam DARLOW, Yael PETRUSCHKA, Aya SOFFER
  • Publication number: 20090276457
    Abstract: A method and system are provided for maintaining profiles of information channels available on the Web, wherein the information channels are accessed via pull-only protocols. The method includes monitoring one or more channels by a channel pull action at a monitoring rate, wherein the monitoring rate is determined for the one or more channels based on the number of update events in a previous time period. The method may optimally include filtering the update events in the time period by a novelty measure, wherein the filtering disregards events that do not include significant novel information. The monitoring rate is adapted based on reinforcement learning applying iterative learning rules over time.
    Type: Application
    Filed: April 30, 2008
    Publication date: November 5, 2009
    Inventors: David Carmel, Haggai Roitman, Elad Yom-Tov
  • Publication number: 20090222441
    Abstract: Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A search query includes a search operator containing of a plurality of search sub-expressions each having an associated weight value. The search engine returns a document or documents having a weight value sum that exceeds a threshold weight value sum. The search operator is implemented as a Boolean predicate that functions as a Weighted AND (WAND).
    Type: Application
    Filed: June 13, 2008
    Publication date: September 3, 2009
    Inventors: Andrei Z. BRODER, David CARMEL, Michael HERSCOVICI, Aya SOFFER, Jason ZIEN
  • Publication number: 20090106375
    Abstract: A method and system are provided for conversation detection in email systems. Multiple email messages are provided and grouped as relating to a conversation. The grouping is carried out by applying a similarity function based on a similarity of the email messages' attributes, the similarity function including a similarity between the email messages' participants and at least one of a similarity between the email messages' subjects or a similarity between the email messages' contents. The similarity function may also include the similarity between the email messages' dates. The similarity function may also include weightings for the contributions of the email messages' attributes. A graphical user interface is provided in an email client which includes means for viewing email messages by conversation.
    Type: Application
    Filed: October 23, 2007
    Publication date: April 23, 2009
    Inventors: David Carmel, Shai Erera
  • Patent number: 7512602
    Abstract: Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A search query includes a search operator containing of a plurality of search sub-expressions each having an associated weight value. The search engine returns a document or documents having a weight value sum that exceeds a threshold weight value sum. The search operator is implemented as a Boolean predicate that functions as a Weighted AND (WAND).
    Type: Grant
    Filed: November 30, 2006
    Date of Patent: March 31, 2009
    Assignee: International Business Machines Corporation
    Inventors: Andrei Z Broder, David Carmel, Michael Herscovici, Aya Soffer, Jason Zien
  • Publication number: 20090055481
    Abstract: The present invention discloses an email application that includes a subject evaluation engine, which is able to automatically generate a subject heading suggestion for an email message based upon content contained in the email message. The subject evaluation engine can also compare a pre-existing subject heading of the email message against content contained in the email message. User selectable interface elements can be included in the email application for invoking the suggestion and comparison functions of the subject evaluation engine. Further, the subject evaluation can automatically be initiated before an email message is sent, can be used to notify a user when the message's subject is inconsistent with the message's content, and can suggest one or more replacement subject headings for the current heading.
    Type: Application
    Filed: August 20, 2007
    Publication date: February 26, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: DAVID CARMEL, SHAI ERERA, ITZHACK GOLDBERG, BOAZ MIZRACHI
  • Patent number: 7406462
    Abstract: A query difficulty prediction unit includes a query difficulty predictor to determine the extent of overlap between query documents received from a search engine operating on an input query and sub-query documents received from the search engine operating on sub-queries of the input query. The unit generates a query difficulty prediction from the extent of overlap.
    Type: Grant
    Filed: October 19, 2004
    Date of Patent: July 29, 2008
    Assignee: International Business Machines Corporation
    Inventors: David Carmel, Lawrence Adam Darlow, Shai Fine, Elad Yom-Tov
  • Patent number: 7401073
    Abstract: A method for searching a document collection includes providing an index of terms indicating the documents in which the terms appear. A first statistical distribution of each of at least some of the terms in the index and a second statistical distribution of each of at least some of the categories are estimated a over the documents in the collection. A query including one or more of the terms and a category restriction referring to at least one of the categories is accepted. A modified term distribution is produced by operating on the first statistical distribution of at least one of the terms in the query using the second statistical distribution, responsively to the category restriction. The query is applied to the index to return a response, in which occurrences of the at least one of the terms are scored responsively to the modified term distribution.
    Type: Grant
    Filed: April 28, 2008
    Date of Patent: July 15, 2008
    Assignee: International Business Machines Corporation
    Inventors: David Carmel, Adam Darlow, Yael Petruschka, Aya Soffer
  • Publication number: 20080147644
    Abstract: A method for searching a corpus of documents, such as the World Wide Web, includes defining a knowledge domain and identifying a set of reference documents in the corpus pertinent to the domain. Upon inputting a query, the corpus is searched using the set of reference documents to find one or more of the documents in the corpus that contain information in the domain relevant to the query. The set of reference documents is updated with the found documents that are most relevant to the domain. The updated set is used in searching the corpus for information in the domain relevant to subsequent queries.
    Type: Application
    Filed: December 10, 2007
    Publication date: June 19, 2008
    Inventors: Yariv Aridor, David Carmel, Michael Herscovici, Yoelle Maarek-Smadja, Aya Soffer, Ronny Lempel
  • Patent number: 7356527
    Abstract: An apparatus and method is provided for pruning an index of a corpus of text documents by creating an inverted index of terms appearing in the documents, wherein the index includes postings of the terms in the documents, ranking the postings in the index, and pruning from the index the postings below a given level in the ranking.
    Type: Grant
    Filed: December 19, 2001
    Date of Patent: April 8, 2008
    Assignee: International Business Machines Corporation
    Inventors: David Carmel, Doron Cohen, Ronald Fagin, Eitan Farchi, Michael Herscovici, Yoelle Maarek, Aya Soffer
  • Publication number: 20080033971
    Abstract: A method and system for analyzing a document set (202, 420) are provided. The method includes determining a set of terms (312) from the terms of the document set that minimizes a distance measurement (405) from the given set of documents (420). The method includes using a greedy algorithm to build the set of terms incrementally, at each stage finding a single word that is closest to the document set (202, 420). The set of terms is evaluated to assess the ability to find the document set (202, 420). The set of terms are compared with expected terms to evaluate the ability to find the document set (202, 420). A measure of the ability to find a document set (202, 420) is provided by computing a distance measure (403) between a document set and an entire collection.
    Type: Application
    Filed: August 1, 2006
    Publication date: February 7, 2008
    Inventors: David Carmel, Adam Darlow, Shai Fine, Dan Pelleg, Elad Yom-Tov
  • Patent number: 7318057
    Abstract: A method for searching a corpus of documents, such as the World Wide Web, includes defining a knowledge domain and identifying a set of reference documents in the corpus pertinent to the domain. Upon inputting a query, the corpus is searched using the set of reference documents to find one or more of the documents in the corpus that contain information in the domain relevant to the query. The set of reference documents is updated with the found documents that are most relevant to the domain. The updated set is used in searching the corpus for information in the domain relevant to subsequent queries.
    Type: Grant
    Filed: August 1, 2003
    Date of Patent: January 8, 2008
    Assignee: International Business Machines Corporation
    Inventors: Yariv Aridor, David Carmel, Michael Herscovici, Yoelle Maarek-Smadja, Aya Soffer, Ronny Lempel
  • Publication number: 20070112763
    Abstract: Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A search query includes a search operator containing of a plurality of search sub-expressions each having an associated weight value. The search engine returns a document or documents having a weight value sum that exceeds a threshold weight value sum. The search operator is implemented as a Boolean predicate that functions as a Weighted AND (WAND).
    Type: Application
    Filed: November 30, 2006
    Publication date: May 17, 2007
    Inventors: Andrei Broder, David Carmel, Michael Herscovici, Aya Soffer, Jason Zien
  • Publication number: 20070016545
    Abstract: A method and system for the detection of missing content in a searchable repository is provided. A system includes: a missing content query identifier (401) for identifying queries to a search engine (102) for which no or little relevant content is returned; a missing content detector (110) which clusters missing content queries by topic; and an output provider for providing details of a missing content topic.
    Type: Application
    Filed: July 14, 2005
    Publication date: January 18, 2007
    Applicant: International Business Machines Corporation
    Inventors: Andrei Broder, David Carmel, Adam Darlow, Shai Fine, Elad Yom-Tov
  • Publication number: 20070016574
    Abstract: A method and system are provided of merging results in distributed information retrieval. A search manager (104) is in communication with a plurality of components, wherein a component is a search engine (106-108) working on a document collection and returning results in the form of a list of documents to a search query. The search manager (104) submits a query (202) to the plurality of components, receives results (213) from each component in the form of a list of documents; estimates (208) the success of a component in handling the query to generate a merit score (210) for a component per query; applies (220) the merit score (210) to the results for the component; and merges (222) results from the plurality of components by ranking in order of the applied merit score.
    Type: Application
    Filed: July 14, 2005
    Publication date: January 18, 2007
    Applicant: International Business Machines Corporation
    Inventors: David Carmel, Adam Darlow, Shai Fine, Elad Yom-Tov
  • Patent number: 7146361
    Abstract: Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A search query includes a search operator containing of a plurality of search sub-expressions each having an associated weight value. The search engine returns a document or documents having a weight value sum that exceeds a threshold weight value sum. The search operator is implemented as a Boolean predicate that functions as a Weighted AND (WAND).
    Type: Grant
    Filed: May 30, 2003
    Date of Patent: December 5, 2006
    Assignee: International Business Machines Corporation
    Inventors: Andrei Z Broder, David Carmel, Michael Herscovici, Aya Soffer, Jason Zien
  • Patent number: 7139752
    Abstract: Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. Also disclosed is system, method and computer program product to process document data. The method includes inputting a document and operating at least one text analysis engine that comprises a plurality of coupled annotators for tokenizing document data for identifying and annotating a particular type of semantic content. Operating the at least one text analysis engine generates a plurality of views of a document, where each of the plurality of views are derived from a different tokenization of the document.
    Type: Grant
    Filed: May 30, 2003
    Date of Patent: November 21, 2006
    Assignee: International Business Machines Corporation
    Inventors: Andrei Z Broder, David Carmel, Arthur C Ciccolo, David Ferrucci, Yoelle Maarek, Yosi Mass, Aya Soffer, Wlodek W Zadrozny
  • Publication number: 20060248074
    Abstract: A method for searching a document collection includes providing an index of terms indicating the documents in which the terms appear. A first statistical distribution of each of at least some of the terms in the index and a second statistical distribution of each of at least some of the categories are estimated a over the documents in the collection. A query including one or more of the terms and a category restriction referring to at least one of the categories is accepted. A modified term distribution is produced by operating on the first estimated statistical distribution of at least one of the terms in the query using the second estimated statistical distribution of the at least one of the categories, responsively to the category restriction. The query is applied to the index so as to return a response, in which occurrences of the at least one of the terms are scored responsively to the modified term distribution.
    Type: Application
    Filed: April 28, 2005
    Publication date: November 2, 2006
    Applicant: International Business Machines Corporation
    Inventors: David Carmel, Adam Darlow, Yael Petruschka, Aya Soffer
  • Patent number: 7072827
    Abstract: A method for morphological disambiguation includes receiving an input string and morphologically analyzing the string to generate a list of candidate analyses of the string, each candidate analysis including a respective word and a linguistic pattern of the word. The pattern of each of the analyses is evaluated against a predefined criterion in order to select one or more of the analyses from the list. The method is suitable particularly for computerized analysis and searching in Hebrew and other Semitic languages.
    Type: Grant
    Filed: June 29, 2000
    Date of Patent: July 4, 2006
    Assignee: International Business Machines Corporation
    Inventors: David Carmel, Yoelle Maarek-Smadja, Victoria Skoblikov