Patents by Inventor Haggai Roitman

Haggai Roitman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11030209
    Abstract: Methods and systems for generating and evaluating fused query lists. A query on a corpus of documents is evaluated using a plurality of retrieval methods and a ranked list for each of the plurality of retrieval methods is obtained. A plurality of fused ranked lists is sampled, each fusing said ranked lists for said plurality of retrieval methods, and the sampled fused ranked lists are sorted. In an unsupervised manner, an objective comprising a likelihood that a fused ranked list, fusing said ranked lists for each of said plurality of retrieval methods, is relevant to a query and a relevance event, is optimized to optimize the sampling, until convergence is achieved. Documents of the fused ranked list are determined based on the optimization.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: June 8, 2021
    Assignee: International Business Machines Corporation
    Inventors: Haggai Roitman, Bar Weiner, Shai Erera
  • Publication number: 20210165812
    Abstract: A computer-implemented method, computerized apparatus and computer program product for minimum coordination passage scoring. Given a candidate passage in a document collection potentially matching a query received, a set of overlapping terms between the candidate passage and the query is determined. For each overlapping term in the set, a first measure of a weight of the term in the query, a second measure of a weight of the term in the candidate passage, and a third measure of a specificity of the term in the document collection are calculated. a function of the first and second measure is evaluated to obtain a value reflecting a condition on the relation therebetween. A minimum coordination score representing a relative similarity between the candidate passage and the query is determined based on the value and the first, second and third measures obtained for each of the overlapping terms.
    Type: Application
    Filed: December 2, 2019
    Publication date: June 3, 2021
    Inventors: Doron Cohen, Haggai Roitman, Oren Sar-Shalom
  • Publication number: 20210157829
    Abstract: Embodiments may provide automated summarization of documents, such as scientific documents by using a prior distribution on logical sections learnt from a corpus of human authored summaries. For example, a method of document summarization may comprise receiving, at the computer system, a document and segmenting the document into a plurality of sentences, identifying, at the computer system, sections in the document and aligning each sentence in the document to a section logical role, and summarizing, at the computer system, the document using a probability distribution.
    Type: Application
    Filed: November 21, 2019
    Publication date: May 27, 2021
    Inventors: ODELLIA BONI, DORON COHEN, GUY FEIGENBLAT, DAVID KONOPNICKI, HAGGAI ROITMAN
  • Publication number: 20210133199
    Abstract: Embodiments of the present systems and methods may provide techniques that provide improved information retrieval. For example, a method may comprise receiving, at the computer system, a query to retrieve a document from a corpus of documents, retrieving, at the computer system, a plurality of documents from the corpus of documents using a plurality of retrieval methods, each retrieval method generating a ranked list of retrieved documents and a score for each document, fusing, at the computer system, the generated ranked list of retrieved documents to form an aggregated ranked list of retrieved documents by re-scoring, at the computer system, the plurality of documents according to its passage scores, with respect to the query and associating, at the computer system, a given document and its maximal passage using relevance information induced from the plurality of ranked lists.
    Type: Application
    Filed: October 31, 2019
    Publication date: May 6, 2021
    Inventors: Shai Erera, Guy Feigenblat, Yosi Mass, Haggai Roitman, Bar Weiner
  • Patent number: 10984168
    Abstract: A system for generating a multi-modal summary of a digital document, comprising a processor adapted for: extracting from the document a plurality of graphical elements; generating a set of textual descriptions, each generated for one of the graphical elements and associated therewith; selecting, from the set of textual descriptions and a set of text fragments extracted from the document, a set of representative elements having a highest score computed by applying thereto a score function, where a set of representative elements' score is indicative of a degree by which the set of representative elements represents the document; for each representative element of the set of representative elements, where the element is a textual description of a graphical element of the plurality of graphical elements, replacing the element with the graphical element associated therewith; and generating, using the set of representative elements, another document comprising a multi-modal summary of the document.
    Type: Grant
    Filed: February 10, 2020
    Date of Patent: April 20, 2021
    Assignee: International Business Machines Corporation
    Inventors: Odellia Boni, Guy Feigenblat, Haggai Roitman
  • Publication number: 20210109959
    Abstract: Automated keyphrase extraction from a digital text document. A pool of candidate keyphrases of the digital text document is created. A cross-entropy method is then employed to compute a set of output keyphrases out of the pool of candidate keyphrases, by iteratively optimizing an objective function that is configured to cause the set of output keyphrases to be descriptive of one or more main topics discussed in the digital text document. The set of output keyphrases may be used for at least one of: text summarization, text categorization, opinion mining, and document indexing.
    Type: Application
    Filed: October 10, 2019
    Publication date: April 15, 2021
    Inventors: Odellia Boni, Doron Cohen, Guy Feigenblat, David Konopnicki, Haggai Roitman
  • Patent number: 10956409
    Abstract: A session search relevance model identifies a user's dynamic information need based on a feedback model and a session relevance model. The feedback model is based on query changes in the session search and user interest in particular documents presented throughout the session search. The relevance model modifies a user's current query to retrieve documents most relevant to a user's information need.
    Type: Grant
    Filed: May 10, 2017
    Date of Patent: March 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Haggai Roitman, Doron Cohen, Nir Levine
  • Publication number: 20210042383
    Abstract: A system for generating a summary of a text document is disclosed. In some examples, the system includes a processor configured to generate an initial summary of an original document. The initial summary includes a selection of extracted sentences copied from the original document. For each extracted sentence of the initial summary, the processor processes the extracted sentence to generate an abstracted sentence, and generates vector representations of the extracted sentence, the abstracted sentence, the original document, and the current summary. The vector representations are then input to a decision network to compute an editing decision. The editing decision is selected from a group of possible decisions that includes a decision to add the extracted sentence and a decision to add the abstracted sentence. The processor also updates the current summary based on the editing decision.
    Type: Application
    Filed: August 5, 2019
    Publication date: February 11, 2021
    Inventors: Guy Feigenblat, David Konopnicki, Edward Moroshko, Haggai Roitman
  • Patent number: 10902191
    Abstract: A system for generating a summary of a text document is disclosed. In some examples, the system includes a processor configured to generate an initial summary of an original document. The initial summary includes a selection of extracted sentences copied from the original document. For each extracted sentence of the initial summary, the processor processes the extracted sentence to generate an abstracted sentence, and generates vector representations of the extracted sentence, the abstracted sentence, the original document, and the current summary. The vector representations are then input to a decision network to compute an editing decision. The editing decision is selected from a group of possible decisions that includes a decision to add the extracted sentence and a decision to add the abstracted sentence. The processor also updates the current summary based on the editing decision.
    Type: Grant
    Filed: August 5, 2019
    Date of Patent: January 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Guy Feigenblat, David Konopnicki, Edward Moroshko, Haggai Roitman
  • Publication number: 20210011937
    Abstract: A method comprising receiving digital documents, a query statement, and a summary length constraint; identifying, for each of said digital documents, a sentence subset, based, at least in part, on said query statement, a modified version of said summary length constraint, and a first set of quality objectives, generating, for each of said sentence subsets, a random forest representation; iteratively (i) sampling, from each of said random forest representations, a plurality of tokens to create a corresponding candidate document summary, based, at least in part, on weights assigned to each of said tokens, (ii) assigning a quality ranking to said candidate document summary, based, at least in part, on said first set of quality objectives and a second set of quality objectives, and (iii) adjusting said weights, based, at least in part, on said quality rankings; and outputting a highest ranking said candidate document as a compressed summary.
    Type: Application
    Filed: July 9, 2019
    Publication date: January 14, 2021
    Inventors: ODELLIA BONI, DORON COHEN, GUY FEIGENBLAT, DAVID KONOPNICKI, HAGGAI ROITMAN
  • Patent number: 10831770
    Abstract: A computer implemented method for estimating quality of document retrieval comprising: retrieving from a corpus of documents stored on at least one storage a plurality of digital documents which comply with a document retrieval query according to a retrieval model; computing a plurality of retrieval scores each calculated for one of the plurality of digital documents using a relevance function scoring a relevance of one of the retrieved plurality of digital documents to the query; computing a calibrated weighted product model (WPM) estimator by calculating a combination of the plurality of retrieval scores weighted according to a plurality of retrieval features of the corpus and/or the query and/or a document, wherein the plurality of retrieval features are weighted according to a relative importance; and using the calibrated WPM estimator to score the plurality of digital documents' relevance to the query.
    Type: Grant
    Filed: December 12, 2017
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Shai Erera, Haggai Roitman, Oren Sar-Shalom, Bar Weiner
  • Patent number: 10831806
    Abstract: A system comprising at least one hardware processor; and a non-transitory computer-readable storage medium having stored thereon program instructions executable to receive, as input, one or more digital documents, a query statement, and a summary length constraint, automatically generate, for each of said one or more digital documents, an initial summary based, at least in part, on a first sentence selection which satisfies said query statement, a modified said summary length constraint, and a first summary quality goal, automatically extract, from each of said initial summaries, one or more associated feedback metrics, and automatically generate, for each of said one or more digital documents, a final summary based, at least in part, on: (i) a second sentence selection which satisfies said query statement, said summary length constraint, and a second summary quality goal, and (ii) at least one of said associated feedback metrics.
    Type: Grant
    Filed: October 29, 2018
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Guy Feigenblat, David Konopnicki, Haggai Roitman
  • Patent number: 10740338
    Abstract: A method of computing a query performance prediction (QPP), comprising: receiving a target search query and a set of target search results obtained by executing the target search query on a corpus of data-elements, computing variations of the target search query, receiving a candidate set of search results for each of the variations, computing a statistical similarity metric indicative of statistically significant similarity or dissimilarity between each candidate set of search results and the set of target search results, clustering the candidate sets of search results into a cluster of pseudo effective reference lists (PE-RL) according to an association with statistical similarity requirement, and into a cluster of pseudo ineffective reference lists (PIE-RL) according to an association with statistically dissimilarity requirement, and computing the QPP of the target search results responsive to the target search query according to an aggregation of the PR-RL cluster and PIE-RL cluster.
    Type: Grant
    Filed: July 23, 2017
    Date of Patent: August 11, 2020
    Assignee: International Business Machines Corporation
    Inventor: Haggai Roitman
  • Publication number: 20200210415
    Abstract: Methods and systems for generating and evaluating fused query lists. A query on a corpus of documents is evaluated using a plurality of retrieval methods and a ranked list for each of the plurality of retrieval methods is obtained. A plurality of fused ranked lists is sampled, each fusing said ranked lists for said plurality of retrieval methods, and the sampled fused ranked lists are sorted. In an unsupervised manner, an objective comprising a likelihood that a fused ranked list, fusing said ranked lists for each of said plurality of retrieval methods, is relevant to a query and a relevance event, is optimized to optimize the sampling, until convergence is achieved. Documents of the fused ranked list are determined based on the optimization.
    Type: Application
    Filed: December 28, 2018
    Publication date: July 2, 2020
    Inventors: HAGGAI ROITMAN, BAR WEINER, SHAI ERERA
  • Publication number: 20200210489
    Abstract: An illustrative embodiment includes a method for post-retrieval query performance prediction using hybrid document-passage information. The method includes: obtaining a set of documents responsive to a specific query; extracting document-level information regarding respective documents within the set; extracting passage-level information regarding respective passages of documents within the set; and estimating a likelihood that the set of documents includes relevant information to the specific query using both the document-level information and the passage-level information.
    Type: Application
    Filed: December 27, 2018
    Publication date: July 2, 2020
    Inventor: HAGGAI ROITMAN
  • Publication number: 20200210437
    Abstract: An exemplary method includes: determining a pool of documents, wherein each document is within at least one of a plurality of lists, each of the lists results from executing a query on a corpus, and the corpus comprises at least the pool of documents; determining a first ranking of documents within the pool of documents based at least in part on first scores computed for respective documents within the pool; estimating relevance to the specified query at least of respective documents within the first ranking, wherein the relevance is estimated without user feedback regarding the relevance; and determining a second ranking of documents within the pool based at least in part on second scores computed at least for respective documents within the first ranking, wherein the second score for a given document is computed based at least in part on the estimated relevance of at least the given document.
    Type: Application
    Filed: December 27, 2018
    Publication date: July 2, 2020
    Inventors: HAGGAI ROITMAN, SHAI ERERA, BAR WEINER
  • Publication number: 20200210438
    Abstract: Techniques are disclosed for query performance prediction (QPP) in the fusion-based retrieval setting. Symmetric list similarity measures used in traditional QPP techniques do not properly account for relevance-dependent aspects of the relationship between a given (base) reference list generated using an information retrieval technique and a final fused list generated using a fusion technique, as such a relationship is actually asymmetric. Embodiments more properly model the asymmetric relationship of reference and fused lists using an asymmetric co-relevance model that estimates, assuming a reference list contains relevant information, the odds that the fused list will be observed. In particular, the asymmetric co-relevance between a reference list and a fused list may be determined by adjusting a symmetric co-relevance of the reference list and the fused list using an odds ratio between the symmetric co-relevance of the reference list and the fused list to the reference list's own relevance.
    Type: Application
    Filed: December 31, 2018
    Publication date: July 2, 2020
    Inventors: HAGGAI ROITMAN, SHAI ERERA, BAR WEINER
  • Patent number: 10698908
    Abstract: A computerized method comprising using hardware processor(s) for receiving, from a computerized search engine, digital input data comprising a group of relevancy score sets, where each relevancy score set comprises scores associated with computerized search terms and search field pairs found in electronic documents. Two or more statistical values are computed of the relevancy score sets, one or more of the two or more statistical values for each relevancy score set. Based on some of the two or more statistical values, some relevancy scores sets are reduced from the group to create a reduced group. The reduced group is sent to the computerized search engine for presenting a search result to a user on a computer display.
    Type: Grant
    Filed: July 12, 2016
    Date of Patent: June 30, 2020
    Assignee: International Business Machines Corporation
    Inventors: Doron Cohen, Haggai Roitman
  • Publication number: 20200134091
    Abstract: A system comprising at least one hardware processor; and a non-transitory computer-readable storage medium having stored thereon program instructions executable to receive, as input, one or more digital documents, a query statement, and a summary length constraint, automatically generate, for each of said one or more digital documents, an initial summary based, at least in part, on a first sentence selection which satisfies said query statement, a modified said summary length constraint, and a first summary quality goal, automatically extract, from each of said initial summaries, one or more associated feedback metrics, and automatically generate, for each of said one or more digital documents, a final summary based, at least in part, on: (i) a second sentence selection which satisfies said query statement, said summary length constraint, and a second summary quality goal, and (ii) at least one of said associated feedback metrics.
    Type: Application
    Filed: October 29, 2018
    Publication date: April 30, 2020
    Inventors: GUY Feigenblat, David Konopnicki, Haggai Roitman
  • Publication number: 20200065346
    Abstract: A method, computer system, and computer program product for generating a multi-document summary is provided. The embodiment may include receiving a query statement, one or more documents, one or more summary constraints, and quality goals. The embodiment may include identifying one or more keywords within the query statement. The embodiment may include performing a sentence selection from the one or more documents based on the one or more identified keywords. The embodiment may include generating a plurality of candidate summaries of the one or more documents based on the performed sentence selection, the goals, and a cross entropy method. The embodiment may include calculating a quality score for each of the plurality of generated candidate summaries using a plurality of quality features. The embodiment may include selecting a candidate summary from the plurality of generated candidate summaries with the highest calculated quality score that also satisfies a quality score threshold.
    Type: Application
    Filed: October 31, 2019
    Publication date: February 27, 2020
    Inventors: Odellia Boni, Guy Feigenblat, David Konopnicki, Haggai Roitman