Patents by Inventor Haggai Roitman

Haggai Roitman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11281677
    Abstract: An exemplary method includes: determining a pool of documents, wherein each document is within at least one of a plurality of lists, each of the lists results from executing a query on a corpus, and the corpus comprises at least the pool of documents; determining a first ranking of documents within the pool of documents based at least in part on first scores computed for respective documents within the pool; estimating relevance to the specified query at least of respective documents within the first ranking, wherein the relevance is estimated without user feedback regarding the relevance; and determining a second ranking of documents within the pool based at least in part on second scores computed at least for respective documents within the first ranking, wherein the second score for a given document is computed based at least in part on the estimated relevance of at least the given document.
    Type: Grant
    Filed: December 27, 2018
    Date of Patent: March 22, 2022
    Assignee: International Business Machines Corporation
    Inventors: Haggai Roitman, Shai Erera, Bar Weiner
  • Patent number: 11275749
    Abstract: Techniques are disclosed for query performance prediction (QPP) in the fusion-based retrieval setting. Symmetric list similarity measures used in traditional QPP techniques do not properly account for relevance-dependent aspects of the relationship between a given (base) reference list generated using an information retrieval technique and a final fused list generated using a fusion technique, as such a relationship is actually asymmetric. Embodiments more properly model the asymmetric relationship of reference and fused lists using an asymmetric co-relevance model that estimates, assuming a reference list contains relevant information, the odds that the fused list will be observed. In particular, the asymmetric co-relevance between a reference list and a fused list may be determined by adjusting a symmetric co-relevance of the reference list and the fused list using an odds ratio between the symmetric co-relevance of the reference list and the fused list to the reference list's own relevance.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: March 15, 2022
    Assignee: International Business Machines Corporation
    Inventors: Haggai Roitman, Shai Erera, Bar Weiner
  • Patent number: 11269942
    Abstract: Automated keyphrase extraction from a digital text document. A pool of candidate keyphrases of the digital text document is created. A cross-entropy method is then employed to compute a set of output keyphrases out of the pool of candidate keyphrases, by iteratively optimizing an objective function that is configured to cause the set of output keyphrases to be descriptive of one or more main topics discussed in the digital text document. The set of output keyphrases may be used for at least one of: text summarization, text categorization, opinion mining, and document indexing.
    Type: Grant
    Filed: October 10, 2019
    Date of Patent: March 8, 2022
    Assignee: International Business Machines Corporation
    Inventors: Odellia Boni, Doron Cohen, Guy Feigenblat, David Konopnicki, Haggai Roitman
  • Patent number: 11269965
    Abstract: A method, computer system, and computer program product for generating a multi-document summary is provided. The embodiment may include receiving a query statement, one or more documents, one or more summary constraints, and quality goals. The embodiment may include identifying one or more keywords within the query statement. The embodiment may include performing a sentence selection from the one or more documents based on the one or more identified keywords. The embodiment may include generating a plurality of candidate summaries of the one or more documents based on the performed sentence selection, the goals, and a cross entropy method. The embodiment may include calculating a quality score for each of the plurality of generated candidate summaries using a plurality of quality features. The embodiment may include selecting a candidate summary from the plurality of generated candidate summaries with the highest calculated quality score that also satisfies a quality score threshold.
    Type: Grant
    Filed: October 31, 2019
    Date of Patent: March 8, 2022
    Assignee: International Business Machines Corporation
    Inventors: Odellia Boni, Guy Feigenblat, David Konopnicki, Haggai Roitman
  • Publication number: 20220043794
    Abstract: Multimodal table encoding, including: Receiving an electronic document that contains a table. The table includes multiple rows, multiple columns, and a schema comprising column labels or row labels. The electronic document includes a description of the table which is located externally to the table. Next, operating separate machine learning encoders to separately encode the description, schema, each of the rows, and each of the columns of the table, respectively. The schema, the rows, and the columns are encoded together with end-of-column tokens and end-of-row tokens that mark an end of each column and row, respectively. Then, applying a machine learning gating mechanism to the encoded description, encoded schema, encoded rows, and encoded columns, to produce a fused encoding of the table, wherein the fused encoding is representative of both a structure of the table and a content of the table.
    Type: Application
    Filed: July 15, 2020
    Publication date: February 10, 2022
    Inventors: Roee Shraga, HAGGAI ROITMAN, Guy Feigenblat, MUSTAFA CANIM
  • Patent number: 11238076
    Abstract: A method including: Obtaining multiple conversation texts, one text per conversation, wherein each of the multiple conversation texts comprises: multiple messages authored by multiple parties, and a reference to an electronic document that provides resolution of a problem that is common to all the conversations. Calculating an importance score for each of the multiple messages of all the conversation texts. Clustering the multiple messages of all the conversation texts into multiple bins. Calculating an aggregated importance score for each of the multiple bins, based on the importance scores of the messages contained in the respective bin. Enriching (a) the electronic document, or (b) a record of the electronic document in an index of electronic documents, with at least some of the multiple bins and their aggregated importance scores, wherein the at least some of the multiple bins are added as fields to the electronic document or to the record.
    Type: Grant
    Filed: April 19, 2020
    Date of Patent: February 1, 2022
    Assignee: International Business Machines Corporation
    Inventors: Haggai Roitman, Shai Erera, Doron Cohen, Yosi Mass, Or Rivlin
  • Patent number: 11222277
    Abstract: A pseudo-relevance feedback (PRF) system is disclosed that determines an optimized relevance model for a search query by utilizing a posterior relevance model to estimate the likelihood that an initial set of top-K retrieved documents would be retrieved given the posterior relevance model, re-ranking the top-K documents based on their respective estimates of likelihood of retrieval, determining a rank similarity between the initial ranking of the top-K documents and the re-ranking of the top-K documents, updating one or more model parameters of the posterior relevance model based on the rank similarity, and iteratively performing the above process until the rank similarity is maximized, at which point, the optimized relevance model is obtained.
    Type: Grant
    Filed: January 29, 2016
    Date of Patent: January 11, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Artem Barger, Roy Levin, Haggai Roitman
  • Publication number: 20210397595
    Abstract: Ad-hoc table retrieval, including: Representing each of a plurality of tables as a multi-field text document in which: different modalities of the table are represented as separate fields, and a concatenation of all the modalities is represented as a separate, auxiliary field. Receiving a query. Executing the query on the multi-field text documents, to retrieve a list of preliminarily-ranked candidate tables out of the plurality of tables. Calculating an intrinsic table similarity score for each of the candidate tables, based on the query and the auxiliary field. Calculating an extrinsic table similarity score for each of the candidate tables, based on a cluster hypothesis of the candidate tables. Combining: the preliminary rankings, the intrinsic table similarity scores, and the extrinsic table similarity scores, to re-rank the candidate tables.
    Type: Application
    Filed: June 23, 2020
    Publication date: December 23, 2021
    Inventors: HAGGAI ROITMAN, Guy Feigenblat, Mustafa Canim, Roee Shraga
  • Publication number: 20210390418
    Abstract: An example system includes a processor to receive a query. The processor can retrieve ranked candidates from an index based on the query. The processor can re-rank the ranked candidates using a Bidirectional Encoder Representations from Transformers (BERT) query-question (Q-q) model trained to match queries to questions of a frequently asked question (FAQ) dataset, wherein the BERT Q-q model is fine-tuned using paraphrases generated for the questions in the FAQ dataset. The processor can return the re-ranked candidates in response to the query.
    Type: Application
    Filed: June 10, 2020
    Publication date: December 16, 2021
    Inventors: Yosi Mass, Boaz Carmeli, Haggai Roitman, David Konopnicki
  • Publication number: 20210342684
    Abstract: A system and a computer-implemented method for ranking tabular data entities by likelihood of comprising answers for (natural language) queries, based on multimodal descriptions of the tabular data entities, comprising separate representations, which represent different aspects of the tabular data entities. The ranking is based on joint representations, generated from the query representation and separate representations of the tabular data entities' aspects, using gated multimodal units. The computer-implemented method may be used for applications such as web searches, data aggregation, and research tasks.
    Type: Application
    Filed: April 29, 2020
    Publication date: November 4, 2021
    Inventors: Roee Shraga, Haggai Roitman, Guy Feigenblat, Mustafa Canim
  • Patent number: 11163780
    Abstract: Embodiments of the present systems and methods may provide techniques that provide improved information retrieval. For example, a method may comprise receiving, at the computer system, a query to retrieve a document from a corpus of documents, retrieving, at the computer system, a plurality of documents from the corpus of documents using a plurality of retrieval methods, each retrieval method generating a ranked list of retrieved documents and a score for each document, fusing, at the computer system, the generated ranked list of retrieved documents to form an aggregated ranked list of retrieved documents by re-scoring, at the computer system, the plurality of documents according to its passage scores, with respect to the query and associating, at the computer system, a given document and its maximal passage using relevance information induced from the plurality of ranked lists.
    Type: Grant
    Filed: October 31, 2019
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Shai Erera, Guy Feigenblat, Yosi Mass, Haggai Roitman, Bar Weiner
  • Publication number: 20210326346
    Abstract: An example system includes a processor to receive concepts extracted from a result set corresponding to a query and result associations for each extracted concept. The processor is to build a graph based on the extracted concepts, wherein the graph comprises a number of nodes representing the extracted concepts and weighted edges representing similarity between concepts extracted from shared results. The processor is to partition the graph into subgraphs with vertices corresponding to candidate facets for vertices having higher sums of weighted edges. The processor is to rank the candidate facets. The processor is to select higher ranked candidate facets to use as facets. The processor is to output facets with a result set in response to the query.
    Type: Application
    Filed: April 21, 2020
    Publication date: October 21, 2021
    Inventors: Or Rivlin, Yosi Mass, Haggai Roitman, David Konopnicki
  • Publication number: 20210326369
    Abstract: A method including: Obtaining multiple conversation texts, one text per conversation, wherein each of the multiple conversation texts comprises: multiple messages authored by multiple parties, and a reference to an electronic document that provides resolution of a problem that is common to all the conversations. Calculating an importance score for each of the multiple messages of all the conversation texts. Clustering the multiple messages of all the conversation texts into multiple bins. Calculating an aggregated importance score for each of the multiple bins, based on the importance scores of the messages contained in the respective bin. Enriching (a) the electronic document, or (b) a record of the electronic document in an index of electronic documents, with at least some of the multiple bins and their aggregated importance scores, wherein the at least some of the multiple bins are added as fields to the electronic document or to the record.
    Type: Application
    Filed: April 19, 2020
    Publication date: October 21, 2021
    Inventors: HAGGAI ROITMAN, Shai ERERA, Doron COHEN, Yosi MASS, Or RIVLIN
  • Publication number: 20210294863
    Abstract: An example system includes a processor to receive a pseudo-relevance set including top results form a search engine in response to transmitting a set of concatenated messages of a dialog. The processor can execute a first fixed point operation on the pseudo-relevance set to generate weighted terms. The processor can also execute a second fixed point operation on a message graph including nodes with a heaviness based on the weighted terms.
    Type: Application
    Filed: March 17, 2020
    Publication date: September 23, 2021
    Inventors: Haggai Roitman, Doron Cohen, Yosi Mass, Shai Erera
  • Patent number: 11120351
    Abstract: The method includes receiving, by one or more processors, an initial query term. The method further includes generating, by one or more processors, an expanded query based on the received initial query term and one or more related terms to the received initial query. The method further includes determining, by one or more processors, weights corresponding to terms in the received initial query term and the generated expanded query term based on a predicted effect on query performance.
    Type: Grant
    Filed: September 21, 2015
    Date of Patent: September 14, 2021
    Assignee: International Business Machines Corporation
    Inventors: Ella Rabinovich, Haggai Roitman
  • Patent number: 11093512
    Abstract: A method for automated selection of a search result ranker comprising: providing a set of queries; for each of said queries, receiving, from a search engine, a plurality of relevancy score sets, wherein each relevancy score set is associated with search results found in a corpus of electronic documents using each of a plurality of computerized search result rankers; calculating a difficulty score for each of said queries relative to all other queries in the set, based on said plurality of relevancy score sets associated with said query; calculating a quality score for each of said search result rankers based on said plurality of relevancy score sets associated with said search result ranker, wherein each of said plurality of relevancy score sets is weighed according to the difficulty score of its associated query; and selecting one of said search rankers based on said quality score.
    Type: Grant
    Filed: April 30, 2018
    Date of Patent: August 17, 2021
    Assignee: International Business Machines Corporation
    Inventors: Doron Cohen, Shai Erera, Haggai Roitman, Bar Weiner
  • Patent number: 11080317
    Abstract: A method comprising receiving digital documents, a query statement, and a summary length constraint; identifying, for each of said digital documents, a sentence subset, based, at least in part, on said query statement, a modified version of said summary length constraint, and a first set of quality objectives, generating, for each of said sentence subsets, a random forest representation; iteratively (i) sampling, from each of said random forest representations, a plurality of tokens to create a corresponding candidate document summary, based, at least in part, on weights assigned to each of said tokens, (ii) assigning a quality ranking to said candidate document summary, based, at least in part, on said first set of quality objectives and a second set of quality objectives, and (iii) adjusting said weights, based, at least in part, on said quality rankings; and outputting a highest ranking said candidate document as a compressed summary.
    Type: Grant
    Filed: July 9, 2019
    Date of Patent: August 3, 2021
    Assignee: International Business Machines Corporation
    Inventors: Odellia Boni, Doron Cohen, Guy Feigenblat, David Konopnicki, Haggai Roitman
  • Patent number: 11061951
    Abstract: Embodiments may provide automated summarization of documents, such as scientific documents by using a prior distribution on logical sections learnt from a corpus of human authored summaries. For example, a method of document summarization may comprise receiving, at the computer system, a document and segmenting the document into a plurality of sentences, identifying, at the computer system, sections in the document and aligning each sentence in the document to a section logical role, and summarizing, at the computer system, the document using a probability distribution.
    Type: Grant
    Filed: November 21, 2019
    Date of Patent: July 13, 2021
    Assignee: International Business Machines Corporation
    Inventors: Odellia Boni, Doron Cohen, Guy Feigenblat, David Konopnicki, Haggai Roitman
  • Patent number: 11030209
    Abstract: Methods and systems for generating and evaluating fused query lists. A query on a corpus of documents is evaluated using a plurality of retrieval methods and a ranked list for each of the plurality of retrieval methods is obtained. A plurality of fused ranked lists is sampled, each fusing said ranked lists for said plurality of retrieval methods, and the sampled fused ranked lists are sorted. In an unsupervised manner, an objective comprising a likelihood that a fused ranked list, fusing said ranked lists for each of said plurality of retrieval methods, is relevant to a query and a relevance event, is optimized to optimize the sampling, until convergence is achieved. Documents of the fused ranked list are determined based on the optimization.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: June 8, 2021
    Assignee: International Business Machines Corporation
    Inventors: Haggai Roitman, Bar Weiner, Shai Erera
  • Publication number: 20210165812
    Abstract: A computer-implemented method, computerized apparatus and computer program product for minimum coordination passage scoring. Given a candidate passage in a document collection potentially matching a query received, a set of overlapping terms between the candidate passage and the query is determined. For each overlapping term in the set, a first measure of a weight of the term in the query, a second measure of a weight of the term in the candidate passage, and a third measure of a specificity of the term in the document collection are calculated. a function of the first and second measure is evaluated to obtain a value reflecting a condition on the relation therebetween. A minimum coordination score representing a relative similarity between the candidate passage and the query is determined based on the value and the first, second and third measures obtained for each of the overlapping terms.
    Type: Application
    Filed: December 2, 2019
    Publication date: June 3, 2021
    Inventors: Doron Cohen, Haggai Roitman, Oren Sar-Shalom