Patents by Inventor Yosi Mass
Yosi Mass has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11947604Abstract: An example system includes a processor to receive a pseudo-relevance set including top results form a search engine in response to transmitting a set of concatenated messages of a dialog. The processor can execute a first fixed point operation on the pseudo-relevance set to generate weighted terms. The processor can also execute a second fixed point operation on a message graph including nodes with a heaviness based on the weighted terms.Type: GrantFiled: March 17, 2020Date of Patent: April 2, 2024Assignee: International Business Machines CorporationInventors: Haggai Roitman, Doron Cohen, Yosi Mass, Shai Erera
-
Patent number: 11853296Abstract: Clarification-question selection, including: Receiving a search conversation that includes utterances by a user and by a conversational search system. Retrieving, from a solution documents database, text passages that are relevant to the search conversation. Retrieving, from a clarification questions database, for each of the text passages, candidate clarification questions that are relevant to both the respective text passage and the search conversation.Type: GrantFiled: July 28, 2021Date of Patent: December 26, 2023Assignee: International Business Machines CorporationInventors: Yosi Mass, Doron Cohen, David Konopnicki
-
Patent number: 11797545Abstract: An example system includes a processor to receive concepts extracted from a result set corresponding to a query and result associations for each extracted concept. The processor is to build a graph based on the extracted concepts, wherein the graph comprises a number of nodes representing the extracted concepts and weighted edges representing similarity between concepts extracted from shared results. The processor is to partition the graph into subgraphs with vertices corresponding to candidate facets for vertices having higher sums of weighted edges. The processor is to rank the candidate facets. The processor is to select higher ranked candidate facets to use as facets. The processor is to output facets with a result set in response to the query.Type: GrantFiled: April 21, 2020Date of Patent: October 24, 2023Assignee: International Business Machines CorporationInventors: Or Rivlin, Yosi Mass, Haggai Roitman, David Konopnicki
-
Patent number: 11790885Abstract: A method, computer system, and a computer program product for natural language processing are provided. A first text corpus that includes semi-structured content that includes hierarchical nodes may be received. Some of the hierarchical nodes may be masked. Node embeddings and level embeddings may be generated from the semi-structured content of the first text corpus and from the masked hierarchical nodes. The node embeddings and the level embeddings may be included in a bi-directional transformer model. The bi-directional transformer model may be trained on the first text corpus by reducing loss from the bi-directional transformer model predicting the masked hierarchical nodes.Type: GrantFiled: May 6, 2021Date of Patent: October 17, 2023Assignee: International Business Machines CorporationInventors: Haggai Roitman, Yosi Mass, Doron Cohen, Jatin Ganhotra
-
Patent number: 11775839Abstract: An example system includes a processor to receive a query. The processor can retrieve ranked candidates from an index based on the query. The processor can re-rank the ranked candidates using a Bidirectional Encoder Representations from Transformers (BERT) query-question (Q-q) model trained to match queries to questions of a frequently asked question (FAQ) dataset, wherein the BERT Q-q model is fine-tuned using paraphrases generated for the questions in the FAQ dataset. The processor can return the re-ranked candidates in response to the query.Type: GrantFiled: June 10, 2020Date of Patent: October 3, 2023Assignee: International Business Machines CorporationInventors: Yosi Mass, Boaz Carmeli, Haggai Roitman, David Konopnicki
-
Patent number: 11720634Abstract: Training a machine learning language model to generate clarification questions for use in conversational search, including: Obtaining multiple dialogs between users and agents, each dialog including messages exchanged between a user and an agent, wherein one of the messages of each dialog includes a reference to a solution document provided by the agent. For each of the dialogs, operating a search engine to retrieve a text passage, relevant to at least one of the messages of the respective dialog, from the respective solution document. Training a machine learning language model to generate a new clarification question given at least one new message and multiple new text passages, wherein the training is based on a training set which comprises, for each of the dialogs: said at least one of the messages of the respective dialog, and the text passage retrieved for the respective dialog.Type: GrantFiled: March 9, 2021Date of Patent: August 8, 2023Assignee: International Business Machines CorporationInventors: Yosi Mass, Haggai Roitman, Doron Cohen
-
Publication number: 20230029829Abstract: Clarification-question selection, including: Receiving a search conversation that includes utterances by a user and by a conversational search system. Retrieving, from a solution documents database, text passages that are relevant to the search conversation. Retrieving, from a clarification questions database, for each of the text passages, candidate clarification questions that are relevant to both the respective text passage and the search conversation.Type: ApplicationFiled: July 28, 2021Publication date: February 2, 2023Inventors: Yosi Mass, Doron Cohen, David Konopnicki
-
Publication number: 20220358906Abstract: A method, computer system, and a computer program product for natural language processing are provided. A first text corpus that includes semi-structured content that includes hierarchical nodes may be received. Some of the hierarchical nodes may be masked. Node embeddings and level embeddings may be generated from the semi-structured content of the first text corpus and from the masked hierarchical nodes. The node embeddings and the level embeddings may be included in a bi-directional transformer model. The bi-directional transformer model may be trained on the first text corpus by reducing loss from the bi-directional transformer model predicting the masked hierarchical nodes.Type: ApplicationFiled: May 6, 2021Publication date: November 10, 2022Inventors: Haggai Roitman, Yosi Mass, Doron Cohen, Jatin Ganhotra
-
Publication number: 20220300561Abstract: An embodiment for predicting a performance of a query in retrieving multifield documents is provided. The embodiment may include receiving a query from a user. The embodiment may also include retrieving a list of multifield documents from a corpus of documents in response to the query. The embodiment may further include generating a pseudo-effective (PE) reference-list for each field in the corpus of documents. The embodiment may also include executing one or more existing query performance prediction (QPP) methods on the retrieved list and each generated PE reference-list. The embodiment may further include deriving one or more extended QPP methods. The embodiment may also include estimating a performance of the query in obtaining the retrieved list of multifield documents based on the one or more extended QPP methods.Type: ApplicationFiled: March 16, 2021Publication date: September 22, 2022Inventors: Yosi Mass, HAGGAI ROITMAN, Guy Feigenblat, Roee Shraga
-
Publication number: 20220292139Abstract: Training a machine learning language model to generate clarification questions for use in conversational search, including: Obtaining multiple dialogs between users and agents, each dialog including messages exchanged between a user and an agent, wherein one of the messages of each dialog includes a reference to a solution document provided by the agent. For each of the dialogs, operating a search engine to retrieve a text passage, relevant to at least one of the messages of the respective dialog, from the respective solution document. Training a machine learning language model to generate a new clarification question given at least one new message and multiple new text passages, wherein the training is based on a training set which comprises, for each of the dialogs: said at least one of the messages of the respective dialog, and the text passage retrieved for the respective dialog.Type: ApplicationFiled: March 9, 2021Publication date: September 15, 2022Inventors: Yosi Mass, Haggai Roitman, Doron Cohen
-
Patent number: 11436288Abstract: An embodiment for predicting a performance of a query in retrieving multifield documents is provided. The embodiment may include receiving a query from a user. The embodiment may also include retrieving a list of multifield documents from a corpus of documents in response to the query. The embodiment may further include generating a pseudo-effective (PE) reference-list for each field in the corpus of documents. The embodiment may also include executing one or more existing query performance prediction (QPP) methods on the retrieved list and each generated PE reference-list. The embodiment may further include deriving one or more extended QPP methods. The embodiment may also include estimating a performance of the query in obtaining the retrieved list of multifield documents based on the one or more extended QPP methods.Type: GrantFiled: March 16, 2021Date of Patent: September 6, 2022Assignee: International Business Machines CorporationInventors: Yosi Mass, Haggai Roitman, Guy Feigenblat, Roee Shraga
-
Patent number: 11238076Abstract: A method including: Obtaining multiple conversation texts, one text per conversation, wherein each of the multiple conversation texts comprises: multiple messages authored by multiple parties, and a reference to an electronic document that provides resolution of a problem that is common to all the conversations. Calculating an importance score for each of the multiple messages of all the conversation texts. Clustering the multiple messages of all the conversation texts into multiple bins. Calculating an aggregated importance score for each of the multiple bins, based on the importance scores of the messages contained in the respective bin. Enriching (a) the electronic document, or (b) a record of the electronic document in an index of electronic documents, with at least some of the multiple bins and their aggregated importance scores, wherein the at least some of the multiple bins are added as fields to the electronic document or to the record.Type: GrantFiled: April 19, 2020Date of Patent: February 1, 2022Assignee: International Business Machines CorporationInventors: Haggai Roitman, Shai Erera, Doron Cohen, Yosi Mass, Or Rivlin
-
Publication number: 20210390418Abstract: An example system includes a processor to receive a query. The processor can retrieve ranked candidates from an index based on the query. The processor can re-rank the ranked candidates using a Bidirectional Encoder Representations from Transformers (BERT) query-question (Q-q) model trained to match queries to questions of a frequently asked question (FAQ) dataset, wherein the BERT Q-q model is fine-tuned using paraphrases generated for the questions in the FAQ dataset. The processor can return the re-ranked candidates in response to the query.Type: ApplicationFiled: June 10, 2020Publication date: December 16, 2021Inventors: Yosi Mass, Boaz Carmeli, Haggai Roitman, David Konopnicki
-
Patent number: 11163780Abstract: Embodiments of the present systems and methods may provide techniques that provide improved information retrieval. For example, a method may comprise receiving, at the computer system, a query to retrieve a document from a corpus of documents, retrieving, at the computer system, a plurality of documents from the corpus of documents using a plurality of retrieval methods, each retrieval method generating a ranked list of retrieved documents and a score for each document, fusing, at the computer system, the generated ranked list of retrieved documents to form an aggregated ranked list of retrieved documents by re-scoring, at the computer system, the plurality of documents according to its passage scores, with respect to the query and associating, at the computer system, a given document and its maximal passage using relevance information induced from the plurality of ranked lists.Type: GrantFiled: October 31, 2019Date of Patent: November 2, 2021Assignee: International Business Machines CorporationInventors: Shai Erera, Guy Feigenblat, Yosi Mass, Haggai Roitman, Bar Weiner
-
Publication number: 20210326346Abstract: An example system includes a processor to receive concepts extracted from a result set corresponding to a query and result associations for each extracted concept. The processor is to build a graph based on the extracted concepts, wherein the graph comprises a number of nodes representing the extracted concepts and weighted edges representing similarity between concepts extracted from shared results. The processor is to partition the graph into subgraphs with vertices corresponding to candidate facets for vertices having higher sums of weighted edges. The processor is to rank the candidate facets. The processor is to select higher ranked candidate facets to use as facets. The processor is to output facets with a result set in response to the query.Type: ApplicationFiled: April 21, 2020Publication date: October 21, 2021Inventors: Or Rivlin, Yosi Mass, Haggai Roitman, David Konopnicki
-
Publication number: 20210326369Abstract: A method including: Obtaining multiple conversation texts, one text per conversation, wherein each of the multiple conversation texts comprises: multiple messages authored by multiple parties, and a reference to an electronic document that provides resolution of a problem that is common to all the conversations. Calculating an importance score for each of the multiple messages of all the conversation texts. Clustering the multiple messages of all the conversation texts into multiple bins. Calculating an aggregated importance score for each of the multiple bins, based on the importance scores of the messages contained in the respective bin. Enriching (a) the electronic document, or (b) a record of the electronic document in an index of electronic documents, with at least some of the multiple bins and their aggregated importance scores, wherein the at least some of the multiple bins are added as fields to the electronic document or to the record.Type: ApplicationFiled: April 19, 2020Publication date: October 21, 2021Inventors: HAGGAI ROITMAN, Shai ERERA, Doron COHEN, Yosi MASS, Or RIVLIN
-
Publication number: 20210294863Abstract: An example system includes a processor to receive a pseudo-relevance set including top results form a search engine in response to transmitting a set of concatenated messages of a dialog. The processor can execute a first fixed point operation on the pseudo-relevance set to generate weighted terms. The processor can also execute a second fixed point operation on a message graph including nodes with a heaviness based on the weighted terms.Type: ApplicationFiled: March 17, 2020Publication date: September 23, 2021Inventors: Haggai Roitman, Doron Cohen, Yosi Mass, Shai Erera
-
Publication number: 20210133199Abstract: Embodiments of the present systems and methods may provide techniques that provide improved information retrieval. For example, a method may comprise receiving, at the computer system, a query to retrieve a document from a corpus of documents, retrieving, at the computer system, a plurality of documents from the corpus of documents using a plurality of retrieval methods, each retrieval method generating a ranked list of retrieved documents and a score for each document, fusing, at the computer system, the generated ranked list of retrieved documents to form an aggregated ranked list of retrieved documents by re-scoring, at the computer system, the plurality of documents according to its passage scores, with respect to the query and associating, at the computer system, a given document and its maximal passage using relevance information induced from the plurality of ranked lists.Type: ApplicationFiled: October 31, 2019Publication date: May 6, 2021Inventors: Shai Erera, Guy Feigenblat, Yosi Mass, Haggai Roitman, Bar Weiner
-
Patent number: 10831793Abstract: A method of estimating a thematic similarity of sentences, comprising receiving a corpus of a plurality of documents describing a plurality of topics where each document comprises a plurality of sentences arranged in a plurality of sections, constructing sentence triplets for at least some of the sentences, each sentence triplet comprising a respective sentence, a respective positive sentence selected randomly from the section comprising the respective sentence and a respective negative sentence selected randomly from another section, training a first neural network with the sentence triplets to identify sentence-sentence vectors mapping each sentence with a shorter distance to its respective positive sentence compared to the distance to its respective negative sentence and outputting the first neural network for estimating thematic similarity between a pair of sentences by computing a distance between the sentence-sentence vectors produced for each sentence of the pair by the first neural network.Type: GrantFiled: October 23, 2018Date of Patent: November 10, 2020Assignee: International Business Machines CorporationInventors: Ranit Aharonov, Liat Ein Dor, Alon Halfon, Yosi Mass, Ilya Shnayderman, Noam Slonim, Elad Venezian
-
Patent number: 10810375Abstract: A method comprising: operating at least one hardware processor for: receiving, as input, at least one named entity, modifying said named entity based on a plurality of modification rules to generate a set of candidate named entities corresponding to said named entity, and identifying, for at least one candidate named entity in said set of candidate named entities, an article in a knowledge base of articles, wherein a title of said article matches said candidate named entity.Type: GrantFiled: July 8, 2018Date of Patent: October 20, 2020Assignee: International Business Machines CorporationInventors: Yosi Mass, Amir Menczel, Dafna Sheinwald, Ilya Shnayderman, Noam Slonim