Patents by Inventor Salim Roukos
Salim Roukos has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11868716Abstract: One or more computer processors parse a received natural language question into an abstract meaning representation (AMR) graph. The one or more computer processors enrich the AMR graph into an extended AMR graph. The one or more computer processors transform the extended AMR graph into a query graph utilizing a path-based approach, wherein the query graph is a directed edge-labeled graph. The one or more computer processors generate one or more answers to the natural language question through one or more queries created utilizing the query graph.Type: GrantFiled: August 31, 2021Date of Patent: January 9, 2024Assignee: International Business Machines CorporationInventors: Srinivas Ravishankar, Pavan Kapanipathi Bangalore, Ibrahim Abdelaziz, Nandana Mihindukulasooriya, Dinesh Garg, Salim Roukos, Alexander Gray
-
Patent number: 11853149Abstract: Generating error event descriptions by receiving a set of error messages associated with an error event, generating a tokenization of at least one line of the set of error messages, providing the tokenization to an attention head according to a context of the tokenization, providing an output of the attention head as input to a generative model, generating a description of the error event according to the output, and providing the description to a user.Type: GrantFiled: September 10, 2021Date of Patent: December 26, 2023Assignee: International Business Machines CorporationInventors: Anjali Shah, Jennifer A. Mallette, Salim Roukos
-
Publication number: 20230229859Abstract: Methods, systems, and computer program products for zero-shot entity linking based on symbolic information are provided herein. A computer-implemented method includes obtaining a knowledge graph comprising a set of entities and a training dataset comprising text samples for at least a subset of the entities in the knowledge graph; training a machine learning model to map an entity mention substring of a given sample of text to one corresponding entity in the set of entities, wherein the machine learning model is trained using a multi-task machine learning framework using symbolic information extracted from the knowledge graph; and mapping an entity mention substring of a new sample of text to one of the entities in the set using the trained machine learning model.Type: ApplicationFiled: January 14, 2022Publication date: July 20, 2023Inventors: Dinesh Khandelwal, G P Shrivatsa Bhargav, Saswati Dana, Dinesh Garg, Pavan Kapanipathi Bangalore, Salim Roukos, Alexander Gray, L. Venkata Subramaniam
-
Publication number: 20230084422Abstract: Generating error event descriptions by receiving a set of error messages associated with an error event, generating a tokenization of at least one line of the set of error messages, providing the tokenization to an attention head according to a context of the tokenization, providing an output of the attention head as input to a generative model, generating a description of the error event according to the output, and providing the description to a user.Type: ApplicationFiled: September 10, 2021Publication date: March 16, 2023Inventors: ANJALI SHAH, Jennifer A. Mallette, Salim Roukos
-
Publication number: 20230060589Abstract: One or more computer processors parse a received natural language question into an abstract meaning representation (AMR) graph. The one or more computer processors enrich the AMR graph into an extended AMR graph. The one or more computer processors transform the extended AMR graph into a query graph utilizing a path-based approach, wherein the query graph is a directed edge-labeled graph. The one or more computer processors generate one or more answers to the natural language question through one or more queries created utilizing the query graph.Type: ApplicationFiled: August 31, 2021Publication date: March 2, 2023Inventors: Srinivas Ravishankar, Pavan Kapanipathi Bangalore, IBRAHIM ABDELAZIZ, NANDANA MIHINDUKULASOORIYA, Dinesh Garg, Salim Roukos, Alexander Gray
-
Publication number: 20220207384Abstract: A system, computer program product, and method are provided for extraction of factual data from unstructured natural language (NL) text. A detection model is applied to convert unstructured NL text in a first language to annotated NL text. The detection model identifies two or more mentions from the unstructured NL text and a logical position of the mentions. The detection model further identifies a sequential position for each of the mentions and attaches a sequential position identifier. A pattern of rules corresponding with the annotated NL text is identified and applied to the annotated NL text, and one or more facts embedded within the annotated NL text are extracted and converted into structured data.Type: ApplicationFiled: December 30, 2020Publication date: June 30, 2022Applicant: International Business Machines CorporationInventors: Radu Florian, Salim Roukos, Martin Franz
-
Patent number: 11373041Abstract: A processor may receive a text segment. The processor may analyze the text segment at a plurality of granularity levels wherein each of the plurality of granularity levels has a comparative selection value for identifying one or more objects of interest within the text segment. The processor may select an optimized granularity level with an optimum comparative selection value. The processor may identify the one or more objects of interest within the text segment. The processor may display the one or more objects of interest to a user.Type: GrantFiled: September 18, 2020Date of Patent: June 28, 2022Assignee: International Business Machines CorporationInventors: Jian Ni, Radu Florian, Salim Roukos, Vittorio Castelli
-
Publication number: 20220129770Abstract: A computer-implemented method according to one embodiment includes identifying a natural language query; translating the natural language query into an intermediate representation; converting the intermediate representation into one or more query triples; and performing relation linking between each of the one or more query triples and a plurality of knowledge base triples.Type: ApplicationFiled: October 23, 2020Publication date: April 28, 2022Inventors: Nandana Mihindukulasooriya, Gaetano Rossiello, Alfio Massimiliano Gliozzo, Pavan Kapanipathi Bangalore, Salim Roukos
-
Publication number: 20220092262Abstract: A processor may receive a text segment. The processor may analyze the text segment at a plurality of granularity levels wherein each of the plurality of granularity levels has a comparative selection value for identifying one or more objects of interest within the text segment. The processor may select an optimized granularity level with an optimum comparative selection value. The processor may identify the one or more objects of interest within the text segment. The processor may display the one or more objects of interest to a user.Type: ApplicationFiled: September 18, 2020Publication date: March 24, 2022Inventors: Jian Ni, Radu Florian, Salim Roukos, Vittorio Castelli
-
Patent number: 9665562Abstract: According to an aspect, a first word in a first language and a second word in a second language in a bilingual corpus are stemmed. A probability for aligning the first stem and the second stem and a distance metric between the normalized first stem and the normalized second stem are calculated. The first word and the second word are identified as a cognate pair when the probability and the distance metric meet a threshold criterion and stored as a cognate pair in a set of cognates. A candidate sentence in the second language is retrieved from a corpus. The candidate sentence is filtered by the active vocabulary of a user in the second language and the set of cognates. A sentence quality score is calculated for the candidate sentence; and the candidate sentence is ranked for presentation to the user based on the sentence quality scorer.Type: GrantFiled: June 22, 2016Date of Patent: May 30, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jiri Navratil, Salim Roukos, Robert T. Ward
-
Patent number: 9400781Abstract: According to an aspect, a first word in a first language and a second word in a second language in a bilingual corpus are stemmed. A probability for aligning the first stem and the second stem and a distance metric between the normalized first stem and the normalized second stem are calculated. The first word and the second word are identified as a cognate pair when the probability and the distance metric meet a threshold criterion and stored as a cognate pair in a set of cognates. A candidate sentence in the second language is retrieved from a corpus. The candidate sentence is filtered by the active vocabulary of a user in the second language and the set of cognates. A sentence quality score is calculated for the candidate sentence; and the candidate sentence is ranked for presentation to the user based on the sentence quality scorer.Type: GrantFiled: February 8, 2016Date of Patent: July 26, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jiri Navratil, Salim Roukos, Robert T. Ward
-
Publication number: 20070061703Abstract: Methods and apparatus are provided for annotating documents with one or more of entities, events and relations. Documents are annotated by presenting the document to a user; presenting the user with a list of possible entity types, wherein the list of possible entity types is configurable; and obtaining at least one mention annotation that associates a selected phrase in the document with one of the possible entity types. The selected phrase can be presented to the user, for example, based on one or more presentation rules associated with the associated entity type. The method can be implemented, for example, in a client-server configuration where a browser communicates with a remote server.Type: ApplicationFiled: September 12, 2005Publication date: March 15, 2007Applicant: International Business Machines CorporationInventors: Nandakishore Kambhatla, Salim Roukos
-
Publication number: 20050237227Abstract: A Bell Tree data structure is provided to model the process of chaining the mentions, from one or more documents, into entities, tracking the entire process; where the data structure is used in an entity tracking process that produces multiple results ranked by a product of probability scores.Type: ApplicationFiled: April 27, 2004Publication date: October 27, 2005Applicant: International Business Machines CorporationInventors: Abraham Ittycheriah, Hongyan Jing, Nandakishore Kambhatla, Xiaoqiang Luo, Salim Roukos
-
Publication number: 20040111253Abstract: A method, computer program product, and data processing system for training a statistical parser by utilizing active learning techniques to reduce the size of the corpus of human-annotated training samples (e.g., sentences) needed is disclosed. According to a preferred embodiment of the present invention, the statistical parser under training is used to compare the grammatical structure of the samples according to the parser's current level of training. The samples are then divided into clusters, with each cluster representing samples having a similar structure as ascertained by the statistical parser. Uncertainty metrics are applied to the clustered samples to select samples from each cluster that reflect uncertainty in the statistical parser's grammatical model. These selected samples may then be annotated by a human trainer for training the statistical parser.Type: ApplicationFiled: December 10, 2002Publication date: June 10, 2004Applicant: International Business Machines CorporationInventors: Xiaoqiang Luo, Salim Roukos, Min Tang
-
Patent number: 6260014Abstract: A method for recognizing speech includes the steps of providing a generic model having a baseform representation of a vocabulary of words, identifying a subset of words relating to an application, constructing a task specific model for the subset of words, constructing a composite model by combining the generic and task specific models and modifying the baseform representation of the subset of words such that the subset of words are recognized by the task specific model. A system for recognizing speech includes a composite model having a generic model having a generic baseform representation of a vocabulary of words and a task specific model for recognizing a subset of words relating to an application wherein the subset of words are recognized using a modified baseform representation. A recognizer compares words input thereto with the generic model for words other than the subset of words and with the task specific model for the subset of words.Type: GrantFiled: September 14, 1998Date of Patent: July 10, 2001Assignee: International Business Machines CorporationInventors: Lalit Rai Bahl, David Lubensky, Mukund Padmanabhan, Salim Roukos
-
Patent number: 6246981Abstract: A system for conversant interaction includes a recognizer for receiving and processing input information and outputting a recognized representation of the input information. A dialog manager is coupled to the recognizer for receiving the recognized representation of the input information, the dialog manager having task-oriented forms for associating user input information therewith, the dialog manager being capable of selecting an applicable form from the task-oriented forms responsive to the input information by scoring the forms relative to each other. A synthesizer is employed for converting a response generated by the dialog manager to output the response. A program storage device and method are also provided.Type: GrantFiled: November 25, 1998Date of Patent: June 12, 2001Assignee: International Business Machines CorporationInventors: Kishore A. Papineni, Salim Roukos, Robert T. Ward
-
Patent number: 6219638Abstract: A messaging system for receiving speech over a telephone and converting the speech to text includes a first server for receiving speech input by a user, a speech recognition system for converting the speech to text, a speech synthesizer for converting the text to speech for playing back the synthesized speech for correction by the user and a correction mechanism for enabling the user to correct the speech such that the corrected speech is provided as text for transmittal over a communication system.Type: GrantFiled: November 3, 1998Date of Patent: April 17, 2001Assignee: International Business Machines CorporationInventors: Mukund Padmanabhan, Michael Picheny, David Nahamoo, Salim Roukos
-
Patent number: 6092034Abstract: A system and method for translating a series of source words in a first language to a series of target words in a second language is provided. The system includes an input device for inputting the series of source words. A fertility hypothesis generator operatively coupled to the input device generates at least one fertility hypotheses for a fertility of a source word, based on the source word and a context of the source word. A sense hypothesis generator operatively coupled to the input device generates sense hypotheses for a translation of the source word, based on the source word and the context of the source word. A fertility model operatively coupled to the fertility hypothesis generator determines a probability of the fertility of the source word, based on the source word and the context of the source word.Type: GrantFiled: July 27, 1998Date of Patent: July 18, 2000Assignee: International Business Machines CorporationInventors: Jeffrey Scott McCarley, Salim Roukos
-
Patent number: 5640487Abstract: The present invention is an n-gram language modeler which significantly reduces the memory storage requirement and convergence time for language modelling systems and methods. The present invention aligns each n-gram with one of "n" number of non-intersecting classes. A count is determined for each n-gram representing the number of times each n-gram occurred in the training data. The n-grams are separated into classes and complement counts are determined. Using these counts and complement counts factors are determined, one factor for each class, using an iterative scaling algorithm. The language model probability, i.e., the probability that a word occurs given the occurrence of the previous two words, is determined using these factors.Type: GrantFiled: June 7, 1995Date of Patent: June 17, 1997Assignee: International Business Machines CorporationInventors: Raymond Lau, Ronald Rosenfeld, Salim Roukos
-
Patent number: 5467425Abstract: The present invention is an n-gram language modeler which significantly reduces the memory storage requirement and convergence time for language modelling systems and methods. The present invention aligns each n-gram with one of "n" number of non-intersecting classes. A count is determined for each n-gram representing the number of times each n-gram occurred in the training data. The n-grams are separated into classes and complement counts are determined. Using these counts and complement counts factors are determined, one factor for each class, using an iterative scaling algorithm. The language model probability, i.e., the probability that a word occurs given the occurrence of the previous two words, is determined using these factors.Type: GrantFiled: February 26, 1993Date of Patent: November 14, 1995Assignee: International Business Machines CorporationInventors: Raymond Lau, Ronald Rosenfeld, Salim Roukos