Patents by Inventor Salim Roukos

Salim Roukos has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11868716
    Abstract: One or more computer processors parse a received natural language question into an abstract meaning representation (AMR) graph. The one or more computer processors enrich the AMR graph into an extended AMR graph. The one or more computer processors transform the extended AMR graph into a query graph utilizing a path-based approach, wherein the query graph is a directed edge-labeled graph. The one or more computer processors generate one or more answers to the natural language question through one or more queries created utilizing the query graph.
    Type: Grant
    Filed: August 31, 2021
    Date of Patent: January 9, 2024
    Assignee: International Business Machines Corporation
    Inventors: Srinivas Ravishankar, Pavan Kapanipathi Bangalore, Ibrahim Abdelaziz, Nandana Mihindukulasooriya, Dinesh Garg, Salim Roukos, Alexander Gray
  • Patent number: 11853149
    Abstract: Generating error event descriptions by receiving a set of error messages associated with an error event, generating a tokenization of at least one line of the set of error messages, providing the tokenization to an attention head according to a context of the tokenization, providing an output of the attention head as input to a generative model, generating a description of the error event according to the output, and providing the description to a user.
    Type: Grant
    Filed: September 10, 2021
    Date of Patent: December 26, 2023
    Assignee: International Business Machines Corporation
    Inventors: Anjali Shah, Jennifer A. Mallette, Salim Roukos
  • Publication number: 20230229859
    Abstract: Methods, systems, and computer program products for zero-shot entity linking based on symbolic information are provided herein. A computer-implemented method includes obtaining a knowledge graph comprising a set of entities and a training dataset comprising text samples for at least a subset of the entities in the knowledge graph; training a machine learning model to map an entity mention substring of a given sample of text to one corresponding entity in the set of entities, wherein the machine learning model is trained using a multi-task machine learning framework using symbolic information extracted from the knowledge graph; and mapping an entity mention substring of a new sample of text to one of the entities in the set using the trained machine learning model.
    Type: Application
    Filed: January 14, 2022
    Publication date: July 20, 2023
    Inventors: Dinesh Khandelwal, G P Shrivatsa Bhargav, Saswati Dana, Dinesh Garg, Pavan Kapanipathi Bangalore, Salim Roukos, Alexander Gray, L. Venkata Subramaniam
  • Publication number: 20230084422
    Abstract: Generating error event descriptions by receiving a set of error messages associated with an error event, generating a tokenization of at least one line of the set of error messages, providing the tokenization to an attention head according to a context of the tokenization, providing an output of the attention head as input to a generative model, generating a description of the error event according to the output, and providing the description to a user.
    Type: Application
    Filed: September 10, 2021
    Publication date: March 16, 2023
    Inventors: ANJALI SHAH, Jennifer A. Mallette, Salim Roukos
  • Publication number: 20230060589
    Abstract: One or more computer processors parse a received natural language question into an abstract meaning representation (AMR) graph. The one or more computer processors enrich the AMR graph into an extended AMR graph. The one or more computer processors transform the extended AMR graph into a query graph utilizing a path-based approach, wherein the query graph is a directed edge-labeled graph. The one or more computer processors generate one or more answers to the natural language question through one or more queries created utilizing the query graph.
    Type: Application
    Filed: August 31, 2021
    Publication date: March 2, 2023
    Inventors: Srinivas Ravishankar, Pavan Kapanipathi Bangalore, IBRAHIM ABDELAZIZ, NANDANA MIHINDUKULASOORIYA, Dinesh Garg, Salim Roukos, Alexander Gray
  • Publication number: 20220207384
    Abstract: A system, computer program product, and method are provided for extraction of factual data from unstructured natural language (NL) text. A detection model is applied to convert unstructured NL text in a first language to annotated NL text. The detection model identifies two or more mentions from the unstructured NL text and a logical position of the mentions. The detection model further identifies a sequential position for each of the mentions and attaches a sequential position identifier. A pattern of rules corresponding with the annotated NL text is identified and applied to the annotated NL text, and one or more facts embedded within the annotated NL text are extracted and converted into structured data.
    Type: Application
    Filed: December 30, 2020
    Publication date: June 30, 2022
    Applicant: International Business Machines Corporation
    Inventors: Radu Florian, Salim Roukos, Martin Franz
  • Patent number: 11373041
    Abstract: A processor may receive a text segment. The processor may analyze the text segment at a plurality of granularity levels wherein each of the plurality of granularity levels has a comparative selection value for identifying one or more objects of interest within the text segment. The processor may select an optimized granularity level with an optimum comparative selection value. The processor may identify the one or more objects of interest within the text segment. The processor may display the one or more objects of interest to a user.
    Type: Grant
    Filed: September 18, 2020
    Date of Patent: June 28, 2022
    Assignee: International Business Machines Corporation
    Inventors: Jian Ni, Radu Florian, Salim Roukos, Vittorio Castelli
  • Publication number: 20220129770
    Abstract: A computer-implemented method according to one embodiment includes identifying a natural language query; translating the natural language query into an intermediate representation; converting the intermediate representation into one or more query triples; and performing relation linking between each of the one or more query triples and a plurality of knowledge base triples.
    Type: Application
    Filed: October 23, 2020
    Publication date: April 28, 2022
    Inventors: Nandana Mihindukulasooriya, Gaetano Rossiello, Alfio Massimiliano Gliozzo, Pavan Kapanipathi Bangalore, Salim Roukos
  • Publication number: 20220092262
    Abstract: A processor may receive a text segment. The processor may analyze the text segment at a plurality of granularity levels wherein each of the plurality of granularity levels has a comparative selection value for identifying one or more objects of interest within the text segment. The processor may select an optimized granularity level with an optimum comparative selection value. The processor may identify the one or more objects of interest within the text segment. The processor may display the one or more objects of interest to a user.
    Type: Application
    Filed: September 18, 2020
    Publication date: March 24, 2022
    Inventors: Jian Ni, Radu Florian, Salim Roukos, Vittorio Castelli
  • Patent number: 9665562
    Abstract: According to an aspect, a first word in a first language and a second word in a second language in a bilingual corpus are stemmed. A probability for aligning the first stem and the second stem and a distance metric between the normalized first stem and the normalized second stem are calculated. The first word and the second word are identified as a cognate pair when the probability and the distance metric meet a threshold criterion and stored as a cognate pair in a set of cognates. A candidate sentence in the second language is retrieved from a corpus. The candidate sentence is filtered by the active vocabulary of a user in the second language and the set of cognates. A sentence quality score is calculated for the candidate sentence; and the candidate sentence is ranked for presentation to the user based on the sentence quality scorer.
    Type: Grant
    Filed: June 22, 2016
    Date of Patent: May 30, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jiri Navratil, Salim Roukos, Robert T. Ward
  • Patent number: 9400781
    Abstract: According to an aspect, a first word in a first language and a second word in a second language in a bilingual corpus are stemmed. A probability for aligning the first stem and the second stem and a distance metric between the normalized first stem and the normalized second stem are calculated. The first word and the second word are identified as a cognate pair when the probability and the distance metric meet a threshold criterion and stored as a cognate pair in a set of cognates. A candidate sentence in the second language is retrieved from a corpus. The candidate sentence is filtered by the active vocabulary of a user in the second language and the set of cognates. A sentence quality score is calculated for the candidate sentence; and the candidate sentence is ranked for presentation to the user based on the sentence quality scorer.
    Type: Grant
    Filed: February 8, 2016
    Date of Patent: July 26, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jiri Navratil, Salim Roukos, Robert T. Ward
  • Publication number: 20070061703
    Abstract: Methods and apparatus are provided for annotating documents with one or more of entities, events and relations. Documents are annotated by presenting the document to a user; presenting the user with a list of possible entity types, wherein the list of possible entity types is configurable; and obtaining at least one mention annotation that associates a selected phrase in the document with one of the possible entity types. The selected phrase can be presented to the user, for example, based on one or more presentation rules associated with the associated entity type. The method can be implemented, for example, in a client-server configuration where a browser communicates with a remote server.
    Type: Application
    Filed: September 12, 2005
    Publication date: March 15, 2007
    Applicant: International Business Machines Corporation
    Inventors: Nandakishore Kambhatla, Salim Roukos
  • Publication number: 20050237227
    Abstract: A Bell Tree data structure is provided to model the process of chaining the mentions, from one or more documents, into entities, tracking the entire process; where the data structure is used in an entity tracking process that produces multiple results ranked by a product of probability scores.
    Type: Application
    Filed: April 27, 2004
    Publication date: October 27, 2005
    Applicant: International Business Machines Corporation
    Inventors: Abraham Ittycheriah, Hongyan Jing, Nandakishore Kambhatla, Xiaoqiang Luo, Salim Roukos
  • Publication number: 20040111253
    Abstract: A method, computer program product, and data processing system for training a statistical parser by utilizing active learning techniques to reduce the size of the corpus of human-annotated training samples (e.g., sentences) needed is disclosed. According to a preferred embodiment of the present invention, the statistical parser under training is used to compare the grammatical structure of the samples according to the parser's current level of training. The samples are then divided into clusters, with each cluster representing samples having a similar structure as ascertained by the statistical parser. Uncertainty metrics are applied to the clustered samples to select samples from each cluster that reflect uncertainty in the statistical parser's grammatical model. These selected samples may then be annotated by a human trainer for training the statistical parser.
    Type: Application
    Filed: December 10, 2002
    Publication date: June 10, 2004
    Applicant: International Business Machines Corporation
    Inventors: Xiaoqiang Luo, Salim Roukos, Min Tang
  • Patent number: 6260014
    Abstract: A method for recognizing speech includes the steps of providing a generic model having a baseform representation of a vocabulary of words, identifying a subset of words relating to an application, constructing a task specific model for the subset of words, constructing a composite model by combining the generic and task specific models and modifying the baseform representation of the subset of words such that the subset of words are recognized by the task specific model. A system for recognizing speech includes a composite model having a generic model having a generic baseform representation of a vocabulary of words and a task specific model for recognizing a subset of words relating to an application wherein the subset of words are recognized using a modified baseform representation. A recognizer compares words input thereto with the generic model for words other than the subset of words and with the task specific model for the subset of words.
    Type: Grant
    Filed: September 14, 1998
    Date of Patent: July 10, 2001
    Assignee: International Business Machines Corporation
    Inventors: Lalit Rai Bahl, David Lubensky, Mukund Padmanabhan, Salim Roukos
  • Patent number: 6246981
    Abstract: A system for conversant interaction includes a recognizer for receiving and processing input information and outputting a recognized representation of the input information. A dialog manager is coupled to the recognizer for receiving the recognized representation of the input information, the dialog manager having task-oriented forms for associating user input information therewith, the dialog manager being capable of selecting an applicable form from the task-oriented forms responsive to the input information by scoring the forms relative to each other. A synthesizer is employed for converting a response generated by the dialog manager to output the response. A program storage device and method are also provided.
    Type: Grant
    Filed: November 25, 1998
    Date of Patent: June 12, 2001
    Assignee: International Business Machines Corporation
    Inventors: Kishore A. Papineni, Salim Roukos, Robert T. Ward
  • Patent number: 6219638
    Abstract: A messaging system for receiving speech over a telephone and converting the speech to text includes a first server for receiving speech input by a user, a speech recognition system for converting the speech to text, a speech synthesizer for converting the text to speech for playing back the synthesized speech for correction by the user and a correction mechanism for enabling the user to correct the speech such that the corrected speech is provided as text for transmittal over a communication system.
    Type: Grant
    Filed: November 3, 1998
    Date of Patent: April 17, 2001
    Assignee: International Business Machines Corporation
    Inventors: Mukund Padmanabhan, Michael Picheny, David Nahamoo, Salim Roukos
  • Patent number: 6092034
    Abstract: A system and method for translating a series of source words in a first language to a series of target words in a second language is provided. The system includes an input device for inputting the series of source words. A fertility hypothesis generator operatively coupled to the input device generates at least one fertility hypotheses for a fertility of a source word, based on the source word and a context of the source word. A sense hypothesis generator operatively coupled to the input device generates sense hypotheses for a translation of the source word, based on the source word and the context of the source word. A fertility model operatively coupled to the fertility hypothesis generator determines a probability of the fertility of the source word, based on the source word and the context of the source word.
    Type: Grant
    Filed: July 27, 1998
    Date of Patent: July 18, 2000
    Assignee: International Business Machines Corporation
    Inventors: Jeffrey Scott McCarley, Salim Roukos
  • Patent number: 5640487
    Abstract: The present invention is an n-gram language modeler which significantly reduces the memory storage requirement and convergence time for language modelling systems and methods. The present invention aligns each n-gram with one of "n" number of non-intersecting classes. A count is determined for each n-gram representing the number of times each n-gram occurred in the training data. The n-grams are separated into classes and complement counts are determined. Using these counts and complement counts factors are determined, one factor for each class, using an iterative scaling algorithm. The language model probability, i.e., the probability that a word occurs given the occurrence of the previous two words, is determined using these factors.
    Type: Grant
    Filed: June 7, 1995
    Date of Patent: June 17, 1997
    Assignee: International Business Machines Corporation
    Inventors: Raymond Lau, Ronald Rosenfeld, Salim Roukos
  • Patent number: 5467425
    Abstract: The present invention is an n-gram language modeler which significantly reduces the memory storage requirement and convergence time for language modelling systems and methods. The present invention aligns each n-gram with one of "n" number of non-intersecting classes. A count is determined for each n-gram representing the number of times each n-gram occurred in the training data. The n-grams are separated into classes and complement counts are determined. Using these counts and complement counts factors are determined, one factor for each class, using an iterative scaling algorithm. The language model probability, i.e., the probability that a word occurs given the occurrence of the previous two words, is determined using these factors.
    Type: Grant
    Filed: February 26, 1993
    Date of Patent: November 14, 1995
    Assignee: International Business Machines Corporation
    Inventors: Raymond Lau, Ronald Rosenfeld, Salim Roukos