Patents by Inventor Ladislav Kunc

Ladislav Kunc has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11966699
    Abstract: A system for classifying a language sample intent by receiving a language sample including a set of features, identifying language sample features, determining a tokenization score for the language sample according to the language sample features, eliminating duplicate features according to the tokenization score, determining a term frequency (tf) according to the identified features and the tokenization score, determining an inverse document frequency (idf) according to the identified features and the tokenization score, and generating a term frequency-inverse document frequency (tf-idf) matrix for the identified features.
    Type: Grant
    Filed: June 17, 2021
    Date of Patent: April 23, 2024
    Assignee: International Business Machines Corporation
    Inventors: Abhishek Shah, Ladislav Kunc, Haode Qi, Lin Pan, Saloni Potdar
  • Publication number: 20240070401
    Abstract: Methods, systems, and computer program products for detecting out-of-domain text data in dialog systems using artificial intelligence techniques are provided herein. A computer-implemented method includes updating artificial intelligence techniques related to out-of-domain text data detection, the updating based on encoding training data and generating regularized representations of at least a portion of the encoded training data by combining the at least a portion of the encoded training data and at least one intent centroid associated with the updated artificial intelligence techniques; encoding input text data; computing out-of-domain scores, in connection with the at least one dialog system, for at least a portion of the encoded input text data by processing the at least a portion of encoded input data using at least a portion of the one or more updated artificial intelligence techniques; and performing one or more automated actions based on the computed out-of-domain scores.
    Type: Application
    Filed: August 29, 2022
    Publication date: February 29, 2024
    Inventors: Cheng Qian, Haode Qi, Saloni Potdar, Ladislav Kunc
  • Publication number: 20240037331
    Abstract: A method, a structure, and a computer system for OOD sentence detection in dialogue systems. The exemplary embodiments may include receiving, for a domain corresponding to a particular topic, one or more on-topic text inputs and one or more off-topic text inputs. The exemplary embodiments may further include encoding the one or more on-topic text inputs and the one or more off-topic text inputs into a latent space, as well as decoding the one or more on-topic text inputs and the one or more off-topic text inputs from the latent space. The exemplary embodiments may additionally include minimizing a reconstruction error between the encoded one or more on-topic text inputs and the decoded one or more on-topic text inputs, and maximizing a reconstruction error between the encoded one or more off-topic text inputs and the decoded one or more off-topic text inputs.
    Type: Application
    Filed: July 28, 2022
    Publication date: February 1, 2024
    Inventors: Haode Qi, Cheng Qian, Ladislav Kunc, Saloni Potdar, Eric Donald Wayne
  • Patent number: 11853712
    Abstract: A method, computer system, and computer program product for multi-lingual chatlog training are provided. The embodiment may include receiving, by a processor, a plurality of data related to conversational data in multiple languages. The embodiment may also include assigning an intent label to each conversational data. The embodiment may further include assigning a language label to each conversational data. The embodiment may also include paring the plurality of the data related to the conversational data according to the intent label and the language label. The embodiment may further include training a machine learning model using a multi-lingual and multi-intent conversational data pairing. The embodiment may also include training the machine learning model using a single language and multi-intent conversational data paring.
    Type: Grant
    Filed: June 7, 2021
    Date of Patent: December 26, 2023
    Assignee: International Business Machines Corporation
    Inventors: Haode Qi, Lin Pan, Abhishek Shah, Ladislav Kunc, Saloni Potdar
  • Publication number: 20220405472
    Abstract: A system for classifying a language sample intent by receiving a language sample including a set of features, identifying language sample features, determining a tokenization score for the language sample according to the language sample features, eliminating duplicate features according to the tokenization score, determining a term frequency (tf) according to the identified features and the tokenization score, determining an inverse document frequency (idf) according to the identified features and the tokenization score, and generating a term frequency-inverse document frequency (tf-idf) matrix for the identified features.
    Type: Application
    Filed: June 17, 2021
    Publication date: December 22, 2022
    Inventors: Abhishek Shah, Ladislav Kunc, Haode Qi, LIN PAN, Saloni Potdar
  • Publication number: 20220391600
    Abstract: A method, computer system, and computer program product for multi-lingual chatlog training are provided. The embodiment may include receiving, by a processor, a plurality of data related to conversational data in multiple languages. The embodiment may also include assigning an intent label to each conversational data. The embodiment may further include assigning a language label to each conversational data. The embodiment may also include paring the plurality of the data related to the conversational data according to the intent label and the language label. The embodiment may further include training a machine learning model using a multi-lingual and multi-intent conversational data pairing. The embodiment may also include training the machine learning model using a single language and multi-intent conversational data paring.
    Type: Application
    Filed: June 7, 2021
    Publication date: December 8, 2022
    Inventors: Haode Qi, LIN PAN, Abhishek Shah, Ladislav Kunc, Saloni Potdar
  • Patent number: 11423333
    Abstract: Mechanisms are provided for optimizing an automated machine learning (AutoML) operation to configure parameters of a machine learning model. AutoML logic is configured based on an initial default value and initial range for sampling of a parameter of the machine learning (ML) model and an initial AutoML process is executed on the ML model based on a plurality of datasets comprising a plurality of domains of data elements, utilizing the initially configured AutoML logic. For each domain, a cross-dataset default value and cross-dataset value range are derived from results of the execution of the initial AutoML process. For each domain, an entry is stored in a data structure, the entry storing the derived cross-dataset default value and cross-dataset value range for the domain. The AutoML logic performs a subsequent AutoML process on a new dataset based on one or more entries of the data structure.
    Type: Grant
    Filed: March 25, 2020
    Date of Patent: August 23, 2022
    Assignee: International Business Machines Corporation
    Inventors: Haode Qi, Ming Tan, Ladislav Kunc, Saloni Potdar
  • Patent number: 11423227
    Abstract: A mechanism is provided to implement an abnormal entity detection mechanism that facilitates detecting abnormal entities in real-time response systems through weak supervision. For each first intent from an entity labeled workspace that matches a second intent in labeled chat logs, when the entity score associated with each first entity or second entity is above a predefined significance level the first entity or the second entity is recorded. For each first intent from the entity labeled workspace that matches the second intent in the labeled chat logs: responsive to the first entity being recorded and the second entity failing to be recorded, that first entity is removed from the training data as being mistakenly included; or, responsive to the second entity being recorded and the first entity failing to be recorded, that second entity is added as a potential business case to the training data.
    Type: Grant
    Filed: February 13, 2020
    Date of Patent: August 23, 2022
    Assignee: International Business Machines Corporation
    Inventors: Haode Qi, Ming Tan, Yang Yu, Navneet N. Rao, Ladislav Kunc, Saloni Potdar
  • Patent number: 11270077
    Abstract: A computing device receives a natural language input from a user. The computing device routes the natural language input from an active domain node of multiple domain nodes of a multi-domain context-based hierarchy to a leaf node of the domain nodes by selecting a parent domain node in the hierarchy until an off-topic classifier labels the natural language input as in-domain and then selecting a subdomain node in the hierarchy until an in-domain classifier labels the natural language input with a classification label, each of the plurality of domain nodes comprising a respective off-topic classifier and a respective in-domain classifier trained for a respective domain node. The computing device outputs the classification label determined by the leaf node.
    Type: Grant
    Filed: May 13, 2019
    Date of Patent: March 8, 2022
    Assignee: International Business Machines Corporation
    Inventors: Ming Tan, Ladislav Kunc, Yang Yu, Haoyu Wang, Saloni Potdar
  • Patent number: 11138506
    Abstract: A computer-implemented method for building a semantic analysis model. In one embodiment, the computer-implemented method includes creating proxy tags comprising a set of surface form variants. The computer-implemented method creates training examples comprising a combination of terminal tokens and at least one of the proxy tags. The computer-implemented method builds the semantic analysis model using the training examples.
    Type: Grant
    Filed: October 10, 2017
    Date of Patent: October 5, 2021
    Assignee: International Business Machines Corporation
    Inventors: Donna K. Byron, Benjamin L. Johnson, Ladislav Kunc, Mary D. Swift
  • Publication number: 20210304056
    Abstract: Mechanisms are provided for performing an automated machine learning (AutoML) operation to configure parameters of a machine learning model. AutoML logic is configured based on an initial parameter sampling configuration for sampling values of parameter(s) of the machine learning (ML) model. An initial AutoML process is executed on the ML model based on a dataset utilizing the initially configured AutoML logic, to generate at least one learned value for the parameter(s) of the ML model. The dataset is analyzed to extract a set of dataset characteristics that define properties of a format and/or a content of the dataset which are stored in association with the at least one learned value as part of a training dataset. A ML prediction model is trained based on the training dataset to predict, for new datasets, corresponding new sampling configuration information based on characteristics of the new datasets.
    Type: Application
    Filed: March 25, 2020
    Publication date: September 30, 2021
    Inventors: Haode Qi, Ming Tan, Ladislav Kunc, Saloni Potdar
  • Publication number: 20210304055
    Abstract: Mechanisms are provided for optimizing an automated machine learning (AutoML) operation to configure parameters of a machine learning model. AutoML logic is configured based on an initial default value and initial range for sampling of a parameter of the machine learning (ML) model and an initial AutoML process is executed on the ML model based on a plurality of datasets comprising a plurality of domains of data elements, utilizing the initially configured AutoML logic. For each domain, a cross-dataset default value and cross-dataset value range are derived from results of the execution of the initial AutoML process. For each domain, an entry is stored in a data structure, the entry storing the derived cross-dataset default value and cross-dataset value range for the domain. The AutoML logic performs a subsequent AutoML process on a new dataset based on one or more entries of the data structure.
    Type: Application
    Filed: March 25, 2020
    Publication date: September 30, 2021
    Inventors: Haode Qi, Ming Tan, Ladislav Kunc, Saloni Potdar
  • Patent number: 11120225
    Abstract: An online version of a sentence representation generation module updated by training a first sentence representation generation module using first labeled data of a first corpus. After training the first sentence representation generation module using the first labeled data, a second corpus of second labeled data is obtained. The second corpus is distinct from the first corpus. A subset of the first labeled data is identified based on similarities between the first corpus and the second corpus. A second sentence representation generation module is trained using the second labeled data of the second corpus and the subset of the first labeled data.
    Type: Grant
    Filed: February 5, 2019
    Date of Patent: September 14, 2021
    Assignee: International Business Machines Corporation
    Inventors: Ming Tan, Ladislav Kunc, Yang Yu, Haoyu Wang, Saloni Potdar
  • Publication number: 20210256211
    Abstract: A mechanism is provided to implement an abnormal entity detection mechanism that facilitates detecting abnormal entities in real-time response systems through weak supervision. For each first intent from an entity labeled workspace that matches a second intent in labeled chat logs, when the entity score associated with each first entity or second entity is above a predefined significance level the first entity or the second entity is recorded. For each first intent from the entity labeled workspace that matches the second intent in the labeled chat logs: responsive to the first entity being recorded and the second entity failing to be recorded, that first entity is removed from the training data as being mistakenly included; or, responsive to the second entity being recorded and the first entity failing to be recorded, that second entity is added as a potential business case to the training data.
    Type: Application
    Filed: February 13, 2020
    Publication date: August 19, 2021
    Inventors: Haode Qi, Ming Tan, Yang Yu, Navneet N. Rao, Ladislav Kunc, Saloni Potdar
  • Patent number: 11095590
    Abstract: Embodiments provide a computer implemented method, in a data processing system including a processor and a memory including instructions which are executed by the processor to cause the processor to train an enhanced chatflow system, the method including: ingesting a corpus of information including at least one user input node corresponding to a user question and at least one variation for each user input node; for each user input node: designating the node as a class; storing the node in a dialog node repository; designating each of the at least one variations as training examples for the designated class; converting the classes and the training examples into feature vector representations; training one or more training classifiers using the one or more feature vector representations of the classes; and training classification objectives using the one or more feature vector representations of the training examples.
    Type: Grant
    Filed: September 28, 2016
    Date of Patent: August 17, 2021
    Assignee: International Business Machines Corporation
    Inventors: Raimo Bakis, Ladislav Kunc, David Nahamoo, Lazaros Polymenakos, John Zakos
  • Publication number: 20210141860
    Abstract: Provided is a method, system, and computer program product for context-dependent spellchecking. The method comprises receiving context data to be used in spell checking. The method further comprises receiving a user input. The method further comprises identifying an out-of-vocabulary (OOV) word in the user input. An initial suggestion pool of candidate words is identified based, at least in part, on the context data. The method then comprises using a noisy channel approach to evaluate a probability that one or more of the candidate words of the initial suggestion pool is an intended word and should be used as a candidate for replacement of the OOV word. The method further comprises selecting one or more candidate words for replacement of the OOV word. The method further comprises outputting the one or more candidates.
    Type: Application
    Filed: November 11, 2019
    Publication date: May 13, 2021
    Inventors: Panos Karagiannis, Ladislav Kunc, Saloni Potdar, Haoyu Wang, Navneet N. Rao
  • Patent number: 10977445
    Abstract: A computer-implemented method includes obtaining a training data set including a plurality of training examples. The method includes generating, for each training example, multiple feature vectors corresponding, respectively, to multiple feature types. The method includes applying weighting factors to feature vectors corresponding to a subset of the feature types. The weighting factors are determined based on one or more of: a number of training examples, a number of classes associated with the training data set, an average number of training examples per class, a language of the training data set, a vocabulary size of the training data set, or a commonality of the vocabulary with a public corpus. The method includes concatenating the feature vectors of a particular training example to form an input vector and providing the input vector as training data to a machine-learning intent classification model to train the model to determine intent based on text input.
    Type: Grant
    Filed: February 1, 2019
    Date of Patent: April 13, 2021
    Assignee: International Business Machines Corporation
    Inventors: Yang Yu, Ladislav Kunc, Haoyu Wang, Ming Tan, Saloni Potdar
  • Patent number: 10937416
    Abstract: A method includes providing input text to a plurality of multi-task learning (MTL) models corresponding to a plurality of domains. Each MTL model is trained to generate an embedding vector based on the input text. The method further includes providing the input text to a domain identifier that is trained to generate a weight vector based on the input text. The weight vector indicates a classification weight for each domain of the plurality of domains. The method further includes scaling each embedding vector based on a corresponding classification weight of the weight vector to generate a plurality of scaled embedding vectors, generating a feature vector based on the plurality of scaled embedding vectors, and providing the feature vector to an intent classifier that is trained to generate, based on the feature vector, an intent classification result associated with the input text.
    Type: Grant
    Filed: February 1, 2019
    Date of Patent: March 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Ming Tan, Haoyu Wang, Ladislav Kunc, Yang Yu, Saloni Potdar
  • Patent number: 10922322
    Abstract: According to some aspects, a method of searching for content in response to a user voice query is provided. The method may comprise receiving the user voice query, performing speech recognition to generate N best speech recognition results comprising a first speech recognition result, performing a supervised search of at least one content repository to identify one or more supervised search results using one or more classifiers that classify the first speech recognition result into at least one class that identifies previously classified content in the at least one content repository, performing an unsupervised search of the at least one content repository to identify one or more unsupervised search results, wherein performing the unsupervised search comprises performing a word search of the at least one content repository, and generating combined results from among the one or more supervised search results and the one or more unsupervised search results.
    Type: Grant
    Filed: July 22, 2014
    Date of Patent: February 16, 2021
    Assignee: Nuance Communications, Inc.
    Inventors: Jan Kleindienst, Ladislav Kunc, Martin Labsky, Tomas Macek
  • Publication number: 20200364300
    Abstract: A computing device receives a natural language input from a user. The computing device routes the natural language input from an active domain node of multiple domain nodes of a multi-domain context-based hierarchy to a leaf node of the domain nodes by selecting a parent domain node in the hierarchy until an off-topic classifier labels the natural language input as in-domain and then selecting a subdomain node in the hierarchy until an in-domain classifier labels the natural language input with a classification label, each of the plurality of domain nodes comprising a respective off-topic classifier and a respective in-domain classifier trained for a respective domain node. The computing device outputs the classification label determined by the leaf node.
    Type: Application
    Filed: May 13, 2019
    Publication date: November 19, 2020
    Inventors: MING TAN, LADISLAV KUNC, YANG YU, HAOYU WANG, SALONI POTDAR