Patents by Inventor Marc Dymetman

Marc Dymetman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240054338
    Abstract: A processor-implemented method for fine-tuning a pre-trained neural conditional language model to perform a downstream task. A pre-trained conditional language model and at least one target constraint for satisfying a task-related control objective are received. A neural model is trained to approximate a target conditional model that optimally reconciles a distance from the pre-trained conditional language model and the control objective across multiple contexts.
    Type: Application
    Filed: October 3, 2022
    Publication date: February 15, 2024
    Inventors: Tomasz KORBAK, Hady ELSAHAR, German KRUSZEWSKI, Marc DYMETMAN
  • Publication number: 20240037184
    Abstract: A sampling system includes: an energy-based model (EBM) configured to generate non-negative scores of an input having discrete classifications, respectively; and a sampling module configured to: generate a sample from a probability distribution of the EBM using a proposal distribution; set a probability of acceptance of the sample based on a minimum of (a) 1 and (b) an acceptance value determined based on the sample, a score of the sample from the EBM, the proposal distribution, and an upper boundary value; determine a distribution value between 0 and 1 using a uniform distribution; and discard the sample when the distribution value is greater than the probability of acceptance of the sample.
    Type: Application
    Filed: July 29, 2022
    Publication date: February 1, 2024
    Applicant: NAVER CORPORATION
    Inventors: Bryan EIKEMA, German KRUSZEWSKI, Hady ELSAHAR, Stéphane CLINCHANT, Marc DYMETMAN
  • Patent number: 11681911
    Abstract: Methods for training a neural sequence-to-sequence (seq2seq) model. A processor receives the model and training data comprising a plurality of training source sequences and corresponding training target sequences, and generates corresponding predicted target sequences. Model parameters are updated based on a comparison of predicted target sequences to training target sequences to reduce or minimize both a local loss in the predicted target sequences and an expected loss of one or more global or semantic features or constraints between the predicted target sequences and the training target sequences given the training source sequences. Expected loss is based on global or semantic features or constraints of general target sequences given general source sequences.
    Type: Grant
    Filed: October 15, 2019
    Date of Patent: June 20, 2023
    Assignee: NAVER CORPORATION
    Inventors: Vu Cong Duy Hoang, Ioan Calapodescu, Marc Dymetman
  • Publication number: 20220108081
    Abstract: A method for generating a language model for text generation by receiving a pre-trained language model having attributes with existing probability distributions over the pre-trained language model; receiving at least one target constraint; the target constraint specifying an expectation of a target attribute over a language model that approximates the pre-trained language model; computing parameters of an energy based model by applying the target constraint to the pre-trained language model; obtaining samples from a reference policy; updating parameters of a target policy using the obtained samples and the energy based model; updating the reference policy with the target policy if the target policy is superior to the reference policy; and outputting the target policy as a target language model. The target language model is adapted to generate text with the target attribute over a probability distribution that approximates the desired probability distribution specified by the target constraint.
    Type: Application
    Filed: August 2, 2021
    Publication date: April 7, 2022
    Inventors: Marc Dymetman, Hady Elsahar, Muhammad Khalifa
  • Publication number: 20220083852
    Abstract: In a method for generating a normalized sequential model using a processor, a sequential energy-based model computed by a parameterized neural network is provided. The sequential energy-based model defines an unnormalized probability distribution over a target sequence for a context source. The normalized sequential model is generated by projecting the sequential energy-based model onto a target autoregressive model that approximates a normalized distribution associated with the sequential energy-based model.
    Type: Application
    Filed: September 11, 2020
    Publication date: March 17, 2022
    Inventors: Tetiana PARSHAKOVA, Marc DYMETMAN, Jean-Marc ANDRÉOLI
  • Publication number: 20210110254
    Abstract: Methods for training a neural sequence-to-sequence (seq2seq) model. A processor receives the model and training data comprising a plurality of training source sequences and corresponding training target sequences, and generates corresponding predicted target sequences. Model parameters are updated based on a comparison of predicted target sequences to training target sequences to reduce or minimize both a local loss in the predicted target sequences and an expected loss of one or more global or semantic features or constraints between the predicted target sequences and the training target sequences given the training source sequences. Expected loss is based on global or semantic features or constraints of general target sequences given general source sequences.
    Type: Application
    Filed: October 15, 2019
    Publication date: April 15, 2021
    Inventors: Vu Cong Duy HOANG, Ioan CALAPODESCU, Marc DYMETMAN
  • Patent number: 10853724
    Abstract: Methods, systems, and devices for semantic parsing. In an example embodiment, a method for semantic parsing can include steps, operations, or instructions such as obtaining a data pair for learning, the data pair comprising logical form data and natural utterance data; acquiring grammar for targeted logical forms among the logical form data of the data pair; modeling data comprising other available prior knowledge utilizing WFSA (Weighted Finite State Automata); combining with the targeted logical forms with the data modeled comprising the other available prior knowledge to form a background; and exploiting the background on the data pair. Note that we do not “learn” the background, but “learn” the background-RNN (Recurrent Neural Network).
    Type: Grant
    Filed: June 2, 2017
    Date of Patent: December 1, 2020
    Assignee: Xerox Corporation
    Inventors: Chunyang Xiao, Marc Dymetman
  • Patent number: 10431205
    Abstract: A dialog device comprises a natural language interfacing device (chat interface or a telephonic device), and a natural language output device (the chat interface, a display device, or a speech synthesizer outputting to the telephonic device). A computer stores natural language dialog conducted via the interfacing device and constructs a current utterance word-by-word. Each word is chosen by applying a plurality of language models to a context comprising concatenation of the stored dialog and the current utterance thus far. Each language model outputs a distribution over the words of a vocabulary. A recurrent neural network (RNN) is applied to the distributions to generate a mixture distribution. The next word is chosen using the mixture distribution. The output device outputs the current natural language utterance after it has been constructed by the computer.
    Type: Grant
    Filed: April 27, 2016
    Date of Patent: October 1, 2019
    Assignee: CONDUENT BUSINESS SERVICES, LLC
    Inventors: Phong Le, Marc Dymetman, Jean-Michel Renders
  • Publication number: 20180349767
    Abstract: Methods, systems, and devices for semantic parsing. In an example embodiment, a method for semantic parsing can include steps, operations, or instructions such as obtaining a data pair for learning, the data pair comprising logical form data and natural utterance data; acquiring grammar for targeted logical forms among the logical form data of the data pair; modeling data comprising other available prior knowledge utilizing WFSA (Weighted Finite State Automata); combining with the targeted logical forms with the data modeled comprising the other available prior knowledge to form a background; and exploiting the background on the data pair. Note that we do not “learn” the background, but “learn” the background-RNN (Recurrent Neural Network).
    Type: Application
    Filed: June 2, 2017
    Publication date: December 6, 2018
    Inventors: Chunyang Xiao, Marc Dymetman
  • Publication number: 20180349765
    Abstract: A neural network apparatus includes a recurrent neural network having a long-linear output layer. The recurrent neural network is trained by training data and the recurrent neural network models outputs symbols as complex combinations of attributes without requiring that each combination among the complex combinations be directly observed in the training data. The recurrent neural network is configured to permit an inclusion of flexible prior knowledge in a form of specified modular features, wherein the recurrent neural network learns to dynamically control weights of a log-linear distribution to promote the specified modular features. The recurrent neural network can be implemented as a log-linear recurrent neural network.
    Type: Application
    Filed: May 30, 2017
    Publication date: December 6, 2018
    Inventors: Marc Dymetman, Chunyang Xiao
  • Patent number: 10049106
    Abstract: A method and a system for generating a target character sequence from a semantic representation including a sequence of characters are provided. The method includes adapting a target background model, built from a vocabulary of words, to form an adapted background model. The adapted background model accepts subsequences of an input semantic representation as well as words from the vocabulary. The input semantic representation is represented as a sequence of character embeddings, which are input to an encoder. The encoder encodes each of the character embeddings to generate a respective character representation. A decoder then generates a target sequence of characters, based on the set of character representations. At a plurality of time steps, a next character in the target sequence is selected as a function of a previously generated character(s) of the target sequence and the adapted background model.
    Type: Grant
    Filed: January 18, 2017
    Date of Patent: August 14, 2018
    Assignee: Xerox Corporation
    Inventors: Raghav Goyal, Marc Dymetman
  • Publication number: 20180203852
    Abstract: A method and a system for generating a target character sequence from a semantic representation including a sequence of characters are provided. The method includes adapting a target background model, built from a vocabulary of words, to form an adapted background model. The adapted background model accepts subsequences of an input semantic representation as well as words from the vocabulary. The input semantic representation is represented as a sequence of character embeddings, which are input to an encoder. The encoder encodes each of the character embeddings to generate a respective character representation. A decoder then generates a target sequence of characters, based on the set of character representations. At a plurality of time steps, a next character in the target sequence is selected as a function of a previously generated character(s) of the target sequence and the adapted background model.
    Type: Application
    Filed: January 18, 2017
    Publication date: July 19, 2018
    Applicant: Xerox Corporation
    Inventors: Raghav Goyal, Marc Dymetman
  • Patent number: 9858263
    Abstract: A method for predicting a canonical form for an input text sequence includes predicting the canonical form with a neural network model. The model includes an encoder, which generates a first representation of the input text sequence based on a representation of n-grams in the text sequence and a second representation of the input text sequence generated by a first neural network. The model also includes a decoder which sequentially predicts terms of the canonical form based on the first and second representations and a predicted prefix of the canonical form. The canonical form can be used, for example, to query a knowledge base or to generate a next utterance in a discourse.
    Type: Grant
    Filed: May 5, 2016
    Date of Patent: January 2, 2018
    Assignees: Conduent Business Services, LLC, Centre National De La Recherche Scientifique
    Inventors: Chunyang Xiao, Marc Dymetman, Claire Gardent
  • Patent number: 9830315
    Abstract: A system and method are provided which employ a neural network model which has been trained to predict a sequentialized form for an input text sequence. The sequentialized form includes a sequence of symbols. The neural network model includes an encoder which generates a representation of the input text sequence based on a representation of n-grams in the text sequence and a decoder which sequentially predicts a next symbol of the sequentialized form based on the representation and a predicted prefix of the sequentialized form. Given an input text sequence, a sequentialized form is predicted with the trained neural network model. The sequentialized form is converted to a structured form and information based on the structured form is output.
    Type: Grant
    Filed: July 13, 2016
    Date of Patent: November 28, 2017
    Assignees: XEROX CORPORATION, Centre National de la Recherche Scientifique
    Inventors: Chunyang Xiao, Marc Dymetman, Claire Gardent
  • Publication number: 20170323636
    Abstract: A method for predicting a canonical form for an input text sequence includes predicting the canonical form with a neural network model. The model includes an encoder, which generates a first representation of the input text sequence based on a representation of n-grams in the text sequence and a second representation of the input text sequence generated by a first neural network. The model also includes a decoder which sequentially predicts terms of the canonical form based on the first and second representations and a predicted prefix of the canonical form. The canonical form can be used, for example, to query a knowledge base or to generate a next utterance in a discourse.
    Type: Application
    Filed: May 5, 2016
    Publication date: November 9, 2017
    Applicant: Conduent Business Services, LLC
    Inventors: Chunyang Xiao, Marc Dymetman, Claire Gardent
  • Publication number: 20170316775
    Abstract: A dialog device comprises a natural language interfacing device (chat interface or a telephonic device), and a natural language output device (the chat interface, a display device, or a speech synthesizer outputting to the telephonic device). A computer stores natural language dialog conducted via the interfacing device and constructs a current utterance word-by-word. Each word is chosen by applying a plurality of language models to a context comprising concatenation of the stored dialog and the current utterance thus far. Each language model outputs a distribution over the words of a vocabulary. A recurrent neural network (RNN) is applied to the distributions to generate a mixture distribution. The next word is chosen using the mixture distribution. The output device outputs the current natural language utterance after it has been constructed by the computer.
    Type: Application
    Filed: April 27, 2016
    Publication date: November 2, 2017
    Applicant: Conduent Business Services, LLC
    Inventors: Phong Le, Marc Dymetman, Jean-Michel Renders
  • Patent number: 9753893
    Abstract: In rejection sampling of a function or distribution p over a space X, a proposal distribution q(n) is refined responsive to rejection of a sample x*?X to generate a refined proposal distribution q(n+1) selected to satisfy the criteria p(x)?q(n+1)(x)?q(n)(x) and q(n+1)(x*)<q(n)(x*). In a sampling mode, the sample x* is obtained by random sampling of the space X, the rejection sampling accepts or rejects x* based on comparison of a ratio p(x*)/q(x*) with a random draw, and the refined proposal distribution q(n+1) is selected to minimize a norm ?q(n+1)?? where ?<?. In an optimization mode, the sample x* is obtained such that q*=q(n)(x*) maximizes q(n) over the space X, the rejection sampling accepts or rejects x* based on a difference between or ratio of q* and p(x*), and the refined proposal distribution q(n+1) is selected to minimize a norm ?q(n+1)??=max{q(n+1)(x)}.
    Type: Grant
    Filed: June 18, 2012
    Date of Patent: September 5, 2017
    Assignee: XEROX CORPORATION
    Inventors: Marc Dymetman, Guillaume Bouchard
  • Patent number: 9722957
    Abstract: A system and method are disclosed which enable more effective email response authoring by contact center agents, for example, by automatically suggesting prototypical (entire) email responses to the human agent and interactive suggestion of next sentence candidates during the writing process. In one method, a customer inquiry is received and a latent topic prediction is generated, based on a word-based representation of the customer inquiry. A latent topic prediction is generated for an entire agent's reply to the customer inquiry as a function of the latent topic prediction generated for the customer inquiry. A further latent topic prediction is generated for a next sentence of the agent's reply as a function of a topic prediction for the next sentence which is generated with a prediction model that has been trained on annotated sentences of agent replies. Information is output to assist the agent, based on the topic predictions.
    Type: Grant
    Filed: May 4, 2015
    Date of Patent: August 1, 2017
    Assignee: CONDUENT BUSINESS SERVICES, LLC
    Inventors: Marc Dymetman, Jean-Michel Renders, Sriram Venkatapathy, Spandana Gella
  • Publication number: 20170031896
    Abstract: A system and method permit analysis and generation to be performed with the same reversible probabilistic model. The model includes a set of factors, including a canonical factor, which is a function of a logical form and a realization thereof, a similarity factor, which is a function of a canonical text string and a surface string, a language model factor, which is a static function of a surface string, a language context factor, which is a dynamic function of a surface string, and a semantic context factor, which is a dynamic function of a logical form. When performing generation, the canonical factor, similarity factor, language model factor, and language context factor are composed to receive as input a logical form and output a surface string, and when performing analysis, the similarity factor, canonical factor, and semantic context factor are composed to take as input a surface string and output a logical form.
    Type: Application
    Filed: July 28, 2015
    Publication date: February 2, 2017
    Applicant: Xerox Corporation
    Inventors: Marc Dymetman, Sriram Venkatapathy, Chunyang Xiao
  • Patent number: 9552355
    Abstract: A system and a method for phrase-based translation are disclosed. The method includes receiving source language text to be translated into target language text. One or more dynamic bi-phrases are generated, based on the source text and the application of one or more rules, which may be based on user descriptions. A dynamic feature value is associated with each of the dynamic bi-phrases. For a sentence of the source text, static bi-phrases are retrieved from a bi-phrase table, each of the static bi-phrases being associated with one or more values of static features. Any of the dynamic bi-phrases which each cover at least one word of the source text are also retrieved, which together form a set of active bi-phrases. Translation hypotheses are generated using active bi-phrases from the set and scored with a translation scoring model which takes into account the static and dynamic feature values of the bi-phrases used in the respective hypothesis. A translation, based on the hypothesis scores, is then output.
    Type: Grant
    Filed: May 20, 2010
    Date of Patent: January 24, 2017
    Assignee: XEROX CORPORATION
    Inventors: Marc Dymetman, Wilker Ferreira Aziz, Nicola Cancedda, Jean-Marc Coursimault, Vassilina Nikoulina, Lucia Specia