Patents by Inventor Marc Dymetman

Marc Dymetman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

METHOD AND SYSTEM FOR FINE-TUNING NEURAL CONDITIONAL LANGUAGE MODELS USING CONSTRAINTS

Publication number: 20240054338

Abstract: A processor-implemented method for fine-tuning a pre-trained neural conditional language model to perform a downstream task. A pre-trained conditional language model and at least one target constraint for satisfying a task-related control objective are received. A neural model is trained to approximate a target conditional model that optimally reconciles a distance from the pre-trained conditional language model and the control objective across multiple contexts.

Type: Application

Filed: October 3, 2022

Publication date: February 15, 2024

Inventors: Tomasz KORBAK, Hady ELSAHAR, German KRUSZEWSKI, Marc DYMETMAN
SAMPLING FROM DISCRETE ENERGY-BASED MODELS WITH QUALITY/EFFICIENCY TRADE-OFF

Publication number: 20240037184

Abstract: A sampling system includes: an energy-based model (EBM) configured to generate non-negative scores of an input having discrete classifications, respectively; and a sampling module configured to: generate a sample from a probability distribution of the EBM using a proposal distribution; set a probability of acceptance of the sample based on a minimum of (a) 1 and (b) an acceptance value determined based on the sample, a score of the sample from the EBM, the proposal distribution, and an upper boundary value; determine a distribution value between 0 and 1 using a uniform distribution; and discard the sample when the distribution value is greater than the probability of acceptance of the sample.

Type: Application

Filed: July 29, 2022

Publication date: February 1, 2024

Applicant: NAVER CORPORATION

Inventors: Bryan EIKEMA, German KRUSZEWSKI, Hady ELSAHAR, Stéphane CLINCHANT, Marc DYMETMAN
Method and system for training neural sequence-to-sequence models by incorporating global features

Patent number: 11681911

Abstract: Methods for training a neural sequence-to-sequence (seq2seq) model. A processor receives the model and training data comprising a plurality of training source sequences and corresponding training target sequences, and generates corresponding predicted target sequences. Model parameters are updated based on a comparison of predicted target sequences to training target sequences to reduce or minimize both a local loss in the predicted target sequences and an expected loss of one or more global or semantic features or constraints between the predicted target sequences and the training target sequences given the training source sequences. Expected loss is based on global or semantic features or constraints of general target sequences given general source sequences.

Type: Grant

Filed: October 15, 2019

Date of Patent: June 20, 2023

Assignee: NAVER CORPORATION

Inventors: Vu Cong Duy Hoang, Ioan Calapodescu, Marc Dymetman
METHOD AND SYSTEM FOR CONTROLLING DISTRIBUTIONS OF ATTRIBUTES IN LANGUAGE MODELS FOR TEXT GENERATION

Publication number: 20220108081

Abstract: A method for generating a language model for text generation by receiving a pre-trained language model having attributes with existing probability distributions over the pre-trained language model; receiving at least one target constraint; the target constraint specifying an expectation of a target attribute over a language model that approximates the pre-trained language model; computing parameters of an energy based model by applying the target constraint to the pre-trained language model; obtaining samples from a reference policy; updating parameters of a target policy using the obtained samples and the energy based model; updating the reference policy with the target policy if the target policy is superior to the reference policy; and outputting the target policy as a target language model. The target language model is adapted to generate text with the target attribute over a probability distribution that approximates the desired probability distribution specified by the target constraint.

Type: Application

Filed: August 2, 2021

Publication date: April 7, 2022

Inventors: Marc Dymetman, Hady Elsahar, Muhammad Khalifa
METHODS AND SYSTEMS FOR PRODUCING NEURAL SEQUENTIAL MODELS

Publication number: 20220083852

Abstract: In a method for generating a normalized sequential model using a processor, a sequential energy-based model computed by a parameterized neural network is provided. The sequential energy-based model defines an unnormalized probability distribution over a target sequence for a context source. The normalized sequential model is generated by projecting the sequential energy-based model onto a target autoregressive model that approximates a normalized distribution associated with the sequential energy-based model.

Type: Application

Filed: September 11, 2020

Publication date: March 17, 2022

Inventors: Tetiana PARSHAKOVA, Marc DYMETMAN, Jean-Marc ANDRÉOLI
METHOD AND SYSTEM FOR TRAINING NEURAL SEQUENCE-TO-SEQUENCE MODELS BY INCORPORATING GLOBAL FEATURES

Publication number: 20210110254

Abstract: Methods for training a neural sequence-to-sequence (seq2seq) model. A processor receives the model and training data comprising a plurality of training source sequences and corresponding training target sequences, and generates corresponding predicted target sequences. Model parameters are updated based on a comparison of predicted target sequences to training target sequences to reduce or minimize both a local loss in the predicted target sequences and an expected loss of one or more global or semantic features or constraints between the predicted target sequences and the training target sequences given the training source sequences. Expected loss is based on global or semantic features or constraints of general target sequences given general source sequences.

Type: Application

Filed: October 15, 2019

Publication date: April 15, 2021

Inventors: Vu Cong Duy HOANG, Ioan CALAPODESCU, Marc DYMETMAN
Symbolic priors for recurrent neural network based semantic parsing

Patent number: 10853724

Abstract: Methods, systems, and devices for semantic parsing. In an example embodiment, a method for semantic parsing can include steps, operations, or instructions such as obtaining a data pair for learning, the data pair comprising logical form data and natural utterance data; acquiring grammar for targeted logical forms among the logical form data of the data pair; modeling data comprising other available prior knowledge utilizing WFSA (Weighted Finite State Automata); combining with the targeted logical forms with the data modeled comprising the other available prior knowledge to form a background; and exploiting the background on the data pair. Note that we do not “learn” the background, but “learn” the background-RNN (Recurrent Neural Network).

Type: Grant

Filed: June 2, 2017

Date of Patent: December 1, 2020

Assignee: Xerox Corporation

Inventors: Chunyang Xiao, Marc Dymetman
Dialog device with dialog support generated using a mixture of language models combined using a recurrent neural network

Patent number: 10431205

Abstract: A dialog device comprises a natural language interfacing device (chat interface or a telephonic device), and a natural language output device (the chat interface, a display device, or a speech synthesizer outputting to the telephonic device). A computer stores natural language dialog conducted via the interfacing device and constructs a current utterance word-by-word. Each word is chosen by applying a plurality of language models to a context comprising concatenation of the stored dialog and the current utterance thus far. Each language model outputs a distribution over the words of a vocabulary. A recurrent neural network (RNN) is applied to the distributions to generate a mixture distribution. The next word is chosen using the mixture distribution. The output device outputs the current natural language utterance after it has been constructed by the computer.

Type: Grant

Filed: April 27, 2016

Date of Patent: October 1, 2019

Assignee: CONDUENT BUSINESS SERVICES, LLC

Inventors: Phong Le, Marc Dymetman, Jean-Michel Renders
SYMBOLIC PRIORS FOR RECURRENT NEURAL NETWORK BASED SEMANTIC PARSING

Publication number: 20180349767

Abstract: Methods, systems, and devices for semantic parsing. In an example embodiment, a method for semantic parsing can include steps, operations, or instructions such as obtaining a data pair for learning, the data pair comprising logical form data and natural utterance data; acquiring grammar for targeted logical forms among the logical form data of the data pair; modeling data comprising other available prior knowledge utilizing WFSA (Weighted Finite State Automata); combining with the targeted logical forms with the data modeled comprising the other available prior knowledge to form a background; and exploiting the background on the data pair. Note that we do not “learn” the background, but “learn” the background-RNN (Recurrent Neural Network).

Type: Application

Filed: June 2, 2017

Publication date: December 6, 2018

Inventors: Chunyang Xiao, Marc Dymetman
LOG-LINEAR RECURRENT NEURAL NETWORK

Publication number: 20180349765

Abstract: A neural network apparatus includes a recurrent neural network having a long-linear output layer. The recurrent neural network is trained by training data and the recurrent neural network models outputs symbols as complex combinations of attributes without requiring that each combination among the complex combinations be directly observed in the training data. The recurrent neural network is configured to permit an inclusion of flexible prior knowledge in a form of specified modular features, wherein the recurrent neural network learns to dynamically control weights of a log-linear distribution to promote the specified modular features. The recurrent neural network can be implemented as a log-linear recurrent neural network.

Type: Application

Filed: May 30, 2017

Publication date: December 6, 2018

Inventors: Marc Dymetman, Chunyang Xiao
Natural language generation through character-based recurrent neural networks with finite-state prior knowledge

Patent number: 10049106

Abstract: A method and a system for generating a target character sequence from a semantic representation including a sequence of characters are provided. The method includes adapting a target background model, built from a vocabulary of words, to form an adapted background model. The adapted background model accepts subsequences of an input semantic representation as well as words from the vocabulary. The input semantic representation is represented as a sequence of character embeddings, which are input to an encoder. The encoder encodes each of the character embeddings to generate a respective character representation. A decoder then generates a target sequence of characters, based on the set of character representations. At a plurality of time steps, a next character in the target sequence is selected as a function of a previously generated character(s) of the target sequence and the adapted background model.

Type: Grant

Filed: January 18, 2017

Date of Patent: August 14, 2018

Assignee: Xerox Corporation

Inventors: Raghav Goyal, Marc Dymetman
NATURAL LANGUAGE GENERATION THROUGH CHARACTER-BASED RECURRENT NEURAL NETWORKS WITH FINITE-STATE PRIOR KNOWLEDGE

Publication number: 20180203852

Abstract: A method and a system for generating a target character sequence from a semantic representation including a sequence of characters are provided. The method includes adapting a target background model, built from a vocabulary of words, to form an adapted background model. The adapted background model accepts subsequences of an input semantic representation as well as words from the vocabulary. The input semantic representation is represented as a sequence of character embeddings, which are input to an encoder. The encoder encodes each of the character embeddings to generate a respective character representation. A decoder then generates a target sequence of characters, based on the set of character representations. At a plurality of time steps, a next character in the target sequence is selected as a function of a previously generated character(s) of the target sequence and the adapted background model.

Type: Application

Filed: January 18, 2017

Publication date: July 19, 2018

Applicant: Xerox Corporation

Inventors: Raghav Goyal, Marc Dymetman
Semantic parsing using deep neural networks for predicting canonical forms

Patent number: 9858263

Abstract: A method for predicting a canonical form for an input text sequence includes predicting the canonical form with a neural network model. The model includes an encoder, which generates a first representation of the input text sequence based on a representation of n-grams in the text sequence and a second representation of the input text sequence generated by a first neural network. The model also includes a decoder which sequentially predicts terms of the canonical form based on the first and second representations and a predicted prefix of the canonical form. The canonical form can be used, for example, to query a knowledge base or to generate a next utterance in a discourse.

Type: Grant

Filed: May 5, 2016

Date of Patent: January 2, 2018

Assignees: Conduent Business Services, LLC, Centre National De La Recherche Scientifique

Inventors: Chunyang Xiao, Marc Dymetman, Claire Gardent
Sequence-based structured prediction for semantic parsing

Patent number: 9830315

Abstract: A system and method are provided which employ a neural network model which has been trained to predict a sequentialized form for an input text sequence. The sequentialized form includes a sequence of symbols. The neural network model includes an encoder which generates a representation of the input text sequence based on a representation of n-grams in the text sequence and a decoder which sequentially predicts a next symbol of the sequentialized form based on the representation and a predicted prefix of the sequentialized form. Given an input text sequence, a sequentialized form is predicted with the trained neural network model. The sequentialized form is converted to a structured form and information based on the structured form is output.

Type: Grant

Filed: July 13, 2016

Date of Patent: November 28, 2017

Assignees: XEROX CORPORATION, Centre National de la Recherche Scientifique

Inventors: Chunyang Xiao, Marc Dymetman, Claire Gardent
SEMANTIC PARSING USING DEEP NEURAL NETWORKS FOR PREDICTING CANONICAL FORMS

Publication number: 20170323636

Abstract: A method for predicting a canonical form for an input text sequence includes predicting the canonical form with a neural network model. The model includes an encoder, which generates a first representation of the input text sequence based on a representation of n-grams in the text sequence and a second representation of the input text sequence generated by a first neural network. The model also includes a decoder which sequentially predicts terms of the canonical form based on the first and second representations and a predicted prefix of the canonical form. The canonical form can be used, for example, to query a knowledge base or to generate a next utterance in a discourse.

Type: Application

Filed: May 5, 2016

Publication date: November 9, 2017

Applicant: Conduent Business Services, LLC

Inventors: Chunyang Xiao, Marc Dymetman, Claire Gardent
DIALOG DEVICE WITH DIALOG SUPPORT GENERATED USING A MIXTURE OF LANGUAGE MODELS COMBINED USING A RECURRENT NEURAL NETWORK

Publication number: 20170316775

Abstract: A dialog device comprises a natural language interfacing device (chat interface or a telephonic device), and a natural language output device (the chat interface, a display device, or a speech synthesizer outputting to the telephonic device). A computer stores natural language dialog conducted via the interfacing device and constructs a current utterance word-by-word. Each word is chosen by applying a plurality of language models to a context comprising concatenation of the stored dialog and the current utterance thus far. Each language model outputs a distribution over the words of a vocabulary. A recurrent neural network (RNN) is applied to the distributions to generate a mixture distribution. The next word is chosen using the mixture distribution. The output device outputs the current natural language utterance after it has been constructed by the computer.

Type: Application

Filed: April 27, 2016

Publication date: November 2, 2017

Applicant: Conduent Business Services, LLC

Inventors: Phong Le, Marc Dymetman, Jean-Michel Renders
Joint algorithm for sampling and optimization and natural language processing applications of same

Patent number: 9753893

Abstract: In rejection sampling of a function or distribution p over a space X, a proposal distribution q(n) is refined responsive to rejection of a sample x*?X to generate a refined proposal distribution q(n+1) selected to satisfy the criteria p(x)?q(n+1)(x)?q(n)(x) and q(n+1)(x*)<q(n)(x*). In a sampling mode, the sample x* is obtained by random sampling of the space X, the rejection sampling accepts or rejects x* based on comparison of a ratio p(x*)/q(x*) with a random draw, and the refined proposal distribution q(n+1) is selected to minimize a norm ?q(n+1)?? where ?<?. In an optimization mode, the sample x* is obtained such that q*=q(n)(x*) maximizes q(n) over the space X, the rejection sampling accepts or rejects x* based on a difference between or ratio of q* and p(x*), and the refined proposal distribution q(n+1) is selected to minimize a norm ?q(n+1)??=max{q(n+1)(x)}.

Type: Grant

Filed: June 18, 2012

Date of Patent: September 5, 2017

Assignee: XEROX CORPORATION

Inventors: Marc Dymetman, Guillaume Bouchard
Method and system for assisting contact center agents in composing electronic mail replies

Patent number: 9722957

Abstract: A system and method are disclosed which enable more effective email response authoring by contact center agents, for example, by automatically suggesting prototypical (entire) email responses to the human agent and interactive suggestion of next sentence candidates during the writing process. In one method, a customer inquiry is received and a latent topic prediction is generated, based on a word-based representation of the customer inquiry. A latent topic prediction is generated for an entire agent's reply to the customer inquiry as a function of the latent topic prediction generated for the customer inquiry. A further latent topic prediction is generated for a next sentence of the agent's reply as a function of a topic prediction for the next sentence which is generated with a prediction model that has been trained on annotated sentences of agent replies. Information is output to assist the agent, based on the topic predictions.

Type: Grant

Filed: May 4, 2015

Date of Patent: August 1, 2017

Assignee: CONDUENT BUSINESS SERVICES, LLC

Inventors: Marc Dymetman, Jean-Michel Renders, Sriram Venkatapathy, Spandana Gella
ROBUST REVERSIBLE FINITE-STATE APPROACH TO CONTEXTUAL GENERATION AND SEMANTIC PARSING

Publication number: 20170031896

Abstract: A system and method permit analysis and generation to be performed with the same reversible probabilistic model. The model includes a set of factors, including a canonical factor, which is a function of a logical form and a realization thereof, a similarity factor, which is a function of a canonical text string and a surface string, a language model factor, which is a static function of a surface string, a language context factor, which is a dynamic function of a surface string, and a semantic context factor, which is a dynamic function of a logical form. When performing generation, the canonical factor, similarity factor, language model factor, and language context factor are composed to receive as input a logical form and output a surface string, and when performing analysis, the similarity factor, canonical factor, and semantic context factor are composed to take as input a surface string and output a logical form.

Type: Application

Filed: July 28, 2015

Publication date: February 2, 2017

Applicant: Xerox Corporation

Inventors: Marc Dymetman, Sriram Venkatapathy, Chunyang Xiao
Dynamic bi-phrases for statistical machine translation

Patent number: 9552355

Abstract: A system and a method for phrase-based translation are disclosed. The method includes receiving source language text to be translated into target language text. One or more dynamic bi-phrases are generated, based on the source text and the application of one or more rules, which may be based on user descriptions. A dynamic feature value is associated with each of the dynamic bi-phrases. For a sentence of the source text, static bi-phrases are retrieved from a bi-phrase table, each of the static bi-phrases being associated with one or more values of static features. Any of the dynamic bi-phrases which each cover at least one word of the source text are also retrieved, which together form a set of active bi-phrases. Translation hypotheses are generated using active bi-phrases from the set and scored with a translation scoring model which takes into account the static and dynamic feature values of the bi-phrases used in the respective hypothesis. A translation, based on the hypothesis scores, is then output.

Type: Grant

Filed: May 20, 2010

Date of Patent: January 24, 2017

Assignee: XEROX CORPORATION

Inventors: Marc Dymetman, Wilker Ferreira Aziz, Nicola Cancedda, Jean-Marc Coursimault, Vassilina Nikoulina, Lucia Specia

1 2 3 4 5 next