Patents by Inventor Stephane Clinchant

Stephane Clinchant has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240037184
    Abstract: A sampling system includes: an energy-based model (EBM) configured to generate non-negative scores of an input having discrete classifications, respectively; and a sampling module configured to: generate a sample from a probability distribution of the EBM using a proposal distribution; set a probability of acceptance of the sample based on a minimum of (a) 1 and (b) an acceptance value determined based on the sample, a score of the sample from the EBM, the proposal distribution, and an upper boundary value; determine a distribution value between 0 and 1 using a uniform distribution; and discard the sample when the distribution value is greater than the probability of acceptance of the sample.
    Type: Application
    Filed: July 29, 2022
    Publication date: February 1, 2024
    Applicant: NAVER CORPORATION
    Inventors: Bryan EIKEMA, German KRUSZEWSKI, Hady ELSAHAR, Stéphane CLINCHANT, Marc DYMETMAN
  • Publication number: 20230418848
    Abstract: A ranker for a neural information retrieval model comprises a document encoder having a pretrained language model layer and configured to receive one or more documents and generate a sparse representation for each of the documents predicting term importance of the document over a vocabulary. A separate query encoder is configured to receive a query and generate a representation of the query over the vocabulary. Generated representations are compared to generate a set of respective document scores and rank the one or more documents.
    Type: Application
    Filed: May 5, 2023
    Publication date: December 28, 2023
    Inventors: Stéphane CLINCHANT, Carlos LASSANCE
  • Publication number: 20230214633
    Abstract: A neural model for representing an input sequence over a vocabulary in a ranker of a neural information retrieval model. An input sequence is embedded based at least on the vocabulary. An importance of each token over the vocabulary is predicted with respect to each token of the embedded input sequence. A predicted term importance of the input sequence over the vocabulary is determined by performing an activation over the embedded input sequence.
    Type: Application
    Filed: June 1, 2022
    Publication date: July 6, 2023
    Inventors: Stéphane Clinchant, Thibault Formal, Carlos Lassance, Benjamin PIwowarski
  • Publication number: 20230084333
    Abstract: Methods and systems for training a neural language model. Clean sequence pairs are received including clean source and target sequences. For each clean sequence pair, a noisy version is sampled with an adversarial generator to generate a noisy sequence pair. Parameters of the neural language model are optimized on the clean and noisy sequence pairs. Parameters of the adversarial generator are optimized to minimize a modeling loss of the adversarial generator and maximize a neural language loss of the neural language model using backpropagation.
    Type: Application
    Filed: August 31, 2021
    Publication date: March 16, 2023
    Inventors: Stéphane CLINCHANT, Badr Youbi IDRISSI
  • Publication number: 20230021996
    Abstract: Information retrieval methods employ a neural network encoder configured to receive a dense representation and generate a composite code comprising C clusters of dimension L from the dense representation. An activation function is configured to generate a sparse composite code from the composite code. The sparse composite code comprises a binary representation. An index can be generated using the sparse composite code.
    Type: Application
    Filed: June 1, 2022
    Publication date: January 26, 2023
    Inventors: Carlos LASSANCE, Stéphane CLINCHANT, Thibault FORMAL
  • Patent number: 11562039
    Abstract: A system and method perform cross-modal information retrieval, by generating a graph representing the set of media objects. Each node of the graph corresponds to a media object and is labeled with a set of features corresponding to a text part of the respective media object. Each edge between two nodes represents a similarity between a media part of the two nodes. A first relevance score is computed for each media object of the set of media objects that corresponds to a text-based score. A second relevance score is computed for each media object by inputting the graph into a graph neural network. The first relevance score and the second relevance score are combined to obtain a final ranking score for each media object.
    Type: Grant
    Filed: February 8, 2021
    Date of Patent: January 24, 2023
    Inventors: Jean-Michel Renders, Stephane Clinchant, Thibault Formal
  • Publication number: 20210349954
    Abstract: A system and method perform cross-modal information retrieval, by generating a graph representing the set of media objects. Each node of the graph corresponds to a media object and is labeled with a set of features corresponding to a text part of the respective media object. Each edge between two nodes represents a similarity between a media part of the two nodes. A first relevance score is computed for each media object of the set of media objects that corresponds to a text-based score. A second relevance score is computed for each media object by inputting the graph into a graph neural network. The first relevance score and the second relevance score are combined to obtain a final ranking score for each media object.
    Type: Application
    Filed: February 8, 2021
    Publication date: November 11, 2021
    Applicant: Naver Corporation
    Inventors: Jean-Michel Renders, Stephane Clinchant, Thibault Formal
  • Patent number: 11099016
    Abstract: Generating a pedestrian tour includes: receiving a query from a user device via a network, the query indicative of a request for a pedestrian tour and including a location and preferences including a target length for the pedestrian tour; obtaining a graph of a geographical area around the location of the user device, the graph including nodes indicative of path crossings and arcs indicative of paths connecting pairs of the nodes; determining scores for the arcs based on characteristics of the arcs, respectively, and the preferences; selecting connecting ones of the arcs based on the scores of the connecting ones of the arcs, lengths of the selected connecting ones of the arcs, and the target length; adding the selected connecting ones of the arcs to the pedestrian tour to initialize the pedestrian tour; and transmitting the pedestrian tour to the user device via the network for display.
    Type: Grant
    Filed: November 21, 2019
    Date of Patent: August 24, 2021
    Assignee: NAVER CORPORATION
    Inventors: Sofia Michel, Stéphane Clinchant, Christophe Legras, Jutta Willamowski
  • Publication number: 20200309545
    Abstract: Generating a pedestrian tour includes: receiving a query from a user device via a network, the query indicative of a request for a pedestrian tour and including a location and preferences including a target length for the pedestrian tour; obtaining a graph of a geographical area around the location of the user device, the graph including nodes indicative of path crossings and arcs indicative of paths connecting pairs of the nodes; determining scores for the arcs based on characteristics of the arcs, respectively, and the preferences; selecting connecting ones of the arcs based on the scores of the connecting ones of the arcs, lengths of the selected connecting ones of the arcs, and the target length; adding the selected connecting ones of the arcs to the pedestrian tour to initialize the pedestrian tour; and transmitting the pedestrian tour to the user device via the network for display.
    Type: Application
    Filed: November 21, 2019
    Publication date: October 1, 2020
    Applicant: NAVER CORPORATION
    Inventors: Sofia MICHEL, Stephane CLINCHANT, Christophe LEGRAS, Jutta WILLAMOWSKI
  • Patent number: 10354199
    Abstract: A classification method includes receiving a collection of samples, each sample comprising a multidimensional feature representation. A class label prediction for each sample in the collection is generated with one or more pretrained classifiers. For at least one iteration, each multidimensional feature representation is augmented with a respective class label prediction to form an augmented representation, a set of corrupted samples is generated from the augmented representations, and a transformation that minimizes a reconstruction error for the set of corrupted samples is learned. An adapted class label prediction for at least one of the samples in the collection is generated using the learned transformation and information is output, based on the adapted class label prediction. The method is useful in predicting labels for target samples where there is no access to source domain samples that are used to train the classifier and no access to target domain training data.
    Type: Grant
    Filed: December 7, 2015
    Date of Patent: July 16, 2019
    Assignee: Xerox Corporation
    Inventors: Stéphane Clinchant, Gabriela Csurka, Boris Chidlovskii
  • Patent number: 10296846
    Abstract: A domain-adapted classification system and method are disclosed. The method includes mapping an input set of representations to generate an output set of representations, using a learned transformation. The input set of representations includes a set of target samples from a target domain. The input set also includes, for each of a plurality of source domains, a class representation for each of a plurality of classes. The class representations are representative of a respective set of source samples from the respective source domain labeled with a respective class. The output set of representations includes an adapted representation of each of the target samples and an adapted class representation for each of the classes for each of the source domains. A class label is predicted for at least one of the target samples based on the output set of representations and information based on the predicted class label is output.
    Type: Grant
    Filed: November 24, 2015
    Date of Patent: May 21, 2019
    Assignee: XEROX CORPORATION
    Inventors: Gabriela Csurka, Boris Chidlovskii, Stéphane Clinchant
  • Patent number: 10055479
    Abstract: Documents of a set of documents are represented by bag-of-words (BOW) vectors. L labeled topics are provided, each labeled with a word list comprising words of a vocabulary that are representative of the labeled topic and possibly a list of relevant documents. Probabilistic classification of the documents generates for each labeled topic a document vector whose elements store scores of the documents for the labeled topic and a word vector whose elements store scores of the words of the vocabulary for the labeled topic. Non-negative matrix factorization (NMF) is performed to generate a document-topic model that clusters the documents into k topics where k>L. NMF factors representing L topics of the k topics are initialized to the document and word vectors for the L labeled topics. In some embodiments the NMF factors representing the L topics initialized to the document and word vectors are frozen, that is, are not updated by the NMF after the initialization.
    Type: Grant
    Filed: January 12, 2015
    Date of Patent: August 21, 2018
    Assignee: Xerox Corporation
    Inventors: Stephane Clinchant, Guillaume Bouchard
  • Patent number: 9916542
    Abstract: A machine learning method operates on training instances from a plurality of domains including one or more source domains and a target domain. Each training instance is represented by values for a set of features. Domain adaptation is performed using stacked marginalized denoising autoencoding (mSDA) operating on the training instances to generate a stack of domain adaptation transform layers. Each iteration of the domain adaptation includes corrupting the training instances in accord with feature corruption probabilities that are non-uniform over at least one of the set of features and the domains. A classifier is learned on the training instances transformed using the stack of domain adaptation transform layers. Thereafter, a label prediction is generated for an input instance from the target domain represented by values for the set of features by applying the classifier to the input instance transformed using the stack of domain adaptation transform domains.
    Type: Grant
    Filed: February 2, 2016
    Date of Patent: March 13, 2018
    Assignee: XEROX CORPORATION
    Inventors: Boris Chidlovskii, Gabriela Csurka, Stéphane Clinchant
  • Publication number: 20180024968
    Abstract: A method for domain adaptation of samples includes receiving training samples from a plurality of domains, the plurality of domains including at least one source domain and a target domain, each training sample including values for a set of features. A domain predictor is learned on at least some of the training samples from the plurality of domains and respective domain labels. Domain adaptation is performed on the training samples using marginalized denoising autoencoding. This generates a domain adaptation transform layer (or layers) that transforms the training samples to a common adapted feature space. The domain adaptation employs the domain predictor to bias the domain adaptation towards one of the plurality of domains. Domain adapted training samples and their class labels can be used to train a classifier for prediction of class labels for unlabeled target samples that have been domain adapted with the domain adaptation transform layer(s).
    Type: Application
    Filed: July 22, 2016
    Publication date: January 25, 2018
    Applicant: Xerox Corporation
    Inventors: Stéphane Clinchant, Gabriela Csurka, Boris Chidlovskii
  • Publication number: 20170220951
    Abstract: Training instances from a target domain are represented by feature vectors storing values for a set of features, and are labeled by labels from a set of labels. Both a noise marginalizing transform and a weighting of one or more source domain classifiers are simultaneously learned by minimizing the expectation of a loss function that is dependent on the feature vectors corrupted with noise represented by a noise probability density function, the labels, and the one or more source domain classifiers operating on the feature vectors corrupted with the noise. An input instance from the target domain is labeled with a label from the set of labels by operations including applying the learned noise marginalizing transform to an input feature vector representing the input instance and applying the one or more source domain classifiers weighted by the learned weighting to the input feature vector representing the input instance.
    Type: Application
    Filed: February 2, 2016
    Publication date: August 3, 2017
    Applicant: Xerox Corporation
    Inventors: Boris Chidlovskii, Gabriela Csurka, Stéphane Clinchant
  • Publication number: 20170220897
    Abstract: A machine learning method operates on training instances from a plurality of domains including one or more source domains and a target domain. Each training instance is represented by values for a set of features. Domain adaptation is performed using stacked marginalized denoising autoencoding (mSDA) operating on the training instances to generate a stack of domain adaptation transform layers. Each iteration of the domain adaptation includes corrupting the training instances in accord with feature corruption probabilities that are non-uniform over at least one of the set of features and the domains. A classifier is learned on the training instances transformed using the stack of domain adaptation transform layers. Thereafter, a label prediction is generated for an input instance from the target domain represented by values for the set of features by applying the classifier to the input instance transformed using the stack of domain adaptation transform domains.
    Type: Application
    Filed: February 2, 2016
    Publication date: August 3, 2017
    Applicant: Xerox Corporation
    Inventors: Boris Chidlovskii, Gabriela Csurka, Stéphane Clinchant
  • Publication number: 20170161633
    Abstract: A classification method includes receiving a collection of samples, each sample comprising a multidimensional feature representation. A class label prediction for each sample in the collection is generated with one or more pretrained classifiers. For at least one iteration, each multidimensional feature representation is augmented with a respective class label prediction to form an augmented representation, a set of corrupted samples is generated from the augmented representations, and a transformation that minimizes a reconstruction error for the set of corrupted samples is learned. An adapted class label prediction for at least one of the samples in the collection is generated using the learned transformation and information is output, based on the adapted class label prediction. The method is useful in predicting labels for target samples where there is no access to source domain samples that are used to train the classifier and no access to target domain training data.
    Type: Application
    Filed: December 7, 2015
    Publication date: June 8, 2017
    Applicant: Xerox Corporation
    Inventors: Stéphane Clinchant, Gabriela Csurka, Boris Chidlovskii
  • Publication number: 20170147944
    Abstract: A domain-adapted classification system and method are disclosed. The method includes mapping an input set of representations to generate an output set of representations, using a learned transformation. The input set of representations includes a set of target samples from a target domain. The input set also includes, for each of a plurality of source domains, a class representation for each of a plurality of classes. The class representations are representative of a respective set of source samples from the respective source domain labeled with a respective class. The output set of representations includes an adapted representation of each of the target samples and an adapted class representation for each of the classes for each of the source domains. A class label is predicted for at least one of the target samples based on the output set of representations and information based on the predicted class label is output.
    Type: Application
    Filed: November 24, 2015
    Publication date: May 25, 2017
    Applicant: Xerox Corporation
    Inventors: Gabriela Csurka, Boris Chidlovskii, Stéphane Clinchant
  • Patent number: 9460197
    Abstract: Processing methods and systems are provided for representing documents relative to importance of words in the document. A processor comprising a weighting model of word importance in a document in a collection relative to an importance of the word in other documents in the collection computes a deviation of distribution of the word from a probability distribution of the word in other documents in the collection, where the deviation distribution is weighted in accordance with a concavity control function. A concavity control parameter is adjustable relative to word frequency.
    Type: Grant
    Filed: May 3, 2012
    Date of Patent: October 4, 2016
    Assignee: Xerox Corporation
    Inventor: Stephane Clinchant
  • Patent number: 9430563
    Abstract: A set of word embedding transforms are applied to transform text words of a set of documents into K-dimensional word vectors in order to generate sets or sequences of word vectors representing the documents of the set of documents. A probabilistic topic model is learned using the sets or sequences of word vectors representing the documents of the set of documents. The set of word embedding transforms are applied to transform text words of an input document into K-dimensional word vectors in order to generate a set or sequence of word vectors representing the input document. The learned probabilistic topic model is applied to assign probabilities for topics of the probabilistic topic model to the set or sequence of word vectors representing the input document. A document processing operation such as annotation, classification, or similar document retrieval may be performed using the assigned topic probabilities.
    Type: Grant
    Filed: February 2, 2012
    Date of Patent: August 30, 2016
    Assignee: XEROX CORPORATION
    Inventors: Stéphane Clinchant, Florent Perronnin