Patents by Inventor FRANCK DERNONCOURT

FRANCK DERNONCOURT has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210042391
    Abstract: Certain embodiments involve a method for generating a summary. The method includes one or more processing devices performing operations including generating a set of word embeddings corresponding to each word of a text input. The operations further include generating a set of selection probabilities corresponding to each word of the text input using the respective word embeddings. Further, the operations include calculating a set of sentence saliency scores for a set of sentences of the text input using respective selection probabilities of the set of selection probabilities for each word of the text input. Additional, the operations include generating the summary of the text input using a subset of sentences from the set of sentences with greatest sentence saliency scores from the set of sentence saliency scores.
    Type: Application
    Filed: August 7, 2019
    Publication date: February 11, 2021
    Inventors: Sebastian Gehrmann, Franck Dernoncourt
  • Publication number: 20210034699
    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for sentence compression in which a provided sentence is compressed to fit within an allotted space. Portions of the input sentence are copied to generate the compressed sentence. Upon receipt of a sentence, top candidate compressed sentences may be determined based on probabilities of segments of the input sentence to be included in a potential compressed sentence. The top candidate compressed sentences are re-ranked based on grammatical accuracy scores for each of the candidate compressed sentences using a language model trained using linguistic features of words and/or phrases. The highest scoring candidate compressed sentence may be presented to the user.
    Type: Application
    Filed: August 2, 2019
    Publication date: February 4, 2021
    Inventors: Sebastian Gehrmann, Franck Dernoncourt
  • Publication number: 20210027141
    Abstract: This disclosure relates to methods, non-transitory computer readable media, and systems that can classify term sequences within a source text based on textual features analyzed by both an implicit-class-recognition model and an explicit-class-recognition model. For example, by applying machine-learning models for both implicit and explicit class recognition, the disclosed systems can determine a class corresponding to a particular term sequence within a source text and identify the particular term sequence reflecting the class. The dual-model architecture can equip the disclosed systems to apply (i) the implicit-class-recognition model to recognize implicit references to a class in source texts and (ii) the explicit-class-recognition model to recognize explicit references to the same class in source texts.
    Type: Application
    Filed: July 22, 2019
    Publication date: January 28, 2021
    Inventors: Sean MacAvaney, Franck Dernoncourt, Walter Chang, Seokhwan Kim, Doo Soon Kim, Chen Fang
  • Publication number: 20210004576
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for generating modified digital images based on verbal and/or gesture input by utilizing a natural language processing neural network and one or more computer vision neural networks. The disclosed systems can receive verbal input together with gesture input. The disclosed systems can further utilize a natural language processing neural network to generate a verbal command based on verbal input. The disclosed systems can select a particular computer vision neural network based on the verbal input and/or the gesture input. The disclosed systems can apply the selected computer vision neural network to identify pixels within a digital image that correspond to an object indicated by the verbal input and/or gesture input. Utilizing the identified pixels, the disclosed systems can generate a modified digital image by performing one or more editing actions indicated by the verbal input and/or gesture input.
    Type: Application
    Filed: September 18, 2020
    Publication date: January 7, 2021
    Inventors: Trung Bui, Zhe Lin, Walter Chang, Nham Le, Franck Dernoncourt
  • Publication number: 20200380030
    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for in-app video navigation in which videos including answers to user provided queries are presented within an application. And portions of the videos that specifically include the answer to the query are highlighted to allow for efficient and effective tutorial utilization. Upon receipt of a text or verbal query, top candidate videos including an answer to the query are determined. Within the top candidate videos, a video span with a starting sentence location and an ending location is identified based on the query and contextual information within each candidate video. The video span with the highest overall score calculated based on a video score and a span score is presented to the user.
    Type: Application
    Filed: May 31, 2019
    Publication date: December 3, 2020
    Inventors: Anthony Michael Colas, Doo Soon Kim, Franck Dernoncourt, Seokhwan Kim
  • Publication number: 20200372025
    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for techniques for identifying textual similarity and performing answer selection. A textual-similarity computing model can use a pre-trained language model to generate vector representations of a question and a candidate answer from a target corpus. The target corpus can be clustered into latent topics (or other latent groupings), and probabilities of a question or candidate answer being in each of the latent topics can be calculated and condensed (e.g., downsampled) to improve performance and focus on the most relevant topics. The condensed probabilities can be aggregated and combined with a downstream vector representation of the question (or answer) so the model can use focused topical and other categorical information as auxiliary information in a similarity computation.
    Type: Application
    Filed: May 23, 2019
    Publication date: November 26, 2020
    Inventors: Seung-hyun Yoon, Franck Dernoncourt, Trung Huu Bui, Doo Soon Kim, Carl Iwan Dockhorn, Yu Gong
  • Patent number: 10817713
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for generating modified digital images based on verbal and/or gesture input by utilizing a natural language processing neural network and one or more computer vision neural networks. The disclosed systems can receive verbal input together with gesture input. The disclosed systems can further utilize a natural language processing neural network to generate a verbal command based on verbal input. The disclosed systems can select a particular computer vision neural network based on the verbal input and/or the gesture input. The disclosed systems can apply the selected computer vision neural network to identify pixels within a digital image that correspond to an object indicated by the verbal input and/or gesture input. Utilizing the identified pixels, the disclosed systems can generate a modified digital image by performing one or more editing actions indicated by the verbal input and/or gesture input.
    Type: Grant
    Filed: November 15, 2018
    Date of Patent: October 27, 2020
    Assignee: ADOBE INC.
    Inventors: Trung Bui, Zhe Lin, Walter Chang, Nham Le, Franck Dernoncourt
  • Publication number: 20200327884
    Abstract: Methods and systems are provided for generating a customized speech recognition neural network system comprised of an adapted automatic speech recognition neural network and an adapted language model neural network. The automatic speech recognition neural network is first trained in a generic domain and then adapted to a target domain. The language model neural network is first trained in a generic domain and then adapted to a target domain. Such a customized speech recognition neural network system can be used to understand input vocal commands.
    Type: Application
    Filed: April 12, 2019
    Publication date: October 15, 2020
    Inventors: Trung Huu Bui, Subhadeep Dey, Franck Dernoncourt
  • Publication number: 20200311207
    Abstract: Methods and systems are provided for identifying subparts of a text. A neural network system can receive a set of sentences that includes context sentences and target sentences that indicate a decision point in a text. The neural network system can generate context vector sentences and target sentence vectors by encoding context from the set of sentences. These context sentence vectors can be weighted to focus on relevant information. The weighted context sentence vectors and the target sentence vectors can then be used to output a label for the decision point in the text.
    Type: Application
    Filed: March 28, 2019
    Publication date: October 1, 2020
    Inventors: Seokhwan Kim, Walter W. Chang, Nedim Lipka, Franck Dernoncourt, Chan Young Park
  • Publication number: 20200312298
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods that generate ground truth annotations of target utterances in digital image editing dialogues in order to create a state-driven training data set. In particular, in one or more embodiments, the disclosed systems utilize machine and user defined tags, machine learning model predictions, and user input to generate a ground truth annotation that includes frame information in addition to intent, attribute, object, and/or location information. In at least one embodiment, the disclosed systems generate ground truth annotations in conformance with an annotation ontology that results in fast and accurate digital image editing dialogue annotation.
    Type: Application
    Filed: March 27, 2019
    Publication date: October 1, 2020
    Inventors: Trung Bui, Zahra Rahimi, Yinglan Ma, Seokhwan Kim, Franck Dernoncourt
  • Patent number: 10783314
    Abstract: Techniques are disclosed for generating a structured transcription from a speech file. In an example embodiment, a structured transcription system receives a speech file comprising speech from one or more people and generates a navigable structured transcription object. The navigable structured transcription object may comprise one or more data structures representing multimedia content with which a user may navigate and interact via a user interface. Text and/or speech relating to the speech file can be selectively presented to the user (e.g., the text can be presented via a display, and the speech can be aurally presented via a speaker).
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: September 22, 2020
    Assignee: Adobe Inc.
    Inventors: Franck Dernoncourt, Walter Wei-Tuh Chang, Seokhwan Kim, Sean Fitzgerald, Ragunandan Rao Malangully, Laurie Marie Byrum, Frederic Thevenet, Carl Iwan Dockhorn
  • Patent number: 10769495
    Abstract: In implementations of collecting multimodal image editing requests (IERs), a user interface is generated that exposes an image pair including a first image and a second image including at least one edit to the first image. A user simultaneously speaks a voice command and performs a user gesture that describe an edit of the first image used to generate the second image. The user gesture and the voice command are simultaneously recorded and synchronized with timestamps. The voice command is played back, and the user transcribes their voice command based on the play back, creating an exact transcription of their voice command. Audio samples of the voice command with respective timestamps, coordinates of the user gesture with respective timestamps, and a transcription are packaged as a structured data object for use as training data to train a neural network to recognize multimodal IERs in an image editing application.
    Type: Grant
    Filed: August 1, 2018
    Date of Patent: September 8, 2020
    Assignee: Adobe Inc.
    Inventors: Trung Huu Bui, Zhe Lin, Walter Wei-Tuh Chang, Nham Van Le, Franck Dernoncourt
  • Publication number: 20200160042
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for generating modified digital images based on verbal and/or gesture input by utilizing a natural language processing neural network and one or more computer vision neural networks. The disclosed systems can receive verbal input together with gesture input. The disclosed systems can further utilize a natural language processing neural network to generate a verbal command based on verbal input. The disclosed systems can select a particular computer vision neural network based on the verbal input and/or the gesture input. The disclosed systems can apply the selected computer vision neural network to identify pixels within a digital image that correspond to an object indicated by the verbal input and/or gesture input. Utilizing the identified pixels, the disclosed systems can generate a modified digital image by performing one or more editing actions indicated by the verbal input and/or gesture input.
    Type: Application
    Filed: November 15, 2018
    Publication date: May 21, 2020
    Inventors: Trung Bui, Zhe Lin, Walter Chang, Nham Le, Franck Dernoncourt
  • Publication number: 20200152175
    Abstract: Techniques are disclosed for generating ASR training data. According to an embodiment, impactful ASR training corpora is generated efficiently, and the quality or relevance of ASR training corpora being generated is increased by leveraging knowledge of the ASR system being trained. An example methodology includes: selecting one of a word or phrase, based on knowledge and/or content of said ASR training corpora; presenting a textual representation of said word or phrase; receiving a speech utterance that includes said word or phrase; receiving a transcript for said speech utterance; presenting said transcript for review (to allow for editing, if needed); and storing said transcript and said audio file in an ASR system training database. The selecting may include, for instance, selecting a word or phrase that is under-represented in said database, and/or based upon an n-gram distribution on a language, and/or based upon known areas that tend to incur transcription mistakes.
    Type: Application
    Filed: November 13, 2018
    Publication date: May 14, 2020
    Applicant: ADOBE INC.
    Inventor: Franck Dernoncourt
  • Publication number: 20200042286
    Abstract: In implementations of collecting multimodal image editing requests (IERs), a user interface is generated that exposes an image pair including a first image and a second image including at least one edit to the first image. A user simultaneously speaks a voice command and performs a user gesture that describe an edit of the first image used to generate the second image. The user gesture and the voice command are simultaneously recorded and synchronized with timestamps. The voice command is played back, and the user transcribes their voice command based on the play back, creating an exact transcription of their voice command. Audio samples of the voice command with respective timestamps, coordinates of the user gesture with respective timestamps, and a transcription are packaged as a structured data object for use as training data to train a neural network to recognize multimodal IERs in an image editing application.
    Type: Application
    Filed: August 1, 2018
    Publication date: February 6, 2020
    Applicant: Adobe Inc.
    Inventors: Trung Huu Bui, Zhe Lin, Walter Wei-Tuh Chang, Nham Van Le, Franck Dernoncourt
  • Publication number: 20200004803
    Abstract: Techniques are disclosed for generating a structured transcription from a speech file. In an example embodiment, a structured transcription system receives a speech file comprising speech from one or more people and generates a navigable structured transcription object. The navigable structured transcription object may comprise one or more data structures representing multimedia content with which a user may navigate and interact via a user interface. Text and/or speech relating to the speech file can be selectively presented to the user (e.g., the text can be presented via a display, and the speech can be aurally presented via a speaker).
    Type: Application
    Filed: June 29, 2018
    Publication date: January 2, 2020
    Applicant: Adobe Inc.
    Inventors: Franck Dernoncourt, Walter Wei-Tuh Chang, Seokhwan Kim, Sean Fitzgerald, Ragunandan Rao Malangully, Laurie Marie Byrum, Frederic Thevenet, Carl Iwan Dockhorn
  • Publication number: 20190384807
    Abstract: Systems, methods, and non-transitory computer-readable media are disclosed that collect and analyze annotation performance data to generate digital annotations for evaluating and training automatic electronic document annotation models. In particular, in one or more embodiments, the disclosed systems provide electronic documents to annotators based on annotator topic preferences. The disclosed systems then identify digital annotations and annotation performance data such as a time period spent by an annotator in generating digital annotations and annotator responses to digital annotation questions. Furthermore, in one or more embodiments, the disclosed systems utilize the identified digital annotations and the annotation performance data to generate a final set of reliable digital annotations. Additionally, in one or more embodiments, the disclosed systems provide the final set of digital annotations for utilization in training a machine learning model to generate annotations for electronic documents.
    Type: Application
    Filed: June 13, 2018
    Publication date: December 19, 2019
    Inventors: Franck Dernoncourt, Walter Chang, Trung Bui, Sean Fitzgerald, Sasha Spala, Kishore Aradhya, Carl Dockhorn
  • Publication number: 20190278835
    Abstract: Techniques are disclosed for abstractive summarization process for summarizing documents, including long documents. A document is encoded using an encoder-decoder architecture with attentive decoding. In particular, an encoder for modeling documents generates both word-level and section-level representations of a document. A discourse-aware decoder then captures the information flow from all discourse sections of a document. In order to extend the robustness of the generated summarization, a neural attention mechanism considers both word-level as well as section-level representations of a document. The neural attention mechanism may utilize a set of weights that are applied to the word-level representations and section-level representations.
    Type: Application
    Filed: March 8, 2018
    Publication date: September 12, 2019
    Applicant: Adobe Inc.
    Inventors: Arman Cohan, Walter W. Chang, Trung Huu Bui, Franck Dernoncourt, Doo Soon Kim
  • Patent number: 10055403
    Abstract: The present disclosure relates dialog states, which computers use to internally represent what users have in mind in dialog. A dialog state tracker employs various rules that enhance the ability of computers to correctly identify the presence of slot-value pairs, which make up dialog states, in utterances or conversational input of dialog. Some rules provide for identifying synonyms of values of slot-values pairs in utterances. Other rules provide for identifying slot-value pairs based on coreferences between utterances and previous utterances of dialog sessions. Rules are also provided for carrying over slot-value pairs from dialog states of previous utterances to a dialog state of a current utterance. Yet other rules provide for removing slot-value pairs from candidate dialog states, which are later used as dialog states of utterances.
    Type: Grant
    Filed: February 5, 2016
    Date of Patent: August 21, 2018
    Assignee: Adobe Systems Incorporated
    Inventors: Trung Huu Bui, Hung Hai Bui, Franck Dernoncourt
  • Publication number: 20170228366
    Abstract: The present disclosure relates dialog states, which computers use to internally represent what users have in mind in dialog. A dialog state tracker employs various rules that enhance the ability of computers to correctly identify the presence of slot-value pairs, which make up dialog states, in utterances or conversational input of dialog. Some rules provide for identifying synonyms of values of slot-values pairs in utterances. Other rules provide for identifying slot-value pairs based on coreferences between utterances and previous utterances of dialog sessions. Rules are also provided for carrying over slot-value pairs from dialog states of previous utterances to a dialog state of a current utterance. Yet other rules provide for removing slot-value pairs from candidate dialog states, which are later used as dialog states of utterances.
    Type: Application
    Filed: February 5, 2016
    Publication date: August 10, 2017
    Inventors: TRUNG HUU BUI, HUNG HAI BUI, FRANCK DERNONCOURT