Patents Examined by Jonathan Ernesto Amaya Hernandez
  • Patent number: 12032922
    Abstract: Automatic generation of intelligent content is created using a system of computers including a user device and a cloud-based component that processes the user information. The system performs a process that includes receiving an input document and parsing the input document to generate inputs for a natural language generation model using a text analysis model. The natural language generation model generates one or more candidate presentation scripts based on the inputs. A presentation script is selected from the candidate presentation scripts and displayed. A text-to-speech model may be used to generate a synthesized audio presentation of the presentation script. A final presentation may be generated that includes a visual display of the input document and the corresponding audio presentation in sync with the visual display.
    Type: Grant
    Filed: May 12, 2021
    Date of Patent: July 9, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ji Li, Konstantin Seleskerov, Huey-Ru Tsai, Muin Barkatali Momin, Ramya Tridandapani, Sindhu Vigasini Jambunathan, Amit Srivastava, Derek Martin Johnson, Gencheng Wu, Sheng Zhao, Xinfeng Chen, Bohan Li
  • Patent number: 11996093
    Abstract: An information processing apparatus and an information processing method are provided that enable suitable determination of sensing results used in estimating a user state. The information processing apparatus is provided with a determination unit that determines, on the basis of a predetermined reference, one or more second sensing results used in estimating the user state from among a plurality of first sensing results received from a plurality of devices. The information processing apparatus is further provided with an output control unit that controls an output of information on the basis of the one or more second sensing results.
    Type: Grant
    Filed: July 12, 2018
    Date of Patent: May 28, 2024
    Inventors: Shinichi Kawano, Hiro Iwase, Mari Saito, Yuhei Taki
  • Patent number: 11955118
    Abstract: A real-time processor-implemented translation method and apparatus is provided. The real-time translation method includes receiving a content, determining a delay time for real-time translation based on a silence interval of the received content and an utterance interval of the received content, generating a translation result by translating a language used in the received content, and synthesizing the translation result and the received content.
    Type: Grant
    Filed: April 17, 2020
    Date of Patent: April 9, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Youngmin Kim, Hwidong Na, Min-joong Lee, Hodong Lee
  • Patent number: 11875792
    Abstract: A computer implemented method, computer system, and computer program product for executing a voice command. A number of processor units displays a view of a location with voice command devices in response to detecting the voice command from a user. The number of processor units displays a voice command direction for the voice command in the view of the location. The number of processor units changes the voice command direction in response to a user input. The number of processor units identifies a voice command device from the voice command devices in the location based on the voice command direction to form a selected voice command device. The number of processor units executes the voice command using the selected voice command device.
    Type: Grant
    Filed: August 17, 2021
    Date of Patent: January 16, 2024
    Assignee: International Business Machines Corporation
    Inventors: Clement Decrop, Jeremy R. Fox, Tushar Agrawal, Sarbajit K. Rakshit
  • Patent number: 11848029
    Abstract: A method for detecting an audio signal, the method comprises: obtaining a speech segment and a non-speech segment of an audio signal to be detected, extracting a first audio feature of the speech segment and a second audio feature of the non-speech segment, detecting the first audio feature using a predetermined speech segment detection model to obtain a first detection score, detecting the second audio feature using a predetermined non-speech segment detection model to obtain a second detection score, and determining whether the audio signal belongs to a target audio based on the first detection score and the second detection score.
    Type: Grant
    Filed: May 21, 2021
    Date of Patent: December 19, 2023
    Assignee: BEIJING XIAOMI PINECONE ELECTRONICS CO., LTD.
    Inventors: Yifeng Wang, Guodu Cai, Shuo Yang, Lihan Li, Peng Gao
  • Patent number: 11848000
    Abstract: Methods, systems, computer program products and data structures are described which allow for efficient correction of a transcription output of an automatic speech recognition system by a human proofreader. A method comprises receiving a voice input from a user; determining a transcription of the voice input; providing the transcription of the voice input; receiving a text input from the user indicating a revision to the transcription; determining how to revise the transcription in accordance with the text input; and revising the transcription of the voice input in accordance with the text input. A general or specialized language model, an acoustical language model, a character language model, a gaze tracker, and/or a stylus may be used to determine how to revise the transcription in accordance with the text input.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: December 19, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: William Duncan Lewis
  • Patent number: 11842718
    Abstract: An unambiguous phonics system (UPS) is capable of presenting text in a format with unambiguous pronunciation. The system can translate input text written in a given language (e.g., English) into a UPS representation of the text written in a UPS alphabet. A unique UPS grapheme can be used to represent each unique grapheme-phoneme combination in the input text. Thus, each letter of the input text is represented in the UPS spelling and each letter of the UPS spelling unambiguously indicates the phoneme used. For all the various grapheme-phoneme combinations for a given input grapheme, the corresponding UPS graphemes can be constructed to have visual similarity with the given input grapheme, thus easing an eventual transition from UPS spelling to traditional spelling. The UPS can include translation, complexity scoring, word/phoneme-grapheme searching, and other module. The UPS can also include techniques to provide efficient, level-based training of the UPS alphabet.
    Type: Grant
    Filed: December 10, 2020
    Date of Patent: December 12, 2023
    Assignee: TINYIVY, INC.
    Inventor: Zachary Silverzweig
  • Patent number: 11817090
    Abstract: A phonetic search system may pass phonetic information from an automatic speech recognition (ASR) system to a natural language understanding (NLU) system for the latter to leverage when performing entity resolution in the presence of ambiguous interpretations. The ASR system may include an acoustic model and a language model. The acoustic model can process audio data to generate hypotheses that can be mapped to acoustic data; i.e., one or more acoustic units such as phonemes. The language model can process the acoustic units to generate text data representing possible transcriptions of the audio data. ASR/NLU systems may have difficulty interpreting speech when confronted with, for example, homographs, which are words that are spelled the same, but have different meanings. When uncertainty in the final transcription is high, the system can leverage the acoustic data to improve the accuracy of entity resolution.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: November 14, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: James Claiborne Moore, Majid Laali, Yasser Gonzalez Fernandez, Siyong Liang, Ameya Ashok Limaye
  • Patent number: 11804234
    Abstract: A method for enhancing telephone speech signals based on Deep Convolutional Neural Network (CNN) is disclosed. The method is able to reduce the effect of acoustic distortions in daily scenarios during a telephone call. It is a single-channel, speech-oriented method with causal design and low latency. The novelty lies in the noise reduction method which, based on the classical gain method, uses a CNN to learn the Wiener estimator. Then, it computes the gain of the filter to enhance the speech power over the noise power for each time-frequency component of the signal. The selection of the Wiener gain estimator as an essential element of the method, decreases the vulnerability to estimation errors since the characteristics of this measure make it very appropriate to be estimated by deep learning approaches.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: October 31, 2023
    Assignee: SYSTEM ONE NOC & DEVELOPMENT SOLUTIONS, S.A.
    Inventors: Javier Gallart Mauri, IƱigo Garcia Morte, Dayana Ribas Gonzalez, Antonio Miguel Artiaga, Alfonso Ortega Gimenez, Eduardo Lleida Solano
  • Patent number: 11741950
    Abstract: A processor-implemented method includes performing speech recognition of a speech signal, generating a plurality of first candidate sentences as a result of the performing of the speech recognition, identifying a respective named entity in each of the plurality of first candidate sentences, determining a standard expression corresponding to the identified respective named entity using phonemes of the corresponding named entity, determining whether to replace the identified named entity in each of the plurality of first candidate sentences with the determined standard expression based on a similarity between the named entity and the standard expression corresponding to the named entity and determining a plurality of second candidate sentences based on the determination result; and outputting a final sentence selected from the plurality of second candidate sentences.
    Type: Grant
    Filed: May 14, 2020
    Date of Patent: August 29, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jeong-Hoon Park, Jihyun Lee, Hoshik Lee
  • Patent number: 11694696
    Abstract: A method and apparatus for generating a speaker identification neural network include generating a first neural network that is trained to identify a first speaker with respect to a first voice signal in a first environment, generating a second neural network for identifying a second speaker with respect to a second voice signal in a second environment, and generating the speaker identification neural network by training the second neural network based on a teacher-student training model in which the first neural network is set to a teacher neural network and the second neural network is set to a student neural network.
    Type: Grant
    Filed: November 25, 2019
    Date of Patent: July 4, 2023
    Assignees: SAMSUNG ELECTRONICS CO.. LTD., SEOUL NATIONAL UNIVERSITY R&DB FOUNDATION
    Inventors: Sungchan Kang, Namsoo Kim, Cheheung Kim, Seokwan Chae
  • Patent number: 11531813
    Abstract: A method, an electronic device and a readable storage medium for creating a label marking model are disclosed. The method for creating the label marking model includes: obtaining text data and determining a word or phrase to be marked in the text data; according to the word or phrase to be marked, constructing a first training sample of the text data corresponding to a word or phrase replacing task and a second training sample corresponding to a label marking task; training a neural network model with a plurality of the first training samples and a plurality of the second training samples, respectively, until a loss function of the word or phrase replacing task and a loss function of the label marking task satisfy a preset condition, to obtain the label marking model.
    Type: Grant
    Filed: September 9, 2020
    Date of Patent: December 20, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Xinchao Xu, Haifeng Wang, Hua Wu, Zhanyi Liu
  • Patent number: 11527235
    Abstract: Text independent speaker recognition models can be utilized by an automated assistant to verify a particular user spoke a spoken utterance and/or to identify the user who spoke a spoken utterance. Implementations can include automatically updating a speaker embedding for a particular user based on previous utterances by the particular user. Additionally or alternatively, implementations can include verifying a particular user spoke a spoken utterance using output generated by both a text independent speaker recognition model as well as a text dependent speaker recognition model. Furthermore, implementations can additionally or alternatively include prefetching content for several users associated with a spoken utterance prior to determining which user spoke the spoken utterance.
    Type: Grant
    Filed: December 2, 2019
    Date of Patent: December 13, 2022
    Assignee: GOOGLE LLC
    Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno, Quan Wang
  • Patent number: 11417313
    Abstract: A speech synthesizer using artificial intelligence includes a memory configured to store a first ratio of a word classified into a minor class among a plurality of classes, a second ratio of the word which is not classified into the minor class, and a synthesized speech model and a processor configured to change a first class classification probability set of the word to a second class classification probability set, based on the first ratio, the second ratio and the first class classification probability set, and learn the synthesized speech model using the changed second class classification probability set.
    Type: Grant
    Filed: April 23, 2019
    Date of Patent: August 16, 2022
    Assignee: LG ELECTRONICS INC.
    Inventors: Jonghoon Chae, Sungmin Han
  • Patent number: 11397852
    Abstract: A news interaction method, apparatus, device and computer storage medium are proposed. Input information input by a user upon reading current news content is obtained; parsing information of the input information is obtained based on the current news content, where the parsing information includes intent information of the input information; the input information is distributed to at least one news interactive service subsystem according to the intent information of the input information, and a return result returned by the at least one news interactive service subsystem is received; and a display result is selected from the return result according to a preset policy, and provided to the user.
    Type: Grant
    Filed: December 13, 2019
    Date of Patent: July 26, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD
    Inventors: Shuo Huang, Jiaxin Lin, Zhihong Fu, Jinbo Zhan, Guang Ling, Shiwei Huang, Guyue Zhou, Chao Zhou