Patents Examined by Jonathan Ernesto Amaya Hernandez

Automated script generation and audio-visual presentations

Patent number: 12032922

Abstract: Automatic generation of intelligent content is created using a system of computers including a user device and a cloud-based component that processes the user information. The system performs a process that includes receiving an input document and parsing the input document to generate inputs for a natural language generation model using a text analysis model. The natural language generation model generates one or more candidate presentation scripts based on the inputs. A presentation script is selected from the candidate presentation scripts and displayed. A text-to-speech model may be used to generate a synthesized audio presentation of the presentation script. A final presentation may be generated that includes a visual display of the input document and the corresponding audio presentation in sync with the visual display.

Type: Grant

Filed: May 12, 2021

Date of Patent: July 9, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Ji Li, Konstantin Seleskerov, Huey-Ru Tsai, Muin Barkatali Momin, Ramya Tridandapani, Sindhu Vigasini Jambunathan, Amit Srivastava, Derek Martin Johnson, Gencheng Wu, Sheng Zhao, Xinfeng Chen, Bohan Li
Information processing apparatus and information processing method

Patent number: 11996093

Abstract: An information processing apparatus and an information processing method are provided that enable suitable determination of sensing results used in estimating a user state. The information processing apparatus is provided with a determination unit that determines, on the basis of a predetermined reference, one or more second sensing results used in estimating the user state from among a plurality of first sensing results received from a plurality of devices. The information processing apparatus is further provided with an output control unit that controls an output of information on the basis of the one or more second sensing results.

Type: Grant

Filed: July 12, 2018

Date of Patent: May 28, 2024

Inventors: Shinichi Kawano, Hiro Iwase, Mari Saito, Yuhei Taki
Method and apparatus with real-time translation

Patent number: 11955118

Abstract: A real-time processor-implemented translation method and apparatus is provided. The real-time translation method includes receiving a content, determining a delay time for real-time translation based on a silence interval of the received content and an utterance interval of the received content, generating a translation result by translating a language used in the received content, and synthesizing the translation result and the received content.

Type: Grant

Filed: April 17, 2020

Date of Patent: April 9, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Youngmin Kim, Hwidong Na, Min-joong Lee, Hodong Lee
Holographic interface for voice commands

Patent number: 11875792

Abstract: A computer implemented method, computer system, and computer program product for executing a voice command. A number of processor units displays a view of a location with voice command devices in response to detecting the voice command from a user. The number of processor units displays a voice command direction for the voice command in the view of the location. The number of processor units changes the voice command direction in response to a user input. The number of processor units identifies a voice command device from the voice command devices in the location based on the voice command direction to form a selected voice command device. The number of processor units executes the voice command using the selected voice command device.

Type: Grant

Filed: August 17, 2021

Date of Patent: January 16, 2024

Assignee: International Business Machines Corporation

Inventors: Clement Decrop, Jeremy R. Fox, Tushar Agrawal, Sarbajit K. Rakshit
Transcription revision interface for speech recognition system

Patent number: 11848000

Abstract: Methods, systems, computer program products and data structures are described which allow for efficient correction of a transcription output of an automatic speech recognition system by a human proofreader. A method comprises receiving a voice input from a user; determining a transcription of the voice input; providing the transcription of the voice input; receiving a text input from the user indicating a revision to the transcription; determining how to revise the transcription in accordance with the text input; and revising the transcription of the voice input in accordance with the text input. A general or specialized language model, an acoustical language model, a character language model, a gaze tracker, and/or a stylus may be used to determine how to revise the transcription in accordance with the text input.

Type: Grant

Filed: December 12, 2019

Date of Patent: December 19, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventor: William Duncan Lewis
Method and device for detecting audio signal, and storage medium

Patent number: 11848029

Abstract: A method for detecting an audio signal, the method comprises: obtaining a speech segment and a non-speech segment of an audio signal to be detected, extracting a first audio feature of the speech segment and a second audio feature of the non-speech segment, detecting the first audio feature using a predetermined speech segment detection model to obtain a first detection score, detecting the second audio feature using a predetermined non-speech segment detection model to obtain a second detection score, and determining whether the audio signal belongs to a target audio based on the first detection score and the second detection score.

Type: Grant

Filed: May 21, 2021

Date of Patent: December 19, 2023

Assignee: BEIJING XIAOMI PINECONE ELECTRONICS CO., LTD.

Inventors: Yifeng Wang, Guodu Cai, Shuo Yang, Lihan Li, Peng Gao
Unambiguous phonics system

Patent number: 11842718

Abstract: An unambiguous phonics system (UPS) is capable of presenting text in a format with unambiguous pronunciation. The system can translate input text written in a given language (e.g., English) into a UPS representation of the text written in a UPS alphabet. A unique UPS grapheme can be used to represent each unique grapheme-phoneme combination in the input text. Thus, each letter of the input text is represented in the UPS spelling and each letter of the UPS spelling unambiguously indicates the phoneme used. For all the various grapheme-phoneme combinations for a given input grapheme, the corresponding UPS graphemes can be constructed to have visual similarity with the given input grapheme, thus easing an eventual transition from UPS spelling to traditional spelling. The UPS can include translation, complexity scoring, word/phoneme-grapheme searching, and other module. The UPS can also include techniques to provide efficient, level-based training of the UPS alphabet.

Type: Grant

Filed: December 10, 2020

Date of Patent: December 12, 2023

Assignee: TINYIVY, INC.

Inventor: Zachary Silverzweig
Entity resolution using acoustic data

Patent number: 11817090

Abstract: A phonetic search system may pass phonetic information from an automatic speech recognition (ASR) system to a natural language understanding (NLU) system for the latter to leverage when performing entity resolution in the presence of ambiguous interpretations. The ASR system may include an acoustic model and a language model. The acoustic model can process audio data to generate hypotheses that can be mapped to acoustic data; i.e., one or more acoustic units such as phonemes. The language model can process the acoustic units to generate text data representing possible transcriptions of the audio data. ASR/NLU systems may have difficulty interpreting speech when confronted with, for example, homographs, which are words that are spelled the same, but have different meanings. When uncertainty in the final transcription is high, the system can leverage the acoustic data to improve the accuracy of entity resolution.

Type: Grant

Filed: December 12, 2019

Date of Patent: November 14, 2023

Assignee: Amazon Technologies, Inc.

Inventors: James Claiborne Moore, Majid Laali, Yasser Gonzalez Fernandez, Siyong Liang, Ameya Ashok Limaye
Method for enhancing telephone speech signals based on Convolutional Neural Networks

Patent number: 11804234

Abstract: A method for enhancing telephone speech signals based on Deep Convolutional Neural Network (CNN) is disclosed. The method is able to reduce the effect of acoustic distortions in daily scenarios during a telephone call. It is a single-channel, speech-oriented method with causal design and low latency. The novelty lies in the noise reduction method which, based on the classical gain method, uses a CNN to learn the Wiener estimator. Then, it computes the gain of the filter to enhance the speech power over the noise power for each time-frequency component of the signal. The selection of the Wiener gain estimator as an essential element of the method, decreases the vulnerability to estimation errors since the characteristics of this measure make it very appropriate to be estimated by deep learning approaches.

Type: Grant

Filed: December 17, 2020

Date of Patent: October 31, 2023

Assignee: SYSTEM ONE NOC & DEVELOPMENT SOLUTIONS, S.A.

Inventors: Javier Gallart Mauri, Iñigo Garcia Morte, Dayana Ribas Gonzalez, Antonio Miguel Artiaga, Alfonso Ortega Gimenez, Eduardo Lleida Solano
Method and apparatus with speech processing

Patent number: 11741950

Abstract: A processor-implemented method includes performing speech recognition of a speech signal, generating a plurality of first candidate sentences as a result of the performing of the speech recognition, identifying a respective named entity in each of the plurality of first candidate sentences, determining a standard expression corresponding to the identified respective named entity using phonemes of the corresponding named entity, determining whether to replace the identified named entity in each of the plurality of first candidate sentences with the determined standard expression based on a similarity between the named entity and the standard expression corresponding to the named entity and determining a plurality of second candidate sentences based on the determination result; and outputting a final sentence selected from the plurality of second candidate sentences.

Type: Grant

Filed: May 14, 2020

Date of Patent: August 29, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jeong-Hoon Park, Jihyun Lee, Hoshik Lee
Method and apparatus for implementing speaker identification neural network

Patent number: 11694696

Abstract: A method and apparatus for generating a speaker identification neural network include generating a first neural network that is trained to identify a first speaker with respect to a first voice signal in a first environment, generating a second neural network for identifying a second speaker with respect to a second voice signal in a second environment, and generating the speaker identification neural network by training the second neural network based on a teacher-student training model in which the first neural network is set to a teacher neural network and the second neural network is set to a student neural network.

Type: Grant

Filed: November 25, 2019

Date of Patent: July 4, 2023

Assignees: SAMSUNG ELECTRONICS CO.. LTD., SEOUL NATIONAL UNIVERSITY R&DB FOUNDATION

Inventors: Sungchan Kang, Namsoo Kim, Cheheung Kim, Seokwan Chae
Method, electronic device and readable storage medium for creating a label marking model

Patent number: 11531813

Abstract: A method, an electronic device and a readable storage medium for creating a label marking model are disclosed. The method for creating the label marking model includes: obtaining text data and determining a word or phrase to be marked in the text data; according to the word or phrase to be marked, constructing a first training sample of the text data corresponding to a word or phrase replacing task and a second training sample corresponding to a label marking task; training a neural network model with a plurality of the first training samples and a plurality of the second training samples, respectively, until a loss function of the word or phrase replacing task and a loss function of the label marking task satisfy a preset condition, to obtain the label marking model.

Type: Grant

Filed: September 9, 2020

Date of Patent: December 20, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Xinchao Xu, Haifeng Wang, Hua Wu, Zhanyi Liu
Text independent speaker recognition

Patent number: 11527235

Abstract: Text independent speaker recognition models can be utilized by an automated assistant to verify a particular user spoke a spoken utterance and/or to identify the user who spoke a spoken utterance. Implementations can include automatically updating a speaker embedding for a particular user based on previous utterances by the particular user. Additionally or alternatively, implementations can include verifying a particular user spoke a spoken utterance using output generated by both a text independent speaker recognition model as well as a text dependent speaker recognition model. Furthermore, implementations can additionally or alternatively include prefetching content for several users associated with a spoken utterance prior to determining which user spoke the spoken utterance.

Type: Grant

Filed: December 2, 2019

Date of Patent: December 13, 2022

Assignee: GOOGLE LLC

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno, Quan Wang
Speech synthesizer using artificial intelligence, method of operating speech synthesizer and computer-readable recording medium

Patent number: 11417313

Abstract: A speech synthesizer using artificial intelligence includes a memory configured to store a first ratio of a word classified into a minor class among a plurality of classes, a second ratio of the word which is not classified into the minor class, and a synthesized speech model and a processor configured to change a first class classification probability set of the word to a second class classification probability set, based on the first ratio, the second ratio and the first class classification probability set, and learn the synthesized speech model using the changed second class classification probability set.

Type: Grant

Filed: April 23, 2019

Date of Patent: August 16, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Jonghoon Chae, Sungmin Han
News interaction method, apparatus, device and computer storage medium

Patent number: 11397852

Abstract: A news interaction method, apparatus, device and computer storage medium are proposed. Input information input by a user upon reading current news content is obtained; parsing information of the input information is obtained based on the current news content, where the parsing information includes intent information of the input information; the input information is distributed to at least one news interactive service subsystem according to the intent information of the input information, and a return result returned by the at least one news interactive service subsystem is received; and a display result is selected from the return result according to a preset policy, and provided to the user.

Type: Grant

Filed: December 13, 2019

Date of Patent: July 26, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD

Inventors: Shuo Huang, Jiaxin Lin, Zhihong Fu, Jinbo Zhan, Guang Ling, Shiwei Huang, Guyue Zhou, Chao Zhou