Patents Examined by Jonathan Ernesto Amaya Hernandez
-
Patent number: 12032922Abstract: Automatic generation of intelligent content is created using a system of computers including a user device and a cloud-based component that processes the user information. The system performs a process that includes receiving an input document and parsing the input document to generate inputs for a natural language generation model using a text analysis model. The natural language generation model generates one or more candidate presentation scripts based on the inputs. A presentation script is selected from the candidate presentation scripts and displayed. A text-to-speech model may be used to generate a synthesized audio presentation of the presentation script. A final presentation may be generated that includes a visual display of the input document and the corresponding audio presentation in sync with the visual display.Type: GrantFiled: May 12, 2021Date of Patent: July 9, 2024Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Ji Li, Konstantin Seleskerov, Huey-Ru Tsai, Muin Barkatali Momin, Ramya Tridandapani, Sindhu Vigasini Jambunathan, Amit Srivastava, Derek Martin Johnson, Gencheng Wu, Sheng Zhao, Xinfeng Chen, Bohan Li
-
Patent number: 11996093Abstract: An information processing apparatus and an information processing method are provided that enable suitable determination of sensing results used in estimating a user state. The information processing apparatus is provided with a determination unit that determines, on the basis of a predetermined reference, one or more second sensing results used in estimating the user state from among a plurality of first sensing results received from a plurality of devices. The information processing apparatus is further provided with an output control unit that controls an output of information on the basis of the one or more second sensing results.Type: GrantFiled: July 12, 2018Date of Patent: May 28, 2024Inventors: Shinichi Kawano, Hiro Iwase, Mari Saito, Yuhei Taki
-
Patent number: 11955118Abstract: A real-time processor-implemented translation method and apparatus is provided. The real-time translation method includes receiving a content, determining a delay time for real-time translation based on a silence interval of the received content and an utterance interval of the received content, generating a translation result by translating a language used in the received content, and synthesizing the translation result and the received content.Type: GrantFiled: April 17, 2020Date of Patent: April 9, 2024Assignee: Samsung Electronics Co., Ltd.Inventors: Youngmin Kim, Hwidong Na, Min-joong Lee, Hodong Lee
-
Patent number: 11875792Abstract: A computer implemented method, computer system, and computer program product for executing a voice command. A number of processor units displays a view of a location with voice command devices in response to detecting the voice command from a user. The number of processor units displays a voice command direction for the voice command in the view of the location. The number of processor units changes the voice command direction in response to a user input. The number of processor units identifies a voice command device from the voice command devices in the location based on the voice command direction to form a selected voice command device. The number of processor units executes the voice command using the selected voice command device.Type: GrantFiled: August 17, 2021Date of Patent: January 16, 2024Assignee: International Business Machines CorporationInventors: Clement Decrop, Jeremy R. Fox, Tushar Agrawal, Sarbajit K. Rakshit
-
Patent number: 11848029Abstract: A method for detecting an audio signal, the method comprises: obtaining a speech segment and a non-speech segment of an audio signal to be detected, extracting a first audio feature of the speech segment and a second audio feature of the non-speech segment, detecting the first audio feature using a predetermined speech segment detection model to obtain a first detection score, detecting the second audio feature using a predetermined non-speech segment detection model to obtain a second detection score, and determining whether the audio signal belongs to a target audio based on the first detection score and the second detection score.Type: GrantFiled: May 21, 2021Date of Patent: December 19, 2023Assignee: BEIJING XIAOMI PINECONE ELECTRONICS CO., LTD.Inventors: Yifeng Wang, Guodu Cai, Shuo Yang, Lihan Li, Peng Gao
-
Patent number: 11848000Abstract: Methods, systems, computer program products and data structures are described which allow for efficient correction of a transcription output of an automatic speech recognition system by a human proofreader. A method comprises receiving a voice input from a user; determining a transcription of the voice input; providing the transcription of the voice input; receiving a text input from the user indicating a revision to the transcription; determining how to revise the transcription in accordance with the text input; and revising the transcription of the voice input in accordance with the text input. A general or specialized language model, an acoustical language model, a character language model, a gaze tracker, and/or a stylus may be used to determine how to revise the transcription in accordance with the text input.Type: GrantFiled: December 12, 2019Date of Patent: December 19, 2023Assignee: Microsoft Technology Licensing, LLCInventor: William Duncan Lewis
-
Patent number: 11842718Abstract: An unambiguous phonics system (UPS) is capable of presenting text in a format with unambiguous pronunciation. The system can translate input text written in a given language (e.g., English) into a UPS representation of the text written in a UPS alphabet. A unique UPS grapheme can be used to represent each unique grapheme-phoneme combination in the input text. Thus, each letter of the input text is represented in the UPS spelling and each letter of the UPS spelling unambiguously indicates the phoneme used. For all the various grapheme-phoneme combinations for a given input grapheme, the corresponding UPS graphemes can be constructed to have visual similarity with the given input grapheme, thus easing an eventual transition from UPS spelling to traditional spelling. The UPS can include translation, complexity scoring, word/phoneme-grapheme searching, and other module. The UPS can also include techniques to provide efficient, level-based training of the UPS alphabet.Type: GrantFiled: December 10, 2020Date of Patent: December 12, 2023Assignee: TINYIVY, INC.Inventor: Zachary Silverzweig
-
Patent number: 11817090Abstract: A phonetic search system may pass phonetic information from an automatic speech recognition (ASR) system to a natural language understanding (NLU) system for the latter to leverage when performing entity resolution in the presence of ambiguous interpretations. The ASR system may include an acoustic model and a language model. The acoustic model can process audio data to generate hypotheses that can be mapped to acoustic data; i.e., one or more acoustic units such as phonemes. The language model can process the acoustic units to generate text data representing possible transcriptions of the audio data. ASR/NLU systems may have difficulty interpreting speech when confronted with, for example, homographs, which are words that are spelled the same, but have different meanings. When uncertainty in the final transcription is high, the system can leverage the acoustic data to improve the accuracy of entity resolution.Type: GrantFiled: December 12, 2019Date of Patent: November 14, 2023Assignee: Amazon Technologies, Inc.Inventors: James Claiborne Moore, Majid Laali, Yasser Gonzalez Fernandez, Siyong Liang, Ameya Ashok Limaye
-
Patent number: 11804234Abstract: A method for enhancing telephone speech signals based on Deep Convolutional Neural Network (CNN) is disclosed. The method is able to reduce the effect of acoustic distortions in daily scenarios during a telephone call. It is a single-channel, speech-oriented method with causal design and low latency. The novelty lies in the noise reduction method which, based on the classical gain method, uses a CNN to learn the Wiener estimator. Then, it computes the gain of the filter to enhance the speech power over the noise power for each time-frequency component of the signal. The selection of the Wiener gain estimator as an essential element of the method, decreases the vulnerability to estimation errors since the characteristics of this measure make it very appropriate to be estimated by deep learning approaches.Type: GrantFiled: December 17, 2020Date of Patent: October 31, 2023Assignee: SYSTEM ONE NOC & DEVELOPMENT SOLUTIONS, S.A.Inventors: Javier Gallart Mauri, IƱigo Garcia Morte, Dayana Ribas Gonzalez, Antonio Miguel Artiaga, Alfonso Ortega Gimenez, Eduardo Lleida Solano
-
Patent number: 11741950Abstract: A processor-implemented method includes performing speech recognition of a speech signal, generating a plurality of first candidate sentences as a result of the performing of the speech recognition, identifying a respective named entity in each of the plurality of first candidate sentences, determining a standard expression corresponding to the identified respective named entity using phonemes of the corresponding named entity, determining whether to replace the identified named entity in each of the plurality of first candidate sentences with the determined standard expression based on a similarity between the named entity and the standard expression corresponding to the named entity and determining a plurality of second candidate sentences based on the determination result; and outputting a final sentence selected from the plurality of second candidate sentences.Type: GrantFiled: May 14, 2020Date of Patent: August 29, 2023Assignee: Samsung Electronics Co., Ltd.Inventors: Jeong-Hoon Park, Jihyun Lee, Hoshik Lee
-
Patent number: 11694696Abstract: A method and apparatus for generating a speaker identification neural network include generating a first neural network that is trained to identify a first speaker with respect to a first voice signal in a first environment, generating a second neural network for identifying a second speaker with respect to a second voice signal in a second environment, and generating the speaker identification neural network by training the second neural network based on a teacher-student training model in which the first neural network is set to a teacher neural network and the second neural network is set to a student neural network.Type: GrantFiled: November 25, 2019Date of Patent: July 4, 2023Assignees: SAMSUNG ELECTRONICS CO.. LTD., SEOUL NATIONAL UNIVERSITY R&DB FOUNDATIONInventors: Sungchan Kang, Namsoo Kim, Cheheung Kim, Seokwan Chae
-
Patent number: 11531813Abstract: A method, an electronic device and a readable storage medium for creating a label marking model are disclosed. The method for creating the label marking model includes: obtaining text data and determining a word or phrase to be marked in the text data; according to the word or phrase to be marked, constructing a first training sample of the text data corresponding to a word or phrase replacing task and a second training sample corresponding to a label marking task; training a neural network model with a plurality of the first training samples and a plurality of the second training samples, respectively, until a loss function of the word or phrase replacing task and a loss function of the label marking task satisfy a preset condition, to obtain the label marking model.Type: GrantFiled: September 9, 2020Date of Patent: December 20, 2022Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Xinchao Xu, Haifeng Wang, Hua Wu, Zhanyi Liu
-
Patent number: 11527235Abstract: Text independent speaker recognition models can be utilized by an automated assistant to verify a particular user spoke a spoken utterance and/or to identify the user who spoke a spoken utterance. Implementations can include automatically updating a speaker embedding for a particular user based on previous utterances by the particular user. Additionally or alternatively, implementations can include verifying a particular user spoke a spoken utterance using output generated by both a text independent speaker recognition model as well as a text dependent speaker recognition model. Furthermore, implementations can additionally or alternatively include prefetching content for several users associated with a spoken utterance prior to determining which user spoke the spoken utterance.Type: GrantFiled: December 2, 2019Date of Patent: December 13, 2022Assignee: GOOGLE LLCInventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno, Quan Wang
-
Patent number: 11417313Abstract: A speech synthesizer using artificial intelligence includes a memory configured to store a first ratio of a word classified into a minor class among a plurality of classes, a second ratio of the word which is not classified into the minor class, and a synthesized speech model and a processor configured to change a first class classification probability set of the word to a second class classification probability set, based on the first ratio, the second ratio and the first class classification probability set, and learn the synthesized speech model using the changed second class classification probability set.Type: GrantFiled: April 23, 2019Date of Patent: August 16, 2022Assignee: LG ELECTRONICS INC.Inventors: Jonghoon Chae, Sungmin Han
-
Patent number: 11397852Abstract: A news interaction method, apparatus, device and computer storage medium are proposed. Input information input by a user upon reading current news content is obtained; parsing information of the input information is obtained based on the current news content, where the parsing information includes intent information of the input information; the input information is distributed to at least one news interactive service subsystem according to the intent information of the input information, and a return result returned by the at least one news interactive service subsystem is received; and a display result is selected from the return result according to a preset policy, and provided to the user.Type: GrantFiled: December 13, 2019Date of Patent: July 26, 2022Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTDInventors: Shuo Huang, Jiaxin Lin, Zhihong Fu, Jinbo Zhan, Guang Ling, Shiwei Huang, Guyue Zhou, Chao Zhou