Patents Examined by Michael N. Opsasnick

Information processing apparatus, information processing method, and program

Patent number: 11462213

Abstract: There is provided an information processing apparatus to realize more natural dialog between users and a system, the information processing apparatus including: a control unit that selects a feedback mode in response to a speech style of a user from among a plurality of modes in accordance with information related to recognition of speech of the user. The plurality of modes include a first mode in which implicit feedback is performed and a second mode in which explicit feedback is performed. Provided is an information processing method including: selecting, by a processor, a feedback mode in response to a speech style of a user from among a plurality of modes in accordance with information related to recognition of speech of the user. The plurality of modes include a first mode in which implicit feedback is performed and a second mode in which explicit feedback is performed.

Type: Grant

Filed: January 12, 2017

Date of Patent: October 4, 2022

Assignee: SONY CORPORATION

Inventor: Mari Saito
Method and apparatus for processing questions and answers, electronic device and storage medium

Patent number: 11461556

Abstract: A method for processing questions and answers includes: in a process of determining an answer to a question to be answered, determining the semantic representation on the question to be answered respectively with a first semantic representation model of question and a second semantic representation model of question. Semantic representation vectors obtained through the first semantic representation model of question and the second semantic representation model of question are spliced. A spliced semantic vector is determined as a semantic representation vector of the question to be answered. An answer semantic vector matching the semantic representation vector of the question to be answered is acquired from a vector index library of answer, and an answer corresponding to the answer semantic vector is determined as a target answer to the question to be answered.

Type: Grant

Filed: May 28, 2020

Date of Patent: October 4, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Yuchen Ding, Kai Liu, Jing Liu, Yan Chen
Multi-person speech separation method and apparatus using a generative adversarial network model

Patent number: 11450337

Abstract: A multi-person speech separation method is provided for a terminal. The method includes extracting a hybrid speech feature from a hybrid speech signal requiring separation, N human voices being mixed in the hybrid speech signal, N being a positive integer greater than or equal to 2; extracting a masking coefficient of the hybrid speech feature by using a generative adversarial network (GAN) model, to obtain a masking matrix corresponding to the N human voices, wherein the GAN model comprises a generative network model and an adversarial network model; and performing a speech separation on the masking matrix corresponding to the N human voices and the hybrid speech signal by using the GAN model, and outputting N separated speech signals corresponding to the N human voices.

Type: Grant

Filed: September 17, 2020

Date of Patent: September 20, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Lianwu Chen, Meng Yu, Yanmin Qian, Dan Su, Dong Yu
Method and apparatus for controlling audio frame loss concealment

Patent number: 11437047

Abstract: In accordance with an example embodiment of the present invention, disclosed is a method and an apparatus thereof for controlling a concealment method for a lost audio frame of a received audio signal. A method for a decoder of concealing a lost audio frame comprises detecting in a property of the previously received and reconstructed audio signal, or in a statistical property of observed frame losses, a condition for which the substitution of a lost frame provides relatively reduced quality. In case such a condition is detected, the concealment method is modified by selectively adjusting a phase or a spectrum magnitude of a substitution frame spectrum.

Type: Grant

Filed: December 19, 2019

Date of Patent: September 6, 2022

Assignee: Telefonaktiebolaget L M Ericsson (publ)

Inventors: Stefan Bruhn, Jonas Svedberg
Artificial intelligence based audio coding

Patent number: 11437050

Abstract: Techniques are described for coding audio signals. For example, using a neural network, a residual signal is generated for a sample of an audio signal based on inputs to the neural network. The residual signal is configured to excite a long-term prediction filter and/or a short-term prediction filter. Using the long-term prediction filter and/or the short-term prediction filter, a sample of a reconstructed audio signal is determined. The sample of the reconstructed audio signal is determined based on the residual signal generated using the neural network for the sample of the audio signal.

Type: Grant

Filed: December 10, 2019

Date of Patent: September 6, 2022

Assignee: QUALCOMM Incorporated

Inventors: Zisis Iason Skordilis, Vivek Rajendran, Guillaume Konrad Sautière, Daniel Jared Sinder
High-band signal generation

Patent number: 11437049

Abstract: A device for signal processing includes a memory and a processor. The memory is configured to store a parameter associated with a bandwidth-extended audio stream. The processor is configured to select a plurality of non-linear processing functions based at least in part on a value of the parameter. The processor is also configured to generate a high-band excitation signal based on the plurality of non-linear processing functions.

Type: Grant

Filed: October 28, 2020

Date of Patent: September 6, 2022

Assignee: QUALCOMM Incorporated

Inventors: Venkatraman Atti, Venkata Subrahmanyam Chandra Sekhar Chebiyyam
Task flow identification based on user intent

Patent number: 11423886

Abstract: The intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.

Type: Grant

Filed: May 20, 2020

Date of Patent: August 23, 2022

Assignee: Apple Inc.

Inventors: Thomas Robert Gruber, Adam John Cheyer, Dag Kittlaus, Didier Rene Guzzoni, Christopher Dean Brigham, Richard Donald Giuli, Marcello Bastea-Forte, Harry Joseph Saddler
Electronic apparatus and control method for controlling a device in an Internet of Things

Patent number: 11417338

Abstract: An electronic apparatus and method of controlling the electronic apparatus are provided. The electronic apparatus includes a communicator, a storage storing information on places wherein Internet of Things (IoT) devices are located, and a processor configured to, based on receiving a control signal for controlling an IoT device located in a specific place through the communicator, control the IoT device located in the specific place based on information on the place stored in the storage. The processor is further configured to receive motion information generated based on a motion of a wearable device from the wearable device, identify a place corresponding to the motion information, and store the identified place as information on a place of an IoT device located within a predetermined distance from the wearable device, in the storage.

Type: Grant

Filed: July 26, 2019

Date of Patent: August 16, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Seongil Hahm, Taejun Kwon, Venkatraman Iyer, Daesung An
System and method of diarization and labeling of audio data

Patent number: 11380333

Abstract: Systems and methods of diarization using linguistic labeling include receiving a set of diarized textual transcripts. A least one heuristic is automatedly applied to the diarized textual transcripts to select transcripts likely to be associated with an identified group of speakers. The selected transcripts are analyzed to create at least one linguistic model. The linguistic model is applied to transcripted audio data to label a portion of the transcripted audio data as having been spoken by the identified group of speakers. Still further embodiments of diarization using linguistic labeling may serve to label agent speech and customer speech in a recorded and transcripted customer service interaction.

Type: Grant

Filed: December 4, 2019

Date of Patent: July 5, 2022

Assignee: Verint Systems Inc.

Inventors: Omer Ziv, Ran Achituv, Ido Shapira, Jeremie Dreyfuss
System and method for streaming end-to-end speech recognition with asynchronous decoders pruning prefixes using a joint label and frame information in transcribing technique

Patent number: 11373639

Abstract: A speech recognition system successively processes each encoder state of encoded acoustic features with a frame-synchronous decoder (FSD) and label-synchronous decoder (LSD) modules. Upon identifying an encoder state carrying information about new transcription output, the system expands a current list of FSD prefixes with FSD module, evaluates the FSD prefixes with LSD module, and prunes the FSD prefixes according to joint FSD and LSD scores. FSD and LSD modules are synchronized by having LSD module to process the portion of the encoder states including new transcription output identified by the FSD module and to produce LSD scores for the FSD prefixes determined by the FSD module.

Type: Grant

Filed: December 12, 2019

Date of Patent: June 28, 2022

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Niko Moritz, Takaaki Hori, Jonathan Le Roux
System and method of diarization and labeling of audio data

Patent number: 11367450

Abstract: Systems and methods of diarization using linguistic labeling include receiving a set of diarized textual transcripts. A least one heuristic is automatedly applied to the diarized textual transcripts to select transcripts likely to be associated with an identified group of speakers. The selected transcripts are analyzed to create at least one linguistic model. The linguistic model is applied to transcripted audio data to label a portion of the transcripted audio data as having been spoken by the identified group of speakers. Still further embodiments of diarization using linguistic labeling may serve to label agent speech and customer speech in a recorded and transcripted customer service interaction.

Type: Grant

Filed: December 4, 2019

Date of Patent: June 21, 2022

Assignee: Verint Systems Inc.

Inventors: Omer Ziv, Ran Achituv, Ido Shapira, Jeremie Dreyfuss
Deep learning segmentation of audio using magnitude spectrogram

Patent number: 11355134

Abstract: A method, system, and computer readable medium for decomposing an audio signal into different isolated sources. The techniques and mechanisms convert an audio signal into K input spectrogram fragments. The fragments are sent into a deep neural network to isolate for different sources. The isolated fragments are then combined to form full isolated source audio signals.

Type: Grant

Filed: October 2, 2020

Date of Patent: June 7, 2022

Assignee: AUDIOSHAKE, INC.

Inventor: Luke Miner
Speech recognition using two language models

Patent number: 11341972

Abstract: In one aspect, a method comprises accessing audio data generated by a computing device based on audio input from a user, the audio data encoding one or more user utterances. The method further comprises generating a first transcription of the utterances by performing speech recognition on the audio data using a first speech recognizer that employs a language model based on user-specific data. The method further comprises generating a second transcription of the utterances by performing speech recognition on the audio data using a second speech recognizer that employs a language model independent of user-specific data. The method further comprises determining that the second transcription of the utterances includes a term from a predefined set of one or more terms. The method further comprises, based on determining that the second transcription of the utterance includes the term, providing an output of the first transcription of the utterance.

Type: Grant

Filed: October 22, 2020

Date of Patent: May 24, 2022

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Petar Aleksic
Diarization using linguistic labeling

Patent number: 11322154

Abstract: Systems and methods of diarization using linguistic labeling include receiving a set of diarized textual transcripts. At least one heuristic is automatedly applied to the diarized textual transcripts to select transcripts likely to be associated with an identified group of speakers. The selected transcripts are analyzed to create at least one linguistic model. The linguistic model is applied to transcripted audio data to label a portion of the transcripted audio data as having been spoken by the identified group of speakers. Still further embodiments of diarization using linguistic labeling may serve to label agent speech and customer speech in a recorded and transcribed customer service interaction.

Type: Grant

Filed: December 4, 2019

Date of Patent: May 3, 2022

Assignee: Verint Systems Inc.

Inventors: Omer Ziv, Ran Achituv, Ido Shapira, Jeremie Dreyfuss
Methods, systems, and media for connecting an IoT device to a call

Patent number: 11315554

Abstract: Methods, systems, and media for connecting an IoT device to a call are provided. In some embodiments, a method is provided, the method comprising: establishing, at a first end-point device, a telecommunication channel with a second end-point device; subsequent to establishing the telecommunication channel, and prior to a termination of the telecommunication channel, detecting, using the first end-point device, a voice command that includes a keyword; and in response to detecting the voice command, causing information associated with an IoT device that corresponds to the keyword to be transmitted to the second end-point device.

Type: Grant

Filed: March 15, 2018

Date of Patent: April 26, 2022

Assignee: Google LLC

Inventors: Saptarshi Bhattacharya, Shreedhar Madhavapeddi
Systems and methods for voice-based initiation of custom device actions

Patent number: 11314481

Abstract: Systems and methods for enabling voice-based interactions with electronic devices can include a data processing system maintaining a plurality of device action data sets and a respective identifier for each device action data set. The data processing system can receive, from an electronic device, an audio signal representing a voice query and an identifier. The data processing system can identify, using the identifier, a device action data set. The data processing system can identify a device action from device action data set based on content of the audio signal. The data processing system can then identify, from the device action dataset, a command associated with the device action and send the command to the for execution device for execution.

Type: Grant

Filed: May 7, 2018

Date of Patent: April 26, 2022

Assignee: GOOGLE LLC

Inventors: Bo Wang, Venkat Kotla, Chad Yoshikawa, Chris Ramsdale, Pravir Gupta, Alfonso Gomez-Jordana, Kevin Yeun, Jae Won Seo, Lantian Zheng, Sang Soo Sung
Methods and apparatus for reconstructing audio signals with decorrelation and differentially coded parameters

Patent number: 11308969

Abstract: A method performed in an audio decoder for decoding M encoded audio channels representing N audio channels is disclosed. The method includes receiving a bitstream containing the M encoded audio channels and a set of spatial parameters, decoding the M encoded audio channels, and extracting the set of spatial parameters from the bitstream. The method also includes analyzing the M audio channels to detect a location of a transient, decorrelating the M audio channels, and deriving N audio channels from the M audio channels and the set of spatial parameters. A first decorrelation technique is applied to a first subset of each audio channel and a second decorrelation technique is applied to a second subset of each audio channel. The first decorrelation technique represents a first mode of operation of a decorrelator, and the second decorrelation technique represents a second mode of operation of the decorrelator.

Type: Grant

Filed: October 5, 2020

Date of Patent: April 19, 2022

Assignee: Dolby Laboratories Licensing Corporation

Inventor: Mark F. Davis
Aligning spike timing of models for maching learning

Patent number: 11302309

Abstract: A technique for aligning spike timing of models is disclosed. A first model having a first architecture trained with a set of training samples is generated. Each training sample includes an input sequence of observations and an output sequence of symbols having different length from the input sequence. Then, one or more second models are trained with the trained first model by minimizing a guide loss jointly with a normal loss for each second model and a sequence recognition task is performed using the one or more second models. The guide loss evaluates dissimilarity in spike timing between the trained first model and each second model being trained.

Type: Grant

Filed: September 13, 2019

Date of Patent: April 12, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Gakuto Kurata, Kartik Audhkhasi
System and method of traffic sign translation

Patent number: 11301642

Abstract: One general aspect includes a system to translate language exhibited on a publicly viewable sign, the system including: a memory configured to include one or more executable instructions and a processor configured to execute the executable instructions, where the executable instructions enable the processor to carry out the steps of: reviewing the sign; translating relevant information conveyed on the sign from a first language to a second language; and producing an output in an interior of a vehicle, the output based on the second language of the relevant information.

Type: Grant

Filed: April 17, 2019

Date of Patent: April 12, 2022

Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: Brunno L. Moretti, Esther Anderson, Luis Goncalves
Two-dimensional smoothing of post-filter masks

Patent number: 11282531

Abstract: A method includes receiving multiple samples of time-domain data that includes noise, computing a first two-dimensional (2D) time-frequency representation of the time domain data, and processing the first time-frequency representation using a time-frequency noise reduction mask to generate a second, noise-reduced time-frequency representation of the time domain data. The method also includes generating a time domain output based on the noise-reduced time-frequency representation.

Type: Grant

Filed: February 3, 2020

Date of Patent: March 22, 2022

Assignee: Bose Corporation

Inventors: Ankita D. Jain, Cristian Marius Hera, Elie Bou Daher

prev 1 2 3 4 5 6 7 8 9 … next