Patents Examined by Fariba Sirjani

AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command

Patent number: 11978444

Abstract: A method, system and apparatus to generate an augmented voice command, including identifying a plurality of sounds from a respective plurality of transducers to a smart speaker device, generating a visualization of the sounds using an augmented reality device, wherein one or more of the sounds can be selected using the visualization, and generating the augmented voice command for the smart speaker device, wherein the augmented voice command comprises the one or more sounds selected using the visualization of the augmented reality device.

Type: Grant

Filed: November 24, 2020

Date of Patent: May 7, 2024

Assignee: International Business Machines Corporation

Inventors: Clement Decrop, Tushar Agrawal, Jeremy R. Fox, Sarbajit K Rakshit
Method for detecting speech in audio data

Patent number: 11967340

Abstract: Disclosed is a method for detecting a voice from audio data, performed by a computing device according to an exemplary embodiment of the present disclosure. The method includes obtaining audio data; generating image data based on a spectrum of the obtained audio data; analyzing the generated image data by utilizing a pre-trained neural network model; and determining whether an automated response system (ARS) voice is included in the audio data, based on the analysis of the image data.

Type: Grant

Filed: June 23, 2023

Date of Patent: April 23, 2024

Assignee: ActionPower Corp.

Inventors: Subong Choi, Dongchan Shin, Jihwa Lee
Systems and methods for selectively attenuating a voice

Patent number: 11929087

Abstract: Systems and methods for amplifying and/or attenuating audio signals are disclosed. In one implementation, a system for selectively amplifying audio signals may include at least one microphone for capturing sounds from an environment of the user and a processor. The processor may be programmed to receive an audio signal representative of sounds captured by the at least one microphone; determine whether the audio signal comprises speech by a user of the system; subject to the audio signal comprising speech by the user, modify the audio signal by attenuating a first part of the audio signal comprising the speech by the user; subject to the audio signal comprising audio other than speech by the user, modify the audio signal by amplifying a second part of the audio signal comprising audio other than the speech by the user; and transmit the modified audio signal to a hearing interface device.

Type: Grant

Filed: September 16, 2021

Date of Patent: March 12, 2024

Assignee: ORCAM TECHNOLOGIES LTD.

Inventors: Tal Rosenwein, Roi Nathan, Ronen Katsir, Yonatan Wexler, Amnon Shashua
Neural-network-based approach for speech denoising

Patent number: 11894012

Abstract: Disclosed are methods, systems, device, and other implementations, including a method that includes receiving an audio signal representation, detecting in the received audio signal representation, using a first learning model, one or more silent intervals with reduced foreground sound levels, determining based on the detected one or more silent intervals an estimated full noise profile corresponding to the audio signal representation, and generating with a second learning model, based on the received audio signal representation and on the determined estimated full noise profile, a resultant audio signal representation with a reduced noise level.

Type: Grant

Filed: May 19, 2023

Date of Patent: February 6, 2024

Assignees: SoftBank Corp.

Inventors: Changxi Zheng, Ruilin Xu, Rundi Wu, Carl Vondrick, Yuko Ishiwaka
Health monitoring system and appliance

Patent number: 11881221

Abstract: Systems and methods are disclosed. A digitized human vocal expression of a user and digital images are received over a network from a remote device. The digitized human vocal expression is processed to determine characteristics of the human vocal expression, including: pitch, volume, rapidity, a magnitude spectrum identify, and/or pauses in speech. Digital images are received and processed to detect characteristics of the user face, including detecting if any of the following is present: a sagging lip, a crooked smile, uneven eyebrows, and/or facial droop. Using the human vocal expression characteristics and face characteristics, a determination is made as to what action is to be taken. A cepstrum pitch may be determined using an inverse Fourier transform of a logarithm of a spectrum of a human vocal expression signal. The volume may be determined using peak heights in a power spectrum of the human vocal expression.

Type: Grant

Filed: June 30, 2022

Date of Patent: January 23, 2024

Assignee: The Notebook, LLC

Inventor: Karen Elaine Khaleghi
Harmful behavior detecting system and method thereof

Patent number: 11875817

Abstract: A technique capable of detecting harmful behavior such as power harassment, sexual harassment, or bullying in work environment to support handling is provided. A harmful behavior detecting system includes a computer that executes observation and detection regarding harmful behavior including power harassment, sexual harassment, and bullying among people in work environment. The computer obtains voice data into which voice around a target person is inputted; obtains voice information containing words and emotion information from the voice data; and obtains data such as vital data, date and time, or a location of the target person. The computer uses five elements including words and an emotion of the other person, words, an emotion, and vital data of the target person to calculate an index value regarding the harmful behavior; estimate a state of the harmful behavior based on the index value; and output handling data for handling the harmful behavior in accordance with the estimated state.

Type: Grant

Filed: November 12, 2019

Date of Patent: January 16, 2024

Assignee: Hitachi Systems Ltd.

Inventors: Satoshi Iwagaki, Atsushi Shimada, Masumi Suehiro, Hidenori Chiba, Kouichi Horiuchi
Systems and methods for brain-informed speech separation

Patent number: 11875813

Abstract: Disclosed are methods, systems, device, and other implementations, including a method (performed by, for example, a hearing aid device) that includes obtaining a combined sound signal for signals combined from multiple sound sources in an area in which a person is located, and obtaining neural signals for the person, with the neural signals being indicative of one or more target sound sources, from the multiple sound sources, the person is attentive to. The method further includes determining a separation filter based, at least in part, on the neural signals obtained for the person, and applying the separation filter to a representation of the combined sound signal to derive a resultant separated signal representation associated with sound from the one or more target sound sources the person is attentive to.

Type: Grant

Filed: March 31, 2023

Date of Patent: January 16, 2024

Assignee: The Trustees of Columbia University in the City of New York

Inventors: Nima Mesgarani, Enea Ceolini, Cong Han
Apparatus and method for improved signal fade out for switched audio coding systems during error concealment

Patent number: 11869514

Abstract: An apparatus for decoding an audio signal includes a receiving interface, wherein the receiving interface is configured to receive a first frame and a second frame. Moreover, the apparatus includes a noise level tracing unit for determining noise level information being represented in a tracing domain. Furthermore, the apparatus includes a first reconstruction unit for reconstructing a third audio signal portion of the audio signal depending on the noise level information and a second reconstruction unit for reconstructing a fourth audio signal portion depending on noise level information being represented in the second reconstruction domain.

Type: Grant

Filed: April 15, 2020

Date of Patent: January 9, 2024

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Michael Schnabel, Goran Markovic, Ralph Sperschneider, Jérémie Lecomte, Christian Helmrich
Word processing system and word processing method

Patent number: 11861305

Abstract: Provided is a word processing system which includes: a first generation unit which generates, based on sentence information including a plurality of sentences, hierarchy data indicating syntax tree data for each hierarchy with regard to each sentence; a second generation unit which acquires, from a plurality of hierarchy data generated by the first generation unit, hierarchy data of a second sentence similar to hierarchy data of a first sentence generated by the first generation unit, extracts a difference between the hierarchy data of the first sentence and the hierarchy data of the second sentence, and generates, as paraphrasing rule data, first expression data as a difference in the first sentence and second expression data as a difference in the second sentence; and a storage unit which stores the paraphrasing rule data generated by the second generation unit in a storage unit.

Type: Grant

Filed: December 7, 2020

Date of Patent: January 2, 2024

Assignee: HITACHI, LTD.

Inventors: Yudai Kato, Noriko Takaya, Takahiro Hamada, Junya Sawazaki
Electronic device for supporting artificial intelligence agent services to talk to users

Patent number: 11862178

Abstract: An electronic device and method are provided. The method includes identifying a speech section of a user and a speech section of a neighbor in a received audio signal, identifying a user utterance in the speech section of the user and a neighbor answer to the user utterance in the speech section of the neighbor, obtaining preference information associated with the user utterance, giving a first reliability to the neighbor answer and a second reliability to an agent answer of an artificial intelligence agent generated in response to the user utterance, based on the preference information, not responding to the user utterance when the second reliability is lower than the first reliability, and outputting the agent answer when the second reliability is equal to or higher than the first reliability.

Type: Grant

Filed: January 10, 2022

Date of Patent: January 2, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hoseon Shin, Chulmin Lee
Signal processing device and signal processing method

Patent number: 11862141

Abstract: The present technology relates to a signal processing device, a signal processing method, and a program that allow for easier sound source separation. The signal processing device includes a sound source separation unit that recursively performs sound source separation on an input acoustic signal by using a predetermined sound source separation model learned in advance to separate a predetermined sound source from an acoustic signal for learning including the predetermined sound source. The present technology can be applied to a signal processing device.

Type: Grant

Filed: March 13, 2020

Date of Patent: January 2, 2024

Assignee: SONY GROUP CORPORATION

Inventor: Naoya Takahashi
Methods for measuring speech intelligibility, and related systems and apparatus

Patent number: 11848025

Abstract: In a method for efficiently and accurately measuring the intelligibility of speech, a user may utter a sample text, and an automatic speech assessment (ASA) system may receive an acoustic signal encoding the utterance. An automatic speech recognition (ASR) module may generate an N-best output corresponding to the utterance and generate an intelligibility score representing the intelligibility of the utterance based on the N-best output and the sample text. Generating the intelligibility score may involve (1) calculating conditional intelligibility value(s) for the N recognition result(s), and (2) determining the intelligibility score based on the conditional intelligibility value of the most intelligible recognition result. Optionally, the process of generating the intelligibility score may involve adjusting the intelligibility score to account for environmental information (e.g., a pronunciation score for the user's speech and/or a confidence score assigned to the 1-best recognition result).

Type: Grant

Filed: January 15, 2021

Date of Patent: December 19, 2023

Assignee: ELSA, Corp.

Inventors: Jorge Daniel Leonardo Proença, Xavier Anguera Miro, Ganna Raboshchuk, Ângela Maria Pereira da Costa
Creation of a minute from a record of a teleconference

Patent number: 11837219

Abstract: In several aspects for creating a minute of a teleconference from a record thereof, a processor classifies portions of the record as relevant or non-relevant according to corresponding relevance indicators with respect to a topic of the teleconference. A processor removes the non-relevant portions from the record. A processor classifies pairs of relevant portions as similar or non-similar according to corresponding similarity indicators. A processor removes one of the relevant portions of each similar pair of relevant portions from the minute.

Type: Grant

Filed: November 18, 2021

Date of Patent: December 5, 2023

Assignee: International Business Machines Corporation

Inventors: Damiano Bassani, Alfonso D'Aniello, Andrea Tortosa, Roberto Giordani, Michela Melfa
Context-aware prosody correction of edited speech

Patent number: 11830481

Abstract: Methods are performed by one or more processing devices for correcting prosody in audio data. A method includes operations for accessing subject audio data in an audio edit region of the audio data. The subject audio data in the audio edit region potentially lacks prosodic continuity with unedited audio data in an unedited audio portion of the audio data. The operations further include predicting, based on a context of the unedited audio data, phoneme durations including a respective phoneme duration of each phoneme in the unedited audio data. The operations further include predicting, based on the context of the unedited audio data, a pitch contour comprising at least one respective pitch value of each phoneme in the unedited audio data. Additionally, the operations include correcting prosody of the subject audio data in the audio edit region by applying the phoneme durations and the pitch contour to the subject audio data.

Type: Grant

Filed: November 30, 2021

Date of Patent: November 28, 2023

Assignee: Adobe Inc.

Inventors: Maxwell Morrison, Zeyu Jin, Nicholas Bryan, Juan Pablo Caceres Chomali, Lucas Rencker
Speaker adaptive end of speech detection for conversational AI applications

Patent number: 11817117

Abstract: In various examples, end of speech (EOS) for an audio signal is determined based at least in part on a rate of speech for a speaker. For a segment of the audio signal, EOS is indicated based at least in part on an EOS threshold determined based at least in part on the rate of speech for the speaker.

Type: Grant

Filed: January 29, 2021

Date of Patent: November 14, 2023

Assignee: NVIDIA CORPORATION

Inventors: Utkarsh Vaidya, Ravindra Yeshwant Lokhande, Viraj Gangadhar Karandikar, Niranjan Rajendra Wartikar, Sumit Kumar Bhattacharya
Privacy device for smart speakers

Patent number: 11805378

Abstract: Systems, apparatuses, and methods are described for a privacy blocking device configured to prevent receipt, by a listening device, of video and/or audio data until a trigger occurs. A blocker may be configured to prevent receipt of video and/or audio data by one or more microphones and/or one or more cameras of a listening device. The blocker may use the one or more microphones, the one or more cameras, and/or one or more second microphones and/or one or more second cameras to monitor for a trigger. The blocker may process the data. Upon detecting the trigger, the blocker may transmit data to the listening device. For example, the blocker may transmit all or a part of a spoken phrase to the listening device.

Type: Grant

Filed: October 29, 2020

Date of Patent: October 31, 2023

Inventor: Thomas Stachura
Systems and methods for voice topic spotting

Patent number: 11769487

Abstract: A voice topic spotting system includes a learning module and a voice topic classifier module. The learning module receives training audio segments with topic labels and generates a fast keyword filter model based on a set of topic-indicative words and generates a topic identification model based on a training set of topic keyword-containing lattices. The voice topic classifier module includes an automatic speech recognition engine arranged to identify one or more keywords included in a received audio segment and output the one or more keywords. A fast keyword filter, implements the fast keyword model to output the received audio segment if a topic-indicative word is detected in the audio segment. A decoder generates a topic keyword-containing lattice associated with the audio segment. A voice topic classifier implements the voice topic identification model to determine a topic associated with received audio segment.

Type: Grant

Filed: March 16, 2021

Date of Patent: September 26, 2023

Assignee: RAYTHEON APPLIED SIGNAL TECHNOLOGY, INC.

Inventor: Jonathan C. Wintrode
Training acoustic models using connectionist temporal classification

Patent number: 11769493

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training acoustic models and using the trained acoustic models. A connectionist temporal classification (CTC) acoustic model is accessed, the CTC acoustic model having been trained using a context-dependent state inventory generated from approximate phonetic alignments determined by another CTC acoustic model trained without fixed alignment targets. Audio data for a portion of an utterance is received. Input data corresponding to the received audio data is provided to the accessed CTC acoustic model. Data indicating a transcription for the utterance is generated based on output that the accessed CTC acoustic model produced in response to the input data. The data indicating the transcription is provided as output of an automated speech recognition service.

Type: Grant

Filed: May 3, 2022

Date of Patent: September 26, 2023

Assignee: Google LLC

Inventors: Kanury Kanishka Rao, Andrew W. Senior, Hasim Sak
Method and device for reducing crosstalk in automatic speech translation system

Patent number: 11763833

Abstract: Disclosed are a method, a device, and a computer-readable storage medium for reducing crosstalk when performing automatic speech translation between at least two users speaking different languages. The method for reducing crosstalk includes receiving a signal inputted to an out-ear microphone of a first user, wherein the first user is wearing a headset equipped with an in-ear microphone and the out-ear microphone and the signal includes a voice signal A of the first user and a voice signal b of a second user, receiving a voice signal Binear inputted to an in-ear microphone of the second user, wherein the second user is wearing a headset equipped with the in-ear microphone and an out-ear microphone, and removing the voice signal b of the second user from the signal A+b inputted to the out-ear microphone of the first user, based on the voice signal Binear inputted to the in-ear microphone of the second user.

Type: Grant

Filed: October 31, 2019

Date of Patent: September 19, 2023

Inventor: Jung Keun Kim
Hierarchical generated audio detection system

Patent number: 11763836

Abstract: Disclosed is a hierarchical generated audio detection system, comprising an audio preprocessing module, a CQCC feature extraction module, a LFCC feature extraction module, a first-stage lightweight coarse-level detection model and a second-stage fine-level deep identification model; the audio preprocessing module preprocesses collected audio or video data to obtain an audio clip with a length not exceeding the limit; inputting the audio clip into CQCC feature extraction module and LFCC feature extraction module respectively to obtain CQCC feature and LFCC feature; inputting CQCC feature or LFCC feature into the first-stage lightweight coarse-level detection model for first-stage screening to screen out the first-stage real audio and the first-stage generated audio; inputting the CQCC feature or LFCC feature of the first-stage generated audio into the second-stage fine-level deep identification model to identify the second-stage real audio and the second-stage generated audio, and the second-stage generated au

Type: Grant

Filed: February 17, 2022

Date of Patent: September 19, 2023

Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventors: Jianhua Tao, Zhengkun Tian, Jiangyan Yi

1 2 3 4 5 … next