Patents Examined by Fariba Sirjani
  • Patent number: 11978444
    Abstract: A method, system and apparatus to generate an augmented voice command, including identifying a plurality of sounds from a respective plurality of transducers to a smart speaker device, generating a visualization of the sounds using an augmented reality device, wherein one or more of the sounds can be selected using the visualization, and generating the augmented voice command for the smart speaker device, wherein the augmented voice command comprises the one or more sounds selected using the visualization of the augmented reality device.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: May 7, 2024
    Assignee: International Business Machines Corporation
    Inventors: Clement Decrop, Tushar Agrawal, Jeremy R. Fox, Sarbajit K Rakshit
  • Patent number: 11967340
    Abstract: Disclosed is a method for detecting a voice from audio data, performed by a computing device according to an exemplary embodiment of the present disclosure. The method includes obtaining audio data; generating image data based on a spectrum of the obtained audio data; analyzing the generated image data by utilizing a pre-trained neural network model; and determining whether an automated response system (ARS) voice is included in the audio data, based on the analysis of the image data.
    Type: Grant
    Filed: June 23, 2023
    Date of Patent: April 23, 2024
    Assignee: ActionPower Corp.
    Inventors: Subong Choi, Dongchan Shin, Jihwa Lee
  • Patent number: 11929087
    Abstract: Systems and methods for amplifying and/or attenuating audio signals are disclosed. In one implementation, a system for selectively amplifying audio signals may include at least one microphone for capturing sounds from an environment of the user and a processor. The processor may be programmed to receive an audio signal representative of sounds captured by the at least one microphone; determine whether the audio signal comprises speech by a user of the system; subject to the audio signal comprising speech by the user, modify the audio signal by attenuating a first part of the audio signal comprising the speech by the user; subject to the audio signal comprising audio other than speech by the user, modify the audio signal by amplifying a second part of the audio signal comprising audio other than the speech by the user; and transmit the modified audio signal to a hearing interface device.
    Type: Grant
    Filed: September 16, 2021
    Date of Patent: March 12, 2024
    Assignee: ORCAM TECHNOLOGIES LTD.
    Inventors: Tal Rosenwein, Roi Nathan, Ronen Katsir, Yonatan Wexler, Amnon Shashua
  • Patent number: 11894012
    Abstract: Disclosed are methods, systems, device, and other implementations, including a method that includes receiving an audio signal representation, detecting in the received audio signal representation, using a first learning model, one or more silent intervals with reduced foreground sound levels, determining based on the detected one or more silent intervals an estimated full noise profile corresponding to the audio signal representation, and generating with a second learning model, based on the received audio signal representation and on the determined estimated full noise profile, a resultant audio signal representation with a reduced noise level.
    Type: Grant
    Filed: May 19, 2023
    Date of Patent: February 6, 2024
    Assignees: SoftBank Corp.
    Inventors: Changxi Zheng, Ruilin Xu, Rundi Wu, Carl Vondrick, Yuko Ishiwaka
  • Patent number: 11881221
    Abstract: Systems and methods are disclosed. A digitized human vocal expression of a user and digital images are received over a network from a remote device. The digitized human vocal expression is processed to determine characteristics of the human vocal expression, including: pitch, volume, rapidity, a magnitude spectrum identify, and/or pauses in speech. Digital images are received and processed to detect characteristics of the user face, including detecting if any of the following is present: a sagging lip, a crooked smile, uneven eyebrows, and/or facial droop. Using the human vocal expression characteristics and face characteristics, a determination is made as to what action is to be taken. A cepstrum pitch may be determined using an inverse Fourier transform of a logarithm of a spectrum of a human vocal expression signal. The volume may be determined using peak heights in a power spectrum of the human vocal expression.
    Type: Grant
    Filed: June 30, 2022
    Date of Patent: January 23, 2024
    Assignee: The Notebook, LLC
    Inventor: Karen Elaine Khaleghi
  • Patent number: 11875817
    Abstract: A technique capable of detecting harmful behavior such as power harassment, sexual harassment, or bullying in work environment to support handling is provided. A harmful behavior detecting system includes a computer that executes observation and detection regarding harmful behavior including power harassment, sexual harassment, and bullying among people in work environment. The computer obtains voice data into which voice around a target person is inputted; obtains voice information containing words and emotion information from the voice data; and obtains data such as vital data, date and time, or a location of the target person. The computer uses five elements including words and an emotion of the other person, words, an emotion, and vital data of the target person to calculate an index value regarding the harmful behavior; estimate a state of the harmful behavior based on the index value; and output handling data for handling the harmful behavior in accordance with the estimated state.
    Type: Grant
    Filed: November 12, 2019
    Date of Patent: January 16, 2024
    Assignee: Hitachi Systems Ltd.
    Inventors: Satoshi Iwagaki, Atsushi Shimada, Masumi Suehiro, Hidenori Chiba, Kouichi Horiuchi
  • Patent number: 11875813
    Abstract: Disclosed are methods, systems, device, and other implementations, including a method (performed by, for example, a hearing aid device) that includes obtaining a combined sound signal for signals combined from multiple sound sources in an area in which a person is located, and obtaining neural signals for the person, with the neural signals being indicative of one or more target sound sources, from the multiple sound sources, the person is attentive to. The method further includes determining a separation filter based, at least in part, on the neural signals obtained for the person, and applying the separation filter to a representation of the combined sound signal to derive a resultant separated signal representation associated with sound from the one or more target sound sources the person is attentive to.
    Type: Grant
    Filed: March 31, 2023
    Date of Patent: January 16, 2024
    Assignee: The Trustees of Columbia University in the City of New York
    Inventors: Nima Mesgarani, Enea Ceolini, Cong Han
  • Patent number: 11869514
    Abstract: An apparatus for decoding an audio signal includes a receiving interface, wherein the receiving interface is configured to receive a first frame and a second frame. Moreover, the apparatus includes a noise level tracing unit for determining noise level information being represented in a tracing domain. Furthermore, the apparatus includes a first reconstruction unit for reconstructing a third audio signal portion of the audio signal depending on the noise level information and a second reconstruction unit for reconstructing a fourth audio signal portion depending on noise level information being represented in the second reconstruction domain.
    Type: Grant
    Filed: April 15, 2020
    Date of Patent: January 9, 2024
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Michael Schnabel, Goran Markovic, Ralph Sperschneider, Jérémie Lecomte, Christian Helmrich
  • Patent number: 11861305
    Abstract: Provided is a word processing system which includes: a first generation unit which generates, based on sentence information including a plurality of sentences, hierarchy data indicating syntax tree data for each hierarchy with regard to each sentence; a second generation unit which acquires, from a plurality of hierarchy data generated by the first generation unit, hierarchy data of a second sentence similar to hierarchy data of a first sentence generated by the first generation unit, extracts a difference between the hierarchy data of the first sentence and the hierarchy data of the second sentence, and generates, as paraphrasing rule data, first expression data as a difference in the first sentence and second expression data as a difference in the second sentence; and a storage unit which stores the paraphrasing rule data generated by the second generation unit in a storage unit.
    Type: Grant
    Filed: December 7, 2020
    Date of Patent: January 2, 2024
    Assignee: HITACHI, LTD.
    Inventors: Yudai Kato, Noriko Takaya, Takahiro Hamada, Junya Sawazaki
  • Patent number: 11862178
    Abstract: An electronic device and method are provided. The method includes identifying a speech section of a user and a speech section of a neighbor in a received audio signal, identifying a user utterance in the speech section of the user and a neighbor answer to the user utterance in the speech section of the neighbor, obtaining preference information associated with the user utterance, giving a first reliability to the neighbor answer and a second reliability to an agent answer of an artificial intelligence agent generated in response to the user utterance, based on the preference information, not responding to the user utterance when the second reliability is lower than the first reliability, and outputting the agent answer when the second reliability is equal to or higher than the first reliability.
    Type: Grant
    Filed: January 10, 2022
    Date of Patent: January 2, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hoseon Shin, Chulmin Lee
  • Patent number: 11862141
    Abstract: The present technology relates to a signal processing device, a signal processing method, and a program that allow for easier sound source separation. The signal processing device includes a sound source separation unit that recursively performs sound source separation on an input acoustic signal by using a predetermined sound source separation model learned in advance to separate a predetermined sound source from an acoustic signal for learning including the predetermined sound source. The present technology can be applied to a signal processing device.
    Type: Grant
    Filed: March 13, 2020
    Date of Patent: January 2, 2024
    Assignee: SONY GROUP CORPORATION
    Inventor: Naoya Takahashi
  • Patent number: 11848025
    Abstract: In a method for efficiently and accurately measuring the intelligibility of speech, a user may utter a sample text, and an automatic speech assessment (ASA) system may receive an acoustic signal encoding the utterance. An automatic speech recognition (ASR) module may generate an N-best output corresponding to the utterance and generate an intelligibility score representing the intelligibility of the utterance based on the N-best output and the sample text. Generating the intelligibility score may involve (1) calculating conditional intelligibility value(s) for the N recognition result(s), and (2) determining the intelligibility score based on the conditional intelligibility value of the most intelligible recognition result. Optionally, the process of generating the intelligibility score may involve adjusting the intelligibility score to account for environmental information (e.g., a pronunciation score for the user's speech and/or a confidence score assigned to the 1-best recognition result).
    Type: Grant
    Filed: January 15, 2021
    Date of Patent: December 19, 2023
    Assignee: ELSA, Corp.
    Inventors: Jorge Daniel Leonardo Proença, Xavier Anguera Miro, Ganna Raboshchuk, Ângela Maria Pereira da Costa
  • Patent number: 11837219
    Abstract: In several aspects for creating a minute of a teleconference from a record thereof, a processor classifies portions of the record as relevant or non-relevant according to corresponding relevance indicators with respect to a topic of the teleconference. A processor removes the non-relevant portions from the record. A processor classifies pairs of relevant portions as similar or non-similar according to corresponding similarity indicators. A processor removes one of the relevant portions of each similar pair of relevant portions from the minute.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: December 5, 2023
    Assignee: International Business Machines Corporation
    Inventors: Damiano Bassani, Alfonso D'Aniello, Andrea Tortosa, Roberto Giordani, Michela Melfa
  • Patent number: 11830481
    Abstract: Methods are performed by one or more processing devices for correcting prosody in audio data. A method includes operations for accessing subject audio data in an audio edit region of the audio data. The subject audio data in the audio edit region potentially lacks prosodic continuity with unedited audio data in an unedited audio portion of the audio data. The operations further include predicting, based on a context of the unedited audio data, phoneme durations including a respective phoneme duration of each phoneme in the unedited audio data. The operations further include predicting, based on the context of the unedited audio data, a pitch contour comprising at least one respective pitch value of each phoneme in the unedited audio data. Additionally, the operations include correcting prosody of the subject audio data in the audio edit region by applying the phoneme durations and the pitch contour to the subject audio data.
    Type: Grant
    Filed: November 30, 2021
    Date of Patent: November 28, 2023
    Assignee: Adobe Inc.
    Inventors: Maxwell Morrison, Zeyu Jin, Nicholas Bryan, Juan Pablo Caceres Chomali, Lucas Rencker
  • Patent number: 11817117
    Abstract: In various examples, end of speech (EOS) for an audio signal is determined based at least in part on a rate of speech for a speaker. For a segment of the audio signal, EOS is indicated based at least in part on an EOS threshold determined based at least in part on the rate of speech for the speaker.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: November 14, 2023
    Assignee: NVIDIA CORPORATION
    Inventors: Utkarsh Vaidya, Ravindra Yeshwant Lokhande, Viraj Gangadhar Karandikar, Niranjan Rajendra Wartikar, Sumit Kumar Bhattacharya
  • Patent number: 11805378
    Abstract: Systems, apparatuses, and methods are described for a privacy blocking device configured to prevent receipt, by a listening device, of video and/or audio data until a trigger occurs. A blocker may be configured to prevent receipt of video and/or audio data by one or more microphones and/or one or more cameras of a listening device. The blocker may use the one or more microphones, the one or more cameras, and/or one or more second microphones and/or one or more second cameras to monitor for a trigger. The blocker may process the data. Upon detecting the trigger, the blocker may transmit data to the listening device. For example, the blocker may transmit all or a part of a spoken phrase to the listening device.
    Type: Grant
    Filed: October 29, 2020
    Date of Patent: October 31, 2023
    Inventor: Thomas Stachura
  • Patent number: 11769487
    Abstract: A voice topic spotting system includes a learning module and a voice topic classifier module. The learning module receives training audio segments with topic labels and generates a fast keyword filter model based on a set of topic-indicative words and generates a topic identification model based on a training set of topic keyword-containing lattices. The voice topic classifier module includes an automatic speech recognition engine arranged to identify one or more keywords included in a received audio segment and output the one or more keywords. A fast keyword filter, implements the fast keyword model to output the received audio segment if a topic-indicative word is detected in the audio segment. A decoder generates a topic keyword-containing lattice associated with the audio segment. A voice topic classifier implements the voice topic identification model to determine a topic associated with received audio segment.
    Type: Grant
    Filed: March 16, 2021
    Date of Patent: September 26, 2023
    Assignee: RAYTHEON APPLIED SIGNAL TECHNOLOGY, INC.
    Inventor: Jonathan C. Wintrode
  • Patent number: 11769493
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training acoustic models and using the trained acoustic models. A connectionist temporal classification (CTC) acoustic model is accessed, the CTC acoustic model having been trained using a context-dependent state inventory generated from approximate phonetic alignments determined by another CTC acoustic model trained without fixed alignment targets. Audio data for a portion of an utterance is received. Input data corresponding to the received audio data is provided to the accessed CTC acoustic model. Data indicating a transcription for the utterance is generated based on output that the accessed CTC acoustic model produced in response to the input data. The data indicating the transcription is provided as output of an automated speech recognition service.
    Type: Grant
    Filed: May 3, 2022
    Date of Patent: September 26, 2023
    Assignee: Google LLC
    Inventors: Kanury Kanishka Rao, Andrew W. Senior, Hasim Sak
  • Patent number: 11763833
    Abstract: Disclosed are a method, a device, and a computer-readable storage medium for reducing crosstalk when performing automatic speech translation between at least two users speaking different languages. The method for reducing crosstalk includes receiving a signal inputted to an out-ear microphone of a first user, wherein the first user is wearing a headset equipped with an in-ear microphone and the out-ear microphone and the signal includes a voice signal A of the first user and a voice signal b of a second user, receiving a voice signal Binear inputted to an in-ear microphone of the second user, wherein the second user is wearing a headset equipped with the in-ear microphone and an out-ear microphone, and removing the voice signal b of the second user from the signal A+b inputted to the out-ear microphone of the first user, based on the voice signal Binear inputted to the in-ear microphone of the second user.
    Type: Grant
    Filed: October 31, 2019
    Date of Patent: September 19, 2023
    Inventor: Jung Keun Kim
  • Patent number: 11763836
    Abstract: Disclosed is a hierarchical generated audio detection system, comprising an audio preprocessing module, a CQCC feature extraction module, a LFCC feature extraction module, a first-stage lightweight coarse-level detection model and a second-stage fine-level deep identification model; the audio preprocessing module preprocesses collected audio or video data to obtain an audio clip with a length not exceeding the limit; inputting the audio clip into CQCC feature extraction module and LFCC feature extraction module respectively to obtain CQCC feature and LFCC feature; inputting CQCC feature or LFCC feature into the first-stage lightweight coarse-level detection model for first-stage screening to screen out the first-stage real audio and the first-stage generated audio; inputting the CQCC feature or LFCC feature of the first-stage generated audio into the second-stage fine-level deep identification model to identify the second-stage real audio and the second-stage generated audio, and the second-stage generated au
    Type: Grant
    Filed: February 17, 2022
    Date of Patent: September 19, 2023
    Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES
    Inventors: Jianhua Tao, Zhengkun Tian, Jiangyan Yi