Patents Examined by Mark Villena

Deep neural network based non-autoregressive speech synthesizer method and system using multiple decoder

Patent number: 12033613

Abstract: Proposed are a deep neural network-based non-autoregressive voice synthesizing method and a system therefor. A deep neural network-based non-autoregressive voice synthesizing system according to an embodiment may comprise: a voice feature vector column synthesizing unit which constitutes a non-recursive deep neural network based on multiple decoders, and gradually produces a voice feature vector column through the multiple decoders from a template including temporal information of a voice; and a voice reconstituting unit which transforms the voice feature vector column into voice data, wherein the voice feature vector column synthesizing unit produces a template input, and produces a voice feature vector column by adding, to the template input, sentence data refined through an attention mechanism.

Type: Grant

Filed: June 26, 2020

Date of Patent: July 9, 2024

Assignee: IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY)

Inventors: Joon-Hyuk Chang, Moa Lee
System and method for detecting fraudsters

Patent number: 12020711

Abstract: A system and method may classify a plurality of interactions, by: obtaining a plurality of voiceprints of the plurality of interactions, wherein each voiceprint of the plurality of voiceprints represents a speaker participating in an interaction of the plurality of interactions; calculating, for each interaction, a plurality of scores, wherein each score of the plurality of scores is indicative of a similarity between the voiceprint of the interaction and one voiceprint of a set of benchmark voiceprints; calculating, for each interaction, statistics of the scores; and determining that a plurality of interactions pertain to a single cluster of interactions based on statistics of the scores of the interactions in the cluster.

Type: Grant

Filed: February 3, 2021

Date of Patent: June 25, 2024

Assignee: Nice Ltd.

Inventors: Roman Frenkel, Yarden Hazut, Rotem Shuster Radashkevich
Methods and apparatus to perform audio watermarking and watermark detection and extraction

Patent number: 12002478

Abstract: Example methods and apparatus to audio watermarking and watermark detection and extraction are disclosed herein. An example apparatus disclosed herein includes memory, computer readable instructions, and processor circuitry to execute the computer readable instructions to at least detect a first symbol, a second symbol, a third symbol, and a fourth symbol sequentially in encoded audio samples, determine whether the first symbol is a synchronization symbol, in response to a determination that the first symbol is a synchronization symbol, determine that the first symbol and the third symbol are associated with a first message and the second symbol and the fourth symbol are associated with a second message, and output at least one of the first message or the second message.

Type: Grant

Filed: July 8, 2022

Date of Patent: June 4, 2024

Assignee: The Nielsen Company (US), LLC

Inventors: Venugopal Srinivasan, Alexander Pavlovich Topchy
Tag estimation device, tag estimation method, and program

Patent number: 12002486

Abstract: A tag estimation device capable of estimating, for an utterance made among several persons, a tag representing a result of analyzing the utterance is provided. The tag estimation device includes an utterance sequence information vector generation unit that adds a t-th utterance word feature vector and a t-th speaker vector to a (t?1)-th utterance sequence information vector ut-1 that includes an utterance word feature vector that precedes the t-th utterance word feature vector and a speaker vector that precedes the t-th speaker vector to generate a t-th utterance sequence information vector ut, where t is a natural number, and a tagging unit that determines a tag lt that represents a result of analyzing a t-th utterance from a model parameter set in advance and the t-th utterance sequence information vector ut.

Type: Grant

Filed: September 13, 2019

Date of Patent: June 4, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Tomohiro Tanaka
Hybrid translation system using a general-purpose neural network machine translator

Patent number: 11983506

Abstract: Provided is a method of performing hybrid translation using a general-purpose neural network machine translator by a device, including: (a) receiving an original document configured of a first language from a user terminal; (b) determining a field of the original document by comparing the original document with a standard document for each field stored in a database; (c) generating a primary translation configured of a second language for the original document by using the general-purpose neural network machine translator; and (d) converting some expressions included in the primary translation into expressions in another second language with reference to a professional translation dictionary for the determined field to generate a secondary translation, wherein the expression includes any one of a word, a phrase, and a sentence.

Type: Grant

Filed: August 24, 2021

Date of Patent: May 14, 2024

Assignee: TWIG FARM CO., LTD.

Inventors: Sun Ho Baek, Chul Ho Baek
Display device and method of operating the same

Patent number: 11978448

Abstract: Disclosed is a display device capable of classifying clickable content in a web application screen, and the display device can perform primary classification on all nodes existing in a document object model (DOM) and perform secondary classification for determining whether a node exists in the screen.

Type: Grant

Filed: February 19, 2020

Date of Patent: May 7, 2024

Assignee: LG ELECTRONICS INC.

Inventors: Jeongmin Kim, Youngtae Chung, Kwangrim Ahn
Voice input processing method and electronic device supporting same

Patent number: 11961508

Abstract: An electronic device and method are disclosed. The electronic device includes a microphone, a speaker and a processor. The processor implements the method, including: receiving a first voice input through the microphone, outputting a response related to a function through the speaker, based on the first voice input, when a second voice input is received through the microphone during output of the response, interrupting the output of the response and continue to receive the second voice input, and executing the function based on content of the received second voice input and content of the response output up to one of: a first time point when the second voice input was initially received, and a second time point when the output of the response was interrupted.

Type: Grant

Filed: November 13, 2020

Date of Patent: April 16, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Heeyoung Choo, Hyunduk Cho, Yeseul Hong, Dayoung Lee, Boram Lee
Learning data generation device, learning data generation method and non-transitory computer readable recording medium

Patent number: 11955111

Abstract: To improve prediction accuracy of utterance types in a dialog. A learning data generation device (10) according to the present invention comprises: a sort unit (11) configured to perform, based on information appended to utterances in a dialog amongst more than one speaker and that is indicative of a dialogue scene that is a scene in which the utterances in the dialog were made, sorting regarding whether the utterances are to be targets for generation of the learning data, wherein the sorter (11) is configured to exclude utterances of a dialogue scene that includes utterances similar to utterance of the particular type from the targets for generation of learning data.

Type: Grant

Filed: August 14, 2019

Date of Patent: April 9, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Setsuo Yamada, Yoshiaki Noda, Takaaki Hasegawa
Communication system for multiple acoustic zones

Patent number: 11950067

Abstract: An In-Car Communication (ICC) system supports the communication paths within a car by receiving the speech signals of a speaking passenger and playing it back for one or more listening passengers. Signal processing tasks are split into a microphone related part and into a loudspeaker related part. A sound processing system suitable for use in a vehicle having multiple acoustic zones includes a plurality of microphone In-Car Communication (Mic-ICC) instances coupled and a plurality of loudspeaker In-Car Communication (Ls-ICC) instances. The system further includes a dynamic audio routing matrix with a controller and coupled to the Mic-ICC instances, a mixer coupled to the plurality of Mic-ICC instances and a distributor coupled to the Ls-ICC instances.

Type: Grant

Filed: February 6, 2023

Date of Patent: April 2, 2024

Assignee: Cerence Operating Company

Inventors: Tobias Herbig, Markus Buck, Meik Pfeffinger
Combining device or assistant-specific hotwords in a single utterance

Patent number: 11948565

Abstract: A method for combining hotwords in a single utterance receives, at a first assistant-enabled device (AED), audio data corresponding to an utterance directed toward the first AED and a second AED among two or more AEDs where the audio data includes a query specifying an operation to perform. The method also detects, using a hotword detector, a first hotword assigned to the first AED that is different than a second hotword assigned to the second AED In response to detecting the first hotword, the method initiates processing on the audio data to determine that the audio data includes a term preceding the query that at least partially matches the second hotword assigned. Based on the at least partial match, the method executes a collaboration routine to cause the first AED and the second AED to collaborate with one another to fulfill the query.

Type: Grant

Filed: December 11, 2020

Date of Patent: April 2, 2024

Assignee: Google LLC

Inventors: Matthew Sharifi, Victor Carbune
Electronic device performing operation based on user speech in multi device environment and operating method thereof

Patent number: 11948579

Abstract: According to various embodiments, an electronic device performing an operation corresponding to an utterance of a user in a multi-device environment includes a communication module, a processor, and a memory. The processor operates a first electronic device for receiving and a second electronic device for executing in association with each other. The memory stores instructions that, when being executed, cause the processor to: determine whether a first utterance received at the first electronic device should be performed at the second electronic device; when the first utterance should be performed at the second electronic device, form a connection binding with the first electronic device and the second electronic device by using one or more channels through the communication module. When an operation corresponding to the first utterance is completed at the second electronic device, the processor terminates the connection binding with the first electronic device and the second electronic device.

Type: Grant

Filed: December 7, 2020

Date of Patent: April 2, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Taegu Kim, Hansin Koh, Jooyeon Kim, Hyeonjae Bak, Hyunjin Park, Kyounggu Woo
Electronic apparatus and controlling method thereof

Patent number: 11948569

Abstract: An electronic apparatus includes: a memory storing a first threshold value and a second threshold value corresponding to a receiving direction of a wake-up word, a sound receiver comprising sound receiving circuitry, and a processor configured to: identify a receiving direction of the sound based on a sound received through the sound receiver, based on a similarity between sound data obtained in response to the received sound and the wake-up word being greater than or equal to the first threshold value corresponding to the identified receiving direction, perform voice recognition for a subsequent sound received through the sound receiver, and based on the similarity being less than the first threshold value and greater than or equal to the second threshold value, change the first threshold value.

Type: Grant

Filed: October 26, 2021

Date of Patent: April 2, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventor: Chanhee Choi
Natural language processing platform for automated event analysis, translation, and transcription verification

Patent number: 11947872

Abstract: Aspects of the disclosure relate to computing platforms that utilize improved natural language processing techniques for claims processing. A computing platform may receive an audio file and determine one or more utterance segments based on the audio file. The computing platform may generate and send one or more audio chunks based on the one or more utterance segments to an audio transcription system. The computing platform may receive one or more text segments in response and may generate an audio transcription file accordingly. The computing platform may assign a category to each word in the audio transcription file, and may send user interface information to a user device along with one or more commands directing the user device to generate a user interface based on the user interface information, which may cause the user device to display the user interface.

Type: Grant

Filed: May 14, 2020

Date of Patent: April 2, 2024

Assignee: Allstate Insurance Company

Inventors: Olivia Mahler-Haug, Connor Walsh, Geoff Colgan, Matthew Cluck, Ross Kirk Eley, Harideep Yeddula
Audio watermarking for people monitoring

Patent number: 11942099

Abstract: Disclosed example people monitoring methods include detecting a first watermark in a first audio signal obtained from an acoustic sensor, the first watermark identifying media presented by a monitored media device, determining whether a second watermark, different from the first watermark, is embedded in the first audio signal obtained from the acoustic sensor, the second watermark identifying at least one of a mobile device or a user of the mobile device, classifying the second watermark as a media watermark or a people monitoring watermark based on a characteristic of the second watermark, and when the second watermark is determined to be embedded in the first audio signal, reporting at least one of the second watermark or information decoded from the second watermark to identify at least one of the mobile device or the user of the mobile device as being exposed to the media presented by the monitored media device.

Type: Grant

Filed: December 29, 2021

Date of Patent: March 26, 2024

Assignee: The Nielsen Company (US), LLC

Inventors: Alexander Topchy, Padmanabhan Soundararajan, Venugopal Srinivasan
Voice recognition system for use with a personal media streaming appliance

Patent number: 11935534

Abstract: A system and method for voice control of a media playback device is disclosed. The method includes receiving an instruction of a voice command, converting the voice command to text, transmitting the text command to the playback device, and having the playback device execute the command. An instruction may include a command to play a set of audio tracks, and the media playback device plays the set of audio tracks upon receiving the instruction.

Type: Grant

Filed: March 15, 2022

Date of Patent: March 19, 2024

Assignee: Spotify AB

Inventors: Daniel Bromand, Richard Mitic, Horia Jurcut, Jennifer Thom-Santelli, Henriette Cramer, Karl Humphreys, Robert Williams, Kurt Jacobson, Henrik Lindström
Quick lookup for speech translation

Patent number: 11900072

Abstract: Offered is a system that presents on a display screen a translation of a sentence together with an untranslated version of the sentence, and that can cause both of the displayed sentences to break apart into component parts in response to a simple user action, e.g., double-tapping on one of them. When the user selects (e.g., taps on) any portion of either version of the sentence, the system can identify a corresponding portion of the other version (in the other language). In some implementations, a user device can include both a microphone and a display screen, and an automatic speech recognition (ASR) engine can be used to transcribe the user's speech in one language (e.g., English) into text. The system can translate the resulting text into another language (e.g., Spanish) and display the translated text on the display screen along with the untranslated text. When a user selects a portion of a sentence, the system can also present information about the selected portion (e.g.

Type: Grant

Filed: July 18, 2017

Date of Patent: February 13, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Lee Michael Bossio, Kartik Suchindra Babu, John Thomas Beck
Voice verification for media playback

Patent number: 11887598

Abstract: In one aspect, a network microphone device includes a plurality of microphones and is configured to capture a voice input via the one or more microphones, detect a wake word in the voice input, transmit data associated with the voice input to one or more remote computing devices associated with a voice assistant service, and receive a response from the one or more remote computing devices, the response comprising a playback command based on the voice input. The network microphone device may be configured to obtain verification information characterizing the voice input and, based on the verification information indicating that the voice input was spoken by an unverified user, functionally disable the NMD from performing the playback command.

Type: Grant

Filed: December 2, 2022

Date of Patent: January 30, 2024

Assignee: Sonos, Inc.

Inventor: Connor Kristopher Smith
Cognitive natural language processing software framework optimization

Patent number: 11875793

Abstract: A system, method, and computer program product for implementing cognitive natural language processing software framework optimization is provided. The method includes receiving instructions associated with an audible user input of a user. An AI input intention of the user is determined and key information is extracted from the audible user input. The key information is inputted into a generated database table and additional key information is retrieved from a dialog table. A supplementary database table comprising the additional key information is generated and the key information is spliced with the additional key information. A resulting spliced data structure is merged into a final database table and natural language is converted into a request code structure within an SQL structure and an interactive AI interface presenting results of the converting is generated. Operational functionality of an AI device is enabled for audibly presenting results of the conversion.

Type: Grant

Filed: September 7, 2021

Date of Patent: January 16, 2024

Assignee: International Business Machines Corporation

Inventors: Zhong Fang Yuan, Tong Liu, De Shuo Kong, Yao Chen, Hai Bo Zou, Sarbajit K. Rakshit, Zheng Jie
Group hotwords

Patent number: 11862155

Abstract: A method includes a first assistant-enabled device (AED) receiving an assignment instruction assigning a group hotword to a selected group of AEDs that includes the first AED and one or more other AEDs. Each AED is configured to wake-up from a low-power state when the group hotword is detected in streaming audio by at least one of the AEDs. The method also includes receiving audio data that corresponds to an utterance spoken by the user and includes a query that specifies an operation to perform. In response to detecting the group hotword in the audio data, the method also includes triggering the first AED to wake-up from the low-power state and executing a collaboration routine to cause the first AED and each other AED in the selected group of AEDs to collaborate with one another to fulfill performance of the operation specified by the query.

Type: Grant

Filed: December 11, 2020

Date of Patent: January 2, 2024

Assignee: Google LLC

Inventors: Matthew Sharifi, Victor Carbune
Method, system, and computer-readable medium for purifying voice using depth information

Patent number: 11842745

Abstract: In an embodiment, a method includes receiving a plurality of first images including at least a mouth-related portion of a human uttering a voice, wherein each first image has depth information; obtaining a noisy spectrogram including a first representation of the voice of the human; extracting a plurality of visual features using the first images, wherein one of the visual features is obtained using the depth information of a second image of the first images; extracting a plurality of audio features using the noisy spectrogram; determining a first spectrogram using the visual features and the audio features; subtracting the first spectrogram from the noisy spectrogram, to obtain a purified representation of the voice of the human; and outputting, by an I/O-related outputting device, a response using the purified representation of the voice of the human.

Type: Grant

Filed: February 16, 2021

Date of Patent: December 12, 2023

Assignee: GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD.

Inventors: Yuan Lin, Jenhao Hsiao, Chiuman Ho

1 2 3 4 5 … next