Patents Examined by Mark Villena
-
Patent number: 12033613Abstract: Proposed are a deep neural network-based non-autoregressive voice synthesizing method and a system therefor. A deep neural network-based non-autoregressive voice synthesizing system according to an embodiment may comprise: a voice feature vector column synthesizing unit which constitutes a non-recursive deep neural network based on multiple decoders, and gradually produces a voice feature vector column through the multiple decoders from a template including temporal information of a voice; and a voice reconstituting unit which transforms the voice feature vector column into voice data, wherein the voice feature vector column synthesizing unit produces a template input, and produces a voice feature vector column by adding, to the template input, sentence data refined through an attention mechanism.Type: GrantFiled: June 26, 2020Date of Patent: July 9, 2024Assignee: IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY)Inventors: Joon-Hyuk Chang, Moa Lee
-
Patent number: 12020711Abstract: A system and method may classify a plurality of interactions, by: obtaining a plurality of voiceprints of the plurality of interactions, wherein each voiceprint of the plurality of voiceprints represents a speaker participating in an interaction of the plurality of interactions; calculating, for each interaction, a plurality of scores, wherein each score of the plurality of scores is indicative of a similarity between the voiceprint of the interaction and one voiceprint of a set of benchmark voiceprints; calculating, for each interaction, statistics of the scores; and determining that a plurality of interactions pertain to a single cluster of interactions based on statistics of the scores of the interactions in the cluster.Type: GrantFiled: February 3, 2021Date of Patent: June 25, 2024Assignee: Nice Ltd.Inventors: Roman Frenkel, Yarden Hazut, Rotem Shuster Radashkevich
-
Patent number: 12002478Abstract: Example methods and apparatus to audio watermarking and watermark detection and extraction are disclosed herein. An example apparatus disclosed herein includes memory, computer readable instructions, and processor circuitry to execute the computer readable instructions to at least detect a first symbol, a second symbol, a third symbol, and a fourth symbol sequentially in encoded audio samples, determine whether the first symbol is a synchronization symbol, in response to a determination that the first symbol is a synchronization symbol, determine that the first symbol and the third symbol are associated with a first message and the second symbol and the fourth symbol are associated with a second message, and output at least one of the first message or the second message.Type: GrantFiled: July 8, 2022Date of Patent: June 4, 2024Assignee: The Nielsen Company (US), LLCInventors: Venugopal Srinivasan, Alexander Pavlovich Topchy
-
Patent number: 12002486Abstract: A tag estimation device capable of estimating, for an utterance made among several persons, a tag representing a result of analyzing the utterance is provided. The tag estimation device includes an utterance sequence information vector generation unit that adds a t-th utterance word feature vector and a t-th speaker vector to a (t?1)-th utterance sequence information vector ut-1 that includes an utterance word feature vector that precedes the t-th utterance word feature vector and a speaker vector that precedes the t-th speaker vector to generate a t-th utterance sequence information vector ut, where t is a natural number, and a tagging unit that determines a tag lt that represents a result of analyzing a t-th utterance from a model parameter set in advance and the t-th utterance sequence information vector ut.Type: GrantFiled: September 13, 2019Date of Patent: June 4, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo Masumura, Tomohiro Tanaka
-
Patent number: 11983506Abstract: Provided is a method of performing hybrid translation using a general-purpose neural network machine translator by a device, including: (a) receiving an original document configured of a first language from a user terminal; (b) determining a field of the original document by comparing the original document with a standard document for each field stored in a database; (c) generating a primary translation configured of a second language for the original document by using the general-purpose neural network machine translator; and (d) converting some expressions included in the primary translation into expressions in another second language with reference to a professional translation dictionary for the determined field to generate a secondary translation, wherein the expression includes any one of a word, a phrase, and a sentence.Type: GrantFiled: August 24, 2021Date of Patent: May 14, 2024Assignee: TWIG FARM CO., LTD.Inventors: Sun Ho Baek, Chul Ho Baek
-
Patent number: 11978448Abstract: Disclosed is a display device capable of classifying clickable content in a web application screen, and the display device can perform primary classification on all nodes existing in a document object model (DOM) and perform secondary classification for determining whether a node exists in the screen.Type: GrantFiled: February 19, 2020Date of Patent: May 7, 2024Assignee: LG ELECTRONICS INC.Inventors: Jeongmin Kim, Youngtae Chung, Kwangrim Ahn
-
Patent number: 11961508Abstract: An electronic device and method are disclosed. The electronic device includes a microphone, a speaker and a processor. The processor implements the method, including: receiving a first voice input through the microphone, outputting a response related to a function through the speaker, based on the first voice input, when a second voice input is received through the microphone during output of the response, interrupting the output of the response and continue to receive the second voice input, and executing the function based on content of the received second voice input and content of the response output up to one of: a first time point when the second voice input was initially received, and a second time point when the output of the response was interrupted.Type: GrantFiled: November 13, 2020Date of Patent: April 16, 2024Assignee: Samsung Electronics Co., Ltd.Inventors: Heeyoung Choo, Hyunduk Cho, Yeseul Hong, Dayoung Lee, Boram Lee
-
Patent number: 11955111Abstract: To improve prediction accuracy of utterance types in a dialog. A learning data generation device (10) according to the present invention comprises: a sort unit (11) configured to perform, based on information appended to utterances in a dialog amongst more than one speaker and that is indicative of a dialogue scene that is a scene in which the utterances in the dialog were made, sorting regarding whether the utterances are to be targets for generation of the learning data, wherein the sorter (11) is configured to exclude utterances of a dialogue scene that includes utterances similar to utterance of the particular type from the targets for generation of learning data.Type: GrantFiled: August 14, 2019Date of Patent: April 9, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Setsuo Yamada, Yoshiaki Noda, Takaaki Hasegawa
-
Patent number: 11950067Abstract: An In-Car Communication (ICC) system supports the communication paths within a car by receiving the speech signals of a speaking passenger and playing it back for one or more listening passengers. Signal processing tasks are split into a microphone related part and into a loudspeaker related part. A sound processing system suitable for use in a vehicle having multiple acoustic zones includes a plurality of microphone In-Car Communication (Mic-ICC) instances coupled and a plurality of loudspeaker In-Car Communication (Ls-ICC) instances. The system further includes a dynamic audio routing matrix with a controller and coupled to the Mic-ICC instances, a mixer coupled to the plurality of Mic-ICC instances and a distributor coupled to the Ls-ICC instances.Type: GrantFiled: February 6, 2023Date of Patent: April 2, 2024Assignee: Cerence Operating CompanyInventors: Tobias Herbig, Markus Buck, Meik Pfeffinger
-
Patent number: 11948565Abstract: A method for combining hotwords in a single utterance receives, at a first assistant-enabled device (AED), audio data corresponding to an utterance directed toward the first AED and a second AED among two or more AEDs where the audio data includes a query specifying an operation to perform. The method also detects, using a hotword detector, a first hotword assigned to the first AED that is different than a second hotword assigned to the second AED In response to detecting the first hotword, the method initiates processing on the audio data to determine that the audio data includes a term preceding the query that at least partially matches the second hotword assigned. Based on the at least partial match, the method executes a collaboration routine to cause the first AED and the second AED to collaborate with one another to fulfill the query.Type: GrantFiled: December 11, 2020Date of Patent: April 2, 2024Assignee: Google LLCInventors: Matthew Sharifi, Victor Carbune
-
Patent number: 11948579Abstract: According to various embodiments, an electronic device performing an operation corresponding to an utterance of a user in a multi-device environment includes a communication module, a processor, and a memory. The processor operates a first electronic device for receiving and a second electronic device for executing in association with each other. The memory stores instructions that, when being executed, cause the processor to: determine whether a first utterance received at the first electronic device should be performed at the second electronic device; when the first utterance should be performed at the second electronic device, form a connection binding with the first electronic device and the second electronic device by using one or more channels through the communication module. When an operation corresponding to the first utterance is completed at the second electronic device, the processor terminates the connection binding with the first electronic device and the second electronic device.Type: GrantFiled: December 7, 2020Date of Patent: April 2, 2024Assignee: Samsung Electronics Co., Ltd.Inventors: Taegu Kim, Hansin Koh, Jooyeon Kim, Hyeonjae Bak, Hyunjin Park, Kyounggu Woo
-
Patent number: 11948569Abstract: An electronic apparatus includes: a memory storing a first threshold value and a second threshold value corresponding to a receiving direction of a wake-up word, a sound receiver comprising sound receiving circuitry, and a processor configured to: identify a receiving direction of the sound based on a sound received through the sound receiver, based on a similarity between sound data obtained in response to the received sound and the wake-up word being greater than or equal to the first threshold value corresponding to the identified receiving direction, perform voice recognition for a subsequent sound received through the sound receiver, and based on the similarity being less than the first threshold value and greater than or equal to the second threshold value, change the first threshold value.Type: GrantFiled: October 26, 2021Date of Patent: April 2, 2024Assignee: Samsung Electronics Co., Ltd.Inventor: Chanhee Choi
-
Patent number: 11947872Abstract: Aspects of the disclosure relate to computing platforms that utilize improved natural language processing techniques for claims processing. A computing platform may receive an audio file and determine one or more utterance segments based on the audio file. The computing platform may generate and send one or more audio chunks based on the one or more utterance segments to an audio transcription system. The computing platform may receive one or more text segments in response and may generate an audio transcription file accordingly. The computing platform may assign a category to each word in the audio transcription file, and may send user interface information to a user device along with one or more commands directing the user device to generate a user interface based on the user interface information, which may cause the user device to display the user interface.Type: GrantFiled: May 14, 2020Date of Patent: April 2, 2024Assignee: Allstate Insurance CompanyInventors: Olivia Mahler-Haug, Connor Walsh, Geoff Colgan, Matthew Cluck, Ross Kirk Eley, Harideep Yeddula
-
Patent number: 11942099Abstract: Disclosed example people monitoring methods include detecting a first watermark in a first audio signal obtained from an acoustic sensor, the first watermark identifying media presented by a monitored media device, determining whether a second watermark, different from the first watermark, is embedded in the first audio signal obtained from the acoustic sensor, the second watermark identifying at least one of a mobile device or a user of the mobile device, classifying the second watermark as a media watermark or a people monitoring watermark based on a characteristic of the second watermark, and when the second watermark is determined to be embedded in the first audio signal, reporting at least one of the second watermark or information decoded from the second watermark to identify at least one of the mobile device or the user of the mobile device as being exposed to the media presented by the monitored media device.Type: GrantFiled: December 29, 2021Date of Patent: March 26, 2024Assignee: The Nielsen Company (US), LLCInventors: Alexander Topchy, Padmanabhan Soundararajan, Venugopal Srinivasan
-
Patent number: 11935534Abstract: A system and method for voice control of a media playback device is disclosed. The method includes receiving an instruction of a voice command, converting the voice command to text, transmitting the text command to the playback device, and having the playback device execute the command. An instruction may include a command to play a set of audio tracks, and the media playback device plays the set of audio tracks upon receiving the instruction.Type: GrantFiled: March 15, 2022Date of Patent: March 19, 2024Assignee: Spotify ABInventors: Daniel Bromand, Richard Mitic, Horia Jurcut, Jennifer Thom-Santelli, Henriette Cramer, Karl Humphreys, Robert Williams, Kurt Jacobson, Henrik Lindström
-
Patent number: 11900072Abstract: Offered is a system that presents on a display screen a translation of a sentence together with an untranslated version of the sentence, and that can cause both of the displayed sentences to break apart into component parts in response to a simple user action, e.g., double-tapping on one of them. When the user selects (e.g., taps on) any portion of either version of the sentence, the system can identify a corresponding portion of the other version (in the other language). In some implementations, a user device can include both a microphone and a display screen, and an automatic speech recognition (ASR) engine can be used to transcribe the user's speech in one language (e.g., English) into text. The system can translate the resulting text into another language (e.g., Spanish) and display the translated text on the display screen along with the untranslated text. When a user selects a portion of a sentence, the system can also present information about the selected portion (e.g.Type: GrantFiled: July 18, 2017Date of Patent: February 13, 2024Assignee: Amazon Technologies, Inc.Inventors: Lee Michael Bossio, Kartik Suchindra Babu, John Thomas Beck
-
Patent number: 11887598Abstract: In one aspect, a network microphone device includes a plurality of microphones and is configured to capture a voice input via the one or more microphones, detect a wake word in the voice input, transmit data associated with the voice input to one or more remote computing devices associated with a voice assistant service, and receive a response from the one or more remote computing devices, the response comprising a playback command based on the voice input. The network microphone device may be configured to obtain verification information characterizing the voice input and, based on the verification information indicating that the voice input was spoken by an unverified user, functionally disable the NMD from performing the playback command.Type: GrantFiled: December 2, 2022Date of Patent: January 30, 2024Assignee: Sonos, Inc.Inventor: Connor Kristopher Smith
-
Patent number: 11875793Abstract: A system, method, and computer program product for implementing cognitive natural language processing software framework optimization is provided. The method includes receiving instructions associated with an audible user input of a user. An AI input intention of the user is determined and key information is extracted from the audible user input. The key information is inputted into a generated database table and additional key information is retrieved from a dialog table. A supplementary database table comprising the additional key information is generated and the key information is spliced with the additional key information. A resulting spliced data structure is merged into a final database table and natural language is converted into a request code structure within an SQL structure and an interactive AI interface presenting results of the converting is generated. Operational functionality of an AI device is enabled for audibly presenting results of the conversion.Type: GrantFiled: September 7, 2021Date of Patent: January 16, 2024Assignee: International Business Machines CorporationInventors: Zhong Fang Yuan, Tong Liu, De Shuo Kong, Yao Chen, Hai Bo Zou, Sarbajit K. Rakshit, Zheng Jie
-
Patent number: 11862155Abstract: A method includes a first assistant-enabled device (AED) receiving an assignment instruction assigning a group hotword to a selected group of AEDs that includes the first AED and one or more other AEDs. Each AED is configured to wake-up from a low-power state when the group hotword is detected in streaming audio by at least one of the AEDs. The method also includes receiving audio data that corresponds to an utterance spoken by the user and includes a query that specifies an operation to perform. In response to detecting the group hotword in the audio data, the method also includes triggering the first AED to wake-up from the low-power state and executing a collaboration routine to cause the first AED and each other AED in the selected group of AEDs to collaborate with one another to fulfill performance of the operation specified by the query.Type: GrantFiled: December 11, 2020Date of Patent: January 2, 2024Assignee: Google LLCInventors: Matthew Sharifi, Victor Carbune
-
Patent number: 11842745Abstract: In an embodiment, a method includes receiving a plurality of first images including at least a mouth-related portion of a human uttering a voice, wherein each first image has depth information; obtaining a noisy spectrogram including a first representation of the voice of the human; extracting a plurality of visual features using the first images, wherein one of the visual features is obtained using the depth information of a second image of the first images; extracting a plurality of audio features using the noisy spectrogram; determining a first spectrogram using the visual features and the audio features; subtracting the first spectrogram from the noisy spectrogram, to obtain a purified representation of the voice of the human; and outputting, by an I/O-related outputting device, a response using the purified representation of the voice of the human.Type: GrantFiled: February 16, 2021Date of Patent: December 12, 2023Assignee: GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD.Inventors: Yuan Lin, Jenhao Hsiao, Chiuman Ho