Patents Examined by Richemond Dorvill
-
Patent number: 12380882Abstract: A speech processing system includes an input for receiving an input utterance spoken by a user and a word alignment unit configured to align different sequences of acoustic speech models with the input utterance spoken by the user. Each different sequence of acoustic speech models corresponds to a different possible utterance that a user might make. The system identifies any parts of a read prompt text that the user skipped; any parts of the read prompt text that the user repeated; and any speech sounds that the user inserted between words of the read prompt text. The information from the word alignment unit can be used to assess the proficiency and/or fluency of the user's speech.Type: GrantFiled: June 19, 2020Date of Patent: August 5, 2025Assignee: THE CHANCELLOR, MASTERS, AND SCHOLARS OF THE UNIVERSITY OF CAMBRIDGEInventors: Thomas William John Ash, Anthony John Robinson
-
Patent number: 12380912Abstract: A method is provided that is performed at a system including multiple speech collectors to collect speech from a talker to produce corresponding ones of multiple audio signals that each convey speech energy: for each audio signal: separating high-frequency speech energy from low-frequency speech energy; and determining a first energy level of the high-frequency speech energy; and determining a preferred audio signal among the multiple audio signals for subsequent processing at least based on the first energy level of each audio signal.Type: GrantFiled: March 8, 2022Date of Patent: August 5, 2025Assignee: CISCO TECHNOLOGY, INC.Inventor: Asbjorn Therkelsen
-
Patent number: 12380910Abstract: A computer-implemented machine learning method for improving speaker separation is provided. The method comprises processing audio data to generate prepared audio data and determining feature data and speaker data from the prepared audio data through a clustering iteration to generate an audio file. The method further comprises re-segmenting the audio file to generate a speaker segment and causing to display the speaker segment through a client device.Type: GrantFiled: June 30, 2021Date of Patent: August 5, 2025Assignee: RingCentral, Inc.Inventors: Prashant Kukde, Sushant Shivram Hiray
-
Patent number: 12367885Abstract: An inter-channel phase difference (IPD) parameter extraction method includes obtaining a parameter for obtaining an information extraction manner for a current frame of a multi-channel signal; obtaining an IPD parameter extraction manner for the current frame based on the parameter for obtaining the information extraction manner, where the obtained IPD parameter extraction manner is one of at least two preset IPD parameter extraction manners; and obtaining an IPD parameter of the current frame based on the obtained IPD parameter extraction manner for the current frame.Type: GrantFiled: January 19, 2024Date of Patent: July 22, 2025Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Xingtao Zhang, Haiting Li, Zexin Liu, Lei Miao
-
Patent number: 12361218Abstract: Named Entity Recognition is the identification and classification of named entities within a document. Nested NEs occur when an NE is contained within another NE. The disclosed invention leverages the CapsNet architecture for improved nested NE identification and classification. This includes deriving the features of an input text. The derived features are used to identify and classify any named entities in the text. The system is further configured to identify named entities in the text and perform clustering to group named entities. The disclosed CapsNet considers the context of the whole text to activate higher capsule layers in order to identify the named entities and classify them. The teachings of this invention are applicable to other NER models to improve nested NE identification and classification.Type: GrantFiled: April 14, 2022Date of Patent: July 15, 2025Inventors: Suzanne M Kirch, Rajiv Baronia, Vineeth Thanikonda Munirathnam, Jack Porter
-
Patent number: 12347423Abstract: A method for decoding an audio query may include: extracting one or more acoustic features from the audio query in response to receiving the audio query from a user; determining a generic word and a domain specific word based on the one or more acoustic features; and decoding the audio query based on the generic word, the domain specific word, and the one or more acoustic features to identify at least one word associated with the audio query.Type: GrantFiled: March 11, 2022Date of Patent: July 1, 2025Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ankur Kumar, Shatrughan Singh, Aman Maghan, Ashutosh Gupta, Sachin Singh
-
Patent number: 12334103Abstract: Methods, systems, and apparatus for door knock access control are disclosed. A monitoring system for monitoring a property includes: a proximity sensor located at a door of the property; a microphone that is configured to detect sound within an area near the door and generate audio data that represents the detected sound; and a monitor control unit configured to perform operations including: receiving, from the proximity sensor, proximity data indicating an object positioned within a set proximity to the door; based on receiving the proximity data, activating the microphone; receiving, from the microphone, the audio data; determining that a similarity between the audio data and stored audio data representing a knocking pattern satisfies similarity criteria; and in response to determining that the similarity between the audio data and the stored audio data satisfies similarity criteria, performing a monitoring system action.Type: GrantFiled: December 30, 2020Date of Patent: June 17, 2025Assignee: Alarm.com IncorporatedInventors: William Wireko Mensah, Daniel John Koniar, Liyu Yao, Martin Logan Elliott, John Zhang
-
Patent number: 12333268Abstract: Systems and methods for handling multilingual queries are provided. One example method includes receiving, at a computing device, an input, wherein the input comprises a multi-lingual query comprising at least a first source language and a second source language. The multi-lingual query is translated, word for word, into a destination language to produce a monolingual query, with the word order of the multilingual query and the word order of the monolingual query being the same. The monolingual query is processed using natural language processing to map the mono-lingual query to a natural language query in the destination language.Type: GrantFiled: February 8, 2024Date of Patent: June 17, 2025Assignee: ADEIA GUIDES INC.Inventors: Ajay Kumar Mishra, Jeffry Copps Robert Jose
-
Patent number: 12334052Abstract: This application relates to a method for training a speech recognition model comprising: providing a speech training data set comprising a plurality of speech data items and corresponding speech tags; providing a speech recognition model to be trained comprising a convolution neural network, a first fully connected network, a recurrent neural network and a second fully connected network which are cascade coupled together, wherein each of the networks comprises one or more network layers each having a parameter matrix; and the speech recognition model processing speech data items to generate corresponding speech recognition results; and using the speech training data set to train the speech recognition model such that the parameter matrices of at least two adjacent network layers satisfies a predetermined constraint condition; and the speech recognition model trained using at least one loss function can generate speech recognition results at an accuracy satisfying a predetermined recognition target.Type: GrantFiled: July 12, 2022Date of Patent: June 17, 2025Assignee: MONTAGE LZ TECHNOLOGIES (CHENGDU) CO., LTD.Inventors: Xuepeng Wang, Taibo Dong, Gang Hu
-
Patent number: 12335328Abstract: This disclosure provides a network call method and apparatus, a computer device, and a storage medium, and belongs to the field of audio data processing. The method includes: performing time-frequency transformation on an acquired audio signal, to obtain a plurality of pieces of frequency domain information of the audio signal; determining a target bit rate corresponding to the audio signal according to the plurality of pieces of frequency domain information; and encoding the audio signal based on the target bit rate, and performing a network call based on the encoded audio signal.Type: GrantFiled: October 21, 2021Date of Patent: June 17, 2025Assignee: Tencent Technology (Shenzhen) Company LimitedInventor: Junbin Liang
-
Patent number: 12327573Abstract: A method of presenting a signal to a speech recognition engine is disclosed. According to an example of the method, an audio signal is received from a user. A portion of the audio signal is identified, the portion having a first time and a second time. A pause in the portion of the audio signal, the pause comprising the second time, is identified. It is determined whether the pause indicates the completion of an utterance of the audio signal. In accordance with a determination that the pause indicates the completion of the utterance, the portion of the audio signal is presented as input to the speech recognition engine. In accordance with a determination that the pause does not indicate the completion of the utterance, the portion of the audio signal is not presented as input to the speech recognition engine.Type: GrantFiled: April 16, 2020Date of Patent: June 10, 2025Assignee: Magic Leap, Inc.Inventors: Anthony Robert Sheeder, Tushar Arora
-
Patent number: 12327091Abstract: A system for translating speech from at least two source languages into another target language provides direct speech to target language translation. The target text is converted to speech in the target language through a TTS system. The system simplifies speech recognition and translation process by providing direct translation, includes mechanisms described herein that facilitate mixed language source speech translation, and punctuating output text streams in the target language. It also in some embodiments allows translation of speech into the target language to reflect the voice of the speaker of the source speech based on characteristics of the source language speech and speaker's voice and to produce subtitled data in the target language corresponding to the source speech. The system uses models having been trained using (i) encoder-decoder architectures with attention mechanisms and training data using TTS and (ii) parallel text training data in more than two different languages.Type: GrantFiled: January 13, 2020Date of Patent: June 10, 2025Assignee: Applications Technology (AppTek), LLCInventors: Evgeny Matusov, Jintao Jiang, Mudar Yaghi
-
Patent number: 12327563Abstract: Systems and methods for audio processing are described. The system may include a display; a user input device; an audio communications terminal; a recording medium; a tagging system for tagging each of the messages recorded on the recording medium as a function of a question with which each of the messages recorded on the recording medium is associated; and a processor comprising code for: initiating one of the questions of the plurality of questions; displaying the one of the questions of the plurality of questions, having been initiated, on the display; recording via the audio communications terminal audio for a respective response associated with the one of the questions of the plurality of questions, having been initiated; directing the tagging system to tag the response associated with the one of the questions of the plurality of questions; and repeating the steps of each of the plurality of questions.Type: GrantFiled: February 24, 2021Date of Patent: June 10, 2025Inventor: Kenneth O'Reilly
-
Patent number: 12296028Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.Type: GrantFiled: December 16, 2024Date of Patent: May 13, 2025Assignee: DOLBY INTERNATIONAL ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 12300258Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.Type: GrantFiled: December 16, 2024Date of Patent: May 13, 2025Assignee: DOLBY INTERNATIONAL ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 12293150Abstract: A registree management function receives member (user) registration, carries out a survey upon registration, performs category classification for the registered user, learns the classified categories, and the like. A comment analysis function performs text mining on comments acquired from an SNS posted comment server, determines post origin positions identified by the text mining and the level of credibility thereof, and executes evaluation and the like of a target relating to a theme. An information provision function edits a social heat map generated based on the results of analyzing the comments to be provided to the user, and also performs user category management and the like.Type: GrantFiled: June 17, 2019Date of Patent: May 6, 2025Assignee: Takenaka CorporationInventors: Kuniaki Andou, Rikuto Kunimoto, Takeshi Takai, Kazuo Ohtake
-
Patent number: 12288029Abstract: Systems, apparatuses, methods, and computer program products are disclosed for distillation of a natural language processing model. An example method includes receiving, by communications circuitry, a set of text data comprising a set of observations and predicting, by processing circuitry and using the NLP model, classifications for each observation in the text data. The example method further includes generating, by model training engine, a balanced sampled data structure based on the predicted classifications for each observation in the text data and training, by the model training engine, a surrogate model using the balanced sampled data structure. The example method further includes identifying, by an interpreter and from the surrogate model, a set of most-influential tokens in the text data.Type: GrantFiled: May 14, 2024Date of Patent: April 29, 2025Assignee: Wells Fargo Bank, N.A.Inventors: Ye Yu, Harsh Singhal, Wayne B. Shoumaker
-
Patent number: 12283271Abstract: A method of providing voice feedback to a listener as part of a user interface of a media playback system may include: storing multiple different voice feedback recordings in at least one computer-readable storage device, where each of the multiple different voice feedback recordings is of a different voice artist; receiving a listener command corresponding to a musical selection; determining an identifying musical characteristic of the musical selection; selecting a first voice feedback recording from the multiple different voice feedback recordings, where the first voice feedback recording corresponds to the identifying musical characteristic; and playing the first voice feedback recording to the listener via the media playback system.Type: GrantFiled: May 18, 2021Date of Patent: April 22, 2025Assignee: Spotify ABInventor: Sten Garmark
-
Patent number: 12272366Abstract: Systems and methods are disclosed for providing voice interactions based on user context. Data is received that causes a voice interaction to be generated for output at a user device during an output time interval. In response, current user contextual data of the user device is retrieved. The voice interaction and output time interval are altered to increase consumption likelihood of the voice interaction based on the current user contextual data. The altered voice interaction is outputted at the user device during the altered output time interval.Type: GrantFiled: April 29, 2021Date of Patent: April 8, 2025Assignee: Adeia Guides Inc.Inventors: Ankur Anil Aher, Jeffry Copps Robert Jose, Reda Harb
-
Patent number: 12266348Abstract: An image display device and a voice recognition method thereof are proposed. The image display device and voice recognition method thereof stores received first utterance data in a first buffer when a wake up word is recognized from utterance data, stores second utterance data consecutively uttered from a user in a second buffer, stores third utterance data consecutively uttered from the user in a third buffer when verification of the wake up word is successful on the basis of the first utterance data after booting of an operating system is completed, and joins the second uttered data and the third uttered data respectively stored in the second buffer and the third buffer, so as to perform voice recognition for the utterance data consecutively uttered by the user. Accordingly, including the commands uttered at intervals after the wake up word, user's commands that are consecutively uttered may be recognized.Type: GrantFiled: September 24, 2019Date of Patent: April 1, 2025Assignee: LG ELECTRONICS INC.Inventors: Woo Jin Choi, Sung Eun Kim, Hyun Woo Park, Eun Kyung Jung, Dae Gon Chae