Speech Signal Processing Patents (Class 704/200)
  • Patent number: 12033659
    Abstract: Disclosed is a method for determining important parts among a speech-to-text (STT) result and reference data, which is performed by a computing device. The method may include acquiring STT data generated by performing STT with respect to a speech signal; acquiring reference data; determining first important information in one data among the STT data and the reference data; and determining second important information linked with the first important information in other data different from data in which the first important information is determined among the STT data and the reference data.
    Type: Grant
    Filed: December 27, 2023
    Date of Patent: July 9, 2024
    Assignee: ActionPower Corp.
    Inventors: Hyungwoo Kim, Hwanbok Mun, Kangwook Kim
  • Patent number: 12033611
    Abstract: A system for use in video game development to generate expressive speech audio comprises a user interface configured to receive user-input text data and a user selection of a speech style. The system includes a machine-learned synthesizer comprising a text encoder, a speech style encoder and a decoder. The machine-learned synthesizer is configured to generate one or more text encodings derived from the user-input text data, using the text encoder of the machine-learned synthesizer; generate a speech style encoding by processing a set of speech style features associated with the selected speech style using the speech style encoder of the machine-learned synthesizer; combine the one or more text encodings and the speech style encoding to generate one or more combined encodings; and decode the one or more combined encodings with the decoder of the machine-learned synthesizer to generate predicted acoustic features.
    Type: Grant
    Filed: February 28, 2022
    Date of Patent: July 9, 2024
    Assignee: ELECTRONIC ARTS INC.
    Inventors: Siddharth Gururani, Kilol Gupta, Dhaval Shah, Zahra Shakeri, Jervis Pinto, Mohsen Sardari, Navid Aghdaie, Kazi Zaman
  • Patent number: 12027177
    Abstract: A computer-implemented method to determine whether to introduce latency into an audio stream from a particular speaker includes an audio stream from a sender device. The method further includes providing, as input to a trained machine-learning model, the audio stream and a speech analysis score, information about one or more voice emotion parameters, and one or more voice emotion scores for a first user associated with the sender device, wherein the trained machine-learning model is iteratively applied to the audio stream and wherein each iteration corresponds to a respective portion of the audio stream. The method further includes generating as output, with the trained machine-learning model, a level of toxicity in the audio stream. The method further includes transmitting the audio stream to a recipient device, wherein the transmitting is performed to introduce a time delay in the audio stream based on the level of toxicity.
    Type: Grant
    Filed: September 8, 2022
    Date of Patent: July 2, 2024
    Assignee: Roblox Corporation
    Inventors: Mahesh Kumar Nandwana, Philippe Clavel, Morgan McGuire
  • Patent number: 12027151
    Abstract: A linguistic content and speaking style disentanglement model includes a content encoder, a style encoder, and a decoder. The content encoder is configured to receive input speech as input and generate a latent representation of linguistic content for the input speech output. The content encoder is trained to disentangle speaking style information from the latent representation of linguistic content. The style encoder is configured to receive the input speech as input and generate a latent representation of speaking style for the input speech as output. The style encoder is trained to disentangle linguistic content information from the latent representation of speaking style. The decoder is configured to generate output speech based on the latent representation of linguistic content for the input speech and the latent representation of speaking style for the same or different input speech.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: July 2, 2024
    Assignee: Google LLC
    Inventors: Ruoming Pang, Andros Tjandra, Yu Zhang, Shigeki Karita
  • Patent number: 12027154
    Abstract: A method includes receiving a training example that includes audio data representing a spoken utterance and a ground truth transcription. For each word in the spoken utterance, the method also includes inserting a placeholder symbol before the respective word identifying a respective ground truth alignment for a beginning and an end of the respective word, determining a beginning word piece and an ending word piece, and generating a first constrained alignment for the beginning word piece and a second constrained alignment for the ending word piece. The first constrained alignment is aligned with the ground truth alignment for the beginning of the respective word and the second constrained alignment is aligned with the ground truth alignment for the ending of the respective word. The method also includes constraining an attention head of a second pass decoder by applying the first and second constrained alignments.
    Type: Grant
    Filed: February 9, 2023
    Date of Patent: July 2, 2024
    Assignee: Google LLC
    Inventors: Tara N. Sainath, Basilio Garcia Castillo, David Rybach, Trevor Strohman, Ruoming Pang
  • Patent number: 12020405
    Abstract: A computer-implemented method for training a convolutional neural network includes receiving a captured image. A denoised image is generated by applying the convolutional neural network to the captured image. The convolutional neural network is trained based on a high frequency loss function, as well as the captured image and the denoised image.
    Type: Grant
    Filed: November 3, 2021
    Date of Patent: June 25, 2024
    Assignee: LEICA MICROSYSTEMS CMS GMBH
    Inventor: Jose Miguel Serra Lleti
  • Patent number: 12021822
    Abstract: A computer-implemented method includes receiving a communication between first and second users via a communication channel associated with a communication space, and identifying the first user having a first role and the second user having a second role, a formality of the communication is determined based on the second role. The method includes identifying a transformer model for the communication space and monitoring the communication for an agreement clause via the transformer model by deriving an agreement clause based on the communication and classifying the derived agreement clause.
    Type: Grant
    Filed: October 4, 2022
    Date of Patent: June 25, 2024
    Assignee: International Business Machines Corporation
    Inventors: Aaron K. Baughman, Jeremy R. Fox, Raghuveer Prasad Nagar, Dinesh Kumar Bhudavaram
  • Patent number: 12020708
    Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.
    Type: Grant
    Filed: October 11, 2021
    Date of Patent: June 25, 2024
    Assignee: SoundHound AI IP, LLC.
    Inventors: Kiersten L. Bradley, Ethan Coeytaux, Ziming Yin
  • Patent number: 12013884
    Abstract: A modular two-stage neural architecture is used in translating a natural language question into a logic form such as a SPARQL Protocol and RDF Query Language (SPARQL) query. In a first stage, a neural machine translation (NMT)-based sequence-to-sequence (Seq2Seq) model translates a question into a sketch of the desired SPARQL query called a SPARQL silhouette. In a second stage a neural graph search module predicts the correct relations in the underlying knowledge graph.
    Type: Grant
    Filed: June 30, 2022
    Date of Patent: June 18, 2024
    Assignee: International Business Machines Corporation
    Inventors: Saswati Dana, Dinesh Garg, Dinesh Khandelwal, G P Shrivatsa Bhargav, Sukannya Purkayastha
  • Patent number: 11993817
    Abstract: The present invention relates to coding of audio signals, and in particular to high frequency reconstruction methods including a frequency domain harmonic transposer. A system and method for generating a high frequency component of a signal from a low frequency component of the signal is described.
    Type: Grant
    Filed: January 19, 2023
    Date of Patent: May 28, 2024
    Assignee: Dolby International AB
    Inventors: Lars Villemoes, Per Ekstrand
  • Patent number: 11961525
    Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.
    Type: Grant
    Filed: August 3, 2021
    Date of Patent: April 16, 2024
    Assignee: Google LLC
    Inventors: Georg Heigold, Samuel Bengio, Ignacio Lopez Moreno
  • Patent number: 11947593
    Abstract: A system, method, and computer program product for hierarchical categorization of sound comprising one or more neural networks implemented on one or more processors. The one or more neural networks are configured to categorize a sound into a two or more tiered hierarchical coarse categorization and a finest level categorization in the hierarchy. The categorization sound may be used to search a database for similar or contextually related sounds.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: April 2, 2024
    Inventors: Arindam Jati, Naveen Kumar, Ruxin Chen
  • Patent number: 11908483
    Abstract: This application relates to a method of extracting an inter channel feature from a multi-channel multi-sound source mixed audio signal performed at a computing device.
    Type: Grant
    Filed: August 12, 2021
    Date of Patent: February 20, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Rongzhi Gu, Shixiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu
  • Patent number: 11903067
    Abstract: An audio forwarding method, an audio forwarding method device and a storage medium are described. The audio forwarding method comprises: establishing a first communication link with a sound source device based on a first wireless communication protocol; establishing a second communication link with an audio playback device based on a second wireless communication protocol; receiving first audio data from the sound source device through the first communication link; processing the first audio data to generate second audio data, and storing the second audio data into a second buffer; and transmitting the second audio data to the audio playback device through the second communication link.
    Type: Grant
    Filed: June 24, 2022
    Date of Patent: February 13, 2024
    Assignee: Nanjing Zgmicro Company Limited
    Inventor: Bin Xu
  • Patent number: 11894015
    Abstract: An embedded sensor can include an audio detector, a digital signal processor, a library, and a rules engine. The digital signal processor can be configured to receive signals from the audio detector and to identify the environment in which the embedded sensor is located. The library can store statistical models associated with specific environments, and the digital signal processor can be configured identify specific events based on detected sounds within the particular environment by utilizing the statistical model associated with the particular environment. The DSP can associate a probability of accuracy for the identified audible event. A rules engine can be configured to receive the probability and transmit a report of the detected audible event.
    Type: Grant
    Filed: October 31, 2022
    Date of Patent: February 6, 2024
    Assignee: CELLULAR SOUTH, INC.
    Inventors: Brett Rogers, Tommy Naugle, Stephen Bye, Craig Sparks, Arman Kirakosyan
  • Patent number: 11854561
    Abstract: The invention provides an audio encoder including a combination of a linear predictive coding filter having a plurality of linear predictive coding coefficients and a time-frequency converter, wherein the combination is configured to filter and to convert a frame of the audio signal into a frequency domain in order to output a spectrum based on the frame and on the linear predictive coding coefficients; a low frequency emphasizer configured to calculate a processed spectrum based on the spectrum, wherein spectral lines of the processed spectrum representing a lower frequency than a reference spectral line are emphasized; and a control device configured to control the calculation of the processed spectrum by the low frequency emphasizer depending on the linear predictive coding coefficients of the linear predictive coding filter.
    Type: Grant
    Filed: November 22, 2022
    Date of Patent: December 26, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Stefan Doehla, Bernhard Grill, Christian Helmrich, Nikolaus Rettelbach
  • Patent number: 11823670
    Abstract: Utterance-based user interfaces can include activation trigger processing techniques for detecting activation triggers and causing execution of certain commands associated with particular command pattern activation triggers without waiting for output from a separate speech processing engine. The activation trigger processing techniques can also detect speech analysis patterns and selectively activate a speech processing engine.
    Type: Grant
    Filed: April 17, 2020
    Date of Patent: November 21, 2023
    Assignee: Spotify AB
    Inventor: Richard Mitic
  • Patent number: 11810572
    Abstract: A system, method, and computer-program product includes distributing a plurality of audio data files of a speech data corpus to a plurality of computing nodes that each implement a plurality of audio processing threads, executing the plurality of audio processing threads associated with each of the plurality of computing nodes to detect a plurality of tentative speakers participating in each of the plurality of audio data files, generating, via a clustering algorithm, a plurality of clusters of embedding signatures based on a plurality of embedding signatures associated with the plurality of tentative speakers in each of the plurality of audio data files, and detecting a plurality of global speakers associated with the speech data corpus based on the plurality of clusters of embedding signatures.
    Type: Grant
    Filed: June 8, 2023
    Date of Patent: November 7, 2023
    Assignee: SAS INSTITUTE INC.
    Inventors: Xiaozhuo Cheng, Xiaolong Li, Xu Yang
  • Patent number: 11810593
    Abstract: A system configured to perform low power mode wakeword detection is provided. A device reduces power consumption without compromising functionality by placing a primary processor into a low power mode and using a secondary processor to monitor for sound detection. The secondary processor stores input audio data in a buffer component while performing sound detection on the input audio data. If the secondary processor detects a sound, the secondary processor sends an interrupt signal to the primary processor, causing the primary processor to enter an active mode. While in the active mode, the primary processor performs wakeword detection using the buffered audio data. To reduce a latency, the primary processor processes the buffered audio data at an accelerated rate. In some examples, the device may further reduce power consumption by including a second buffer component and only processing the input audio data after detecting a sound.
    Type: Grant
    Filed: November 6, 2020
    Date of Patent: November 7, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Dibyendu Nandy, Om Prakash Gangwal
  • Patent number: 11804238
    Abstract: An optimization method for an implementation of mel-frequency cepstral coefficients is provided. The optimization method includes the following steps: performing a framing step, including using a 400×16 static random access memory to temporarily store a plurality of sampling points of a sound signal with overlap, and decomposing the sound signal into a plurality of frames. Each of the plurality of frames is 400 of the sampling points, there is an overlapping region between adjacent two of the plurality of frames, and the overlapping region includes 240 of the sampling points. The optimization method further includes performing a windowing step, which includes multiplying each of the plurality of frames by a window function in a bit-level design, and the optimization method includes performing a fast Fourier transform (FFT) step, which includes applying a 512 point FFT on a frame signal to obtain a corresponding frequency spectrum.
    Type: Grant
    Filed: October 29, 2021
    Date of Patent: October 31, 2023
    Assignee: REALTEK SEMICONDUCTOR CORP.
    Inventors: Li-Li Tan, Zhi-Lin Wang, Xiao-Feng Cao, Xiao-Huan Li
  • Patent number: 11798566
    Abstract: The present disclosure discloses a data transmission method performed by a computer device and a non-transitory computer-readable storage medium. According to the present disclosure, voice criticality analysis is performed on a to-be-transmitted audio to obtain a criticality level of each to-be-transmitted audio frame in the to-be-transmitted audio, and a corrected redundancy multiple of each to-be-transmitted audio frame is obtained according to a current redundancy multiple and a redundant transmission factor corresponding to the criticality level of each to-be-transmitted audio frame. Therefore, each to-be-transmitted audio frame is duplicated according to a corrected redundancy multiple of each to-be-transmitted audio frame, to obtain at least one redundancy data packet, and the at least one redundancy data packet is transmitted to a target terminal, which can improve the network anti-packet loss effect without causing network congestion.
    Type: Grant
    Filed: October 28, 2021
    Date of Patent: October 24, 2023
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Junbin Liang
  • Patent number: 11790924
    Abstract: In a stereo encoding method, a channel combination encoding solution of a current frame is first obtained, and then a quantized channel combination ratio factor of the current frame and an encoding index of the quantized channel combination ratio factor are obtained based on the obtained channel combination encoding solution, so that an obtained primary channel signal and secondary channel signal of the current frame meet a characteristic of the current frame.
    Type: Grant
    Filed: November 9, 2022
    Date of Patent: October 17, 2023
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Bin Wang, Haiting Li, Lei Miao
  • Patent number: 11784712
    Abstract: A cell site test tool provides field technicians with resources to support multiple aspects of cell site testing. The cell site test tool includes multiple, integrated and removably connectable modules such as a base module, a user interface module, and a battery module. Additional modules include a CPRI module to provide Common Public Radio Interface testing, an OTDR module to provide dedicated Optical Time-Domain Reflectometer testing, a CAA module to provide Cable Antenna Analysis testing, a fiber inspection module to visually inspect optical fiber, and an SA/CPRI module to provide Radio Frequency over Common Public Radio Interface testing.
    Type: Grant
    Filed: March 29, 2022
    Date of Patent: October 10, 2023
    Assignee: VIAVI SOLUTIONS INC.
    Inventors: Reza Vaez-Ghaemi, Craig Stephen Boledovic, Andrew Thomas Rayno, Waleed Wardak, Michael Jon Bangert, Jr., Kanwaljit Singh Rekhi
  • Patent number: 11739641
    Abstract: A method for processing speech, comprising semantically parsing a received natural language speech input with respect to a plurality of predetermined command grammars in an automated speech processing system; determining if the parsed speech input unambiguously corresponds to a command and is sufficiently complete for reliable processing, then processing the command; if the speech input ambiguously corresponds to a single command or is not sufficiently complete for reliable processing, then prompting a user for further speech input to reduce ambiguity or increase completeness, in dependence on a relationship of previously received speech input and at least one command grammar of the plurality of predetermined command grammars, reparsing the further speech input in conjunction with previously parsed speech input, and iterating as necessary. The system also monitors abort, fail or cancel conditions in the speech input.
    Type: Grant
    Filed: April 13, 2021
    Date of Patent: August 29, 2023
    Assignee: Great Northern Research, LLC
    Inventors: Philippe Roy, Paul J. Lagassey
  • Patent number: 11711589
    Abstract: The present disclosure relates to a method and system for presenting a set of control functions via an interface of a peripheral control device (PCD). A control function can include a command associated with one or more media contexts of a host media device. The method decodes a payload, from the host media device, with an encoded context identifier, where the context identifier indicates a primary media context active on the host media device. The method determines one or more control functions corresponding to the context identifier, and changes the set of control functions on the interface of the PCD to include the one or more control functions that can command the primary media context.
    Type: Grant
    Filed: December 15, 2021
    Date of Patent: July 25, 2023
    Assignee: NAGRAVISION S.A.
    Inventors: Amudha Kaliamoorthi, Prabhu Chawandi, Karthikeyan Srinivasan, Jihyun Park, Jun Seo Lee
  • Patent number: 11692907
    Abstract: Dishwashing appliances and methods, as provided herein, may include features or steps such as measuring a first pressure in a sump with a pressure sensor and storing the first pressure in a memory of the dishwashing appliance as a reference pressure. Dishwashing appliances and methods may further include features or steps for measuring a second pressure within the sump with the pressure sensor after measuring the first pressure, and determining that a check valve is failed when the second pressure exceeds the first pressure by at least a predetermined margin.
    Type: Grant
    Filed: June 25, 2020
    Date of Patent: July 4, 2023
    Assignee: Haier US Appliance Solutions, Inc.
    Inventor: Kyle Edward Durham
  • Patent number: 11689484
    Abstract: The disclosed exemplary embodiments include computer-implemented systems, apparatuses, and processes that dynamically configure and populate a digital interface based on sequential elements of message data exchanged during a chatbot session established programmatically between an apparatus and a device. For example, the apparatus may generate first messaging data that includes a candidate input value for an interface element of a digital interface, and transmit the first messaging data to the device during the programmatically established chatbot session. The apparatus may also receive, from the device during the programmatically established chatbot session, second messaging data that includes a confirmation of the candidate input value. Based on the second messaging data, the apparatus may generate populated interface data that associates the interface element with the confirmed candidate input value, and store the populated interface data within a memory.
    Type: Grant
    Filed: September 18, 2019
    Date of Patent: June 27, 2023
    Assignee: The Toronto-Dominion Bank
    Inventors: Tae Gyun Moon, Robert Alexander McCarter, Kheiver Kayode Roberts
  • Patent number: 11676580
    Abstract: An electronic device is provided. The electronic device includes a microphone, and at least one processor operatively connected to the microphone, wherein the at least one processor may include a buffer memory configured to store a first feature vector for a first voice signal obtained from the microphone as an inverse value, and an operation circuit configured to perform a norm operation for a first feature vector and a second feature vector, based on the second feature vector, based on a second voice signal streamed from the microphone and an inverse value of the first feature vector stored in the buffer memory, or calculate a similarity between the first feature vector and the second feature vector. In addition, various embodiments identified through the specification are possible.
    Type: Grant
    Filed: April 30, 2021
    Date of Patent: June 13, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyunbin Park, Jin Choi
  • Patent number: 11670297
    Abstract: The various implementations described herein include methods and systems for determining device leadership among voice interface devices. In one aspect, a method is performed at a first electronic device of a plurality of electronic devices, each having microphones, a speaker, processors, and memory storing programs for execution by the processors. The first device detects a voice input. It determines a device state and a relevance of the voice input. It identifies a subset of electronic devices from the plurality to which the voice input is relevant. In accordance with a determination that the subset includes the first device, the first device determines a first score of a criterion associated with the voice input and receives second scores of the criterion from other devices in the subset. In accordance with a determination that the first score is higher than the second scores, the first device responds to the detected input.
    Type: Grant
    Filed: April 27, 2021
    Date of Patent: June 6, 2023
    Assignee: Google LLC
    Inventors: Kenneth Mixter, Diego Melendo Casado, Alexander Houston Gruenstein, Terry Tai, Christopher Thaddeus Hughes, Matthew Nirvan Sharifi
  • Patent number: 11663183
    Abstract: A method includes generating from a time-series dataset multiple corresponding time-slice datasets. Each time-slice dataset has a corresponding time-slice time index and includes field-value data strings and associated field-value-time-index data strings, or pointers indicating the corresponding strings in an earlier time-slice dataset, that are the latest in the time-series dataset that are also earlier than the corresponding time-slice time index. A query of the time-series dataset for latest data records earlier than a given query time index is performed by using the time-slice datasets to reduce or eliminate the need to directly access or interrogate the time-series dataset.
    Type: Grant
    Filed: August 23, 2021
    Date of Patent: May 30, 2023
    Assignee: MOONSHADOW MOBILE, INC.
    Inventors: Roy W. Ward, David S. Alavi
  • Patent number: 11664034
    Abstract: A method of parametric coding of a multichannel digital audio signal including coding a signal arising from a channels reduction processing applied to the multichannel signal and coding spatialization information of the multichannel signal. The method includes the following acts: extraction of a plurality of items of spatialization information of the multichannel signal; obtaining at least one representation model of the extracted spatialization information; determination of at least one angle parameter of a model obtained; coding the at least one determined angle parameter so as to code the spatialization information extracted during the coding of spatialization information. Also provided are a method for decoding such a coded signal and corresponding coding and decoding devices.
    Type: Grant
    Filed: December 22, 2020
    Date of Patent: May 30, 2023
    Assignee: ORANGE
    Inventors: Bertrand Fatus, Stephane Ragot, Marc Emerit
  • Patent number: 11651107
    Abstract: A method and system of securing personally identifiable and sensitive information in conversational AI based communication. The method comprises enabling, in response to the identifying a conversation session initiated with a client device, a first service provider device in a set of service providers as communication channel provider of the incoming mode and enabling a second service provider device of the set as communication channel provider of the outgoing mode; and storing at least a portion of content of the incoming conversation in a first storage medium accessible to the first provider but not the second provider, and storing at least a portion of content from the outgoing conversation at a second storage medium accessible to the second provider device but not the first provider device.
    Type: Grant
    Filed: March 17, 2020
    Date of Patent: May 16, 2023
    Assignee: Ventech Solutions, Inc.
    Inventors: Ravi Kiran Pasupuleti, Ravi Kunduru
  • Patent number: 11635904
    Abstract: The present disclosure relates to technical field of data access, and discloses a matrix storage method, a matrix access method, an apparatus and an electronic device in the technical field of data access. The matrix storage method includes: dividing a matrix into a plurality of data blocks with a preset segmentation granularity of N rows×M columns; the plurality of data blocks includes at least one first data block of N rows×M columns; if the column number of the matrix is not an integer multiple of M, the plurality of data blocks further includes at least one second data block of N rows×P columns, the second data block is aligned with an adjacent row of first data block; and storing the data in each of the first data blocks and the second data blocks continuously in an off-chip storage.
    Type: Grant
    Filed: June 22, 2020
    Date of Patent: April 25, 2023
    Assignee: KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yuan Ruan, Haoyang Li
  • Patent number: 11601548
    Abstract: Internet Protocol captioned telephone service often utilizing Automated Speech Recognition can be utilized with conference calls to separate out each of the various parties' speech as text, such as with text bubbles differentiated by caller on a device of the user. Additionally, a prioritized vocabulary can be provided for each user that is not shared with a public so that if the user utilizes words in their speech not common in the general public, those words can be more accurately identified by the telephone service. The service may learn and apply that vocabulary and/or the user may provide words to the service.
    Type: Grant
    Filed: February 24, 2021
    Date of Patent: March 7, 2023
    Inventors: Beryl Burcher, James van den Bergh
  • Patent number: 11601487
    Abstract: Embodiments described herein relate to the adaptation of a real-time Web communication transmission profile, particularly the adaptation of throughput such as the video throughput of the real-time Web communication. A method is described for adapting a real-time Web communication transmission profile, including changing a transmission profile parameter of a real-time Web communication device on the basis of bandwidth-related data recovered during a real-time Web communication time period. Thus, the transmission profile can be adapted to the bandwidth of the real-time Web communication in progress, allowing a user to enjoy the best quality when the bandwidth allows and, conversely, to limit transmission errors when the bandwidth does not allow high throughput.
    Type: Grant
    Filed: June 23, 2016
    Date of Patent: March 7, 2023
    Assignee: ORANGE
    Inventors: Sandrine Lacharme, Romain Caron
  • Patent number: 11587547
    Abstract: An electronic apparatus which acquires input data to be input into a TTS module for outputting a voice through the TTS module, acquires a voice signal corresponding to the input data through the TTS module, detects an error in the acquired voice signal based on the input data, corrects the input data based on the detection result, and acquires a corrected voice signal corresponding to the corrected input data through the TTS module.
    Type: Grant
    Filed: February 12, 2020
    Date of Patent: February 21, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Hosang Sung, Kyoungbo Min, Seonho Hwang, Doohwa Hong, Eunmi Oh, Jonghoon Jeong, Kihyun Choo
  • Patent number: 11580956
    Abstract: A method includes receiving a training example that includes audio data representing a spoken utterance and a ground truth transcription. For each word in the spoken utterance, the method also includes inserting a placeholder symbol before the respective word identifying a respective ground truth alignment for a beginning and an end of the respective word, determining a beginning word piece and an ending word piece, and generating a first constrained alignment for the beginning word piece and a second constrained alignment for the ending word piece. The first constrained alignment is aligned with the ground truth alignment for the beginning of the respective word and the second constrained alignment is aligned with the ground truth alignment for the ending of the respective word. The method also includes constraining an attention head of a second pass decoder by applying the first and second constrained alignments.
    Type: Grant
    Filed: March 17, 2021
    Date of Patent: February 14, 2023
    Assignee: Google LLC
    Inventors: Tara N. Sainath, Basi Garcia, David Rybach, Trevor Strohman, Ruoming Pang
  • Patent number: 11568883
    Abstract: The invention provides an audio encoder including a combination of a linear predictive coding filter having a plurality of linear predictive coding coefficients and a time-frequency converter, wherein the combination is configured to filter and to convert a frame of the audio signal into a frequency domain in order to output a spectrum based on the frame and on the linear predictive coding coefficients; a low frequency emphasizer configured to calculate a processed spectrum based on the spectrum, wherein spectral lines of the processed spectrum representing a lower frequency than a reference spectral line are emphasized; and a control device configured to control the calculation of the processed spectrum by the low frequency emphasizer depending on the linear predictive coding coefficients of the linear predictive coding filter.
    Type: Grant
    Filed: June 11, 2020
    Date of Patent: January 31, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Stefan Doehla, Bernhard Grill, Christian Helmrich, Nikolaus Rettelbach
  • Patent number: 11551694
    Abstract: Methods, systems, and apparatuses for improved speech recognition and transcription of user utterances are described herein. User utterances may be processed by a speech recognition computing device as well as an acoustic model. The acoustic model may be trained using historical user utterance data and machine learning techniques. The acoustic model may be used to determine whether a transcription determined by the speech recognition computing device should be overridden with an updated transcription.
    Type: Grant
    Filed: January 5, 2021
    Date of Patent: January 10, 2023
    Assignee: COMCAST CABLE COMMUNICATIONS, LLC
    Inventors: Rui Min, Stefan Deichmann, Hongcheng Wang, Geifei Yang
  • Patent number: 11526781
    Abstract: A set of partial words is received. At least one partial word in the set of partial words is completed. The set of partial words with the at least one completed partial word is run through a trained deep neural network, the trained deep neural network inferring a word embedding associated with an unfinished word in the set of partial words. An inferred word is determined based on the inferred word embedding associated with the unfinished word. A sentence may be output, which includes at least the completed partial word and the inferred word.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: December 13, 2022
    Assignee: International Business Machines Corporation
    Inventors: Su Liu, Jinho Lee, Inseok Hwang, Matthew Harrison Tong
  • Patent number: 11514890
    Abstract: According to an embodiment, disclosed is an electronic device including a speaker, a microphone, a communication interface, a processor operatively connected to the speaker, the microphone, and the communication interface, and a memory operatively connected to the processor. The memory stores instructions that, when executed, cause the processor to receive a first utterance through the microphone, to determine a speaker model by performing speaker recognition on the first utterance, to receive a second utterance through the microphone after the first utterance is received, to detect an end-point of the second utterance, at least partially using the determined speaker model. Besides, various embodiments as understood from the specification are also possible.
    Type: Grant
    Filed: July 12, 2019
    Date of Patent: November 29, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Youngwoo Lee, Hoseon Shin, Chulmin Lee, Seungyeol Lee, Taejin Lee
  • Patent number: 11514917
    Abstract: An electronic device is provided, which includes a user interface, at least one communication module, a microphone, at least one speaker, at least one processor operatively connected with the user interface, the at least one communication module, the microphone, and the at least one speaker, and at least one memory operatively connected with the at least one processor, wherein the at least one memory stores instructions, which when executed, instruct the at least one processor to while the electronic device is wiredly or wirelessly connected with an access point (AP) connected with at least one external electronic device, after receiving, through the microphone, part of a wake-up utterance to invoke a voice-based intelligent assistant service, broadcast identification information about the electronic device and receive identification information broadcast from the external electronic device, after receiving the whole wake-up utterance through the microphone, individually transmit first information related to
    Type: Grant
    Filed: August 21, 2019
    Date of Patent: November 29, 2022
    Inventors: Junghwan Kang, Sungwoon Jang, Sangki Kang
  • Patent number: 11443747
    Abstract: Disclosed herein is an artificial intelligence apparatus for recognizing speech of a user including a microphone and a processor configured to obtain, via the microphone, speech data including speech of a user, determine a frequency weight for each word using a speech recognition log, generate a speech recognition result corresponding to the speech data using the frequency weight, and perform control corresponding to the speech recognition result.
    Type: Grant
    Filed: October 15, 2019
    Date of Patent: September 13, 2022
    Assignee: LG Electronics Inc.
    Inventor: Jonghoon Chae
  • Patent number: 11443113
    Abstract: User-generated input is received that includes a sequence of words associated with initiation of a computer-implemented event. Thereafter, such input is parsed using at least one natural language processing (NLP) model. This parsed input is then used by a machine learning model to determine a suggested template having a plurality of fields for initiating the event. The template can then be presented in a graphical user interface. Related apparatus, systems, techniques and articles are also described.
    Type: Grant
    Filed: March 19, 2021
    Date of Patent: September 13, 2022
    Assignee: SAP SE
    Inventors: Nishant Kumar, Panish Ramakrishna, Kumaraswamy Gowda, Rajendra Vuppala, Vidhya Neelakantan, Erica Vandenhoek, Nithya Rajagopalan
  • Patent number: 11423920
    Abstract: The methods and systems described herein aid users by modifying the presentation of content to users. For example, the methods and systems suppress the dialogue track of a movie when the user engages with the content by reciting a line of the movie as it is presented to the user. Words spoken by the user are detected and compared with the words in the movie. When the user is not engaging with the movie by reciting the lines or humming tunes while watching the movie, the audio track of the movie is not modified. Content can be modified in response to engagement by a single user or by multiple users (e.g., each reciting lines of a different character in a movie). Accordingly, the methods and systems described herein provide increased interest in and engagement with content.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: August 23, 2022
    Assignee: Rovi Guides, Inc.
    Inventors: Susanto Sen, Shakir Sharfraz Ashfaq Ahamed, Sriram Ponnuswamy
  • Patent number: 11380427
    Abstract: Systems, methods, and computer-readable media having computer-executable instructions embodied thereon for protocol driven image acquisition are provided. In embodiments, a protocol is received by an image capturing device. The protocol comprises orders from a clinician, a workflow for capturing at least one image, or a combination thereof. At least one field for receiving metadata to be associated with the at least one image allows structured documentation to begin on the image capturing device. The at least one image and associated metadata are communicated to a medical information system. A patient is identified by the metadata or an existing patient to device association and the at least one image is associated with an electronic medical record for the patient.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: July 5, 2022
    Assignee: CERNER INNOVATION, INC.
    Inventors: Damon Herbst, Carla Leighow, David A. Robaska
  • Patent number: 11349878
    Abstract: In a procedure for handling security settings of a mobile end device the operating conditions of the end device are determined. Then minimum security requirements are established according to the operating conditions by evaluating contextual data regarding the operating conditions of the end device. Next it is determined whether the security settings on the end device comply with at a least with the minimum security requirements. Access to applications is allowed or denied according to the security settings on the mobile end device. Should the end device not meet minimum security requirements the user may be prompted to change the security settings on the end device. The method may involve locating the end device and issuing of a warning in the end device does not meet minimum security settings.
    Type: Grant
    Filed: June 23, 2020
    Date of Patent: May 31, 2022
    Assignee: Unify GmbH & Co. KG
    Inventors: Karl Klug, Jurgen Totzke
  • Patent number: 11328031
    Abstract: In an approach for automatically generating and adding a timestamp to a comment left by a user on a media post based on a specific part of the media post referenced in the comment, responsive to receiving a comment on a media post, a processor completes a visual analysis and linguistic analysis of the media post. A processor completes a linguistic analysis of the comment. A processor performs a linguistic intent correlation analysis to determine a part of the media post that correlates to the comment. A processor determines a timestamp for the part of the media post. A processor adds the timestamp to the comment.
    Type: Grant
    Filed: July 11, 2020
    Date of Patent: May 10, 2022
    Assignee: International Business Machines Corporation
    Inventors: Clement Decrop, Martin G. Keen, Zachary A. Silverstein, Jeremy R. Fox
  • Patent number: 11322147
    Abstract: A voice control system for operating machinery mainly comprises: an autonomous reaction device (1) for receiving input of a voice command (11) to establish or perform operation of at least one machining task of a specific set of industrial machinery; an interaction manager (2) for receiving and outputting the voice command (11), the interaction manager (2) including interpreting an acoustic modeling algorithm, and identifying the voice command (11), so as to form an identification instruction (21), and the interpreted identification instruction (21) forming a basic machine control command and/or a machine motion control command corresponding to the operation of multiple machining tasks of the industrial machinery; and an upper controller (3) for receiving the basic machine control command and/or a machine motion control command, and operating a system of a driver (44) of the industrial machinery by voice input.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: May 3, 2022
    Inventor: Chien-Hung Liu
  • Patent number: RE49363
    Abstract: A device and a method for quantizing a LPC filter in the form of an input vector in a quantization domain, comprises a calculator of a first-stage approximation of the input vector, a subtractor of the first-stage approximation from the input vector to produce a residual vector, a calculator of a weighting function from the first-stage approximation, a warper of the residual vector with the weighting function, and a quantizer of the weighted residual vector to supply a quantized weighted residual vector.
    Type: Grant
    Filed: January 23, 2018
    Date of Patent: January 10, 2023
    Inventors: Philippe Gournay, Bruno Bessette, Redwan Salami