Patents Examined by Bryan S Blankenagel
-
Patent number: 11966707Abstract: A quantum-enhanced system and method for natural language processing (NLP) for generating a word embedding on a hybrid quantum-classical computer. A training set is provided on the classical computer, wherein the training set provides at least one pair of words, and at least one binary value indicating the correlation between the pair of words. The quantum computer generates quantum state representations for each word in the pair of words. The quantum component evaluates the quantum correlation between the quantum state representations of the word pair using an engineering likelihood function and a Bayesian inference. Training the word embedding on the quantum computer is provided using an error function containing the binary value and the quantum correlation.Type: GrantFiled: January 13, 2022Date of Patent: April 23, 2024Assignee: Zapata Computing, Inc.Inventor: Yudong Cao
-
Patent number: 11961528Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag.Type: GrantFiled: July 24, 2023Date of Patent: April 16, 2024Assignee: Dolby International ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11948550Abstract: Techniques for real-time accent conversion are described herein. An example computing device receives an indication of a first accent and a second accent. The computing device further receives, via at least one microphone, speech content having the first accent. The computing device is configured to derive, using a first machine-learning algorithm trained with audio data including the first accent, a linguistic representation of the received speech content having the first accent. The computing device is configured to, based on the derived linguistic representation of the received speech content having the first accent, synthesize, using a second machine learning-algorithm trained with (i) audio data comprising the first accent and (ii) audio data including the second accent, audio data representative of the received speech content having the second accent.Type: GrantFiled: August 27, 2021Date of Patent: April 2, 2024Assignee: SANAS.AI INC.Inventors: Maxim Serebryakov, Shawn Zhang
-
Patent number: 11948577Abstract: Certain aspects of the disclosure are directed to apparatuses and methods for analyzing digital voice data in a data-communication system. A specific aspect is directed to a data-communication apparatus that includes a data-communication server and processing circuitry in communication therewith. The data-communication server interfaces with a plurality of remotely-situated client entities for providing data communication services.Type: GrantFiled: February 28, 2019Date of Patent: April 2, 2024Assignee: 8x8, Inc.Inventors: Zhishen Liu, Bryan R. Martin
-
Patent number: 11942100Abstract: Techniques for encoding audio data with metadata are described. In an example, a device receives audio data corresponding to audio detected by a microphone and receives metadata associated with the audio. The device generates encoded data based at least in part on encoding the audio data with the metadata. The encoding involves replacing a portion of the audio data with the metadata, such that the encoded data includes the metadata and a remaining portion of the audio data. The device sends the encoded data to an audio processing application.Type: GrantFiled: April 4, 2022Date of Patent: March 26, 2024Assignee: Amazon Technologies, Inc.Inventors: Aditya Sharadchandra Joshi, Carlo Murgia, Michael Thomas Peterson
-
Patent number: 11929085Abstract: Described herein is a method of low-bitrate coding of audio data and generating enhancement metadata for controlling audio enhancement of the low-bitrate coded audio data at a decoder side, including the steps of: (a) core encoding original audio data at a low bitrate to obtain encoded audio data; (b) generating enhancement metadata to be used for controlling a type and/or amount of audio enhancement at the decoder side after core decoding the encoded audio data; and (c) outputting the encoded audio data and the enhancement metadata. Described is further an encoder configured to perform said method. Described is moreover a method for generating enhanced audio data from low-bitrate coded audio data based on enhancement metadata and a decoder configured to perform said method.Type: GrantFiled: August 29, 2019Date of Patent: March 12, 2024Assignees: DOLBY INTERNATIONAL AB, DOLBY LABORATORIES LICENSING CORPORATIONInventors: Arijit Biswas, Jia Dai, Aaron Steven Master
-
Patent number: 11922933Abstract: Voice processing method and device includes obtaining a probability value of an audio signal representing sound, collected by a first microphone on a near-end side, including a person's voice, determining a gain of the audio signal based on the determined probability value, processing the audio signal based on the determined gain of the audio signal, and sending the processed audio signal to a far-end side.Type: GrantFiled: June 2, 2020Date of Patent: March 5, 2024Assignee: YAMAHA CORPORATIONInventor: Tetsuto Kawai
-
Patent number: 11922963Abstract: Systems and methods are provided for generating and operating a speech enhancement model optimized for generating noise-suppressed speech outputs for improved human listening and live captioning. A computing system obtains a speech enhancement model trained on a first training dataset to generate noise-suppressed speech outputs and an automatic speech recognition model trained on a second training dataset to generate transcription labels for spoken language utterances. A third training dataset comprising a set of spoken language utterances is applied to the speech enhancement model to obtain a first noise-suppressed speech output which is applied to the automatic speech recognition model to generate a noise-suppressed transcription output for the set of spoken language utterances.Type: GrantFiled: May 26, 2021Date of Patent: March 5, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Xiaofei Wang, Sefik Emre Eskimez, Min Tang, Hemin Yang, Zirun Zhu, Zhuo Chen, Huaming Wang, Takuya Yoshioka
-
Patent number: 11915714Abstract: Methods for modifying audio data include operations for accessing audio data having a first prosody, receiving a target prosody differing from the first prosody, and computing acoustic features representing samples. Computing respective acoustic features for a sample includes computing a pitch feature as a quantized pitch value of the sample by assigning a pitch value, of the target prosody or the audio data, to at least one of a set of pitch bins having equal widths in cents. Computing the respective acoustic features further includes computing a periodicity feature from the audio data. The respective acoustic features for the sample include the pitch feature, the periodicity feature, and other acoustic features. A neural vocoder is applied to the acoustic features to pitch-shift and time-stretch the audio data from the first prosody toward the target prosody.Type: GrantFiled: December 21, 2021Date of Patent: February 27, 2024Assignees: Adobe Inc., Northwestern UniversityInventors: Maxwell Morrison, Juan Pablo Caceres Chomali, Zeyu Jin, Nicholas Bryan, Bryan A. Pardo
-
Patent number: 11892859Abstract: A drone system is configured to capture an audio stream that includes voice commands from an operator, to process the audio stream for identification of the voice commands, and to perform operations based on the identified voice commands. The drone system can identify a particular voice stream in the audio stream as an operator voice, and perform the command recognition with respect to the operator voice to the exclusion of other voice streams present in the audio stream. The drone can include a directional camera that is automatically and continuously focused on the operator to capture a video stream usable in disambiguation of different voice streams captured by the drone.Type: GrantFiled: July 28, 2022Date of Patent: February 6, 2024Assignee: Snap Inc.Inventors: David Meisenholder, Steven Horowitz
-
Patent number: 11887618Abstract: A call audio mixing processing method is provided. In the method, call audio streams from terminals of call members participating in a call are obtained. Voice analysis is performed on the call audio streams to determine voice activity corresponding to each of the terminals. The voice activity of the terminals indicate activity levels of the call members participating in the call. According to the voice activity of the terminals, respective voice adjustment parameters corresponding to the terminals are determined. According to the respective voice adjustment parameters corresponding to the terminals, the call audio streams of the terminals are adjusted. Further, mixing processing is performed on the adjusted call audio streams to obtain a mixed audio stream.Type: GrantFiled: April 18, 2022Date of Patent: January 30, 2024Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventor: Junbin Liang
-
Patent number: 11854556Abstract: Methods and apparatus are disclosed for supplementing partially readable and/or inaccurate codes. An example apparatus includes a watermark analyzer to select a first watermark and a second watermark decoded from media; a comparator to compare a first decoded timestamp of the first watermark to a second decoded timestamp of the second watermark; and a timestamp adjuster to adjust the second decoded timestamp based on the first decoded timestamp of the second watermark when at least a threshold number of symbols of the second decoded timestamp match corresponding symbols of the first decoded timestamp.Type: GrantFiled: November 14, 2022Date of Patent: December 26, 2023Assignee: The Nielsen Company (US), LLCInventors: David Gish, Jeremey M. Davis, Wendell D. Lynch, Christen V. Nielsen
-
Patent number: 11848021Abstract: An envelope sequence is provided that can improve approximation accuracy near peaks caused by the pitch period of an audio signal. A periodic-combined-envelope-sequence generation device according to the present invention takes, as an input audio signal, a time-domain audio digital signal in each frame, which is a predetermined time segment, and generates a periodic combined envelope sequence as an envelope sequence. The periodic-combined-envelope-sequence generation device according to the present invention comprises at least a spectral-envelope-sequence calculating part and a periodic-combined-envelope generating part. The spectral-envelope-sequence calculating part calculates a spectral envelope sequence of the input audio signal on the basis of time-domain linear prediction of the input audio signal.Type: GrantFiled: September 29, 2022Date of Patent: December 19, 2023Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada
-
Patent number: 11837246Abstract: The present invention relates to transposing signals in time and/or frequency and in particular to coding of audio signals. More particular, the present invention relates to high frequency reconstruction (HFR) methods including a frequency domain harmonic transposer. A method and system for generating a transposed output signal from an input signal using a transposition factor T is described. The system comprises an analysis window of length La, extracting a frame of the input signal, and an analysis transformation unit of order M transforming the samples into M complex coefficients. M is a function of the transposition factor T. The system further comprises a nonlinear processing unit altering the phase of the complex coefficients by using the transposition factor T, a synthesis transformation unit of order M transforming the altered coefficients into M altered samples, and a synthesis window of length Ls, generating a frame of the output signal.Type: GrantFiled: February 3, 2023Date of Patent: December 5, 2023Assignee: DOLBY INTERNATIONAL ABInventors: Per Ekstrand, Lars Villemoes
-
Patent number: 11830514Abstract: A vehicle infotainment system that adds background sounds to an outgoing call on a mobile device. The infotainment system comprises: i) a database of selectable augmenting audio signals; and ii) audio processing circuitry configured to receive at a first input an uplink signal from the infotainment system and receive at a second input a selected augmenting audio signal. The audio processing circuitry adapts a spectrum of the first selected augmenting audio signal to prevent the selected augmenting audio signal from masking the uplink signal and combines the adapted selected augmenting audio signal and the uplink signal to produce an augmented uplink signal at an output.Type: GrantFiled: May 27, 2021Date of Patent: November 28, 2023Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLCInventors: Omer Tsimhoni, Eli Tzirkel-Hancock
-
Patent number: 11829868Abstract: A feature value generation device includes a generator configured to digitize non-numerical text data items collected at a plurality of timings from a target of anomaly detection, to generate vectors whose elements are feature values corresponding to the digitized data items; a learning unit configured to learn the vectors during a learning period so as to output a learning result; and a detector configured to detect, during a test period, for each of the vectors generated by the generator, an anomaly based on said each of the vectors and the learning result.Type: GrantFiled: October 31, 2017Date of Patent: November 28, 2023Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Yasuhiro Ikeda, Yusuke Nakano, Keishiro Watanabe, Keisuke Ishibashi, Ryoichi Kawahara
-
Patent number: 11823692Abstract: Methods, devices, non-transitory computer-readable medium, and systems are described for compressing audio data. The techniques involve obtaining a sequence of digitized samples of an audio signal, performing a transform using the sequence of digitized samples, to generate a plurality of spectral lines, obtaining a group of spectral lines from the plurality of spectral lines, and quantizing the group of spectral lines to generate a group of quantized values. Quantizing the group of spectral lines to generate the group of quantized values may comprise performing a specialized rounding operation on a spectral line selected from the group of spectral lines and using the specialized rounding operation to force a group parity value, computed for the group of quantized values, to a predetermined parity value. One or more data frames based on the group of quantized values may be outputted.Type: GrantFiled: May 25, 2022Date of Patent: November 21, 2023Assignee: QUALCOMM IncorporatedInventors: Richard Turner, Megan Lucy Taggart, Laurent Wojcieszak, Justin Hundt
-
Patent number: 11817103Abstract: Provided is a pattern recognition apparatus to provide classification robustness to any kind of domain variability. The pattern recognition apparatus 500 based on Neural Network (NN) includes: NN training unit 501 that trains an NN model to generate NN parameters, based on at least one first feature vector and at least one domain vector indicating one of subsets in a specific domain, wherein, the first feature vector is extracted from each of the subsets, the domain vector indicates an identifier corresponding to the each of the subsets; and NN verification unit 502 that verifies a pair of second feature vectors in the specific domain to output whether the pair indicates same individual or not, based on a target domain vector and the NN parameters.Type: GrantFiled: September 15, 2017Date of Patent: November 14, 2023Assignee: NEC CORPORATIONInventors: Qiongqiong Wang, Takafumi Koshinaka
-
Patent number: 11817113Abstract: To filter unwanted sounds from a conference call, a voice profile of a first user is generated based on a first voice signal captured by a media device during a first conference call. The voice profile may be generated by identifying a base frequency of the first voice signal and determining a plurality of voice characteristics, such as pitch, intonation, accent, loudness, and speech rate. These data may be stored in association with the first user. During a second conference call, a second voice signal captured by the media device is analyzed to determine, based on the voice profile of the first user, whether the second voice signal includes the voice of a second user. If so, the second voice signal is prevented from being transmitted into the conference call. A voice profile of the second user may be generated from the second voice signal for future use.Type: GrantFiled: September 9, 2020Date of Patent: November 14, 2023Assignee: Rovi Guides, Inc.Inventors: Rajendran Pichaimurthy, Madhusudhan Seetharam
-
Patent number: 11816577Abstract: Generally, the present disclosure is directed to systems and methods that generate augmented training data for machine-learned models via application of one or more augmentation techniques to audiographic images that visually represent audio signals. In particular, the present disclosure provides a number of novel augmentation operations which can be performed directly upon the audiographic image (e.g., as opposed to the raw audio data) to generate augmented training data that results in improved model performance. As an example, the audiographic images can be or include one or more spectrograms or filter bank sequences.Type: GrantFiled: September 28, 2021Date of Patent: November 14, 2023Assignee: GOOGLE LLCInventors: Daniel Sung-Joon Park, Quoc Le, William Chan, Ekin Dogus Cubuk, Barret Zoph, Yu Zhang, Chung-Cheng Chiu