Pitch Patents (Class 704/207)
-
Patent number: 11816577Abstract: Generally, the present disclosure is directed to systems and methods that generate augmented training data for machine-learned models via application of one or more augmentation techniques to audiographic images that visually represent audio signals. In particular, the present disclosure provides a number of novel augmentation operations which can be performed directly upon the audiographic image (e.g., as opposed to the raw audio data) to generate augmented training data that results in improved model performance. As an example, the audiographic images can be or include one or more spectrograms or filter bank sequences.Type: GrantFiled: September 28, 2021Date of Patent: November 14, 2023Assignee: GOOGLE LLCInventors: Daniel Sung-Joon Park, Quoc Le, William Chan, Ekin Dogus Cubuk, Barret Zoph, Yu Zhang, Chung-Cheng Chiu
-
Patent number: 11763796Abstract: A computer-implemented method for speech synthesis, a computer device, and a non-transitory computer readable storage medium are provided. The method includes: obtaining a speech text to be synthesized; obtaining a Mel spectrum corresponding to the speech text to be synthesized according to the speech text to be synthesized; inputting the Mel spectrum into a complex neural network, and obtaining a complex spectrum corresponding to the speech text to be synthesized, wherein the complex spectrum comprises real component information and imaginary component information; and obtaining a synthetic speech corresponding to the speech text to be synthesized, according to the complex spectrum. The method can efficiently and simply complete speech synthesis.Type: GrantFiled: December 10, 2020Date of Patent: September 19, 2023Assignee: UBTECH ROBOTICS CORP LTDInventors: Dongyan Huang, Leyuan Sheng, Youjun Xiong
-
Patent number: 11756530Abstract: Example embodiments relate to techniques for training artificial neural networks or oilier machine-learning encoders to accurately predict the pitch of input audio samples in a semitone or otherwise logarithmically-scaled pitch space. An example method may include generating, from a sample of audio data, two training samples by applying two different pitch shifts to the sample of audio training data. This can be done by converting the sample of audio data into the frequency domain and then shifting the transformed data. These known shifts are then compared to the predicted pitches generated by applying the two training samples to the encoder. The encoder is then updated based on the comparison, such that the relative pitch output by the encoder is improved with respect to accuracy. One or more audio samples, labeled with absolute pitch values, can then be used to calibrate the relative pitch values generated by the trained encoder.Type: GrantFiled: September 25, 2020Date of Patent: September 12, 2023Assignee: Google LLCInventors: Marco Tagliasacchi, Mihajlo Velimirovic, Matthew Sharifi, Dominik Roblek, Christian Frank, Beat Gfeller
-
Patent number: 11749290Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing long-term prediction (LTP) are described. One example of the methods includes determining a pitch gain and a pitch lag of an input audio signal for at least a predetermined number of frames. It is determined that the pitch gain of the input audio signal has exceeded a predetermined threshold and that a change of the pitch lag of the input audio signal has been within a predetermined range for at least the predetermined number of frames. In response to determining that the pitch gain of the input audio signal has exceeded the predetermined threshold and that the change of the third pitch lag has been within the predetermined range for at least the predetermined number of frames, a pitch gain is set for a current frame of the input audio signal.Type: GrantFiled: July 12, 2021Date of Patent: September 5, 2023Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventor: Yang Gao
-
Patent number: 11727922Abstract: A computerized system for deriving expression of intent from recorded speech includes: a text classification module comparing a transcription of recorded speech against a text classifier to generate a first set of representations of potential intents; a phonetics classification module comparing a phonetic transcription of the recorded speech against a phonetics classifier to generate a second set of representations; an audio classification module comparing an audio version of the recorded speech with an audio classifier to generate a third set of representations; and a discriminator module for receiving the first, second and third sets of the representations of potential intents and generating one derived expression of intent by processing the first, second and third sets together; where at least two of the text classification module, the phonetics classification module, and the audio classification module are asynchronous processes from one another.Type: GrantFiled: May 11, 2021Date of Patent: August 15, 2023Assignee: Verint Americas Inc.Inventor: Moshe Villaizan
-
Patent number: 11711648Abstract: Techniques are provided for audio-based detection and tracking of an acoustic source. A methodology implementing the techniques according to an embodiment includes generating acoustic signal spectra from signals provided by a microphone array, and performing beamforming on the acoustic signal spectra to generate beam signal spectra, using time-frequency masks to reduce noise. The method also includes detecting, by a deep neural network (DNN) classifier, an acoustic event, associated with the acoustic source, in the beam signal spectra. The DNN is trained on acoustic features associated with the acoustic event. The method further includes performing pattern extraction, in response to the detection, to identify time-frequency bins of the acoustic signal spectra that are associated with the acoustic event, and estimating a motion direction of the source relative to the array of microphones based on Doppler frequency shift of the acoustic event calculated from the time-frequency bins of the extracted pattern.Type: GrantFiled: March 10, 2020Date of Patent: July 25, 2023Assignee: Intel CorporationInventors: Kuba Lopatka, Adam Kupryjanow, Lukasz Kurylo, Karol Duzinkiewicz, Przemyslaw Maziewski, Marek Zabkiewicz
-
Patent number: 11710492Abstract: Methods, systems, and devices for encoding are described. A device, which may be otherwise known as user equipment (UE), may support standards-compatible audio encoding (e.g., speech encoding) using a pre-encoded database. The device may receive a digital representation of an audio signal and identify, based on receiving the digital representation of the audio signal, a database that is pre-encoded according to a coding standard and that includes a quantity of digital representations of other audio signals. The device may encode the digital representation of the audio signal using a machine learning scheme and information from the database pre-encoded according to the coding standard. The device may generate a bitstream of the digital representation that is compatible with the coding standard based on encoding the digital representation of the audio signal, and output a representation of the bitstream.Type: GrantFiled: October 2, 2019Date of Patent: July 25, 2023Assignee: QUALCOMM IncorporatedInventors: Stephane Pierre Villette, Daniel Jared Sinder
-
Patent number: 11640824Abstract: Systems, devices, and methods transcribe words recorded in audio data. A computer-generated transcript is provided. The transcript comprises records for each word in the computer-generated transcript. At least one confirmation input is received for each record. The at least one confirmation input modifies a selected record and automatically identifies a next record for receiving a next confirmation input. A sequence of confirmation inputs may rapidly modify and validate each record in a sequence of records in the computer-generated transcript. A validated transcript is generated from the modified records and is provided from an evidence management system.Type: GrantFiled: July 15, 2020Date of Patent: May 2, 2023Assignee: Axon Enterprise, Inc.Inventors: Noah Spitzer-Williams, Choongyeun Cho, Thomas Crosley, Zachary Charles Goist, Daniel Michael Bellia, Vinh Hein Nguyen, Chelsea Alexander-Taylor
-
Patent number: 11636836Abstract: Provided is a method for processing audio including: acquiring an accompaniment audio signal and a voice signal of a current to-be-processed musical composition; determining a target reverberation intensity parameter value of the acquired accompaniment audio signal, wherein the target reverberation intensity parameter value is configured to indicate a rhythm speed, an accompaniment type, and a performance score of a singer of the current to-be-processed musical composition; and reverberating the acquired vocal signal based on the target reverberation intensity parameter value.Type: GrantFiled: March 23, 2022Date of Patent: April 25, 2023Assignee: BEIJING DAJIA INTERNET INFORMATION TECHNOLOGY CO., LTD.Inventors: Xiguang Zheng, Chen Zhang
-
Patent number: 11621725Abstract: A method for partitioning of input vectors for coding is presented. The method comprises obtaining of an input vector. The input vector is segmented, in a non-recursive manner, into an integer number, NSEG, of input vector segments. A representation of a respective relative energy difference between parts of the input vector on each side of each boundary between the input vector segments is determined, in a recursive manner. The input vector segments and the representations of the relative energy differences are provided for individual coding. Partitioning units and computer programs for partitioning of input vectors for coding, as well as positional encoders, are presented.Type: GrantFiled: January 11, 2022Date of Patent: April 4, 2023Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)Inventors: Tomas Jansson Toftgård, Volodya Grancharov, Jonas Svedberg
-
Patent number: 11610577Abstract: Methods and Systems for providing a change to a voice interacting with a user are described. Information indicating a change that can be made to the voice can be received. The voice can be changed based on the information.Type: GrantFiled: November 19, 2020Date of Patent: March 21, 2023Assignee: Capital One Services, LLCInventors: Anh Truong, Mark Watson, Jeremy Goodsitt, Vincent Pham, Fardin Abdi Taghi Abad, Kate Key, Austin Walters, Reza Farivar
-
Patent number: 11605377Abstract: The dialog device according to the present invention includes a prediction unit 254 configured to predict an utterance length attribute of a user utterance in response to a the machine utterance, a selection unit 256 configured to use the utterance length attribute to select, as a feature model for usage in an end determination of the user utterance, at least one of an acoustic feature model or a lexical feature model, and an estimation unit 258 configured to estimate an end point in the user utterance using the selected model. By using this dialog device, it is possible to shorten the waiting time until a response is output to a user utterance by a machine, and to realize a more natural conversation between a user and a machine.Type: GrantFiled: March 19, 2020Date of Patent: March 14, 2023Assignee: HITACHI, LTD.Inventors: Amalia Istiqlali Adiba, Takeshi Homma, Takashi Sumiyoshi
-
Patent number: 11553235Abstract: Techniques have been developed to facilitate the livestreaming of group audiovisual performances. Audiovisual performances including vocal music are captured and coordinated with performances of other users in ways that can create compelling user and listener experiences. For example, in some cases or embodiments, duets with a host performer may be supported in a sing-with-the-artist style audiovisual livestream in which aspiring vocalists request or queue particular songs for a live radio show entertainment format. The developed techniques provide a communications latency-tolerant mechanism for synchronizing vocal performances captured at geographically-separated devices (e.g., at globally-distributed, but network-connected mobile phones or tablets or at audiovisual capture devices geographically separated from a live studio).Type: GrantFiled: June 7, 2021Date of Patent: January 10, 2023Assignee: Smule, Inc.Inventors: Anton Holmberg, Benjamin Hersh, Jeannie Yang, Perry R. Cook, Jeffrey C. Smith
-
Patent number: 11545153Abstract: Provided is a device, a method that allow a remote terminal to perform a process on the basis of a local-terminal-side user utterance. There are a local terminal and a remote terminal. The local terminal performs a process of a semantic analysis of a user utterance input into the local terminal. On the basis of a result of the semantic analysis, the local terminal determines whether or not the user utterance is a request to the remote terminal for a process. Moreover, in a case where the user utterance is a request to the remote terminal for a process, the local terminal transmits the result of the semantic analysis by a semantic-analysis part to the remote terminal. The remote terminal receives the result of the semantic analysis of the local-terminal-side user utterance, and performs a process based on the received result of the semantic analysis of the local-terminal-side user utterance.Type: GrantFiled: March 12, 2019Date of Patent: January 3, 2023Assignee: Sony CorporationInventor: Keiichi Yamada
-
Patent number: 11488620Abstract: The present invention is a computer program product and method for increasing the playback speed of audio or other media files. The computer program product and method identifies pedagogic media files and adds a flag to the metadata of the media file. The flag represents the number and type of pauses or silent sections in the pedagogic media file. Based on the flag, the computer program product and method may fast forward or remove a portion of the pauses and silent sections to provide a new playback speed.Type: GrantFiled: June 12, 2019Date of Patent: November 1, 2022Assignee: International Business Machines CorporationInventor: Deepa Jain
-
Patent number: 11443761Abstract: A technique, suitable for real-time processing, is disclosed for pitch tracking by detection of glottal excitation epochs in speech signal. It uses Hilbert envelope to enhance saliency of the glottal excitation epochs and to reduce the ripples due to the vocal tract filter. The processing comprises the steps of dynamic range compression, calculation of the Hilbert envelope, and epoch marking. The Hilbert envelope is calculated using the output of a FIR filter based Hilbert transformer and the delay-compensated signal. The epoch marking uses a dynamic peak detector with fast rise and slow fall and nonlinear smoothing to further enhance the saliency of the epochs, followed by a differentiator or a Teager energy operator, and amplitude-duration thresholding. The technique is meant for use in speech codecs, voice conversion, speech and speaker recognition, diagnosis of voice disorders, speech training aids, and other applications involving pitch estimation.Type: GrantFiled: August 3, 2019Date of Patent: September 13, 2022Inventors: Prem Chand Pandey, Hirak Dasgupta, Nataraj Kathriki Shambulingappa
-
Patent number: 11443751Abstract: Innovations in phase quantization during speech encoding and phase reconstruction during speech decoding are described. For example, to encode a set of phase values, a speech encoder omits higher-frequency phase values and/or represents at least some of the phase values as a weighted sum of basis functions. Or, as another example, to decode a set of phase values, a speech decoder reconstructs at least some of the phase values using a weighted sum of basis functions and/or reconstructs lower-frequency phase values then uses at least some of the lower-frequency phase values to synthesize higher-frequency phase values. In many cases, the innovations improve the performance of a speech codec in low bitrate scenarios, even when encoded data is delivered over a network that suffers from insufficient bandwidth or transmission quality problems.Type: GrantFiled: February 12, 2021Date of Patent: September 13, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Soren Skak Jensen, Sriram Srinivasan, Koen Bernard Vos
-
Patent number: 11430461Abstract: A method for detecting a voice activity in an input audio signal composed of frames includes that a noise characteristic of the input signal is determined based on a received frame of the input audio signal. A voice activity detection (VAD) parameter is derived based on the noise characteristic of the input audio signal using an adaptive function. The derived VAD parameter is compared with a threshold value to provide a voice activity detection decision. The input audio signal is processed according to the voice activity detection decision.Type: GrantFiled: September 21, 2020Date of Patent: August 30, 2022Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventor: Zhe Wang
-
Patent number: 11423902Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for handing off a user conversation between computer-implemented agents. One of the methods includes receiving, by a computer-implemented agent specific to a user device, a digital representation of speech encoding an utterance, determining, by the computer-implemented agent, that the utterance specifies a requirement to establish a communication with another computer-implemented agent, and establishing, by the computer-implemented agent, a communication between the other computer-implemented agent and the user device.Type: GrantFiled: July 27, 2020Date of Patent: August 23, 2022Assignee: GOOGLE LLCInventors: Johnny Chen, Thomas L. Dean, Qiangfeng Peter Lau, Sudeep Gandhe, Gabriel Schine
-
Patent number: 11380351Abstract: A method for pulmonary condition monitoring includes selecting a phrase from an utterance of a user of an electronic device, wherein the phrase matches an entry of multiple phrases. At least one speech feature that is associated with one or more pulmonary conditions within the phrase is identified. A pulmonary condition is determined based on analysis of the at least one speech feature.Type: GrantFiled: January 14, 2019Date of Patent: July 5, 2022Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ebrahim Nematihosseinabadi, Md Mahbubur Rahman, Viswam Nathan, Korosh Vatanparvar, Jilong Kuang, Jun Gao
-
Patent number: 11348591Abstract: A speaker identification system and method to identify a speaker based on the speaker's voice is disclosed. In an exemplary embodiment, the speaker identification system comprises a Gaussian Mixture Model (GMM) for speaker accent and dialect identification for a given speech signal input by the speaker and an Artificial Neural Network (ANN) to identify the speaker based on the identified dialect, in which the output of the GMM is input to the ANN.Type: GrantFiled: September 23, 2021Date of Patent: May 31, 2022Assignee: King Abdulaziz UniversityInventors: Muhammad Moinuddin, Ubaid M. Al-Saggaf, Shahid Munir Shah, Rizwan Ahmed Khan, Zahraa Ubaid Al-Saggaf
-
Patent number: 11303489Abstract: A transmitting apparatus includes a first signal generating unit that generates, on the basis of data a first signal transmitted by single carrier block transmission; a second signal generating unit that generates, on the basis of an RS, a second signal transmitted by orthogonal frequency division multiplex transmission; a switching operator that selects and outputs the second signal in a first transmission period and selects and outputs the first signal in a second transmission period; an antenna that transmits the signal output from the switching operator; and a control-signal generating unit that controls the second signal generating unit such that, in the first transmission period, the RS is arranged in a frequency band allocated for transmission of the RS from the transmitting apparatus among frequency bands usable in OFDM.Type: GrantFiled: January 30, 2020Date of Patent: April 12, 2022Assignee: Mitsubishi Electric CorporationInventors: Fumihiro Hasegawa, Akinori Taira
-
Patent number: 11289067Abstract: Methods and systems for generating voices based on characteristics of an avatar. One or more characteristics of an avatar are obtained and one or more parameters of a voice synthesizer for generating a voice corresponding to the one or more avatar characteristics are determined. The voice synthesizer is configured based on the one or more parameters and a voice is generated using the parameterized voice synthesizer.Type: GrantFiled: June 25, 2019Date of Patent: March 29, 2022Assignee: International Business Machines CorporationInventors: Kristina Marie Brimijoin, Gregory Boland, Joseph Schwarz
-
Patent number: 11282534Abstract: Systems and methods for intelligent playback of media content may include an intelligent media playback system that, in response to determining the speech tempo in audio content by measuring syllable density of speech in the audio content, automatically adjusts a playback speed of the audio content as the audio content is being played based on the determined speech tempo. In some embodiments, the system may automatically and dynamically adjust the playback speed to result in a desired target speech tempo. In addition, the system may determine whether to automatically adjust playback speed of the audio content, as the media is being played, based on the detected speech tempo of the speech in the audio content and the determined type of content of media. Such automatic adjustments in playback speed result in more efficient playback of the audio content.Type: GrantFiled: August 3, 2018Date of Patent: March 22, 2022Assignee: Sling Media PVT LtdInventors: Yatish Jayant Naik Raikar, Varunkumar Tripathi, Karthik Mahabaleshwar Hegde
-
Patent number: 11276412Abstract: A method and device allocates a bit-budget to a plurality of first parts of a CELP core module of (a) an encoder for encoding a sound signal or (b) a decoder for decoding the sound signal. In the method and device, bit-budget allocation tables assign, for each of a plurality of intermediate bit rates, respective bit-budgets to the first CELP core module parts. A CELP core module bit rate is determined and one of the intermediate bit rates is selected based on the determined CELP core module bit rate. The respective bit-budgets assigned by the bit-budget allocation tables for the selected intermediate bit rate are allocated to the first CELP core module parts.Type: GrantFiled: September 20, 2018Date of Patent: March 15, 2022Assignee: VOICEAGE CORPORATIONInventor: Vaclav Eksler
-
Patent number: 11270071Abstract: Systems, apparatuses, and methods are described herein for providing language-level content recommendations to users based on an analysis of closed captions of content viewed by the users and other data. Language-level analysis of content viewed by a user may be performed to generate metrics that are associated with the user. The metrics may be used to provide recommendations for content, which may include advertising, that is closely aligned with the user's interests.Type: GrantFiled: December 28, 2017Date of Patent: March 8, 2022Assignee: Comcast Cable Communications, LLCInventor: Richard Walsh
-
Patent number: 11270721Abstract: Pre-processing systems, methods of pre-processing, and speech processing systems for improved Automated Speech Recognition are provided. Some pre-processing systems for improved speech recognition of a speech signal are provided, which systems comprise a pitch estimation circuit; and a pitch equalization processor. The pitch estimation circuit is configured to receive the speech signal to determine a pitch index of the speech signal, and the pitch equalization processor is configured to receive the speech signal and pitch information, to equalize a speech pitch of the speech signal using the pitch information, and to provide a pitch-equalized speech signal.Type: GrantFiled: May 21, 2019Date of Patent: March 8, 2022Assignee: PLANTRONICS, INC.Inventors: Youhong Lu, Arun Rajasekaran
-
Patent number: 11263876Abstract: Data is collected for Self-Service Terminals (SSTs) including tallies, events, and outcomes associated with servicing the SSTs. Statistical correlations are derived from the tallies and events with respect to the outcomes. Subsequent collected data is processed with the statistical correlations and a probability for a failure of a component or a part of the component associated with a particular SST is reported for servicing the component or part before the failure.Type: GrantFiled: September 28, 2017Date of Patent: March 1, 2022Assignee: NCR CorporationInventors: Claudio Cifarelli, Gardiner Arthur, Iain M. N. Cowan, Massimo Mastropietro, Callum Ellis Morton
-
Patent number: 11250826Abstract: Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.Type: GrantFiled: October 28, 2019Date of Patent: February 15, 2022Assignee: Smule, Inc.Inventors: Stefan Sullivan, John Shimmin, Dean Schaffer, Perry R. Cook
-
Patent number: 11250221Abstract: Methods, systems, and computer-readable storage media for contextual interpretation of a Japanese word are provided. A first set of characters representing Japanese words is received. The first set of characters are received is input to a neural network. The neural network is trained to processes characters based on bi-directional context interpretation. The first set of characters is processed by the neural network through a plurality of learning layers that process the first set of characters in an order of the first set of characters and in a reverse order to determine semantical meanings of the characters in the first set of characters. An alphabet representation of at least one character of the first set of characters representing a Japanese word is output. The alphabet representation corresponds to a semantical meaning of the at least one character within the first set of characters.Type: GrantFiled: March 14, 2019Date of Patent: February 15, 2022Assignee: SAP SEInventor: Sean Saito
-
Patent number: 11244694Abstract: A method is described that processes an audio signal. A discontinuity between a filtered past frame and a filtered current frame of the audio signal is removed using linear predictive filtering.Type: GrantFiled: January 23, 2017Date of Patent: February 8, 2022Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Emmanuel Ravelli, Manuel Jander, Grzegorz Pietrzyk, Martin Dietz, Marc Gayer
-
Patent number: 11241635Abstract: The present disclosure according to at least one embodiment relates to, in the learning process of a child using smart toys, a method and system for providing an interactive service by using a smart toy, which provide more accurate classified emotional state of the child based on at least one or more sensed data items of an optical image, a thermal image, and voice data of the child, as well as adaptively provide a flexible and versatile interactive service according to classified emotions.Type: GrantFiled: November 15, 2019Date of Patent: February 8, 2022Inventors: Heui Yul Noh, Myeong Ho Roh, Chang Woo Ban, Oh Soung Kwon, Seung Pil Lee, Seung Min Shin
-
Patent number: 11239859Abstract: A method for partitioning of input vectors for coding is presented. The method comprises obtaining of an input vector. The input vector is segmented, in a non-recursive manner, into an integer number, NSEG, of input vector segments. A representation of a respective relative energy difference between parts of the input vector on each side of each boundary between the input vector segments is determined, in a recursive manner. The input vector segments and the representations of the relative energy differences are provided for individual coding. Partitioning units and computer programs for partitioning of input vectors for coding, as well as positional encoders, are presented.Type: GrantFiled: June 5, 2020Date of Patent: February 1, 2022Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)Inventors: Tomas Jansson Toftgård, Volodya Grancharov, Jonas Svedberg
-
Patent number: 11227586Abstract: Systems and methods improving the performance of statistical model-based single-channel speech enhancement systems using a deep neural network (DNN) are disclosed. Embodiments include a DNN-trained system to predict speech presence in the input signal, and this information can be used to create frameworks for tracking noise and conducting a priori signal to-noise ratio estimation. Example frameworks provide increased flexibility for various aspects of system design, such as gain estimation. Examples include training a DNN to detect speech in the presence of both noise and reverberation, enabling joint suppression of additive noise and reverberation. Example frameworks provide significant improvements in objective speech quality metrics relative to baseline systems.Type: GrantFiled: September 11, 2019Date of Patent: January 18, 2022Assignee: Massachusetts Institute of TechnologyInventors: Bengt J. Borgstrom, Michael S. Brandstein, Robert B. Dunn
-
Patent number: 11217237Abstract: At least one exemplary embodiment is directed to a method and device for voice operated control with learning. The method can include measuring a first sound received from a first microphone, measuring a second sound received from a second microphone, detecting a spoken voice based on an analysis of measurements taken at the first and second microphone, learning from the analysis when the user is speaking and a speaking level in noisy environments, training a decision unit from the learning to be robust to a detection of the spoken voice in the noisy environments, mixing the first sound and the second sound to produce a mixed signal, and controlling the production of the mixed signal based on the learning of one or more aspects of the spoken voice and ambient sounds in the noisy environments.Type: GrantFiled: November 13, 2018Date of Patent: January 4, 2022Assignee: Staton Techiya, LLCInventors: John Usher, Steven Goldstein, Marc Boillot
-
Patent number: 11216853Abstract: A method and system for advertising dynamic content in an immersive digital medium user experience operate a plurality of computer processors and databases in an associated network for receiving, processing, and communicate instructions and data relating to advertising content in an immersive digital medium user experience. The method and system execute instructions and processing data relating to advertising objects, the advertising objects comprising images of objects, signs, labels, and related indicia of object origin for indicating sources of purchasing one or more objects for advertising to receive advertising instructions and data from a plurality of software applications and further respond to variations in said advertising instructions and data whereby operation of said computer processors and databases enables swapping out of various advertising messages and images according to the context of said immersive digital medium user experience.Type: GrantFiled: March 27, 2020Date of Patent: January 4, 2022Inventor: Quintan Ian Pribyl
-
Patent number: 11159589Abstract: Described are a system, method, and computer program product for task-based teleconference management. The method includes initiating a teleconference bridge and generating a teleconference session hosted by the bridge. The method also includes connecting teleconference participants of an organization to the bridge and receiving a participant identifier for each participant. The method further includes determining an association of an organization group with each participant based on a respective participant identifier. The method further includes generating display data configured to cause a computing device to display a control interface depicting: (i) the teleconference session having groups of participants, the groups selected from predetermined groups based on task data, and each participant visually associated which its group; and (ii) labels of each participant to identify the group associated therewith.Type: GrantFiled: August 28, 2019Date of Patent: October 26, 2021Assignee: Visa International Service AssociationInventors: Yi Shen, Trinath Anaparthi, Sangram Pattanaik
-
Patent number: 11134330Abstract: Embodiments of the invention determine a speech estimate using a bone conduction sensor or accelerometer, without employing voice activity detection gating of speech estimation. Speech estimation is based either exclusively on the bone conduction signal, or is performed in combination with a microphone signal. The speech estimate is then used to condition an output signal of the microphone. There are multiple use cases for speech processing in audio devices.Type: GrantFiled: July 12, 2019Date of Patent: September 28, 2021Assignee: Cirrus Logic, Inc.Inventors: David Leigh Watts, Brenton Robert Steele, Thomas Ivan Harvey, Vitaliy Sapozhnykov
-
Patent number: 11127416Abstract: A method and an apparatus for voice activity detection provided in embodiments of the present disclosure allow for dividing a to-be-detected audio file into frames to obtain a first sequence of audio frames, extracting an acoustic features of each audio frame in the first sequence of audio frames, and then inputting the acoustic feature of each audio frame to a noise-added VAD model in chronological order to obtain a probability value of each audio frame in the first sequence of audio frames; and then determining, by an electronic device, a start and an end of the voice signal according to the probability value of each audio frame. During the VAD detection, the start and the end of a voice signal in an audio are recognized with a noise-added VAD model to realize the purpose of accurately recognizing the start and the end of the voice signal.Type: GrantFiled: September 6, 2019Date of Patent: September 21, 2021Inventor: Chao Li
-
Patent number: 11094328Abstract: Various embodiments herein each include at least one of systems, methods, and software for conference audio manipulation for inclusion and accessibility. One embodiment, in the form of a method that may be performed, for example, on a server or a participant computing device. This method includes receiving a voice signal via a network and modifying an audible characteristic of the voice signal that is perceptible when the voice signal is audibly output. The method further includes outputting the voice signal including the modified audible characteristic.Type: GrantFiled: September 27, 2019Date of Patent: August 17, 2021Assignee: NCR CorporationInventor: Phil Noel Day
-
Patent number: 11087231Abstract: This disclosure is directed to an apparatus for intelligent matching of disparate input data received from disparate input data systems in a complex computing network for establishing targeted communication to a computing device associated with the intelligently matched disparate input data.Type: GrantFiled: December 9, 2019Date of Patent: August 10, 2021Assignee: Research Now Group, LLCInventors: Melanie D. Courtright, Vincent P. Derobertis, Michael D. Bigby, William C. Robinson, Greg Ellis, Heidi D. E. Wilton, John R. Rothwell, Jeremy S. Antoniuk
-
Patent number: 11062094Abstract: A method of analyzing sentiments includes receiving one or more strings of text, identifying sentiments related to a first topic from the one or more strings of text, and assigning a sentiment score to each of the sentiments related to the first topic, where the sentiment score corresponds to a degree of positivity or negativity of a sentiment of the sentiments. The method further includes calculating an average sentiment score for the first topic based on the sentiment score for each of the sentiments related to the first topic, determining a percentile for the first topic based on a frequency of sentiments related to the first topic, where the percentile for the first topic is determined with respect to a maximum frequency of sentiments related to one or more other topics, and computing an X-Score based on the average sentiment score and the percentile of the first topic.Type: GrantFiled: May 7, 2019Date of Patent: July 13, 2021Assignee: LANGUAGE LOGIC, LLCInventors: Rick Kieser, Charles Baylis, Serge Luyens
-
Patent number: 11049492Abstract: Described herein are real-time musical translation devices (RETM) and methods of use thereof. Exemplary uses of RETMs include optimizing the understanding and/or recall of an input message for a user and improving a cognitive process in a user.Type: GrantFiled: November 10, 2020Date of Patent: June 29, 2021Assignee: YAO THE BARD, LLCInventors: Leonardus H. T. Van Der Ploeg, Halley Young
-
Patent number: 11038787Abstract: In accordance with an example embodiment of the present invention, disclosed is a method and an apparatus thereof for selecting a packet loss concealment procedure for a lost audio frame of a received audio signal. A method for selecting a packet loss concealment procedure comprises detecting an audio type of a received audio frame and determining a packet loss concealment procedure based on the audio type. In the method, detecting an audio type comprises determining a stability of a spectral envelope of signals of received audio frames.Type: GrantFiled: October 1, 2019Date of Patent: June 15, 2021Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)Inventor: Stefan Bruhn
-
Patent number: 11011160Abstract: A computerized system for transforming recorded speech into a derived expression of intent from the recorded speech includes: (1) a text classification module comparing a transcription of at least a portion of recorded speech against a text classifier to generate a first set of one or more of the representations of potential intents based upon such comparison; (2) a phonetics classification module comparing a phonetic transcription of at least a portion of the recorded speech against a phonetics classifier to generate a second set of one or more of the representations of potential intents based upon such comparison; (3) an audio classification module comparing an audio version of at least a portion of the recorded speech with an audio classifier to generate a third set of one or more of the representations of potential intents based upon such comparison; and a (4) discriminator module for receiving the first, second and third sets of the one or more representations of potential intents and generating at leastType: GrantFiled: January 19, 2018Date of Patent: May 18, 2021Assignee: OPEN WATER DEVELOPMENT LLCInventor: Moshe Villaizan
-
Patent number: 10984813Abstract: A method and an apparatus for detecting correctness of a pitch period, where the method for detecting correctness of a pitch period includes determining, according to an initial pitch period of an input signal in a time domain, a pitch frequency bin of the input signal, where the initial pitch period is obtained by performing open-loop detection on the input signal, determining, based on an amplitude spectrum of the input signal in a frequency domain, a pitch period correctness decision parameter, associated with the pitch frequency bin, of the input signal, and determining correctness of the initial pitch period according to the pitch period correctness decision parameter. Hence, the method and apparatus for detecting correctness of the pitch period improve, based on a relatively less complex algorithm, accuracy of detecting correctness of the pitch period.Type: GrantFiled: February 15, 2019Date of Patent: April 20, 2021Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Fengyan Qi, Lei Miao
-
Coordinated audiovisual montage from selected crowd-sourced content with alignment to audio baseline
Patent number: 10971191Abstract: A generally diverse set of audiovisual clips is sourced from one or more repositories for use in preparing a coordinated audiovisual work. In some cases, audiovisual clips are retrieved using tags such as user-assigned hashtags or metadata. Pre-existing associations of such tags can be used as hints that certain audiovisual clips are likely to share correspondence with an audio signal encoding of a particular song or other audio baseline. Clips are evaluated for computationally determined correspondence with an audio baseline track. In general, comparisons of audio power spectra, of rhythmic features, tempo, pitch sequences and other extracted audio features may be used to establish correspondence. For clips exhibiting a desired level of correspondence, computationally determined temporal alignments of individual clips with the baseline audio track are used to prepare a coordinated audiovisual work that mixes the selected audiovisual clips with the audio track.Type: GrantFiled: June 15, 2015Date of Patent: April 6, 2021Inventors: Mark T. Godfrey, Turner Evan Kirk, Ian S. Simon, Nick Kruge -
Patent number: 10938366Abstract: A volume level meter has a housing that is mounted on a microphone, and is connected to a pop filter positioned in front of a vocalist and adjacent to a microphone. The display faces the vocalist, and is arranged on the housing so that it indicates a volume level of audio signals received from the microphone. The vocalist can see indicators on the display and know the volume level of the audio signal from the microphone. This allows the vocalist to monitor the volume level indicators of the volume level display and control their vocal volume levels based on the indicators. In this way, the vocalist to reduce fluctuations in vocal volume levels that may lead to distortion of the audio signal by monitoring the volume level display.Type: GrantFiled: May 3, 2019Date of Patent: March 2, 2021Inventors: Joseph N Griffin, Corey D Chapman
-
Patent number: 10924193Abstract: Embodiments include techniques for transmitting and receiving radio frequency (RF) signals, where the techniques for generating, via a digital analog converter (DAC), a frequency signal, and filtering the frequency signal to produce a first filtered signal and a second filtered signal. The techniques also include transmitting the second filtered signal to a device under test, and filtering the second filtered signal into a sub-signal having one or more components. The techniques include mixing the first filtered signal with the sub-signal to produce a first mixed signal, subsequently mixing the first mixed signal with an output signal received from the device under test to produce a second mixed signal, and converting the second mixed signal for analysis.Type: GrantFiled: September 29, 2017Date of Patent: February 16, 2021Assignee: International Business Machines CorporationInventors: Mohit Kapur, Muir Kumph
-
Patent number: 10902841Abstract: Systems, methods, and computer program products customizing and delivering contextually relevant, artificially synthesized, voiced content that is targeted toward the individual user behaviors, viewing habits, experiences and preferences of each individual user accessing the content of a content provider. A network accessible profile service collects and analyzes collected user profile data and recommends contextually applicable voices based on the user's profile data. As user input to access voiced content or triggers voiced content maintained by a content provider, the voiced content being delivered to the user is a modified version comprising artificially synthesized human speech mimicking the recommended voice and delivering the dialogue of the voiced content, in a manner that imitates the sounds and speech patterns of the recommended voice.Type: GrantFiled: February 15, 2019Date of Patent: January 26, 2021Assignee: International Business Machines CorporationInventors: Su Liu, Eric J. Rozner, Inseok Hwang, Chungkuk Yoo