Pitch Determination Of Speech Signals (epo) Patents (Class 704/E11.006)

VOCAL SOURCE EXTRACTION BY MAXIMUM PHASE DETECTION

Publication number: 20130325455

Abstract: Methods, apparatus and computer program products implement embodiments of the present invention that include receiving a time domain voice signal, and extracting a single pitch cycle from the received signal. The extracted single pitch cycle is transformed to a frequency domain, and the misclassified roots of the frequency domain are identified and corrected. Using the corrected roots, an indication of a maximum phase of the frequency domain is generated.

Type: Application

Filed: June 4, 2012

Publication date: December 5, 2013

Applicants: INTERNATIONAL BUSINESS MACHINES CORPORATION, UZDAROJI AKCINÊ BENDROVÊ LIETUVOS TYRIMU CENTRAS

Inventors: Aharon Satt, Zvi Kons, Ron Hoory
Format Based Speech Reconstruction from Noisy Signals

Publication number: 20130231924

Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.

Type: Application

Filed: August 20, 2012

Publication date: September 5, 2013

Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S.H. Chu, Shawn E. Stevenson
METHOD AND APPARATUS FOR RECEIVING AND PLAYING A SIGNAL IN A RADIO RECEIVER

Publication number: 20130136276

Abstract: A method and apparatus for receiving and playing a signal in a radio receiver to suppress microphonic feedback are provided by alternately pitch shifting a received audio signal. The pitch of the received audio signal is alternately shifted up and then down, repeatedly over successive intervals of the audio signal, to produce a pitch swing signal which is then played over a speaker. The alternating pitch shifting prevents the buildup of regenerative feedback normally caused by acoustic vibrations coupling into the radio receiver.

Type: Application

Filed: November 29, 2011

Publication date: May 30, 2013

Applicant: MOTOROLA SOLUTIONS, INC.

Inventors: V. C. PRAKASH VK CHACKO, THEAN HAI OOI, KAR BOON OUNG, CHEAH HENG TAN, HUOY THYNG YOW
MULTIPLE MICROPHONE BASED LOW COMPLEXITY PITCH DETECTOR

Publication number: 20130117014

Abstract: Disclosed are various embodiments of multiple microphone based pitch detection. In one embodiment, a method includes obtaining a primary signal and a secondary signal associated with multiple microphones. A pitch value is determined based at least in part upon a level difference between the primary and secondary signals. In another embodiment, a system includes a plurality of microphones configured to provide a primary signal and a secondary signal. A level difference detector is configured to determine a level difference between the primary and secondary signals and a pitch identifier is configured to clip the primary and secondary signals based at least in part upon the level difference. In another embodiment, a method determines the presence of voice activity based upon a pitch prediction gain variation that is determined based at least in part upon a pitch lag.

Type: Application

Filed: November 7, 2011

Publication date: May 9, 2013

Applicant: BROADCOM CORPORATION

Inventors: Xianxian Zhang, Alfonsus Lunardhi
IDENTIFYING FEATURES IN A PORTION OF A SIGNAL REPRESENTING SPEECH

Publication number: 20130046533

Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a region of the signal representing speech. The region can comprise a portion of a frame of the signal representing speech classified as a voiced frame. The region can be marked based on one or more pitch estimates for the region. A cord can be identified within the region based on occurrence of one or more events within the region of the signal. For example, the one or more events can comprise one or more glottal pulses. In such cases, cord can begin with onset of a first glottal pulse and extend to a point prior to an onset of a second glottal pulse. The cord may exclude a portion of the region of the signal prior to the onset of the second glottal pulse.

Type: Application

Filed: October 19, 2012

Publication date: February 21, 2013

Applicant: RED SHIFT COMPANY, LLC

Inventor: RED SHIFT COMPANY, LLC
SYSTEM AND METHOD FOR TRACKING SOUND PITCH ACROSS AN AUDIO SIGNAL USING HARMONIC ENVELOPE

Publication number: 20130041657

Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and a representation of harmonic envelope at the estimated pitch. The estimated pitch and the representation of harmonic envelope may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.

Type: Application

Filed: August 8, 2011

Publication date: February 14, 2013

Applicant: The Intellisis Corporation

Inventors: David C. BRADLEY, Rodney Gateau, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher
SYSTEM AND METHOD FOR TRACKING SOUND PITCH ACROSS AN AUDIO SIGNAL

Publication number: 20130041656

Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and an estimated fractional chirp rate of the harmonics at the estimated pitch. The estimated pitch and the estimated fractional chirp rate may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.

Type: Application

Filed: August 8, 2011

Publication date: February 14, 2013

Applicant: The Intellisis Corporation

Inventors: David C. BRADLEY, Daniel S. GOLDIN, Rodney GATEAU, Nicholas K. FISHER, Robert N. HILTON, Derrick R. ROOS, Eric WIEWIORA
ATMOSPHERE EXPRESSION WORD SELECTION SYSTEM, ATMOSPHERE EXPRESSION WORD SELECTION METHOD, AND PROGRAM

Publication number: 20130024192

Abstract: Disclosed is an information display system provided with: a signal analyzing unit which analyzes the audio signals obtained from a predetermined location and which generates ambient sound information regarding the sound generated at the predetermined location; and an ambient expression selection unit which selects an ambient expression which expresses the content of what a person is feeling from the sound generated at the predetermined location on the basis of the ambient sound information.

Type: Application

Filed: March 28, 2011

Publication date: January 24, 2013

Applicant: NEC CORPORATION

Inventors: Toshiyuki Nomura, Yuzo Senda, Kyota Higa, Takayuki Arakawa, Yasuyuki Mitsui
AUDIO SIGNAL PROCESSING METHOD AND DEVICE

Publication number: 20120239389

Abstract: Disclosed is an audio signal processing method comprising the steps of: receiving an audio signal containing current frame data; generating a first temporary output signal for the current frame when an error occurs in the current frame data, by carrying out frame error concealment with respect to the current frame data a random codebook; generating a parameter by carrying out one or more of short-term prediction, long-term prediction and a fixed codebook search based on the first temporary output signal; and memory updating the parameter for the next frame; wherein the parameter comprises one or more of pitch gain, pitch delay, fixed codebook gain and a fixed codebook.

Type: Application

Filed: November 24, 2010

Publication date: September 20, 2012

Applicant: LG ELECTRONICS INC.

Inventors: Hye Jeong Jeon, Dae Hwan Kim, Hong Goo Kang, Min Ki Lee, Byung Suk Lee, Gyu Hyeok Jeong
STATE DETECTING DEVICE AND STORAGE MEDIUM STORING A STATE DETECTING PROGRAM

Publication number: 20120209598

Abstract: A state detecting device includes an input unit that receives an input voice sound; an analyzer that calculates a feature parameter of each of plurality of frames extracted from the voice sound; a calculator that calculates the average of the feature parameters of the frames, determines a threshold on the basis of the average and statistical data representing relationships between other averages of other feature parameters obtained from a plurality of speakers and cumulative frequencies of the other feature parameters, and calculates an appearance frequency of a frame that is among the plurality of frames and whose feature parameter is larger than the threshold; a determining unit that determines, on the basis of the appearance frequency, a strained state of a vocal cord that has made the voice sound; and an output unit that outputs a result of the determination.

Type: Application

Filed: January 23, 2012

Publication date: August 16, 2012

Applicant: FUJITSU LIMITED

Inventors: Shoji HAYAKAWA, Naoshi MATSUO
SPEECH PROCESSING DEVICE, SPEECH PROCESSING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20120185244

Abstract: According to one embodiment, in a speech processing device, an extractor windows a part of the speech signal and extracts a partial waveform. A calculator performs frequency analysis of the partial waveform to calculate a frequency spectrum. An estimator generates an artificial waveform that is a waveform according to an interval between the pitch marks for each harmonic component having a frequency that is a predetermined multiple of a fundamental frequency of the speech signal and estimates harmonic spectral features representing characteristics of the frequency spectrum of the harmonic component from each of the artificial waveforms. A separator separates the partial waveform into a periodic component produced from periodic vocal-fold vibration as an acoustic source and an aperiodic component produced from aperiodic acoustic sources other than the vocal-fold vibration by using the respective harmonic spectral features and the frequency spectrum of the partial waveform.

Type: Application

Filed: January 26, 2012

Publication date: July 19, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masahiro Morita, Javier Latorre, Takehiko Kagoshima
SYSTEM AND METHOD FOR AUDIO SYNTHESIZER UTILIZING FREQUENCY APERTURE ARRAYS

Publication number: 20120166187

Abstract: A system and method for audio synthesizer utilizing frequency aperture cells (FAC) and frequency aperture arrays (FAA). In accordance with an embodiment, an audio processing system can be provided for the transformation of audio-band frequencies for musical and other purposes. In accordance with an embodiment, a single stream of mono, stereo, or multi-channel monophonic audio can be transformed into polyphonic music, based on a desired target musical note or set of multiple notes. At its core, the system utilizes an input waveform(s) (which can be either file-based or streamed) which is then fed into an array of filters, which are themselves optionally modulated, to generate a new synthesized audio output.

Type: Application

Filed: August 26, 2011

Publication date: June 28, 2012

Applicant: SONIC NETWORK, INC.

Inventors: James Edwin Van Buskirk, Jennifer Hruska, Jason Jordan, Al Joelson, Borislav Zlatkov
Method and System for Determining a Perceived Quality of an Audio System

Publication number: 20120143601

Abstract: The invention relates to a method for determining a quality indicator representing a perceived quality of an output signal of an audio system with respect to a reference signal. The reference signal and the output signal are processed and compared. The processing includes dividing the reference signal and the output signal into mutually corresponding time frames. Additionally, the processing includes scaling the intensity of the reference signal towards a fixed intensity level, and then performing measurements on time frames within the scaled reference signal for determining reference signal time frame characteristics. The intensity of the reference signal is then scaled from the fixed intensity level towards an intensity level related to the output signal. Further on in the method, the loudness of the output signal is scaled towards a fixed loudness level in the perceptual loudness domain. This scaling action uses the reference signal time frame characteristics.

Type: Application

Filed: August 9, 2010

Publication date: June 7, 2012

Applicants: Nederlandse Organsatie Voor Toegespast-Natuurweten schappelijk Onderzoek TNO, KONINKLIJKE KPN N.V.

Inventors: John Gerard Beerends, Jeroen Van Vugt
SPEECH PROCESSING APPARATUS AND SPEECH PROCESSING METHOD

Publication number: 20120136655

Abstract: A signal portion is extracted per frame having a specific duration from an input signal, thus generating a per-frame input signal. The per-frame input signal in the time domain is converted into a per-frame input signal in the frequency domain, thereby generating a spectral pattern of spectra. Peak spectra having peaks are detected in the spectral pattern. A harmonic spectrum is determined, in the peak spectra, having a harmonic structure showing a relationship between a fundamental pitch and a harmonic overtone.

Type: Application

Filed: November 28, 2011

Publication date: May 31, 2012

Applicant: JVC KENWOOD Corporation a corporation of Japan

Inventor: Takaaki YAMABE
METHOD FOR TONE/INTONATION RECOGNITION USING AUDITORY ATTENTION CUES

Publication number: 20120116756

Abstract: In a spoken language processing method for tone/intonation recognition, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more tonal characteristics corresponding to the input window of sound can be determined by mapping the cumulative gist vector to one or more tonal characteristics using a machine learning algorithm.

Type: Application

Filed: November 10, 2010

Publication date: May 10, 2012

Applicant: Sony Computer Entertainment Inc.

Inventor: Ozlem Kalinli
DSP-BASED DEVICE FOR AUDITORY SEGREGATION OF MULTIPLE SOUND INPUTS

Publication number: 20120109645

Abstract: There is provided a unique signal processing technique for localizing and characterizing each of a number of differently located acoustic sources. Specifically there is provided a method for auditory segregation of multiple voice inputs comprising the steps of: receiving a plurality of voice input signals from different source locations; filtering said voice input signals with head related transfer functions (HRTF) using a digital signal processor (DSP) thereby assigning the voice input signals to different locations in virtual auditory space; and changing the HRTF filtered voice input signals in two dimensions, wherein pitch is changed and the signal is filtered with different filters emulating vocal tracts of different sizes thereby further segregating the voice input signals from each other.

Type: Application

Filed: June 23, 2010

Publication date: May 3, 2012

Applicant: LIZARD TECHNOLOGY

Inventors: John Hallam, Jakob Christensen-Dalsgaard
QUERY BY HUMMING FOR RINGTONE SEARCH AND DOWNLOAD

Publication number: 20120101815

Abstract: Described is a technology by which a user hums, sings or otherwise plays a user-provided rendition of a ringtone (or ringback tone) through a mobile telephone to a ringtone search service (e.g., a WAP, interactive voice response or SMS-based search platform). The service matches features of the user's rendition against features of actual ringtones to determine one or more matching candidate ringtones for downloading. Features may include pitch contours (up or down), pitch intervals and durations of notes. Matching candidates may be ranked based on the determined similarity, possibly in conjunction with weighting criterion such as the popularity of the ringtone and/or the importance of the matched part. The candidate set may be augmented with other ringtones independent of the matching, such as the most popular ones downloaded by other users, ringtones from similar artists, and so forth.

Type: Application

Filed: December 29, 2011

Publication date: April 26, 2012

Applicant: Microsoft Corporation

Inventors: Lie LU, Yutao XIE, Sing XIE, Jiafan OU, Ruihao WENG
Artifact Reduction in Packet Loss Concealment

Publication number: 20120101814

Abstract: Various techniques are disclosed for improving packet loss concealment to reduce artifacts by using audio character measures of the audio signal. These techniques include attenuation to a noise fill instead of attenuation to silence, varying how long to wait before attenuating the extrapolation, varying the rate of attenuation of the extrapolation, attenuating periodic extrapolation at a different rate than non-periodic extrapolation, and performing period extrapolation on successively longer fill data based on the audio character measures, adjusting weighting between periodic and non-periodic extrapolation based on the audio character measures, and adjusting weighting between periodic extrapolation and non-periodic extrapolation non-linearly.

Type: Application

Filed: October 25, 2010

Publication date: April 26, 2012

Applicant: POLYCOM, INC.

Inventor: Eric David Elias
ESTIMATION OF SPEECH MODEL PARAMETERS

Publication number: 20120089391

Abstract: Methods for estimating speech model parameters are disclosed. For pulsed parameter estimation, a speech signal is divided into multiple frequency bands or channels using bandpass filters. Channel processing reduces sensitivity to pole magnitudes and frequencies and reduces impulse response time duration to improve pulse location and strength estimation performance. These methods are useful for high quality speech coding and reproduction at various bit rates for applications such as satellite and cellular voice communication.

Type: Application

Filed: October 7, 2011

Publication date: April 12, 2012

Applicant: Digital Voice Systems, Inc.

Inventor: Daniel W. Griffin
ESTIMATING A PITCH LAG

Publication number: 20120072209

Abstract: An electronic device for estimating a pitch lag is described. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor. The electronic device obtains a current frame. The electronic device also obtains a residual signal based on the current frame. The electronic device additionally determines a set of peak locations based on the residual signal. Furthermore, the electronic device obtains a set of pitch lag candidates based on the set of peak locations. The electronic device also estimates a pitch lag based on the set of pitch lag candidates.

Type: Application

Filed: September 8, 2011

Publication date: March 22, 2012

Applicant: QUALCOMM Incorporated

Inventors: Venkatesh Krishnan, Stephane Pierre Villette
DETERMINING PITCH CYCLE ENERGY AND SCALING AN EXCITATION SIGNAL

Publication number: 20120072208

Abstract: An electronic device for determining a set of pitch cycle energy parameters is described. The electronic device includes a processor and executable instructions stored in memory. The electronic device obtains a frame, a set of filter coefficients and a residual signal based on the frame and the set of filter coefficients. The electronic device determines a set of peak locations based on the residual signal and segments the residual signal such that each segment includes one peak. The electronic device determines a first set of pitch cycle energy parameters based on a frame region between two consecutive peak locations and maps regions between peaks in the residual signal to regions between peaks in a synthesized excitation signal to produce a mapping. The electronic device determines a second set of pitch cycle energy parameters based on the first set of pitch cycle energy parameters and the mapping.

Type: Application

Filed: September 8, 2011

Publication date: March 22, 2012

Applicant: QUALCOMM Incorporated

Inventors: Venkatesh Krishnan, Stephane Pierre Villette
Method For Communicating and Displaying Interactive Avatar

Publication number: 20120058747

Abstract: A method for communication and for displaying an interactive avatar or hologram corresponding to a remote party.

Type: Application

Filed: September 8, 2010

Publication date: March 8, 2012

Inventors: James Yiannios, Mourad Ben Ayed
SPEECH SYNTHESIZER, SPEECH SYNTHESIS METHOD AND COMPUTER PROGRAM PRODUCT

Publication number: 20120053933

Abstract: According to one embodiment, a first storage unit stores n band noise signals obtained by applying n band-pass filters to a noise signal. A second storage unit stores n band pulse signals. A parameter input unit inputs a fundamental frequency, n band noise intensities, and a spectrum parameter. A extraction unit extracts for each pitch mark the n band noise signals while shifting. An amplitude control unit changes amplitudes of the extracted band noise signals and band pulse signals in accordance with the band noise intensities. A generation unit generates a mixed sound source signal by adding the n band noise signals and the n band pulse signals. A generation unit generates the mixed sound source signal generated based on the pitch mark. A vocal tract filter unit generates a speech waveform by applying a vocal tract filter using the spectrum parameter to the generated mixed sound source signal.

Type: Application

Filed: March 18, 2011

Publication date: March 1, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
AUTOMATIC MARKING METHOD FOR KARAOKE VOCAL ACCOMPANIMENT

Publication number: 20120022859

Abstract: An automatic marking method for Karaoke vocal accompaniment is provided. In the method, pitch, beat position and volume of a singer are compared with the original pitch, beat position and volume of the theme of a song to generate a score of pitch, a score of beat and a score of emotion respectively, so as to obtain a weighted total score in a weighted marking method. By using the method, the pitch, beat position and volume error of each section of the song sung by the singer can be exactly worked out, and a pitch curve and a volume curve can be displayed, so that the singer can learn which part is sung incorrectly and which part needs to be enhanced. The present invention also has the advantages of dual effects of teaching and entertainment, high practicability and technical advancement.

Type: Application

Filed: April 7, 2009

Publication date: January 26, 2012

Inventor: Wen-Hsin Lin
VOICE RECOGNITION TERMINAL

Publication number: 20120004908

Abstract: A voice recognition terminal executes a local voice recognition process and utilizes an external center voice recognition process. The terminal includes: a voice message synthesizing element for synthesizing at least one of a voice message to be output from a speaker according to the external center voice recognition process and a voice message to be output from the speaker according to the local voice recognition process so as to distinguish between characteristics of the voice message to be output from the speaker according to the external center voice recognition process and characteristics of the voice message to be output from the speaker according to the local voice recognition process; and a voice output element for outputting a synthesized voice message from the speaker.

Type: Application

Filed: June 28, 2011

Publication date: January 5, 2012

Applicant: DENSO CORPORATION

Inventors: Kunio YOKOI, Kazuhisa SUZUKI, Masayuki TAKAMI, Naoyori TANZAWA
Transient signal encoding method and device, decoding method and device, and processing system

Patent number: 8063809

Abstract: A transient signal encoding method and device, decoding method and device, and processing system, where the transient signal encoding method includes: obtaining a reference sub-frame where a maximal time envelope having a maximal amplitude value is located from time envelopes of all sub-frames of an input transient signal; adjusting an amplitude value of the time envelope of each sub-frame before the reference sub-frame in such a way that a first difference is greater than a preset first threshold, in which the first difference is a difference between the amplitude value of the time envelope of each sub-frame before the reference sub-frame and the amplitude value of the maximal time envelope; and writing the adjusted time envelope into bitstream.

Type: Grant

Filed: June 29, 2011

Date of Patent: November 22, 2011

Assignee: Huawei Technologies Co., Ltd.

Inventors: Zexin Liu, Longyin Chen, Lei Miao, Chen Hu, Wei Xiao, Herve Marcel Taddei, Qing Zhang
Method and Apparatus for Audio Source Separation

Publication number: 20110282658

Abstract: The present invention relates to co-channel audio source separation. In one embodiment a first frequency-related representation of plural regions of the acoustic signal is prepared over time, and a two-dimensional transform of plural two-dimensional localized regions of the first frequency-related representation, each less than an entire frequency range of the first frequency related representation, is obtained to provide a two-dimensional compressed frequency-related representation with respect to each two dimensional localized region. For each of the plural regions, at least one pitch is identified. The pitch from the plural regions is processed to provide multiple pitch estimates over time. In another embodiment, a mixed acoustic signal is processed by localizing multiple time-frequency regions of a spectrogram of the mixed acoustic signal to obtain one or more acoustic properties.

Type: Application

Filed: September 3, 2010

Publication date: November 17, 2011

Applicant: Massachusetts Institute of Technology

Inventors: Tianyu Wang, Thomas R. Quatieri, JR.
Adaptive Filter Pitch Extraction

Publication number: 20110276324

Abstract: An enhancement system extracts pitch from a processed speech signal. The system estimates the pitch of voiced speech by deriving filter coefficients of an adaptive filter and using the obtained filter coefficients to derive pitch. The pitch estimation may be enhanced by using various techniques to condition the input speech signal, such as spectral modification of the background noise and the speech signal, and/or reduction of the tonal noise from the speech signal.

Type: Application

Filed: May 11, 2011

Publication date: November 10, 2011

Inventors: Rajeev Nongpiur, Phillip A. Hetherington
SPEECH-BASED SPEAKER RECOGNITION SYSTEMS AND METHODS

Publication number: 20110276323

Abstract: The illustrative embodiments described herein provide systems and methods for authenticating a speaker. In one embodiment, a method includes receiving reference speech input including a reference passphrase to form a reference recording, and receiving test speech input including a test passphrase to form a test recording. The method includes determining whether the test passphrase matches the reference passphrase, and determining whether one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase. The method authenticates the speaker of the test speech input in response to determining that the reference passphrase matches the test passphrase and that one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase.

Type: Application

Filed: May 6, 2010

Publication date: November 10, 2011

Applicant: Senam Consulting, Inc.

Inventor: Serge Olegovich Seyfetdinov
INTEROPERABLE VOCODER

Publication number: 20110257965

Abstract: Encoding a sequence of digital speech samples into a bit stream includes dividing the digital speech samples into one or more frames and computing a set of model parameters for the frames. The set of model parameters includes at least a first parameter conveying pitch information. The voicing state of a frame is determined and the first parameter conveying pitch information is modified to designate the determined voicing state of the frame, if the determined voicing state of the frame is equal to one of a set of reserved voicing states. The model parameters are quantized to generate quantizer bits which are used to produce the bit stream.

Type: Application

Filed: June 27, 2011

Publication date: October 20, 2011

Applicant: DIGITAL VOICE SYSTEMS, INC.

Inventor: John C. Hardwick
PITCH-CORRECTION OF VOCAL PERFORMANCE IN ACCORD WITH SCORE-CODED HARMONIES

Publication number: 20110251840

Abstract: Despite many practical limitations imposed by mobile device platforms and application execution environments, vocal musical performances may be captured and continuously pitch-corrected for mixing and rendering with backing tracks in ways that create compelling user experiences. In some cases, the vocal performances of individual users are captured on mobile devices in the context of a karaoke-style presentation of lyrics in correspondence with audible renderings of a backing track. Such performances can be pitch-corrected in real-time at a portable computing device (such as a mobile phone, personal digital assistant, laptop computer, notebook computer, pad-type computer or netbook) in accord with pitch correction settings. In some cases, pitch correction settings include a score-coded melody and/or harmonies supplied with, or for association with, the lyrics and backing tracks.

Type: Application

Filed: April 12, 2011

Publication date: October 13, 2011

Inventors: Perry R. Cook, Ari Lazier, Tom Lieber, Turner E. Kirk
COORDINATING AND MIXING VOCALS CAPTURED FROM GEOGRAPHICALLY DISTRIBUTED PERFORMERS

Publication number: 20110251841

Abstract: Despite many practical limitations imposed by mobile device platforms and application execution environments, vocal musical performances may be captured and continuously pitch-corrected for mixing and rendering with backing tracks in ways that create compelling user experiences. Based on the techniques described herein, even mere amateurs are encouraged to share with friends and family or to collaborate and contribute vocal performances as part of virtual “glee clubs.” In some implementations, these interactions are facilitated through social network- and/or eMail-mediated sharing of performances and invitations to join in a group performance. Using uploaded vocals captured at clients such as a mobile device, a content server (or service) can mediate such virtual glee clubs by manipulating and mixing the uploaded vocal performances of multiple contributing vocalists.

Type: Application

Filed: April 12, 2011

Publication date: October 13, 2011

Inventors: Perry R. Cook, Ari Lazier, Tom Lieber, Turner E. Kirk
COMPUTATIONAL TECHNIQUES FOR CONTINUOUS PITCH CORRECTION AND HARMONY GENERATION

Publication number: 20110251842

Abstract: Using signal processing techniques described herein, pitch detection and correction of a user's vocal performance can be performed continuously and in real-time with respect to the audible rendering of the backing track at the handheld or portable computing device. In some implementations, pitch detection builds on time-domain pitch correction techniques that employ average magnitude difference function (AMDF) or autocorrelation-based techniques together with zero-crossing and/or peak picking techniques to identify differences between pitch of a captured vocal signal and score-coded target pitches. Based on detected differences, pitch correction based on pitch synchronous overlapped add (PSOLA) and/or linear predictive coding (LPC) techniques allow captured vocals to be pitch shifted in real-time to “correct” notes in accord with pitch correction settings that code score-coded melody targets and harmonies.

Type: Application

Filed: April 12, 2011

Publication date: October 13, 2011

Inventors: Perry R. Cook, Ari Lazier, Tom Lieber
NONVOLATILE STORAGE SYSTEM AND MUSIC SOUND GENERATION SYSTEM

Publication number: 20110246188

Abstract: A music sound generation system is formed with a high sound quality and with a small size using a large-capacity NAND flash memory for storing music sound data. Music sound data is divided into N pitch groups and stored into N different storage modules as being divided in these storage modules. A sound generation command classification unit (3000) classifies sound generation commands provided from an external unit into N sound generation command groups. A read command unit in each access module reads data from a storage module based on the sound generation command group. This structure enables music sound data to be read from a plurality of storage modules in parallel.

Type: Application

Filed: May 26, 2010

Publication date: October 6, 2011

Inventor: Masahiro Nakanishi
ROBOT, METHOD AND PROGRAM OF CONTROLLING ROBOT

Publication number: 20110224977

Abstract: A robot may include a driving control unit configured to control a driving of a movable unit that is connected movably to a body unit, a voice generating unit configured to generate a voice, and a voice output unit configured to output the voice, which has been generated by the voice generating unit. The voice generating unit may correct the voice, which is generated, based on a bearing of the movable unit, which is controlled by the driving control unit, to the body unit.

Type: Application

Filed: September 14, 2010

Publication date: September 15, 2011

Applicant: HONDA MOTOR CO., LTD.

Inventors: Kazuhiro NAKADAI, Takuma OTSUKA, Hiroshi OKUNO
METHOD AND APPARATUS FOR OBTAINING PITCH GAIN, AND CODER AND DECODER

Publication number: 20110218800

Abstract: The present invention relates to a method and apparatus for obtaining a pitch gain, and a coder and a decoder. The method includes: obtaining information about an input signal; and obtaining a pitch gain corresponding to the information about the input signal according to the correspondence between the signal information and the pitch gain. The embodiments of the present invention obtain the corresponding pitch gain according to the signal information by using the obtained correspondence between the signal information and the pitch gain, and the pitch gain is applicable to the coder and the decoder, thus making it unnecessary for the coder to transmit the pitch gain to the decoder and solving the problem of bit overhead. The embodiments of the present invention determine the pitch gain adaptively according to the signal information, avoid consumption of extra bits for quantizing the pitch gain, avoid impact on the coding performance, and improve the compression ratio.

Type: Application

Filed: May 17, 2011

Publication date: September 8, 2011

Applicant: Huawei Technologies Co., Ltd.

Inventors: Dejun Zhang, Lei Miao, Jianfeng Xu, Fengyan Qi, Qing Zhang, Lixiong Li, Fuwei Ma
SPECTRUM CODING APPARATUS, SPECTRUM DECODING APPARATUS, ACOUSTIC SIGNAL TRANSMISSION APPARATUS, ACOUSTIC SIGNAL RECEPTION APPARATUS AND METHODS THEREOF

Publication number: 20110196674

Abstract: A spectrum coding apparatus capable of performing coding at a low bit rate and with high quality is disclosed. This apparatus is provided with a section that performs the frequency transformation of a first signal and calculates a first spectrum, a section that converts the frequency of a second signal and calculates a second spectrum, a section that estimates the shape of the second spectrum in a band of FL?k<FH using a filter having the first spectrum in a band of 0?k<FL as an internal state and a section that codes an outline of the second spectrum determined based on a coefficient indicating the characteristic of the filter at this time.

Type: Application

Filed: April 17, 2011

Publication date: August 11, 2011

Applicant: PANASONIC CORPORATION

Inventor: Masahiro Oshikiri
CONCEALING LOST PACKETS IN A SUB-BAND CODING DECODER

Publication number: 20110196673

Abstract: An electronic device for reconstructing a lost packet in a Sub-Band Coding (SBC) decoder is described. The electronic device includes a processor and instructions stored in memory. The electronic device detects a lost packet, obtains a zero-input response of a synthesis filter bank and obtains a coarse pitch estimate. The electronic device also obtains a fine pitch estimate based on the zero-input response and the coarse pitch estimate. The electronic device selects a last pitch period based on the fine pitch estimate and uses samples from the last pitch period for the lost packet.

Type: Application

Filed: January 26, 2011

Publication date: August 11, 2011

Applicant: QUALCOMM Incorporated

Inventors: Amit Sharma, Jeremy P. Toman, Hyun Jin Park, Sang-Uk Ryu
SYSTEMS AND METHODS FOR SPEECH EXTRACTION

Publication number: 20110191102

Abstract: In some embodiments, a processor-readable medium stores code representing instructions to cause a processor to receive an input signal having a first component and a second component. An estimate of the first component of the input signal is calculated based on an estimate of a pitch of the first component of the input signal. An estimate of the input signal is calculated based on the estimate of the first component of the input signal and an estimate of the second component of the input signal. The estimate of the first component of the input signal is modified based on a scaling function to produce a reconstructed first component of the input signal. The scaling function is a function of at least one of the input signal, the estimate of the first component of the input signal, the estimate of the second component of the input signal, or a residual signal.

Type: Application

Filed: January 31, 2011

Publication date: August 4, 2011

Applicant: UNIVERSITY OF MARYLAND, COLLEGE PARK

Inventors: Carol Espy-Wilson, Srikanth Vishnubhotla
SIGNAL PRESENCE DETECTION USING BI-DIRECTIONAL COMMUNICATION DATA

Publication number: 20110184732

Abstract: A system and method for using bi-directional conversation data to improve signal presence detection are disclosed. The detector module is adapted to communicate with a signal enhancement module. The detector module collects data from a transmit direction of the connection and a receive direction of a data connection. The collected data from the transmit and the receive direction is used to classify at least one of data in the transmit direction and data in the receive direction. Responsive to the classification, the signal enhancement module enhances data in one of the transmit direction and the receive direction. Hence, data classification accuracy is improved by using data from both the transmit and receive directions. In one embodiment, the detector module applies a voice activity detection module (VAD) process to detect the presence or absence of voice data in the collected data.

Type: Application

Filed: April 4, 2011

Publication date: July 28, 2011

Applicant: DITECH NETWORKS, INC.

Inventor: Mahesh Godavarti
GENDER DETECTION IN MOBILE PHONES

Publication number: 20110153317

Abstract: An apparatus for wireless communications includes a processing system. The processing system is configured to receive an input sound stream of a user, split the input sound stream into a plurality of frames, classify each of the frames as one selected from the group consisting of a non-speech frame and a speech frame, determine a pitch of each of the frames in a subset of the speech frames, and identify a gender of the user from the determined pitch. To determine the pitch, the processing system is configured to filter the speech frames to compute an error signal, compute an autocorrelation of the error signal, find a maximum autocorrelation value, and set the pitch to an index of the maximum autocorrelation value.

Type: Application

Filed: December 23, 2009

Publication date: June 23, 2011

Applicant: QUALCOMM INCORPORATED

Inventors: Yinian Mao, Gene Marsh
CONTINUOUS SCORE-CODED PITCH CORRECTION

Publication number: 20110144982

Abstract: Vocal musical performances may be captured and continuously pitch-corrected at a mobile device for mixing and rendering with backing tracks in ways that create compelling user experiences. In some cases, the vocal performances of individual users are captured in the context of a karaoke-style presentation of lyrics in correspondence with audible renderings of a backing track. Such performances can be pitch-corrected in real-time at the mobile device in accord with pitch correction settings. In some cases, such pitch correction settings code a particular key or scale for the vocal performance or for portions thereof. In some cases, pitch correction settings include a score-coded melody sequence of note targets supplied with, or for association with, the lyrics and/or backing track. In some cases, pitch correction settings are dynamically variable based on gestures captured at a user interface.

Type: Application

Filed: September 4, 2010

Publication date: June 16, 2011

Inventors: Spencer Salazar, Rebecca A. Fiebrink, Ge Wang, Mattias Ljungström, Jeffrey C. Smith, Perry R. Cook
CONTINUOUS PITCH-CORRECTED VOCAL CAPTURE DEVICE COOPERATIVE WITH CONTENT SERVER FOR BACKING TRACK MIX

Publication number: 20110144981

Abstract: Techniques have been developed to facilitate (1) the capture and pitch correction of vocal performances on handheld or other portable computing devices and (2) the mixing of such pitch-corrected vocal performances with backing tracks for audible rendering on targets that include such portable computing devices and as well as desktops, workstations, gaming stations, even telephony targets. Implementations of the described techniques employ signal processing techniques and allocations of system functionality that are suitable given the generally limited capabilities of such handheld or portable computing devices and that facilitate efficient encoding and communication of the pitch-corrected vocal performances (or precursors or derivatives thereof) via wireless and/or wired bandwidth-limited networks for rendering on portable computing devices or other targets.

Type: Application

Filed: September 4, 2010

Publication date: June 16, 2011

Inventors: Spencer Salazar, Rebecca A. Fiebrink, Ge Wang, Mattias Ljungström, Jeffrey C. Smith, Perry R. Cook
Speech Intelligibility

Publication number: 20110125491

Abstract: The perceived quality of a speech signal is improved by estimating the average power of first and second signal components and applying a first gain factor to the second signal components to generate adjusted second signal components. The first gain factor is selected such that on application of the first gain factor to the second signal components, the ratio of the average power of the first signal components to the average power of the adjusted second signal components would be a first predetermined value, the first predetermined value being such as to inhibit perceptual distortion of the improved speech signal.

Type: Application

Filed: November 23, 2009

Publication date: May 26, 2011

Inventors: Rogerio Guedes Alves, Kuan-Chieh Yen, Michael Christopher Vartanian, Sameer Arun Gadre
Speech Intelligibility

Publication number: 20110125492

Abstract: The perceived quality of a narrowband speech signal truncated from a wideband speech signal is improved by generating in a third frequency band third speech components matching first speech components in a first frequency band of the narrowband signal, and generating in a fourth frequency band fourth speech components matching second speech components in a second frequency band of the narrowband signal. A first gain factor is applied to the third speech components to generate adjusted third speech components, and a second gain factor is applied to the fourth speech components to generate adjusted fourth speech components, the gain factors being selected such that the ratios of the average powers of the adjusted third and fourth speech components to the average power of the first speech components are predetermined values.

Type: Application

Filed: November 23, 2009

Publication date: May 26, 2011

Applicant: CAMBRIDGE SILICON RADIO LIMITED

Inventors: Rogerio Guedes Alves, Kuan-Chieh Yen, Michael Christopher Vartanian, Sameer Arun Gadre
VOICE QUALITY CONVERSION APPARATUS, PITCH CONVERSION APPARATUS, AND VOICE QUALITY CONVERSION METHOD

Publication number: 20110125493

Abstract: The voice quality conversion apparatus includes: low-frequency harmonic level calculating units and a harmonic level mixing unit for calculating a low-frequency sound source spectrum by mixing a level of a harmonic of an input sound source waveform and a level of a harmonic of a target sound source waveform at a predetermined conversion ratio for each order of harmonics including fundamental, in a frequency range equal to or lower than a boundary frequency; a high-frequency spectral envelope mixing unit that calculates a high-frequency sound source spectrum by mixing the input sound source spectrum and the target sound source spectrum at the predetermined conversion ratio in a frequency range larger than the boundary frequency; and a spectrum combining unit that combines the low-frequency sound source spectrum with the high-frequency sound source spectrum at the boundary frequency to generate a sound source spectrum for an entire frequency range.

Type: Application

Filed: January 31, 2011

Publication date: May 26, 2011

Inventors: Yoshifumi Hirose, Takahiro Kamai
FRAMING METHOD AND APPARATUS

Publication number: 20110099005

Abstract: A framing method and apparatus are disclosed to overcome inconsistency of gains between sub-frames caused by simple average framing in the prior art. The method includes: obtaining the Linear Prediction Coding (LPC) order and the pitch of the signal; removing the samples inapplicable to Long-Term Prediction (LTP) synthesis according to the LPC prediction order and the pitch; and splitting the remaining samples of the signal into several sub-frames. The technical solution under the present invention is applicable to the multimedia speech coding field.

Type: Application

Filed: December 30, 2010

Publication date: April 28, 2011

Inventors: Dejun ZHANG, Fengyan Qi, Lei Miao, Jianfeng Xu, Qing Zhang, Lixiong Li, Fuwei Ma
Method and Apparatus for Performing Packet Loss or Frame Erasure Concealment

Publication number: 20110087489

Abstract: The invention concerns a method and apparatus for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder that does not have a built-in or standard FEC process. A receiver with a decoder receives encoded frames of compressed speech information transmitted from an encoder. A lost frame detector at the receiver determines if an encoded frame has been lost or corrupted in transmission, or erased. If the encoded frame is not erased, the encoded frame is decoded by a decoder and a temporary memory is updated with the decoder's output. A predetermined delay period is applied and the audio frame is then output. If the lost frame detector determines that the encoded frame is erased, a FEC module applies a frame concealment process to the signal. The FEC processing produces natural sounding synthetic speech for the erased frames.

Type: Application

Filed: December 21, 2010

Publication date: April 14, 2011

Inventor: David A. Kapilow
SPEECH SYNTHESIS APPARATUS AND METHOD

Publication number: 20110087488

Abstract: According to an embodiment, a speech synthesis apparatus includes a selecting unit configured to select speaker's parameters one by one for respective speakers and obtain a plurality of speakers' parameters, the speaker's parameters being prepared for respective pitch waveforms corresponding to speaker's speech sounds, the speaker's parameters including formant frequencies, formant phases, formant powers, and window functions concerning respective formants that are contained in the respective pitch waveforms. The apparatus includes a mapping unit configured to make formants correspond to each other between the plurality of speakers' parameters using a cost function based on the formant frequencies and the formant powers. The apparatus includes a generating unit configured to generate an interpolated speaker's parameter by interpolating, at desired interpolation ratios, the formant frequencies, formant phases, formant powers, and window functions of formants which are made to correspond to each other.

Type: Application

Filed: December 16, 2010

Publication date: April 14, 2011

Inventors: Ryo Morinaka, Takehiko Kagoshima
REAL-TIME SPEAKER-ADAPTIVE SPEECH RECOGNITION APPARATUS AND METHOD

Publication number: 20110066426

Abstract: A speech recognition apparatus and method for real-time speaker adaptation are provided. The speech recognition apparatus may estimate a pitch of a speech section from an inputted speech signal, extract a speech feature for speech recognition based on the estimated pitch, and perform speech recognition with respect to the speech signal based on the speech feature. The speech recognition apparatus may be adaptively normalized depending on a speaker. Thus, the speech recognition apparatus may extract a speech feature for speech recognition, and may improve a performance of speech recognition based on the extracted speech feature.

Type: Application

Filed: July 15, 2010

Publication date: March 17, 2011

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Gil Ho LEE

1 2 3 next