Time Patents (Class 704/211)
-
Publication number: 20080077400Abstract: A speech-duration detector includes a starting-end detecting unit that detects a starting end of a first duration where the characteristic exceeds a threshold value as a starting end of a speech-duration, when the first duration continues for a first time length; a trailing-end-candidate detecting unit that detects a starting end of a second duration where the characteristic is lower than the threshold value as a candidate point for a trailing end of speech, when the second duration continues for a second time length; and a trailing-end-candidate determining unit that determines the candidate point as a trailing end of the speech-duration, when the second duration where the characteristic exceeds the threshold value does not continue for the first time length while a third time length elapses from measurement at the candidate point.Type: ApplicationFiled: March 20, 2007Publication date: March 27, 2008Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Koichi Yamamoto, Akinori Kawamura
-
Patent number: 7346502Abstract: There is provided a method of updating a noise state of a voice activity detector (VAD) for indicating an active voice mode and an inactive voice mode. The method comprises receiving an input signal having a plurality of frames, determining an elapsed time since the last update of the noise state, updating the noise state of the VAD if the elapsed time exceeds a predetermined time, determining an average minimum energy based on two or more of the plurality of frames, determining a current minimum energy based on a current frame of the plurality of frames, updating the noise state of the VAD if the average minimum energy is less than the current minimum energy, and updating the noise state of the VAD if the average minimum energy is greater than the current minimum energy plus a first predetermined value.Type: GrantFiled: January 26, 2006Date of Patent: March 18, 2008Assignee: Mindspeed Technologies, Inc.Inventors: Yang Gao, Eyal Shlomot, Adil Benyassine
-
Publication number: 20080059157Abstract: Method and computing apparatus for processing speech signal data. A speech signal is divided into frames. Each frame is characterized by a frame number T representing a unique interval of time. Each speech signal is characterized by a power spectrum with respect to frame T and frequency band ?. A speech segment and a reverberation segment of the speech signal is determined. L filter coefficients W(k) (k=1, 2, . . . , L) respectively corresponding to L frames immediately preceding frame T are computed such that the L filter coefficients minimize a function ? that is a linear combination of sum of squares of a residual speech power in the reverberation segment and a sum of squares of a subtracted speech power in the speech segment. The computed L filter coefficients are stored within storage media of the computing apparatus.Type: ApplicationFiled: August 7, 2007Publication date: March 6, 2008Inventors: Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura
-
Publication number: 20080046233Abstract: A technique for concealing the effect of a lost frame in a series of frames representing an encoded audio signal in a sub-band predictive coding system is provided. In accordance with the technique, one or more received frames in the series of frames are decoded to generate a full-band output audio signal, wherein the full-band output audio signal comprises a combination of at least a first sub-band decoded audio signal and a second sub-band decoded audio signal. The full-band output audio signal corresponding to the one or more received frames is stored. Then, a full-band output audio signal corresponding to the lost frame is synthesized, wherein synthesizing the full-band output audio signal corresponding to the lost frame comprises performing waveform extrapolation based on the stored full-band output audio signal corresponding to the one or more received frames.Type: ApplicationFiled: August 15, 2007Publication date: February 21, 2008Applicant: BROADCOM CORPORATIONInventors: Juin-Hwey Chen, Jes Thyssen, Robert W. Zopf
-
Patent number: 7333619Abstract: A method and apparatus for de-noising weak bio-signals having a relatively low signal to noise ratio utilizes an iterative process of wavelet de-noising a data set comprised of a new set of frames of wavelet coefficients partially generated through a cyclic shift algorithm. The method preferably operates on a data set having 2N frames, and the iteration is performed N?1 times. The resultant wavelet coefficients are then linearly averaged and an inverse discrete wavelet transform is performed to arrive at the de-noised original signal. The method is preferably carried out in a digital processor.Type: GrantFiled: May 30, 2006Date of Patent: February 19, 2008Assignees: Everest Biomedical Instruments Company, Washington UniversityInventors: Elvir Causevic, Eldar Causevic, Mladen Victor Wickerhauser
-
Patent number: 7328149Abstract: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.Type: GrantFiled: November 29, 2004Date of Patent: February 5, 2008Assignee: Microsoft CorporationInventors: Hao Jiang, Hongjiang Zhang
-
Publication number: 20080027718Abstract: The range of disclosed configurations includes methods in which subbands of a speech signal are separately encoded, with the excitation of a first subband being derived from a second subband. Gain factors are calculated to indicate a time-varying relation between envelopes of the original first subband and of the synthesized first subband. The gain factors are quantized, and quantized values that exceed the pre-quantized values are re-coded.Type: ApplicationFiled: December 13, 2006Publication date: January 31, 2008Inventors: Venkatesh Krishnan, Ananthapadmanabhan A. Kandhadai
-
Patent number: 7324944Abstract: Systems and methods for dynamically analyzing temporality in an individual's speech in order to selectively categorize the speech fluency of the individual and/or to selectively provide speech training based on the results of the dynamic analysis. Temporal variables in one or more speech samples are dynamically quantified. The temporal variables in combination with a dynamic process, which is derived from analyses of temporality in the speech of native speakers and language learners, are used to provide a fluency score that identifies a proficiency of the individual. In some implementations, temporal variables are measured instantaneously.Type: GrantFiled: December 11, 2003Date of Patent: January 29, 2008Assignee: Brigham Young University, Technology Transfer OfficeInventors: Lynne Hansen, Joshua Rowe
-
Patent number: 7321851Abstract: The present invention relates to the decoding-/playback part of received sound data packets in systems for transmission of sound over packet switched networks. According to the invention, the lengths of received signal frames are manipulated by performing time expansion or time compression of one or more signal frames at time varying intervals and with time varying lengths of the expansion or the compression, said intervals and said lengths being determined so as to maintain a continuous flow of signal samples to be played back.Type: GrantFiled: February 4, 2000Date of Patent: January 22, 2008Assignees: Global IP Solutions (GIPS) AB, Global IP Solutions, Inc.Inventors: Soren V. Andrsen, Willem B. Kleijn, Patrik Sörqvist
-
Publication number: 20080004869Abstract: An audio encoder, an audio decoder or an audio processor includes a filter for generating a filtered audio signal, the filter having a variable warping characteristic, the characteristic being controllable in response to a time-varying control signal, the control signal indicating a small or no warping characteristic or a comparatively high warping characteristic. Furthermore, a controller is connected for providing the time-varying control signal, which depends on the audio signal. The filtered audio signal can be introduced to an encoding processor having different encoding algorithms, one of which is a coding algorithm adapted to a specific signal pattern. Alternatively, the filter is a post-filter receiving a decoded audio signal.Type: ApplicationFiled: June 30, 2006Publication date: January 3, 2008Inventors: Juergen Herre, Bernhard Grill, Markus Multrus, Stefan Bayer, Ulrich Kraemer, Jens Hirschfeld, Stefan Wabnik, Gerald Schuller
-
Patent number: 7305338Abstract: Circuitry and a method compensate the erasure of speech signal data or similar periodic signal data, by substitution using past periodic signal data input. After a predetermined number of latest periodic signal data have been saved, whether or not an erasure occurs is determined with every periodic signal data sequence, which is a unit of processing. When an erasure occurs, one of periodic signal data sequences saved, which lies in a determined segment to be used, is used to generate synthetic data for substitution. The position of the segment to be used is determined such that when the erasure continues over units of processing, the position sequentially varies gradually for each processing units.Type: GrantFiled: May 14, 2004Date of Patent: December 4, 2007Assignee: Oki Electric Industry Co., Ltd.Inventors: Atsushi Tashiro, Hiromi Aoyagi, Masashi Takada
-
Patent number: 7305337Abstract: The present invention includes a method for speech encoding and decoding and a design of speech coder and decoder. The characteristic of speech encoding method relies on the type of data with high compression rate after the whole speech data is compressed. The present invention is able to lower the bit rate of the original speech from 64 Kbps to 1.6 Kbps and provide a bit rate lower than the traditional compression method. It can provide good speech quality, and attain the function of storing the maximum speech data with minimum memory. As to the speech decoding method, some random noises are appropriated added into the exciting source, so that more speech characteristics can be simulated to produce various speech sounds. In addition, the present invention also discloses a coder and a decoder designed by application specific integrated circuit, and the structural design is optimized according to the software.Type: GrantFiled: December 24, 2002Date of Patent: December 4, 2007Assignee: National Cheng Kung UniversityInventors: Jhing-Fa Wang, Jia-Ching Wang, Yun-Fei Chao, Han-Chiang Chen, Ming-Chi Shih
-
Patent number: 7302064Abstract: A method and apparatus for de-noising weak bio-signals having a relatively low signal to noise ratio utilizes an iterative process of de-noising a data set comprised of a new set of frames. The method separately performs a non-linear de-noising operation on each of the component frames and combines the resultant de-noised frames to form a combined resultant de-noised input signal. The method is preferably carried out in a digital processor.Type: GrantFiled: January 24, 2006Date of Patent: November 27, 2007Assignee: Brainscope Company, Inc.Inventors: Elvir Causevic, Eldar Causevic
-
Patent number: 7299184Abstract: An embodiment of the present invention is a method for generating a listener-interest-filtered work for an audio or audio-visual work, which method includes steps of: (a) generating one or more average speed contours for one or more audio or audio-visual works for one or more categories of users; (b) converting the one or more average speed contours to one or more conceptual speed association data structures; and forming a listener-interest-filtered conceptual speed association data structure from the one or more conceptual speed association data structures.Type: GrantFiled: September 7, 2004Date of Patent: November 20, 2007Assignee: Enounce IncorporatedInventor: Donald J. Hejna, Jr.
-
Publication number: 20070265841Abstract: An information processing apparatus, comprises: a lower time series data generation unit having a plurality of recurrent neural networks which learn predetermined time series data, and generate prediction time series data according to the learning result; an upper time series data generation unit having recurrent neural networks which learn error time series data that is time series data of errors raised at the time of the learning by the respective plural recurrent neural networks of the lower time series data generation unit, and generate prediction error time series data that is time series data of prediction errors according to the learning result; and a conversion unit that performs nonlinear conversion for the prediction errors generated by the upper time series data generation unit, wherein the lower time series data generation unit outputs the prediction time series data generated by the respective plural recurrent neural networks according to the prediction errors which have undergone the nonlinear cType: ApplicationFiled: May 14, 2007Publication date: November 15, 2007Inventors: Jun Tani, Ryunosuke Nishimoto, Masato Ito
-
Patent number: 7277847Abstract: A method for determining intensity characteristics of background noise during speech pauses of speech signals includes determining a proportion of speech pauses in the undisturbed source speech signal so as to define a frequency threshold. The disturbed speech signal is divided into short successive signal elements, an intensity value is determined for each of the signal elements, and a cumulative relative frequency distribution is formed from the determined intensity values of the signal elements. The cumulative relative frequency distribution is used to determine an intensity threshold value which corresponds to the defined frequency threshold. At least one intensity characteristic of the background noise during the speech pauses is determined using a region of the cumulative relative frequency distribution below the intensity threshold value.Type: GrantFiled: April 3, 2002Date of Patent: October 2, 2007Assignee: Deutsche Telekom AGInventor: Jens Berger
-
Patent number: 7254532Abstract: The invention relates to a method for determining voice activity in a signal section of an audio signal. The result, i.e., whether voice activity is present in the section of the signal thus observed, depends upon spectral and temporal stationarity of the signal section and/or prior signal sections. In a first step, the method determines whether there is spectral stationarity in the observed signal section. In a second step, the method determines whether there is temporal stationarity in the signal section in question. The final decision as to the presence of voice activity in the signal section observed depends upon the initial values of both steps.Type: GrantFiled: March 16, 2001Date of Patent: August 7, 2007Assignee: Deutsche Telekom AGInventors: Alexander Kyrill Fischer, Christoph Erdmann
-
Publication number: 20070179781Abstract: A filter apparatus for filtering a time domain input signal to obtain a time domain output signal, which is a representation of the time domain input signal filtered using a filter characteristic having an non-uniform amplitude/frequency characteristic, comprises a complex analysis filter bank for generating a plurality of complex subband signals from the time domain input signals, a plurality of intermediate filters, wherein at least one of the intermediate filters of the plurality of the intermediate filters has a non-uniform amplitude/frequency characteristic, wherein the plurality of intermediate filters have a shorter impulse response compared to an impulse response of a filter having the filter characteristic, and wherein the non-uniform amplitude/frequency characteristics of the plurality of intermediate filters together represent the non-uniform filter characteristic, and a complex synthesis filter bank for synthesizing the output of the intermediate filters to obtain the time domain output signal.Type: ApplicationFiled: September 1, 2006Publication date: August 2, 2007Inventor: Lars Villemoes
-
Patent number: 7251596Abstract: The present invention provides a unique wave-trigon transformation (WTT) method for performing transformation process over a wave signal. The present invention also provides a pitch detecting method and apparatus for detecting pitch based on the WTT process as well as a sentence detecting method and apparatus for detecting a sentence in a sound signal based on the WTT process. The pitch detecting method and apparatus can effectively detect pitch in a sound signal. In the WTT process, an inputted wave signal (such as a sound signal) is transformed into a series of trigons, and an energy-width spectrum is formed using these trigons. For a sound signal containing voice, the distribution of trigons transformed from the sound signal has a certain pattern. By analyzing the pattern, whether a pitch is contained in the sound signal can be determined. In particular, existence of a pitch can be determined by determining and evaluating the periodicity of trigons in a candidate chained peak in the energy-width spectrum.Type: GrantFiled: December 23, 2002Date of Patent: July 31, 2007Assignee: Canon Kabushiki KaishaInventors: Lianshan Zhu, Tao Yu
-
Publication number: 20070174051Abstract: An adaptive time/frequency-based encoding mode determination apparatus including a time domain feature extraction unit to generate a time domain feature by analysis of a time domain signal of an input audio signal, a frequency domain feature extraction unit to generate a frequency domain feature corresponding to each frequency band generated by division of a frequency domain corresponding to a frame of the input audio signal into a plurality of frequency domains, by analysis of a frequency domain signal of the input audio signal, and a mode determination unit to determine any one of a time-based encoding mode and a frequency-based encoding mode, with respect to the each frequency band, by use of the time domain feature and the frequency domain feature.Type: ApplicationFiled: September 21, 2006Publication date: July 26, 2007Applicant: SAMSUNG Electronics Co., Ltd.Inventors: Eun Mi Oh, Ki Hyun Choo, Jung-Hoe Kim, Chang Yong Son
-
Patent number: 7239999Abstract: A method of pitch corrected speed control (PCSC) playback in which a decoder rate controller receives a desired playback speed from a PCSC controller and determines the number of decoded digital audio samples stored in a buffer. The rate controller then determines the required number of execution times of a parametric speech decoder based on the desired playback speed and the number of decoded samples stored in the buffer. The parametric speech decoder is then executed the determined number of times.Type: GrantFiled: July 23, 2002Date of Patent: July 3, 2007Assignee: Intel CorporationInventor: Changwon D. Rhee
-
Patent number: 7231344Abstract: The shape of windows used during linear predictive analysis can be optimized through the use of gradient-descent based window optimization procedures. Window optimization may be achieved fairly precisely through the use of a primary optimization procedure, or less precisely through the use of an alternate optimization procedure. Both optimization procedures use the principle of gradient-descent to find a window sequence that will either minimize the prediction error energy or maximize the segmental prediction gain. However, the primary optimization procedure uses a Levinson-Durbin based algorithm to determine the gradient while the alternate optimization procedure uses an estimate of the gradient based on the basic definition of a derivative. These optimization procedures can be implemented as computer readable software code. Additionally, the optimization procedures may be implemented in a window optimization device which generally includes a window optimization unit and may also include an interface unit.Type: GrantFiled: October 29, 2002Date of Patent: June 12, 2007Assignee: NTT DoCoMo, Inc.Inventor: Wai C. Chu
-
Patent number: 7219061Abstract: Predetermined macrosegments of the fundamental frequency are determined by a neural network, and these predefined macrosegments are reproduced by fundamental-frequency sequences stored in a database. The fundamental frequency is generated on the basis of a relatively large text section which is analyzed by the neural network. Microstructures from the database are received in the fundamental frequency. The fundamental frequency thus formed is thus optimized both with regard to its macrostructure and to its microstructure. As a result, an extremely natural sound is achieved.Type: GrantFiled: October 24, 2000Date of Patent: May 15, 2007Assignee: Siemens AktiengesellschaftInventors: Caglayan Erdem, Martin Holzapfel
-
Patent number: 7197464Abstract: A system, method and computer-readable medium are disclosed for operating a communications network. The method aspect comprises receiving an audio signal and to remove a first portion of a frame of the audio signal, and generating an overlap-added segment from (1) a first segment of the frame, the first segment being located before the first portion; and (2) a second segment of the frame, the second segment comprising an endmost portion of a terminal section of the frame. The method preferably operates in a discontinuous transmission packet telephony network having a channel access delay.Type: GrantFiled: July 27, 2005Date of Patent: March 27, 2007Assignee: AT&T Corp.Inventors: Richard Vandervoort Cox, David A. Kapilow
-
Patent number: 7143047Abstract: A data-compressed audio waveform is temporally modified without requiring complete decompression of the audio signal. Packets of compressed audio data are first unpacked, to remove scaling that was applied in the formation of the packets. The unpacked data is then temporally modified, using one of a number of different approaches. This modification takes place while the audio information remains in a data-compressed format. New packets are then assembled from the modified data, to produce a data-compressed output stream that can be subsequently processed in a conventional manner to reproduce the desired sound. The assembly of the new packets employs a technique for inferring an auditory model from the original packets, to requantize the data in the output packets.Type: GrantFiled: September 17, 2004Date of Patent: November 28, 2006Assignee: Vulcan Patents LLCInventors: Michele M. Covell, Malcolm Slaney, Arthur Rothstein
-
Patent number: 7143029Abstract: An apparatus for changing the playback rate of recorded speech includes memory storing a plurality of recorded speech messages and a plurality of feature tables. Each feature table is associated with an individual one of the speech messages and includes speech frame parameters based on the jitter states of speech frames of the associated recorded speech message. A playback module receives input specifying a recorded speech message in the memory to be played and the rate at which the recorded speech message is to be played back. In response to the input, the playback module uses a set of decision rules to modify the specified speech message based on the speech frame parameters in the feature table associated with the specified speech message and the specified playback rate, prior to playing back the specified speech message.Type: GrantFiled: September 9, 2004Date of Patent: November 28, 2006Assignee: Mitel Networks CorporationInventor: Moustafa Elshafei
-
Patent number: 7139699Abstract: Method and apparatus to measure jitter (period-to-period fluctuations in fundamental frequency) among the voices of suicidal, major depressed, and non-suicidal patients to predict near-term suicidal risk.Type: GrantFiled: October 5, 2001Date of Patent: November 21, 2006Inventors: Stephen E. Silverman, Asli Ozdas, Marilyn K. Silverman
-
Patent number: 7120577Abstract: A system and terminal for facilitating a “virtual presence” allows users on a communication network to simply begin speaking through other users. A system immediately detects the destination party's name, and begins routing the audio signal to a particular destination without any noticeable call set-up. Additionally, the system performs pitch corrected speed control in order to allow the detection and processing of a speech pattern without causing delay to an end user.Type: GrantFiled: January 9, 2003Date of Patent: October 10, 2006Assignee: Intel CorporationInventor: Howard Bubb
-
Patent number: 7085724Abstract: The invention relates to a linking unit 100, a parametric encoder 400 and a method for generating linking information L indicating components of consecutive extended segments sp and sc which may be linked together in order to form a sinusoidal track. The segments sp and sc approximate consecutive segments of a sinusoidal audio or speech signal s. The linking unit comprises a calculating unit 120 for generating a similarity matrix S(m,n) in response to received sinusoidal code data and an evaluating unit 140 for receiving and evaluating said similarity matrix S in order to generate said linking information by selecting those pairs of components m,n the similarity of which is maximal. According to the invention the calculating unit 120 is adapted to calculate the similarity matrix S by additionally considering information about the phase consistency between the components of the extended previous segment sp and the extended current segment sc.Type: GrantFiled: January 14, 2002Date of Patent: August 1, 2006Assignee: Koninklijke Philips Electronics N.V.Inventors: Albertus Cornelis Den Brinker, Arnoldus Werner Johannes Oomen, Fransiscus Marinus Jozephus De Bont, Erik Gosuinus Petrus Schuijers
-
Patent number: 7072829Abstract: With respect to each of codes corresponding to code vectors in a code book stored in a code book storage section, an expectation degree storage section stores an expectation degree at which observation is expected when an integrated parameter with respect to a word as a recognition target is inputted. A vector quantization section vector-quantizes the integrated parameter and outputs a series of codes of a code vector which has a shortest distance to the integrated parameter.Type: GrantFiled: June 10, 2002Date of Patent: July 4, 2006Assignee: Sony CorporationInventors: Tetsujiro Kondo, Norifumi Yoshiwara
-
Patent number: 7069208Abstract: A system and method for the concealment of errors resulting from missing or corrupted data in the transmission of audio signals in compressed digital packet formats is disclosed. The system utilizes a circular FIFO buffer to store audio frames from the transmitted audio signal, and a beat detector, to identify the presence of beats in the audio signal. The error concealment method replaces erroneous audio frames with error-free audio frames by a process which takes into account the presence and location of the detected beats.Type: GrantFiled: January 24, 2001Date of Patent: June 27, 2006Assignee: Nokia, Corp.Inventor: Ye Wang
-
Patent number: 7069210Abstract: Method of and system for coding a sound signal (10) as multiple independent streams of frames (14, 15) by creating frames (1,2,3,4,5,6) using sinusoidal coding and then placing frame i into stream i modulo the number of streams, method of and system for reconstructing a sound signal (23) by decoding frames from multiple streams (21, 22) in an interleaved fashion and reconstructing missing frames by using information from surrounding frames, system for recording and playing back sound signals implementing the above two methods, where under normal circumstances both streams (31, 32) of a coded signal are stored, and when capacity on the storage medium (35) is low, only one of the two streams of a coded signal is stored while one of the two streams of existing coded signals is overwritten and allowing a decoder (37) to reconstruct a sound signal by using either both or the one available stream for that sound signal.Type: GrantFiled: November 29, 2000Date of Patent: June 27, 2006Assignee: Koninklijke Philips Electronics N.V.Inventor: Rakesh Taori
-
Patent number: 7054454Abstract: A method and apparatus for de-noising weak bio-signals having a relatively low signal to noise ratio utilizes an iterative process of wavelet de-noising a data set comprised of a new set of frames of wavelet coefficients partially generated through a cyclic shift algorithm. The method preferably operates on a data set having 2N frames, and the iteration is performed N?1 times. The resultant wavelet coefficients are then linearly averaged and an inverse discrete wavelet transform is performed to arrive at the de-noised original signal. The method is preferably carried out in a digital processor.Type: GrantFiled: March 29, 2002Date of Patent: May 30, 2006Assignee: Everest Biomedical Instruments CompanyInventors: Elvir Causevic, Eldar Causevic, Mladen Victor Wickerhauser
-
Patent number: 7054453Abstract: A method and apparatus for de-noising weak bio-signals having a relatively low signal to noise ratio utilizes an iterative process of de-noising a data set comprised of a new set of frames. The method separately performs a non-linear de-noising operation on each of the component frames and combines the resultant de-noised frames to form a combined resultant de-noised input signal. The method is preferably carried out in a digital processor.Type: GrantFiled: March 29, 2002Date of Patent: May 30, 2006Assignee: Everest Biomedical Instruments Co.Inventors: Elvir Causevic, Eldar Causevic
-
Patent number: 7043433Abstract: Embodiments of the present invention provide method and apparatus for determining audience affinity and/or aptitude in portions of media works and for developing information that represent measures of the audience affinity and/or aptitude. Further embodiments of present invention provide method and apparatus for utilizing the information to create altered media works and/or to present the altered media works to an audience. One embodiment of the present invention is a method for inferring audience affinity or aptitude with regard to content or properties of portions of a media work which includes: (a) presenting the media work to an audience; (b) obtaining user input regarding presentation rates for the portions of the media work; (c) correlating content or properties of the portion with the presentation rates; and; (d) associating audience affinity or aptitude with the correlated content or properties.Type: GrantFiled: September 16, 1999Date of Patent: May 9, 2006Assignee: Enounce, Inc.Inventor: Donald J. Hejna, Jr.
-
Patent number: 7043425Abstract: In order to improve recognition performance, a no-speech sound model correction section performs an adaptation of a no-speech sound model which is a sound model representing a no-speech state on the basis of input data observed in an interval immediately before a speech recognition interval for the object of speech recognition and the degree of freshness representing the recentness of the input data.Type: GrantFiled: March 24, 2005Date of Patent: May 9, 2006Assignee: Sony CorporationInventor: Hongchang Pao
-
Patent number: 7016850Abstract: Speech at the beginning of a talkspurt in a discontinuous transmission (DTX) packet telephony system is speeded up to help make up for an access delay incurred during channel allocation. Incoming speech frames are buffered, a pitch period for a current portion of the signal is estimated, and then a pitch period=s worth of the signal is cut from that portion. This is continued until the original access delay, as estimated from the time lag between the commencement of voice input for the talkspurt, and notification that a channel is available, is eliminated. The remainder of the talkspurt is then transmitted without such compression.Type: GrantFiled: January 25, 2001Date of Patent: March 21, 2006Assignee: AT&T Corp.Inventors: Richard Vandervoort Cox, David A Kapilow
-
Patent number: 7010491Abstract: With the goal of presenting a waveform compression and expansion apparatus with which the sound quality of such things as musical tones that are expressed by waveforms is satisfactory following the compression and expansion of the waveforms of the musical tones etc., a method and system for waveform compression and expansion is disclosed in which all of the multiple number of band divided waveforms that comprise the original waveform which has been band divided are apportioned to at least two kinds of compression and expansion formats and form a multiple number of compressed and expanded waveforms by compression or expansion an identical amount only in the direction of the temporal axis.Type: GrantFiled: December 9, 1999Date of Patent: March 7, 2006Assignee: Roland CorporationInventor: Tadao Kikumoto
-
Patent number: 7010481Abstract: In a method for performing a segmentation operation upon a synthesizing speech signal and an input speech signal, a synthesized speech signal and a speech element duration signal are generated from the synthesizing speech signal A first feature parameter is extracted from the synthesized speech signal, and a second feature parameter is extracted from the input speech signal. A dynamic programming matching operation is performed upon the second feature parameter with reference to the first feature parameter and the speech element duration signal to obtain segmentation points of the input speech signal.Type: GrantFiled: March 27, 2002Date of Patent: March 7, 2006Assignee: NEC CorporationInventor: Takuya Takizawa
-
Patent number: 6999922Abstract: The present invention (110) permits a user to speed up and slow down speech without changing the speakers pitch (102, 110, 112, 128, 402–416). It is a user adjustable feature to change the spoken rate to the listeners' preferred listening rate or comfort. It can be included on the phone as a customer convenience feature without changing any characteristics of the speakers voice besides the speaking rate with soft key button (202) combinations (in interconnect or normal). From the users perspective, it would seem only that the talker changed his speaking rate, and not that the speech was digitally altered in any way. The pitch and general prosody of the speaker are preserved. The following uses of the time expansion/compression feature are listed to compliment already existing technologies or applications in progress including messaging services, messaging applications and games, real-time feature to slow down the listening rate.Type: GrantFiled: June 27, 2003Date of Patent: February 14, 2006Assignee: Motorola, Inc.Inventors: Marc Andre Boillot, John Gregory Harris, Thomas Lawrence Reinke
-
Patent number: 6982377Abstract: A time scale modification method employs separate bands obtained through an analysis polyphase filter bank with separate time-scale modification processing for the bands. The outputs are combined using a synthesis filter bank. Some constraints are imposed on the time-scale modification processing, such a limitation of the range of overlap adjustment values for bands other than the greatest energy band, to eliminate noise due to aliasing and inter-channel phase mismatch. This invention produces output quality considerably higher than conventional time-domain time-scale modification methods for general music signals with computational requirements comparable to those of conventional time-domain time-scale modification methods.Type: GrantFiled: December 18, 2003Date of Patent: January 3, 2006Assignee: Texas Instruments IncorporatedInventors: Atsuhiro Sakurai, Steven Trautmann, Daniel L. Zelazo
-
Patent number: 6983241Abstract: To address the need for choosing values of harmonic noise weighting (HNW) coefficient (?p) so that the amount of harmonic noise weighting can be optimized, a method and apparatus for performing harmonic noise weighting in digital speech coders is provided herein. During operation, received speech is analyzed to determine a pitch period. HNW coefficients are then chosen based on the pitch period, and a perceptual noise weighting filter (C(z)) is determined based on the harmonic-noise weighting (HNW) coefficients (?p).Type: GrantFiled: October 14, 2004Date of Patent: January 3, 2006Assignee: Motorola, Inc.Inventors: Udar Mittal, James P. Ashley
-
Patent number: 6952673Abstract: A system and method for automatically adjusting the rate at which recorded speech is played back as a typist manually transcribes the speech. The typing speed is measured and a speech playback rate determined based on the measured speed. The playback rate of the audio is then automatically increased or decreased as appropriate to match the typing speed.Type: GrantFiled: February 20, 2001Date of Patent: October 4, 2005Assignee: International Business Machines CorporationInventors: Arnon Amir, Michael Rodeh
-
Patent number: 6947887Abstract: A low speed encoding method based on Internet protocol (IP) includes the steps of determining speech characteristic parameters in TN duration, determining an optimized frame length T for successive speech data processing according to the characteristic parameters, making compressed encoding of the speech data in every T, assembling a packet of the encoded bits with TCP and UDP, again assembling a packet of the assembled bits with IP, and finally outputting the channel. The method uses a single frame, variable length frame, intra-frame adaptive low speed speech encoding method, which has the advantages of reducing the bit rate and raising transmission efficiency. The method takes an optimized length encoded frame as a unit to break the IP datagram, and therefore raises encoding and decoding quality of the speech data greatly. Informal tests show that the method can raise a MOS (mean opinion score) value from 0.1 to 0.2.Type: GrantFiled: February 19, 2003Date of Patent: September 20, 2005Assignee: Huawei Technologies Co., Ltd.Inventors: Shengxi Pan, Yingtao Li
-
Patent number: 6920421Abstract: In order to improve recognition performance, a no-speech sound model correction section performs an adaptation of a no-speech sound model which is a sound model representing a no-speech state on the basis of input data observed in an interval immediately before a speech recognition interval for the object of speech recognition and the degree of freshness representing the recentness of the input data.Type: GrantFiled: December 26, 2000Date of Patent: July 19, 2005Assignee: Sony CorporationInventor: Hongchang Pao
-
Patent number: 6898565Abstract: A system and terminal for facilitating a “virtual presence” allows users on a communication network to simply begin speaking through other users. A system immediately detects the destination party's name, and begins routing the audio signal to a particular destination without any noticeable call set-up. Additionally, the system performs pitch corrected speed control in order to allow the detection and processing of a speech pattern without causing delay to an end user.Type: GrantFiled: January 6, 2003Date of Patent: May 24, 2005Assignee: Intel CorporationInventor: Howard Bubb
-
Patent number: 6876965Abstract: A voice activity detector is disclosed for use with a radio transmitter to continuously sense the presence of speech in an audio signal. Initially, the audio signal is processed to produce a train of signal samples. Signal peaks are identified therefrom, which are used to compute respective values for a succession of quasi-pitch periods associated with the signal sample train. The quasi-pitch period values are then selectively compared with one another, in order to determine the presence or absence of a speech component.Type: GrantFiled: February 28, 2001Date of Patent: April 5, 2005Assignee: Telefonaktiebolaget LM Ericsson (publ)Inventors: Fisseha Mekuria, Joakim Persson
-
Patent number: 6868377Abstract: A method and apparatus to inexpensively and efficiently process audio and speech signals. A method for processing a signal having at least one region of interest is provided. The method begins by dividing the signal into a plurality of sub-band signals, wherein a selected sub-band signal includes the region of interest. The selected sub-band is processed by a phase vocoder to produce a vocoder output signal. Next, at least a portion of the subbands are time-aligned with the vocoder output signal. Finally, the aligned sub-band signals and the vocoder output signal are combined to form an output signal.Type: GrantFiled: November 23, 1999Date of Patent: March 15, 2005Assignee: Creative Technology Ltd.Inventor: Jean Laroche
-
Patent number: 6850882Abstract: A method of and device for the diagnosis and treatment of speech dynamically measures the functioning of the velum in the control of nasality during speech. Various components of oral and nasal airflow are separated and selectively analyzed including (i) the fundamental frequency component of each airflow during voiced speech, (ii) a plurality of voice components that cover a frequency range encompassing at least the lowest vocal tract resonance (the first formant), and (iii) the subsonic and infrasonic components of at least the nasal airflow. By comparing the nasal and oral airflow components at the voice fundamental frequency, a nasalization measure for voiced speech sounds is formed which emulates methods that compare low frequency nasal and oral airflow during voiced speech, while eliminating or greatly reducing the problems associated with comparing these low frequency airflows, and which improves upon previous methods based on measuring and comparing nasal and oral radiated sound pressure.Type: GrantFiled: October 23, 2000Date of Patent: February 1, 2005Inventor: Martin Rothenberg
-
Patent number: 6842735Abstract: A data-compressed audio waveform is temporally modified without requiring complete decompression of the audio signal. Packets of compressed audio data are first unpacked, to remove scaling that was applied in the formation of the packets. The unpacked data is then temporally modified, using one of a number of different approaches. This modification takes place while the audio information remains in a data-compressed format. New packets are then assembled from the modified data, to produce a data-compressed output stream that can be subsequently processed in a conventional manner to reproduce the desired sound. The assembly of the new packets employs a technique for inferring an auditory model from the original packets, to requantize the data in the output packets.Type: GrantFiled: September 13, 2000Date of Patent: January 11, 2005Assignee: Interval Research CorporationInventors: Michele M. Covell, Malcolm Slaney, Arthur Rothstein