Specialized Information Patents (Class 704/206)
  • Patent number: 6742003
    Abstract: A system that incorporates an interactive graphical user interface for visualizing clusters (categories) and segments (summarized clusters) of data. Specifically, the system automatically categorizes incoming case data into clusters, summarizes those clusters into segments, determines similarity measures for the segments, scores the selected segments through the similarity measures, and then forms and visually depicts hierarchical organizations of those selected clusters. The system also automatically and dynamically reduces, as necessary, a depth of the hierarchical organization, through elimination of unnecessary hierarchical levels and inter-nodal links, based on similarity measures of segments or segment groups. Attribute/value data that tends to meaningfully characterize each segment is also scored, rank ordered based on normalized scores, and then graphically displayed.
    Type: Grant
    Filed: April 30, 2001
    Date of Patent: May 25, 2004
    Assignee: Microsoft Corporation
    Inventors: David E. Heckerman, Paul S. Bradley, David M. Chickering, Christopher A. Meek
  • Publication number: 20040083093
    Abstract: A method for measuring nasality senses a sound, amplifies the sensed sound signal, converts the sound signal to a digital sound signal, transforms the digital time domain sound signal to a frequency domain with a Fast Fourier Transformation (FFT), determines a cut frequency (fcut) calculates a low frequency power and a high frequency power of each window, calculates an acoustic low/high ratio (ALHR) and calculates an average acoustic low/high ratio (ALHRave). With such a method, the ALHRave reflects a person's nasality and nasal airway status, and the method is conveniently performed.
    Type: Application
    Filed: October 25, 2002
    Publication date: April 29, 2004
    Inventors: Guo-She Lee, Terry B. J. Kuo
  • Patent number: 6718297
    Abstract: The present invention classifies an input signal as either voice or data with reduced energy consumption. The present invention includes a frequency estimator and an energy estimator for processing an input signal and a classification unit connected to both the frequency and energy estimators for classifying the input signal. The frequency estimator includes a delay and difference integrator. In operation, the delay receives the input signal and generates a delayed input signal and the difference integrator receives the delayed and input signals and generates a frequency estimate value representing both the estimated central frequency of the input signal and the estimated energy of the input signal. The energy estimator generates an estimate of the energy level of the input signal. The classification unit classifies the input as either voice or data based on a comparison of the frequency and energy estimate values and a data threshold value.
    Type: Grant
    Filed: February 15, 2000
    Date of Patent: April 6, 2004
    Assignee: The Boeing Company
    Inventors: Joseph Peebles Pride, III, Edward James Carroll, Cheryl Jean Franklin
  • Publication number: 20040054526
    Abstract: A speech encoder including a pitch detector operative to determine the pitch frequency of a speech segment, a spectral estimator operative to estimate the complex spectrum of the speech segment at the pitch frequency, an envelope encoder operative to calculate the amplitude of the complex spectrum, a phase aligner operative to remove a phase term which is linear in frequency from each of a plurality of complex values of the complex spectrum, and calculate a series of division products of each of the plurality of complex values by the square root of the absolute value of each of the complex values, where the series has a minimum total variation, thereby resulting in an aligned phase &thgr;k. and a phase encoder operative to encode the phase information.
    Type: Application
    Filed: September 13, 2002
    Publication date: March 18, 2004
    Applicant: IBM
    Inventors: Dan Chazan, Zvi Kons
  • Patent number: 6697796
    Abstract: A technique and apparatus to allow a digital search of the entries in a digital audio database such as the Flash memory of a telephone answering system, the hard drive of a voice messaging system, the audio tracks on a compact disk, a cassette tape, a digital video disk (DVD), a videotape, etc. In one disclosed embodiment, each entry in the digital audio database (e.g., each audio track, each voice message, etc.) is converted into textual information, and the converted textual information is associated with a particular audio segment within the digital audio database. The textual information allows a digital search to be performed for a particular voice message, or portion of a voice message, in a telephone answering device, or for a particular song on a music CD, etc. Once the particular audio segment(s) containing a particular textual string is (are) located, that particular audio segment may be played or otherwise accessed, either in whole or in relevant part.
    Type: Grant
    Filed: January 13, 2000
    Date of Patent: February 24, 2004
    Assignee: Agere Systems Inc.
    Inventor: Bahram Ghaffarzadeh Kermani
  • Patent number: 6678647
    Abstract: A perceptual audio coder is disclosed for encoding audio signals, such as speech or music, with different spectral and temporal resolutions for the redundancy reduction and irrelevancy reduction using cascaded filterbanks. The disclosed perceptual audio coder includes a first analysis filterbank for performing irrelevancy reduction in accordance with a psychoacoustic model and a second analysis filterbank for performing redundancy reduction. The spectral/temporal resolution of the first filterbank can be optimized for irrelevancy reduction and the spectral/temporal resolution of the second filterbank can be optimized for maximum redundancy reduction. The disclosed perceptual audio coder also includes a scaling block between the cascaded filterbank that scales the spectral coefficients, based on the employed perceptual model.
    Type: Grant
    Filed: June 2, 2000
    Date of Patent: January 13, 2004
    Assignee: Agere Systems Inc.
    Inventors: Bernd Andreas Edler, Christof Faller
  • Patent number: 6675140
    Abstract: The signal processing method includes the steps of: wavelet-transforming an input signal in a computer; and extracting features of the signal by Mellin-transforming the output of the wavelet transform step in synchrony with the input signal in a computer.
    Type: Grant
    Filed: January 28, 2000
    Date of Patent: January 6, 2004
    Assignee: Seiko Epson Corporation
    Inventors: Toshio Irino, Roy D. Patterson
  • Patent number: 6665637
    Abstract: The present invention relates to the concealment of errors in decoded acoustic signals caused by encoded data representing the acoustic signals being partially lost or damaged during transmission over a transmission medium. In case of lost data or received damaged data a secondary reconstructed signal is produced on basis of a primary reconstructed signal. This signal has a spectrally adjusted spectrum (Z4E), such that it deviates less with respect spectral shape from a spectrum (Z3) of a previously reconstructed signal produced from previously received data than a spectrum (Z′4) of the primary reconstructed signal.
    Type: Grant
    Filed: October 19, 2001
    Date of Patent: December 16, 2003
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventor: Stefan Bruhn
  • Patent number: 6658112
    Abstract: A voice decoder detects channel errors and loss of cryptographic synchronization using the change in spectral energy between sequential frames. The change in energy between frames is determined between corresponding LSP's of said successive frames and summed together. A running average of the change in energy for a predetermined number of frames is maintained. Current voice frames are eliminated based on the difference between the change in energy associated with the current frame and the running average. Accordingly, offensive audio associated with such channel errors or cryptographic synchronization loss is eliminated.
    Type: Grant
    Filed: August 6, 1999
    Date of Patent: December 2, 2003
    Assignee: General Dynamics Decision Systems, Inc.
    Inventors: David L. Barron, William Chunhung Yip, Paul Lopez Kennedy
  • Patent number: 6654716
    Abstract: The invention relates to encoding of broadband and narrowband acoustic source signals (x) such that the perceived sound quality of corresponding reconstructed signals is improved in comparison to the known solutions. An enhancement estimation unit (102), operating in serial or in parallel with the regular encoding/decoding means (101), perceptually enhances a reconstructed acoustic source signal by utilization of an enhancement spectrum (C) comprising a larger number of spectral coefficients than the number of sample values in corresponding frames of the signals carrying the basic encoded representation of the acoustic source signal. The thus extended block length of the enhancement spectrum frame provides a basis for accomplishing the desired improvement of the perceived sound quality.
    Type: Grant
    Filed: October 19, 2001
    Date of Patent: November 25, 2003
    Assignee: Telefonaktiebolaget LM Ericsson
    Inventors: Stefan Bruhn, Susanne Olvenstam
  • Patent number: 6651041
    Abstract: A source signal (e.g. a speech sample) is processed or transmitted by a speech coder 1 and converted into a reception signal (coded speech signal). The source and reception signals are separately subjected to preprocessing 2 and psychoacoustic modelling 3. This is followed by a distance calculation 4, which assesses the similarity of the signals. Lastly, an MOS calculation is carried out in order to obtain a result comparable with human evaluation. According to the invention, in order to assess the transmission quality a spectral similarity value is determined which is based on calculation of the covariance of the spectra of the source signal and reception signal and division of the covariance by the standard deviations of the two said spectra. The method makes it possible to obtain an objective assessment (speech quality prediction) while taking the human auditory process into account.
    Type: Grant
    Filed: February 9, 2001
    Date of Patent: November 18, 2003
    Assignee: Ascom AG
    Inventor: Pero Juric
  • Patent number: 6633839
    Abstract: In a distributed speech recognition system comprising a first communication device which receives a speech input (34), encodes data representative of the speech input, and transmits the encoded data and a second remotely-located communication device which receives the encoded data and compares the encoded data with a known data set, the device including a processor with a program which controls the processor to operate according to a method of reconstructing the speech input including the step of receiving encoded data including encoded spectral data and encoded energy data. The method further includes the step of decoding the encoded spectral data and encoded energy data to determine the spectral data and energy data. The method also includes the step of combining the spectral data and energy data to reconstruct the speech input.
    Type: Grant
    Filed: February 2, 2001
    Date of Patent: October 14, 2003
    Assignee: Motorola, Inc.
    Inventors: William M. Kushner, Jeffrey Meunier, Mark A. Jasiuk, Tenkasi V. Ramabadran
  • Patent number: 6629067
    Abstract: A range control system includes an input section for inputting a singing voice, a fundamental frequency extracting section for extracting a fundamental frequency of the inputted voice, and a pitch control section for performing a pitch control of the inputted voice so as to match the extracted fundamental frequency with a given frequency. The system further includes a formant extracting section for extracting a formant of the inputted voice, and a formant filter section for performing a filter operation relative to the pitch-controlled voice so that the pitch-controlled voice has a characteristic of the extracted formant. The system further includes an input loudness detecting section for detecting a first loudness of the inputted voice, and a loudness control section for controlling a second loudness of the voice subjected to the filter operation to match with the first loudness.
    Type: Grant
    Filed: May 14, 1998
    Date of Patent: September 30, 2003
    Assignee: Kabushiki Kaisha Kawai Gakki Seisakusho
    Inventors: Tsutomu Saito, Hiroshi Kato, Youichi Kondo
  • Publication number: 20030182105
    Abstract: The present invention relates to method and system for distinguishing speech from music in a digital audio signal in real time.
    Type: Application
    Filed: February 21, 2003
    Publication date: September 25, 2003
    Inventors: Mikhael A. Sall, Sergei N. Gramnitskiy, Alexandr L. Maiboroda, Victor V. Redkov, Anatoli I. Tikhotsky, Andrei B. Viktorov
  • Patent number: 6615169
    Abstract: A speech coding method and device for encoding and decoding an input signal and providing synthesized speech, wherein the higher frequency components of the synthesized speech are achieved by high-pass filtering and coloring an artificial signal to provide a processed artificial signal. The processed artificial signal is scaled by a first scaling factor during the active speech periods of the input signal and a second scaling factor during the non-active speech periods, wherein the first scaling factor is characteristic of the higher frequency band of the input signal and the second scaling factor is characteristic of the lower frequency band of the input signal. In particular, the second scaling factor is estimated based on the lower frequency components of the synthesized speech and the coloring of the artificial signal is based on the linear predictive coding coefficients characteristic of the lower frequency of the input signal.
    Type: Grant
    Filed: October 18, 2000
    Date of Patent: September 2, 2003
    Assignee: Nokia Corporation
    Inventors: Pasi Ojala, Jani Rotola-Pukkila, Janne Vainio, Hannu Mikkola
  • Patent number: 6584437
    Abstract: A method and apparatus for coding successive pitch periods of a speech signal. Based on a priori knowledge of statistical properties of successive speech periods, a shaped lattice structure is designed to cover the most probable points in the pitch space. The codebook index search starts with finding an open-loop estimate in the pitch space considering all dimensions and refining the open-loop estimate in a closed-loop search separately in each dimension based on the shaped lattice structure. The closed-loop search for the first subframe is for obtaining an absolute pitch period or a delta pitch while the closed-loop search for each of the other subframes is for obtaining a delta pitch for the respective subframe.
    Type: Grant
    Filed: June 11, 2001
    Date of Patent: June 24, 2003
    Assignee: Nokia Mobile Phones Ltd.
    Inventors: Ari Heikkinen, Vesa T. Ruoppila, Samuli Pietilä
  • Patent number: 6574592
    Abstract: The voice detecting device includes a voice frequency detector for detecting the frequency of the input voice, then discriminating whether the detected frequency falls within a preset frequency range of voices to be detected, and outputting the result of the discrimination; an input signal level detector for detecting the energy level of the input voice, then comparing it to confirm whether the detected energy level exceeds a preset energy level threshold value of voices to be detected, and outputting the result of the comparison; a voice input judge part responsive to the result of discrimination and the result of comparison to judge whether a voice satisfying conditions for voice detection has been input or not, and output a first status signal in accordance with the result of the judgement; and a voice duration measure part for measuring the duration of the first status signal, then comparing it to confirm whether the detected duration falls within a preset range of duration threshold value, and outputting
    Type: Grant
    Filed: March 20, 2000
    Date of Patent: June 3, 2003
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Keigo Nankawa, Masaki Saito
  • Publication number: 20030078768
    Abstract: Method and apparatus to measure jitter (period-to-period fluctuations in fundamental frequency) among the voices of suicidal, major depressed, and non-suicidal patients to predict near-term suicidal risk.
    Type: Application
    Filed: October 5, 2001
    Publication date: April 24, 2003
    Inventors: Stephen E. Silverman, Asli Ozdas, Marilyn K. Silverman
  • Patent number: 6535846
    Abstract: A voice signal processing system with multiple parallel control paths, each of which address different problems, such as the high peak-to-RMS signal ratios characteristic of speech, wide variations in RMS speech levels, and high background noise levels. Different families of input-output control curves are used simultaneously to achieve efficient peak limiting and dynamic range compression as well as low-level dynamic expansion to prevent excessive amplification of background noise. In addition, a delay in the audio path relative to the control path makes it possible to employ an effective look-ahead in the control path, with FIR filtering smoothing-matched to the look-ahead. Digital domain peak interpolators are used for estimating the peaks of the input signal in the continuous time domain.
    Type: Grant
    Filed: August 7, 2000
    Date of Patent: March 18, 2003
    Assignee: K.S. Waves Ltd.
    Inventor: Meir Shashoua
  • Patent number: 6507820
    Abstract: The present-invention relates to a method for the band expansion of speech for telephones, in particular for mobile telephones, by increasing the effective sampling rate of the speech signal by the insertion of additional samples and subsequent filtering of the expanded bandwidth speech signal.
    Type: Grant
    Filed: July 3, 2000
    Date of Patent: January 14, 2003
    Assignee: Telefonaktiebolaget LM Ericsson
    Inventor: Petra Deutgen
  • Patent number: 6493668
    Abstract: A system suitable for use in a speech recognition system or other voice processing system extracts features related to the frequency and amplitude characteristics of an input speech signal using a plurality of complex band pass filters by processing the outputs of adjacent bandpass filters.
    Type: Grant
    Filed: June 15, 2001
    Date of Patent: December 10, 2002
    Inventor: Yigal Brandman
  • Patent number: 6470310
    Abstract: Processing for producing encoded output representing information about a pitch period of an input speech signal is performed. The pitch period of a previously entered speech signal is stored in a buffer. A search range-determining portion determines a range in which a current pitch period is analyzed, according to the pitch period of the previously entered speech signal. A presently entered speech signal is applied from a speech input terminal. A pitch analysis portion makes a pitch analysis of candidates for the pitch period contained in the determined search range. Information about the pitch period is delivered from an output terminal and stored in the buffer for subsequent processing. The pitch period of the speech signal can be calculated with a small amount of calculation and represented with a small amount of information.
    Type: Grant
    Filed: September 28, 1999
    Date of Patent: October 22, 2002
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masahiro Oshikiri, Kimio Miseki
  • Patent number: 6466904
    Abstract: There is provided a speech decoder comprising a means for generating an excitation signal and a means for performing harmonic analysis and synthesis on the excitation signal in order to generate a smooth, periodic speech signal. The speech decoder further comprises a mixing means for mixing the excitation signal with the smooth, periodic signal and a synthesizing means for synthesizing the modified excitation signal into a speech signal that can be played to a user through a listening means. There is also provided a receiver that incorporates a speech decoder such as the decoder described above as well as a method for speech decoding.
    Type: Grant
    Filed: July 25, 2000
    Date of Patent: October 15, 2002
    Assignee: Conexant Systems, Inc.
    Inventors: Yang Gao, Huan-yu Su
  • Patent number: 6463405
    Abstract: A method, system and product are provided for loss-less encoding of a digital signal representing an audible sound. The method includes dividing the digital signal into a plurality of frames, dividing each of the plurality of frames into a plurality of subbands, and assigning each of the plurality of subbands an indicator selected from the group consisting of positive, zero, and negative, wherein the indicator selected is based on a polarity and a magnitude of the subband. The method further includes assigning each of the plurality of subbands one of a plurality of scale factors, wherein each scale factor represents a sound level range of at most two decibels, and generating a digital word for each of the plurality of frames, each digital word having a scale factor section including the scale factors for the plurality of subbands in the frame, and a sample data section including the indicators for the plurality of subbands in the frame.
    Type: Grant
    Filed: December 20, 1996
    Date of Patent: October 8, 2002
    Inventor: Eliot M. Case
  • Patent number: 6463407
    Abstract: A low-bit-rate coding technique for unvoiced segments of speech includes the steps of extracting high-time-resolution energy coefficients from a frame of speech, quantizing the energy coefficients, generating a high-time-resolution energy envelope from the quantized energy coefficients, and reconstituting a residue signal by shaping a randomly generated noise vector with quantized values of the energy envelope. The energy envelope may be generated with linear interpolation technique. A post-processing measure may be obtained and compared with a predefined threshold to determine whether the coding algorithm is performing adequately.
    Type: Grant
    Filed: November 13, 1998
    Date of Patent: October 8, 2002
    Assignee: Qualcomm Inc.
    Inventors: Amitava Das, Sharath Manjunath
  • Patent number: 6453284
    Abstract: For tracking multiple, simultaneous voices, predicted tracking is used to follow individual voices through time, even when the voices are very similar in fundamental frequency. An acoustic waveform comprised of a group of voices is submitted to a frequency estimator, which may employ an average magnitude difference function (AMDF) calculation to determine the voice fundamental frequencies that are present for each voice. These frequency estimates are then used as input values to a recurrent neural network that tracks each of the frequencies by predicting the current fundamental frequency value for each voice present based on past fundamental frequency values in order to disambiguate any fundamental frequency trajectories that may be converging in frequency.
    Type: Grant
    Filed: July 26, 1999
    Date of Patent: September 17, 2002
    Assignee: Texas Tech University Health Sciences Center
    Inventor: D. Dwayne Paschall
  • Patent number: 6449592
    Abstract: A method for tracking the phase of a quasi-periodic signal includes the steps of estimating the phase of the signal for frames during which the signal is periodic, monitoring the performance of the estimated phase with a closed-loop performance measure, and measuring the phase of the signal for frames during which the signal is periodic and performance of the estimated phase falls below a predefined threshold level. In estimating the phase, the initial phase value is set equal to the estimated final phase value of the previous frame if the previous frame was periodic. The initial phase value is set equal to a measured phase value of the previous frame if the previous frame was nonperiodic, or if the previous frame was periodic and performance of the estimated phase for the previous frame fell below the predefined threshold level. For frames during which the signal is nonperiodic, the phase of the signal is measured.
    Type: Grant
    Filed: February 26, 1999
    Date of Patent: September 10, 2002
    Assignee: Qualcomm Incorporated
    Inventor: Amitava Das
  • Publication number: 20020103637
    Abstract: The present invention relates to digital audio coding systems that employ high frequency reconstruction (HFR) methods. It teaches how to improve the overall performance of such systems, by means of an adaption over time of the crossover frequency between the lowband coded by a core codec, and the highband coded by an HFR system. Different methods of establishing the instantaneous optimum choice of crossover frequency are introduced.
    Type: Application
    Filed: November 15, 2001
    Publication date: August 1, 2002
    Inventors: Fredrik Henn, Andrea Ehret, Michael Schug
  • Patent number: 6418405
    Abstract: A system controller (106) includes a speech encoder (107) that dynamically segments frames of a low bit rate digital voice message. Speech model parameters have been generated in a sequence of frames. The speech model parameters include quantized speech spectral parameter vectors. The speech encoder selects (1820) a first quantized speech spectral parameter vector as a current anchor vector, selects (1820, 1830) a second quantized speech spectral parameter vector located a predetermined number of frames (LMAX) from the current anchor vector as a target speech parameter vector, and perturbs (1840) the target speech parameter vector to derive a plurality (K) of perturbed speech parameter vectors.
    Type: Grant
    Filed: September 30, 1999
    Date of Patent: July 9, 2002
    Assignee: Motorola, Inc.
    Inventors: Sunil Satyamurti, Jian-Cheng Huang, Floyd Simpson, Kenneth Finlon
  • Patent number: 6415251
    Abstract: If all original subbands are not selected for processing in conventional subband coders or decoders aliasing distortion is generated by the characteristics of their subband band-splitting filters or subband band synthesis filters. To improve sound quality in a subband decoder the decoded frequency components in the overlap region adjacent to a subband selected not to be decoded are band-limited prior to synthesis. Alternatively, in a subband coder the sound quality in a processed subband adjacent to one not to be coded is improved by band-limiting the filtering frequency overlap region between these subbands prior to coding. By thus decoding only the non-overlapping part of the subband adjacent to an omitted subband signal distortion is reduced.
    Type: Grant
    Filed: June 3, 1999
    Date of Patent: July 2, 2002
    Assignee: Sony Corporation
    Inventors: Yoshiaki Oikawa, Mitsuyuki Hatanaka, Kenzo Akagiri
  • Patent number: 6377915
    Abstract: A decoder compares a spectral envelope value y8 on a frequency axis with a predetermined threshold f9 to identify a voiced region and an unvoiced region. An excitation signal is produced by using excitations suitable for respective frequency regions. An encoder applies the nonuniform quantization to the period of the aperiodic pitch in accordance with its frequency of occurrence. The result of the nonuniform quantization is transmitted together with the quantization result of the unvoiced state and the periodic pitch as one code. A decoder obtains spectral envelope amplitude l8′ from the spectral envelope information, and identifies a frequency band e10′ where the spectral envelope amplitude value is maximized in each of respective bands divided on the frequency axis.
    Type: Grant
    Filed: March 14, 2000
    Date of Patent: April 23, 2002
    Assignee: YRP Advanced Mobile Communication Systems Research Laboratories Co., Ltd.
    Inventor: Seishi Sasaki
  • Patent number: 6351729
    Abstract: There is disclosed a method for processing a time-varying signal to produce a high-resolution spectrogram that represents power as a function of both frequency and time. Data blocks of a time series, which represents of a sampled signal, are subjected to processing which results in a sequence of frequency-dependent functions referred to as eigencoefficients. Each eigencoefficient represents signal information projected onto a local frequency domain using a respective one of K Slepian sequences or Slepian functions. The spectrogram is derived from time- and frequency-dependent expansions formed from the eigencoefficients.
    Type: Grant
    Filed: July 12, 1999
    Date of Patent: February 26, 2002
    Assignee: Lucent Technologies Inc.
    Inventor: David James Thomson
  • Publication number: 20020007268
    Abstract: Encoding (2) a signal (A) is provided, wherein frequency and amplitude information of at least one sinusoidal component in the signal (A) is determined (20), and sinusoidal parameters (f,a) representing the frequency and amplitude information are transmitted (22), and wherein further a phase jitter parameter (p) is transmitted, which represents an amount of phase jitter that should be added during restoring the sinusoidal component from the transmitted sinusoidal parameters (f,a).
    Type: Application
    Filed: June 20, 2001
    Publication date: January 17, 2002
    Inventors: Arnoldus Werner Johannes Oomen, Albertus Cornelis Den Brinker
  • Patent number: 6339756
    Abstract: A audio digital CODEC is provided with various parameters that when changed affect the quality of the resultant audio. These psycho-acoustic parameters include the standard ISO parameters and additional parameters to aid in effecting a pure resulting audio quality. The psycho-acoustic parameters located in the audio digital CODEC can be monitored and controlled by the user. The parameters can be monitored by a speaker associated with the CODEC or headphones. The user can control the adjustment of the psycho-acoustic parameters through the use of knobs present on the front panel of the CODEC or graphic or digital representations. Adjustment of the parameters will provide real time change of the resulting audio sound that the user can monitor through the speaker or the headphones. Selections may also be made to connect to a plurality of transmission facilities.
    Type: Grant
    Filed: September 19, 2000
    Date of Patent: January 15, 2002
    Assignee: Corporate Computer Systems
    Inventor: Larry W. Hinderks
  • Patent number: 6338036
    Abstract: When a sound which is to be recognized is input to a device, this invention briefly informs the user of whether the sound has been input in an appropriate state.
    Type: Grant
    Filed: August 23, 1999
    Date of Patent: January 8, 2002
    Assignee: Seiko Epson Corporation
    Inventor: Yasunaga Miyazawa
  • Patent number: 6332119
    Abstract: An audio digital CODEC can be connected to a plurality of digital transmission facilities. The CODEC has a plurality of programmable compression schemes which are upgradeable and downloadable. One of the programmable compression schemes is provided with various parameters that when changed affect the quality of the resultant audio. These psycho-acoustic parameters include the standard ISO parameters and additional parameters and can be monitored and controlled by a user.
    Type: Grant
    Filed: March 20, 2000
    Date of Patent: December 18, 2001
    Assignee: Corporate Computer Systems
    Inventor: Larry W. Hinderks
  • Patent number: 6263306
    Abstract: A technique for obtaining an intermediate set of frequency dependant features from a speech signal for use in speech processing and in obtaining estimates of speech pitch. The technique utilizes multiple tapers derived from Slepian sequences to obtain a product of the speech signal and the Slepian functions. Multiple tapered Fourier transforms are then obtained from the product, from which the set of frequency dependent features are calculated. In a preferred embodiment, a derivative of the cepstrum of the speech signal is used as an estimate of speech signal pitch. In another preferred embodiment, the F-spectrum is calculated from the product and the F-cepstrum is obtained therefrom by calculating the Fourier transform of the smoothed derivative of the log of the F-spectrum. The maximum of the F-cepstrum also provides a pitch estimation.
    Type: Grant
    Filed: February 26, 1999
    Date of Patent: July 17, 2001
    Assignee: Lucent Technologies Inc.
    Inventors: Michael Sean Fee, Ching Elizabeth Ho, Partha Pratim Mitra, Bijan Pesaran
  • Publication number: 20010005824
    Abstract: A sound image localization apparatus that can support an input signal of plurality of sampling frequencies while being small in circuitry size is provided.
    Type: Application
    Filed: December 19, 2000
    Publication date: June 28, 2001
    Inventors: Naoyuki Kato, Yoshinori Kumamoto
  • Patent number: 6216134
    Abstract: A system that provides for the graphic visualization of the categories of a collection of records. The graphic visualization is referred to as “category graph.” The system optionally displays the category graph as a “similarity graph” or a “hierarchical map.” When displaying a category graph, the system displays a graphic representation of each category. The system displays the category graph as a similarity graph or a hierarchical map in a way that visually illustrates the similarity between categories. The display of a category graph allows a data analyst to better understand the similarity and dissimilarity between categories. A similarity graph includes a node for each category and an arc connecting nodes representing categories whose similarity is above a threshold. A hierarchical map is a tree structure that includes a node for each base category along with nodes representing combinations of similar categories.
    Type: Grant
    Filed: June 25, 1998
    Date of Patent: April 10, 2001
    Assignee: Microsoft Corporation
    Inventors: David E. Heckerman, David Maxwell Chickering, Usama M. Fayyad, Christopher A. Meek
  • Patent number: 6195632
    Abstract: An iterative formant analysis, based on minimizing the arc-length of various curves, and under various filter constraints estimates formant frequencies with desirable properties for text-to-speech applications. A class of arc-length cost functions may be employed. Some of these have analytic solutions and thus lend themselves well to applications requiring speed and reliability. The arc-length inverse filtering techniques are inherently pitch synchronous and are useful in realizing high quality pitch tracking and pitch epoch marking.
    Type: Grant
    Filed: November 25, 1998
    Date of Patent: February 27, 2001
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Steve Pearson
  • Patent number: 6182036
    Abstract: A method of extracting features for a voice recognition system of a wireless communication device. The wireless communication device includes a transmitter and a receiver coupled to a call processor, a microphone for inputting audible sounds, and an analog-to-digital converter coupled to the microphone to output digital signals. The call processor includes a coefficient generator coupled to the analog-to-digital converter and generating representative coefficients, a differentiator coupled to the coefficient generator to generate delta coefficients, and an extractor outputting a portion of the representative coefficients and the delta coefficients as a feature vector for use in voice recognition, wherein every other representative coefficient is used in the feature vector.
    Type: Grant
    Filed: February 23, 1999
    Date of Patent: January 30, 2001
    Assignee: Motorola, Inc.
    Inventor: Daniel Charles Poppert
  • Patent number: 6154549
    Abstract: A method and apparatus for providing sound in a spatial environment is provided. Control is provided over the perceived position of sound sources, including perceived positions behind the listener or listeners. The invention is effective for multiple simultaneous listeners distributed throughout the spatial environment. The invention provides techniques for real-time interaction with graphic images, computer game controls, or other computer-based events. The invention provides control over the relative amplitude of signals provided to each of a plurality of spatially diverse transducers. By controlling the relative amplitudes of signals applied to each of the transducers, the present invention provides control over the perceived position of sound sources in a spatial environment with respect to one or more listeners. Attenuation parameters may be stored in a table, indexed by the spatial region to which they apply, and used to adjust the relative amplitude of each of the transducers used to produce a sound.
    Type: Grant
    Filed: May 2, 1997
    Date of Patent: November 28, 2000
    Assignee: Extreme Audio Reality, Inc.
    Inventors: Glenn Arnold, Daniel Bates
  • Patent number: 6112169
    Abstract: A system and method for preserving the natural sound of a signal that is processed by an analysis step of converting the signal into a sequence of overlapping windowed DFT representations and a synthesis step of converting these DFT representations back to a time domain signal. For example, the system and method are applicable to analysis-synthesis systems based on a sequence of overlapping windowed, DFT representations in which either: (1) the analysis transforms overlap in time by a different amount than the synthesis transforms, or (2) the modification involves a re-mapping of transform values from one frequency location to another. The phases of the complex-valued DFT representations may be modified so that synthesis of the time domain signal results in a natural sound despite the effects of e.g., either (1) or (2).
    Type: Grant
    Filed: November 7, 1996
    Date of Patent: August 29, 2000
    Assignee: Creative Technology, Ltd.
    Inventor: Mark Dolson
  • Patent number: 6098036
    Abstract: A speech coding system and associated method rely on a speech encoder and a speech decoder. The speech decoder includes a Linear Predictive Coding (LPC) filter having an input and an output. The LPC filter provides synthesized speech at the output in response to voiced and unvoiced excitation provided at the input. A harmonic generator for providing voiced excitation to the input of the LPC filter includes a spectral formant enhancer for attenuating the amplitude of harmonics generate by the harmonic generator in spectral valleys between format peaks of respective frames of voiced speech. The system and method reduce perceived buzziness while increasing perceived spectral depth of synthesized speech at the output of the LPC filter.
    Type: Grant
    Filed: July 13, 1998
    Date of Patent: August 1, 2000
    Assignee: Lockheed Martin Corp.
    Inventors: Richard Louis Zinser, Jr., Mark Lewis Grabb, Glen William Brooksby, Steven Robert Koch
  • Patent number: 6094629
    Abstract: A speech coding system and associated method relies on a speech encoder and a speech decoder. The encoder includes a spectral quantizer for computing line spectral frequencies (LSFs) for respective frames of speech and for quantizing the LSFs to obtain a minimum bit representation of a spectral envelope of each respective frame of speech. For even numbered frames of speech the LSFs are quantized using a vector quantization technique. For odd numbered frames of speech samples the LSFs are quantized using a dynamic bit allocation (DBA) method. The dynamic bit allocation method determines an interpolation factor for interpolating between the LSFs of the previous and next frames. According to the dynamic bit allocation method the most perceptually important LSFs are represented by relatively more bits, while the least perceptually important LSFs are represented by relatively fewer bits.
    Type: Grant
    Filed: July 13, 1998
    Date of Patent: July 25, 2000
    Assignee: Lockheed Martin Corp.
    Inventors: Mark Lewis Grabb, Steven Robert Koch, Glen William Brooksby, Richard Louis Zinser, Jr.
  • Patent number: 6064954
    Abstract: Apparatus is disclosed for digitally encoding an input audio signal, for storage or transmission, comprising: a pitch detector for determining at least a dominant time-domain periodicity in the input signal; a generator for generating a prediction signal based on the dominant time domain periodicity of the input signal; a first discrete frequency domain transform generator for generating a frequency domain representation of the input signal; a second discrete frequency domain transform generator for generating a frequency domain representation of the prediction signal; a subtractor to subtract at least a portion of the frequency domain representation of the prediction signal from the frequency domain representation of the input signal to generate an error signal; and a generator to generate an output signal from the error signal and parameters defining the prediction signal. A corresponding decoder is also described.
    Type: Grant
    Filed: March 4, 1998
    Date of Patent: May 16, 2000
    Assignee: International Business Machines Corp.
    Inventors: Gilad Cohen, Yossef Cohen, Doron Hoffman, Hagai Krupnik, Aharon Satt
  • Patent number: 6041295
    Abstract: An adjustable CODEC receives a signal and compresses a received audio signal based on a psycho-acoustic compression system. The psycho-acoustic compression system includes several psycho-acoustic parameters which may be adjusted by the user to optimize the compression system for a specific application, transmission medium, or end user, for example. The compression and the adjustment of compression parameters may occur simultaneously in real time without interruption of the compressed signal. Additionally, the parameters may be adjusted in smart groups of a plurality of parameters that are, for example, sympathetic or are often adjusted together. An additional signal may also be combined with the compressed signal and the resultant combined signal may be transmitted over a single transmission path. For example, the additional signal may be multiplexed into the compressed data stream for TDMA transmission.
    Type: Grant
    Filed: April 10, 1996
    Date of Patent: March 21, 2000
    Assignee: Corporate Computer Systems
    Inventor: Larry W. Hinderks
  • Patent number: 6006175
    Abstract: By simultaneously recording EM wave reflections and acoustic speech information, the positions and velocities of the speech organs as speech is articulated can be defined for each acoustic speech unit. Well defined time frames and feature vectors describing the speech, to the degree required, can be formed. Such feature vectors can uniquely characterize the speech unit being articulated each time frame. The onset of speech, rejection of external noise, vocalized pitch periods, articulator conditions, accurate timing, the identification of the speaker, acoustic speech unit recognition, and organ mechanical parameters can be determined.
    Type: Grant
    Filed: February 6, 1996
    Date of Patent: December 21, 1999
    Assignee: The Regents of the University of California
    Inventor: John F. Holzrichter
  • Patent number: 5978756
    Abstract: An audio stream is analyzed to distinguish silent periods from non-silent periods and an encoded bitstream is generated for the audio stream, wherein the silent periods are represented by one or more sets of canned encoded data corresponding to representative silent periods. In a preferred embodiment, one of the sets of canned encoded data is randomly selected for each silent period. There may be different sets of silent periods corresponding to different types of silent periods, where a particular type of silent period is selected based on some characteristic of the audio stream (e.g., energy level of the silent periods). In addition, the sets of encoded data may be generated from actual silent periods of the audio stream.
    Type: Grant
    Filed: March 28, 1996
    Date of Patent: November 2, 1999
    Assignee: Intel Corporation
    Inventors: Mark R. Walker, Jeffrey Kidder, Michael Keith
  • Patent number: RE36478
    Abstract: A sinusoidal model for acoustic waveforms is applied to develop a new analysis/synthesis technique which characterizes a waveform by the amplitudes, frequencies, and phases of component sine waves. These parameters are estimated from a short-time Fourier transform. Rapid changes in the highly-resolved spectral components are tracked using the concept of "birth" and "death" of the underlying sine waves. The component values are interpolated from one frame to the next to yield a representation that is applied to a sine wave generator. The resulting synthetic waveform preserves the general waveform shape and is perceptually indistinguishable from the original. Furthermore, in the presence of noise the perceptual characteristics of the waveform as well as the noise are maintained. The method and devices are particularly useful in speech coding, time-scale modification, frequency scale modification and pitch modification.
    Type: Grant
    Filed: April 12, 1996
    Date of Patent: December 28, 1999
    Assignee: Massachusetts Institute of Technology
    Inventors: Robert J. McAulay, Thomas F. Quatieri, Jr.