Abstract: A highly efficient, low delay pitch parameter derivation and quantization permits overall delay which is a fraction of prior coding delays for equivalent speech quality at low bitrates. In distinguishing between pitch period information for voiced and non-voiced frames of input signals, non-voiced frames are assigned a non-zero "bias" value, while voiced frames have associated with them generated pitch information based on an analysis of signals in a present frame and comparison with signals relating to the pitch in a prior frame. Transitions from non-voiced to voiced input frames are efficiently accomplished using a non-uniform quantization method based on an analysis of a sequence of frames. Typical uses include low delay, low-bitrate coders such as Code Excited Linear Prediction (CELP).
Abstract: A speech recognition system may be trained with data that is independent from previous acoustics. This method of training is quicker and more cost effective than previous training methods. In training the system, after a vocabulary word is input into the system, a first set of phonemes representative of the vocabulary word is determined. Next, the first set of phonemes is compared with a second set of phonemes representative of a second vocabulary word. The first vocabulary word and the second vocabulary word are different. The comparison generates a confusability index. The confusability index for the second word is a measure of the likelihood that the second word will be mistaken as another vocabulary word, e.g., the first word, already in the system. This process may be repeated for each newly desired vocabulary word.
Abstract: An adaptive postfilter is used on the decoding side of tandem codecs (coder/decoders). Post-filter parameters are adapted using a backward synthesis filter. The parameters used are 10th order LPC (Linear Predictive Coding) predictor coefficients. The system employed uses Low-Delay Code Excited Linear Predictive codecs (LD-CELP).
December 9, 1996
Date of Patent:
December 2, 1997
Lucent Technologies, Inc.
Juin-Hwey Chen, Richard Vandervoort Cox, Nuggehally Sampath Jayant
Abstract: Codebook vectors may be considered critical if they give poor energy approximations and exhibit a particular shape with smaller components near the beginning and larger components toward the end of the vector. Standard deviation may be used to identify critical codevectors based on energy approximation error measured in decibels. A low-bit rate (typically 8 kbit/s or less), low-delay digital coder and decoder based on Code Excited Linear Prediction for speech and similar signals features backward adaptive adjustment for codebook gain and short-term synthesis filter parameters and forward adaptive adjustment of long-term (pitch) synthesis filter parameters. In addition, the coder makes use of an excitation codebook and the coding is based on a set of codebook vector energies for a set of codebook vectors in the codebook. The codebook energies are calculated by identifying a set of approximations for the non-critical codebook vector energies.
Abstract: A low-bitrate (typically 8 kbit/s or less), low-delay digital coder and decoder based on Code Excited Linear Prediction for speech and similar signals features backward adaptive adjustment for codebook gain and short-term synthesis filter parameters and forward adaptive adjustment of long-term (pitch) synthesis filter parameters. A highly efficient, low delay pitch parameter derivation and quantization permits overall delay which is a fraction of prior coding delays for equivalent speech quality at low bitrates.
Abstract: A method and apparatus for quantizing audio signals is disclosed which advantageously produces a quantized audio signal which can be encoded within an acceptable range. Advantageously, the quantizer uses a scale factor which is interpolated between a threshold based on the calculated threshold of hearing at a given frequency and the absolute threshold of hearing at the same frequency.
Abstract: Techniques for encoding audio signals are disclosed wherein a left channel signal masking threshold and a right channel signal masking threshold may be advantageously adjusted. A given time block of a stereo audio signal comprises a left channel signal and a right channel signal, each of which are represented in a frequency domain by a first plurality of frequency partitions and a second plurality of frequency partitions, respectively. For a frequency partition from the first plurality of frequency partitions and a corresponding frequency partition from the second plurality of frequency partitions, a scheme first calculates a set of masking thresholds comprising the left channel masking threshold and the right channel masking threshold. Next, based upon the set of masking thresholds, the scheme calculates an adjusted left channel masking threshold and an adjusted right channel masking threshold.
Abstract: A method and apparatus for performing a Modified Discrete Cosine Transform on an audio signal is disclosed which utilizes a Discrete Fourier Transform. Illustratively, the MDCT spectral coefficients for the signal are generated from the real FFT spectral coefficients.
Abstract: Signal distribution systems, such as interactive cable television systems, wherein a central server that communicates with remote terminals wherein the server assigns a specific address to each newly-connected terminal for control messages and polls. In the case of an interactive cable television system, such terminal is the set-top box or converter connected between the cable system and the television receiver. Periodically, the server broadcasts a control message containing a tentative address, and polls the tentative address. Upon receiving such a message, a newly-connected converter stores the tentative address and responds to the poll with an uplink message. After receiving the uplink message, the server inserts the tentative address in its polling list in accordance with the transmission delay measured from sending the poll to receiving the uplink message and selects a new tentative address to use in subsequent broadcast control messages.
Abstract: A method and an apparatus for spinning off a channel so as to provide a viewer with a choice of viewing a first program (e.g., the TV program), a second program (e.g., a news brief), or both are disclosed. More specifically, a method for providing such a choice entails initially having a system for transmitting programs wherein a first program is being transmitted on a first channel to a set of subscriber stations. Next, a temporary channel is defined. The transmission of the first program is continued on the temporary channel. The transmission of the first program is discontinued on the first channel. A second program is transmitted on the first channel. Advantageously, the invention provides viewers with a programming choice not otherwise available.
Abstract: A system is disclosed for delivering audio and/or video signals to users in connection with the provision of interactive television services. Various sources of such signals are connected to a digital network, such as a packet network. Also connected to such network are control and application processors and interfaces to distribution arrangements such as cable television systems and telephone subscriber loops. Each user has a signal converter for receiving a digital signal from the distribution arrangement, converting such signal for viewing on a conventional television receiver and transmitting control packets to other elements of the system. The signal sources can include a data cache for storing recorded video and audio materials, a broadcast source for receiving broadcast signals, apparatus for composing multimedia signals from multiple sources and apparatus for running games.
Abstract: A technique for the masking of quantizing noise in the coding of audio signals is usable with the types of channel coding known as "noiseless" or Huffman coding and with variable radix packing. In a multichannel environment, noise masking thresholds may be determined by combining sets of power spectra for each of the channels. The stereophonic embodiment eliminates redundancies in the sum and difference signals, so that the stereo coding uses significantly less than twice the bit rate of the comparable monaural signal. The technique can be used both in transmission of signals and in recording for reproduction, particularly recording and reproduction of music. Compatibility with the ISDN transmission rates known as 1B, 2B and 3B rates has been achieved.
Abstract: Described is a technique for an interactive television ("ITV") system wherein viewers are allowed to select a desired level of advertisements with which they are provided. The technique comprises transmitting to a interactive services subscriber location a program and a set of advertisements (collectively referred to as a "show"). The set of advertisements is selected based upon an input from a user associated with the interactive services subscriber location. The input comprises an indicator of an amount of advertisements in the set of advertisements. Another feature of the ITV system described is that it allows for adjusting an amount of a bill of a subscriber to interactive television services based upon the amount of advertisements viewed in a show.
April 29, 1994
Date of Patent:
July 2, 1996
Donald E. Blahut, William M. Schell, Guy A. Story, Edward S. Szurkowski
Abstract: An image-coding system reduces image data redundancies and perceptual irrelevancies through progressive sub-band coding. The image is separated into a plurality of sub-bands. From this sub-band information, a perceptual metric, based on the properties of the sub-band filters, quantizer error distribution, and properties of the human visual system, is determined which provides the maximum amount of coding noise that may be introduced to each pixel in every sub-band without causing perceptible degradation of the coded image. This perceptual metric is used to adjust the quantizer used in encoding each sub-band signal. In addition, redundancy in the output of the quantizer is reduced using a multidimensional Huffman compression scheme.
Abstract: A circuit that prevents optical screening systems from being pinned in an inoperative state under certain light conditions. The invention accomplishes this by using a circuit comprising a photodetector for converting scattered light from a visual code (e.g., universal product code) into current and a current limiting circuit coupled to the photodetector for limiting the current capable of passing through the photodetector.
Abstract: Speaker independent recognition of small vocabularies, spoken over the long distance telephone network, is achieved using two types of models, one type for defined vocabulary words (e.g., collect, calling-card, person, third-number and operator), and one type for extraneous input which ranges from non-speech sounds to groups of non-vocabulary words (e.g. `I want to make a collect call please`). For this type of key word spotting, modifications are made to a connected word speech recognition algorithm based on state-transitional (hidden Markov) models which allow it to recognize words from a pre-defined vocabulary list spoken in an unconstrained fashion. Statistical models of both the actual vocabulary words and the extraneous speech and background noises are created. A syntax-driven connected word recognition system is then used to find the best sequence of extraneous input and vocabulary word models for matching the actual input speech.
October 6, 1993
Date of Patent:
April 16, 1996
Chin H. Lee, Lawrence R. Rabiner, Jay G. Wilpon
Abstract: Coding of high quality stereophonic audio signals is accomplished in a perceptual filterbank coder which exploits the interchannel redundancies and psychoacoustic. Using perceptual principles, switching between a normal and short window of input samples improve output signal quality for certain input signals, particularly those having a rapid attack. Switching is also accomplished between coding of left and right channels and so-called sum and difference channels in response to particular signal conditions.
Abstract: A system for and method of enhancing image/video signals to be decoded is disclosed. The system for and method of post-filtering uses, preferably, a temporal filter and a spatial filter, both of which are adaptive but neither is required to be adaptive. However, the system for and method of post-filtering may also be used with only an adaptive spatial post-filter. In this case, the performance is upper-bounded by the performance of systems using both the adaptive temporal and adaptive spatial post-filter.
Abstract: Different components of television programs, such as the video and audio components or different time segments of the program, are assigned to different channels transmitted in the form of packetized digital information in at least one of the channels of a multi-channel cable television distribution system. A particular subscriber's converter is configured to receive a particular subset of program components by enabling the converter to receive the virtual channels carrying the components in such subset. An embodiment is disclosed in which different combinations of program segments are enabled for different subscribers. Another embodiment is disclosed in which the viewing of a program by multiple subscribers who begin viewing at different times is synchronized by using "filler" program segments having different lengths.
Abstract: The invention relates to a system for transmitting stored programs, such as movies and musical works, to customers via a distribution network, such as a cable television system. The programs are stored in compressed form in a program library, such as a tape cartridge library. On receipt of a request for a program, a request processor sends control messages causing a data block comprising the requested program to be read from the program library at high speed and stored in a large dynamic random access memory (DRAM) in a server. The server then sends the program from DRAM over the distribution system to a customer as a series of digital packets. Each instance of sending a program is managed by a separate command word. The customer can request such operations as "fast forward" by sending control messages that change pointers in the command word. Command words can be grouped to send multiple audio and/or video overlays simultaneously and linked to send program sequences.