Correlation Patents (Class 704/263)
-
Patent number: 11935515Abstract: A method of generating a synthetic voice by capturing audio data, cutting it into discrete phoneme and pitch segments, forming superior phoneme and pitch segments by averaging segments having similar phoneme, pitch, and other sound qualities, and training neural networks to correctly concatenate the segments.Type: GrantFiled: December 27, 2021Date of Patent: March 19, 2024Inventor: Claude Polonov
-
Patent number: 11636845Abstract: A method includes generating first synthesized speech by using text and a first emotion vector configured for the text, extracting a second emotion vector included in the first synthesized speech, determining whether correction of the second emotion information vector is needed by comparing a loss value calculated by using the first emotion information vector and the second emotion information vector with a preconfigured threshold, re-performing speech synthesis by using a third emotion information vector generated by correcting the second emotion information vector, and outputting the generated synthesized speech, thereby configuring emotion information of speech in a more effective manner. A speech synthesis apparatus may be associated with an artificial intelligence module, drone (unmanned aerial vehicle, UAV), robot, augmented reality (AR) devices, virtual reality (VR) devices, devices related to 5G services, and the like.Type: GrantFiled: July 14, 2020Date of Patent: April 25, 2023Assignee: LG ELECTRONICS INC.Inventors: Siyoung Yang, Yongchul Park, Sungmin Han, Sangki Kim, Juyeong Jang, Minook Kim
-
Patent number: 9875301Abstract: Systems and methods for learning topic models from unstructured data and applying the learned topic models to recognize semantics for new data items are described herein. In at least one embodiment, a corpus of multimedia data items associated with a set of labels may be processed to generate a refined corpus of multimedia data items associated with the set of labels. Such processing may include arranging the multimedia data items in clusters based on similarities of extracted multimedia features and generating intra-cluster and inter-cluster features. The intra-cluster and the inter-cluster features may be used for removing multimedia data items from the corpus to generate the refined corpus. The refined corpus may be used for training topic models for identifying labels. The resulting models may be stored and subsequently used for identifying semantics of a multimedia data item input by a user.Type: GrantFiled: April 30, 2014Date of Patent: January 23, 2018Assignee: Microsoft Technology Licensing, LLCInventors: Xian-Sheng Hua, Jin Li, Yoshitaka Ushiku
-
Patent number: 8972258Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.Type: GrantFiled: May 22, 2014Date of Patent: March 3, 2015Assignee: Nuance Communications, Inc.Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
-
Patent number: 8930200Abstract: A vector joint encoding/decoding method and a vector joint encoder/decoder are provided, more than two vectors are jointly encoded, and an encoding index of at least one vector is split and then combined between different vectors, so that encoding idle spaces of different vectors can be recombined, thereby facilitating saving of encoding bits, and because an encoding index of a vector is split and then shorter split indexes are recombined, thereby facilitating reduction of requirements for the bit width of operating parts in encoding/decoding calculation.Type: GrantFiled: July 24, 2013Date of Patent: January 6, 2015Assignee: Huawei Technologies Co., LtdInventors: Fuwei Ma, Dejun Zhang, Lei Miao, Fengyan Qi
-
Publication number: 20140358547Abstract: Systems and methods for prosody prediction include extracting features from runtime data using a parametric model. The features from runtime data are compared with features from training data using an exemplar-based model to predict prosody of the runtime data. The features from the training data are paired with exemplars from the training data and stored on a computer readable storage medium.Type: ApplicationFiled: September 17, 2013Publication date: December 4, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Raul Fernandez, Asaf Rendel
-
Publication number: 20140358546Abstract: Systems and methods for prosody prediction include extracting features from runtime data using a parametric model. The features from runtime data are compared with features from training data using an exemplar-based model to predict prosody of the runtime data. The features from the training data are paired with exemplars from the training data and stored on a computer readable storage medium.Type: ApplicationFiled: August 28, 2013Publication date: December 4, 2014Applicant: International Business Machines CorporationInventors: Raul Fernandez, Asaf Rendel
-
Patent number: 8788268Abstract: A speech synthesis system can select recorded speech fragments, or acoustic units, from a very large database of acoustic units to produce artificial speech. When a pair of acoustic units in the database does not have an associated concatenation cost, the system assigns a default concatenation cost. The system then synthesizes speech, identifies the acoustic unit sequential pairs generated and their respective concatenation costs, and stores those concatenation costs likely to occur.Type: GrantFiled: November 19, 2012Date of Patent: July 22, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Mark Charles Beutnagel, Mehryar Mohri, Michael Dennis Riley
-
Patent number: 8768704Abstract: An input signal that includes linguistic content in a first language may be received by a computing device. The linguistic content may include text or speech. Based on an acoustic feature comparison between a plurality of first-language speech sounds and a plurality of second-language speech sounds, the computing device may associate the linguistic content in the first language with one or more phonemes from a second language. The computing device may also determine a phonemic representation of the linguistic content in the first language based on use of the one or more phonemes from the second language. The phonemic representation may be indicative of a pronunciation of the linguistic content in the first language according to speech sounds of the second language.Type: GrantFiled: October 14, 2013Date of Patent: July 1, 2014Assignee: Google Inc.Inventors: Javier Gonzalvo Fructuoso, Ioannis Agiomyrgiannakis
-
Patent number: 8738376Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.Type: GrantFiled: October 28, 2011Date of Patent: May 27, 2014Assignee: Nuance Communications, Inc.Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
-
Patent number: 8731938Abstract: A computer-implemented system and method for identifying and masking special information within recorded speech is provided. A field for entry of special information is identified. Movement of a pointer device along a trajectory towards the field is also identified. A correlation of the pointer device movement and entry of the special information is determined based on a location of the trajectory in relation to the field. A threshold is applied to the correlation. The special information is received as verbal speech. A recording of the special information is rendered unintelligible when the threshold is satisfied.Type: GrantFiled: April 26, 2013Date of Patent: May 20, 2014Assignee: Intellisist, Inc.Inventor: G. Kevin Doren
-
Patent number: 8706493Abstract: In one embodiment of a controllable prosody re-estimation system, a TTS/STS engine consists of a prosody prediction/estimation module, a prosody re-estimation module and a speech synthesis module. The prosody prediction/estimation module generates predicted or estimated prosody information. And then the prosody re-estimation module re-estimates the predicted or estimated prosody information and produces new prosody information, according to a set of controllable parameters provided by a controllable prosody parameter interface. The new prosody information is provided to the speech synthesis module to produce a synthesized speech.Type: GrantFiled: July 11, 2011Date of Patent: April 22, 2014Assignee: Industrial Technology Research InstituteInventors: Cheng-Yuan Lin, Chien-Hung Huang, Chih-Chung Kuo
-
Patent number: 8576961Abstract: A method for determining an overlap and add length estimate comprises determining a plurality of correlation values of a plurality of ordered frequency domain samples obtained from a data frame; comparing the correlation values of a first subset of the samples to a first predetermined threshold to determine a first edge sample; comparing the correlation values of a second subset of the samples to a second predetermined threshold to determine a second edge sample; using the first and second edge samples to determine an overlap and add length estimate; and providing the overlap and add length estimate to an overlap and add circuit.Type: GrantFiled: June 15, 2009Date of Patent: November 5, 2013Assignee: Olympus CorporationInventors: Haidong Zhu, Dumitru Mihai Ionescu, Abu Amanullah
-
Patent number: 8566106Abstract: A method and device for searching an algebraic codebook during encoding of a sound signal, wherein the algebraic codebook comprises a set of codevectors formed of a number of pulse positions and a number of pulses distributed over the pulse positions. In the algebraic codebook searching method and device, a reference signal for use in searching the algebraic codebook is calculated. In a first stage, a position of a first pulse is determined in relation with the reference signal and among the number of pulse positions. In each of a number of stages subsequent to the first stage, (a) an algebraic codebook gain is recomputed, (b) the reference signal is updated using the recomputed algebraic codebook gain and (c) a position of another pulse is determined in relation with the updated reference signal and among the number of pulse positions.Type: GrantFiled: September 11, 2008Date of Patent: October 22, 2013Assignee: Voiceage CorporationInventors: Redwan Salami, Vaclav Eksler, Milan Jelinek
-
Patent number: 8494845Abstract: Provided is a signal distortion elimination apparatus comprising: an inverse filter application means that outputs the signal obtained by applying an inverse filter to an observed signal as a restored signal when a predetermined iteration termination condition is met and outputs the signal obtained by applying the inverse filter to the observed signal as an ad-hoc signal when the predetermined iteration termination condition is not met; a prediction error filter calculation means that segments the ad-hoc signal into frames and outputs a prediction error filter of each frame obtained by performing linear prediction analysis of the ad-hoc signal of each frame; an inverse filter calculation means that calculates an inverse filter such that a concatenation of innovation estimates of the respective frames becomes mutually independent among their samples, where the innovation estimate of a single frame (an innovation estimate) is the signal obtained by applying the prediction error filter of the corresponding frameType: GrantFiled: February 16, 2007Date of Patent: July 23, 2013Assignee: Nippon Telegraph and Telephone CorporationInventors: Takuya Yoshioka, Takafumi Hikichi, Masato Miyoshi
-
Patent number: 8370153Abstract: A speech analyzer includes a vocal tract and sound source separating unit which separates a vocal tract feature and a sound source feature from an input speech, based on a speech generation model, a fundamental frequency stability calculating unit which calculates a temporal stability of a fundamental frequency of the input speech in the sound source feature, from the separated sound source feature, a stable analyzed period extracting unit which extracts time information of a stable period, based on the temporal stability, and a vocal tract feature interpolation unit which interpolates a vocal tract feature which is not included in the stable period, using a vocal tract feature included in the extracted stable period.Type: GrantFiled: May 3, 2010Date of Patent: February 5, 2013Assignee: Panasonic CorporationInventors: Yoshifumi Hirose, Takahiro Kamai
-
Patent number: 8321225Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method including receiving text to be synthesized as a spoken utterance. The method includes analyzing the received text to determine attributes of the received text and selecting one or more utterances from a database based on a comparison between the attributes of the received text and attributes of text representing the stored utterances. The method includes determining, for each utterance, a distance between a contour of the utterance and a hypothetical contour of the spoken utterance, the determination based on a model that relates distances between pairs of contours of the utterances to relationships between attributes of text for the pairs. The method includes selecting a final utterance having a contour with a closest distance to the hypothetical contour and generating a contour for the received text based on the contour of the final utterance.Type: GrantFiled: November 14, 2008Date of Patent: November 27, 2012Assignee: Google Inc.Inventors: Martin Jansche, Michael D. Riley, Andrew M. Rosenberg, Terry Tai
-
Patent number: 8315872Abstract: A speech synthesis system can select recorded speech fragments, or acoustic units, from a very large database of acoustic units to produce artificial speech. The selected acoustic units are chosen to minimize a combination of target and concatenation costs for a given sentence. However, as concatenation costs, which are measures of the mismatch between sequential pairs of acoustic units, are expensive to compute, processing can be greatly reduced by pre-computing and caching the concatenation costs. Unfortunately, the number of possible sequential pairs of acoustic units makes such caching prohibitive. However, statistical experiments reveal that while about 85% of the acoustic units are typically used in common speech, less than 1% of the possible sequential pairs of acoustic units occur in practice.Type: GrantFiled: November 29, 2011Date of Patent: November 20, 2012Assignee: AT&T Intellectual Property II, L.P.Inventors: Mark Charles Beutnagel, Mehryar Mohri, Michael Dennis Riley
-
Patent number: 8214216Abstract: A simply configured speech synthesis device and the like for producing a natural synthetic speech at high speed. When data representing a message template is supplied, a voice unit editor (5) searches a voice unit database (7) for voice unit data on a voice unit whose sound matches a voice unit in the message template. Further, the voice unit editor (5) predicts the cadence of the message template and selects, one at a time, a best match of each voice unit in the message template from the voice unit data that has been retrieved, according to the cadence prediction result. For a voice unit for which no match can be selected, an acoustic processor (41) is instructed to supply waveform data representing the waveform of each unit voice. The voice unit data that is selected and the waveform data that is supplied by the acoustic processor (41) are combined to generate data representing a synthetic speech.Type: GrantFiled: June 3, 2004Date of Patent: July 3, 2012Assignee: Kabushiki Kaisha KenwoodInventor: Yasushi Sato
-
Patent number: 8145477Abstract: Systems, methods, and apparatus described include waveform alignment operations in which a single set of evaluated cosines and sines is used to calculate cross-correlations of two periodic waveforms at two different phase shifts.Type: GrantFiled: December 1, 2006Date of Patent: March 27, 2012Inventors: Sharath Manjunath, Ananthapadmanabhan A. Kandhadai
-
Patent number: 8078466Abstract: A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. The processor reads first data comprising one or more parameters associated with noise-producing orifice images of sequences of at least three concatenated phonemes which correspond to an input stimulus. The processor reads, based on the first data, second data comprising images of a noise-producing entity. The processor generates an animated sequence of the noise-producing entity.Type: GrantFiled: November 30, 2009Date of Patent: December 13, 2011Assignee: AT&T Intellectual Property II, L.P.Inventors: Eric Cosatto, Hans Peter Graf, Juergen Schroeter
-
Patent number: 7835909Abstract: A method and apparatus for normalizing a histogram utilizing a backward cumulative histogram which can cumulate a probability distribution function in an order from a greatest to smallest value so as to estimate a noise robust histogram. A method of normalizing a speech feature vector includes: extracting the speech feature vector from a speech signal; calculating a probability distribution function using the extracted speech feature vector; calculating a backward cumulative distribution function by cumulating the probability distribution function in an order from a largest to smallest value; and normalizing a histogram using the backward cumulative distribution function.Type: GrantFiled: December 12, 2006Date of Patent: November 16, 2010Assignee: Samsung Electronics Co., Ltd.Inventors: So-Young Jeong, Gil Jin Jang, Kwang Cheol Oh
-
Patent number: 7756715Abstract: Apparatus, method, and medium for processing an audio signal using a correlation between bands are provided. The apparatus includes an encoding unit encoding an input audio signal and a decoding unit decoding the encoded input audio signal.Type: GrantFiled: November 17, 2005Date of Patent: July 13, 2010Assignee: Samsung Electronics Co., Ltd.Inventors: Junghoe Kim, Dohyung Kim, Sihwa Lee
-
Patent number: 7747441Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal. In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result.Type: GrantFiled: January 16, 2007Date of Patent: June 29, 2010Assignee: Mitsubishi Denki Kabushiki KaishaInventor: Tadashi Yamaura
-
Patent number: 7660714Abstract: A noise suppression device comprises subband SN ratio calculation means which receives a noise likeness signal, an input signal spectrum and a subband-based estimated noise spectrum, calculates the subband-based input signal average spectrum, calculates a subband-based mixture ratio of the subband-based estimated noise spectrum to the subband-based input signal average spectrum on the basis of the noise likeness signal, and calculates the subband-based SN ratio on the basis of the subband-based estimated noise spectrum, the subband-based input signal average spectrum and the mixture ratio.Type: GrantFiled: October 29, 2007Date of Patent: February 9, 2010Assignee: Mitsubishi Denki Kabushiki KaishaInventors: Satoru Furuta, Shinya Takahashi
-
Patent number: 7630897Abstract: A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. The processor reads first data comprising one or more parameters associated with noise-producing orifice images of sequences of at least three concatenated phonemes which correspond to an input stimulus. The processor reads, based on the first data. second data comprising images of a noise-producing entity. The processor generates an animated sequence of the noise-producing entity.Type: GrantFiled: May 19, 2008Date of Patent: December 8, 2009Assignee: AT&T Intellectual Property II, L.P.Inventors: Eric Cosatto, Hans Peter Graf, Juergen Schroeter
-
Publication number: 20090234653Abstract: Provided is an audio decoding device performing frame loss compensation capable of obtaining a decoded audio which is natural for ears with little noise. The audio decoding device includes: a non-cyclic pulse waveform detection unit (19) for detecting a non-cyclic pulse waveform section in a n?1-th frame which is repeatedly used with a pitch cycle in the n-th frame upon compensation of loss of the n-th frame; a non-cyclic pulse waveform suppression unit (17) for suppressing a non-cyclic pulse waveform by replacing an audio source signal existing in the non-cyclic pulse waveform section in the n?1-th frame by a noise signal; and a synthesis filter (20) for using a linear prediction coefficient decoded by an LPC decoding unit (11) to perform synthesis by a synthesis filter by using the audio source signal of the n?1-th frame from the non-cyclic pulse waveform suppression unit (17) as a drive audio source, thereby obtaining the decoded audio signal of the n-th frame.Type: ApplicationFiled: December 26, 2006Publication date: September 17, 2009Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.Inventors: Takuya Kawashima, Hiroyuki Ehara
-
Patent number: 7257535Abstract: A system and method are provided for processing audio and speech signals using a pitch and voicing dependent spectral estimation algorithm (voicing algorithm) to accurately represent voiced speech, unvoiced speech, and mixed speech in the presence of background noise, and background noise with a single model. The present invention also modifies the synthesis model based on an estimate of the current input signal to improve the perceptual quality of the speech and background noise under a variety of input conditions. The present invention also improves the voicing dependent spectral estimation algorithm robustness by introducing the use of a Multi-Layer Neural Network in the estimation process. The voicing dependent spectral estimation algorithm provides an accurate and robust estimate of the voicing probability under a variety of background noise conditions. This is essential to providing high quality intelligible speech in the presence of background noise.Type: GrantFiled: October 28, 2005Date of Patent: August 14, 2007Assignee: Lucent Technologies Inc.Inventors: Joseph Gerard Aguilar, Juin-Hwey Chen, Wei Wang, Robert W. Zopf
-
Patent number: 7003461Abstract: An adaptive codebook search (ACS) algorithm is based on a set of matrix operations suitable for data processing engines supporting a single instruction multiple data (SIMD) architecture. The result is a reduction in memory access and increased parallelism to produce an overall improvement in the computational efficiency of ACS processing.Type: GrantFiled: July 9, 2002Date of Patent: February 21, 2006Assignee: Renesas Technology CorporationInventor: Clifford Tavares
-
Patent number: 6996291Abstract: After one or both of a pair of images are obtained, an auto-correlation function for one of those images is generated to determine a smear amount and possibly a smear direction. The smear amount and direction are used to identify potential locations of a peak portion of the correlation function between the pair of images. The pair of images is then correlated only at offset positions corresponding to the one or more of the potential peak locations. In some embodiments, the pair of images is correlated according to a sparse set of image correlation function value points around the potential peak locations. In other embodiments, the pair of images is correlated at a dense set of correlation function value points around the potential peak locations. The correlation function values of these correlation function value points are then analyzed to determine the offset position of the true correlation function peak.Type: GrantFiled: August 6, 2001Date of Patent: February 7, 2006Assignee: Mitutoyo CorporationInventor: Michael Nahum
-
Patent number: 6937977Abstract: A start of an input speech signal is detected during presentation of an output audio signal and an input start time, relative to the output audio signal, is determined. The input start time is then provided for use in responding to the input speech signal. In another embodiment, the output audio signal has a corresponding identification. When the input speech signal is detected during presentation of the output audio signal, the identification of the output audio signal is provided for use in responding to the input speech signal. Information signals comprising data and/or control signals are provided in response to at least the contextual information provided, i.e., the input start time and/or the identification of the output audio signal. In this manner, the present invention accurately establishes a context of an input speech signal relative to an output audio signal regardless of the delay characteristics of the underlying communication system.Type: GrantFiled: October 5, 1999Date of Patent: August 30, 2005Assignee: fastmobile, Inc.Inventor: Ira A. Gerson
-
Patent number: 6865535Abstract: In a synchronization control apparatus, a voice-language-information generating section generates the voice language information of a word which a robot utters. A voice synthesizing section calculates phoneme information and a phoneme continuation duration according to the voice language information, and also generates synthesized-voice data according to an adjusted phoneme continuation duration. An articulation-operation generating section calculates an articulation-operation period according to the phoneme information. A voice-operation adjusting section adjusts the phoneme continuation duration and the articulation-operation period. An articulation-operation executing section operates an organ of articulation according to the adjusted articulation-operation period.Type: GrantFiled: December 27, 2000Date of Patent: March 8, 2005Assignee: Sony CorporationInventors: Keiichi Yamada, Kenichiro Kobayashi, Tomoaki Nitta, Makoto Akabane, Masato Shimakawa, Nobuhide Yamazaki, Erika Kobayashi
-
Patent number: 6804649Abstract: Voice synthesis with improved expressivity is obtained in a voice synthesiser of source-filter type by making use of a library of source sound categories in the source module. Each source sound category corresponds to a particular morphological category and is derived from analysis of real vocal sounds, by inverse filtering so as to subtract the effect of the vocal tract. The library may be parametrical, that is, the stored data corresponds not to the inverse-filtered sounds themselves but to synthesis coefficients for resynthesising the inverse-filtered sounds using any suitable re-synthesis technique, such as the phase vocoder technique. The coefficients are derived by Short Time Fourier Transform (STFT) analysis.Type: GrantFiled: June 1, 2001Date of Patent: October 12, 2004Assignee: Sony France S.A.Inventor: Eduardo Reck Miranda
-
Patent number: 6681202Abstract: The invention describes a system that generates a wide band signal (100-7000 Hz) from a telephony band (or narrow band: 300-3400 Hz) speech signal to obtain an extended band speech signal (100-3400 Hz). This technique is particularly advantageous since it increases signal naturalness and listening comfort with keeping compatibility with all current telephony systems. The described technique is inspired on Linear Predictive speech coders. The speech signal is thus split into a spectral envelope and a short-term residual signal. Both signals are extended separately and recombined to create an extended band signal.Type: GrantFiled: November 13, 2000Date of Patent: January 20, 2004Assignee: Koninklijke Philips Electronics N.V.Inventors: Giles Miet, Andy Gerrits
-
Patent number: 6662161Abstract: A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. Representative parameters are extracted from the image samples and stored in an animation library. The processor also samples a plurality of multiphones comprising images together with their associated sounds. The processor extracts parameters from these images comprising data characterizing mouth shapes, maps, rules, or equations, and stores the resulting parameters and sound information in a coarticulation library. The animated sequence begins with the processor considering an input phoneme sequence, recalling from the coarticulation library parameters associated with that sequence, and selecting appropriate image samples from the animation library based on that sequence. The image samples are concatenated together, and the corresponding sound is output, to form the animated synthesis.Type: GrantFiled: September 7, 1999Date of Patent: December 9, 2003Assignee: AT&T Corp.Inventors: Eric Cosatto, Hans Peter Graf, Juergen Schroeter
-
Publication number: 20030061051Abstract: A voice synthesizing system can make necessary calculation amount satisfactorily small and can make necessary file size small. The system includes a compressed pitch segment database storing compressed voice waveform segments, a pitch developing portion reading out the voice waveform segment from the database and decompressing the compressed data for reproducing an original voice waveform segment when the voice waveform segment necessary for voice waveform synthesis is demanded, and a cache processing portion temporarily storing the voice waveform segment already used in voice waveform synthesis, and when voice waveform segment necessary for voice waveform synthesis is demanded, returning demanded voice waveform segment to a demander when demanded voice waveform segment is already stored, and obtaining the voice waveform segment from the database via the pitch developing portion to hold the obtained voice waveform segment and return to the demander when demanded voice waveform segment is not stored.Type: ApplicationFiled: September 26, 2002Publication date: March 27, 2003Applicant: NEC CORPORATIONInventors: Reishi Kondo, Hiroaki Hattori
-
Patent number: 6513007Abstract: There is provided a synthesized sound generating apparatus and method which can achieve responsive and high-quality speech synthesis based on a real-time convolution operation. Coefficients are generated by using dynamic cutting to extract characteristic information from a first signal. A convolution operation is performed on a second signal using the generated coefficients to generate a synthesized signal. As the convolution operation, an interpolation process is performed on the coefficients to prevent a rapid change in level of the generated synthesized signal upon switching of the coefficients.Type: GrantFiled: July 20, 2000Date of Patent: January 28, 2003Assignee: Yamaha CorporationInventor: Akio Takahashi
-
Patent number: 6421636Abstract: An apparatus and method is disclosed for converting an input signal having frequency related information sustained over a first duration of time into an output signal sustained over a second duration of time at substantially the same first frequency by adding or subtracting to the effective wave length of the output signal. Preferably, the signals are converted in digital form with samples added or subtracted to frequency convert the signal.Type: GrantFiled: May 30, 2000Date of Patent: July 16, 2002Assignee: Pixel InstrumentsInventors: J. Carl Cooper, Steve Anderson
-
Patent number: 6208958Abstract: A pitch determination apparatus and method using spectro-temporal autocorrelation to prevent pitch determination errors are provided.Type: GrantFiled: January 7, 1999Date of Patent: March 27, 2001Assignee: Samsung Electronics Co., Ltd.Inventors: Yong-duk Cho, Moo-Young Kim
-
Patent number: 5983181Abstract: The table document preparation module prepares a table document containing cells, the read-out attribute setting module sets a read-out attribute specifying a way of reading-out cell data supplied through the setting screen from the table document preparation module being assisted by the setting display module, and voice-generating data generation module generates voice-generating data for the table document according to the way of reading out specified by the read-out attribute, and the voice synthesis module synthesizes voices according to the voice-generating data.Type: GrantFiled: December 3, 1997Date of Patent: November 9, 1999Assignee: Justsystem Corp.Inventor: Nobuhide Yamazaki
-
Patent number: 5850438Abstract: In a transmission system, a tone is transmitted by a transmitter to a receiver via a transmission channel. In the receiver, a tone detector is used to detect the presence of a signalling tone. In order to improve a reliability of the tone detector when the arrival of the signalling tone is unknown, a number of correlators having mutually displaced measuring periods are used. More than two correlators are used in order to reduce the measuring period, also resulting in an improved reliability of the tone detection.Type: GrantFiled: April 16, 1996Date of Patent: December 15, 1998Assignee: U.S. Philips CorporationInventors: Harm Braams, Cornelis M. Moerman
-
Patent number: 5850437Abstract: In a transmission system, a signalling tone is transmitted by a transmitter to a receiver via a transmission channel. In the receiver, a tone detector detects the presence of a signalling tone. In order to improve a reliability of the tone detector, a number of correlating elements which determine a correlation value between an input signal and a reference signal are used. Absolute output signals of the correlating elements are added by an adder to derive a combined correlation signal to be used for detection.Type: GrantFiled: April 16, 1996Date of Patent: December 15, 1998Assignee: U.S. Philips CorporationInventors: Harm Braams, Cornelis M. Moerman
-
Patent number: 5832442Abstract: A method is disclosed of modification of parameters of audio signals by dividing a digital signal converted from an original analog signal into sound frames, modifying a pitch and a playing rate of the digital signal within a frame and subsequent successive splicing a last modified frame with a first non-modified frame and calculating the mean absolute error to define the best splicing point in terms of producing minimal or no audible noise such that various sections of sound signals can be spliced together to achieve pitch and playing rate modification.An apparatus is also disclosed for implementing the method, the apparatus comprising input and output amplifiers, a low pass filter at the input and a low pass filter at the output, analog-to-digital and digital-to-analog converters, and a pitch shifting processor.Type: GrantFiled: June 23, 1995Date of Patent: November 3, 1998Assignee: Electronics Research & Service OrganizationInventors: Gang-Janp Lin, Sau-Gee Chen, Der-Chwan Wu, Yuan-An Kao, Yen-Hui Wang
-
Patent number: 5809456Abstract: The present invention relates to a method and to equipment for coding and decoding a sampled speech signal. It belongs to systems used in speech processing, in particular for compression of speech information. The method is based upon a time/frequency description and on a representation of the prototype as a fundamental period of a periodic waveform; moreover the excitation of the synthesis filter is carried out through a single, phase-adapted pulse.Type: GrantFiled: June 27, 1996Date of Patent: September 15, 1998Assignee: Alcatel Italia S.P.A.Inventors: Silvio Cucchi, Marco Fratti
-
Patent number: 5761635Abstract: A synthesis filter is disclosed which models the effect of the fundamental frequency of speech for digital speech coders operating on the analysis-by-synthesis principle. High fundamental frequencies having a period shorter than the corresponding cycle length of the frame employed in the analysis-by-synthesis method are optimally encoded. The filter is constructed of a number of parallel, separately updatable synthesis-memory blocks. When analysis delays shorter than the analysis frame are used, a portion of a signal that was stored in memory several frames earlier is selected and scaled to approximate the missing portion of the analysis frame using the available portion of the analysis frame.Type: GrantFiled: April 29, 1996Date of Patent: June 2, 1998Assignee: Nokia Mobile Phones Ltd.Inventor: Kari Juhani Jarvinen