Sound Editing Patents (Class 704/278)
  • Publication number: 20080103782
    Abstract: A method for audio modulation is provided. The method including: obtaining the digital audio signals of the caller in the process of communications; analyzing the digital audio signals and obtain a voice frequency of the caller; reading a voice frequency of the user from a memory, and calculate the rate of the voice frequencies between the caller and the user; modulating the user' analog audio signals according to the rate of the voice frequencies; converting the modulated analog audio signals into digital audio signals; coding the digital audio signals and modulating the coded digital audio signals and transmitting the modulated digital audio signals to the caller. Through the method, the user's voice is modulated to sound like the caller's voice, thereby increasing the interest of the process of communicating. Present invention also provides a communication device with the function of audio modulation.
    Type: Application
    Filed: October 30, 2007
    Publication date: May 1, 2008
    Applicant: HON HAI PRECISION INDUSTRY CO., LTD.
    Inventors: SHIH-FANG WONG, CHUNG-JEN WANG
  • Patent number: 7363232
    Abstract: The present invention provides a method and system for processing an audio signal. According to an exemplary method, an audio signal such as a digital voice signal is received and divided into one or more individual unit cycles. An audio speed conversion operation is enabled by repeating or removing one or more of the individual unit cycles. In particular, repeating one or more of the individual unit cycles decreases audio speed, and removing one or more of the individual unit cycles increases audio speed.
    Type: Grant
    Filed: June 29, 2001
    Date of Patent: April 22, 2008
    Assignee: Thomson Licensing
    Inventors: Magdy Megeid, Markus Inkamp
  • Publication number: 20080086310
    Abstract: A system to generate a contextually specific custom audio file targeted to the user. The user can be a traveler who has particular travel data and the system can generate content tailored to the user's travel data. The content in the audio file can be chosen based on relevancy to the user's travel data. The audio file can be served to the user via a computer communications network such as the Internet.
    Type: Application
    Filed: October 9, 2006
    Publication date: April 10, 2008
    Inventor: Kent Campbell
  • Patent number: 7356476
    Abstract: An audio signal processing method for repairing an anomalous state such as noise, a discontinuity, and a break of sound, comprising detecting the anomalous state of an audio signal, deleting the audio signal in the anomalous segment, deducing the correct audio signal by referring to the waveform of the audio signal before and after the deleted segment, generating a repair signal for repairing the signal in the deleted segment based on the deduced result, inserting the repair signal into the deleted segment, and connecting it to the audio signal before and after the deleted segment.
    Type: Grant
    Filed: April 22, 2005
    Date of Patent: April 8, 2008
    Assignee: Sony Corporation
    Inventors: Mototsugu Abe, Akira Inoue, Jun Matsumoto, Koji Suginuma, Masayuki Nishiguchi
  • Patent number: 7356465
    Abstract: The invention relates to a computer device comprising a memory 108 for storing audio signals 114, in part pre-recorded, each corresponding to a defined source, by means of spatial position data 116, and a processing module 110 for processing these audio signals in real time as a function of the spatial position data. The processing module 110 allows for the instantaneous power level parameters to be calculated on the basis of audio signals 114, the corresponding sources being defined by instantaneous power level parameters. The processing module 110 comprises a selection module 120 for regrouping certain of the audio signals into a variable number of audio signal groups, and the processing module 110 is capable of calculating spatial position data which is representative of a group of audio signals as a function of the spatial position data 116 and instantaneous power level parameters for each corresponding source.
    Type: Grant
    Filed: December 31, 2003
    Date of Patent: April 8, 2008
    Assignee: Inria Institut National de Recherche en Informatique et en Automatique
    Inventors: Nicolas Tsingos, Emmanuel Gallo, George Drettakis
  • Publication number: 20080075292
    Abstract: An audio processing apparatus (20) connected between an audio source (10) and an audio output (30) is provided. The audio processing apparatus is suitable for singing practice of users and includes a vocal removing unit (210) and a pitch change unit (220). The vocal removing unit is for removing vocal signals from audio signals of the audio source, and the pitch change unit is for changing pitch of the audio signals and outputting the audio signals to the audio output.
    Type: Application
    Filed: August 27, 2007
    Publication date: March 27, 2008
    Applicant: HON HAI PRECISION INDUSTRY CO., LTD.
    Inventors: Shih-Fang Wong, Chung-Jen Wang
  • Patent number: 7343019
    Abstract: A normalization for streaming digital audio signals applies a gain factor according to the maximum sample magnitude in a window of samples and compare the gain factor to prior gain factors to adjust the gain factor for the samples in the window of samples. Adaptation of the gain factor with rapid decreases but slow increases avoids saturation but allows quiet passages.
    Type: Grant
    Filed: July 25, 2002
    Date of Patent: March 11, 2008
    Assignee: Texas Instruments Incorporated
    Inventors: Timothy C. Hankins, Thomas Millikan, Christopher A. Scarr, Jason Kridner, Gabriel Dagani
  • Patent number: 7336890
    Abstract: A “music video parser” automatically detects and segments music videos in a combined audio-video media stream. Automatic detection and segmentation is achieved by integrating shot boundary detection, video text detection and audio analysis to automatically detect temporal boundaries of each music video in the media stream. In one embodiment, song identification information, such as, for example, a song name, artist name, album name, etc., is automatically extracted from the media stream using video optical character recognition (OCR). This information is then used in alternate embodiments for cataloging, indexing and selecting particular music videos, and in maintaining statistics such as the times particular music videos were played, and the number of times each music video was played.
    Type: Grant
    Filed: February 19, 2003
    Date of Patent: February 26, 2008
    Assignee: Microsoft Corporation
    Inventors: Lie Lu, Yan-Feng Sun, Mingjing Li, Xian-Sheng Hua, Hong-Jiang Zhang
  • Patent number: 7319964
    Abstract: The present invention provides for a method and apparatus for segmenting a multi-media program based upon audio events. In an embodiment a method of classifying an audio stream is provided. This method includes receiving an audio stream. Sampling the audio stream at a predetermined rate and then combining a predetermined number of samples into a clip. A plurality of features are then determined for the clip and are analyzed using a linear approximation algorithm. The clip is then characterized based upon the results of the analysis conducted with the linear approximation algorithm.
    Type: Grant
    Filed: June 7, 2004
    Date of Patent: January 15, 2008
    Assignee: AT&T Corp.
    Inventors: Qian Huang, Zhu Liu
  • Patent number: 7319764
    Abstract: Some embodiments of the invention provide a method for controlling the volume of an audio track. This method represents the volume of an audio track with a graph. This graph is defined along two axes, with one axis representing time and the other representing the volume level. A user can adjust the graph at different instances in time in order to change the volume level in the audio track at these instances. Different embodiments use different types of graphs to represent volume. For instance, some embodiments use a deformable line bar.
    Type: Grant
    Filed: January 6, 2003
    Date of Patent: January 15, 2008
    Assignee: Apple Inc.
    Inventors: Glenn Reid, James Brasure
  • Patent number: 7310604
    Abstract: Complex sound events are created by generating multiple different kinds of simpler sounds with randomly varying repetition rates. The average repetition rate can also be variable. The values of sound parameters such as wave selection, pitch distribution, pan distribution and amplitude distribution can have random distributions, as determined by various control inputs, some of which have their own random distributions.
    Type: Grant
    Filed: October 19, 2001
    Date of Patent: December 18, 2007
    Assignee: Analog Devices, Inc.
    Inventors: Kim Cascone, Sean M. Costello, Nicholas J. Porcaro, Timothy S. Stilson, Scott A. Van Duyne
  • Patent number: 7308408
    Abstract: A method and system for providing efficient menu services for an information processing system that uses a telephone or other form of audio user interface. In one embodiment, the menu services provide effective support for novice users by providing a full listing of available keywords and rotating house advertisements which inform novice users of potential features and information. For experienced users, cues are rendered so that at any time the user can say a desired keyword to invoke the corresponding application. The menu is flat to facilitate its usage. Full keyword listings are rendered after the user is given a brief cue to say a keyword. Service messages rotate words and word prosody. When listening to receive information from the user, after the user has been cued, soft background music or other audible signals are rendered to inform the user that a response may now be spoken to the service.
    Type: Grant
    Filed: September 29, 2004
    Date of Patent: December 11, 2007
    Assignee: Microsoft Corporation
    Inventors: Lisa Joy Stifelman, Hadi Partovi, Haleh Partovi, David Bryan Alpert, Matthew Talin Marx, Scott James Bailey, Kyle D. Sims, Darby McDonough Bailey, Roderick Steven Brathwaite, Eugene Koh, Angus Macdonald Davis
  • Patent number: 7277859
    Abstract: A method for a digest generating apparatus to generate a digest of image content or sound content is provided. The digest generating apparatus obtains an audience rating or an audience count of image content or sound content at established time intervals, and extracts images from the image content or extracts sounds from the sound content. The extracted images or the extracted sounds correspond to a time when the audience rating or the audience count exceeds a threshold. Then, the digest generating apparatus generates a digest by using the extracted images or the extracted sounds. The audience rating or the audience count can be obtained by using audience data corresponding to an audience group having a specific user profile.
    Type: Grant
    Filed: December 20, 2002
    Date of Patent: October 2, 2007
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Tomoki Watanabe, Shinya Uegaki, Katsumi Kishida, Koichiro Yamamoto, Takashi Hosobuchi, Wataru Inoue, Hisashi Matsukawa
  • Patent number: 7266286
    Abstract: Temporarily storing audio data to be reproduced in the block form, temporarily storing synthesized audio data in the block form, generating and supplying a reference signal, and calculating the first address of the block of audio data, the editing system of the present invention enables identifying the buffering position of the recording signal and then matching the position to the reproduced signal, resulting in quick editing.
    Type: Grant
    Filed: November 14, 2001
    Date of Patent: September 4, 2007
    Assignee: Sony Corporation
    Inventors: Seiji Tanizawa, Satoru Tobita, Hideaki Miyauchi, Kazushi Sato, Keiji Hirai
  • Patent number: 7233832
    Abstract: Systems implementing the invention allow a user to time stretch an audio track without changing the pitch of the sound, and to produce optimal audible qualities of the output signal. The approach utilized in the invention relies on providing several time stretching methods, each one of which is selected based on one or more criteria of the audio data properties. One method relies on crossfading pairs of segments of audio data while running one segment backward every other repetition. The second time stretching method detects inaudible segments and inserts longer periods of audible data within those segments. The third method utilizes a reverb to create a reverb segment that is played after the original segment.
    Type: Grant
    Filed: April 4, 2003
    Date of Patent: June 19, 2007
    Assignee: Apple Inc.
    Inventors: Sol Friedman, Chris Moulios
  • Patent number: 7219065
    Abstract: A sound processor including a microphone (1), a pre-amplifier (2), a bank of N parallel filters (3), means for detecting short-duration transitions in the envelope signal of each filter channel, and means for applying gain to the outputs of these filter channels in which the gain is related to a function of the second-order derivative of the slow-varying envelope signal in each filter channel, to assist in perception of low-intensity short-duration speech features in said signal.
    Type: Grant
    Filed: October 25, 2000
    Date of Patent: May 15, 2007
    Inventors: Andrew E. Vandali, Graeme M. Clark
  • Patent number: 7203647
    Abstract: A speech output apparatus is disclosed, which can allow the user to easily catch synthetic speech when the synthetic speech is output upon being superposed on a music output. The apparatus output can output a music and synthetic speech that indicates contents of information such as an e-mail and is superposed on the music. When the synthetic speech is output to be superposed on the music during output, the apparatus gradually decreases a tone volume of the music.
    Type: Grant
    Filed: August 13, 2002
    Date of Patent: April 10, 2007
    Assignee: Canon Kabushiki Kaisha
    Inventors: Makoto Hirota, Hideo Kuboyama
  • Patent number: 7197458
    Abstract: A method and apparatus for verifying automatically that a plurality of derivative audio (or other multimedia) files have acceptable sound quality. In one embodiment, each derivative file is compared on a byte-by-byte basis to a corresponding original file to generate a difference. The difference is compared a threshold value (that may be determined empirically). If the difference is too large for many bytes, the derivative file is tagged as having an unacceptable sound quality. In another embodiment, segments of the original and derivative files are converted to the frequency domain and analysis is performed in this domain. The resulting signal could be a tag indicating that whether the derivative file is acceptable, or could be a more comprehensive signal indicative what kind of errors were detected and in what temporal and/or spectral region for diagnostic purposes.
    Type: Grant
    Filed: May 9, 2002
    Date of Patent: March 27, 2007
    Assignee: Warner Music Group, Inc.
    Inventors: George H. Lydecker, Todd Yvega
  • Patent number: 7191134
    Abstract: A method of changing psychological stress indices by evaluating manifestations of physiological change in the human voice wherein the utterances of a subject under examination are formatted as electrical signals and processed to alter selected characteristics which have been found to change with psycho-physiological state changes, such that the resultant output data signals are perceptually unchanged yet display none of the undesired physiological response characteristics. Apparatus for performing changes of this type includes a data input port, means for spectral alteration and a data output port.
    Type: Grant
    Filed: March 25, 2002
    Date of Patent: March 13, 2007
    Inventor: Patrick O'Neal Nunally
  • Patent number: 7184952
    Abstract: A simple and efficient method for producing an obfuscated speech signal which may be used to mask a stream of speech, is disclosed. A speech signal representing the speech stream to be masked is obtained. The speech signal is then temporally partitioned into segments, preferably corresponding to phonemes within the speech stream. The segments are then stored in a memory, and some or all of the segments are subsequently selected, retrieved, and assembled into an obfuscated speech signal representing an unintelligible speech stream that, when combined with the speech signal or reproduced and combined with the speech stream, provides a masking effect. While the presently preferred embodiment finds application most readily in an open plan office, embodiments suitable for use in restaurants, classrooms, and in telecommunications systems are also disclosed.
    Type: Grant
    Filed: July 12, 2006
    Date of Patent: February 27, 2007
    Assignee: Applied Minds, Inc.
    Inventors: W. Daniel Hillis, Bran Ferren, Russel Howe
  • Patent number: 7177278
    Abstract: Method of processing a transmitted encoded media data stream is received. If a data element arrives prior to, or at, a predetermined playout deadline, the data element is decoded, the media represented by the decoded data element is played, and the data element is provided to a decoder state machine to update a decoder state. If a data element arrives after the predetermined playout deadline, the data element is provided to the decoder state machine to update the decoder state. In one embodiment, if the specified data element fails to arrive by the playout deadline, a subsequently received data element is saved in memory. Then, if the specified data element arrives after the predetermined playout deadline, the specified data element and the saved, subsequently received, data element are provided to the decoder state machine to update the decoder state.
    Type: Grant
    Filed: February 25, 2002
    Date of Patent: February 13, 2007
    Assignee: Broadcom Corporation
    Inventor: Wilfrid LeBlanc
  • Patent number: 7162424
    Abstract: The invention relates to a method for defining a sequence of sound modules for synthesis of a speech signal in a tonal language corresponding to a sequence of speech modules. The method according to the invention differs from known methods in that the speech modules represent triphones, which each comprise one phoneme with the respective context, and with syllables in the tonal language being composed of one or more triphones. This results in a high level of flexibility for the synthesis of tonal languages.
    Type: Grant
    Filed: April 26, 2002
    Date of Patent: January 9, 2007
    Assignee: Siemens Aktiengesellschaft
    Inventors: Martin Holzapfel, Jianhua Tao
  • Patent number: 7155394
    Abstract: A system is disclosed for playing prerecorded audio encoded in a fault tolerant manner as a series of invisible dots carried on a substrate together with a photographic image. The system has a detector that detects the dot form of the prerecorded audio on the substrate and outputs a first signal; a decoder interconnected to the detector that decodes the first signal to produce a output signal; and an audio emitter interconnected to the processor that receives the output signal and outputs corresponding sounds. The dots may be infrared absorbing and the encoding can include Reed-Solomon encoding of the prerecorded audio. The system can include a wand-like arm having a slot through which the photograph is inserted.
    Type: Grant
    Filed: August 8, 2003
    Date of Patent: December 26, 2006
    Assignee: Silverbrook Research Pty Ltd
    Inventors: Kia Silverbrook, Paul Lapstun, Simon Robert Walmsley
  • Patent number: 7149686
    Abstract: A system and method for eliminating synchronization errors using speech recognition. Using separate audio and visual speech recognition techniques, the inventive system and method identifies visemes, or visual cues which are indicative of articulatory type, in the video content, and identifies phones and their articulatory types in the audio content. Once the two recognition techniques have been applied, the outputs are compared to determine the relative alignment and, if not aligned, a synchronization algorithm is applied to time-adjust one or both of the audio and the visual streams in order to achieve synchronization.
    Type: Grant
    Filed: June 23, 2000
    Date of Patent: December 12, 2006
    Assignee: International Business Machines Corporation
    Inventors: Paul S. Cohen, John R. Dildine, Edward J. Gleason
  • Patent number: 7143028
    Abstract: A simple and efficient method for producing an obfuscated speech signal which may be used to mask a stream of speech, is disclosed. A speech signal representing the speech stream to be masked is obtained. The speech signal is then temporally partitioned into segments, preferably corresponding to phonemes within the speech stream. The segments are then stored in a memory, and some or all of the segments are subsequently selected, retrieved, and assembled into an obfuscated speech signal representing an unintelligble speech stream that, when combined with the speech signal or reproduced and combined with the speech stream, provides a masking effect. While the presently preferred embodiment finds application most readily in an open plan office, embodiments suitable for use in restaurants, classrooms, and in telecommunications systems are also disclosed.
    Type: Grant
    Filed: July 24, 2002
    Date of Patent: November 28, 2006
    Assignee: Applied Minds, Inc.
    Inventors: W. Daniel Hillis, Bran Ferren, Russel Howe, Brian Eno
  • Patent number: 7127306
    Abstract: A recording and/or reproducing apparatus includes a microphone, a semiconductor memory, an operating section and a controller. An output signal from the microphone is written in the semiconductor memory and the written signals are read out from the semiconductor memory. The operating section performs input processing for writing a digital signal outputted by an analog/digital converter, reading out the digital signal stored in the semiconductor memory and for erasing the digital signal stored in the semiconductor memory. The control section controls the writing of the microphone output signal in the semiconductor memory based on an input from the operating section and the readout of the digital signal stored in the semiconductor memory.
    Type: Grant
    Filed: March 12, 2001
    Date of Patent: October 24, 2006
    Assignee: Sony Corporation
    Inventor: Kenichi Iida
  • Patent number: 7103842
    Abstract: Realizing a presentation system which can present picture data and voice data in a simplified manner after shooting and picking up pictures by a digital still camera, etc. and recording voices by a voice recorder, etc. Firstly, when the presentation system executes the program, a folder will be selected to specify a recording area in the personal computer 10 (step 1). After obtaining temporal information of all picture data and voice data in the folder, the presentation system performs relating operation for pages of slides so that the picture data and the voice data correspond to the slides at the time of data presentation (step 2). Next, the presentation system makes the user select whether information processing for data will be performed or not (step 3).
    Type: Grant
    Filed: September 6, 2001
    Date of Patent: September 5, 2006
    Assignee: Sony Corporation
    Inventors: Hiroki Masuda, Takuhiko Takatsu, Hiroyuki Bando, Yukio Takeyari
  • Patent number: 7103430
    Abstract: The present invention provides an apparatus having a microphone, an analog to digital converting circuit, a semiconductor memory, input device, and a controller. The analog to digital converting circuit converts an output signal from the microphone into a digital signal. The semiconductor memory stores the output signal from the analog to digital converting circuit. The input device at least carry out input of a record start and a record end. The controller, according to the input from the input device, carries out operation control for start and stop of writing into the semiconductor memory a digital signal from the analog to digital converting circuit. When the input device is operated and a predetermined time interval has passed, the controller controls to start writing the digital signal from the analog/digital conversion circuit into the semiconductor memory.
    Type: Grant
    Filed: July 13, 2001
    Date of Patent: September 5, 2006
    Assignee: Sony Corporation
    Inventor: Eiichi Yamada
  • Patent number: 7099827
    Abstract: Packet stream is generated by combining a plurality of packets corresponding to style-of-rendition identification information which are selected from among a number of packets usable for producing waveforms corresponding to various styles of rendition. Then, a waveform having characteristics of the style of rendition indicated by the style-of-rendition identification information is produced on the basis of the generated packet stream. The packet stream includes a plurality of packets and time information of the individual packets and controls the pitch, amplitude and shape of the waveform to be produced. By thus combining packets corresponding to the style-of-rendition identification information and producing a waveform on the basis of the packet stream, there can be provided a waveform corresponding to a desired style of rendition in a simplified manner with great facility.
    Type: Grant
    Filed: September 22, 2000
    Date of Patent: August 29, 2006
    Assignee: Yamaha Corporation
    Inventor: Motoichi Tamura
  • Patent number: 7096186
    Abstract: Sound signal is received which contains sound characteristics to be represented in musical notation. The characteristics, such as a volume level of the sound signal, are extracted out of the received sound signal, and various parameters for use in subsequent analysis of the sound signal are set in accordance with the extracted characteristics. Also, a desired scale determining condition is set by a user. Pitch of the sound signal is determined using the thus-set parameters. The determined pitch is rounded to any one of scale notes, corresponding to the user-set scale determining condition. Also, a given unit note length is set as a predetermined criterion or reference for determining a note length, and a length of the scale note determined from the received sound signal is determined using the thus-set unit note length as a minimum determination unit, i.e., with an accuracy of the unit note length.
    Type: Grant
    Filed: August 10, 1999
    Date of Patent: August 22, 2006
    Assignee: Yamaha Corporation
    Inventor: Tomoyuki Funaki
  • Patent number: 7089559
    Abstract: A mechanism is provided for chaining server applications. A chaining module is provided that receives a series of server applications and chains them together passing the output of one to the input of the next. The series of server applications may be passed to the chaining module in a chain option. A properties file may be provided to register names of server applications. A name may be associated with the chaining module and the options may be specified in the properties file. Thus, a chain of server applications may be registered by name.
    Type: Grant
    Filed: July 31, 2001
    Date of Patent: August 8, 2006
    Assignee: International Business Machines Corporation
    Inventor: Richard J. Redpath
  • Patent number: 7085719
    Abstract: A method and apparatus are provided for adjusting a content of an oral presentation provided by an agent of an organization and perceived by a human target of the organization based upon an objective of the organization. The method includes the steps of detecting a content of the oral presentation provided by the agent and modifying the oral presentation provided by the agent to produce the oral presentation perceived by the human target based upon the detected content and the organizational objective.
    Type: Grant
    Filed: July 13, 2000
    Date of Patent: August 1, 2006
    Assignee: Rockwell Electronics Commerce Technologies LLC
    Inventors: Craig R. Shambaugh, Anthony Dezonno, Mark J. Power, Kenneth Venner, James Martin, Darryl Hymel, Laird C. Williams
  • Patent number: 7080016
    Abstract: Audio information is read out of a recording medium which records musical pieces as audio information. The audio information read out is processed to detect BPM values and positions of beats in the musical piece. A musical piece is reproduced from the recording medium by reproducing audio information in accordance with the detected BPM values and positions of beats. The BPM value indicates the tempo of a musical piece, and the beat indicates a strength of a sound which repeatedly appears in each musical piece. When the audio information is reproduced in accordance with the detected BPM values and positions of beats, the musical piece is reproduced at the correct tempo and beats without giving an unnatural feeling to the listener.
    Type: Grant
    Filed: September 23, 2002
    Date of Patent: July 18, 2006
    Assignee: Pioneer Corporation
    Inventors: Masahiko Miyashita, Koji Ogura, Kensuke Chiba, Takeaki Funada
  • Patent number: 7062442
    Abstract: The method and a system is for locating and recording time-limited signal sequences in media channels that may contain undesirable signal components, e.g., recording music in radio transmissions. The signals are continuously buffered in a memory. The user identifies a desired source material. Out of this desired source material a section may be taken as a search key. The device may also select search keys automatically. If a second instance of the search key is detected, signal sequences that in time are connected to the search keys are compared. The signal sequences that by comparison are substantially identical are identified as belonging to the same, wanted, source material. The next step is an iteration of the above procedure results in a longer and higher quality segment of source material than the initial common segment.
    Type: Grant
    Filed: October 23, 2001
    Date of Patent: June 13, 2006
    Assignee: Popcatcher AB
    Inventors: Jakob Berg, Rickard Berg, Tomas Ahrne
  • Patent number: 7054817
    Abstract: A computer system is provided including a control module 20 and data collection module 22 which generate user interfaces enabling a user to identify a vocabulary and a number of speakers from whom utterances are to be obtained. The data collection module 22 then co-ordinates the collection of utterance data for the words in the vocabulary from these speakers and stores the data in a speaker database 24. When a satisfactory set of utterances have been collected the utterances are passed to a model generation module 25 which generates a speech model using the utterances. The speech model is stored by the model generation module 25 in a model database 26. The generated model stored within the model database 26 can then be tested using a testing module 27 and other utterances stored within the speaker database 24. If the performance of the model is unsatisfactory further or different utterances can be used to generate new models for storage within the model database 26.
    Type: Grant
    Filed: January 25, 2002
    Date of Patent: May 30, 2006
    Assignee: Canon Europa N.V.
    Inventor: Yuan Shao
  • Patent number: 7035418
    Abstract: Provided in accordance with the invention are a sound source identifying apparatus and method whereby objects as sound sources can be determined as to their locations with higher accuracy by using sound information and image information thereof and are separated from mixed sounds with certainty by using position information thereof. The sound source identifying apparatus (10) is constructed to include a sound collecting part; an imaging part; a sensing part; an image processing part; a sound processing part; and a control part.
    Type: Grant
    Filed: June 7, 2000
    Date of Patent: April 25, 2006
    Assignee: Japan Science and Technology Agency
    Inventors: Hiroshi Okuno, Hiroaki Kitano, Yukiko Nakagawa
  • Patent number: 7035807
    Abstract: A Sound on Sound-Annotations (SOS-A) system facilitates the collection, categorization, and retrieval of streams of sound. A stream of sound is captured and annotations of sound concerning the stream of sound are generated for positions of interest or relevancy. The annotations add additional information concerning the stream of sound at the points of interest. Markers of sound are logically or physically inserted in the stream of sound to identify the locations associated with the annotations of sound. The markers of sound point to or link the annotation of sound. The annotations of sound are also captured and can convey any information desired; for example, add description, provide evidence, challenge the validity, ask questions, etc. Any form or frequency of sound can be utilized with the stream of sound, the marker of sound, and/or the annotations of sound.
    Type: Grant
    Filed: February 19, 2002
    Date of Patent: April 25, 2006
    Inventors: John W. Brittain, Thomas J. Eccles
  • Patent number: 7020613
    Abstract: A method and system of mixing audios to convert a plurality of input voices into a single output voice is described. The system of mixing audios has a decoding device, an audio mixing device and a frame package unit. The input voices including a plurality of audio frames are partially decoded to acquire audio parameters of the input voices by the decoding device. One audio frame of the input voices is selected by the audio mixing device to obtain a target frame according to the audio parameters later. The target frame is then packaged so as to be identical to the original format of the input voices by the frame package unit.
    Type: Grant
    Filed: July 26, 2002
    Date of Patent: March 28, 2006
    Assignee: AT Chip Corporation
    Inventors: Pao-Chi Chang, Ching-Chang Chen
  • Patent number: 7012650
    Abstract: The invention uses digital signal processing (DSP) techniques to synchronize an audio encoding process with a video synchronization signal. Namely, the encoder parameters of a DSP microchip are preset according to characteristics of an audio frame. A buffer temporarily stores the audio frame prior to sending it to an encoder. The buffer then transfers the frame in response to receiving a video synchronization signal in conjunction with authorization from a microprocessor. As such, the encoding sequence of the audio frame coincides with the video synchronization signal. Since the corresponding video frame is already slaved to the video synchronization signal, the audio samples are effectively processed in sequence with the video data. Prior to outputting the encoded audio frame to a multiplexor, the encoder sends a value to the microprocessor representing the difference between the end of the encoded audio frame and a second video synchronization signal.
    Type: Grant
    Filed: June 14, 2001
    Date of Patent: March 14, 2006
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Fengduo Hu, Lin Yin, Yew-Koon Tan
  • Patent number: 7013283
    Abstract: A system and a concomitant method for providing programming content in response to an audio signal. The programming content and the audio signal are transmitted in a network having a forward channel and a back channel. In one embodiment, the system comprises a local processing unit and a remote server computer. A first user provides a first audio signal containing a request for programming content from a service provider. The local processing unit receives the first audio signal and transmits the received first audio signal to a service provider via the back channel. The remote server computer receives the first audio signal from the back channel, recognizes the first user and the request for programming content, retrieves the requested programming content from a program database and transmits the programming content to the local processing unit via the forward channel.
    Type: Grant
    Filed: November 16, 2000
    Date of Patent: March 14, 2006
    Assignee: Sarnoff Corporation
    Inventors: Michael Chase Murdock, John Pearson, Paul Sajda
  • Patent number: 6999933
    Abstract: A speech recognition device (1) processes speech data (SD) of a dictation and thus establishes recognized text information (ETI) and link information (LI) of the dictation. In a synchronous playback mode of the speech recognition device (1), during the acoustic playback of the dictation a correction device (10) synchronously marks the word of the recognized text information (ETI) which word relates to the speech data (SD) just played back marked by the link information (LI) is marked synchronously. The correction device (10) now allows, when the synchronous playback mode is active, correction of an incorrect word of the recognized text information (ETI), so that time saving correction of incorrect words is possible.
    Type: Grant
    Filed: March 25, 2002
    Date of Patent: February 14, 2006
    Assignee: Koninklijke Philips Electronics, N.V
    Inventor: Dieter Hoi
  • Patent number: 6990449
    Abstract: A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules is provided. The digital voice library includes a plurality of speech items including words and syllables and a corresponding plurality of voice recordings. Each speech item corresponds to at least one available voice recording. The method comprises training the digital voice library to associate each syllable speech item with a literal text syllable of the particular syllable speech item.
    Type: Grant
    Filed: March 27, 2001
    Date of Patent: January 24, 2006
    Assignee: Qwest Communications International Inc.
    Inventor: Eliot M. Case
  • Patent number: 6978240
    Abstract: A speech translation system and method are disclosed utilizing a holographic storage medium having a plurality of frames therein, each frame containing one or more discrete speech wave forms thereon for comparison with a spoken word wave form to select a wave form of a second language of equivalent meaning which through a digital audio player can be made audible for speech translation.
    Type: Grant
    Filed: January 31, 2002
    Date of Patent: December 20, 2005
    Inventor: Gregory R. Brotz
  • Patent number: 6975995
    Abstract: A system and method for providing a service of playing an accompaniment/musical performance is disclosed. In order to embody the system and method for providing the service of playing the accompaniment/musical performance, virtual orchestra system (VOS) files, which is converted from digital music files, e.g., musical instrument digital interface (MIDI) files and includes play order notes and sound data for each musical instrument capable of being played, are used. A server provides the VOS files through a network, e.g., a local area network (LAN), an Intranet, a value added network (VAN), an Internet or a public switched telephone network. A music is selected by a user through at least a client terminal. The play order note for each musical instrument is provided and the sound data for each musical instrument is played based on the play order note, thereby playing in solo or in concert. (At this time, sound for the others musical instrument is silent or used as a background music.
    Type: Grant
    Filed: December 20, 2000
    Date of Patent: December 13, 2005
    Assignees: Hanseulsoft Co., Ltd., P&IB Co., LTD
    Inventor: Yun-Jong Kim
  • Patent number: 6975912
    Abstract: A recording and/or reproducing apparatus includes a microphone, a semiconductor memory, an operating section and a controller. An output signal from the microphone is written in the semiconductor memory and the written signals are read out from the semiconductor memory. The operating section performs input processing for writing a digital signal outputted by an analog/digital converter, for reading out the digital signal stored in the semiconductor memory and for erasing the digital signal stored in the semiconductor memory. The control section controls the writing of the microphone output signal in the semiconductor memory based on an input from the operating section and the readout of the digital signal stored in the semiconductor memory.
    Type: Grant
    Filed: September 28, 2000
    Date of Patent: December 13, 2005
    Assignee: Sony Corporation
    Inventor: Kenichi Iida
  • Patent number: 6928405
    Abstract: The invention relates to a method for adding an information title containing audio data to a document. It provides a method for rapidly adding audio data to those data stored in a computer (PC or laptop) or PDA (Personal Data Assistant). The invention allows audio data to be recorded by an audio recorder and saved as an audio file after the information title of a document is opened. A link between the opened information title and the audio file is created, and an audio link tag is then shown on the information title linked with the audio data. The audio data can be easily retrieved by clicking the information title.
    Type: Grant
    Filed: September 5, 2001
    Date of Patent: August 9, 2005
    Assignee: Inventec Corporation
    Inventor: Shen-Yu Wu
  • Patent number: 6925340
    Abstract: An apparatus and method for recording and reproducing a sound (i.e. audio) signal corresponding to a video signal at a higher than normal speed. The method delimits a sound signal reproduced at a recording medium at a speed higher than a normal speed into successive processing unit periods. For each processing unit period, sound absence portion(s) of the reproduced sound signal are deleted (or partially deleted) within a range corresponding to a normal speed reproduction. Sound presence portions preceding and following the deleted absence portions are joined or compressed to produce a recognizable sound signal.
    Type: Grant
    Filed: August 22, 2000
    Date of Patent: August 2, 2005
    Assignee: Sony Corporation
    Inventors: Taro Suito, Masashi Ohta, Masayoshi Miura
  • Patent number: 6876969
    Abstract: A document read-out apparatus has a document read-out function for reading out a document according to a first speech parameter. The document read-out apparatus is provided with a first specifying section which specifies a keyword, and a read-out section which reads out the document according to a second speech parameter different from the first speech parameter, until a keyword within the document.
    Type: Grant
    Filed: January 25, 2001
    Date of Patent: April 5, 2005
    Assignee: Fujitsu Limited
    Inventor: Makiko Nakao
  • Patent number: 6865537
    Abstract: A request for reading audio information is sent to an audio information source in accordance with an amount of information accumulated in a buffer memory, a predetermined amount of audio information is read from the buffer memory in accordance with a preset speed magnification, and the predetermined amount of audio information is reproduced after undergoing a reproducing speed conversion treatment. First portions of the audio information may be successively cut out in accordance with window functions, connected together, and rendered to serve as an output for converting a reproducing speed in a first channel. Second portions of the audio information may be successively cut out in accordance with window functions, connected together, and rendered to serve as an output for converting a reproducing speed in a second channel. The audio information may be independently reproduced through the first and second channels.
    Type: Grant
    Filed: March 29, 2001
    Date of Patent: March 8, 2005
    Assignees: Pioneer Corporation, Futek Electronics Co., LTD
    Inventors: Mitsuo Yasushi, Masatoshi Yanagidaira, Kunio Yarita
  • Patent number: 6862568
    Abstract: A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules is provided. The method comprises generating voice data based on a sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings. Concatenating a first recording and a second recording adjacent to the first recording includes manipulating the ending sonic feature of the first recording to determine a first recording switch point, manipulating the starting sonic feature of the second recording to determine a second recording switch point, and synchronizing the first recording switch point and the second recording switch point.
    Type: Grant
    Filed: March 27, 2001
    Date of Patent: March 1, 2005
    Assignee: Qwest Communications International, Inc.
    Inventor: Eliot M. Case