Sound Editing Patents (Class 704/278)
  • Patent number: 6859779
    Abstract: A background sound sending side multiplexes and sends, in a multiplexer, uttered encoded speech data generated in a speech sending section and encoded background sound data outputted from a background sound storing section. Simultaneously, a background sound reproducing section, reproduces encoded background sound data and reproduced background sound signal is superposed on received speech in a receiving section and outputted from a receiver. A background sound receiving side demultiplexes, in a demultiplexer, received multiplexed data into received encoded speech data and encoded background sound data which are decoded in the receiving section and the background sound reproducing section respectively, and in the receiving section, a sound in which received speech and background sound are superposed is outputted from a receiver.
    Type: Grant
    Filed: February 27, 2001
    Date of Patent: February 22, 2005
    Assignee: Hitachi Ltd.
    Inventor: Tohru Yokoyama
  • Patent number: 6839672
    Abstract: An enhanced arrangement for a talking head driven by text is achieved by sending FAP information to a rendering arrangement that allows the rendering arrangement to employ the received FAPs in synchronism with the speech that is synthesized. In accordance with one embodiment, FAPs that correspond to visemes which can be developed from phonemes that are generated by a TTS synthesizer in the rendering arrangement are not included in the sent FAPs, to allow the local generation of such FAPs. In a further enhancement, a process is included in the rendering arrangement for creating a smooth transition from one FAP specification to the next FAP specification. This transition can follow any selected function. In accordance with one embodiment, a separate FAP value is evaluated for each of the rendered video frames.
    Type: Grant
    Filed: December 31, 1998
    Date of Patent: January 4, 2005
    Assignee: AT&T Corp.
    Inventors: Mark Charles Beutnagel, Joern Ostermann, Ariel Fischer, Yao Wang
  • Publication number: 20040267539
    Abstract: A method for generating output sound data in a predetermined time period includes mixing input sound data in the predetermined time period with input sound data in the time period previous to the predetermined time period, and with output sound data in the time period previous to the predetermined time period in order to generate the output sound data in the predetermined time period.
    Type: Application
    Filed: August 29, 2003
    Publication date: December 30, 2004
    Inventor: Gin-Der Wu
  • Publication number: 20040267541
    Abstract: A method and corresponding equipment by which a synthesizer/MIDI (musical instrument digital interface) device (10) is able to optimally perform a MIDI file (11) taking into account not the polyphony required by the MIDI file (11) as in SP-MIDI (scalable polyphony MIDI), but taking into account instead extended scalable polyphony (XSP) data 12b including the maximum number of instantaneous voices required by the MIDI file and the categories in which they occur for different channel masking, and also taking into account the architecture of the synthesizer/MIDI device (10) in terms of a voice complexity coefficient table (12b) indicating the relative complexity (corresponding to a resource requirement) for voices in each category. The result is a total voice requirement table 12c-1 indicating typically less masking than would be required for the same synthesizer/MIDI device to play the MIDI file according to SP-MIDI.
    Type: Application
    Filed: April 16, 2004
    Publication date: December 30, 2004
    Inventors: Matti S. Hamalainen, Timo Kosonen
  • Publication number: 20040267540
    Abstract: The present invention (110) permits a user to speed up and slow down speech without changing the speakers pitch (102, 110, 112, 128, 402-416). It is a user adjustable feature to change the spoken rate to the listeners' preferred listening rate or comfort. It can be included on the phone as a customer convenience feature without changing any characteristics of the speakers voice besides the speaking rate with soft key button (202) combinations (in interconnect or normal). From the users perspective, it would seem only that the talker changed his speaking rate, and not that the speech was digitally altered in any way. The pitch and general prosody of the speaker are preserved. The following uses of the time expansion/compression feature are listed to compliment already existing technologies or applications in progress including messaging services, messaging applications and games, real-time feature to slow down the listening rate.
    Type: Application
    Filed: June 27, 2003
    Publication date: December 30, 2004
    Applicant: MOTOROLA, INC.
    Inventors: Marc Andre Boillot, John Gregory Harris, Thomas Lawrence Reinke
  • Publication number: 20040236582
    Abstract: The invention aims at providing server apparatus capable of outputting image data and sound data thereby deactivating sound transmission at a low cost and with ease. That is, a sound input device (microphone) for converting sound to a sound signal is made detachable. A connection detector for detecting whether this sound input device (microphone) is connected is provided. In case the sound input device is connected to a sound input section, the sound transmission function is automatically controlled into the operating state. In case the sound input device is not connected, the sound transmission function is automatically controlled into the non-operating state. Thus, only a simple procedure of removing the sound input device from the sound input section is needed to deactivate sound transmission. This allows switching between activation and deactivation of sound transmission at a low cost.
    Type: Application
    Filed: May 13, 2004
    Publication date: November 25, 2004
    Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.
    Inventors: Tadashi Yoshikai, Toshiyuki Kihara, Yoshiyuki Watanabe, Hisashi Koga, Yuji Arima
  • Publication number: 20040199395
    Abstract: A timeline-based approach for selecting and manipulating audio tracks is presented. This is accomplished via a graphical user interface that provides users with a series of visual cues and enhancements when selecting a particular area of an audio track depicted within the interface. These visual cues are rendered as a display region having multiple other display areas, components or interface components that provide the user with a location for initiating actions upon the file. User input provided to the timeline component generates a selection overlay that indicates a selected area of the audio file. The user can perform numerous actions with that audio file, such as copying and pasting. The user can do this more quickly and efficiently because the user is not required to switch tools. Everything is accomplished “modelessly.” Multiple instances of the selection overlay applied, for example, across multiple audio tracks may achieve even more powerful results.
    Type: Application
    Filed: April 1, 2004
    Publication date: October 7, 2004
    Inventor: Egan Schulz
  • Publication number: 20040193429
    Abstract: A singing voice extraction section 2 that extracts human singing voices from the digital sound generator data 11 and obtains the signing voice data 12 in the ADPCM format, the BGM generating section 13 that generates the BGM data 13 in the MIDI format, the MIDI adjustment section 4 that generates simulated singing voice data in the MIDI format based on the extracted signing voices and adds such data to the BGM data 13, and the file generating section 5 that processes the singing voice data 12 and the BGM and simulated singing voice data 14 into a single music file 15 are established. Bandwidth limiting is significantly executed for the singing voice portion. Through generating the BGM portion in the MIDI format, the entire amount of data is decreased, and the singing voice portion that has been deteriorated in quality due to the performance of bandwidth limiting is supplemented by MIDI data. Due to this, the quality of the reproduced singing voices can be also maintained beyond a predominating level.
    Type: Application
    Filed: March 12, 2004
    Publication date: September 30, 2004
    Applicant: SUNS-K CO., LTD.
    Inventor: Hirohito Kimoto
  • Publication number: 20040186733
    Abstract: The stream sourcing content delivery system goes to a database and builds a physical stream, based on a schedule. The stream source content delivery system works at a station ID (SID), finds the order of the delivery of content for the station based upon the schedule, and downloads a plurality of music files to its hard drive to enable play back. The stream source content delivery system then concatenates the files, to create stream, and awaits the request of one or more stream recipients. Some preferred system embodiments further comprise a fail-safe mode, whereby a loop of music is generated from the downloaded stream, and is delivered to one or more users when further access to content is interrupted, such that recipients experience an uninterrupted delivery of a plurality of files, e.g. songs.
    Type: Application
    Filed: October 16, 2003
    Publication date: September 23, 2004
    Inventors: Stephen Loomis, David Biderman, Simon Gibson, Thomas Pepper, Andrew Dickson
  • Publication number: 20040186734
    Abstract: A method and apparatus for mixing audio streams, and an information storage medium that stores mixing information. The information storage medium includes at least one audio stream that contains a multiplicity of audio data obtained from respective multiple channels, and mixing information used to mix at least parts of the multiplicity of audio data. Accordingly, it is possible to mix and reproduce different types of channel components without changing the channel formats of different audio streams. Furthermore, it is also possible to perform dynamic mixing on multiple channel components, thus enabling adaptation to a change in audio content and characteristics thereof and thereby reproducing audio data more appropriately. In particular, since mixing information is described in interactive data allowing an interaction with a user, it is possible to provide the user with more applications.
    Type: Application
    Filed: December 29, 2003
    Publication date: September 23, 2004
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Jung-kwon Heo, Sung-wook Park, Hyun-kwon Chung, Kil-soo Jung
  • Patent number: 6782365
    Abstract: A graphic interface system and product are provided for editing an encoded audio signal. The system includes a receiver for receiving an encoded audio signal having multiple frequency subbands, as well as control logic operative to generate a spectral graph of the encoded audio signal, the spectral graph including an amplitude of each frequency subband as a function of time, and to mark a selectable edit point of the encoded audio signal. The system also includes a display unit for displaying the spectral graph including the edit point marked, and an input device for selecting the edit point. The product includes a storage medium having computer readable programmed instructions recorded thereon.
    Type: Grant
    Filed: December 20, 1996
    Date of Patent: August 24, 2004
    Assignee: Qwest Communications International Inc.
    Inventor: Eliot M. Case
  • Patent number: 6778960
    Abstract: A speech information processing apparatus which sets the duration of phonological series with accuracy, and sets a natural phoneme duration in accordance with phonemic/linguistic environment. For this purpose, the duration of predetermined unit of phonological series is obtained based on a duration model for entire segment. Then duration of each of phonemes constructing the phonological series is obtained based on the duration model for the entire segment. Then duration of each phoneme is set based on the duration of the phonological series and the duration of each phoneme.
    Type: Grant
    Filed: March 28, 2001
    Date of Patent: August 17, 2004
    Assignee: Canon Kabushiki Kaisha
    Inventor: Toshiaki Fukada
  • Patent number: 6772125
    Abstract: An audio/video reproducing apparatus is connectable to a communications network for selectively reproducing items of audio/video material from a recording medium in response to a request received via the communications network. The audio/video reproducing apparatus may comprise a control processor operable in use to receive data representing the request for the audio/video material item via the communications network. A reproducing processor is operable in response to signals identifying the audio/video material items from the control processor to reproduce the audio/video material items. The data identifying the audio/video material items includes meta data indicative of the audio/video material items. The meta data may be one of UMID, tape ID and time codes, and a Unique Material Identifier the material items.
    Type: Grant
    Filed: December 4, 2001
    Date of Patent: August 3, 2004
    Assignee: Sony United Kingdom Limited
    Inventors: Vincent Carl Harradine, Alan Turner, Morgan William David, Michael Williams, Mark John McGrath, Andrew Kydd, Jonathan Thorpe
  • Publication number: 20040148177
    Abstract: The performance method of audio effect is executive of application software in company with an output device. A referring signal indicative of a genre of a digital audio signal is provided. A DSP engine accepts the referring signal to process the referring signal and to produce a genre command. The output device accepts the genre command to synchronically perform an audio effect. Apparatus of audio performance includes a digital signal processor engine, an output circuit, and audio output device. The DSP engine, definitive of accepting a referring signal, is configured to produce a genre command in accordance with the referring signal. The output circuit is configured to accept the genre command and process an audio effect of the genre in accordance with the genre command. The audio output device is configured to perform the audio effect of the genre without manual settings.
    Type: Application
    Filed: January 27, 2003
    Publication date: July 29, 2004
    Inventors: Yung-Chiuan Weng, Chu Hung-Yi Cheng
  • Publication number: 20040143349
    Abstract: Broadcast music, or other audio that a user wants to hear, is recorded based on criteria obtained from a user. Any of a plurality of techniques may be used to identify the audio, alone or in combination with other identification techniques, including length of song, fingerprint recognition of digital or analog audio, scheduled programming, or metadata transmitted in the same or an adjacent channel or frequency. The criteria used to determine whether to save a recording may be based on attributes included in the identification database, such as artist, genre, popularity, station programming, year, signal quality, etc. The audio selected by a user for listening may be recorded, or a programmable tuner or other input selector may automatically record desired music regardless of whether the music is being output for listening. The audio recorded may be obtained from any source, including analog and digital radio, Internet radio streams and removable pre-recorded media.
    Type: Application
    Filed: October 28, 2003
    Publication date: July 22, 2004
    Applicant: Gracenote, Inc.
    Inventors: Dale T. Roberts, Michael W. Mantle, Maxwell Wells, Randall Cook, Brian Hamilton
  • Publication number: 20040133430
    Abstract: This invention is related to a sound apparatus including a unit which selects one audio information from pieces of audio information of a plurality of application programs executed in an information processing apparatus, and a unit which outputs the selected audio information.
    Type: Application
    Filed: December 18, 2003
    Publication date: July 8, 2004
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventor: Takayasu Tsuchiuchi
  • Publication number: 20040128144
    Abstract: Systems and methods by which voice/data communications may occur in multiple modes/protocols are disclosed. In particular, systems and methods are provided for multiple native mode/protocol voice and data transmissions and receptions with a computing system having a multi-bus structure, including, for example, a TDM bus and a packet bus, and multi-protocol framing engines. Such systems preferably include subsystem functions such as PBX, voice mail and other telephony functions, LAN hub and data router. In preferred embodiments, a TDM bus and a packet bus are intelligently bridged and managed, thereby enabling such multiple mode/protocol voice and data transmissions to be intelligently managed and controlled with a single, integrated system. A computer or other processor includes a local area network controller, which provides routing and hub(s) for one or more packet networks. The computer also is coupled to a buffer/framer, which serves to frame/deframe data to/from the computer from TDM bus.
    Type: Application
    Filed: October 10, 2003
    Publication date: July 1, 2004
    Inventors: Christopher Sean Johnson, Scott K. Pickett
  • Patent number: 6738784
    Abstract: The invention relates to an information and/or document processing system. The information processing system includes at least one user input device, a transcription center at which voice files received from the at least one user input device are transcribed to text format and a natural language processing system receiving the transcribed voice files for analysis and processing. The natural language processing system applies knowledge based analysis for compiling the transcribed voice files. The system, further includes a dynamic experiential database processing the compiled and transcribed voice files to add value to the incoming information.
    Type: Grant
    Filed: April 6, 2000
    Date of Patent: May 18, 2004
    Assignee: Dictaphone Corporation
    Inventor: Simon L. Howes
  • Publication number: 20040093220
    Abstract: A method for generating subtitles for audiovisual material received and analyses a text file containing dialogue spoken in audiovisual material and provides a signal representative of the text. The text information and audio signal are aligned in time using time alignment speech recognition and the text and timing information are then output to a subtitle file. Colours can be assigned to different speakers or groups of speakers. Subtitles are derived by receiving and analyzing a text file containing dialogue spoken by considering each word in turn and the next information signal, assigning a score to each subtitle in a plurality of different possible subtitle formatting options which lead to that word. The steps are then repeated until all the words in the text information signal have been used and the subtitle formatting option which gives the best overall score is then derived.
    Type: Application
    Filed: December 6, 2002
    Publication date: May 13, 2004
    Inventors: David Graham Kirby, Christopher Edward Poole, Adam Wiewiorka, William Oscar Lahr
  • Patent number: 6735564
    Abstract: A method and arrangement for managing talk groups of a telecommunication system at a dispatcher station of the telecommunications system having one or more talk groups which may consist of one or more users and which are controlled by the dispatcher at the dispatcher station. The arrangement includes a two-channel or a multichannel sound reproducing system which is configured to create an artificial acoustic space at the dispatcher station, and reproduce voices of each talk group so that the voices are heard from a certain point of the acoustic space, which allows the dispatcher to recognize the talk group to which the voice belongs on the basis of the location of the voice.
    Type: Grant
    Filed: December 28, 2000
    Date of Patent: May 11, 2004
    Assignee: Nokia Networks Oy
    Inventor: Pekka Puhakainen
  • Patent number: 6728680
    Abstract: A data processing system collects video and audio samples of acceptable speech production. A video camera focuses on a speaker's face and, particularly, articulation visible in the area of the mouth or other body movements associated with speech production. Video files are used to archive acceptable and unacceptable productions. These files may then be used to provide feedback about acceptable and unacceptable ways to produce speech. A speech professional or language teacher may play a model speech production and a subject speech attempt simultaneously to compare articulation, audio analysis, and appearance of articulators. A subject may play a model speech production and record a speech attempt simultaneously to attempt to mimic the appearance of articulators. Image processing may be used to create a mirror image of a video model or a current attempt or both to avoid left-right confusion.
    Type: Grant
    Filed: November 16, 2000
    Date of Patent: April 27, 2004
    Assignee: International Business Machines Corporation
    Inventors: Joseph D. Aaron, Peter Thomas Brunet, Frederik C. M. Kjeldsen, Paul S. Luther, Robert Bruce Mahaffey
  • Patent number: 6728682
    Abstract: Audio associated with a video program, such as an audio track or live or recorded commentary, may be analyzed to recognize or detect one or more predetermined sound patterns, such as words or sound effects. The recognized or detected sound patterns may be used to enhance video processing, by controlling video capture and/or delivery during editing, or to facilitate selection of clips or splice points during editing.
    Type: Grant
    Filed: October 25, 2001
    Date of Patent: April 27, 2004
    Assignee: Avid Technology, Inc.
    Inventor: Peter Fasciano
  • Patent number: 6721711
    Abstract: The present invention relates to an audio waveform reproduction apparatus for reproducing a recorded audio waveform at a reproduction tempo that can be specified as desired, and its object is to achieve that the reproduction does not deviate from the tempo when performed at a tempo that is different from the tempo at the time of recording of the audio waveform.
    Type: Grant
    Filed: October 18, 2000
    Date of Patent: April 13, 2004
    Assignee: Roland Corporation
    Inventor: Atsushi Hoshiai
  • Patent number: 6717522
    Abstract: A message providing apparatus according to one embodiment is apparently a small-sized (for sample, screen size of about 3.5 inches) flat display (for example, LCD), wherein a speaker is provided at a cabinet, and a memory card that is a message data storage medium is inserted into the cabinet. Advertisement display data is stored to be classified into a moving image layer indicating a landscape, a still image layer indicating a commodity, a graphic (POP) layer indicating a price, a character layer indicating a catch phrase, and a sound layer which are displayed to be superimposed as required. BGM music data is also stored as the sound layer, and is reproduced in synchronism with a display. Advertisement characters are printed at the cabinet. When power is supplied, the apparatus automatically starts data reproduction of the memory card (by using a power on play function), and when all the data has been reproduced, the apparatus automatically restarts reproduction (by using an auto replay function).
    Type: Grant
    Filed: June 16, 2000
    Date of Patent: April 6, 2004
    Assignee: Toppan Printing Co., Ltd.
    Inventors: Hideo Nagatomo, Toshihiko Kobayashi
  • Patent number: 6711543
    Abstract: A voice operated portable information management system that is substantially language independent and capable of supporting a substantially unlimited vocabulary. The system includes a microphone, speaker, clock and GPS connected to a speech processing system.
    Type: Grant
    Filed: May 30, 2002
    Date of Patent: March 23, 2004
    Assignee: CameronSound, Inc.
    Inventor: Seth A. Cameron
  • Patent number: 6704671
    Abstract: The present invention provides for a method and system for identifying a sonic event of interest within a received audio signal. A sonic event is characterized by a predetermined rate of change in the perceived audio volume, and is associated with the loudness of the audio. The present invention detects a sonic event such as a percussive hit without requiring that the detector be disabled for a fixed time to avoid false triggering. Because the detector is not disabled during the detection process, sonic events occurring in close proximity are easily recognized and not ignored as in some conventional systems.
    Type: Grant
    Filed: July 22, 1999
    Date of Patent: March 9, 2004
    Assignee: Avid Technology, Inc.
    Inventor: Frederick W. Umminger, III
  • Publication number: 20040044428
    Abstract: This invention has as its object to record audio data by a simple operation upon recording the audio data as an operation sound or startup sound of an image sensing apparatus such as a digital camera or the like. To this end, upon recording audio data, since a startup sound, operation sound, shutter sound, and self-timer sound are set as purposes of audio data to be recorded, an audio recording time is set by selecting a desired one of these purposes. After the audio recording time is set, when audio recording is started by a user's intention, audio recording is executed for the set time, and other operations such as an audio recording stop operation and the like are inhibited during this interval. Hence, audio recording is executed until the set audio recording time elapses.
    Type: Application
    Filed: August 29, 2003
    Publication date: March 4, 2004
    Inventor: Hiroaki Yoshino
  • Patent number: 6697796
    Abstract: A technique and apparatus to allow a digital search of the entries in a digital audio database such as the Flash memory of a telephone answering system, the hard drive of a voice messaging system, the audio tracks on a compact disk, a cassette tape, a digital video disk (DVD), a videotape, etc. In one disclosed embodiment, each entry in the digital audio database (e.g., each audio track, each voice message, etc.) is converted into textual information, and the converted textual information is associated with a particular audio segment within the digital audio database. The textual information allows a digital search to be performed for a particular voice message, or portion of a voice message, in a telephone answering device, or for a particular song on a music CD, etc. Once the particular audio segment(s) containing a particular textual string is (are) located, that particular audio segment may be played or otherwise accessed, either in whole or in relevant part.
    Type: Grant
    Filed: January 13, 2000
    Date of Patent: February 24, 2004
    Assignee: Agere Systems Inc.
    Inventor: Bahram Ghaffarzadeh Kermani
  • Publication number: 20040034536
    Abstract: A system and method including an agent for selecting a song among simultaneously streaming songs based on user information is presented.
    Type: Application
    Filed: August 14, 2002
    Publication date: February 19, 2004
    Applicants: Sony Corporation, Sony Music Entertainment Inc.
    Inventor: David A. Hughes
  • Patent number: 6687671
    Abstract: A method is applied for providing electronically collected and summarized meeting information. First, verbal sounds are electronically collected. The verbal sounds are transmitted to a processor. The processor automatically converts the verbal sounds, as collected, into an electronic text file. The text from the file is then automatically scanned and summarized, in accordance with a predetermined algorithm for identifying one or more key terms in the text file, into an electronic summary file. The text file and/or the summary file are then automatically and electronically distributed to a predetermined number of users.
    Type: Grant
    Filed: March 13, 2001
    Date of Patent: February 3, 2004
    Assignees: Sony Corporation, Sony Electronics, Inc.
    Inventors: Gregory D. Gudorf, Philip Michael Abram, Marc Beckwitt, Kazuaki Iso, Brian Raymond, Brian M. Siegel
  • Publication number: 20040019491
    Abstract: A method of pitch corrected speed control (PCSC) playback in which a decoder rate controller receives a desired playback speed from a PCSC controller and determines the number of decoded digital audio samples stored in a buffer. The rate controller then determines the required number of execution times of a parametric speech decoder based on the desired playback speed and the number of decoded samples stored in the buffer. The parametric speech decoder is then executed the determined number of times.
    Type: Application
    Filed: July 23, 2002
    Publication date: January 29, 2004
    Inventor: Changwon D. Rhee
  • Publication number: 20040015252
    Abstract: An audio signal processing device which processes audio signals and outputs the audio signals, is made capable of storing a plurality of setting data as scenes, selecting a portion to be copied as copy data from current data being setting data representing a current status of the device in accordance with the setting at a copy data selection section, selecting a scene to be a paste destination of the copy data in accordance with the setting at a paste destination selection section, and rewriting with the copy data a portion corresponding to the copy data among the scene selected as the paste destination.
    Type: Application
    Filed: July 9, 2003
    Publication date: January 22, 2004
    Applicant: YAMAHA CORPORATION
    Inventors: Masaru Aiso, Akio Suyama
  • Patent number: 6681208
    Abstract: A method of converting text to speech in a communication device includes providing a code table containing coded speech parameters. Next steps include inputting a text message into a communication device, and dividing the text message into phonics. A next step includes mapping each of the phonics against the code table to find the coded speech parameters corresponding to each of the phonics. A next step includes processing the coded speech parameters corresponding to each of the phonics to provide an audio signal. In this way, text can be mapped directly to a vocoder table without intermediate translation steps.
    Type: Grant
    Filed: September 25, 2001
    Date of Patent: January 20, 2004
    Assignee: Motorola, Inc.
    Inventors: Bin Wu, Fan He
  • Patent number: 6678661
    Abstract: A method for highlighting a desired portion in an audio sequence for use in a visual display challenged environment. The method includes storing the audio sequence in memory. Next, the user selects a desired portion of the audio sequence and the selected portion is distinguished from the remainder of the audio sequence by automatically varying an audio characteristic of the selected portion during playback, without permanently altering the selected portion. In a related embodiment, the audio characteristic that is varied is pitch of the selected portion.
    Type: Grant
    Filed: February 11, 2000
    Date of Patent: January 13, 2004
    Assignee: International Business Machines Corporation
    Inventors: Gordon James Smith, George Willard Van Leeuwen
  • Patent number: 6675141
    Abstract: A method and apparatus for converting the reproducing speed of an acoustic signal where, of the acoustic signals held in a data recording section 1, the input acoustic signal s1 (sampled for max. pitch cycle×2) is read from a process-start position P. A low-pass filter 7 controls the high-band component of the acoustic signal s1. A decimation section 8 performs appropriate down-sampling on a signal output from the low pass filter 7. The signal, thus down-sampled, is read into a signal buffer section 9. A down-sampled, input acoustic signal s2 is transferred from the signal buffer section 9 to a pitch-calculating section 3, which calculates a pitch cycle s3.
    Type: Grant
    Filed: October 26, 2000
    Date of Patent: January 6, 2004
    Assignee: Sony Corporation
    Inventors: Akira Inoue, Masayuki Nishiguchi
  • Patent number: 6658307
    Abstract: The method for configuring the functional properties of an audiological device in the form of a hearing aid initially provides a hearing aid with an IC that can be differently configured in view of its properties, permitting configuration upgrade information to be employed that is either distributed to middlemen via a separate data carrier or transmitted on-line from a data store of the manufacturer to a programming station of the middleman. The middleman has the possibility of himself upgrading hearing aids initially present as basic hearing aids in customized fashion, the configuration information being used for this purpose and the hearing manufacturer being paid for this.
    Type: Grant
    Filed: October 12, 2000
    Date of Patent: December 2, 2003
    Assignee: Siemens Aktiengesellschaft
    Inventor: Stefan Mueller
  • Publication number: 20030216924
    Abstract: A digital recording system for stereophonic audio signals is disclosed. Included is an MP3 encoder coupled to an ADC for encoding the two channels of digital audio signals into a compressed data stream. Before being directed into a flash memory for editing, the compressed data stream is split into the original two channels of audio data blocks. An editing device is coupled to the flash memory for block-by-block editing of the stored audio data blocks. On being subsequently retrieved from the flash memory, the two channels of audio data blocks are reformatted into an MP3 data stream preparatory to introduction into an MP3 decoder.
    Type: Application
    Filed: May 16, 2003
    Publication date: November 20, 2003
    Applicant: TEAC Corporation
    Inventors: Katsuhiko Yanase, Akira Suzuki, Eiji Ueda
  • Publication number: 20030208359
    Abstract: A method and apparatus for controlling buffering of an audio stream so that audio can be played continuously without stoppage using a minimum amount of buffering are provided. The method for controlling buffering of data includes (a) determining a buffering period of compressed audio data on the basis of a time stamp which corresponds to a packet division unit, by considering the state of a network, (b) calculating a time period where the compressed audio data is to be stored in a buffer, on the basis of the buffering period that is determined in step (a), (c) buffering the inputted compressed audio data in a decoding buffer for the time period that is calculated in step (b), and (d) decoding the compressed audio data that is stored in the decoding buffer when the time period passes in step (c) and storing the decoded audio data in a composition buffer for synchronization with video data.
    Type: Application
    Filed: January 7, 2003
    Publication date: November 6, 2003
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Sang-Uk Kang, Dae-Gyu Bae
  • Publication number: 20030158613
    Abstract: A method for testing micropower short-wave frequency-modulation digital radio includes the steps of providing at each test station on a production line with a steel cage that is adapted to accommodate a transmitter and a corresponding receiver for test; and providing between two steel cages located at two adjacent test stations with an interference generator that transmits signal-free carriers having the same frequency as that of electric waves transmitted from the transmitter to be tested. Electric waves outward radiated from a steel cage are first attenuated by the steel cage, then destructed by electric waves from the interference generator, and then attenuated again by a steel cage at another test station. Since the attenuated signal-free carriers are not decoded at the receiver in any other steel cage, radio tests could be simultaneously conducted at two adjacent test stations without being mutually interfered.
    Type: Application
    Filed: February 19, 2002
    Publication date: August 21, 2003
    Inventor: Peter Chen
  • Patent number: 6604078
    Abstract: In a voice edit device for editing voice information, the voice information is stored in a voice information storage unit 21, text information corresponding to the voice information stored in the voice information storage unit 21 is stored in a text information storage unit 23, and voice/text association information indicating the corresponding relationship between the voice information and the text information is stored in a voice/text association information storage unit 22. When the voice information is edited, a user indicates an edit target portion on a text displayed on a display device 6, and indicates an edit type. Display control means 12 outputs text edit target portion information indicating the text information which corresponds to the edit target portion indicated on the text, and editing means 14 edits the voice information stored in the voice information storage unit 21 on the basis of the text edit target portion information, the voice/text association information and the edit type.
    Type: Grant
    Filed: August 18, 2000
    Date of Patent: August 5, 2003
    Assignee: NEC Corporation
    Inventor: Izumi Shimazaki
  • Publication number: 20030144847
    Abstract: A unique, fully integrated, fully programmable, and highly flexible sound distribution system and methodology for providing masking sound, background music, and paging capabilities in up to eight zones of a building or space is provided. The methodology embodied in the system includes internal masking sounds that are uniquely pre-filtered to provide efficient and effective masking of distracting sounds within selectable zones of the space with a minimum masking sound dB sound level and with a pleasant sounding and non-annoying masking sound. The system also incorporates the capacity to be controlled from a remote or local telephone to adjust the volume level in any zone serviced by the system by issuing appropriate DTMF codes from the telephone's keypad. Unique bi-tone diagnostic functions are provided for assuring that the entire system is correctly wired and installed and for troubleshooting operational anomalies.
    Type: Application
    Filed: March 28, 2002
    Publication date: July 31, 2003
    Inventors: Kenneth P. Roy, Thomas J. Johnson, Ronald Fuller, Steve Dove
  • Patent number: 6591240
    Abstract: A speech signal modification and concatenation method is provided, in which spoken messages having different voice characteristics can be concatenated without causing a sense of incompatibility, and it is possible to efficiently perform addition or modification of spoken messages. In the speech signal modification and concatenation method, when two speech signals having different voice characteristics are concatenated, the speech signals are concatenated by modifying a parameter indicating a character of speech signals in a manner such that the parameter is gradually changed from a value indicating a feature of one of the speech signals to a value indicating a feature of the other speech signal over a predetermined period.
    Type: Grant
    Filed: September 25, 1996
    Date of Patent: July 8, 2003
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventor: Masanobu Abe
  • Publication number: 20030120495
    Abstract: A method for a digest generating apparatus to generate a digest of image content or sound content is provided. The digest generating apparatus obtains an audience rating or an audience count of image content or sound content at established time intervals, and extracts images from the image content or extracts sounds from the sound content. The extracted images or the extracted sounds correspond to a time when the audience rating or the audience count exceeds a threshold. Then, the digest generating apparatus generates a digest by using the extracted images or the extracted sounds. The audience rating or the audience count can be obtained by using audience data corresponding to an audience group having a specific user profile.
    Type: Application
    Filed: December 20, 2002
    Publication date: June 26, 2003
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Tomoki Watanabe, Shinya Uegaki, Katsumi Kishida, Koichiro Yamamoto, Takashi Hosobuchi, Wataru Inoue, Hisashi Matsukawa
  • Patent number: 6577999
    Abstract: In a computer speech dictation system, a method for automatically managing a plurality of acoustic models. The method is intended to ensure that only high reliability acoustic models which accurately reflect word pronunciations for a given user are retained. The method is accomplished by assigning a base quality metric value for each of the acoustic models maintained by a speech recognition application. The quality metric is incremented or decremented upon the occurrence of certain events relevant to the reliablity of the acoustic model. Acoustic model are discarded when the quality metric value falls below a threshold value.
    Type: Grant
    Filed: March 8, 1999
    Date of Patent: June 10, 2003
    Assignee: International Business Machines Corporation
    Inventors: James R. Lewis, Barbara Ballard
  • Patent number: 6546367
    Abstract: Statistical data including an average value, a standard deviation, and a minimum value of a phoneme duration of each phoneme is stored in a memory. When speech production time is determined for a phoneme string in a predetermined expiratory paragraph, the total phoneme duration of the phoneme string is set so as to become equal to the speech production time. Based on the set phoneme duration, phonemes are connected and a speech waveform is generated. To set a phoneme duration for each phoneme, a phoneme duration initial value is first set based on an average value, obtained by equally dividing the speech production time by phonemes of the phoneme string, and a phoneme duration range, phoneme. Then, set based on statistical data of each the phoneme duration initial value is adjusted based on the statistical data and the speech production time.
    Type: Grant
    Filed: March 9, 1999
    Date of Patent: April 8, 2003
    Assignee: Canon Kabushiki Kaisha
    Inventor: Mitsuru Otsuka
  • Publication number: 20030065517
    Abstract: Audio information is read out of a recording medium which records musical pieces as audio information. The audio information read out is processed to detect BPM values and positions of beats in the musical piece. A musical piece is reproduced from the recording medium by reproducing audio information in accordance with the detected BPM values and positions of beats. The BPM value indicates the tempo of a musical piece, and the beat indicates a strength of a sound which repeatedly appears in each musical piece. When the audio information is reproduced in accordance with the detected BPM values and positions of beats, the musical piece is reproduced at the correct tempo and beats without giving an unnatural feeling to the listener.
    Type: Application
    Filed: September 23, 2002
    Publication date: April 3, 2003
    Applicant: PIONEER CORPORATION
    Inventors: Masahiko Miyashita, Koji Ogura, Kensuke Chiba, Takeaki Funada
  • Publication number: 20030055636
    Abstract: A gain adjustment unit uses a power ratio, Padd/Pdif, as an index for judging the strength of speech in an audio signal. Padd is the power of a sum signal of a left channel signal and a right channel signal, and Pdif is the power of the difference signal of the left channel signal and the right channel signal. When the power ratio is small, speech is absent from the audio signal and the gain of the sum signal of the left channel signal and right channel signal is minimized. As a result, it becomes possible to suppress a speech enhancement process when speech is absent from the audio signal to thereby eliminate negative effects associated therewith.
    Type: Application
    Filed: September 16, 2002
    Publication date: March 20, 2003
    Applicant: Matsushita Electric Industrial Co., Ltd.
    Inventors: Naoyuki Katuo, Yoshinori Kumamoto
  • Patent number: 6522746
    Abstract: Methods and apparatus for processing a transmitted voice signal include a centralized frame controller providing at least one boundary control signal to voice processing blocks and controlling the operation of the voice processing blocks on the transmitted voice signal based upon the boundary control signal.
    Type: Grant
    Filed: November 3, 2000
    Date of Patent: February 18, 2003
    Assignee: Tellabs Operations, Inc.
    Inventors: Daniel J. Marchok, Richard C. Younce, Charles W. K. Gritton, Ravi Chandran
  • Publication number: 20030028385
    Abstract: An audio player includes a memory storing one or more audio data files and at least one personalized audio profile comprising for each ear a map of amplitude audio frequency profile. A file selector permits a user to select one or more of the stored audio data files and at least one of the personalized profiles. A data processor accesses the selected files and profiles from the memory and processes the selected files with the profile to generate a processed audio signal for each ear. An actuator reproduces the audio information stored in the processed audio signal as sound for each ear. Personalized profiles are produced by monitoring and storing the responses of users to audible signals having different frequencies.
    Type: Application
    Filed: July 1, 2002
    Publication date: February 6, 2003
    Inventor: Athena Christodoulou
  • Publication number: 20030014262
    Abstract: A system and method for providing a service of playing an accompaniment/musical performance is disclosed. In order to embody the system and method for providing the service of playing the accompaniment/musical performance, virtual orchestra system (VOS) files, which is converted from digital music files, e.g., musical instrument digital interface (MIDI) files and includes play order notes and sound data for each musical instrument capable of being played, are used. A server provides the VOS files through a network, e.g., a local area network (LAN), an Intranet, a value added network (VAN), an Internet or a public switched telephone network. A music is selected by a user through at least a client terminal. The play order note for each musical instrument is provided and the sound data for each musical instrument is played based on the play order note, thereby playing in solo or in concert. (At this time, sound for the others musical instrument is silent or used as a background music.
    Type: Application
    Filed: June 20, 2002
    Publication date: January 16, 2003
    Inventor: Yun-Jong Kim