Sound Editing Patents (Class 704/278)

Media interface for converting voice to text

Patent number: 8145497

Abstract: Provided are a user interface for processing digital data, a method for processing a media interface, and a recording medium thereof. The user interface is used for converting a selected script into voice to generate digital data having a form of a voice file corresponding to the script, or for managing the generated digital data. In the method, the user interface is displayed. The user interface includes at least a text window on which a script to be converted into voice is written, and an icon to be selected for converting the script written on the text window into voice.

Type: Grant

Filed: July 10, 2008

Date of Patent: March 27, 2012

Assignee: LG Electronics Inc.

Inventors: Tae Hee Ahn, Sung Hun Kim, Dong Hoon Lee
Time varying processing of repeated digital audio samples in accordance with a user defined effect

Patent number: 8145496

Abstract: A programmed “Stutter Edit” creates, stores and triggers combinations of effects to be used on a repeated short sample (“slice”) of recorded audio. The combination of effects (“gesture”) act on the sample over a specified duration (“gesture length”), with the change in parameters for each effect over the gesture length being dictated by user-defined curves. Such a system affords wide manipulation of audio recorded on-the-fly, perfectly suited for live performance. These effects preferably include not only stuttering but also imposing an amplitude envelope on the slice being triggered, sample rate and bit rate manipulation, panning (interpolation between pre-defined spatial positions), high- and low-pass filters and compression. Destructive edits, such as reversing, pitch shifting, and fading may also alter the way the Stutter Edit is heard. More advanced techniques, include using filters, FX processors, and other plug-ins, can increase the detail and uniqueness of a particular Stutter Edit effect.

Type: Grant

Filed: May 25, 2007

Date of Patent: March 27, 2012

Inventor: Brian Transeau
Content filtering for a digital audio signal

Patent number: 8121849

Abstract: According to some embodiments, content filtering is provided for a digital audio signal.

Type: Grant

Filed: November 22, 2010

Date of Patent: February 21, 2012

Assignee: Intel Corporation

Inventors: Christopher J. Cormack, Tony Moy
Method and apparatus for detecting audio signals

Patent number: 8116463

Abstract: A method and an apparatus for detecting audio signals are disclosed. The input audio signal is inspected to check whether it is a foreground frame or a background frame; the detected background signal is further inspected according to the music eigenvalue and the decision rule. Therefore, background music can be detected, and the classifying performance of the voice/music classifier is improved.

Type: Grant

Filed: December 27, 2010

Date of Patent: February 14, 2012

Assignee: Huawei Technologies Co., Ltd.

Inventor: Zhe Wang
Synchronise an audio cursor and a text cursor during editing

Patent number: 8117034

Abstract: A speech recognition device (1) processes speech data (SD) of a dictation and thus establishes recognized text information (ETI) and link information (LI) of the dictation. In a synchronous playback mode of the speech recognition device (1), during the acoustic playback of the dictation a correction device (10) synchronously marks the word of the recognized text information (ETI) which word relates to the speech data (SD) just played back marked by the link information (LI) is marked synchronously, the just marked word featuring the position of an audio cursor (AC). When a user of the speech recognition device (1) recognizes an incorrect word, he positions a text cursor (TC) at the incorrect word and corrects it. Cursor synchronization means (15) now make it possible to synchronize the text cursor (TC) with the audio cursor (AC) or the audio cursor (AC) with the text cursor (TC) so that the positioning of the respective cursor (AC, TC) is simplified considerably.

Type: Grant

Filed: March 26, 2002

Date of Patent: February 14, 2012

Assignee: Nuance Communications Austria GmbH

Inventor: Wolfgang Gschwendtner
Multiple audio file processing method and system

Patent number: 8103511

Abstract: An audio file generation method and system. A computing system receives a first audio file comprising first speech data associated with a first party. The computing system receives a second audio file comprising second speech data associated with a second party. The first audio file differs from the second audio file. The computing system generates a third audio file from the second audio file. The third audio file differs from the second audio file. The process to generate the third audio file includes identifying a first set of attributes missing from the second audio file and adding the first set of attributes to the second audio file. The process to generate the third audio file additionally includes removing a second set of attributes from the second audio file. The third audio file includes third speech data associated with the second party. The computing system broadcasts the third audio file.

Type: Grant

Filed: May 28, 2008

Date of Patent: January 24, 2012

Assignee: International Business Machines Corporation

Inventors: Sara H. Basson, Brian R. Heasman, Dimitri Kanevsky, Edward Emile Kelley
Methods and systems for parasitic sensing

Patent number: 8095367

Abstract: Methods and systems of parasitic sensing are shown and described. The method includes, measuring, at a first time using one or more electrical elements native to a domain, a parameter of a circuit within the domain and measuring, at a second time using the one or more electrical elements native to the domain, the parameter. The method also includes, comparing the parameter measurement from the first time to the parameter measurement at the second time and determining, in response to the comparison, that an activity occurred within the domain.

Type: Grant

Filed: October 30, 2007

Date of Patent: January 10, 2012

Assignee: Raytheon BBN Technologies Corp.

Inventors: Ronald Bruce Coleman, John Scott Knight, George Shepard, Richard Madden
Method and apparatus for detecting audio signals

Patent number: 8050415

Abstract: A method and an apparatus for detecting audio signals are disclosed. The input audio signal is detected to determine whether it is a background frame. The detected background signal is further detected according to a music characterization value and a decision rule. Therefore, background music can be detected, and the classifying performance of the voice/music classifier is improved.

Type: Grant

Filed: April 25, 2011

Date of Patent: November 1, 2011

Assignee: Huawei Technologies, Co., Ltd.

Inventor: Zhe Wang
Apparatus and method for adjusting prompt voice depending on environment

Patent number: 8050926

Abstract: An apparatus for adjusting a prompt voice depending on an environment comprises a receiver module used for receiving a background sound, an analyzer module generating a control signal according to the background sound and an output module adjusting an output frequency of a prompt voice through the control signal and outputting the adjusted prompt voice.

Type: Grant

Filed: November 5, 2007

Date of Patent: November 1, 2011

Assignee: Micro-Star Int'l Co., Ltd

Inventor: Chien Ming Huang
Sound masking system and masking sound generation method

Patent number: 8050931

Abstract: In a masking sound generation apparatus, a CPU analyzes a speech utterance speed of a received sound signal. Then, the CPU copies the received sound signal into a plurality of sound signals and performs the following processing on each of the sound signals. Namely, the CPU divides each of the sound signals into frames on the basis of a frame length determined on the basis of the speech utterance speed. Reverse process is performed on each of the frames to replace a waveform of the frame with a reverse waveform, and a windowing process is performed to achieve a smooth connection between the frames. Then, the CPU randomly rearranges the order of the frames and mixes the plurality of sound signals to generate a masking sound signal.

Type: Grant

Filed: March 19, 2008

Date of Patent: November 1, 2011

Assignee: Yamaha Corporation

Inventors: Atsuko Ito, Yasushi Shimizu, Akira Miki, Masato Hata
METHOD AND SYSTEM FOR ADAPTING COMMUNICATIONS

Publication number: 20110264453

Abstract: In a method of adapting communications in a communication system comprising at least two terminals (1,2), a signal carrying at least a representation of at least part of an information content of an audio signal captured at a first terminal (1) and representing speech is communicated between the first terminal (1) and a second terminal (2). A modified version of the audio signal is made available for at the second terminal (2). At least one of the terminals (1,2) generates the modified version by re-creating the audio signal in a version modified such that at least one prosodic aspect of the represented speech is adapted in dependence on input data (22) provided at at least one of the terminals (1,2).

Type: Application

Filed: December 15, 2009

Publication date: October 27, 2011

Applicant: KONINKLIJKE PHILIPS ELECTRONICS N.V.

Inventors: Dirk Brokken, Nicolle Hanneke van Schijndel, Mark Thomas Johnson, Joanne Henriette Desiree Monique Westerink, Paul Marcel Carl Lemmens
Brush tool for audio editing

Patent number: 8037413

Abstract: This specification describes technologies relating to editing digital audio data. In some implementations, a computer-implemented method is provided. The method includes displaying a visual representation of audio data, receiving an input selecting a selected portion of audio data within the visual representation, the selecting including applying a brush tool to the visual representation of the audio data, and editing the selected portion of audio data including determining a degree of opacity for the selected audio data and applying an editing effect according to the degree of opacity.

Type: Grant

Filed: January 30, 2008

Date of Patent: October 11, 2011

Assignee: Adobe Systems Incorporated

Inventor: David E. Johnston
System and method for high-quality variable speed playback of audio-visual media

Patent number: 8032360

Abstract: A system and method for high-quality variable speed playback of audio-visual (A/V) media is provided. The system receives an encoded visual signal and an encoded audio signal. The encoded visual signal is decoded to generate a decoded visual signal and the encoded audio signal is decoded to generate a decoded audio signal. The decoded audio signal is time scale modified to generate a time scale modified audio signal. The decoded visual signal and the time scale modified audio signal are then synchronized for playback at a predefined playback speed. Only partial decoding of the encoded audio signal may be performed to conserve processing power.

Type: Grant

Filed: May 13, 2004

Date of Patent: October 4, 2011

Assignee: Broadcom Corporation

Inventor: Juin-Hwey Chen
TRANSCRIPT EDITOR

Publication number: 20110239107

Abstract: A transcript editor enables text-based editing of time-based media that includes spoken dialog. It involves an augmented transcript that includes timing metadata that associates words and phrases within the transcript with corresponding temporal locations within the time-based media where the text is spoken, and editing the augmented transcript without the need for playback of the time-based media. After editing, the augmented transcript is processed by a media editing system to automatically generate an edited version of the time-based media that only includes the segments of the time-based media that include the speech corresponding to the edited augmented transcript.

Type: Application

Filed: March 29, 2010

Publication date: September 29, 2011

Inventors: Michael E. Phillips, Glenn Lea
Speaker Speed Conversion System, Method for Same, and Speed Conversion Device

Publication number: 20110224990

Abstract: A speaker speed conversion system includes: a risk site detection unit (22) for detecting sites of risk regarding sound quality from among speech that is received as input, a frame boundary detection unit (23) for searching for a plurality of points that can serve as candidates of frame boundaries from among speech that is received as input and, of these points, supplying as a frame boundary the point that is predicted to be best from the standpoint of sound quality, and an OLA unit (25) for implementing speed conversion based on the detection results in the frame boundary detection unit (23); wherein the frame boundary detection unit (23) eliminates, from candidates of frame boundaries, sites of risk regarding sound quality that were detected in the risk site detection unit (22).

Type: Application

Filed: July 22, 2008

Publication date: September 15, 2011

Inventor: Satoshi Hosokawa
OBFUSCATING SENSITIVE CONTENT IN AUDIO SOURCES

Publication number: 20110218798

Abstract: Techniques implemented as systems, methods, and apparatuses, including computer program products, for obfuscating sensitive content in an audio source representative of an interaction between a contact center caller and a contact center agent. The techniques include performing, by an analysis engine of a contact center system, a context-sensitive content analysis of the audio source to identify each audio source segment that includes content determined by the analysis engine to be sensitive content based on its context; and processing, by an obfuscation engine of the contact center system, one or more identified audio source segments to generate corresponding altered audio source segments each including obfuscated sensitive content.

Type: Application

Filed: March 5, 2010

Publication date: September 8, 2011

Applicant: Nexdia Inc.

Inventor: Marsal Gavalda
Selective security masking within recorded speech

Patent number: 7996230

Abstract: A marker is derived from an interaction between a person and an agent of a business and the agent's user interface. A part of a speech signal that corresponds to a portion of the person's special information is located with the marker. The speech signal results from the interaction between the person and the agent. The part of the speech signal that corresponds to the portion of the person's special information is rendered unintelligible.

Type: Grant

Filed: August 6, 2009

Date of Patent: August 9, 2011

Assignee: Intellisist, Inc.

Inventor: G. Kevin Doren
System and method using blind change detection for audio segmentation

Patent number: 7991619

Abstract: A system, method and computer program product for performing blind change detection audio segmentation that combines hypothesized boundaries from several segmentation algorithms to achieve the final segmentation of the audio stream. Automatic segmentation of the audio streams according to the system and method of the invention may be used for many applications like speech recognition, speaker recognition, audio data mining, online audio indexing, and information retrieval systems, where the actual boundaries of the audio segments are required.

Type: Grant

Filed: June 19, 2008

Date of Patent: August 2, 2011

Assignee: International Business Machines Corporation

Inventors: Upendra V. Chaudhari, Mohamed Kamal Omar, Ganesh N. Ramaswamy
SOUND QUALITY CONTROL DEVICE AND SOUND QUALITY CONTROL METHOD

Publication number: 20110178805

Abstract: According to one embodiment, a sound quality control device includes: a time domain analysis module configured to perform a time-domain analysis on an audio-input signal; a frequency domain analysis module configured to perform a frequency-domain analysis on a frequency-domain signal; a first calculation module configured to calculate first speech/music scores based on the analysis results; a compensation filtering processing module configured to generate a filtered signal; a second calculation module configured to calculate second speech/music scores based on the filtered signal; a score correction module configured to generate one of corrected speech/music scores based on a difference between the first speech/music score and the second speech/music score; and a sound quality control module configured to control a sound quality of the audio-input signal based on the one of the corrected speech/music scores.

Type: Application

Filed: September 29, 2010

Publication date: July 21, 2011

Inventors: Hirokazu Takeuchi, Hiroshi Yonekubo
AUTOMATED DETECTION AND FILTERING OF AUDIO ADVERTISEMENTS

Publication number: 20110145001

Abstract: A data stream is filtered to produce a filtered data stream. The data stream is analyzed based on an acoustic parameter to determine whether a predetermined condition is satisfied. At least one extraneous portion of the data stream, in which the predetermined condition is satisfied, is determined. Thereafter, the at least one extraneous portion is deleted from the data stream to produce the filtered data stream.

Type: Application

Filed: December 10, 2009

Publication date: June 16, 2011

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Yeon-Jun KIM, I. Dan MELAMED, Bernard S. RENGER, Steven Neil TISCHER
AUTOMATIC DETECTION OF AUDIO ADVERTISEMENTS

Publication number: 20110145002

Abstract: A method, apparatus, and computer-readable medium for editing a data stream based on a corpus are provided. The data stream includes stream words. A sequence includes a predetermined number of sequential words of the stream words. The method, apparatus, and computer-readable medium determine whether the sequence exists in the corpus at least at a predetermined minimum frequency. When the sequence exists in the corpus at least at the predetermined minimum frequency, the sequence is edited in the data stream.

Type: Application

Filed: September 17, 2010

Publication date: June 16, 2011

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Ilya Dan MELAMED, Yeon-Jun KIM
Robust decoder

Patent number: 7962335

Abstract: Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Type: Grant

Filed: July 14, 2009

Date of Patent: June 14, 2011

Assignee: Microsoft Corporation

Inventors: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen
METHOD AND APPARATUS FOR CANCELING VOCAL SIGNAL FROM AUDIO SIGNAL

Publication number: 20110137658

Abstract: Provided is a method of canceling a vocal signal, wherein the method includes obtaining a difference signal between two audio signals; and smoothing the frequency of the difference signal. Also provided is a device for canceling a vocal signal, the device including a subtracter which obtains a difference signal between two audio signals; and a frequency smoothing unit which smoothes a frequency of the difference signal.

Type: Application

Filed: October 12, 2010

Publication date: June 9, 2011

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Jun-ho LEE
Sound processing apparatus and method, and program therefor

Patent number: 7945446

Abstract: Spectrum envelope of an input sound is detected. In the meantime, a converting spectrum is acquired which is a frequency spectrum of a converting sound comprising a plurality of sounds, such as unison sounds. Output spectrum is generated by imparting the detected spectrum envelope of the input sound to the acquired converting spectrum. Sound signal is synthesized on the basis of the generated output spectrum. Further, a pitch of the input sound may be detected, and frequencies of peaks in the acquired converting spectrum may be varied in accordance with the detected pitch of the input sound. In this manner, the output spectrum can have the pitch and spectrum envelope of the input sound and spectrum frequency components of the converting sound comprising a plurality of sounds, and thus, unison sounds can be readily generated with simple arrangements.

Type: Grant

Filed: March 9, 2006

Date of Patent: May 17, 2011

Assignee: Yamaha Corporation

Inventors: Hideki Kemmochi, Yasuo Yoshioka, Jordi Bonada
Vocoder system and method for vocal sound synthesis

Patent number: 7933768

Abstract: A vocoder system for improving the performance expression of an output sound while lightening the computational load. The system includes formant detection means and division means in which the center frequencies have been fixed. The modulation level with which the levels of each of the frequency bands that have been divided in the division means are set by a setting means based on the levels of each of the frequency bands that correspond to those that have been detected in the formant detection means and formant information with which the formants are changed. Therefore, it is possible to improve the performance expression of the output sound with a light computational load and without the need to calculate and change the filter figure of each filter for each sample in order to change the center frequency and bandwidth of each of the filters comprising the division means.

Type: Grant

Filed: March 23, 2004

Date of Patent: April 26, 2011

Assignee: Roland Corporation

Inventor: Tadao Kikumoto
SYSTEM AND METHOD FOR AUTOMATIC TEMPORAL ADJUSTMENT BETWEEN MUSIC AUDIO SIGNAL AND LYRICS

Publication number: 20110054910

Abstract: A system provided herein may perform automatic temporal alignment between music audio signal and lyrics with higher accuracy than ever. A non-fricative section extracting 4 extracts non-fricative sound sections, where no fricative sounds exist, from the music audio signal. An alignment portion 17 includes a phone model 15 for singing voice capable of estimating phonemes corresponding to temporal-alignment features. The alignment portion 17 performs an alignment operation using as inputs temporal-alignment features obtained from a temporal-alignment feature extracting portion 11, information on vocal and non-vocal sections obtained from a vocal section estimating portion 9, and a phoneme network SN on conditions that no phonemes exist at least in non-vocal sections and that no fricative phonemes exist in non-fricative sound sections.

Type: Application

Filed: February 5, 2009

Publication date: March 3, 2011

Applicant: NATIONAL INSTITUTE OF ADVANCED INDUSTRIAL SCIENCE AND TECHNOLOGY

Inventors: Hiromasa Fujihara, Masataka Goto
System for enhancing live speech with information accessed from the world wide web

Patent number: 7865367

Abstract: A system that includes a speaker workstation and a system that includes an auditor device. The speaker workstation is configured to perform a method for generating a Speech Hyperlink-Time table in conjunction with a system of universal time. The speaker workstation creates a Speech Hyperlink table. While a speech is being spoken by a speaker, the speaker workstation recognizes each hyperlinked term of the Speech Hyperlink table being spoken by the speaker, and for each recognized hyperlinked term, generates a row in the Speech Hyperlink-Time table. The auditor device is configured to perform a method for processing a speech in conjunction with a system of universal time. The auditor device determines and records, in a record of a Selections Hyperlink-Time table, a universal time corresponding to a hyperlinked term spoken during a speech.

Type: Grant

Filed: March 12, 2009

Date of Patent: January 4, 2011

Assignee: International Business Machines Corporation

Inventor: Fernando Incertis Carro
Audio device

Patent number: 7865360

Abstract: An audio device for modifying the voice transmitted during a telephone call particularly suitable for a mobile telephone system receives from the user of the audio device an analog speech signal. A converter converts the analog speech signal into a digital speech signal comprising at least one fundamental frequency. A set of coded data represents a musical score comprising a set of notes, each note being defined by a fundamental frequency, a duration, and an instrument that plays the note. A digital music signal is extracted from the set of coded data, and a first portion of the digital speech signal is mixed with a first portion of the digital music signal to produce a combined digital signal.

Type: Grant

Filed: March 18, 2004

Date of Patent: January 4, 2011

Assignee: IPG Electronics 504 Limited

Inventors: Xavier Fourquin, Pierre Bonnard
Content filtering for a digital audio signal

Patent number: 7865370

Abstract: According to some embodiments, content filtering is provided for a digital audio signal.

Type: Grant

Filed: November 21, 2008

Date of Patent: January 4, 2011

Assignee: Intel Corporation

Inventors: Christopher J. Cormack, Tony Moy
SOUND QUALITY CORRECTION APPARATUS, SOUND QUALITY CORRECTION METHOD AND SOUND QUALITY CORRECTION PROGRAM

Publication number: 20100332237

Abstract: According to one embodiment, a sound quality correction apparatus calculates various feature parameters for identifying the speech signal and the music signal from an input audio signal and, based on the various feature parameters thus calculated, also calculates a speech/music identification score indicating to which of the speech signal and the music signal the input audio signal is close to. Then, based on this speech/music identification score, the correction strength of each of plural sound quality correctors is controlled to execute different types of the sound quality correction processes on the input audio signal.

Type: Application

Filed: February 4, 2010

Publication date: December 30, 2010

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Hirokazu TAKEUCHI
Robust decoder

Patent number: 7831421

Abstract: Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Type: Grant

Filed: May 31, 2005

Date of Patent: November 9, 2010

Assignee: Microsoft Corporation

Inventors: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen
Redundancy elimination by aggregation of multiple chunks

Patent number: 7814284

Abstract: A data redundancy elimination system.

Type: Grant

Filed: January 18, 2007

Date of Patent: October 12, 2010

Assignee: Cisco Technology, Inc.

Inventors: Gideon Glass, Maxim Martynov, Qiwen Zhang, Etai Lev Ran, Dan Li
VOICE QUALITY EDIT DEVICE AND VOICE QUALITY EDIT METHOD

Publication number: 20100250257

Abstract: This invention includes: a voice quality feature database (101) holding voice quality features; a speaker attribute database (106) holding, for each voice quality feature, an identifier enabling a user to expect a voice quality of the voice quality feature; a weight setting unit (103) setting a weight for each acoustic feature of a voice quality; a scaling unit (105) calculating display coordinates of each voice quality feature based on the acoustic features in the voice quality feature and the weights set by the weight setting unit (103); a display unit (107) displaying the identifier of each voice quality feature on the calculated display coordinates; a position input unit (108) receiving designated coordinates; and a voice quality mix unit (110) (i) calculating a distance between (1) the received designated coordinates and (2) the display coordinates of each of a part or all of the voice quality features, and (ii) mixing the acoustic features of the part or all of the voice quality features together based

Type: Application

Filed: June 4, 2008

Publication date: September 30, 2010

Inventors: Yoshifumi Hirose, Takahiro Kamai
Supporting a decoding of frames

Patent number: 7796626

Abstract: For supporting a decoding of encoded frames, which belong to a sequence of frames received via a packet switched network, it is detected whether a particular encoded frame has been received after a scheduled decoding time for the particular encoded frame and before a scheduled decoding time for a next encoded frame. In case the particular encoded frame is detected to have been received after its scheduled decoding time and before the scheduled decoding time for the next encoded frame, the particular encoded frame is re-scheduled to be decoded at the scheduled decoding time for the next encoded frame.

Type: Grant

Filed: September 26, 2006

Date of Patent: September 14, 2010

Assignee: Nokia Corporation

Inventors: Ari Lakaniemi, Pasi S. Ojala
System and method for providing virtual spatial sound with an audio visual player

Patent number: 7792674

Abstract: A method and machine-readable medium for providing virtual spatial sound with an audio visual player are disclosed. Input audio is processed into output audio having spatial attributes associated with the spatial sound represented in a room display.

Type: Grant

Filed: March 30, 2007

Date of Patent: September 7, 2010

Assignee: Smith Micro Software, Inc.

Inventors: Robert J. E. Dalton, Jr., Rupen Dolasia
Demographic based classification for local word wheeling/web search

Patent number: 7778837

Abstract: Systems and methods that create a classification of sentences in a language, and further construct associated local versions of language models, based on geographical location and/or other demographic criteria—wherein such local language models can be of different levels of granularity according to chosen demographic criteria. The subject innovation employs a classification encoder component that forms a classification (e.g. a tree structure) of sentences, and a local language models encoder component, which employs the classification of sentences in order to construct the localized language models. A decoder component can subsequently enable local word wheeling and/or local web search by blending k-best answers from local language models of varying demographic granularity that match users demographics. Hence, k-best matches for input data by users in one demographic locality can be different from k-best matches for the same input by other users in another locality.

Type: Grant

Filed: November 30, 2006

Date of Patent: August 17, 2010

Assignee: Microsoft Corporation

Inventors: Bo Thiesson, Kenneth W. Church
Pre-processing individual audio items in a media project in order to improve real-time processing of the media project

Patent number: 7778823

Abstract: Some embodiments of the invention provide a method of processing audio data while creating a media presentation. The media presentation includes several audio streams. The method processes a section of a first audio stream and stores the processed section of the first audio stream. The method also processes a section of a second audio stream that overlaps with the processes section of the first audio stream. The method then processes the section of the second audio stream independently of the first audio stream. In some embodiments, the method processes the first audio stream section by applying an effect to the first audio stream section. Also, in some embodiments, the processing of the first audio stream section also entails performing a sample rate conversion on the first audio stream section.

Type: Grant

Filed: December 21, 2007

Date of Patent: August 17, 2010

Assignee: Apple Inc.

Inventor: Kenneth M. Carson
AUDIO TRANSFORMS IN CONNECTION WITH MULTIPARTY COMMUNICATION

Publication number: 20100195812

Abstract: The claimed subject matter relates to an architecture that can preprocess audio portions of communications in order to enrich multiparty communication sessions or environments. In particular, the architecture can provide both a public channel for public communications that are received by substantially all connected parties and can further provide a private channel for private communications that are received by a selected subset of all connected parties. Most particularly, the architecture can apply an audio transform to communications that occur during the multiparty communication session based upon a target audience of the communication. By way of illustration, the architecture can apply a whisper transform to private communications, an emotion transform based upon relationships, an ambience or spatial transform based upon physical locations, or a pace transform based upon lack of presence.

Type: Application

Filed: February 5, 2009

Publication date: August 5, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Dinei A. Florencio, Alejandro Acero, William Buxton, Phillip A. Chou, Ross G. Cutler, Jason Garms, Christian Huitema, Kori M. Quinn, Daniel Allen Rosenfeld, Zhengyou Zhang
Voice Conversion System

Publication number: 20100198600

Abstract: A voice conversion training system, voice conversion system, voice conversion client-server system, and program that realize voice conversion to be performed with low load of training are provided. In a server 10, an intermediate conversion function generation unit 101 generates an intermediate conversion function F, and a target conversion function generation unit 102 generates a target conversion function G. In a mobile terminal 20, an intermediate voice conversion unit 211 uses the conversion function F to generate speech of an intermediate speaker from speech of a source speaker, and a target voice conversion unit 212 uses the conversion function G to convert speech of the intermediate speaker speech generated by the intermediate voice conversion unit 211 to speech of a target speaker.

Type: Application

Filed: November 28, 2006

Publication date: August 5, 2010

Inventor: Tsuyoshi Masuda
SPEECH RECOGNITION SYSTEM AND METHOD

Publication number: 20100161326

Abstract: A speech recognition system includes: a speed level classifier for measuring a moving speed of a moving object by using a noise signal at an initial time of speech recognition to determine a speed level of the moving object; a first speech enhancement unit for enhancing sound quality of an input speech signal of the speech recognition by using a Wiener filter, if the speed level of the moving object is equal to or lower than a specific level; and a second speech enhancement unit enhancing the sound quality of the input speech signal by using a Gaussian mixture model, if the speed level of the moving object is higher than the specific level. The system further includes an end point detection unit for detecting start and end points, an elimination unit for eliminating sudden noise components based on a sudden noise Gaussian mixture model.

Type: Application

Filed: July 21, 2009

Publication date: June 24, 2010

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Sung Joo Lee, Ho-Young Jung, Jeon Gue Park, Hoon Chung, Yunkeun Lee, Byung Ok Kang, Hyung-Bae Jeon, Jong Jin Kim, Ki-young Park, Euisok Chung, Ji Hyun Wang, Jeom Ja Kang
Method and apparatus for data processing and encoding and decoding method, and apparatus therefor

Patent number: 7743016

Abstract: An apparatus for processing a signal and method thereof are disclosed. Data coding and entropy coding are performed with interconnection, and grouping is used to enhance coding efficiency. The present invention includes the steps of obtaining index information and entropy-decoding the index information and identifying a content corresponding to the entropy-decoded index information and selecting entropy table.

Type: Grant

Filed: October 4, 2006

Date of Patent: June 22, 2010

Assignee: LG Electronics Inc.

Inventors: Hee Suk Pang, Hyen O Oh, Dong Soo Kim, Jae Hyun Lim, Yang-Won Jung, Hyo Jin Kim
Signal coupling method and apparatus

Patent number: 7739112

Abstract: A signal connecting method and apparatus is provided which can reduce noises and create natural synthesized voices. The signal connecting method (or apparatus) for connecting a plurality of waveform signals and creating a synthesized waveform signal, has: a step (or unit) for determining an upper limit frequency of a frequency spectrum of each of the plurality of waveform signals; and a step (or unit) for filtering at least a connection portion of each waveform signal by using predetermined filter characteristics having the determined upper limit frequency. The cut-off frequency of the filtering is the higher upper limit frequency in upper limit frequencies of spectra of adjacent two waveform signals before and after the connection portion of the waveform signals. Higher harmonics to be caused by discontinuity of the connection portion of waveform signals can be effectively removed and noises of synthesized waveform signals can be reduced considerably.

Type: Grant

Filed: June 27, 2002

Date of Patent: June 15, 2010

Assignees: Kabushiki Kaisha Kenwood, Advanced Telecommunications Research Institute International

Inventors: Yasushi Sato, Davin Patrick
Speech recognition enhancer

Patent number: 7734472

Abstract: The invention concerns a speech recognition enhancer (51) and a speech recognition system comprising such speech recognition enhancer (51), an audio input unit (41) and a speech recognizer (61, 3). The speech recognition enhancer (51) is arranged between the audio input unit (41) and the speech recognizer (61, 3). The speech recognition enhancer (51) has a parametrizable pre-filtering unit (511), a parametrizable dynamic voice level control unit (512), a parametrizable noise reduction unit (513) and a parametrizable voice level control unit (514). The parameters of these parametrizable units (511, 512, 513, 514) are adjusted to the characteristics of the specific audio input unit (41) and/or the characteristics of the specific speech recognizer (61, 3) for adapting the audio input unit (41) to the speech recognizer (61, 3).

Type: Grant

Filed: September 29, 2004

Date of Patent: June 8, 2010

Assignee: Alcatel

Inventor: Michael Walker
FILTERING A LIST OF AUDIBLE ITEMS

Publication number: 20100137030

Abstract: Disclosed is a technique for presenting audible items to a user in a manner that allows the user to easily distinguish them and to select from among them. A number of audible items are rendered simultaneously to the user. To prevent the sounds from blending together into a sonic mishmash, some of the items are “conditioned” while they are being rendered. For example, one audible item might be rendered more quietly than another, or one item can be moved up in register compared with another. Some embodiments combine audible conditioning with visual avatars portrayed on, for example, a display screen of a user device. During the rendering, each audible item is paired with an avatar, the pairing based on some suitable criterion, such as a type of conditioning applied to the audible item. Audible spatial placement is mimicked by visual placement of the avatars on the user's display screen.

Type: Application

Filed: December 2, 2008

Publication date: June 3, 2010

Applicant: MOTOROLA, INC.

Inventor: Changxue Ma
Method and apparatus for a differentiated voice output

Patent number: 7698139

Abstract: In a method and apparatus for a differentiated voice output, systems existing in a vehicle, such as the on-board computer, the navigation system, and others, can be connected with a voice output device. The voice outputs of different systems can be differentiated by way of voice characteristics.

Type: Grant

Filed: June 20, 2003

Date of Patent: April 13, 2010

Assignee: Bayerische Motoren Werke Aktiengesellschaft

Inventors: Georg Obert, Klaus-Josef Bengler
Broadcast receiving method, broadcast receiving system, recording medium, and program

Patent number: 7698138

Abstract: A broadcast receiving system includes a broadcast receiving part for receiving a broadcast in which additional information that corresponds to an object appearing in broadcast contents and that contains keyword information for specifying the object is broadcasted simultaneously with the broadcast contents; a recognition vocabulary generating section for generating a recognition vocabulary set in a manner corresponding to the additional information by using a synonym dictionary; a speech recognition section for performing the speech recognition of a voice uttered by a viewing person, and for thereby specifying keyword information corresponding to a recognition vocabulary set when a word recognized as the speech recognition result is contained in the recognition vocabulary set; and a displaying section for displaying additional information corresponding to the specified keyword information.

Type: Grant

Filed: December 26, 2003

Date of Patent: April 13, 2010

Assignee: Panasonic Corporation

Inventors: Yumiko Kato, Takahiro Kamai, Hideyuki Yoshida, Yoshifumi Hirose
Session file modification with annotation using speech recognition or text to speech

Patent number: 7693717

Abstract: An apparatus comprising a session file, session file editor, annotation window, concatenation software and training software. The session file includes one or more audio files and text associated with each audio file segment. The session file editor displays text and provides text selection capability and plays back audio. The annotation window operably associated with the session file editor supports user modification of the selected text, the annotation window saves modified text corresponding to the selected text from the session file editor and audio associated with the modified text. The concatenation software concatenates modified text and audio associated therewith for two or more instances of the selected text. The training software trains a speech user profile using a concatenated file formed by the concatenating software.

Type: Grant

Filed: April 12, 2006

Date of Patent: April 6, 2010

Assignee: Custom Speech USA, Inc.

Inventors: Jonathan Kahn, Michael C. Huttinger
VOICE SIGNAL PROCESSING APPARATUS AND VOICE SIGNAL PROCESSING METHOD

Publication number: 20100076771

Abstract: A voice signal processing apparatus and method includes determining maximum amplitude values of a plurality of different voice frame signals obtained by giving different amounts of phase shift to frequency components of voice frame signals having a predetermined length which are divided from a digital voice signal, and selecting a voice frame signal whose maximum amplitude value is the minimum from among the amplitude values of the plurality of different voice frame signals.

Type: Application

Filed: September 16, 2009

Publication date: March 25, 2010

Applicant: Fujitsu Limited

Inventor: Fumio AMANO
Audio/video reproducing apparatus and method

Patent number: RE41939

Abstract: An audio/video reproducing apparatus is connectable to a communications network for selectively reproducing items of audio/video material from a recording medium in response to a request received via the communications network. The audio/video reproducing apparatus may comprise a control processor operable in use to receive data representing the request for the audio/video material item via the communications network. A reproducing processor is operable in response to signals identifying the audio/video material items from the control processor to reproduce the audio/video material items. The data identifying the audio/video material items includes meta data indicative of the audio/video material items. The meta data may be one of UMID, tape ID and time codes, and a Unique Material Identifier the material items.

Type: Grant

Filed: August 3, 2006

Date of Patent: November 16, 2010

Assignee: Sony United Kingdom Limited

Inventors: Vincent Carl Harradine, Alan Turner, Morgan William David, Michael Williams, Mark John McGrath, Andrew Kydd, Jonathan Thorpe
Text-to speech conversion system for synchronizing between synthesized speech and a moving picture in a multimedia environment and a method of the same

Patent number: RE42647

Abstract: The present invention provides a text-to-speech conversion system (TTS) for interlocking synchronizing with multimedia and a method for organizing input data of the TTS which can enhance the natural naturalness of synthesized speech and accomplish the synchronization of multimedia with TTS by defining additional prosody information, the information required to interlock synchronize TTS with multimedia, and interface between these this information and TTS for use in the production of the synthesized speech.

Type: Grant

Filed: September 30, 2002

Date of Patent: August 23, 2011

Assignee: Electronics and Telecommunications Research Institute

Inventors: Jung Chul Lee, Min Soo Hahn, Hang Seop Lee, Jae Woo Yang, Youngjik Lee

prev 1 2 3 4 5 6 7 8 9 … next