For Storage Or Transmission Patents (Class 704/201)

Neural network (Class 704/202)

Transformation (Class 704/203)

Orthogonal functions (Class 704/204)

Frequency (Class 704/205)

Specialized information (Class 704/206)

Time (Class 704/211)

Linear prediction (Class 704/219)

Analysis by synthesis (Class 704/220)

Pattern matching vocoders (Class 704/221)

Normalizing (Class 704/224)

Gain control (Class 704/225)

Noise (Class 704/226)

Adaptive bit allocation (Class 704/229)

Quantization (Class 704/230)

Speech Signal Enhancement Using Visual Information

Publication number: 20140337016

Abstract: Visual information is used to alter or set an operating parameter of an audio signal processor, other than a beamformer. A digital camera captures visual information about a scene that includes a human speaker and/or a listener. The visual information is analyzed to ascertain information about acoustics of a room. A distance between the speaker and a microphone may be estimated, and this distance estimate may be used to adjust an overall gain of the system. Distances among, and locations of, the speaker, the listener, the microphone, a loudspeaker and/or a sound-reflecting surface may be estimated. These estimates may be used to estimate reverberations within the room and adjust aggressiveness of an anti-reverberation filter, based on an estimated ratio of direct to indirect (reverberated) sound energy expected to reach the microphone.

Type: Application

Filed: October 17, 2011

Publication date: November 13, 2014

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Tobias Herbig, Tobias Wolff, Markus Buck
Signal processing based on audio context

Patent number: 8886524

Abstract: Described herein are systems, methods, and apparatus for determining audio context between an audio source and an audio sink and selecting signal profiles based at least in part on that audio context. The signal profiles may include noise cancellation which is configured to facilitate operation within the audio context. Audio context may include user-to-user and user-to-device communications.

Type: Grant

Filed: May 1, 2012

Date of Patent: November 11, 2014

Assignee: Amazon Technologies, Inc.

Inventors: Yuzo Watanabe, Stephen Polansky, Matthew P. Bell
Systems, devices and methods for list display and management

Patent number: 8880397

Abstract: Exemplary embodiments provide systems, devices and methods that allow creation and management of lists of items in an integrated manner on an interactive graphical user interface. A user may speak a plurality of list items in a natural unbroken manner to provide an audio input stream into an audio input device. Exemplary embodiments may automatically process the audio input stream to convert the stream into a text output, and may process the text output into one or more n-grams that may be used as list items to populate a list on a user interface.

Type: Grant

Filed: October 21, 2011

Date of Patent: November 4, 2014

Assignee: Wal-Mart Stores, Inc.

Inventors: Dion Almaer, Bernard Paul Cousineau, Ben Galbraith
Critical sampling encoding with a predictive encoder

Patent number: 8880411

Abstract: A method for encoding and decoding a digital audio signal is provided, said method comprising the steps of: encoding a first sequence of samples of the digital signal according to a transform encoding; encoding a second sequence of samples of the digital signal according to a predictive encoding; wherein the second sequence starts before the end of the first sequence, a subsequence common to the first and second sequences being thus encoded both by predictive encoding and by transform encoding.

Type: Grant

Filed: October 5, 2009

Date of Patent: November 4, 2014

Assignee: Orange

Inventors: Pierrick Philippe, David Virette
METHOD OF AND APPARATUS FOR EVALUATING INTELLIGIBILITY OF A DEGRADED SPEECH SIGNAL

Publication number: 20140316773

Abstract: The present invention relates to a method of evaluating intelligibility of a degraded speech signal received from an audio transmission system conveying a reference signal. The method comprises sampling said reference and degraded signal into frames, and forming frame pairs. For each pair one or more difference functions representing a difference between the degraded and reference signal are provided. A difference function is selected and compensated for different disturbance types, such as to provide a disturbance density function adapted to human auditory perception. An overall quality parameter is determined indicative of the intelligibility of the degraded signal. The method comprises determining a switching parameter indicative of audio power level of said degraded signal, for performing said selecting.

Type: Application

Filed: November 15, 2012

Publication date: October 23, 2014

Applicant: Nederlandse Organisatie voor toegepast-natuurwetenschappelijk onderzoek TNO

Inventor: John Gerard Beerends
Verification of Extracted Data

Publication number: 20140316772

Abstract: Facts are extracted from speech and recorded in a document using codings. Each coding represents an extracted fact and includes a code and a datum. The code may represent a type of the extracted fact and the datum may represent a value of the extracted fact. The datum in a coding is rendered based on a specified feature of the coding. For example, the datum may be rendered as boldface text to indicate that the coding has been designated as an “allergy.” In this way, the specified feature of the coding (e.g., “allergy”-ness) is used to modify the manner in which the datum is rendered. A user inspects the rendering and provides, based on the rendering, an indication of whether the coding was accurately designated as having the specified feature. A record of the user's indication may be stored, such as within the coding itself.

Type: Application

Filed: June 27, 2014

Publication date: October 23, 2014

Inventors: Detlef Koll, Michael Finke
SYSTEMS AND METHODS FOR SOURCE SIGNAL SEPARATION

Publication number: 20140316771

Abstract: A method includes receiving an input signal comprising an original domain signal and creating a first window data set and a second window data set from the signal, wherein an initiation of the second window data set is offset from an initiation of the first window data set, converting the first window data set and the second window data set to a frequency domain and storing the resulting data as data in a second domain different from the original domain, performing complex spectral phase evolution (CSPE) on the second domain data to estimate component frequencies of the first and second window data sets, using the component frequencies estimated in the CSPE, sampling a set of second-domain high resolution windows to select a mathematical representation comprising a second-domain high resolution window that fits at least one of the amplitude, phase, amplitude modulation and frequency modulation of a component of an underlying signal wherein the component comprises at least one oscillator peak, generating an ou

Type: Application

Filed: March 13, 2014

Publication date: October 23, 2014

Applicant: Kaonyx Labs LLC

Inventors: Kevin M. Short, Brian T. Hone
System and method for updating information in electronic calendars

Patent number: 8868427

Abstract: Systems and methods for updating electronic calendar information. Speech is received from a user at a vehicle telematics unit (VTU), wherein the speech is representative of information related to a particular vehicle trip. The received speech is recorded in the VTU as a voice memo, and data associated with the voice memo is communicated from the VTU to a computer running a calendaring application. The data is associated with a field of the calendaring application, and stored in association with the calendaring application field.

Type: Grant

Filed: June 10, 2010

Date of Patent: October 21, 2014

Assignee: General Motors LLC

Inventor: Jeffrey P. Rysenga
Parallel processing of data sets

Patent number: 8868470

Abstract: Systems, methods, and devices are described for implementing learning algorithms on data sets. A data set may be partitioned into a plurality of data partitions that may be distributed to two or more processors, such as a graphics processing unit. The data partitions may be processed in parallel by each of the processors to determine local counts associated with the data partitions. The local counts may then be aggregated to form a global count that reflects the local counts for the data set. The partitioning may be performed by a data partition algorithm and the processing and the aggregating may be performed by a parallel collapsed Gibbs sampling (CGS) algorithm and/or a parallel collapsed variational Bayesian (CVB) algorithm. In addition, the CGS and/or the CVB algorithms may be associated with the data partition algorithm and may be parallelized to train a latent Dirichlet allocation model.

Type: Grant

Filed: November 9, 2010

Date of Patent: October 21, 2014

Assignee: Microsoft Corporation

Inventors: Ning-Yi Xu, Feng-Hsiung Hsu, Feng Yan
Audio signal bandwidth extension in CELP-based speech coder

Patent number: 8868432

Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.

Type: Grant

Filed: September 28, 2011

Date of Patent: October 21, 2014

Assignee: Motorola Mobility LLC

Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
METHODS AND APPARATUS FOR MASKING SPEECH IN A PRIVATE ENVIRONMENT

Publication number: 20140309991

Abstract: A speech masking apparatus includes a microphone and a speaker. The microphone can detect a human voice. The speaker can output a masking language which can include phonemes resembling human speech. At least one component of the masking language can have a pitch, a volume, a theme, and/or a phonetic content substantially matching a pitch, a volume, a theme, and/or a phonetic content of the voice.

Type: Application

Filed: March 10, 2014

Publication date: October 16, 2014

Applicant: Medical Privacy Solutions, LLC

Inventors: Babak ARVANAGHI, Joel FECHTER
Determining pitch cycle energy and scaling an excitation signal

Patent number: 8862465

Abstract: An electronic device for determining a set of pitch cycle energy parameters is described. The electronic device includes a processor and executable instructions stored in memory. The electronic device obtains a frame, a set of filter coefficients and a residual signal based on the frame and the set of filter coefficients. The electronic device determines a set of peak locations based on the residual signal and segments the residual signal such that each segment includes one peak. The electronic device determines a first set of pitch cycle energy parameters based on a frame region between two consecutive peak locations and maps regions between peaks in the residual signal to regions between peaks in a synthesized excitation signal to produce a mapping. The electronic device determines a second set of pitch cycle energy parameters based on the first set of pitch cycle energy parameters and the mapping.

Type: Grant

Filed: September 8, 2011

Date of Patent: October 14, 2014

Assignee: QUALCOMM Incorporated

Inventors: Venkatesh Krishnan, Stephane Pierre Villette
METHOD FOR ENCODING VOICE SIGNAL, METHOD FOR DECODING VOICE SIGNAL, AND APPARATUS USING SAME

Publication number: 20140303965

Abstract: The present invention relates to a method for encoding a voice signal, a method for decoding a voice signal, and an apparatus using the same. The method for encoding the voice signal according to the present invention, includes the steps of: determining an eco-zone in a present frame; allocating bits for the present frame on the basis of the location of the eco-zone; and encoding the present frame using the allocated bits, wherein the step of allocating the bits allocates more bits in the section in which the eco-zone is located than in the section in which the eco-zone is not located.

Type: Application

Filed: October 29, 2012

Publication date: October 9, 2014

Inventors: Younghan Lee, Gyuhyeok Jeong, Ingyu Kang, Hyejeong Jeon, Lagyoung Kim
COMMUNICATION SYSTEM AND TERMINAL DEVICE

Publication number: 20140303966

Abstract: A communication system according to the present invention includes a plurality of terminal devices that are able to communicate mutually. Each of the terminal devices includes a voice input conversion device, a voice transmitting device, a voice receiving device, and a voice reproducing device. When there is a plurality of voice signals which has not been completed reproduction, the voice reproducing device reproduces after arranging the voice signals so that respective voices corresponding to the respective voice signals do not overlap.

Type: Application

Filed: December 14, 2012

Publication date: October 9, 2014

Inventors: Tsutomu Adachi, Tomoyoshi Yokoi, Shigeru Hayashi, Takezumi Kondo, Tatsumi Kuroda, Daisuke Mouri, Takeo Nozawa, Kenshi Takenaka, Hiroshi Maekawa, Tsuyoshi Kawanishi
Loudness maximization with constrained loudspeaker excursion

Patent number: 8855322

Abstract: An original loudness level of an audio signal is maintained for a mobile device while maintaining sound quality as good as possible and protecting the loudspeaker used in the mobile device. The loudness of an audio (e.g., speech) signal may be maximized while controlling the excursion of the diaphragm of the loudspeaker (in a mobile device) to stay within the allowed range. In an implementation, the peak excursion is predicted (e.g., estimated) using the input signal and an excursion transfer function. The signal may then be modified to limit the excursion and to maximize loudness.

Type: Grant

Filed: August 9, 2011

Date of Patent: October 7, 2014

Assignee: QUALCOMM Incorporated

Inventors: Sang-Uk Ryu, Jongwon Shin, Roy Silverstein, Andre Gustavo P. Schevciw, Pei Xiang
Method, medium, and apparatus encoding and/or decoding multichannel audio signals

Patent number: 8849678

Abstract: A method, medium, and apparatus encoding and/or decoding a multichannel audio signal. The method includes detecting the type of spatial extension data included in an encoding result of an audio signal, if the spatial extension data is data indicating a core audio object type related to a technique of encoding core audio data, detecting the core audio object type; decoding core audio data by using a decoding technique according to the detected core audio object type, if the spatial extension data is residual coding data, decoding the residual coding data by using the decoding technique according to the core audio object type, and up-mixing the decoded core audio data by using the decoded residual coding data. According to the method, the core audio data and residual coding data may be decoded by using an identical decoding technique, thereby reducing complexity at the decoding end.

Type: Grant

Filed: October 28, 2013

Date of Patent: September 30, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jung-hoe Kim, Eun-mi Oh
Multi-channel synthesizer and method for generating a multi-channel output signal

Patent number: 8843378

Abstract: A multi-channel synthesizer includes a post processor for determining post processed reconstruction parameters or quantities derived from the reconstruction parameter for an actual time portion of the input signal so that the post processed reconstruction parameter or the post processed quantity is different from the corresponding quantized and inversely quantized reconstruction parameter in that the value of the post processed reconstruction parameter or the derived quantity is not bound by the quantization step size. A multi-channel reconstructor uses the post-processed reconstruction parameter for reconstructing the multi-channel output signal.

Type: Grant

Filed: June 30, 2004

Date of Patent: September 23, 2014

Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.

Inventors: Juergen Herre, Sascha Disch, Johannes Hilpert, Christian Ertel, Andreas Hoelzer, Claus-Christian Spenger
Method and apparatus of visual feedback for latency in communication media

Patent number: 8843365

Abstract: A method and apparatus are provided for visualizing the latency in a conversation between a local speaker and at least one remote speaker separated from the local speaker by a communication medium. A latency estimate is obtained. A timing indication of at least the end of a conversational turn by the local speaker is obtained, and an outbound graphic is displayed, indicating the progress of at least the end-of-turn across the communication medium toward the remote speaker. The outbound graphical indication is displayed with a transit time across the medium that is derived from the latency estimate. An inbound graphic is displayed, indicating the progress across the communication medium toward the local speaker, of a start of a conversational turn by the remote speaker, which is imputed to begin when the remote speaker receives the local speaker's end-of-turn. The inbound graphical indication is displayed with a transit time across the medium that is derived from the latency estimate.

Type: Grant

Filed: April 12, 2011

Date of Patent: September 23, 2014

Assignee: Alcatel Lucent

Inventor: James W. McGowan
System and method for providing internet based phone conferences using multiple codecs

Patent number: 8842580

Abstract: A method of communicating digitized speech from a transmitting forum participant comprises the step of receiving a data structure that includes said digitized speech. The data structure is analyzed to determine whether the digitized speech is redundantly represented in a plurality of forms in the data structure. A portion of the data structure is forwarded to a receiving forum participant, thereby communicating the digitized speech from the transmitting forum participant. In this method, when the digitized speech is redundantly represented in the data structure in a plurality of forms, the forwarding step includes a step of selecting one or more forms, based on a function, from the plurality of forms in the data structure. Furthermore, the portion of the data structure that is forwarded to the receiving forum participant includes data in the data structure that corresponds to each of the selected one or more forms.

Type: Grant

Filed: December 28, 2011

Date of Patent: September 23, 2014

Assignee: Entropy Processing NV LLC

Inventors: Kyle Granger, Edward A. Lerner, James E. G. Morris, Jonathan B. Blossom, Martin Hung
Methods and Arrangements in a Telecommunications Network

Publication number: 20140249808

Abstract: The present invention relates to a postfilter and a postfilter control to be associated with a postfilter for improving perceived quality of speech reconstructed at a speech decoder. The postfilter control comprises means for measuring stationarity of a speech signal reconstructed at a decoder, means for determining a coefficient to a postfilter control parameter based on the measured stationarity, and means for transmitting the determined coefficient to a postfilter, such that the postfilter can process the reconstructed speech signal by applying the determined coefficient to the postfilter control parameter to obtain an enhanced speech signal.

Type: Application

Filed: May 15, 2014

Publication date: September 4, 2014

Applicant: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)

Inventor: Volodya Grancharov
Systems, methods, and apparatus for frame erasure recovery

Patent number: 8825477

Abstract: In one configuration, erasure of a significant frame of a sustained voiced segment is detected. An adaptive codebook gain value for the erased frame is calculated based on the preceding frame. If the calculated value is less than (alternatively, not greater than) a threshold value, a higher adaptive codebook gain value is used for the erased frame. The higher value may be derived from the calculated value or selected from among one or more predefined values.

Type: Grant

Filed: December 13, 2010

Date of Patent: September 2, 2014

Assignee: Qualcomm Incorporated

Inventors: Venkatesh Krishnan, Ananthapadmanabhan Arasanipatai Kandhadai
METHOD AND APPARATUS FOR COMMUNICATING MESSAGES AMONGST A NODE, DEVICE AND A USER OF A DEVICE

Publication number: 20140236586

Abstract: An method and apparatus that modifies static media, such as music files being played to a user of the device, upon the generation or receipt of an alert, notification or message, so that information in the alert, notification or message can be incorporated into the media files then communicated to the user. In a further embodiment, a user's response to the communicated information can be sensed using one or more sensors and transducers so as to provide feedback to the device, and then optionally to a node in a system.

Type: Application

Filed: February 18, 2013

Publication date: August 21, 2014

Applicant: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)

Inventor: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)
Method and system for asynchronous pipeline architecture for multiple independent dual/stereo channel PCM processing

Patent number: 8805678

Abstract: Aspects of a method and system for an asynchronous pipeline architecture for multiple independent dual/stereo channel PCM processing are provided. Asynchronously pipeline processing of audio information comprised within a decoded PCM frame may be based on metadata information generated from the decoded PCM frame and an output decoding rate. The asynchronously pipeline processing may comprise mixing a primary audio information portion and a secondary audio information, portion, sample rate converting the audio information, and buffering the audio information. The asynchronously pipeline processing may comprise multiple pipeline stages. Feeding back an output of one of the pipeline stages to an input of a previous one of the pipeline stages may be enabled. The metadata information may comprise a frame start indicator associated with the decoded PCM frame and/or a plurality of mixing coefficients.

Type: Grant

Filed: November 9, 2006

Date of Patent: August 12, 2014

Assignee: Broadcom Corporation

Inventor: David Wu
Quality improvement techniques in an audio encoder

Patent number: 8805696

Abstract: An audio encoder implements multi-channel coding decision, band truncation, multi-channel rematrixing, and header reduction techniques to improve quality and coding efficiency. In the multi-channel coding decision technique, the audio encoder dynamically selects between joint and independent coding of a multi-channel audio signal via an open-loop decision based upon (a) energy separation between the coding channels, and (b) the disparity between excitation patterns of the separate input channels. In the band truncation technique, the audio encoder performs open-loop band truncation at a cut-off frequency based on a target perceptual quality measure. In multi-channel rematrixing technique, the audio encoder suppresses certain coefficients of a difference channel by scaling according to a scale factor, which is based on current average levels of perceptual quality, current rate control buffer fullness, coding mode, and the amount of channel separation in the source.

Type: Grant

Filed: October 7, 2013

Date of Patent: August 12, 2014

Assignee: Microsoft Corporation

Inventors: Wei-Ge Chen, Naveen Thumpudi, Ming-Chieh Lee
Low bitrate audio encoding/decoding scheme with common preprocessing

Patent number: 8804970

Abstract: An audio encoder has a common preprocessing stage, an information sink based encoding branch such as spectral domain encoding branch, a information source based encoding branch such as an LPC-domain encoding branch and a switch for switching between these branches at inputs into these branches or outputs of these branches controlled by a decision stage. An audio decoder has a spectral domain decoding branch, an LPC-domain decoding branch, one or more switches for switching between the branches and a common post-processing stage for post-processing a time-domain audio signal for obtaining a post-processed audio signal.

Type: Grant

Filed: January 11, 2011

Date of Patent: August 12, 2014

Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.

Inventors: Bernhard Grill, Stefan Bayer, Guillaume Fuchs, Stefan Geyersberger, Ralf Geiger, Johannes Hilpert, Ulrich Kraemer, Jeremie Lecomte, Markus Multrus, Max Neuendorf, Harald Popp, Nikolaus Rettelbach, Frederik Nagel, Sascha Disch, Juergen Herre, Yoshikazu Yokotani, Stefan Wabnik, Gerald Schuller, Jens Hirschfeld
Method and apparatus for encoding and decoding audio signal using layered sinusoidal pulse coding

Patent number: 8805680

Abstract: Provided are a method and an apparatus for encoding and decoding an audio signal. A method for encoding an audio signal includes receiving a transformed audio signal, dividing the transformed audio signal into a plurality of subbands, performing a first sinusoidal pulse coding operation on the subbands, determining a performance region of a second sinusoidal pulse coding operation among the subbands on the basis of coding information of the first sinusoidal pulse coding operation, and performing the second sinusoidal pulse coding operation on the determined performance region, wherein the first sinusoidal pulse coding operation is performed variably according to the coding information. Accordingly, it is possible to further improve the quality of a synthesized signal by considering the sinusoidal pulse coding of a lower layer when encoding or decoding an audio signal in an upper layer by a layered sinusoidal pulse coding scheme.

Type: Grant

Filed: May 19, 2010

Date of Patent: August 12, 2014

Assignee: Electronics and Telecommunications Research Institute

Inventors: Mi-Suk Lee, Heesik Yang, Hyun-Woo Kim, Jongmo Sung, Hyun-Joo Bae, Byung-Sun Lee
STREAMING ENCODER, PROSODY INFORMATION ENCODING DEVICE, PROSODY-ANALYZING DEVICE, AND DEVICE AND METHOD FOR SPEECH SYNTHESIZING

Publication number: 20140222421

Abstract: A speech-synthesizing device includes a hierarchical prosodic module, a prosody-analyzing device, and a prosody-synthesizing unit. The hierarchical prosodic module generates at least a first hierarchical prosodic model. The prosody-analyzing device receives a low-level linguistic feature, a high-level linguistic feature and a first prosodic feature, and generates at least a prosodic tag based on the low-level linguistic feature, the high-level linguistic feature, the first prosodic feature and the first hierarchical prosodic model. The prosody-synthesizing unit synthesizes a second prosodic feature based on the hierarchical prosodic module, the low-level linguistic feature and the prosodic tag.

Type: Application

Filed: January 30, 2014

Publication date: August 7, 2014

Applicant: National Chiao Tung University

Inventors: Sin-Horng Chen, Yih-Ru Wang, Chen-Yu Chiang, Chiao-Hua Hsieh
DATA PROCESSING METHOD THAT SELECTIVELY PERFORMS ERROR CORRECTION OPERATION IN RESPONSE TO DETERMINATION BASED ON CHARACTERISTIC OF PACKETS CORRESPONDING TO SAME SET OF SPEECH DATA, AND ASSOCIATED DATA PROCESSING APPARATUS

Publication number: 20140222420

Abstract: A data processing method for performing data processing on wireless received data and an associated data processing apparatus are provided, where the data processing method is applied to an electronic device. The data processing method includes the steps of: wirelessly receiving a plurality of packets corresponding to a same set of speech data from another electronic device; and selectively performing error correction operation on at least one of the plurality of packets to obtain the set of speech data, wherein whether to perform the error correction operation is determined according to at least one characteristic of the plurality of packets. More particularly, the error correction operation is selectively performed for at least one scenario of a timing critical scenario and a re-transmission limited scenario.

Type: Application

Filed: August 8, 2013

Publication date: August 7, 2014

Applicant: MEDIATEK INC.

Inventors: Wei-Kun Su, Hsuan-Yi Hou, Wei-Chu Lai, Chia-Wei Tao, Cheng-Lun Hu, Chieh-Cheng Cheng
DIALOGUE SYSTEM AND METHOD FOR RESPONDING TO MULTIMODAL INPUT USING CALCULATED SITUATION ADAPTABILITY

Publication number: 20140214410

Abstract: A dialogue system and a method for the same are disclosed. The dialogue system includes a multimodal input unit receiving speech and non-speech information of a user, a domain reasoner, which stores a plurality of pre-stored situations, each of which is formed by a combination one or more speech and non-speech information, calculating each adaptability of the pre-stored situations on the basis of a generated situation based on the speech and the non-speech information received from the multimodal input unit, and determining a current domain according to the calculated adaptability, a dialogue manager to select a response corresponding to the current domain, and a multimodal output unit to output the response. The dialogue system performs domain reasoning using a situation including information combinations reflected in the domain reasoning process, current information, and a speech recognition result, and reduces the size of a dialogue search space while increasing domain reasoning accuracy.

Type: Application

Filed: April 2, 2014

Publication date: July 31, 2014

Applicant: SAMSUNG ELECTRONICS CO., LTD

Inventors: Jun Won JANG, Woo Sup HAN
Protection of Private Information in a Client/Server Automatic Speech Recognition System

Publication number: 20140207442

Abstract: A mobile device is adapted for protecting private information on the mobile device in a hybrid automatic speech recognition arrangement. The mobile device includes a speech input component for receiving a speech input signal from a user. Additionally, the mobile device includes a local ASR arrangement for performing local ASR processing of the speech input signal and determining if private information is included within the speech input signal. A control unit on the mobile device obscures private information in the speech input signal if the local ASR arrangement identifies information within a speech recognition result as private information. The control unit releases the speech input signal with the obscured private information for transmission to a remote server for further ASR processing. Results from the remote server's ASR processing are integrated and combined with results from local ASR processing to display information on the mobile device.

Type: Application

Filed: January 24, 2013

Publication date: July 24, 2014

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: William F. Ganong, III, Paul J. Vozila
System and method for babble noise detection

Patent number: 8788265

Abstract: A method, device, system, and computer program product calculate a gradient index as a sum of magnitudes of gradients of speech signals from a received frame at each change of direction; and provide an indication that the frame contains babble noise if the gradient index, energy information, and background noise level exceed pre-determined thresholds or a voice activity detector algorithm and sound level indicate babble noise.

Type: Grant

Filed: May 25, 2004

Date of Patent: July 22, 2014

Assignee: Nokia Solutions and Networks Oy

Inventors: Laura Laaksonen, Päivi Valve
Method and an apparatus for processing speech, audio, and speech/audio signal using mode information

Patent number: 8781843

Abstract: A method of processing a signal, which includes receiving at least one of a first signal and a second signal, receiving mode information, and decoding the at least one of the first signal and the second signal using at least one of a first coding scheme and a second coding scheme according to the mode information. Further, the mode information is information for indicating that a prescribed mode corresponds to which one of at least three modes.

Type: Grant

Filed: October 15, 2008

Date of Patent: July 15, 2014

Assignee: Intellectual Discovery Co., Ltd.

Inventors: Hyen-O Oh, Hong Goo Kang, Chang Heon Lee, Sang Wook Shin, Yang Won Jung
Audio device switching with reduced pop and click

Patent number: 8779962

Abstract: This document discusses, among other things, apparatus and methods including an analog-to-digital controller (ADC) configured to receive an enable signal and to provide an ADC output signal to control logic, wherein the control logic is configured to provide a control voltage to a control input of a switch. In an example, the control voltage includes the ADC output signal when the ADC output signal is below a first threshold or above a second threshold. In certain examples, the control logic is configured to transition the control voltage from the first threshold to the second threshold when the ADC output signal is between the first and second thresholds.

Type: Grant

Filed: April 10, 2013

Date of Patent: July 15, 2014

Assignee: Fairchild Semiconductor Corporation

Inventors: John L. Carpentier, Julie Lynn Stultz, Steven Macaluso
Audio coding

Patent number: 8781844

Abstract: A method for encoding an audio signal including: processing a selected subset of a lower series of samples forming a lower frequency spectral band of the audio signal and a higher series of samples forming a higher frequency spectral band of the audio signal to parametrically encode the higher series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.

Type: Grant

Filed: September 25, 2009

Date of Patent: July 15, 2014

Assignee: Nokia Corporation

Inventors: Lasse Juhani Laaksonen, Mikko Tapio Tammi, Adriana Vasilache, Anssi Sakari Ramo
METHOD AND SYSTEM FOR TRANSMITTING AUDIO SIGNAL

Publication number: 20140195223

Abstract: A method and a system for transmitting an audio signal are provided. The system includes a transmission device and a receiving device communicating with the transmission device via a network. The method includes receiving and sampling the audio signal, recording values of points of the sampled audio signal using the transmission device, segmenting the sampled audio signal into a plurality of frames, extracting and encoding characteristic information from each frame, to obtain a group of generated codes, transmitting each group of generated codes to a receiving device sequentially using the transmission device, and decoding each group of generated codes using the receiving device, to obtain a decoded audio signal.

Type: Application

Filed: December 26, 2013

Publication date: July 10, 2014

Applicants: HONG FU JIN PRECISION INDUSTRY (ShenZhen) CO., LTD., HON HAI PRECISION INDUSTRY CO., LTD.

Inventors: FANG-YOU WANG, CHIH-HUA HSU, HANG XU
Speech Modification for Distributed Story Reading

Publication number: 20140195222

Abstract: Various embodiments provide an interactive, shared, story-reading experience in which stories can be experienced from remote locations. Various embodiments enable augmentation or modification of audio and/or video associated with the story-reading experience. This can include augmentation and modification of a reader's voice, face, and/or other content associated with the story as the story is read.

Type: Application

Filed: January 7, 2013

Publication date: July 10, 2014

Applicant: MICROSOFT CORPORATION

Inventors: Alan W. Peevers, John C. Tang, Nizamettin Gok, Gina Danielle Venolia, Kori Inkpen Quinn, Simon Andrew Longbottom, Kurt A. Thywissen
Coding/decoding method, system and apparatus

Patent number: 8775166

Abstract: An encoding method includes: extracting core layer characteristic parameters and enhancement layer characteristic parameters of a background noise signal, encoding the core layer characteristic parameters and enhancement layer characteristic parameters to obtain a core layer codestream and an enhancement layer codestream. The disclosure also provides an encoding device, a decoding device and method, an encapsulating method, a reconstructing method, an encoding-decoding system and an encoding-decoding method. By describing the background noise signal with the enhancement layer characteristic parameters, the background noise signal can be processed by using more accurate encoding and decoding method, so as to improve the quality of encoding and decoding the background noise signal.

Type: Grant

Filed: August 14, 2009

Date of Patent: July 8, 2014

Assignee: Huawei Technologies Co., Ltd.

Inventors: Hualin Wan, Libin Zhang
Method for emotion communication between emotion signal sensing device and emotion service providing device

Patent number: 8775186

Abstract: Provided are a method for emotion communication to share a user's emotions between an emotion signal sensing device and an emotion service providing device. The method for emotion communication includes: the emotion signal sensing device's sensing biological and environmental information of the user and generating an emotion signal and emotion information of the user based on the biological and environmental information; establishing an emotion communication connection with the emotion service providing device; transmitting the emotion signal and the emotion information to the emotion service providing device by the emotion communication connection establishment; and breaking the connection with the emotion service providing device.

Type: Grant

Filed: December 29, 2010

Date of Patent: July 8, 2014

Assignee: Electronics and Telecommnications Research Institute

Inventors: Hyun Soon Shin, Sung Won Lee, Choong Seon Hong
Yule walker based low-complexity voice activity detector in noise suppression systems

Patent number: 8775168

Abstract: A Yule-Walker based, low-complexity voice activity detector (VAD) is disclosed. An input signal is typically noisy speech (i.e., corrupted with, for example, babble noise). In one embodiment, a first initialization stage of the VAD computes an occurrence of a silent period within the input signal and the AR parameters. The VAD could accordingly compute a tentative adaptive threshold and output hypothesis H1 (which means speech is present) during this stage. During the second initialization stage, the VAD generally builds a database of associated values and computes the adaptive threshold accordingly. The second initialization stage could also output tentative VAD decisions based on the tentative threshold computed in the first initialization stage. Finally, the VAD periodically retrains or updates AR parameters, threshold values and/or the database and outputs VAD decisions accordingly.

Type: Grant

Filed: August 3, 2007

Date of Patent: July 8, 2014

Assignee: STMicroelectronics Asia Pacific PTE, Ltd.

Inventors: Karthik Muralidhar, Anoop Kumar Krishna
HOME APPLIANCE AND OPERATION METHOD THEREOF

Publication number: 20140188463

Abstract: A home appliance and an operation method thereof are disclosed. The operation method of the home appliance includes entering a voice recognition mode, receiving a voice data through a microphone, recognizing the received voice date, and, in a case in which the recognized voice data contains information related to another home appliance, transmitting the recognized voice data to the corresponding home appliance. Consequently, sharing of voice data between home appliances is achieved.

Type: Application

Filed: January 2, 2014

Publication date: July 3, 2014

Applicant: LG ELECTRONICS INC.

Inventors: Seonghwan NOH, Sungwook HAN, Sangbae PARK, Chansung JEON
Machine translation of indirect speech

Patent number: 8768687

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating direct speech messages based on voice commands that include indirect speech messages. In one aspect, a method includes receiving a voice input corresponding to an utterance. A determination is made whether a transcription of the utterance includes a command to initiate a communication to a user and a segment that is classified as indirect speech. In response to determining that the transcription of the utterance includes the command and the segment that is classified as indirect speech, the segment that is classified as indirect speech is provided as input to a machine translator. In response to providing the segment that is classified as indirect speech to the machine translator, a direct speech segment is received from the machine translator. A communication is initiated that includes the direct speech segment.

Type: Grant

Filed: April 29, 2013

Date of Patent: July 1, 2014

Assignee: Google Inc.

Inventors: Matthias Quasthoff, Simon Tickner
Sound encoding device and sound encoding method

Patent number: 8768691

Abstract: A sound encoder for efficiently encoding stereophonic sound. A prediction parameter analyzer determines a delay difference D and an amplitude ratio g of a first-channel sound signal with respect to a second-channel sound signal as channel-to-channel prediction parameters from a first-channel decoded signal and a second-channel sound signal. A prediction parameter quantizer quantizes the prediction parameters, and a signal predictor predicts a second-channel signal using the first decoded signal and the quantization prediction parameters. The prediction parameter quantizer encodes and quantizes the prediction parameters (the delay difference D and the amplitude ratio g) using a relationship (correlation) between the delay difference D and the amplitude ratio g attributed to a spatial characteristic (e.g., distance) from a sound source of the signal to a receiving point.

Type: Grant

Filed: March 23, 2006

Date of Patent: July 1, 2014

Assignee: Panasonic Corporation

Inventor: Koji Yoshida
Frequency band extension apparatus and method, encoding apparatus and method, decoding apparatus and method, and program

Patent number: 8762135

Abstract: Provided is a frequency band extension apparatus and method, an encoding apparatus and method, a decoding apparatus and method, and a program. Band-pass filters obtain a plurality of lowband subband signals from an input signal. A frequency envelope extracting circuit extracts a frequency envelope from the plurality of lowband subband signals obtained by the plurality of band-pass filters. A highband signal generating circuit generates highband signal components on the basis of the frequency envelope obtained by the frequency envelope extracting circuit, and the plurality of subband signals obtained by the band-pass filters. A frequency band extension apparatus extends the frequency band of the input signal on the basis of the highband signal components generated by the highband signal generating circuit.

Type: Grant

Filed: August 28, 2009

Date of Patent: June 24, 2014

Assignee: Sony Corporation

Inventors: Hiroyuki Honma, Toru Chinen, Yuki Yamamoto, Yuhki Mitsufuji, Kenichi Makino
Decoding method and decoding apparatus therefor

Patent number: 8762158

Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.

Type: Grant

Filed: August 5, 2011

Date of Patent: June 24, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
SYSTEM AND METHOD FOR GENERATING PERSONALIZED TAG RECOMMENDATIONS FOR TAGGING AUDIO CONTENT

Publication number: 20140172419

Abstract: Systems, methods, and computer-readable storage media for generating personalized tag recommendations using speech analytics. The system first analyzes an audio stream to identify topics in the audio stream. Next, the system identifies tags related to the topics to yield identified tags. Based on the identified tags, the system then generates a tag recommendation for tagging the audio stream. The system can also send the tag recommendation to a device associated with a user for presentation to the user.

Type: Application

Filed: December 14, 2012

Publication date: June 19, 2014

Applicant: AVAYA INC.

Inventors: Ajita JOHN, Doree Duncan SELIGMANN
METHOD FOR CLASSIFYING VOICE CONFERENCE MINUTES, DEVICE, AND SYSTEM

Publication number: 20140163970

Abstract: Embodiments of the present invention provide a method, device, and system for classifying voice conference minutes. The method is: performing voice source locating according to audio data of the conference site so as to acquire a location of a voice source corresponding to the audio data, writing the location of the voice source into additional field information of the audio data, writing a voice activation flag into the additional field information, packaging the audio data as an audio code stream, and sending the audio code stream and the additional field information of the audio code stream to a recording server, so that the recording server classifies the audio data according to the additional field information and writes a participant identity that corresponds to the location of the voice source corresponding to the audio data into the additional field information of the audio code stream.

Type: Application

Filed: November 29, 2013

Publication date: June 12, 2014

Applicant: Huawei Technologies Co., Ltd.

Inventor: Wuzhou ZHAN
METHOD OF USING A MOBILE DEVICE AS A MICROPHONE, METHOD OF AUDIO PLAYBACK, AND RELATED DEVICE AND SYSTEM

Publication number: 20140163971

Abstract: The present disclosure discloses a method of using a mobile terminal as a microphone, an audio playback method, and related device and system. The method of using a mobile terminal as a microphone comprises: receiving identification information from a media device; establishing a data connection with the media device based on the identification information; converting a voice signal into audio data and sending the audio data to the media device, enabling the media device to output the audio data. According to the present disclosure, a mobile device and a media device can coordinate with each other. By connecting a mobile device, such as a mobile phone etc., with a media device, the mobile device can be used as a microphone. This makes it convenient for a user to use a microphone whenever and wherever, and meets the user's need.

Type: Application

Filed: December 3, 2013

Publication date: June 12, 2014

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventor: Bo SONG
Audio transmission, recording and reproducing system

Patent number: RE44954

Abstract: An audio recording and reproducing apparatus includes a controller for controlling the entire behaviors, hard disc for write and read of audio data, audio compression/expansion circuit for expanding compressed audio data, and external I/O port. The audio recording and reproducing apparatus is connected to a network service center to obtain desired music data from storage of the network service center and to store it in the hard disc.

Type: Grant

Filed: June 23, 2011

Date of Patent: June 17, 2014

Assignee: Sony Corporation

Inventors: Kazunori Ozawa, Nobuhiro Tone, Masahiro Asai
Method and apparatus for the provision of information signals based upon speech recognition

Patent number: RE45041

Abstract: A wireless comprises at least one subscriber unit in wireless communication with an infrastructure. Each subscriber unit implements a speech recognition client, and the infrastructure comprises a speech recognition server. A given subscriber unit takes as input an unencoded speech signal that is subsequently parameterized by the speech recognition client. The parameterized speech is then provided to the speech recognition server that, in turn, performs speech recognition analysis on the parameterized speech. Information signals, based in part upon any recognized utterances identified by the speech recognition analysis, are subsequently provided to the subscriber unit. The information signals may be used to control the subscriber unit itself; to control one or more devices coupled to the subscriber unit, or may be operated upon by the subscriber unit or devices coupled thereto.

Type: Grant

Filed: May 10, 2013

Date of Patent: July 22, 2014

Assignee: BlackBerry Limited

Inventor: Ira A. Gerson
Method and apparatus for the provision of information signals based upon speech recognition

Patent number: RE45066

Abstract: A wireless system comprises at least one subscriber unit in wireless communication with an infrastructure. Each subscriber unit implements a speech recognition client, and the infrastructure comprises a speech recognition server. A given subscriber unit takes as input an unencoded speech signal that is subsequently parameterized by the speech recognition client. The parameterized speech is then provided to the speech recognition server that, in turn, performs speech recognition analysis on the parameterized speech. Information signals, based in part upon any recognized utterances identified by the speech recognition analysis, are subsequently provided to the subscriber unit. The information signals may be used to control the subscriber unit itself; to control one or more devices coupled to the subscriber unit, or may be operated upon by the subscriber unit or devices coupled thereto.

Type: Grant

Filed: May 10, 2013

Date of Patent: August 5, 2014

Assignee: BlackBerry Limited

Inventor: Ira A. Gerson

prev … 3 4 5 6 7 8 9 10 11 … next