Patents by Inventor Raquel Tato

Raquel Tato has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method for automatic extraction of important events in audio signals

Patent number: 8635065

Abstract: The present invention discloses an apparatus for automatic extraction of important events in audio signals comprising: signal input means for supplying audio signals; audio signal fragmenting means for partitioning audio signals supplied by the signal input means into audio fragments of a predetermined length and for allocating a sequence of one or more audio fragments to a respective audio window; feature extracting means for analyzing acoustic characteristics of the audio signals comprised in the audio fragments and for analyzing acoustic characteristics of the audio signals comprised in the audio windows; and important event extraction means for extracting important events in audio signals supplied by the audio signal fragmenting means based on predetermined important event classifying rules depending on acoustic characteristics of the audio signals comprised in the audio fragments and on acoustic characteristics of the audio signals comprised in the audio windows, wherein each important event extracted

Type: Grant

Filed: November 10, 2004

Date of Patent: January 21, 2014

Assignee: Sony Deutschland GmbH

Inventors: Silke Goronzy-Thomae, Thomas Kemp, Ralf Kompe, Yin Hay Lam, Krzysztof Marasek, Raquel Tato
Method for processing speech using absolute loudness

Patent number: 8200488

Abstract: The invention provides a method for processing speech comprising the steps of receiving a speech input (SI) of a speaker, generating speech parameters (SP) from said speech input (SI), determining parameters describing an absolute loudness (L) of said speech input (SI), and evaluating (EV) said speech input (SI) and/or said speech parameters (SP) using said parameters describing the absolute loudness (L). In particular, the step of evaluation (EV) comprises a step of emotion recognition and/or speaker identification. Further, a microphone array comprising a plurality of microphones is used for determining said parameters describing the absolute loudness. With a microphone array the distance of the speaker from the microphone array can be determined and the loudness can be normalized by the distance.

Type: Grant

Filed: December 10, 2003

Date of Patent: June 12, 2012

Assignee: Sony Deutschland GmbH

Inventors: Thomas Kemp, Ralf Kompe, Raquel Tato
Apparatus and method for automatic dissection of segmented audio signals

Patent number: 7962330

Abstract: An apparatus for automatic dissection of segmented audio signals, wherein at least one information signal for identifying programs included in said audio signals and for identifying contents included in said programs. Content detection device detects programs and contents belonging to the respective programs in the information signal. Program weighting device weights each program includes in the information signal based on the contents of the respective program detected by the content detection device. Program ranking device indentifies programmers of the same category and ranking said programs based on a weighting result for each program provided by the program weighting device.

Type: Grant

Filed: November 10, 2004

Date of Patent: June 14, 2011

Assignee: Sony Deutschland GmbH

Inventors: Silke Goronzy, Thomas Kemp, Ralf Kompe, Yin Hay Lam, Krzysztof Marasek, Raquel Tato
Method for detecting emotions involving subspace specialists

Patent number: 7729914

Abstract: To detect and determine a current emotional state (CES) of a human being from a spoken speech input (SI), it is suggested in a method for detecting emotions to identify first and second feature classes (A, E) with, in particular distinct, dimensions of an underlying emotional manifold (EM) or emotional space (ES) and/or with subspaces thereof.

Type: Grant

Filed: October 4, 2002

Date of Patent: June 1, 2010

Assignee: Sony Deutschland GmbH

Inventors: Raquel Tato, Thomas Kemp, Krzystof Marasek
Apparatus and method for segmentation of audio data into meta patterns

Patent number: 7680654

Abstract: An audio data segmentation apparatus for segmenting of audio data including for supplying audio data, dividing the audio data supplied into audio clips of a predetermined length, discriminating the audio clips into predetermined audio classes, the audio classes identifying a kind of audio data included in the respective audio clip and segmenting for segmenting the audio data into audio meta patterns based on a sequence of audio classes of consecutive audio clips, each meta pattern being allocated to a predetermined type of contents of the audio data. It is difficult to achieve good results with known methods for segmentation of audio data into meta patterns since the rules for the allocation of the meta patterns are dissatisfying.

Type: Grant

Filed: November 10, 2004

Date of Patent: March 16, 2010

Assignee: Sony Deutschland GmbH

Inventors: Silke Goronzy, Thomas Kemp, Ralf Kompe, Yin Hay Lam, Krzysztof Marasek, Raquel Tato
Pre-processing speech for speech recognition

Patent number: 7376559

Abstract: A method for pre-processing speech, in particular for recognizing speech, including receiving a speech signal, separating a spectrum of said speech signal into a number of predetermined frequency sub-bands, analyzing said speech signal within each of said frequency sub-bands, generating respective band-dependent acoustic feature data for each of said respective frequency sub-bands, deriving band-dependent likelihoods for occurrences of speech elements or within said speech signal based on said band-dependent acoustic feature data, analyzing said speech signal within said spectrum, generating full-band acoustic feature data, which are at least in part representative for said speech signal with respect to said spectrum, deriving a full-band likelihood for occurrences of speech elements or of sequences thereof within said speech signal based on said full-band acoustic feature data, deriving an overall likelihood for occurrences of speech elements within said speech signal based on said band-dependent likelihoods

Type: Grant

Filed: March 25, 2004

Date of Patent: May 20, 2008

Assignee: Sony Deutschland GmbH

Inventors: Raquel Tato, Thomas Kemp, Antoni Abella
Method for detecting emotions from speech using speaker identification

Patent number: 7373301

Abstract: To reduce the error rate when classifying emotions from an acoustical speech input (SI) only, it is suggested to include a process of speaker identification to obtain certain speaker identification data (SID) on the basis of which the process of recognizing an emotional state is adapted and/or configured. In particular, speaker-specific feature extractors (FE) and/or emotion classifiers (EC) are selected based on said speaker identification data (SID).

Type: Grant

Filed: July 31, 2002

Date of Patent: May 13, 2008

Assignee: Sony Deutschland GmbH

Inventors: Thomas Kemp, Ralf Kompe, Raquel Tato
Apparatus and method for automatic dissection of segmented audio signals

Publication number: 20050160449

Abstract: Apparatus and method for automatic dissection of segmented audio signals According to the present invention, an apparatus for automatic dissection of segmented audio signals, wherein at least one information signal for identifying programmes included in said audio signals and for identifying contents included in said programmes is provided, comprises: content detection means for detecting programmes and contents belonging to the respective programmes in the information signal; programme weighting means for weighting each programme comprised in the information signal based on the contents of the respective programme detected by the content detection means; and programme ranking means for identifying programmes of the same category and ranking said programmes based on a weighting result for each programme provided by the programme weighting means.

Type: Application

Filed: November 10, 2004

Publication date: July 21, 2005

Inventors: Silke Goronzy, Thomas Kemp, Ralf Kompe, Yin Lam, Krzysztof Marasek, Raquel Tato
Apparatus and method for classifying an audio signal

Publication number: 20050131688

Abstract: An apparatus for classifying audio signals comprises audio signal clipping means for partitioning audio signals into audio clips, and class discrimination means for discriminating the audio clips provided by the audio signal clipping means into predetermined audio classes based on predetermined audio class classifying rules, by analysing acoustic characteristics of the audio signals comprised in the audio clips, wherein a predetermined audio class classifying rule is provided for each audio class, and each audio class represents a respective kind of audio signals comprised in the corresponding audio clip. The determination process to find acceptable audio class classifying rules for each audio class according to the prior art is depending on both the used raw audio signals and the personal experience of the person conducting the determination process. Thus, the determination process usually is very difficult, time consuming and subjective.

Type: Application

Filed: November 10, 2004

Publication date: June 16, 2005

Inventors: Silke Goronzy, Thomas Kemp, Ralf Kompe, Yin Lam, Krzysztof Marasek, Raquel Tato
Automatic summarisation for a television programme suggestion engine based on consumer preferences

Publication number: 20050120368

Abstract: A method and an apparatus for effecting the method are proposed that allow to define a subset of video signals from a source set of video signals on the basis of meta data available for the source set of video signals. The meta data assign a generic term to a sub-section of the audio channel of the source set of video signals, a class description to one or more sub-units of the sub-section for classifying the origin of the respective sub-unit, a category allocation to a segment, which is formed by a string of one or more classified sub-units of a sub-section, and a rating value to the segment for rating the reliability of the category allocation of the segment. The method includes steps for selecting segments of a sub-section with a rating value above a defined threshold value, assigning a priority value to each category, and specifying a first subset of video signals by defining an arrangement of selected segments by an order based on the respective priority and rating values related to each segment.

Type: Application

Filed: November 10, 2004

Publication date: June 2, 2005

Inventors: Silke Goronzy, Thomas Kemp, Ralf Kompe, Yin Lam, Krzysztof Marasek, Raquel Tato
Apparatus and method for segmentation of audio data into meta patterns

Publication number: 20050114388

Abstract: An audio data segmentation apparatus for segmenting of audio data comprises audio data input means for supplying audio data, audio data clipping means for dividing the audio data supplied by the audio data input means into audio clips of a predetermined length, class discrimination means for discriminating the audio clips supplied by the audio data clipping means into predetermined audio classes, the audio classes identifying a kind of audio data included in the respective audio clip and segmenting means for segmenting the audio data into audio meta patterns based on a sequence of audio classes of consecutive audio clips, each meta pattern being allocated to a predetermined type of contents of the audio data. It is difficult to achieve good results with known methods for segmentation of audio data into meta patterns since the rules for the allocation of the meta patterns are dissatisfying.

Type: Application

Filed: November 10, 2004

Publication date: May 26, 2005

Inventors: Silke Goronzy, Thomas Kemp, Ralf Kompe, Yin Lam, Krzysztof Marasek, Raquel Tato
Apparatus and method for automatic extraction of important events in audio signals

Publication number: 20050102135

Abstract: The present invention discloses an apparatus for automatic extraction of important events in audio signals comprising: signal input means for supplying audio signals; audio signal fragmenting means for partitioning audio signals supplied by the signal input means into audio fragments of a predetermined length and for allocating a sequence of one or more audio fragments to a respective audio window; feature extracting means for analysing acoustic characteristics of the audio signals comprised in the audio fragments and for analysing acoustic characteristics of the audio signals comprised in the audio windows; and important event extraction means for extracting important events in audio signals supplied by the audio signal fragmenting means based on predetermined important event classifying rules depending on acoustic characteristics of the audio signals comprised in the audio fragments and on acoustic characteristics of the audio signals comprised in the audio windows, wherein each important event extracted by

Type: Application

Filed: November 10, 2004

Publication date: May 12, 2005

Inventors: Silke Goronzy, Thomas Kemp, Ralf Kompe, Yin Lam, Krzysztof Marasek, Raquel Tato
Man-machine interface unit control method, robot apparatus, and its action control method

Patent number: 6862497

Abstract: There is proposed a method that may be universally used for controlling a man-machine interface unit. A learning sample is used in order at least to derive and/or initialize a target action (t) to be carried out and to lead the user from an optional current status (ec) to an optional desired target status (et) as the final status (ef). This learning sample (l) is formed by a data triple made up by an initial status (ei) before an optional action (a) carried out by the user, a final status (ef) after the action taken place (a).

Type: Grant

Filed: June 3, 2002

Date of Patent: March 1, 2005

Assignees: Sony Corporation, Sony International (Europe) GmbH

Inventors: Thomas Kemp, Ralf Kompe, Raquel Tato, Masahiro Fujita, Katsuki Minamino, Kenta Kawamoto, Rika Horinaka
Method for pre-processing speech

Publication number: 20040236570

Abstract: The invention provides a method for pre-processing speech, in particular in a method for recognizing speech, comprising the steps of receiving a speech signal (S), separating a spectrum (F) of said speech signal (S) into a given number (N) of predetermined frequency sub-bands (F1, . . . , FN), analyzing said speech signal (S) within each of said frequency sub-bands (F1, . . . , FN), thereby generating respective band-dependent acoustic feature data (O1, . . . , ON) for each of said respective frequency sub-bands (F1, . . . , FN), which band-dependent acoustic feature data (O1, . . . ON) are at least in part representative for said speech signal (S) with respect to a respective frequency sub-band (F1, . . . , FN), deriving band-dependent likelihoods (b1, . . . , bN) for occurrences of speech elements (P1, . . . , Pm) or of sequences thereof within said speech signal (S) based on said band-dependent acoustic feature data (O1, . . .

Type: Application

Filed: March 25, 2004

Publication date: November 25, 2004

Inventors: Raquel Tato, Thomas Kemp, Antoni Abella
Method for processing speech using absolute loudness

Publication number: 20040128127

Abstract: The invention provides a method for processing speech comprising the steps of receiving a speech input (SI) of a speaker, generating speech parameters (SP) from said speech input (SI), determining parameters describing an absolute loudness (L) of said speech input (SI), and evaluating (EV) said speech input (SI) and/or said speech parameters (SP) using said parameters describing the absolute loudness (L). In particular, the step of evaluation (EV) comprises a step of emotion recognition and/or speaker identification. Further, a microphone array comprising a plurality of microphones is used for determining said parameters describing the absolute loudness. With a microphone array the distance of the speaker from the microphone array can be determined and the loudness can be normalized by the distance.

Type: Application

Filed: December 10, 2003

Publication date: July 1, 2004

Inventors: Thomas Kemp, Ralf Kompe, Raquel Tato
Man-machine interface unit control method, robot apparatus, and its action control method

Publication number: 20040039483

Abstract: There is proposed a method that may be universally used for controlling a man-machine interface unit. A learning sample is used in order at least to derive and/or initialize a target action (t) to be carried out and to lead the user from an optional current status (ec) to an optional desired target status (et) as the final status (ef). This learning sample (l) is formed by a data triple made up by an initial status (ei) before an optional action (a) carried out by the user, a final status (ef) after the action taken place, and the action taken place (a).

Type: Application

Filed: June 16, 2003

Publication date: February 26, 2004

Inventors: Thomas Kemp, Ralf Kompe, Raquel Tato, Masahiro Fujita, Katsuki Minamino, Kenta Kawamoto, Rika Horinaka
Method for detecting emotions involving subspace specialists

Publication number: 20030069728

Abstract: To detect and determine a current emotional state (CES) of a human being from a spoken speech input (SI), it is suggested in a method for detecting emotions to identify first and second feature classes (A, E) with, in particular distinct, dimensions of an underlying emotional manifold (EM) or emotional space (ES) and/or with subspaces thereof.

Type: Application

Filed: October 4, 2002

Publication date: April 10, 2003

Inventors: Raquel Tato, Thomas Kemp, Krzysztof Marasek
Method for detecting emotions from speech using speaker identification

Publication number: 20030028384

Abstract: To reduce the error rate when classifying emotions from an acoustical speech input (SI) only, it is suggested to include a process of speaker identification to obtain certain speaker identification data (SID) on the basis of which the process of recognizing an emotional state is adapted and/or configured. In particular, speaker-specific feature extractors (FE) and/or emotion classifiers (EC) are selected based on said speaker identification data (SID).

Type: Application

Filed: July 31, 2002

Publication date: February 6, 2003

Inventors: Thomas Kemp, Ralf Kompe, Raquel Tato