Patents by Inventor Ozlem Kalinli

Ozlem Kalinli has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10424289
    Abstract: A speech recognition system includes a phone classifier and a boundary classifier. The phone classifier generates combined boundary posteriors from a combination of auditory attention features and phone posteriors by feeding phone posteriors of neighboring frames of an audio signal into a machine learning algorithm to classify phone posterior context information. The boundary classifier estimates boundaries in speech contained in the audio signal from the combined boundary posteriors.
    Type: Grant
    Filed: August 14, 2018
    Date of Patent: September 24, 2019
    Assignee: SONY INTERACTIVE ENTERTAINMENT INC.
    Inventor: Ozlem Kalinli-Akbacak
  • Publication number: 20190005943
    Abstract: A speech recognition system includes a phone classifier and a boundary classifier. The phone classifier generates combined boundary posteriors from a combination of auditory attention features and phone posteriors by feeding phone posteriors of neighboring frames of an audio signal into a machine learning algorithm to classify phone posterior context information. The boundary classifier estimates boundaries in speech contained in the audio signal from the combined boundary posteriors.
    Type: Application
    Filed: August 14, 2018
    Publication date: January 3, 2019
    Inventor: Ozlem Kalinli-Akbacak
  • Patent number: 10127927
    Abstract: A method for emotion or speaking style recognition and/or clustering comprises receiving one or more speech samples, generating a set of training data by extracting one or more acoustic features from every frame of the one or more speech samples, and generating a model from the set of training data, wherein the model identifies emotion or speaking style dependent information in the set of training data. The method may further comprise receiving one or more test speech samples, generating a set of test data by extracting one or more acoustic features from every frame of the one or more test speeches, and transforming the set of test data using the model to better represent emotion/speaking style dependent information, and use the transformed data for clustering and/or classification to discover speech with similar emotion or speaking style.
    Type: Grant
    Filed: June 18, 2015
    Date of Patent: November 13, 2018
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Ozlem Kalinli-Akbacak, Ruxin Chen
  • Patent number: 10049657
    Abstract: Phoneme boundaries may be determined from a signal corresponding to recorded audio by extracting auditory attention features from the signal and extracting phoneme posteriors from the signal. The auditory attention features and phoneme posteriors may then be combined to detect boundaries in the signal.
    Type: Grant
    Filed: May 26, 2017
    Date of Patent: August 14, 2018
    Assignee: SONY INTERACTIVE ENTERTAINMENT INC.
    Inventor: Ozlem Kalinli-Akbacak
  • Publication number: 20170263240
    Abstract: Phoneme boundaries may be determined from a signal corresponding to recorded audio by extracting auditory attention features from the signal and extracting phoneme posteriors from the signal. The auditory attention features and phoneme posteriors may then be combined to detect boundaries in the signal.
    Type: Application
    Filed: May 26, 2017
    Publication date: September 14, 2017
    Inventor: Ozlem Kalinli-Akbacak
  • Patent number: 9672811
    Abstract: Phoneme boundaries may be determined from a signal corresponding to recorded audio by extracting auditory attention features from the signal and extracting phoneme posteriors from the signal. The auditory attention features and phoneme posteriors may then be combined to detect boundaries in the signal.
    Type: Grant
    Filed: May 23, 2013
    Date of Patent: June 6, 2017
    Assignee: SONY INTERACTIVE ENTERTAINMENT INC.
    Inventor: Ozlem Kalinli-Akbacak
  • Patent number: 9251783
    Abstract: In syllable or vowel or phone boundary detection during speech, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more syllable or vowel or phone boundaries in the input window of sound can be detected by mapping the cumulative gist vector to one or more syllable or vowel or phone boundary characteristics using a machine learning algorithm.
    Type: Grant
    Filed: June 17, 2014
    Date of Patent: February 2, 2016
    Assignee: Sony Computer Entertainment Inc.
    Inventors: Ozlem Kalinli-Akbacak, Ruxin Chen
  • Publication number: 20160027452
    Abstract: A method for emotion or speaking style recognition and/or clustering comprises receiving one or more speech samples, generating a set of training data by extracting one or more acoustic features from every frame of the one or more speech samples, and generating a model from the set of training data, wherein the model identifies emotion or speaking style dependent information in the set of training data. The method may further comprise receiving one or more test speech samples, generating a set of test data by extracting one or more acoustic features from every frame of the one or more test speeches, and transforming the set of test data using the model to better represent emotion/speaking style dependent information, and use the transformed data for clustering and/or classification to discover speech with similar emotion or speaking style.
    Type: Application
    Filed: June 18, 2015
    Publication date: January 28, 2016
    Inventors: Ozlem Kalinli-Akbacak, Ruxin Chen
  • Patent number: 9244285
    Abstract: Methods of eye gaze tracking are provided using magnetized contact lenses tracked by magnetic sensors and/or reflecting contact lenses tracked by video-based sensors. Tracking information of contact lenses from magnetic sensors and video-based sensors may be used to improve eye tracking and/or combined with other sensor data to improve accuracy. Furthermore, reflective contact lenses improve blink detection while eye gaze tracking is otherwise unimpeded by magnetized contact lenses. Additionally, contact lenses may be adapted for viewing 3D information.
    Type: Grant
    Filed: January 21, 2014
    Date of Patent: January 26, 2016
    Assignee: SONY COMPUTER ENTERTAINMENT INC.
    Inventors: Ruxin Chen, Ozlem Kalinli
  • Patent number: 9031293
    Abstract: Features, including one or more acoustic features, visual features, linguistic features, and physical features may be extracted from signals obtained by one or more sensors with a processor. The acoustic, visual, linguistic, and physical features may be analyzed with one or more machine learning algorithms and an emotional state of a user may be extracted from analysis of the features. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
    Type: Grant
    Filed: October 19, 2012
    Date of Patent: May 12, 2015
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Ozlem Kalinli-Akbacak
  • Patent number: 9020822
    Abstract: Emotion recognition may be implemented on an input window of sound. One or more auditory attention features may be extracted from an auditory spectrum for the window using one or more two-dimensional spectro-temporal receptive filters. One or more feature maps corresponding to the one or more auditory attention features may be generated. Auditory gist features may be extracted from feature maps, and the auditory gist features may be analyzed to determine one or more emotion classes corresponding to the input window of sound. In addition, a bottom-up auditory attention model may be used to select emotionally salient parts of speech and execute emotion recognition only on the salient parts of speech while ignoring the rest of the speech signal.
    Type: Grant
    Filed: October 19, 2012
    Date of Patent: April 28, 2015
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Ozlem Kalinli-Akbacak
  • Patent number: 9009039
    Abstract: Technologies are described herein for noise adaptive training to achieve robust automatic speech recognition. Through the use of these technologies, a noise adaptive training (NAT) approach may use both clean and corrupted speech for training. The NAT approach may normalize the environmental distortion as part of the model training. A set of underlying “pseudo-clean” model parameters may be estimated directly. This may be done without point estimation of clean speech features as an intermediate step. The pseudo-clean model parameters learned from the NAT technique may be used with a Vector Taylor Series (VTS) adaptation. Such adaptation may support decoding noisy utterances during the operating phase of a automatic voice recognition system.
    Type: Grant
    Filed: June 12, 2009
    Date of Patent: April 14, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Lewis Seltzer, James Garnet Droppo, Ozlem Kalinli, Alejandro Acero
  • Publication number: 20150073794
    Abstract: In syllable or vowel or phone boundary detection during speech, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more syllable or vowel or phone boundaries in the input window of sound can be detected by mapping the cumulative gist vector to one or more syllable or vowel or phone boundary characteristics using a machine learning algorithm.
    Type: Application
    Filed: June 17, 2014
    Publication date: March 12, 2015
    Inventors: Ozlem Kalinli-Akbacak, Ruxin Chen
  • Publication number: 20140198382
    Abstract: Methods of eye gaze tracking are provided using magnetized contact lenses tracked by magnetic sensors and/or reflecting contact lenses tracked by video-based sensors. Tracking information of contact lenses from magnetic sensors and video-based sensors may be used to improve eye tracking and/or combined with other sensor data to improve accuracy. Furthermore, reflective contact lenses improve blink detection while eye gaze tracking is otherwise unimpeded by magnetized contact lenses. Additionally, contact lenses may be adapted for viewing 3D information.
    Type: Application
    Filed: January 21, 2014
    Publication date: July 17, 2014
    Applicant: Sony Computer Entertainment Inc.
    Inventors: Ruxin Chen, Ozlem Kalinli
  • Patent number: 8756061
    Abstract: In syllable or vowel or phone boundary detection during speech, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more syllable or vowel or phone boundaries in the input window of sound can be detected by mapping the cumulative gist vector to one or more syllable or vowel or phone boundary characteristics using a machine learning algorithm.
    Type: Grant
    Filed: April 1, 2011
    Date of Patent: June 17, 2014
    Assignee: Sony Computer Entertainment Inc.
    Inventors: Ozlem Kalinli, Ruxin Chen
  • Publication number: 20140149112
    Abstract: Phoneme boundaries may be determined from a signal corresponding to recorded audio by extracting auditory attention features from the signal and extracting phoneme posteriors from the signal. The auditory attention features and phoneme posteriors may then be combined to detect boundaries in the signal.
    Type: Application
    Filed: May 23, 2013
    Publication date: May 29, 2014
    Applicant: Sony Computer Entertainment Inc.
    Inventor: Ozlem KALINLI-AKBACAK
  • Publication number: 20140114655
    Abstract: Emotion recognition may be implemented on an input window of sound. One or more auditory attention features may be extracted from an auditory spectrum for the window using one or more two-dimensional spectro-temporal receptive filters. One or more feature maps corresponding to the one or more auditory attention features may be generated. Auditory gist features may be extracted from feature maps, and the auditory gist features may be analyzed to determine one or more emotion classes corresponding to the input window of sound. In addition, a bottom-up auditory attention model may be used to select emotionally salient parts of speech and execute emotion recognition only on the salient parts of speech while ignoring the rest of the speech signal.
    Type: Application
    Filed: October 19, 2012
    Publication date: April 24, 2014
    Applicant: Sony Computer Entertainment Inc.
    Inventor: Ozlem Kalinli-Akbacak
  • Publication number: 20140112556
    Abstract: Features, including one or more acoustic features, visual features, linguistic features, and physical features may be extracted from signals obtained by one or more sensors with a processor. The acoustic, visual, linguistic, and physical features may be analyzed with one or more machine learning algorithms and an emotional state of a user may be extracted from analysis of the features. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
    Type: Application
    Filed: October 19, 2012
    Publication date: April 24, 2014
    Applicant: Sony Computer Entertainment Inc.
    Inventor: Ozlem Kalinli-Akbacak
  • Patent number: 8676574
    Abstract: In a spoken language processing method for tone/intonation recognition, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more tonal characteristics corresponding to the input window of sound can be determined by mapping the cumulative gist vector to one or more tonal characteristics using a machine learning algorithm.
    Type: Grant
    Filed: November 10, 2010
    Date of Patent: March 18, 2014
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Ozlem Kalinli
  • Patent number: 8632182
    Abstract: Methods of eye gaze tracking are provided using magnetized contact lenses tracked by magnetic sensors and/or reflecting contact lenses tracked by video-based sensors. Tracking information of contact lenses from magnetic sensors and video-based sensors may be used to improve eye tracking and/or combined with other sensor data to improve accuracy. Furthermore, reflective contact lenses improve blink detection while eye gaze tracking is otherwise unimpeded by magnetized contact lenses. Additionally, contact lenses may be adapted for viewing 3D information.
    Type: Grant
    Filed: May 5, 2011
    Date of Patent: January 21, 2014
    Assignee: Sony Computer Entertainment Inc.
    Inventors: Ruxin Chen, Ozlem Kalinli