Patents by Inventor Albert Joseph Kishan Thambiratnam

Albert Joseph Kishan Thambiratnam has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

GENERATIVE IMAGE ACQUISITION

Publication number: 20220207076

Abstract: The present disclosure provides method and apparatus for generative image acquisition. A query can be received. A first set of images retrieved according to the query can be obtained. It is can be determined that the first set of images includes a first image that partially satisfies the query. A missing component of the first image compared to the query can be detected. A second set of images based on the first image and the missing component can be generated.

Type: Application

Filed: March 17, 2020

Publication date: June 30, 2022

Inventors: Dehua CUI, Albert Joseph Kishan THAMBIRATNAM, Lin Su
Keyword generation for media content

Patent number: 9483557

Abstract: In various embodiments, a transcript that represents a media file is created. Keyword candidates that may represent topics and/or content associated with the media content are then be extracted from the transcript. Furthermore, a keyword set may be generated for the media content utilizing a mutual information criteria. In other embodiments, one or more queries may be generated based at least in part on the transcript, and a plurality of web documents may be retrieved based at least in part on the one or more queries. Additional keyword candidates may be extracted from each web document and then ranked. A subset of the keyword candidates may then be selected to form a keyword set associated with the media content.

Type: Grant

Filed: March 4, 2011

Date of Patent: November 1, 2016

Assignee: Microsoft Technology Licensing LLC

Inventors: Albert Joseph Kishan Thambiratnam, Sha Meng, Gang Li, Frank Torsten Bernd Seide
Subword-based multi-level pronunciation adaptation for recognizing accented speech

Patent number: 8825481

Abstract: Techniques are described for training a speech recognition model for accented speech. A subword parse table is employed that models mispronunciations at multiple subword levels, such as the syllable, position-specific cluster, and/or phone levels. Mispronunciation probability data is then generated at each level based on inputted training data, such as phone-level annotated transcripts of accented speech. Data from different levels of the subword parse table may then be combined to determine the accented speech model. Mispronunciation probability data at each subword level is based at least in part on context at that level. In some embodiments, phone-level annotated transcripts are generated using a semi-supervised method.

Type: Grant

Filed: January 20, 2012

Date of Patent: September 2, 2014

Assignee: Microsoft Corporation

Inventors: Albert Joseph Kishan Thambiratnam, Timo Pascal Mertens, Frank Torsten Bernd Seide
Leveraging speech recognizer feedback for voice activity detection

Patent number: 8650029

Abstract: A voice activity detection (VAD) module analyzes a media file, such as an audio file or a video file, to determine whether one or more frames of the media file include speech. A speech recognizer generates feedback relating to an accuracy of the VAD determination. The VAD module leverages the feedback to improve subsequent VAD determinations. The VAD module also utilizes a look-ahead window associated with the media file to adjust estimated probabilities or VAD decisions for previously processed frames.

Type: Grant

Filed: February 25, 2011

Date of Patent: February 11, 2014

Assignee: Microsoft Corporation

Inventors: Albert Joseph Kishan Thambiratnam, Weiwu Zhu, Frank Torsten Bernd Seide
Subword-Based Multi-Level Pronunciation Adaptation for Recognizing Accented Speech

Publication number: 20130191126

Abstract: Techniques are described for training a speech recognition model for accented speech. A subword parse table is employed that models mispronunciations at multiple subword levels, such as the syllable, position-specific cluster, and/or phone levels. Mispronunciation probability data is then generated at each level based on inputted training data, such as phone-level annotated transcripts of accented speech. Data from different levels of the subword parse table may then be combined to determine the accented speech model. Mispronunciation probability data at each subword level is based at least in part on context at that level. In some embodiments, phone-level annotated transcripts are generated using a semi-supervised method.

Type: Application

Filed: January 20, 2012

Publication date: July 25, 2013

Applicant: Microsoft Corporation

Inventors: Albert Joseph Kishan Thambiratnam, Timo Pascal Mertens, Frank Torsten Bernd Seide
Keyword Generation for Media Content

Publication number: 20120226696

Abstract: In various embodiments, a transcript that represents a media file is created. Keyword candidates that may represent topics and/or content associated with the media content are then be extracted from the transcript. Furthermore, a keyword set may be generated for the media content utilizing a mutual information criteria. In other embodiments, one or more queries may be generated based at least in part on the transcript, and a plurality of web documents may be retrieved based at least in part on the one or more queries. Additional keyword candidates may be extracted from each web document and then ranked. A subset of the keyword candidates may then be selected to form a keyword set associated with the media content.

Type: Application

Filed: March 4, 2011

Publication date: September 6, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Albert Joseph Kishan Thambiratnam, Sha Meng, Gang Li, Frank Torsten Bernd Seide
LEVERAGING SPEECH RECOGNIZER FEEDBACK FOR VOICE ACTIVITY DETECTION

Publication number: 20120221330

Abstract: A voice activity detection (VAD) module analyzes a media file, such as an audio file or a video file, to determine whether one or more frames of the media file include speech. A speech recognizer generates feedback relating to an accuracy of the VAD determination. The VAD module leverages the feedback to improve subsequent VAD determinations. The VAD module also utilizes a look-ahead window associated with the media file to adjust estimated probabilities or VAD decisions for previously processed frames.

Type: Application

Filed: February 25, 2011

Publication date: August 30, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Albert Joseph Kishan Thambiratnam, Weiwu Zhu, Frank Torsten Bernd Seide
Indexing and searching audio using text indexers

Patent number: 8060494

Abstract: A full-text lattice indexing and searching system and method for indexing word lattices using a text indexer to enable enhance searching of audio content. The system and method utilize a Time-Anchored Lattice Expansion (TALE) method that represents word lattices such that they can be indexed with existing text indexers with little or no modification. Embodiments of system and method include an indexing module for generating and indexing word lattices based on audio content and a searching module for allowing searching of a full-text index containing indexed word lattices. The indexing module includes a custom IFilter and a custom Wordbreaker. Embodiments of the searching module include an ExpandQuery function for decorating an input query and a custom Stemmer. Embodiments of the searching module also include a GenerateSnippets module that extracts information from the indexed word lattices to enable the creation of clickable snippets.

Type: Grant

Filed: December 7, 2007

Date of Patent: November 15, 2011

Assignee: Microsoft Corporation

Inventors: Frank T. B. Seide, Peng Yu, Albert Joseph Kishan Thambiratnam
TRANSCRIPTION, ARCHIVING AND THREADING OF VOICE COMMUNICATIONS

Publication number: 20100268534

Abstract: Described is a technology that provides highly accurate speech-recognized text transcripts of conversations, particularly telephone or meeting conversations. Speech is received for recognition when it is at a high quality and separate for each user, that is, independent of any transmission. Moreover, because the speech is received separately, a personalized recognition model adapted to each user's voice and vocabulary may be used. The separately recognized text is then merged into a transcript of the communication. The transcript may be labeled with the identity of each user that spoke the corresponding speech. The output of the transcript may be dynamic as the conversation takes place, or may occur later, such as contingent upon each user agreeing to release his or her text. The transcript may be incorporated into the text or data of another program, such as to insert it as a thread in a larger email conversation or the like.

Type: Application

Filed: April 17, 2009

Publication date: October 21, 2010

Applicant: MICROSOFT CORPORATION

Inventors: ALBERT JOSEPH KISHAN THAMBIRATNAM, FRANK TORSTEN BERND SEIDE, PENG YU, ROY GEOFFREY WALLACE
INDEXING AND SEARCHING AUDIO USING TEXT INDEXERS

Publication number: 20090150337

Abstract: A full-text lattice indexing and searching system and method for indexing word lattices using a text indexer to enable enhance searching of audio content. The system and method utilize a Time-Anchored Lattice Expansion (TALE) method that represents word lattices such that they can be indexed with existing text indexers with little or no modification. Embodiments of system and method include an indexing module for generating and indexing word lattices based on audio content and a searching module for allowing searching of a full-text index containing indexed word lattices. The indexing module includes a custom IFilter and a custom Wordbreaker. Embodiments of the searching module include an ExpandQuery function for decorating an input query and a custom Stemmer. Embodiments of the searching module also include a GenerateSnippets module that extracts information from the indexed word lattices to enable the creation of clickable snippets.

Type: Application

Filed: December 7, 2007

Publication date: June 11, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Frank T.B. Seide, Peng Yu, Albert Joseph Kishan Thambiratnam