Patents by Inventor Albert Joseph Kishan Thambiratnam
Albert Joseph Kishan Thambiratnam has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220207076Abstract: The present disclosure provides method and apparatus for generative image acquisition. A query can be received. A first set of images retrieved according to the query can be obtained. It is can be determined that the first set of images includes a first image that partially satisfies the query. A missing component of the first image compared to the query can be detected. A second set of images based on the first image and the missing component can be generated.Type: ApplicationFiled: March 17, 2020Publication date: June 30, 2022Inventors: Dehua CUI, Albert Joseph Kishan THAMBIRATNAM, Lin Su
-
Patent number: 9483557Abstract: In various embodiments, a transcript that represents a media file is created. Keyword candidates that may represent topics and/or content associated with the media content are then be extracted from the transcript. Furthermore, a keyword set may be generated for the media content utilizing a mutual information criteria. In other embodiments, one or more queries may be generated based at least in part on the transcript, and a plurality of web documents may be retrieved based at least in part on the one or more queries. Additional keyword candidates may be extracted from each web document and then ranked. A subset of the keyword candidates may then be selected to form a keyword set associated with the media content.Type: GrantFiled: March 4, 2011Date of Patent: November 1, 2016Assignee: Microsoft Technology Licensing LLCInventors: Albert Joseph Kishan Thambiratnam, Sha Meng, Gang Li, Frank Torsten Bernd Seide
-
Patent number: 8825481Abstract: Techniques are described for training a speech recognition model for accented speech. A subword parse table is employed that models mispronunciations at multiple subword levels, such as the syllable, position-specific cluster, and/or phone levels. Mispronunciation probability data is then generated at each level based on inputted training data, such as phone-level annotated transcripts of accented speech. Data from different levels of the subword parse table may then be combined to determine the accented speech model. Mispronunciation probability data at each subword level is based at least in part on context at that level. In some embodiments, phone-level annotated transcripts are generated using a semi-supervised method.Type: GrantFiled: January 20, 2012Date of Patent: September 2, 2014Assignee: Microsoft CorporationInventors: Albert Joseph Kishan Thambiratnam, Timo Pascal Mertens, Frank Torsten Bernd Seide
-
Patent number: 8650029Abstract: A voice activity detection (VAD) module analyzes a media file, such as an audio file or a video file, to determine whether one or more frames of the media file include speech. A speech recognizer generates feedback relating to an accuracy of the VAD determination. The VAD module leverages the feedback to improve subsequent VAD determinations. The VAD module also utilizes a look-ahead window associated with the media file to adjust estimated probabilities or VAD decisions for previously processed frames.Type: GrantFiled: February 25, 2011Date of Patent: February 11, 2014Assignee: Microsoft CorporationInventors: Albert Joseph Kishan Thambiratnam, Weiwu Zhu, Frank Torsten Bernd Seide
-
Publication number: 20130191126Abstract: Techniques are described for training a speech recognition model for accented speech. A subword parse table is employed that models mispronunciations at multiple subword levels, such as the syllable, position-specific cluster, and/or phone levels. Mispronunciation probability data is then generated at each level based on inputted training data, such as phone-level annotated transcripts of accented speech. Data from different levels of the subword parse table may then be combined to determine the accented speech model. Mispronunciation probability data at each subword level is based at least in part on context at that level. In some embodiments, phone-level annotated transcripts are generated using a semi-supervised method.Type: ApplicationFiled: January 20, 2012Publication date: July 25, 2013Applicant: Microsoft CorporationInventors: Albert Joseph Kishan Thambiratnam, Timo Pascal Mertens, Frank Torsten Bernd Seide
-
Publication number: 20120226696Abstract: In various embodiments, a transcript that represents a media file is created. Keyword candidates that may represent topics and/or content associated with the media content are then be extracted from the transcript. Furthermore, a keyword set may be generated for the media content utilizing a mutual information criteria. In other embodiments, one or more queries may be generated based at least in part on the transcript, and a plurality of web documents may be retrieved based at least in part on the one or more queries. Additional keyword candidates may be extracted from each web document and then ranked. A subset of the keyword candidates may then be selected to form a keyword set associated with the media content.Type: ApplicationFiled: March 4, 2011Publication date: September 6, 2012Applicant: MICROSOFT CORPORATIONInventors: Albert Joseph Kishan Thambiratnam, Sha Meng, Gang Li, Frank Torsten Bernd Seide
-
Publication number: 20120221330Abstract: A voice activity detection (VAD) module analyzes a media file, such as an audio file or a video file, to determine whether one or more frames of the media file include speech. A speech recognizer generates feedback relating to an accuracy of the VAD determination. The VAD module leverages the feedback to improve subsequent VAD determinations. The VAD module also utilizes a look-ahead window associated with the media file to adjust estimated probabilities or VAD decisions for previously processed frames.Type: ApplicationFiled: February 25, 2011Publication date: August 30, 2012Applicant: MICROSOFT CORPORATIONInventors: Albert Joseph Kishan Thambiratnam, Weiwu Zhu, Frank Torsten Bernd Seide
-
Patent number: 8060494Abstract: A full-text lattice indexing and searching system and method for indexing word lattices using a text indexer to enable enhance searching of audio content. The system and method utilize a Time-Anchored Lattice Expansion (TALE) method that represents word lattices such that they can be indexed with existing text indexers with little or no modification. Embodiments of system and method include an indexing module for generating and indexing word lattices based on audio content and a searching module for allowing searching of a full-text index containing indexed word lattices. The indexing module includes a custom IFilter and a custom Wordbreaker. Embodiments of the searching module include an ExpandQuery function for decorating an input query and a custom Stemmer. Embodiments of the searching module also include a GenerateSnippets module that extracts information from the indexed word lattices to enable the creation of clickable snippets.Type: GrantFiled: December 7, 2007Date of Patent: November 15, 2011Assignee: Microsoft CorporationInventors: Frank T. B. Seide, Peng Yu, Albert Joseph Kishan Thambiratnam
-
Publication number: 20100268534Abstract: Described is a technology that provides highly accurate speech-recognized text transcripts of conversations, particularly telephone or meeting conversations. Speech is received for recognition when it is at a high quality and separate for each user, that is, independent of any transmission. Moreover, because the speech is received separately, a personalized recognition model adapted to each user's voice and vocabulary may be used. The separately recognized text is then merged into a transcript of the communication. The transcript may be labeled with the identity of each user that spoke the corresponding speech. The output of the transcript may be dynamic as the conversation takes place, or may occur later, such as contingent upon each user agreeing to release his or her text. The transcript may be incorporated into the text or data of another program, such as to insert it as a thread in a larger email conversation or the like.Type: ApplicationFiled: April 17, 2009Publication date: October 21, 2010Applicant: MICROSOFT CORPORATIONInventors: ALBERT JOSEPH KISHAN THAMBIRATNAM, FRANK TORSTEN BERND SEIDE, PENG YU, ROY GEOFFREY WALLACE
-
Publication number: 20090150337Abstract: A full-text lattice indexing and searching system and method for indexing word lattices using a text indexer to enable enhance searching of audio content. The system and method utilize a Time-Anchored Lattice Expansion (TALE) method that represents word lattices such that they can be indexed with existing text indexers with little or no modification. Embodiments of system and method include an indexing module for generating and indexing word lattices based on audio content and a searching module for allowing searching of a full-text index containing indexed word lattices. The indexing module includes a custom IFilter and a custom Wordbreaker. Embodiments of the searching module include an ExpandQuery function for decorating an input query and a custom Stemmer. Embodiments of the searching module also include a GenerateSnippets module that extracts information from the indexed word lattices to enable the creation of clickable snippets.Type: ApplicationFiled: December 7, 2007Publication date: June 11, 2009Applicant: MICROSOFT CORPORATIONInventors: Frank T.B. Seide, Peng Yu, Albert Joseph Kishan Thambiratnam