Patents by Inventor Albert Joseph Kishan Thambiratnam

Albert Joseph Kishan Thambiratnam has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220207076
    Abstract: The present disclosure provides method and apparatus for generative image acquisition. A query can be received. A first set of images retrieved according to the query can be obtained. It is can be determined that the first set of images includes a first image that partially satisfies the query. A missing component of the first image compared to the query can be detected. A second set of images based on the first image and the missing component can be generated.
    Type: Application
    Filed: March 17, 2020
    Publication date: June 30, 2022
    Inventors: Dehua CUI, Albert Joseph Kishan THAMBIRATNAM, Lin Su
  • Patent number: 9483557
    Abstract: In various embodiments, a transcript that represents a media file is created. Keyword candidates that may represent topics and/or content associated with the media content are then be extracted from the transcript. Furthermore, a keyword set may be generated for the media content utilizing a mutual information criteria. In other embodiments, one or more queries may be generated based at least in part on the transcript, and a plurality of web documents may be retrieved based at least in part on the one or more queries. Additional keyword candidates may be extracted from each web document and then ranked. A subset of the keyword candidates may then be selected to form a keyword set associated with the media content.
    Type: Grant
    Filed: March 4, 2011
    Date of Patent: November 1, 2016
    Assignee: Microsoft Technology Licensing LLC
    Inventors: Albert Joseph Kishan Thambiratnam, Sha Meng, Gang Li, Frank Torsten Bernd Seide
  • Patent number: 8825481
    Abstract: Techniques are described for training a speech recognition model for accented speech. A subword parse table is employed that models mispronunciations at multiple subword levels, such as the syllable, position-specific cluster, and/or phone levels. Mispronunciation probability data is then generated at each level based on inputted training data, such as phone-level annotated transcripts of accented speech. Data from different levels of the subword parse table may then be combined to determine the accented speech model. Mispronunciation probability data at each subword level is based at least in part on context at that level. In some embodiments, phone-level annotated transcripts are generated using a semi-supervised method.
    Type: Grant
    Filed: January 20, 2012
    Date of Patent: September 2, 2014
    Assignee: Microsoft Corporation
    Inventors: Albert Joseph Kishan Thambiratnam, Timo Pascal Mertens, Frank Torsten Bernd Seide
  • Patent number: 8650029
    Abstract: A voice activity detection (VAD) module analyzes a media file, such as an audio file or a video file, to determine whether one or more frames of the media file include speech. A speech recognizer generates feedback relating to an accuracy of the VAD determination. The VAD module leverages the feedback to improve subsequent VAD determinations. The VAD module also utilizes a look-ahead window associated with the media file to adjust estimated probabilities or VAD decisions for previously processed frames.
    Type: Grant
    Filed: February 25, 2011
    Date of Patent: February 11, 2014
    Assignee: Microsoft Corporation
    Inventors: Albert Joseph Kishan Thambiratnam, Weiwu Zhu, Frank Torsten Bernd Seide
  • Publication number: 20130191126
    Abstract: Techniques are described for training a speech recognition model for accented speech. A subword parse table is employed that models mispronunciations at multiple subword levels, such as the syllable, position-specific cluster, and/or phone levels. Mispronunciation probability data is then generated at each level based on inputted training data, such as phone-level annotated transcripts of accented speech. Data from different levels of the subword parse table may then be combined to determine the accented speech model. Mispronunciation probability data at each subword level is based at least in part on context at that level. In some embodiments, phone-level annotated transcripts are generated using a semi-supervised method.
    Type: Application
    Filed: January 20, 2012
    Publication date: July 25, 2013
    Applicant: Microsoft Corporation
    Inventors: Albert Joseph Kishan Thambiratnam, Timo Pascal Mertens, Frank Torsten Bernd Seide
  • Publication number: 20120226696
    Abstract: In various embodiments, a transcript that represents a media file is created. Keyword candidates that may represent topics and/or content associated with the media content are then be extracted from the transcript. Furthermore, a keyword set may be generated for the media content utilizing a mutual information criteria. In other embodiments, one or more queries may be generated based at least in part on the transcript, and a plurality of web documents may be retrieved based at least in part on the one or more queries. Additional keyword candidates may be extracted from each web document and then ranked. A subset of the keyword candidates may then be selected to form a keyword set associated with the media content.
    Type: Application
    Filed: March 4, 2011
    Publication date: September 6, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Albert Joseph Kishan Thambiratnam, Sha Meng, Gang Li, Frank Torsten Bernd Seide
  • Publication number: 20120221330
    Abstract: A voice activity detection (VAD) module analyzes a media file, such as an audio file or a video file, to determine whether one or more frames of the media file include speech. A speech recognizer generates feedback relating to an accuracy of the VAD determination. The VAD module leverages the feedback to improve subsequent VAD determinations. The VAD module also utilizes a look-ahead window associated with the media file to adjust estimated probabilities or VAD decisions for previously processed frames.
    Type: Application
    Filed: February 25, 2011
    Publication date: August 30, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Albert Joseph Kishan Thambiratnam, Weiwu Zhu, Frank Torsten Bernd Seide
  • Patent number: 8060494
    Abstract: A full-text lattice indexing and searching system and method for indexing word lattices using a text indexer to enable enhance searching of audio content. The system and method utilize a Time-Anchored Lattice Expansion (TALE) method that represents word lattices such that they can be indexed with existing text indexers with little or no modification. Embodiments of system and method include an indexing module for generating and indexing word lattices based on audio content and a searching module for allowing searching of a full-text index containing indexed word lattices. The indexing module includes a custom IFilter and a custom Wordbreaker. Embodiments of the searching module include an ExpandQuery function for decorating an input query and a custom Stemmer. Embodiments of the searching module also include a GenerateSnippets module that extracts information from the indexed word lattices to enable the creation of clickable snippets.
    Type: Grant
    Filed: December 7, 2007
    Date of Patent: November 15, 2011
    Assignee: Microsoft Corporation
    Inventors: Frank T. B. Seide, Peng Yu, Albert Joseph Kishan Thambiratnam
  • Publication number: 20100268534
    Abstract: Described is a technology that provides highly accurate speech-recognized text transcripts of conversations, particularly telephone or meeting conversations. Speech is received for recognition when it is at a high quality and separate for each user, that is, independent of any transmission. Moreover, because the speech is received separately, a personalized recognition model adapted to each user's voice and vocabulary may be used. The separately recognized text is then merged into a transcript of the communication. The transcript may be labeled with the identity of each user that spoke the corresponding speech. The output of the transcript may be dynamic as the conversation takes place, or may occur later, such as contingent upon each user agreeing to release his or her text. The transcript may be incorporated into the text or data of another program, such as to insert it as a thread in a larger email conversation or the like.
    Type: Application
    Filed: April 17, 2009
    Publication date: October 21, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: ALBERT JOSEPH KISHAN THAMBIRATNAM, FRANK TORSTEN BERND SEIDE, PENG YU, ROY GEOFFREY WALLACE
  • Publication number: 20090150337
    Abstract: A full-text lattice indexing and searching system and method for indexing word lattices using a text indexer to enable enhance searching of audio content. The system and method utilize a Time-Anchored Lattice Expansion (TALE) method that represents word lattices such that they can be indexed with existing text indexers with little or no modification. Embodiments of system and method include an indexing module for generating and indexing word lattices based on audio content and a searching module for allowing searching of a full-text index containing indexed word lattices. The indexing module includes a custom IFilter and a custom Wordbreaker. Embodiments of the searching module include an ExpandQuery function for decorating an input query and a custom Stemmer. Embodiments of the searching module also include a GenerateSnippets module that extracts information from the indexed word lattices to enable the creation of clickable snippets.
    Type: Application
    Filed: December 7, 2007
    Publication date: June 11, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Frank T.B. Seide, Peng Yu, Albert Joseph Kishan Thambiratnam