Patents by Inventor Frank Seide

Frank Seide has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9699404
    Abstract: Aligning a closed caption track to a media content includes calculating the offset and the drift between the closed caption track and the media content item. The closed caption track is aligned to the media content item as a function of the calculated offset and drift.
    Type: Grant
    Filed: March 19, 2014
    Date of Patent: July 4, 2017
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Dennis Cronin, Frank Seide, Ian Kennedy
  • Patent number: 9292787
    Abstract: A deep tensor neural network (DTNN) is described herein, wherein the DTNN is suitable for employment in a computer-implemented recognition/classification system. Hidden layers in the DTNN comprise at least one projection layer, which includes a first subspace of hidden units and a second subspace of hidden units. The first subspace of hidden units receives a first nonlinear projection of input data to a projection layer and generates the first set of output data based at least in part thereon, and the second subspace of hidden units receives a second nonlinear projection of the input data to the projection layer and generates the second set of output data based at least in part thereon. A tensor layer, which can converted into a conventional layer of a DNN, generates the third set of output data based upon the first set of output data and the second set of output data.
    Type: Grant
    Filed: August 29, 2012
    Date of Patent: March 22, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dong Yu, Li Deng, Frank Seide
  • Patent number: 9177550
    Abstract: Various technologies described herein pertain to conservatively adapting a deep neural network (DNN) in a recognition system for a particular user or context. A DNN is employed to output a probability distribution over models of context-dependent units responsive to receipt of captured user input. The DNN is adapted for a particular user based upon the captured user input, wherein the adaption is undertaken conservatively such that a deviation between outputs of the adapted DNN and the unadapted DNN is constrained.
    Type: Grant
    Filed: March 6, 2013
    Date of Patent: November 3, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dong Yu, Kaisheng Yao, Hang Su, Gang Li, Frank Seide
  • Publication number: 20150271442
    Abstract: Aligning a closed caption track to a media content includes calculating the offset and the drift between the closed caption track and the media content item. The closed caption track is aligned to the media content item as a function of the calculated offset and drift.
    Type: Application
    Filed: March 19, 2014
    Publication date: September 24, 2015
    Applicant: Microsoft Corporation
    Inventors: Dennis Cronin, Frank Seide, Ian Kennedy
  • Publication number: 20140257803
    Abstract: Various technologies described herein pertain to conservatively adapting a deep neural network (DNN) in a recognition system for a particular user or context. A DNN is employed to output a probability distribution over models of context-dependent units responsive to receipt of captured user input. The DNN is adapted for a particular user based upon the captured user input, wherein the adaption is undertaken conservatively such that a deviation between outputs of the adapted DNN and the unadapted DNN is constrained.
    Type: Application
    Filed: March 6, 2013
    Publication date: September 11, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Dong Yu, Kaisheng Yao, Hang Su, Gang Li, Frank Seide
  • Publication number: 20140067735
    Abstract: A deep tensor neural network (DTNN) is described herein, wherein the DTNN is suitable for employment in a computer-implemented recognition/classification system. Hidden layers in the DTNN comprise at least one projection layer, which includes a first subspace of hidden units and a second subspace of hidden units. The first subspace of hidden units receives a first nonlinear projection of input data to a projection layer and generates the first set of output data based at least in part thereon, and the second subspace of hidden units receives a second nonlinear projection of the input data to the projection layer and generates the second set of output data based at least in part thereon. A tensor layer, which can converted into a conventional layer of a DNN, generates the third set of output data based upon the first set of output data and the second set of output data.
    Type: Application
    Filed: August 29, 2012
    Publication date: March 6, 2014
    Applicant: Microsoft Corporation
    Inventors: Dong Yu, Li Deng, Frank Seide
  • Patent number: 7890849
    Abstract: The concurrent presentation technique provides information about content related to a source media currently being presented to a user in a fashion that allows the user to keep viewing the source media while either interactively or non-interactively perusing a list of related content. Thus, the user can see a list of related content without interrupting the presentation experience, and if desired, the user can choose to interact with the list to obtain further information about available related content.
    Type: Grant
    Filed: September 15, 2006
    Date of Patent: February 15, 2011
    Assignee: Microsoft Corporation
    Inventors: Neema Moraveji, Kishan Thambiratnam, Jun Liu, Roger Yu, Frank Seide
  • Publication number: 20090187588
    Abstract: Described herein is technology for, among other things, distributed indexing of file content. Content-based indexing the file involves determining whether content-based index information for the file is available from an external source. This avoids repeating already-performed content analysis, which is time consuming and computationally intensive especially for non-text files. The content-based index information, if it is available, is received from the external source and may be stored. If the content-based index information is not available or is not complete, content-based index information for the file is generated and stored. Moreover, the generated content-based index information is shared with the external source. Once content analysis of the file is performed to generate content-based index information for the file, the content-based index information is available and sharable as needed. There is no need to repeat the same content analysis on the file.
    Type: Application
    Filed: January 23, 2008
    Publication date: July 23, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Albert J. K. Thambiratnam, Frank Seide
  • Publication number: 20080072132
    Abstract: The concurrent presentation technique provides information about content related to a source media currently being presented to a user in a fashion that allows the user to keep viewing the source media while either interactively or non-interactively perusing a list of related content. Thus, the user can see a list of related content without interrupting the presentation experience, and if desired, the user can choose to interact with the list to obtain further information about available related content.
    Type: Application
    Filed: September 15, 2006
    Publication date: March 20, 2008
    Applicant: Microsoft Corporation
    Inventors: Neema Moraveji, Kishan Thambiratnam, Jun Liu, Roger Peng Yu, Frank Seide
  • Publication number: 20070255565
    Abstract: Search results are provided in a format that allows users to efficiently determine whether audio or video documents identified from a search query actually contain the words in the query. This is achieved by returning snippets of text around query term matches and allowing the user to play a segment of the audio signal by selecting a word in the snippet. In other embodiments, markers are placed on a timeline that represents the duration of the audio signal. Each marker represents a query term match and when selected causes the audio signal to begin to play near the temporal location represented by the marker.
    Type: Application
    Filed: April 10, 2006
    Publication date: November 1, 2007
    Applicant: Microsoft Corporation
    Inventors: Roger Yu, Frank Seide, Kaijiang Chen
  • Publication number: 20070244902
    Abstract: The best features of both Internet video search and television-type viewing experience have been combined. A user may use a remote control to enter search terms on a television monitor. A search engine may then search for video files accessible on the Internet that correspond to the search terms. Indicators of relevant search results may then be shown on the television monitor, enabling the user to select one to play. This enables the user to search for and view Internet video content in a television-like experience.
    Type: Application
    Filed: April 17, 2006
    Publication date: October 18, 2007
    Applicant: Microsoft Corporation
    Inventors: Frank Seide, Lie Lu, Neema Moraveji, Roger Yu, Wei-Ying Ma
  • Publication number: 20070153989
    Abstract: Improved systems and methods are provided for transcribing audio files of voice mails sent over a unified messaging system. Customized grammars specific to a voice mail recipient are created and utilized to transcribe a received voice mail by comparing the audio file to commonly utilized words, names, acronyms, and phrases used by the recipient. Key elements are identified from the resulting text transcription to aid the recipient in processing received voice mails based on the significant content contained in the voice mail.
    Type: Application
    Filed: December 30, 2005
    Publication date: July 5, 2007
    Applicant: Microsoft Corporation
    Inventors: David Howell, Sridhar Sundararaman, David Fong, Frank Seide
  • Publication number: 20060212897
    Abstract: Systems and methods for analyzing the content of audio/video files using speech recognition and data mining technologies are provided. As it can generally be assumed that a user's interest is highly correlated with an audio/video clip or television program the user may be watching, methods and systems for utilizing the results of speech recognition and data mining technology implementation to retrieve relevant advertising content for display are also provided.
    Type: Application
    Filed: March 18, 2005
    Publication date: September 21, 2006
    Applicant: Microsoft Corporation
    Inventors: Ying Li, Li Li, Tarek Najm, Hongbin Gao, Benyu Zhang, Xianfang Wang, Frank Seide, Roger Yu, Hua-Jun Zeng, Jian-Lai Zhou, Zheng Chen
  • Publication number: 20060085191
    Abstract: A speech signal is decoded by determining a production-related value for a current state based on an optimal production-related value at the end of a preceding state, the optimal production-related value being selected from a set of continuous values. The production-related value is used to determine a likelihood of a phone being represented by a set of observation vectors that are aligned with a path between the preceding state and the current state. The likelihood of the phone is combined with a score from the preceding state to determine a score for the current state, the score from the preceding state being associated with a discrete class of production-related values wherein the class matches the class of the optimal production-related value.
    Type: Application
    Filed: December 6, 2005
    Publication date: April 20, 2006
    Applicant: Microsoft Corporation
    Inventors: Li Deng, Jian-Iai Zhou, Frank Seide, Asela Gunawardana, Hagai Attias, Alejandro Acero, Xuedong Huang
  • Publication number: 20050159953
    Abstract: A method of searching audio data is provided including receiving a query defining multiple phonetic possibilities. The method also includes comparing the query with a lattice of phonetic hypotheses associated with the audio data to identify if at least one of the multiple phonetic possibilities is approximated by at least one phonetic hypothesis in the lattice of phonetic hypotheses.
    Type: Application
    Filed: January 15, 2004
    Publication date: July 21, 2005
    Applicant: Microsoft Corporation
    Inventors: Frank Seide, Eric Chang
  • Patent number: 6513037
    Abstract: A data base query submitted in natural speech normally requires a dialog with the data base system, which repeatedly prompts the user to submit further statements. From each speech utterance submitted by the user a plurality of sets of statements are derived. The statements in these sets are tested for consistency with stored statements determined previously and consistent new statements are stored and stored statements are corrected or verified. Moreover, the stored statements are basically used in each dialogue step in order to derive from these statements an optimum request for the user by the system. Preferably, the statements are also stored with probability values or reliability values, the corresponding values of new statements to be stored being derived from the reliabilities of the statements of the respective speech utterance and the corresponding consistent statements stored.
    Type: Grant
    Filed: August 17, 1999
    Date of Patent: January 28, 2003
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Bernhard J. RĂ¼ber, Andreas Kellner, Frank Seide
  • Patent number: 5987410
    Abstract: A method and device for recognizing speech that has a sequence of words each including one or more letters. The word and letters form a recognition data base. The method receives and recognizes the speech by preliminary modelling among various probably recognized sequences. The method selects one or more model sequences as result. In particular, the method allows in a model sequence of exclusively letters, various words as a subset. Such words are used to qualify one or more neighbouring or included letters in the sequence. An applicable model is a mixed information unit model.
    Type: Grant
    Filed: November 10, 1997
    Date of Patent: November 16, 1999
    Assignee: U.S. Philips Corporation
    Inventors: Andreas Kellner, Frank Seide
  • Patent number: 5987409
    Abstract: The determination of a plurality of sequences of words from a speech signal with a decreasing probability of correspondence utilizes the best word sequence as a basis and as further word sequences there are determined only those which enclose a part of the best word sequence, that is to say the remainder of these word sequences. To this end, the recognition involves first the formation of a word graph and the best word sequence is separately stored as a tree which initially has one branch only. The word boundaries of this word sequence form nodes in this tree. Because only nodes of this tree have to be taken into account for the next-best word sequences, the calculation is substantially simpler than if the complete word graph were first completely expanded in the form of a tree and completely searched again for each new word sequence.
    Type: Grant
    Filed: September 26, 1997
    Date of Patent: November 16, 1999
    Assignee: U.S. Philips Corporation
    Inventors: Bach-Hiep Tran, Frank Seide, Volker Steinbiss
  • Patent number: 5892960
    Abstract: The method and system are used to process a set of data elements, such as components of vectors used in pattern recognition, in parallel on a conventional sequential processor 30. A group of data elements is loaded into a data register of the processor 30 as a compound data element. The data register is operated on using a conventional processor instruction, such as an addition, subtraction or multiplication. The compound data element comprises an arrangement of at least two blocks, with each block comprising at the low order bit positions a data element and at the high order bit position(s) at least one separation bit. The separation bits are assigned predetermined bit values ensuring that during a processor operation the desired parallel result is formed.
    Type: Grant
    Filed: March 24, 1997
    Date of Patent: April 6, 1999
    Assignee: U.S. Philips Corporation
    Inventor: Frank Seide
  • Patent number: 5857169
    Abstract: A time-sequential input pattern (20), which is derived from a continual physical quantity, such as speech is recognized. The system includes input means (30), which accesses the physical quantity and therefrom generates a sequence of input observation vectors. The input observation vectors represent the input pattern. A reference pattern database (40) is used for storing reference patterns, which consist of a sequence of reference units. Each reference unit is represented by associated reference probability densities. A tree builder (60) represents for each reference unit the set of associated reference probability densities as a tree structure. Each leaf node of the tree corresponds to a reference probability density. Each non-leaf node corresponds to a cluster probability density, which is derived from all reference probability densities corresponding to leaf nodes in branches below the non-leaf node.
    Type: Grant
    Filed: August 28, 1996
    Date of Patent: January 5, 1999
    Assignee: U.S. Philips Corporation
    Inventor: Frank Seide