Patents by Inventor Thomas M. Soemo

Thomas M. Soemo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240078682
    Abstract: Training a multi-object tracking model includes: generating a plurality of training images based at least on scene generation information, each training image comprising a plurality of objects to be tracked; generating, for each training image, original simulated data based at least on the scene generation information, the original simulated data comprising tag data for a first object; locating, within the original simulated data, tag data for the first object, based on at least an anomaly alert (e.g., occlusion alert, proximity alert, motion alert) associated with the first object in the first training image; based at least on locating the tag data for the first object, modifying at least a portion of the tag data for the first object from the original simulated data, thereby generating preprocessed training data from the original simulated data; and training a multi-object tracking model with the preprocessed training data to produce a trained multi-object tracker.
    Type: Application
    Filed: November 13, 2023
    Publication date: March 7, 2024
    Inventors: Ishani CHAKRABORTY, Jonathan C. HANZELKA, Lu YUAN, Pedro Urbina ESCOS, Thomas M. SOEMO
  • Patent number: 11854211
    Abstract: Training a multi-object tracking model includes: generating a plurality of training images based at least on scene generation information, each training image comprising a plurality of objects to be tracked; generating, for each training image, original simulated data based at least on the scene generation information, the original simulated data comprising tag data for a first object; locating, within the original simulated data, tag data for the first object, based on at least an anomaly alert (e.g., occlusion alert, proximity alert, motion alert) associated with the first object in the first training image; based at least on locating the tag data for the first object, modifying at least a portion of the tag data for the first object from the original simulated data, thereby generating preprocessed training data from the original simulated data; and training a multi-object tracking model with the preprocessed training data to produce a trained multi-object tracker.
    Type: Grant
    Filed: January 26, 2022
    Date of Patent: December 26, 2023
    Assignee: Microsoft Technology Licensing, LLC.
    Inventors: Ishani Chakraborty, Jonathan C. Hanzelka, Lu Yuan, Pedro Urbina Escos, Thomas M. Soemo
  • Patent number: 11335008
    Abstract: Training a multi-object tracking model includes: generating a plurality of training images based at least on scene generation information, each training image comprising a plurality of objects to be tracked; generating, for each training image, original simulated data based at least on the scene generation information, the original simulated data comprising tag data for a first object; locating, within the original simulated data, tag data for the first object, based on at least an anomaly alert (e.g., occlusion alert, proximity alert, motion alert) associated with the first object in the first training image; based at least on locating the tag data for the first object, modifying at least a portion of the tag data for the first object from the original simulated data, thereby generating preprocessed training data from the original simulated data; and training a multi-object tracking model with the preprocessed training data to produce a trained multi-object tracker.
    Type: Grant
    Filed: September 18, 2020
    Date of Patent: May 17, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ishani Chakraborty, Jonathan C. Hanzelka, Lu Yuan, Pedro Urbina Escos, Thomas M. Soemo
  • Publication number: 20220148197
    Abstract: Training a multi-object tracking model includes: generating a plurality of training images based at least on scene generation information, each training image comprising a plurality of objects to be tracked; generating, for each training image, original simulated data based at least on the scene generation information, the original simulated data comprising tag data for a first object; locating, within the original simulated data, tag data for the first object, based on at least an anomaly alert (e.g., occlusion alert, proximity alert, motion alert) associated with the first object in the first training image; based at least on locating the tag data for the first object, modifying at least a portion of the tag data for the first object from the original simulated data, thereby generating preprocessed training data from the original simulated data; and training a multi-object tracking model with the preprocessed training data to produce a trained multi-object tracker.
    Type: Application
    Filed: January 26, 2022
    Publication date: May 12, 2022
    Inventors: Ishani CHAKRABORTY, Jonathan C. HANZELKA, Lu YUAN, Pedro Urbina ESCOS, Thomas M. SOEMO
  • Publication number: 20220092792
    Abstract: Training a multi-object tracking model includes: generating a plurality of training images based at least on scene generation information, each training image comprising a plurality of objects to be tracked; generating, for each training image, original simulated data based at least on the scene generation information, the original simulated data comprising tag data for a first object; locating, within the original simulated data, tag data for the first object, based on at least an anomaly alert (e.g., occlusion alert, proximity alert, motion alert) associated with the first object in the first training image; based at least on locating the tag data for the first object, modifying at least a portion of the tag data for the first object from the original simulated data, thereby generating preprocessed training data from the original simulated data; and training a multi-object tracking model with the preprocessed training data to produce a trained multi-object tracker.
    Type: Application
    Filed: September 18, 2020
    Publication date: March 24, 2022
    Inventors: Ishani CHAKRABORTY, Jonathan C. HANZELKA, Lu YUAN, Pedro Urbina ESCOS, Thomas M. SOEMO
  • Patent number: 10534438
    Abstract: A multimedia entertainment system combines both gestures and voice commands to provide an enhanced control scheme. A user's body position or motion may be recognized as a gesture, and may be used to provide context to recognize user generated sounds, such as speech input. Likewise, speech input may be recognized as a voice command, and may be used to provide context to recognize a body position or motion as a gesture. Weights may be assigned to the inputs to facilitate processing. When a gesture is recognized, a limited set of voice commands associated with the recognized gesture are loaded for use. Further, additional sets of voice commands may be structured in a hierarchical manner such that speaking a voice command from one set of voice commands leads to the system loading a next set of voice commands.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: January 14, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Christian Klein, Ali M. Vassigh, Jason S. Flaks, Vanessa Larco, Thomas M. Soemo
  • Patent number: 10368120
    Abstract: A method and system are disclosed in which a group of people are able to replicate the physical world experience of going with a group of friends to pick a movie, watch the movie together, and provide commentary on the movie itself in the virtual world on a virtual couch while each user is sitting in different physical locations. Additionally, the virtual representation of the destination that the group of people are watching the movie together in can be themed to allow users to watch movies in different locations pivoting on special events or by the users choice.
    Type: Grant
    Filed: August 11, 2016
    Date of Patent: July 30, 2019
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Andrew Lawrence Mattingly, Brian Charles Kramp, Thomas M. Soemo, Eddie Mays
  • Patent number: 9945946
    Abstract: Examples are disclosed herein that relate to depth imaging techniques using ultrasound One example provides an ultrasonic depth sensing system configured to, for an image frame, emit an ultrasonic pulse from each of a plurality of transducers, receive a reflection of each ultrasonic pulse at a microphone array, perform transmit beamforming and also receive beamforming computationally after receiving the reflections, form a depth image, and output the depth image for the image frame.
    Type: Grant
    Filed: September 11, 2014
    Date of Patent: April 17, 2018
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ivan Dokmanic, Ivan J. Tashev, Thomas M. Soemo
  • Publication number: 20170228036
    Abstract: A multimedia entertainment system combines both gestures and voice commands to provide an enhanced control scheme. A user's body position or motion may be recognized as a gesture, and may be used to provide context to recognize user generated sounds, such as speech input. Likewise, speech input may be recognized as a voice command, and may be used to provide context to recognize a body position or motion as a gesture. Weights may be assigned to the inputs to facilitate processing. When a gesture is recognized, a limited set of voice commands associated with the recognized gesture are loaded for use. Further, additional sets of voice commands may be structured in a hierarchical manner such that speaking a voice command from one set of voice commands leads to the system loading a next set of voice commands.
    Type: Application
    Filed: April 28, 2017
    Publication date: August 10, 2017
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Christian Klein, Ali M. Vassigh, Jason S. Flaks, Vanessa Larco, Thomas M. Soemo
  • Publication number: 20170041658
    Abstract: A method and system are disclosed in which a group of people are able to replicate the physical world experience of going with a group of friends to pick a movie, watch the movie together, and provide commentary on the movie itself in the virtual world on a virtual couch while each user is sitting in different physical locations. Additionally, the virtual representation of the destination that the group of people are watching the movie together in can be themed to allow users to watch movies in different locations pivoting on special events or by the users choice.
    Type: Application
    Filed: August 11, 2016
    Publication date: February 9, 2017
    Inventors: Andrew Lawrence Mattingly, Brian Charles Kramp, Thomas M. Soemo, Eddie Mays
  • Patent number: 9423945
    Abstract: A method and system are disclosed in which a group of people are able to replicate the physical world experience of going with a group of friends to pick a movie, watch the movie together, and provide commentary on the movie itself in the virtual world on a virtual couch while each user is sitting in different physical locations. Additionally, the virtual representation of the destination that the group of people are watching the movie together in can be themed to allow users to watch movies in different locations pivoting on special events or by the users choice.
    Type: Grant
    Filed: August 24, 2015
    Date of Patent: August 23, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Andrew Lawrence Mattingly, Brian Charles Kramp, Thomas M. Soemo, Eddie Mays
  • Publication number: 20160077206
    Abstract: Examples are disclosed herein that relate to depth imaging techniques using ultrasound One example provides an ultrasonic depth sensing system configured to, for an image frame, emit an ultrasonic pulse from each of a plurality of transducers, receive a reflection of each ultrasonic pulse at a microphone array, perform transmit beamforming and also receive beamforming computationally after receiving the reflections, form a depth image, and output the depth image for the image frame.
    Type: Application
    Filed: September 11, 2014
    Publication date: March 17, 2016
    Inventors: Ivan Dokmanic, Ivan J. Tashev, Thomas M. Soemo
  • Publication number: 20150363099
    Abstract: A method and system are disclosed in which a group of people are able to replicate the physical world experience of going with a group of friends to pick a movie, watch the movie together, and provide commentary on the movie itself in the virtual world on a virtual couch while each user is sitting in different physical locations. Additionally, the virtual representation of the destination that the group of people are watching the movie together in can be themed to allow users to watch movies in different locations pivoting on special events or by the users choice.
    Type: Application
    Filed: August 24, 2015
    Publication date: December 17, 2015
    Inventors: Andrew Lawrence Mattingly, Brian Charles Kramp, Thomas M. Soemo, Eddie Mays
  • Patent number: 9118737
    Abstract: A method and system are disclosed in which a group of people are able to replicate the physical world experience of going with a group of friends to pick a movie, watch the movie together, and provide commentary on the movie itself in the virtual world on a virtual couch while each user is sitting in different physical locations. Additionally, the virtual representation of the destination that the group of people are watching the movie together in can be themed to allow users to watch movies in different locations pivoting on special events or by the users choice.
    Type: Grant
    Filed: February 24, 2014
    Date of Patent: August 25, 2015
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Andrew Lawrence Mattingly, Brian Charles Kramp, Thomas M. Soemo, Eddie Mays
  • Publication number: 20140173462
    Abstract: A method and system are disclosed in which a group of people are able to replicate the physical world experience of going with a group of friends to pick a movie, watch the movie together, and provide commentary on the movie itself in the virtual world on a virtual couch while each user is sitting in different physical locations. Additionally, the virtual representation of the destination that the group of people are watching the movie together in can be themed to allow users to watch movies in different locations pivoting on special events or by the users choice.
    Type: Application
    Filed: February 24, 2014
    Publication date: June 19, 2014
    Applicant: Microsoft Corporation
    Inventors: Andrew Lawrence Mattingly, Brian Charles Kramp, Thomas M. Soemo, Eddie Mays
  • Patent number: 8661353
    Abstract: A method and system are disclosed in which a group of people are able to replicate the physical world experience of going with a group of friends to pick a movie, watch the movie together, and provide commentary on the movie itself in the virtual world on a virtual couch while each user is sitting in different physical locations. Additionally, the virtual representation of the destination that the group of people are watching the movie together in can be themed to allow users to watch movies in different locations pivoting on special events or by the users choice.
    Type: Grant
    Filed: August 31, 2009
    Date of Patent: February 25, 2014
    Assignee: Microsoft Corporation
    Inventors: Andrew Lawrence Mattingly, Brian Charles Kramp, Thomas M. Soemo, Eddie Mays
  • Patent number: 8660847
    Abstract: A system for integrating local speech recognition with cloud-based speech recognition in order to provide an efficient natural user interface is described. In some embodiments, a computing device determines a direction associated with a particular person within an environment and generates an audio recording associated with the direction. The computing device then performs local speech recognition on the audio recording in order to detect a first utterance spoken by the particular person and to detect one or more keywords within the first utterance. The first utterance may be detected by applying voice activity detection techniques to the audio recording. The first utterance and the one or more keywords are subsequently transferred to a server which may identify speech sounds within the first utterance associated with the one or more keywords and adapt one or more speech recognition techniques based on the identified speech sounds.
    Type: Grant
    Filed: September 2, 2011
    Date of Patent: February 25, 2014
    Assignee: Microsoft Corporation
    Inventors: Thomas M. Soemo, Leo Soong, Michael H. Kim, Chad R. Heinemann, Dax H. Hawkins
  • Publication number: 20130060571
    Abstract: A system for integrating local speech recognition with cloud-based speech recognition in order to provide an efficient natural user interface is described. In some embodiments, a computing device determines a direction associated with a particular person within an environment and generates an audio recording associated with the direction. The computing device then performs local speech recognition on the audio recording in order to detect a first utterance spoken by the particular person and to detect one or more keywords within the first utterance. The first utterance may be detected by applying voice activity detection techniques to the audio recording. The first utterance and the one or more keywords are subsequently transferred to a server which may identify speech sounds within the first utterance associated with the one or more keywords and adapt one or more speech recognition techniques based on the identified speech sounds.
    Type: Application
    Filed: September 2, 2011
    Publication date: March 7, 2013
    Applicant: Microsoft Corporation
    Inventors: Thomas M. Soemo, Leo Soong, Michael H. Kim, Chad R. Heinemann, Dax H. Hawkins
  • Patent number: 8387015
    Abstract: Scalable empirical testing of media file playback utilizes test hooks in each media player to support simulated human interaction and playback monitoring. A media crawler catalogs media files accumulated in a media file database to create a wordlist. One or more scalable instances of media tester accesses the wordlist to select items of work linked to media files. Work items and/or operating modes of media tester specify test parameters such as performance profiles or further define testing such as specifying repetitious playback on one or more media players. Media files are downloaded to and played by a scalable number of media players. Playback performance is monitored, analyzed and reported. Failure reports are accompanied by instructions to reproduce failures and cross-references to content or source code in media files. Failures can be audited by additional work items for follow-up testing.
    Type: Grant
    Filed: January 31, 2008
    Date of Patent: February 26, 2013
    Assignee: Microsoft Corporation
    Inventors: Russell D. Christensen, Jun Ma, Thomas M. Soemo
  • Patent number: 8296151
    Abstract: A multimedia entertainment system combines both gestures and voice commands to provide an enhanced control scheme. A user's body position or motion may be recognized as a gesture, and may be used to provide context to recognize user generated sounds, such as speech input. Likewise, speech input may be recognized as a voice command, and may be used to provide context to recognize a body position or motion as a gesture. Weights may be assigned to the inputs to facilitate processing. When a gesture is recognized, a limited set of voice commands associated with the recognized gesture are loaded for use. Further, additional sets of voice commands may be structured in a hierarchical manner such that speaking a voice command from one set of voice commands leads to the system loading a next set of voice commands.
    Type: Grant
    Filed: June 18, 2010
    Date of Patent: October 23, 2012
    Assignee: Microsoft Corporation
    Inventors: Christian Klein, Ali M. Vassigh, Jason S. Flaks, Vanessa Larco, Thomas M. Soemo