Patents by Inventor Qiang Huo

Qiang Huo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20190139222
    Abstract: An image processing method and an apparatus (300) are provided, the method includes: obtaining a depth image of a protuberant object (210); selecting a plurality of test points placed on a circle around a pixel in the depth image as a center point of the circle; calculating a protuberance value of the center point based on a comparison between the depth value of the center point and the depth value of each of the selected test points (240); and determining one or more salient points of the protuberant object by using the protuberance value of each pixel in the depth image (250).
    Type: Application
    Filed: June 30, 2016
    Publication date: May 9, 2019
    Inventors: Qiang Huo, Yuseok Ban
  • Publication number: 20190114072
    Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.
    Type: Application
    Filed: December 12, 2018
    Publication date: April 18, 2019
    Inventors: Peng BAI, Jun DU, Lei SUN, Qiang Huo
  • Patent number: 10216730
    Abstract: Some implementations disclosed herein provide techniques and arrangements to enable translating language characters in media content. For example, some implementations receive a user selection of a first portion of media content. Some implementations disclosed herein may, based on the first portion, identify a second portion of the media content. The second portion of the media content may include one or more first characters of a first language. Some implementations disclosed herein may create an image that includes the second portion of the media content and may send the image to a server. Some implementations disclosed herein may receive one or more second characters of a second language corresponding to a translation of the one or more first characters of the first language from the server.
    Type: Grant
    Filed: January 29, 2016
    Date of Patent: February 26, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jun Du, Lei Sun, Jian Sun, Qiang Huo
  • Publication number: 20190050743
    Abstract: The disclosure relates to a method and apparatus for training a learning machine, wherein the apparatus includes: a broadcasting module for broadcasting an initial global model for a training cycle to a plurality of worker nodes; a receiving module for receiving a plurality of updated local models from the plurality of worker nodes, wherein each updated local model is generated by one of the plurality of worker nodes independently based on a data split assigned to the worker node and the initial global model for the training cycle; an aggregating module for aggregating the plurality of updated local models to obtain an aggregated model; and a generating module for generating an updated global model for the training cycle based at least on the aggregated model and historical information which is obtained from a preceding training cycle.
    Type: Application
    Filed: March 18, 2016
    Publication date: February 14, 2019
    Inventors: Kai Chen, Qiang Huo
  • Patent number: 10191650
    Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.
    Type: Grant
    Filed: March 30, 2016
    Date of Patent: January 29, 2019
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Peng Bai, Jun Du, Lei Sun, Qiang Huo
  • Publication number: 20170147920
    Abstract: The use of the alternating direction method of multipliers (ADMM) algorithm to train a classifier may reduce the amount of classifier training time with little degradation in classifier accuracy. The training involves partitioning the training data for training the classifier into multiple data blocks. The partitions may preserve the joint distribution of input features and an output class of the training data. The training may further include performing an ADMM iteration on the multiple data blocks in an initial order using multiple worker nodes. Subsequently, the training of the classifier is determined to be completed if a stop criterion is satisfied following the ADMM iteration. Otherwise, if the stop criterion is determined to be unsatisfied following the ADMM iteration, one or more additional ADMM iterations may be performed on different orders of the multiple data blocks until the stop criterion is satisfied.
    Type: Application
    Filed: April 8, 2014
    Publication date: May 25, 2017
    Inventors: Qiang Huo, Zhi-Jie Yan, Kai Chen
  • Patent number: 9613450
    Abstract: Dynamic texture mapping is used to create a photorealistic three dimensional animation of an individual with facial features synchronized with desired speech. Audiovisual data of an individual reading a known script is obtained and stored in an audio library and an image library. The audiovisual data is processed to extract feature vectors used to train a statistical model. An input audio feature vector corresponding to desired speech with which the animation will be synchronized is provided. The statistical model is used to generate a trajectory of visual feature vectors that corresponds to the input audio feature vector. These visual feature vectors are used to identify a matching image sequence from the image library. The resulting sequence of images, concatenated from the image library, provides a photorealistic image sequence with facial features, such as lip movements, synchronized with the desired speech. This image sequence is applied to the three-dimensional model.
    Type: Grant
    Filed: May 3, 2011
    Date of Patent: April 4, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lijuan Wang, Frank Soong, Qiang Huo, Zhengyou Zhang
  • Publication number: 20160210040
    Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.
    Type: Application
    Filed: March 30, 2016
    Publication date: July 21, 2016
    Inventors: Peng Bai, Jun Du, Lei Sun, Qiang Huo
  • Publication number: 20160147743
    Abstract: Some implementations disclosed herein provide techniques and arrangements to enable translating language characters in media content. For example, some implementations receive a user selection of a first portion of media content. Some implementations disclosed herein may, based on the first portion, identify a second portion of the media content. The second portion of the media content may include one or more first characters of a first language. Some implementations disclosed herein may create an image that includes the second portion of the media content and may send the image to a server. Some implementations disclosed herein may receive one or more second characters of a second language corresponding to a translation of the one or more first characters of the first language from the server.
    Type: Application
    Filed: January 29, 2016
    Publication date: May 26, 2016
    Inventors: Jun Du, Lei Sun, Jian Sun, Qiang Huo
  • Patent number: 9336775
    Abstract: A high-dimensional posterior-based feature with partial distance elimination may be utilized for speech recognition. The log likelihood values of a large number of Gaussians are needed to generate the high-dimensional posterior feature. Gaussians with very small log likelihoods are associated with zero posterior values. Log likelihoods for Gaussians for a speech frame may be evaluated with a partial distance elimination method. If the partial distance of a Gaussian is already too small, the Gaussian will have a zero posterior value. The partial distance may be calculated by sequentially adding individual dimensions in a group of dimensions. The partial distance elimination occurs when less than all of the dimensions in the group are sequentially added.
    Type: Grant
    Filed: March 5, 2013
    Date of Patent: May 10, 2016
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Jinyu Li, Zhijie Yan, Qiang Huo, Yifan Gong
  • Patent number: 9329692
    Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: May 3, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Peng Bai, Qiang Huo, Jun Du, Lei Sun
  • Patent number: 9251144
    Abstract: Some implementations disclosed herein provide techniques and arrangements to enable translating language characters in media content. For example, some implementations receive a user selection of a first portion of media content. Some implementations disclosed herein may, based on the first portion, identify a second portion of the media content. The second portion of the media content may include one or more first characters of a first language. Some implementations disclosed herein may create an image that includes the second portion of the media content and may send the image to a server. Some implementations disclosed herein may receive one or more second characters of a second language corresponding to a translation of the one or more first characters of the first language from the server.
    Type: Grant
    Filed: October 19, 2011
    Date of Patent: February 2, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jun Du, Lei Sun, Jian Sun, Qiang Huo
  • Patent number: 9207765
    Abstract: Techniques and systems for inputting data to interactive media devices are disclosed herein. In some aspects, a sensing device senses an object as it moves in a trajectory indicative of a desired input to an interactive media device. Recognition software may be used to translate the trajectory into various suggested characters or navigational commands. The suggested characters may be ranked based on a likelihood of being an intended input. The suggested characters may be displayed on a user interface at least in part based on the rank and made available for selection as the intended input.
    Type: Grant
    Filed: December 31, 2009
    Date of Patent: December 8, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lei Ma, Qiang Huo
  • Publication number: 20150199960
    Abstract: Methods and systems for i-vector based clustering training data in speech recognition are described. An i-vector may be extracted from a speech segment of a speech training data to represent acoustic information. The extracted i-vectors from the speech training data may be clustered into multiple clusters using a hierarchical divisive clustering algorithm. Using a cluster of the multiple clusters, an acoustic model may be trained. This trained acoustic model may be used in speech recognition.
    Type: Application
    Filed: August 24, 2012
    Publication date: July 16, 2015
    Inventors: Qiang Huo, Zhi-Jie Yan, Yu Zhang, Jian Xu
  • Publication number: 20150095855
    Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.
    Type: Application
    Filed: September 27, 2013
    Publication date: April 2, 2015
    Applicant: Microsoft Corporation
    Inventors: Peng Bai, Qiang Huo, Jun Du, Lei Sun
  • Patent number: 8977042
    Abstract: A character recognition system receives an unknown character and recognizes the character based on a pre-trained recognition model. Prior to recognizing the character, the character recognition system may pre-process the character to rotate the character to a normalized orientation. By rotating the character to a normalized orientation in both training and recognition stages, the character recognition system releases the pre-trained recognition model from considering character prototypes in different orientations and thereby speeds up recognition of the unknown character. In one example, the character recognition system rotates the character to the normalized orientation by aligning a line between a sum of coordinates of starting points and a sum of coordinates of ending points of each stroke of the character with a normalized direction.
    Type: Grant
    Filed: March 23, 2012
    Date of Patent: March 10, 2015
    Assignee: Microsoft Corporation
    Inventors: Qiang Huo, Jun Du
  • Publication number: 20140257814
    Abstract: A high-dimensional posterior-based feature with partial distance elimination may be utilized for speech recognition. The log likelihood values of a large number of Gaussians are needed to generate the high-dimensional posterior feature. Gaussians with very small log likelihoods are associated with zero posterior values. Log likelihoods for Gaussians for a speech frame may be evaluated with a partial distance elimination method. If the partial distance of a Gaussian is already too small, the Gaussian will have a zero posterior value. The partial distance may be calculated by sequentially adding individual dimensions in a group of dimensions. The partial distance elimination occurs when less than all of the dimensions in the group are sequentially added.
    Type: Application
    Filed: March 5, 2013
    Publication date: September 11, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Jinyu Li, Zhijie Yan, Qiang Huo, Yifan Gong
  • Publication number: 20130251249
    Abstract: A character recognition system receives an unknown character and recognizes the character based on a pre-trained recognition model. Prior to recognizing the character, the character recognition system may pre-process the character to rotate the character to a normalized orientation. By rotating the character to a normalized orientation in both training and recognition stages, the character recognition system releases the pre-trained recognition model from considering character prototypes in different orientations and thereby speeds up recognition of the unknown character. In one example, the character recognition system rotates the character to the normalized orientation by aligning a line between a sum of coordinates of starting points and a sum of coordinates of ending points of each stroke of the character with a normalized direction.
    Type: Application
    Filed: March 23, 2012
    Publication date: September 26, 2013
    Applicant: Microsoft Corporation
    Inventors: Qiang Huo, Jun Du
  • Patent number: 8515758
    Abstract: Some implementations provide for speech recognition based on structured modeling, irrelevant variability normalization and unsupervised online adaptation of one or more speech recognition parameters. Some implementations may improve the ability of a runtime speech recognizer or decoder to adapt to new speakers and new environments.
    Type: Grant
    Filed: April 14, 2010
    Date of Patent: August 20, 2013
    Assignee: Microsoft Corporation
    Inventor: Qiang Huo
  • Publication number: 20130185070
    Abstract: A speech recognition system trains a plurality of feature transforms and a plurality of acoustic models using an irrelevant variability normalization based discriminative training. The speech recognition system employs the trained feature transforms to absorb or ignore variability within an unknown speech that is irrelevant to phonetic classification. The speech recognition system may then recognize the unknown speech using the trained recognition models. The speech recognition system may further perform an unsupervised adaptation to adapt the feature transforms for the unknown speech and thus increase the accuracy of recognizing the unknown speech.
    Type: Application
    Filed: January 12, 2012
    Publication date: July 18, 2013
    Applicant: Microsoft Corporation
    Inventors: Qiang Huo, Zhi-Jie Yan, Yu Zhang