Patents by Inventor Qiang Huo

Qiang Huo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

METHOD AND APPARATUS FOR DETECTING A SALIENT POINT OF A PROTUBERANT OBJECT

Publication number: 20190139222

Abstract: An image processing method and an apparatus (300) are provided, the method includes: obtaining a depth image of a protuberant object (210); selecting a plurality of test points placed on a circle around a pixel in the depth image as a center point of the circle; calculating a protuberance value of the center point based on a comparison between the depth value of the center point and the depth value of each of the selected test points (240); and determining one or more salient points of the protuberant object by using the protuberance value of each pixel in the depth image (250).

Type: Application

Filed: June 30, 2016

Publication date: May 9, 2019

Inventors: Qiang Huo, Yuseok Ban
ACTIONABLE CONTENT DISPLAYED ON A TOUCH SCREEN

Publication number: 20190114072

Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.

Type: Application

Filed: December 12, 2018

Publication date: April 18, 2019

Inventors: Peng BAI, Jun DU, Lei SUN, Qiang Huo
Translating language characters in media content

Patent number: 10216730

Abstract: Some implementations disclosed herein provide techniques and arrangements to enable translating language characters in media content. For example, some implementations receive a user selection of a first portion of media content. Some implementations disclosed herein may, based on the first portion, identify a second portion of the media content. The second portion of the media content may include one or more first characters of a first language. Some implementations disclosed herein may create an image that includes the second portion of the media content and may send the image to a server. Some implementations disclosed herein may receive one or more second characters of a second language corresponding to a translation of the one or more first characters of the first language from the server.

Type: Grant

Filed: January 29, 2016

Date of Patent: February 26, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jun Du, Lei Sun, Jian Sun, Qiang Huo
METHOD AND APPARATUS FOR TRAINING A LEARNING MACHINE

Publication number: 20190050743

Abstract: The disclosure relates to a method and apparatus for training a learning machine, wherein the apparatus includes: a broadcasting module for broadcasting an initial global model for a training cycle to a plurality of worker nodes; a receiving module for receiving a plurality of updated local models from the plurality of worker nodes, wherein each updated local model is generated by one of the plurality of worker nodes independently based on a data split assigned to the worker node and the initial global model for the training cycle; an aggregating module for aggregating the plurality of updated local models to obtain an aggregated model; and a generating module for generating an updated global model for the training cycle based at least on the aggregated model and historical information which is obtained from a preceding training cycle.

Type: Application

Filed: March 18, 2016

Publication date: February 14, 2019

Inventors: Kai Chen, Qiang Huo
Actionable content displayed on a touch screen

Patent number: 10191650

Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.

Type: Grant

Filed: March 30, 2016

Date of Patent: January 29, 2019

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Peng Bai, Jun Du, Lei Sun, Qiang Huo
DEEP LEARNING USING ALTERNATING DIRECTION METHOD OF MULTIPLIERS

Publication number: 20170147920

Abstract: The use of the alternating direction method of multipliers (ADMM) algorithm to train a classifier may reduce the amount of classifier training time with little degradation in classifier accuracy. The training involves partitioning the training data for training the classifier into multiple data blocks. The partitions may preserve the joint distribution of input features and an output class of the training data. The training may further include performing an ADMM iteration on the multiple data blocks in an initial order using multiple worker nodes. Subsequently, the training of the classifier is determined to be completed if a stop criterion is satisfied following the ADMM iteration. Otherwise, if the stop criterion is determined to be unsatisfied following the ADMM iteration, one or more additional ADMM iterations may be performed on different orders of the multiple data blocks until the stop criterion is satisfied.

Type: Application

Filed: April 8, 2014

Publication date: May 25, 2017

Inventors: Qiang Huo, Zhi-Jie Yan, Kai Chen
Photo-realistic synthesis of three dimensional animation with facial features synchronized with speech

Patent number: 9613450

Abstract: Dynamic texture mapping is used to create a photorealistic three dimensional animation of an individual with facial features synchronized with desired speech. Audiovisual data of an individual reading a known script is obtained and stored in an audio library and an image library. The audiovisual data is processed to extract feature vectors used to train a statistical model. An input audio feature vector corresponding to desired speech with which the animation will be synchronized is provided. The statistical model is used to generate a trajectory of visual feature vectors that corresponds to the input audio feature vector. These visual feature vectors are used to identify a matching image sequence from the image library. The resulting sequence of images, concatenated from the image library, provides a photorealistic image sequence with facial features, such as lip movements, synchronized with the desired speech. This image sequence is applied to the three-dimensional model.

Type: Grant

Filed: May 3, 2011

Date of Patent: April 4, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Lijuan Wang, Frank Soong, Qiang Huo, Zhengyou Zhang
ACTIONABLE CONTENT DISPLAYED ON A TOUCH SCREEN

Publication number: 20160210040

Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.

Type: Application

Filed: March 30, 2016

Publication date: July 21, 2016

Inventors: Peng Bai, Jun Du, Lei Sun, Qiang Huo
TRANSLATING LANGUAGE CHARACTERS IN MEDIA CONTENT

Publication number: 20160147743

Abstract: Some implementations disclosed herein provide techniques and arrangements to enable translating language characters in media content. For example, some implementations receive a user selection of a first portion of media content. Some implementations disclosed herein may, based on the first portion, identify a second portion of the media content. The second portion of the media content may include one or more first characters of a first language. Some implementations disclosed herein may create an image that includes the second portion of the media content and may send the image to a server. Some implementations disclosed herein may receive one or more second characters of a second language corresponding to a translation of the one or more first characters of the first language from the server.

Type: Application

Filed: January 29, 2016

Publication date: May 26, 2016

Inventors: Jun Du, Lei Sun, Jian Sun, Qiang Huo
Posterior-based feature with partial distance elimination for speech recognition

Patent number: 9336775

Abstract: A high-dimensional posterior-based feature with partial distance elimination may be utilized for speech recognition. The log likelihood values of a large number of Gaussians are needed to generate the high-dimensional posterior feature. Gaussians with very small log likelihoods are associated with zero posterior values. Log likelihoods for Gaussians for a speech frame may be evaluated with a partial distance elimination method. If the partial distance of a Gaussian is already too small, the Gaussian will have a zero posterior value. The partial distance may be calculated by sequentially adding individual dimensions in a group of dimensions. The partial distance elimination occurs when less than all of the dimensions in the group are sequentially added.

Type: Grant

Filed: March 5, 2013

Date of Patent: May 10, 2016

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Jinyu Li, Zhijie Yan, Qiang Huo, Yifan Gong
Actionable content displayed on a touch screen

Patent number: 9329692

Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.

Type: Grant

Filed: September 27, 2013

Date of Patent: May 3, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Peng Bai, Qiang Huo, Jun Du, Lei Sun
Translating language characters in media content

Patent number: 9251144

Abstract: Some implementations disclosed herein provide techniques and arrangements to enable translating language characters in media content. For example, some implementations receive a user selection of a first portion of media content. Some implementations disclosed herein may, based on the first portion, identify a second portion of the media content. The second portion of the media content may include one or more first characters of a first language. Some implementations disclosed herein may create an image that includes the second portion of the media content and may send the image to a server. Some implementations disclosed herein may receive one or more second characters of a second language corresponding to a translation of the one or more first characters of the first language from the server.

Type: Grant

Filed: October 19, 2011

Date of Patent: February 2, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jun Du, Lei Sun, Jian Sun, Qiang Huo
Recognizing interactive media input

Patent number: 9207765

Abstract: Techniques and systems for inputting data to interactive media devices are disclosed herein. In some aspects, a sensing device senses an object as it moves in a trajectory indicative of a desired input to an interactive media device. Recognition software may be used to translate the trajectory into various suggested characters or navigational commands. The suggested characters may be ranked based on a likelihood of being an intended input. The suggested characters may be displayed on a user interface at least in part based on the rank and made available for selection as the intended input.

Type: Grant

Filed: December 31, 2009

Date of Patent: December 8, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Lei Ma, Qiang Huo
I-Vector Based Clustering Training Data in Speech Recognition

Publication number: 20150199960

Abstract: Methods and systems for i-vector based clustering training data in speech recognition are described. An i-vector may be extracted from a speech segment of a speech training data to represent acoustic information. The extracted i-vectors from the speech training data may be clustered into multiple clusters using a hierarchical divisive clustering algorithm. Using a cluster of the multiple clusters, an acoustic model may be trained. This trained acoustic model may be used in speech recognition.

Type: Application

Filed: August 24, 2012

Publication date: July 16, 2015

Inventors: Qiang Huo, Zhi-Jie Yan, Yu Zhang, Jian Xu
ACTIONABLE CONTENT DISPLAYED ON A TOUCH SCREEN

Publication number: 20150095855

Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.

Type: Application

Filed: September 27, 2013

Publication date: April 2, 2015

Applicant: Microsoft Corporation

Inventors: Peng Bai, Qiang Huo, Jun Du, Lei Sun
Rotation-free recognition of handwritten characters

Patent number: 8977042

Abstract: A character recognition system receives an unknown character and recognizes the character based on a pre-trained recognition model. Prior to recognizing the character, the character recognition system may pre-process the character to rotate the character to a normalized orientation. By rotating the character to a normalized orientation in both training and recognition stages, the character recognition system releases the pre-trained recognition model from considering character prototypes in different orientations and thereby speeds up recognition of the unknown character. In one example, the character recognition system rotates the character to the normalized orientation by aligning a line between a sum of coordinates of starting points and a sum of coordinates of ending points of each stroke of the character with a normalized direction.

Type: Grant

Filed: March 23, 2012

Date of Patent: March 10, 2015

Assignee: Microsoft Corporation

Inventors: Qiang Huo, Jun Du
Posterior-Based Feature with Partial Distance Elimination for Speech Recognition

Publication number: 20140257814

Abstract: A high-dimensional posterior-based feature with partial distance elimination may be utilized for speech recognition. The log likelihood values of a large number of Gaussians are needed to generate the high-dimensional posterior feature. Gaussians with very small log likelihoods are associated with zero posterior values. Log likelihoods for Gaussians for a speech frame may be evaluated with a partial distance elimination method. If the partial distance of a Gaussian is already too small, the Gaussian will have a zero posterior value. The partial distance may be calculated by sequentially adding individual dimensions in a group of dimensions. The partial distance elimination occurs when less than all of the dimensions in the group are sequentially added.

Type: Application

Filed: March 5, 2013

Publication date: September 11, 2014

Applicant: MICROSOFT CORPORATION

Inventors: Jinyu Li, Zhijie Yan, Qiang Huo, Yifan Gong
ROTATION-FREE RECOGNITION OF HANDWRITTEN CHARACTERS

Publication number: 20130251249

Abstract: A character recognition system receives an unknown character and recognizes the character based on a pre-trained recognition model. Prior to recognizing the character, the character recognition system may pre-process the character to rotate the character to a normalized orientation. By rotating the character to a normalized orientation in both training and recognition stages, the character recognition system releases the pre-trained recognition model from considering character prototypes in different orientations and thereby speeds up recognition of the unknown character. In one example, the character recognition system rotates the character to the normalized orientation by aligning a line between a sum of coordinates of starting points and a sum of coordinates of ending points of each stroke of the character with a normalized direction.

Type: Application

Filed: March 23, 2012

Publication date: September 26, 2013

Applicant: Microsoft Corporation

Inventors: Qiang Huo, Jun Du
Speech recognition including removal of irrelevant information

Patent number: 8515758

Abstract: Some implementations provide for speech recognition based on structured modeling, irrelevant variability normalization and unsupervised online adaptation of one or more speech recognition parameters. Some implementations may improve the ability of a runtime speech recognizer or decoder to adapt to new speakers and new environments.

Type: Grant

Filed: April 14, 2010

Date of Patent: August 20, 2013

Assignee: Microsoft Corporation

Inventor: Qiang Huo
NORMALIZATION BASED DISCRIMINATIVE TRAINING FOR CONTINUOUS SPEECH RECOGNITION

Publication number: 20130185070

Abstract: A speech recognition system trains a plurality of feature transforms and a plurality of acoustic models using an irrelevant variability normalization based discriminative training. The speech recognition system employs the trained feature transforms to absorb or ignore variability within an unknown speech that is irrelevant to phonetic classification. The speech recognition system may then recognize the unknown speech using the trained recognition models. The speech recognition system may further perform an unsupervised adaptation to adapt the feature transforms for the unknown speech and thus increase the accuracy of recognizing the unknown speech.

Type: Application

Filed: January 12, 2012

Publication date: July 18, 2013

Applicant: Microsoft Corporation

Inventors: Qiang Huo, Zhi-Jie Yan, Yu Zhang

prev 1 2 3 next