Patents by Inventor Qiang Huo
Qiang Huo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20190139222Abstract: An image processing method and an apparatus (300) are provided, the method includes: obtaining a depth image of a protuberant object (210); selecting a plurality of test points placed on a circle around a pixel in the depth image as a center point of the circle; calculating a protuberance value of the center point based on a comparison between the depth value of the center point and the depth value of each of the selected test points (240); and determining one or more salient points of the protuberant object by using the protuberance value of each pixel in the depth image (250).Type: ApplicationFiled: June 30, 2016Publication date: May 9, 2019Inventors: Qiang Huo, Yuseok Ban
-
Publication number: 20190114072Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.Type: ApplicationFiled: December 12, 2018Publication date: April 18, 2019Inventors: Peng BAI, Jun DU, Lei SUN, Qiang Huo
-
Patent number: 10216730Abstract: Some implementations disclosed herein provide techniques and arrangements to enable translating language characters in media content. For example, some implementations receive a user selection of a first portion of media content. Some implementations disclosed herein may, based on the first portion, identify a second portion of the media content. The second portion of the media content may include one or more first characters of a first language. Some implementations disclosed herein may create an image that includes the second portion of the media content and may send the image to a server. Some implementations disclosed herein may receive one or more second characters of a second language corresponding to a translation of the one or more first characters of the first language from the server.Type: GrantFiled: January 29, 2016Date of Patent: February 26, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Jun Du, Lei Sun, Jian Sun, Qiang Huo
-
Publication number: 20190050743Abstract: The disclosure relates to a method and apparatus for training a learning machine, wherein the apparatus includes: a broadcasting module for broadcasting an initial global model for a training cycle to a plurality of worker nodes; a receiving module for receiving a plurality of updated local models from the plurality of worker nodes, wherein each updated local model is generated by one of the plurality of worker nodes independently based on a data split assigned to the worker node and the initial global model for the training cycle; an aggregating module for aggregating the plurality of updated local models to obtain an aggregated model; and a generating module for generating an updated global model for the training cycle based at least on the aggregated model and historical information which is obtained from a preceding training cycle.Type: ApplicationFiled: March 18, 2016Publication date: February 14, 2019Inventors: Kai Chen, Qiang Huo
-
Patent number: 10191650Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.Type: GrantFiled: March 30, 2016Date of Patent: January 29, 2019Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Peng Bai, Jun Du, Lei Sun, Qiang Huo
-
Publication number: 20170147920Abstract: The use of the alternating direction method of multipliers (ADMM) algorithm to train a classifier may reduce the amount of classifier training time with little degradation in classifier accuracy. The training involves partitioning the training data for training the classifier into multiple data blocks. The partitions may preserve the joint distribution of input features and an output class of the training data. The training may further include performing an ADMM iteration on the multiple data blocks in an initial order using multiple worker nodes. Subsequently, the training of the classifier is determined to be completed if a stop criterion is satisfied following the ADMM iteration. Otherwise, if the stop criterion is determined to be unsatisfied following the ADMM iteration, one or more additional ADMM iterations may be performed on different orders of the multiple data blocks until the stop criterion is satisfied.Type: ApplicationFiled: April 8, 2014Publication date: May 25, 2017Inventors: Qiang Huo, Zhi-Jie Yan, Kai Chen
-
Patent number: 9613450Abstract: Dynamic texture mapping is used to create a photorealistic three dimensional animation of an individual with facial features synchronized with desired speech. Audiovisual data of an individual reading a known script is obtained and stored in an audio library and an image library. The audiovisual data is processed to extract feature vectors used to train a statistical model. An input audio feature vector corresponding to desired speech with which the animation will be synchronized is provided. The statistical model is used to generate a trajectory of visual feature vectors that corresponds to the input audio feature vector. These visual feature vectors are used to identify a matching image sequence from the image library. The resulting sequence of images, concatenated from the image library, provides a photorealistic image sequence with facial features, such as lip movements, synchronized with the desired speech. This image sequence is applied to the three-dimensional model.Type: GrantFiled: May 3, 2011Date of Patent: April 4, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Lijuan Wang, Frank Soong, Qiang Huo, Zhengyou Zhang
-
Publication number: 20160210040Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.Type: ApplicationFiled: March 30, 2016Publication date: July 21, 2016Inventors: Peng Bai, Jun Du, Lei Sun, Qiang Huo
-
Publication number: 20160147743Abstract: Some implementations disclosed herein provide techniques and arrangements to enable translating language characters in media content. For example, some implementations receive a user selection of a first portion of media content. Some implementations disclosed herein may, based on the first portion, identify a second portion of the media content. The second portion of the media content may include one or more first characters of a first language. Some implementations disclosed herein may create an image that includes the second portion of the media content and may send the image to a server. Some implementations disclosed herein may receive one or more second characters of a second language corresponding to a translation of the one or more first characters of the first language from the server.Type: ApplicationFiled: January 29, 2016Publication date: May 26, 2016Inventors: Jun Du, Lei Sun, Jian Sun, Qiang Huo
-
Patent number: 9336775Abstract: A high-dimensional posterior-based feature with partial distance elimination may be utilized for speech recognition. The log likelihood values of a large number of Gaussians are needed to generate the high-dimensional posterior feature. Gaussians with very small log likelihoods are associated with zero posterior values. Log likelihoods for Gaussians for a speech frame may be evaluated with a partial distance elimination method. If the partial distance of a Gaussian is already too small, the Gaussian will have a zero posterior value. The partial distance may be calculated by sequentially adding individual dimensions in a group of dimensions. The partial distance elimination occurs when less than all of the dimensions in the group are sequentially added.Type: GrantFiled: March 5, 2013Date of Patent: May 10, 2016Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Jinyu Li, Zhijie Yan, Qiang Huo, Yifan Gong
-
Patent number: 9329692Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.Type: GrantFiled: September 27, 2013Date of Patent: May 3, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Peng Bai, Qiang Huo, Jun Du, Lei Sun
-
Patent number: 9251144Abstract: Some implementations disclosed herein provide techniques and arrangements to enable translating language characters in media content. For example, some implementations receive a user selection of a first portion of media content. Some implementations disclosed herein may, based on the first portion, identify a second portion of the media content. The second portion of the media content may include one or more first characters of a first language. Some implementations disclosed herein may create an image that includes the second portion of the media content and may send the image to a server. Some implementations disclosed herein may receive one or more second characters of a second language corresponding to a translation of the one or more first characters of the first language from the server.Type: GrantFiled: October 19, 2011Date of Patent: February 2, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Jun Du, Lei Sun, Jian Sun, Qiang Huo
-
Patent number: 9207765Abstract: Techniques and systems for inputting data to interactive media devices are disclosed herein. In some aspects, a sensing device senses an object as it moves in a trajectory indicative of a desired input to an interactive media device. Recognition software may be used to translate the trajectory into various suggested characters or navigational commands. The suggested characters may be ranked based on a likelihood of being an intended input. The suggested characters may be displayed on a user interface at least in part based on the rank and made available for selection as the intended input.Type: GrantFiled: December 31, 2009Date of Patent: December 8, 2015Assignee: Microsoft Technology Licensing, LLCInventors: Lei Ma, Qiang Huo
-
Publication number: 20150199960Abstract: Methods and systems for i-vector based clustering training data in speech recognition are described. An i-vector may be extracted from a speech segment of a speech training data to represent acoustic information. The extracted i-vectors from the speech training data may be clustered into multiple clusters using a hierarchical divisive clustering algorithm. Using a cluster of the multiple clusters, an acoustic model may be trained. This trained acoustic model may be used in speech recognition.Type: ApplicationFiled: August 24, 2012Publication date: July 16, 2015Inventors: Qiang Huo, Zhi-Jie Yan, Yu Zhang, Jian Xu
-
Publication number: 20150095855Abstract: Some implementations may present a media file that includes video on a touchscreen display. A user gesture performed on the touchscreen display may be detected. The user gesture may include one of a tap gesture, a swipe gesture, or a tap and hold and drag while holding gesture. Text selected by the user gesture may be determined. One or more follow-up actions may be performed automatically based at least partly on the text selected by the user gesture.Type: ApplicationFiled: September 27, 2013Publication date: April 2, 2015Applicant: Microsoft CorporationInventors: Peng Bai, Qiang Huo, Jun Du, Lei Sun
-
Patent number: 8977042Abstract: A character recognition system receives an unknown character and recognizes the character based on a pre-trained recognition model. Prior to recognizing the character, the character recognition system may pre-process the character to rotate the character to a normalized orientation. By rotating the character to a normalized orientation in both training and recognition stages, the character recognition system releases the pre-trained recognition model from considering character prototypes in different orientations and thereby speeds up recognition of the unknown character. In one example, the character recognition system rotates the character to the normalized orientation by aligning a line between a sum of coordinates of starting points and a sum of coordinates of ending points of each stroke of the character with a normalized direction.Type: GrantFiled: March 23, 2012Date of Patent: March 10, 2015Assignee: Microsoft CorporationInventors: Qiang Huo, Jun Du
-
Publication number: 20140257814Abstract: A high-dimensional posterior-based feature with partial distance elimination may be utilized for speech recognition. The log likelihood values of a large number of Gaussians are needed to generate the high-dimensional posterior feature. Gaussians with very small log likelihoods are associated with zero posterior values. Log likelihoods for Gaussians for a speech frame may be evaluated with a partial distance elimination method. If the partial distance of a Gaussian is already too small, the Gaussian will have a zero posterior value. The partial distance may be calculated by sequentially adding individual dimensions in a group of dimensions. The partial distance elimination occurs when less than all of the dimensions in the group are sequentially added.Type: ApplicationFiled: March 5, 2013Publication date: September 11, 2014Applicant: MICROSOFT CORPORATIONInventors: Jinyu Li, Zhijie Yan, Qiang Huo, Yifan Gong
-
Publication number: 20130251249Abstract: A character recognition system receives an unknown character and recognizes the character based on a pre-trained recognition model. Prior to recognizing the character, the character recognition system may pre-process the character to rotate the character to a normalized orientation. By rotating the character to a normalized orientation in both training and recognition stages, the character recognition system releases the pre-trained recognition model from considering character prototypes in different orientations and thereby speeds up recognition of the unknown character. In one example, the character recognition system rotates the character to the normalized orientation by aligning a line between a sum of coordinates of starting points and a sum of coordinates of ending points of each stroke of the character with a normalized direction.Type: ApplicationFiled: March 23, 2012Publication date: September 26, 2013Applicant: Microsoft CorporationInventors: Qiang Huo, Jun Du
-
Patent number: 8515758Abstract: Some implementations provide for speech recognition based on structured modeling, irrelevant variability normalization and unsupervised online adaptation of one or more speech recognition parameters. Some implementations may improve the ability of a runtime speech recognizer or decoder to adapt to new speakers and new environments.Type: GrantFiled: April 14, 2010Date of Patent: August 20, 2013Assignee: Microsoft CorporationInventor: Qiang Huo
-
Publication number: 20130185070Abstract: A speech recognition system trains a plurality of feature transforms and a plurality of acoustic models using an irrelevant variability normalization based discriminative training. The speech recognition system employs the trained feature transforms to absorb or ignore variability within an unknown speech that is irrelevant to phonetic classification. The speech recognition system may then recognize the unknown speech using the trained recognition models. The speech recognition system may further perform an unsupervised adaptation to adapt the feature transforms for the unknown speech and thus increase the accuracy of recognizing the unknown speech.Type: ApplicationFiled: January 12, 2012Publication date: July 18, 2013Applicant: Microsoft CorporationInventors: Qiang Huo, Zhi-Jie Yan, Yu Zhang