Patents by Inventor Cha Zhang

Cha Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

KNOWLEDGE DRIVEN PRE-TRAINED FORM KEY MAPPING

Publication number: 20240071047

Abstract: The disclosure herein describes generating input key-standard key mappings for a form. A set of input key-value pairs are received, and a subset of candidate form types are determined from a set of form types using the input key-value pairs. A set of standard keys associated with the determined subset of candidate form types are obtained. A set of input key-standard key pairs are generated using the set of input key-value pairs and the obtained set of standard keys and the set of input key-standard key pairs are narrowed using a narrowing rule. Ranking scores for each input key-standard key pair of the narrowed set of input key-standard key pairs are generated. Each input key of the set of input key-vale pairs is mapped to a standard key of the set of standard keys using at least the generated ranking scores of the narrowed set of input key-standard key pairs.

Type: Application

Filed: November 29, 2022

Publication date: February 29, 2024

Inventors: Souvik KUNDU, Jianwen ZHANG, Kaushik CHAKRABARTI, Yuet CHING, Leon ROMANIUK, Zheng CHEN, Cha ZHANG, Neta HAIBY, Vinod KURPAD, Anatoly Yevgenyevich PONOMAREV
ENTRY DETECTION AND RECOGNITION FOR CUSTOM FORMS

Publication number: 20230084845

Abstract: The disclosure herein describes providing signature data of an input document. Text data of the input document is obtained (e.g., OCR data generated from image data) and a first set of signature fields are identified using signature key-value pairs of the text data. A first subset of signed signature fields and a first subset of unsigned signature fields are determined based on mapping to a set of predicted values. A second set of signature fields are determined using a region prediction model applied to image data of the input document. Region images associated with the first subset of unsigned signature fields and with second set of signature fields are obtained and a second set of signed signature fields and a second set of unsigned signature fields are determined using a signature recognition model. Signature output data is provided including signed signature fields and/or unsigned signature fields.

Type: Application

Filed: September 13, 2021

Publication date: March 16, 2023

Inventors: Yijuan LU, Lynsey LIU, Andrei A. GAIVORONSKI, Yu CHENG, Dinei Afonso Ferreira FLORENCIO, Cha ZHANG, John Richard CORRING
Enhanced supervised form understanding

Patent number: 11562588

Abstract: Interfaces and systems are provided for harvesting ground truth from forms to be used in training models based on key-value pairings in the forms and to later use the trained models to identify related key-value pairings in new forms. Initially, forms are identified and clustered to identify a subset of forms to label with the key-value pairings. Users provide input to identify keys to use in labeling and then select/highlight text from forms that are presented concurrently with the keys in order to associate the highlighted text with the key(s) as the corresponding key-value pairing(s). After labeling the forms with the key-value pairings, the key-value pairing data is used as ground truth for training a model to independently identify the key-value pairing(s) in new forms. Once trained, the model is used to identify the key-value pairing(s) in new forms.

Type: Grant

Filed: March 26, 2020

Date of Patent: January 24, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Dinei Afonso Ferreira Florencio, Yu-Yun Dai, Cha Zhang, Shih Chia Wang
Supervised OCR training for custom forms

Patent number: 11093740

Abstract: The disclosed technology is generally directed to optical character recognition for forms. In one example of the technology, optical character recognition is performed on a plurality of forms. The forms of the plurality of forms include at least one type of form. Anchors are determined for the forms, including corresponding anchors for each type of form of the plurality of forms. Feature rules are determined, including corresponding feature rules for each type of form of the plurality of forms. Features and labels are determined for each form of the plurality of forms. A training model is generated based on a ground truth that includes a plurality of key-value pairs corresponding to the plurality of forms, and further based on the determined features and labels for the plurality of forms.

Type: Grant

Filed: November 9, 2018

Date of Patent: August 17, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dinei Afonso Ferreira Florencio, Cha Zhang, Gil Moshe Nahmias, Yu-Yun Dai
Unsupervised domain adaptation from generic forms for new OCR forms

Patent number: 11055560

Abstract: The disclosed technology is generally directed to optical text recognition for forms. In one example of the technology, line grouping rules are generated based on the generic forms and a ground truth for the generic forms. Line groupings are applied to the generic forms based on the line grouping rules. Feature extraction rules are generated. Features are extracted from the generic forms based on the feature extraction rules. A key-value classifier model is generated, such that the key-value classifier model is configured to determine, for each line of a form: a probability that the line is a value, and a probability that the line is a key. A key-value pairing model is generated, such that the key-value pairing model is configured to predict, for each key in a form, which value in the form corresponds to the key.

Type: Grant

Filed: May 15, 2019

Date of Patent: July 6, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dinei Afonso Ferreira Florencio, Cha Zhang, Gil Moshe Nahmias, Yu-Yun Dai, Sean Louis Goldberg
ENHANCED SUPERVISED FORM UNDERSTANDING

Publication number: 20210133438

Abstract: Interfaces and systems are provided for harvesting ground truth from forms to be used in training models based on key-value pairings in the forms and to later use the trained models to identify related key-value pairings in new forms. Initially, forms are identified and clustered to identify a subset of forms to label with the key-value pairings. Users provide input to identify keys to use in labeling and then select/highlight text from forms that are presented concurrently with the keys in order to associate the highlighted text with the key(s) as the corresponding key-value pairing(s). After labeling the forms with the key-value pairings, the key-value pairing data is used as ground truth for training a model to independently identify the key-value pairing(s) in new forms. Once trained, the model is used to identify the key-value pairing(s) in new forms.

Type: Application

Filed: March 26, 2020

Publication date: May 6, 2021

Inventors: Dinei Afonso Ferreira Florencio, Yu-Yun Dai, Cha Zhang, Shih Chia Wang
UNSUPERVISED DOMAIN ADAPTATION FROM GENERIC FORMS FOR NEW OCR FORMS

Publication number: 20200160086

Abstract: The disclosed technology is generally directed to optical text recognition for forms. In one example of the technology, line grouping rules are generated based on the generic forms and a ground truth for the generic forms. Line groupings are applied to the generic forms based on the line grouping rules. Feature extraction rules are generated. Features are extracted from the generic forms based on the feature extraction rules. A key-value classifier model is generated, such that the key-value classifier model is configured to determine, for each line of a form: a probability that the line is a value, and a probability that the line is a key. A key-value pairing model is generated, such that the key-value pairing model is configured to predict, for each key in a form, which value in the form corresponds to the key.

Type: Application

Filed: May 15, 2019

Publication date: May 21, 2020

Inventors: Dinei Afonso Ferreira FLORENCIO, Cha ZHANG, Gil Moshe NAHMIAS, Yu-Yun DAI, Sean Louis GOLDBERG
SUPERVISED OCR TRAINING FOR CUSTOM FORMS

Publication number: 20200151443

Abstract: The disclosed technology is generally directed to optical character recognition for forms. In one example of the technology, optical character recognition is performed on a plurality of forms. The forms of the plurality of forms include at least one type of form. Anchors are determined for the forms, including corresponding anchors for each type of form of the plurality of forms. Feature rules are determined, including corresponding feature rules for each type of form of the plurality of forms. Features and labels are determined for each form of the plurality of forms. A training model is generated based on a ground truth that includes a plurality of key-value pairs corresponding to the plurality of forms, and further based on the determined features and labels for the plurality of forms.

Type: Application

Filed: November 9, 2018

Publication date: May 14, 2020

Inventors: Dinei Afonso Ferreira FLORENCIO, Cha ZHANG, Gil Moshe NAHMIAS, Yu-Yun DAI
Method of Adding Classes to Classifier

Publication number: 20180330272

Abstract: A method includes obtaining a first classifier trained on a first dataset having a first dataset class, the first classifier having a plurality of first parameters, obtaining a second dataset having a second dataset class, loading the first parameters into a second classifier, merging a subset of the first dataset class and the second dataset class into a merged class, and training the second classifier using the merged class.

Type: Application

Filed: June 7, 2017

Publication date: November 15, 2018

Inventors: Yuxiao Hu, Lei Zhang, Christopher Buehler, Cha Zhang, Anna Roth, Cornelia Carapcea
Active speaker location detection

Patent number: 9980040

Abstract: Various examples related to determining a location of an active participant are provided. In one example, image data of a room from an image capture device is received. First audio data from a first microphone array at the image capture device is received. Second audio data from a second microphone array spaced from the image capture device is received. Using a three dimensional model, a location of the second microphone array is determined. Using the first audio data, second audio data, location of the second microphone array, and an angular orientation of the second microphone array, an estimated location of the active participant is determined.

Type: Grant

Filed: February 24, 2017

Date of Patent: May 22, 2018

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Oliver Arthur Whyte, Ross Cutler, Avronil Bhattacharjee, Adarsh Prakash Murthy Kowdle, Adam Kirk, Stanley T. Birchfield, Cha Zhang
Computing 3D shape parameters for face animation

Patent number: 9959627

Abstract: A three-dimensional shape parameter computation system and method for computing three-dimensional human head shape parameters from two-dimensional facial feature points. A series of images containing a user's face is captured. Embodiments of the system and method deduce the 3D parameters of the user's head by examining a series of captured images of the user over time and in a variety of head poses and facial expressions, and then computing an average. An energy function is constructed over a batch of frames containing 2D face feature points obtained from the captured images, and the energy function is minimized to solve for the head shape parameters valid for the batch of frames. Head pose parameters and facial expression and animation parameters can vary over each captured image in the batch of frames. In some embodiments this minimization is performed using a modified Gauss-Newton minimization technique using a single iteration.

Type: Grant

Filed: May 6, 2015

Date of Patent: May 1, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Nikolay Smolyanskiy, Christian F. Huitema, Cha Zhang, Lin Liang, Sean Eron Anderson, Zhengyou Zhang
PROBABILISTIC FACE DETECTION

Publication number: 20180096195

Abstract: Examples are disclosed herein that relate to face detection. One example provides a computing device comprising a logic subsystem and a storage subsystem holding instructions executable by the logic subsystem to receive an image, apply a tile array to the image, the tile array comprising a plurality of tiles, and perform face detection on at least a subset of the tiles, where determining whether or not to perform face detection on a given tile is based on a likelihood that the tile includes at least a portion of a human face.

Type: Application

Filed: November 25, 2015

Publication date: April 5, 2018

Applicant: Microsoft Technology Licensing, LLC

Inventors: Cristian Canton Ferrer, Stanley T. Birchfield, Adam Kirk, Cha Zhang
Imaging through a display screen

Patent number: 9756284

Abstract: The described implementations relate to enhancement images, such as in videoconferencing scenarios. One system includes a poriferous display screen having generally opposing front and back surfaces. This system also includes a camera positioned proximate to the back surface to capture an image through the poriferous display screen.

Type: Grant

Filed: August 3, 2015

Date of Patent: September 5, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Cha Zhang, Timothy A. Large, Zhengyou Zhang, Ruigang Yang
Audio based discovery and connection to a service controller

Patent number: 9742780

Abstract: Techniques for automatically connecting to a service controller are described herein. In one example, a service controller device includes a processor and a computer-readable memory storage device storing executable instructions that cause the processor to broadcast at least one of an access credential, connection information or an access credential hash embedded in an audio signal. The processor can also authenticate a client device based on a transmission of at least one of the connection information, the access credential, or the access credential hash from the client device to the client connector and transmit data to the client device in response to authenticating the client device.

Type: Grant

Filed: October 5, 2015

Date of Patent: August 22, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Sasa Junuzovic, Yinpeng Chen, Cha Zhang, Dinei Florencio, Zhengyou Zhang, Alastair Wolman
ACTIVE SPEAKER LOCATION DETECTION

Publication number: 20170201825

Abstract: Various examples related to determining a location of an active participant are provided. In one example, image data of a room from an image capture device is received. First audio data from a first microphone array at the image capture device is received. Second audio data from a second microphone array spaced from the image capture device is received. Using a three dimensional model, a location of the second microphone array is determined. Using the first audio data, second audio data, location of the second microphone array, and an angular orientation of the second microphone array, an estimated location of the active participant is determined.

Type: Application

Filed: February 24, 2017

Publication date: July 13, 2017

Applicant: Microsoft Technology Licensing, LLC

Inventors: Oliver Arthur Whyte, Ross Cutler, Avronil Bhattacharjee, Adarsh Prakash Murthy Kowdle, Adam Kirk, Stanley T. Birchfield, Cha Zhang
Facial expression tracking

Patent number: 9672416

Abstract: The description relates to facial tracking. One example can include an orientation structure configured to position the wearable device relative to a user's face. The example can also include a camera secured by the orientation structure parallel to or at a low angle to the user's face to capture images across the user's face. The example can further include a processor configured to receive the images and to map the images to parameters associated with an avatar model.

Type: Grant

Filed: April 29, 2014

Date of Patent: June 6, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Cha Zhang, Zhengyou Zhang, Bernardino Romera Paredes
Multi-view video compression and streaming based on viewpoints of remote viewer

Patent number: 9648346

Abstract: Multi-view video that is being streamed to a remote device in real time may be encoded. Frames of a real-world scene captured by respective video cameras are received for compression. A virtual viewpoint, positioned relative to the video cameras, is used to determine expected contributions of individual portions of the frames to a synthesized image of the scene from the viewpoint position using the frames. For each frame, compression rates for individual blocks of a frame are computed based on the determined contributions of the individual portions of the frame. The frames are compressed by compressing the blocks of the frames according to their respective determined compression rates. The frames are transmitted in compressed form via a network to a remote device, which is configured to render the scene using the compressed frames.

Type: Grant

Filed: June 25, 2009

Date of Patent: May 9, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Cha Zhang, Dinei A. Florencio
Active speaker location detection

Patent number: 9621795

Abstract: Various examples related to determining a location of an active speaker are provided. In one example, image data of a room from an image capture device is received and a three dimensional model is generated. First audio data from a first microphone array at the image capture device is received. Second audio data from a second microphone array laterally spaced from the image capture device is received. Using the three dimensional model, a location of the second microphone array with respect to the image capture device is determined. Using the audio data and the location and angular orientation of the second microphone array, an estimated location of the active speaker is determined. Using the estimated location, a setting for the image capture device is determined and outputted to highlight the active speaker.

Type: Grant

Filed: January 8, 2016

Date of Patent: April 11, 2017

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Oliver Arthur Whyte, Ross Cutler, Avronil Bhattacharjee, Adarsh Prakash Murthy Kowdle, Adam Kirk, Stanley T. Birchfield, Cha Zhang
Automated acquisition of facial images

Patent number: 9536046

Abstract: Described is a technology by which medical patient facial images are acquired and maintained for associating with a patient's records and/or other items. A video camera may provide video frames, such as captured when a patient is being admitted to a hospital. Face detection may be employed to clip the facial part from the frame. Multiple images of a patient's face may be displayed on a user interface to allow selection of a representative image. Also described is obtaining the patient images by processing electronic documents (e.g., patient records) to look for a face pictured therein.

Type: Grant

Filed: January 12, 2010

Date of Patent: January 3, 2017

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Michael Gillam, John Christopher Gillotte, Craig Frederick Feied, Jonathan Alan Handler, Renato Reder Cazangi, Rajesh Kutpadi Hegde, Zhengyou Zhang, Cha Zhang
Depth reconstruction using plural depth capture units

Patent number: 9536312

Abstract: A depth construction module is described that receives depth images provided by two or more depth capture units. Each depth capture unit generates its depth image using a structured light technique, that is, by projecting a pattern onto an object and receiving a captured image in response thereto. The depth construction module then identifies at least one deficient portion in at least one depth image that has been received, which may be attributed to overlapping projected patterns that impinge the object. The depth construction module then uses a multi-view reconstruction technique, such as a plane sweeping technique, to supply depth information for the deficient portion. In another mode, a multi-view reconstruction technique can be used to produce an entire depth scene based on captured images received from the depth capture units, that is, without first identifying deficient portions in the depth images.

Type: Grant

Filed: May 16, 2011

Date of Patent: January 3, 2017

Assignee: Microsoft Corporation

Inventors: Cha Zhang, Wenwu Zhu, Zhengyou Zhang, Philip A. Chou

1 2 3 4 5 … next