Patents by Inventor Cha Zhang

Cha Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240071047
    Abstract: The disclosure herein describes generating input key-standard key mappings for a form. A set of input key-value pairs are received, and a subset of candidate form types are determined from a set of form types using the input key-value pairs. A set of standard keys associated with the determined subset of candidate form types are obtained. A set of input key-standard key pairs are generated using the set of input key-value pairs and the obtained set of standard keys and the set of input key-standard key pairs are narrowed using a narrowing rule. Ranking scores for each input key-standard key pair of the narrowed set of input key-standard key pairs are generated. Each input key of the set of input key-vale pairs is mapped to a standard key of the set of standard keys using at least the generated ranking scores of the narrowed set of input key-standard key pairs.
    Type: Application
    Filed: November 29, 2022
    Publication date: February 29, 2024
    Inventors: Souvik KUNDU, Jianwen ZHANG, Kaushik CHAKRABARTI, Yuet CHING, Leon ROMANIUK, Zheng CHEN, Cha ZHANG, Neta HAIBY, Vinod KURPAD, Anatoly Yevgenyevich PONOMAREV
  • Publication number: 20230084845
    Abstract: The disclosure herein describes providing signature data of an input document. Text data of the input document is obtained (e.g., OCR data generated from image data) and a first set of signature fields are identified using signature key-value pairs of the text data. A first subset of signed signature fields and a first subset of unsigned signature fields are determined based on mapping to a set of predicted values. A second set of signature fields are determined using a region prediction model applied to image data of the input document. Region images associated with the first subset of unsigned signature fields and with second set of signature fields are obtained and a second set of signed signature fields and a second set of unsigned signature fields are determined using a signature recognition model. Signature output data is provided including signed signature fields and/or unsigned signature fields.
    Type: Application
    Filed: September 13, 2021
    Publication date: March 16, 2023
    Inventors: Yijuan LU, Lynsey LIU, Andrei A. GAIVORONSKI, Yu CHENG, Dinei Afonso Ferreira FLORENCIO, Cha ZHANG, John Richard CORRING
  • Patent number: 11562588
    Abstract: Interfaces and systems are provided for harvesting ground truth from forms to be used in training models based on key-value pairings in the forms and to later use the trained models to identify related key-value pairings in new forms. Initially, forms are identified and clustered to identify a subset of forms to label with the key-value pairings. Users provide input to identify keys to use in labeling and then select/highlight text from forms that are presented concurrently with the keys in order to associate the highlighted text with the key(s) as the corresponding key-value pairing(s). After labeling the forms with the key-value pairings, the key-value pairing data is used as ground truth for training a model to independently identify the key-value pairing(s) in new forms. Once trained, the model is used to identify the key-value pairing(s) in new forms.
    Type: Grant
    Filed: March 26, 2020
    Date of Patent: January 24, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Dinei Afonso Ferreira Florencio, Yu-Yun Dai, Cha Zhang, Shih Chia Wang
  • Patent number: 11093740
    Abstract: The disclosed technology is generally directed to optical character recognition for forms. In one example of the technology, optical character recognition is performed on a plurality of forms. The forms of the plurality of forms include at least one type of form. Anchors are determined for the forms, including corresponding anchors for each type of form of the plurality of forms. Feature rules are determined, including corresponding feature rules for each type of form of the plurality of forms. Features and labels are determined for each form of the plurality of forms. A training model is generated based on a ground truth that includes a plurality of key-value pairs corresponding to the plurality of forms, and further based on the determined features and labels for the plurality of forms.
    Type: Grant
    Filed: November 9, 2018
    Date of Patent: August 17, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dinei Afonso Ferreira Florencio, Cha Zhang, Gil Moshe Nahmias, Yu-Yun Dai
  • Patent number: 11055560
    Abstract: The disclosed technology is generally directed to optical text recognition for forms. In one example of the technology, line grouping rules are generated based on the generic forms and a ground truth for the generic forms. Line groupings are applied to the generic forms based on the line grouping rules. Feature extraction rules are generated. Features are extracted from the generic forms based on the feature extraction rules. A key-value classifier model is generated, such that the key-value classifier model is configured to determine, for each line of a form: a probability that the line is a value, and a probability that the line is a key. A key-value pairing model is generated, such that the key-value pairing model is configured to predict, for each key in a form, which value in the form corresponds to the key.
    Type: Grant
    Filed: May 15, 2019
    Date of Patent: July 6, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dinei Afonso Ferreira Florencio, Cha Zhang, Gil Moshe Nahmias, Yu-Yun Dai, Sean Louis Goldberg
  • Publication number: 20210133438
    Abstract: Interfaces and systems are provided for harvesting ground truth from forms to be used in training models based on key-value pairings in the forms and to later use the trained models to identify related key-value pairings in new forms. Initially, forms are identified and clustered to identify a subset of forms to label with the key-value pairings. Users provide input to identify keys to use in labeling and then select/highlight text from forms that are presented concurrently with the keys in order to associate the highlighted text with the key(s) as the corresponding key-value pairing(s). After labeling the forms with the key-value pairings, the key-value pairing data is used as ground truth for training a model to independently identify the key-value pairing(s) in new forms. Once trained, the model is used to identify the key-value pairing(s) in new forms.
    Type: Application
    Filed: March 26, 2020
    Publication date: May 6, 2021
    Inventors: Dinei Afonso Ferreira Florencio, Yu-Yun Dai, Cha Zhang, Shih Chia Wang
  • Publication number: 20200160086
    Abstract: The disclosed technology is generally directed to optical text recognition for forms. In one example of the technology, line grouping rules are generated based on the generic forms and a ground truth for the generic forms. Line groupings are applied to the generic forms based on the line grouping rules. Feature extraction rules are generated. Features are extracted from the generic forms based on the feature extraction rules. A key-value classifier model is generated, such that the key-value classifier model is configured to determine, for each line of a form: a probability that the line is a value, and a probability that the line is a key. A key-value pairing model is generated, such that the key-value pairing model is configured to predict, for each key in a form, which value in the form corresponds to the key.
    Type: Application
    Filed: May 15, 2019
    Publication date: May 21, 2020
    Inventors: Dinei Afonso Ferreira FLORENCIO, Cha ZHANG, Gil Moshe NAHMIAS, Yu-Yun DAI, Sean Louis GOLDBERG
  • Publication number: 20200151443
    Abstract: The disclosed technology is generally directed to optical character recognition for forms. In one example of the technology, optical character recognition is performed on a plurality of forms. The forms of the plurality of forms include at least one type of form. Anchors are determined for the forms, including corresponding anchors for each type of form of the plurality of forms. Feature rules are determined, including corresponding feature rules for each type of form of the plurality of forms. Features and labels are determined for each form of the plurality of forms. A training model is generated based on a ground truth that includes a plurality of key-value pairs corresponding to the plurality of forms, and further based on the determined features and labels for the plurality of forms.
    Type: Application
    Filed: November 9, 2018
    Publication date: May 14, 2020
    Inventors: Dinei Afonso Ferreira FLORENCIO, Cha ZHANG, Gil Moshe NAHMIAS, Yu-Yun DAI
  • Publication number: 20180330272
    Abstract: A method includes obtaining a first classifier trained on a first dataset having a first dataset class, the first classifier having a plurality of first parameters, obtaining a second dataset having a second dataset class, loading the first parameters into a second classifier, merging a subset of the first dataset class and the second dataset class into a merged class, and training the second classifier using the merged class.
    Type: Application
    Filed: June 7, 2017
    Publication date: November 15, 2018
    Inventors: Yuxiao Hu, Lei Zhang, Christopher Buehler, Cha Zhang, Anna Roth, Cornelia Carapcea
  • Patent number: 9980040
    Abstract: Various examples related to determining a location of an active participant are provided. In one example, image data of a room from an image capture device is received. First audio data from a first microphone array at the image capture device is received. Second audio data from a second microphone array spaced from the image capture device is received. Using a three dimensional model, a location of the second microphone array is determined. Using the first audio data, second audio data, location of the second microphone array, and an angular orientation of the second microphone array, an estimated location of the active participant is determined.
    Type: Grant
    Filed: February 24, 2017
    Date of Patent: May 22, 2018
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oliver Arthur Whyte, Ross Cutler, Avronil Bhattacharjee, Adarsh Prakash Murthy Kowdle, Adam Kirk, Stanley T. Birchfield, Cha Zhang
  • Patent number: 9959627
    Abstract: A three-dimensional shape parameter computation system and method for computing three-dimensional human head shape parameters from two-dimensional facial feature points. A series of images containing a user's face is captured. Embodiments of the system and method deduce the 3D parameters of the user's head by examining a series of captured images of the user over time and in a variety of head poses and facial expressions, and then computing an average. An energy function is constructed over a batch of frames containing 2D face feature points obtained from the captured images, and the energy function is minimized to solve for the head shape parameters valid for the batch of frames. Head pose parameters and facial expression and animation parameters can vary over each captured image in the batch of frames. In some embodiments this minimization is performed using a modified Gauss-Newton minimization technique using a single iteration.
    Type: Grant
    Filed: May 6, 2015
    Date of Patent: May 1, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nikolay Smolyanskiy, Christian F. Huitema, Cha Zhang, Lin Liang, Sean Eron Anderson, Zhengyou Zhang
  • Publication number: 20180096195
    Abstract: Examples are disclosed herein that relate to face detection. One example provides a computing device comprising a logic subsystem and a storage subsystem holding instructions executable by the logic subsystem to receive an image, apply a tile array to the image, the tile array comprising a plurality of tiles, and perform face detection on at least a subset of the tiles, where determining whether or not to perform face detection on a given tile is based on a likelihood that the tile includes at least a portion of a human face.
    Type: Application
    Filed: November 25, 2015
    Publication date: April 5, 2018
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Cristian Canton Ferrer, Stanley T. Birchfield, Adam Kirk, Cha Zhang
  • Patent number: 9756284
    Abstract: The described implementations relate to enhancement images, such as in videoconferencing scenarios. One system includes a poriferous display screen having generally opposing front and back surfaces. This system also includes a camera positioned proximate to the back surface to capture an image through the poriferous display screen.
    Type: Grant
    Filed: August 3, 2015
    Date of Patent: September 5, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Cha Zhang, Timothy A. Large, Zhengyou Zhang, Ruigang Yang
  • Patent number: 9742780
    Abstract: Techniques for automatically connecting to a service controller are described herein. In one example, a service controller device includes a processor and a computer-readable memory storage device storing executable instructions that cause the processor to broadcast at least one of an access credential, connection information or an access credential hash embedded in an audio signal. The processor can also authenticate a client device based on a transmission of at least one of the connection information, the access credential, or the access credential hash from the client device to the client connector and transmit data to the client device in response to authenticating the client device.
    Type: Grant
    Filed: October 5, 2015
    Date of Patent: August 22, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Sasa Junuzovic, Yinpeng Chen, Cha Zhang, Dinei Florencio, Zhengyou Zhang, Alastair Wolman
  • Publication number: 20170201825
    Abstract: Various examples related to determining a location of an active participant are provided. In one example, image data of a room from an image capture device is received. First audio data from a first microphone array at the image capture device is received. Second audio data from a second microphone array spaced from the image capture device is received. Using a three dimensional model, a location of the second microphone array is determined. Using the first audio data, second audio data, location of the second microphone array, and an angular orientation of the second microphone array, an estimated location of the active participant is determined.
    Type: Application
    Filed: February 24, 2017
    Publication date: July 13, 2017
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Oliver Arthur Whyte, Ross Cutler, Avronil Bhattacharjee, Adarsh Prakash Murthy Kowdle, Adam Kirk, Stanley T. Birchfield, Cha Zhang
  • Patent number: 9672416
    Abstract: The description relates to facial tracking. One example can include an orientation structure configured to position the wearable device relative to a user's face. The example can also include a camera secured by the orientation structure parallel to or at a low angle to the user's face to capture images across the user's face. The example can further include a processor configured to receive the images and to map the images to parameters associated with an avatar model.
    Type: Grant
    Filed: April 29, 2014
    Date of Patent: June 6, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Cha Zhang, Zhengyou Zhang, Bernardino Romera Paredes
  • Patent number: 9648346
    Abstract: Multi-view video that is being streamed to a remote device in real time may be encoded. Frames of a real-world scene captured by respective video cameras are received for compression. A virtual viewpoint, positioned relative to the video cameras, is used to determine expected contributions of individual portions of the frames to a synthesized image of the scene from the viewpoint position using the frames. For each frame, compression rates for individual blocks of a frame are computed based on the determined contributions of the individual portions of the frame. The frames are compressed by compressing the blocks of the frames according to their respective determined compression rates. The frames are transmitted in compressed form via a network to a remote device, which is configured to render the scene using the compressed frames.
    Type: Grant
    Filed: June 25, 2009
    Date of Patent: May 9, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Cha Zhang, Dinei A. Florencio
  • Patent number: 9621795
    Abstract: Various examples related to determining a location of an active speaker are provided. In one example, image data of a room from an image capture device is received and a three dimensional model is generated. First audio data from a first microphone array at the image capture device is received. Second audio data from a second microphone array laterally spaced from the image capture device is received. Using the three dimensional model, a location of the second microphone array with respect to the image capture device is determined. Using the audio data and the location and angular orientation of the second microphone array, an estimated location of the active speaker is determined. Using the estimated location, a setting for the image capture device is determined and outputted to highlight the active speaker.
    Type: Grant
    Filed: January 8, 2016
    Date of Patent: April 11, 2017
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oliver Arthur Whyte, Ross Cutler, Avronil Bhattacharjee, Adarsh Prakash Murthy Kowdle, Adam Kirk, Stanley T. Birchfield, Cha Zhang
  • Patent number: 9536046
    Abstract: Described is a technology by which medical patient facial images are acquired and maintained for associating with a patient's records and/or other items. A video camera may provide video frames, such as captured when a patient is being admitted to a hospital. Face detection may be employed to clip the facial part from the frame. Multiple images of a patient's face may be displayed on a user interface to allow selection of a representative image. Also described is obtaining the patient images by processing electronic documents (e.g., patient records) to look for a face pictured therein.
    Type: Grant
    Filed: January 12, 2010
    Date of Patent: January 3, 2017
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Michael Gillam, John Christopher Gillotte, Craig Frederick Feied, Jonathan Alan Handler, Renato Reder Cazangi, Rajesh Kutpadi Hegde, Zhengyou Zhang, Cha Zhang
  • Patent number: 9536312
    Abstract: A depth construction module is described that receives depth images provided by two or more depth capture units. Each depth capture unit generates its depth image using a structured light technique, that is, by projecting a pattern onto an object and receiving a captured image in response thereto. The depth construction module then identifies at least one deficient portion in at least one depth image that has been received, which may be attributed to overlapping projected patterns that impinge the object. The depth construction module then uses a multi-view reconstruction technique, such as a plane sweeping technique, to supply depth information for the deficient portion. In another mode, a multi-view reconstruction technique can be used to produce an entire depth scene based on captured images received from the depth capture units, that is, without first identifying deficient portions in the depth images.
    Type: Grant
    Filed: May 16, 2011
    Date of Patent: January 3, 2017
    Assignee: Microsoft Corporation
    Inventors: Cha Zhang, Wenwu Zhu, Zhengyou Zhang, Philip A. Chou