Patents by Inventor Mengqing Guo

Mengqing Guo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240338959
    Abstract: The system generates templates to be used for use in synthetic document generation. Generating a template includes selecting entities to be included in the template and/or selecting characteristics of entities in the template. The system may execute randomization functions to determine which entities, of a candidate set of entities, are to be included in a template. The randomization functions may accept, as input, probabilities associated with the entities to compute the inclusion or exclusion of entities in the template. Different entities may be associated with different corresponding probabilities. The system may execute randomization functions to determine characteristics of entities that are included in a template. The characteristics may be selected from a specific candidate set of characteristics or a range of characteristic values.
    Type: Application
    Filed: April 4, 2023
    Publication date: October 10, 2024
    Applicant: Oracle International Corporation
    Inventors: Karan Dua, Praneet Pabolu, Mengqing Guo
  • Publication number: 20240338958
    Abstract: Techniques are disclosed for optical character recognition of extensible markup language content. A method can include a system generating a first training data comprising extensible markup language (XML) content, the first training data comprising a first plurality of training instances, each training instance including a respective image comprising XML content and annotation information for the respective image. The system can train a plurality of machine learning models using the first training data to generate a plurality of trained machine learning models, to perform image-based XML content extraction. The system can generate a plurality of trained machine learning models based at least in part on the training.
    Type: Application
    Filed: April 6, 2023
    Publication date: October 10, 2024
    Applicant: Oracle International Corporation
    Inventors: Liyu Gong, Yuying Wang, Mengqing Guo, Tao Sheng, Jun Qian
  • Publication number: 20240221407
    Abstract: Techniques for multi-stage training of a machine learning model to extract key-value pairs from documents are disclosed. A system trains a machine learning model using a set of training data including unlabeled documents of various document categories. The initial stage identifies relationships among tokens, or words, numbers, and punctuation, in documents. The system re-trains the machine learning model using a set of training data which includes a particular category of documents while excluding other categories of documents. The second training stage is a supervised machine learning stage in which the training data is labeled to identify key-value pairs in the documents. In the initial training stage, the system sets parameters of the machine learning model to an initial state. In the second stage, the system modifies the parameters of the machine learning model based on the characteristics of the training data set including the documents of the particular category.
    Type: Application
    Filed: January 4, 2023
    Publication date: July 4, 2024
    Applicant: Oracle International Corporation
    Inventors: Yazhe Hu, Jeaff Wang, Mengqing Guo, Tao Sheng, Jun Qian