Patents by Inventor Li Juan Gao

Li Juan Gao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11971916
    Abstract: A system and method for table conversion including converting a table containing text in tabular form to an image, labeling each text area of the image with a bounding box, determining for each bounding box, a position information, a semantic information, and an image information, reconstructing the image into a graph form having a plurality of nodes, wherein each node represents the bounding box of the text areas of the image, inputting at least two nodes into a trained neural network to determine a relative relationship between the at least two nodes, building a knowledge graph using the relative relationship of the at least two nodes, and translating the knowledge graph into machine readable natural language.
    Type: Grant
    Filed: November 30, 2021
    Date of Patent: April 30, 2024
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Si Heng Sun, Na Liu
  • Patent number: 11972108
    Abstract: A method, computer program product, and computer system for generating and using a basic state layer. N task models are provided (N?2). Each task model was trained on a same pre-trained backbone model. Each task model includes M feature layers and a task layer (M?1). Each feature layer of each task model includes a parameter matrix that is different for the different models. An encoder-decoder model is trained. The encoder-decoder model includes sequentially: an input layer, an encoder, M hidden layers, a decoder, and an output layer. The encoder is a neural network that maps and compresses the parameter matrices in the input layer into the M hidden layers, which generates a basic state model. The decoder is a neural network that receives the basic state model as input and generates the output layer to be identical to the input layer.
    Type: Grant
    Filed: November 15, 2021
    Date of Patent: April 30, 2024
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Na Liu, Xiang Yu Yang
  • Patent number: 11881042
    Abstract: A system and method for field extraction including determining a key position of a key in an electronic file, isolating candidate key values based on a distance from the key position, selecting a key value from the candidate key values based on an output of a trained neural network, and extracting the key and the key value from the electronic file, regardless of a key-value structure.
    Type: Grant
    Filed: November 30, 2021
    Date of Patent: January 23, 2024
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Peng HuangFu, Si Heng Sun, Yi Chen Zhong
  • Publication number: 20240004913
    Abstract: In an approach for using an open source of existing text labeling models to label sentences that need to be clustered with multiple external tags and then to use the tags as auxiliary information to perform the clustering at a dual level, a processor receives a set of text, wherein the set of text contains one or more sentences. A processor tags each sentence of the set of text with one or more tags using a plurality of open-source text classification models. A processor performs a preliminary clustering of one or more nodes under strict conditions using a canopy clustering algorithm.
    Type: Application
    Filed: June 29, 2022
    Publication date: January 4, 2024
    Inventors: Zhong Fang Yuan, Tong Liu, Wen Wang, Li Juan Gao, Xiang Yu Yang
  • Patent number: 11860980
    Abstract: A method and related system detail a split of an architecture of a monolithic application into an architecture of a micro service application. The method receives source code for the monolithic application, and maps the source code into a directed graph. The graph is split into subgraphs and optimized. The method further provides the detailing of the micro service application split, based on the subgraphs.
    Type: Grant
    Filed: January 5, 2022
    Date of Patent: January 2, 2024
    Assignee: International Business Machines Corporation
    Inventors: Li Juan Gao, Zhong Fang Yuan, Chen Gao, Tong Liu
  • Patent number: 11854287
    Abstract: A method, a computer program product, and a computer system compare images for content consistency. The method includes receiving a first image including a first document and a second image including a second document. The method includes performing a visual classification analysis on the first image and the second image. The visual classification analysis generates an overlap of the first image with the second image. The method includes determining whether a region of the overlap is indicative of a content inconsistency. As a result of the region of the overlap being indicative of a content inconsistency, the method includes performing a character recognition analysis on a first area of the first image and a second area of the second image corresponding to the region of the overlap to verify the content inconsistency.
    Type: Grant
    Filed: November 23, 2021
    Date of Patent: December 26, 2023
    Assignee: International Business Machines Corporation
    Inventors: Li Juan Gao, Zhong Fang Yuan, Tong Liu, Ming Xia Shi, Ming Jin Chen
  • Patent number: 11809454
    Abstract: Label-based document classification using artificial intelligence includes collecting, by one or more processors, a plurality of pre-trained classification models into a model pool and a plurality of documents into a document pool. The collected plurality of pre-trained classification models are applied in parallel to the plurality of documents in the document pool to generate a list of labels. Based on the list of labels, a final label result is generated according to which a baseline algorithm for document classification is generated by the one or more processors.
    Type: Grant
    Filed: November 21, 2020
    Date of Patent: November 7, 2023
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Ming Jin Chen, Ke Yong Zhang
  • Publication number: 20230244868
    Abstract: An example operation may include one or more of executing a machine learning model on training data, where the training data comprises a plurality of word strings, identifying words within the training data that are extracted by the machine learning model during the executing, determining a color for the machine learning model based on the identified words and a predefined mapping of words to colors, and rendering, via a user interface, a label associated with the machine learning model in the determined color for the machine learning model.
    Type: Application
    Filed: January 31, 2022
    Publication date: August 3, 2023
    Inventors: Zhong Fang Yuan, Tong Liu, Ting Yao Liu, Li Juan Gao, Hai Bo Zou
  • Publication number: 20230229741
    Abstract: A method and related system detail a split of an architecture of a monolithic application into an architecture of a micro service application. The method receives source code for the monolithic application, and maps the source code into a directed graph. The graph is split into subgraphs and optimized. The method further provides the detailing of the micro service application split, based on the subgraphs.
    Type: Application
    Filed: January 5, 2022
    Publication date: July 20, 2023
    Inventors: Li Juan GAO, Zhong Fang YUAN, Chen GAO, Tong LIU
  • Publication number: 20230179410
    Abstract: A method, computer system, and a computer program product for data protection is provided. The present invention may include, generating an encoder network. The present invention may also include, encoding a training data using the generated encoder network, wherein the training data includes natural language data. The present invention may further include, training a deep learning model using the encoded training data.
    Type: Application
    Filed: December 6, 2021
    Publication date: June 8, 2023
    Inventors: Li Juan Gao, Zhong Fang Yuan, Ming Jin Chen, Tong Liu
  • Publication number: 20230169101
    Abstract: A system and method for table conversion including converting a table containing text in tabular form to an image, labeling each text area of the image with a bounding box, determining for each bounding box, a position information, a semantic information, and an image information, reconstructing the image into a graph form having a plurality of nodes, wherein each node represents the bounding box of the text areas of the image, inputting at least two nodes into a trained neural network to determine a relative relationship between the at least two nodes, building a knowledge graph using the relative relationship of the at least two nodes, and translating the knowledge graph into machine readable natural language.
    Type: Application
    Filed: November 30, 2021
    Publication date: June 1, 2023
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Si Heng Sun, Na Liu
  • Publication number: 20230169786
    Abstract: A system and method for field extraction including determining a key position of a key in an electronic file, isolating candidate key values based on a distance from the key position, selecting a key value from the candidate key values based on an output of a trained neural network, and extracting the key and the key value from the electronic file, regardless of a key-value structure.
    Type: Application
    Filed: November 30, 2021
    Publication date: June 1, 2023
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Peng HuangFu, Si Heng Sun, Yi Chen Zhong
  • Publication number: 20230152971
    Abstract: A method, computer program product, and computer system for generating and using a basic state layer. N task models are provided (N ? 2). Each task model was trained on a same pre-trained backbone model. Each task model includes M feature layers and a task layer (M ? 1). Each feature layer of each task model includes a parameter matrix that is different for the different models. An encoder-decoder model is trained. The encoder-decoder model includes sequentially: an input layer, an encoder, M hidden layers, a decoder, and an output layer. The encoder is a neural network that maps and compresses the parameter matrices in the input layer into the M hidden layers, which generates a basic state model. The decoder is a neural network that receives the basic state model as input and generates the output layer to be identical to the input layer.
    Type: Application
    Filed: November 15, 2021
    Publication date: May 18, 2023
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Na Liu, Xiang Yu Yang
  • Publication number: 20230127907
    Abstract: Embodiments of the present disclosure relate to question answering. A computer-implemented method includes determining a plurality of intention candidates of a user from the user's question; determining a set of entities and attributes associated with the set of entities from the plurality of intention candidates; constructing a decision tree from the set of entities and the attributes associated with the set of entities, wherein each node of the decision tree is associated with a respective one of the attributes and represents a respective subset of the plurality of intention candidates, and wherein the respective subset of the plurality of intention candidates are split based on the entities associated with the respective one of the attributes; and generating a question corresponding to a node of the decision tree to determine the user's intention.
    Type: Application
    Filed: October 22, 2021
    Publication date: April 27, 2023
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Yi Chen Zhong, Hai Bo Zou
  • Publication number: 20230073932
    Abstract: A computer-implemented method, according to one embodiment, includes: receiving an image having characters that correspond to a language, and using a text recognition algorithm to determine a first language believed to correspond to the characters. A first confidence level associated with the first language is also computed, and a determination is made as to whether the first confidence level associated with the first language is outside a predetermined range. In response to determining that the first confidence level associated with the first language is not outside the predetermined range, the first language is output as the given language. The text recognition algorithm is trained using a simple shallow neural network and a generated mixed language corpus. The generated mixed language corpus is formed by: randomly sampling libraries having vocabulary and/or characters therein, and combining the randomly sampled vocabulary and/or characters to form the generated mixed language corpus.
    Type: Application
    Filed: September 7, 2021
    Publication date: March 9, 2023
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Xiang Yu Yang, Qiang He, Yu Pan
  • Patent number: 11501550
    Abstract: A method, system, and computer program product for segmenting and processing documents for optical character recognition is provided. The method includes receiving a document and detecting different types of text data. The document is divided into a plurality of text regions associated with the different types of said text data. Optical noise is removed from each text region and differing optical character recognition software code is selected for application to each text region. The differing optical character recognition software code is executed with respect to each text region resulting in extractable computer readable text located within each said text region.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: November 15, 2022
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Yu Pan, Tong Liu, Yi Chen Zhong, Li Juan Gao, Qiong Wu, Dan Dan Wu
  • Publication number: 20220180180
    Abstract: A data-driven model compression technique is introduced that only targets to provide same accuracy as the original (not compressed) model in certain areas by reducing compression parameters. A compression engine relies on backpropagation to determine an extent of parameter value changes and designate certain parameters as key parameters. The model matrix is reshaped according to importance of each neuron. Only randomly generated parameter values of the reshaped parameter matrix are fine tuned to create a reliable compressed neural network model.
    Type: Application
    Filed: December 9, 2020
    Publication date: June 9, 2022
    Inventors: Tong Liu, Zhong Fang Yuan, Kun Yan Yin, He Li, Li Juan Gao
  • Publication number: 20220164370
    Abstract: Label-based document classification using artificial intelligence includes collecting, by one or more processors, a plurality of pre-trained classification models into a model pool and a plurality of documents into a document pool. The collected plurality of pre-trained classification models are applied in parallel to the plurality of documents in the document pool to generate a list of labels. Based on the list of labels, a final label result is generated according to which a baseline algorithm for document classification is generated by the one or more processors.
    Type: Application
    Filed: November 21, 2020
    Publication date: May 26, 2022
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Ming Jin Chen, Ke Yong Zhang
  • Publication number: 20220164572
    Abstract: A method, system, and computer program product for segmenting and processing documents for optical character recognition is provided. The method includes receiving a document and detecting different types of text data. The document is divided into a plurality of text regions associated with the different types of said text data. Optical noise is removed from each text region and differing optical character recognition software code is selected for application to each text region. The differing optical character recognition software code is executed with respect to each text region resulting in extractable computer readable text located within each said text region.
    Type: Application
    Filed: November 24, 2020
    Publication date: May 26, 2022
    Inventors: Zhong Fang Yuan, Yu Pan, Tong Liu, Yi Chen Zhong, Li Juan Gao, Qiong Wu, Dan Dan Wu