Patents by Inventor Li Juan Gao
Li Juan Gao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11971916Abstract: A system and method for table conversion including converting a table containing text in tabular form to an image, labeling each text area of the image with a bounding box, determining for each bounding box, a position information, a semantic information, and an image information, reconstructing the image into a graph form having a plurality of nodes, wherein each node represents the bounding box of the text areas of the image, inputting at least two nodes into a trained neural network to determine a relative relationship between the at least two nodes, building a knowledge graph using the relative relationship of the at least two nodes, and translating the knowledge graph into machine readable natural language.Type: GrantFiled: November 30, 2021Date of Patent: April 30, 2024Assignee: International Business Machines CorporationInventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Si Heng Sun, Na Liu
-
Patent number: 11972108Abstract: A method, computer program product, and computer system for generating and using a basic state layer. N task models are provided (N?2). Each task model was trained on a same pre-trained backbone model. Each task model includes M feature layers and a task layer (M?1). Each feature layer of each task model includes a parameter matrix that is different for the different models. An encoder-decoder model is trained. The encoder-decoder model includes sequentially: an input layer, an encoder, M hidden layers, a decoder, and an output layer. The encoder is a neural network that maps and compresses the parameter matrices in the input layer into the M hidden layers, which generates a basic state model. The decoder is a neural network that receives the basic state model as input and generates the output layer to be identical to the input layer.Type: GrantFiled: November 15, 2021Date of Patent: April 30, 2024Assignee: International Business Machines CorporationInventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Na Liu, Xiang Yu Yang
-
Patent number: 11881042Abstract: A system and method for field extraction including determining a key position of a key in an electronic file, isolating candidate key values based on a distance from the key position, selecting a key value from the candidate key values based on an output of a trained neural network, and extracting the key and the key value from the electronic file, regardless of a key-value structure.Type: GrantFiled: November 30, 2021Date of Patent: January 23, 2024Assignee: International Business Machines CorporationInventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Peng HuangFu, Si Heng Sun, Yi Chen Zhong
-
Publication number: 20240004913Abstract: In an approach for using an open source of existing text labeling models to label sentences that need to be clustered with multiple external tags and then to use the tags as auxiliary information to perform the clustering at a dual level, a processor receives a set of text, wherein the set of text contains one or more sentences. A processor tags each sentence of the set of text with one or more tags using a plurality of open-source text classification models. A processor performs a preliminary clustering of one or more nodes under strict conditions using a canopy clustering algorithm.Type: ApplicationFiled: June 29, 2022Publication date: January 4, 2024Inventors: Zhong Fang Yuan, Tong Liu, Wen Wang, Li Juan Gao, Xiang Yu Yang
-
Patent number: 11860980Abstract: A method and related system detail a split of an architecture of a monolithic application into an architecture of a micro service application. The method receives source code for the monolithic application, and maps the source code into a directed graph. The graph is split into subgraphs and optimized. The method further provides the detailing of the micro service application split, based on the subgraphs.Type: GrantFiled: January 5, 2022Date of Patent: January 2, 2024Assignee: International Business Machines CorporationInventors: Li Juan Gao, Zhong Fang Yuan, Chen Gao, Tong Liu
-
Patent number: 11854287Abstract: A method, a computer program product, and a computer system compare images for content consistency. The method includes receiving a first image including a first document and a second image including a second document. The method includes performing a visual classification analysis on the first image and the second image. The visual classification analysis generates an overlap of the first image with the second image. The method includes determining whether a region of the overlap is indicative of a content inconsistency. As a result of the region of the overlap being indicative of a content inconsistency, the method includes performing a character recognition analysis on a first area of the first image and a second area of the second image corresponding to the region of the overlap to verify the content inconsistency.Type: GrantFiled: November 23, 2021Date of Patent: December 26, 2023Assignee: International Business Machines CorporationInventors: Li Juan Gao, Zhong Fang Yuan, Tong Liu, Ming Xia Shi, Ming Jin Chen
-
Patent number: 11809454Abstract: Label-based document classification using artificial intelligence includes collecting, by one or more processors, a plurality of pre-trained classification models into a model pool and a plurality of documents into a document pool. The collected plurality of pre-trained classification models are applied in parallel to the plurality of documents in the document pool to generate a list of labels. Based on the list of labels, a final label result is generated according to which a baseline algorithm for document classification is generated by the one or more processors.Type: GrantFiled: November 21, 2020Date of Patent: November 7, 2023Assignee: International Business Machines CorporationInventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Ming Jin Chen, Ke Yong Zhang
-
Publication number: 20230244868Abstract: An example operation may include one or more of executing a machine learning model on training data, where the training data comprises a plurality of word strings, identifying words within the training data that are extracted by the machine learning model during the executing, determining a color for the machine learning model based on the identified words and a predefined mapping of words to colors, and rendering, via a user interface, a label associated with the machine learning model in the determined color for the machine learning model.Type: ApplicationFiled: January 31, 2022Publication date: August 3, 2023Inventors: Zhong Fang Yuan, Tong Liu, Ting Yao Liu, Li Juan Gao, Hai Bo Zou
-
Publication number: 20230229741Abstract: A method and related system detail a split of an architecture of a monolithic application into an architecture of a micro service application. The method receives source code for the monolithic application, and maps the source code into a directed graph. The graph is split into subgraphs and optimized. The method further provides the detailing of the micro service application split, based on the subgraphs.Type: ApplicationFiled: January 5, 2022Publication date: July 20, 2023Inventors: Li Juan GAO, Zhong Fang YUAN, Chen GAO, Tong LIU
-
Publication number: 20230179410Abstract: A method, computer system, and a computer program product for data protection is provided. The present invention may include, generating an encoder network. The present invention may also include, encoding a training data using the generated encoder network, wherein the training data includes natural language data. The present invention may further include, training a deep learning model using the encoded training data.Type: ApplicationFiled: December 6, 2021Publication date: June 8, 2023Inventors: Li Juan Gao, Zhong Fang Yuan, Ming Jin Chen, Tong Liu
-
Publication number: 20230169101Abstract: A system and method for table conversion including converting a table containing text in tabular form to an image, labeling each text area of the image with a bounding box, determining for each bounding box, a position information, a semantic information, and an image information, reconstructing the image into a graph form having a plurality of nodes, wherein each node represents the bounding box of the text areas of the image, inputting at least two nodes into a trained neural network to determine a relative relationship between the at least two nodes, building a knowledge graph using the relative relationship of the at least two nodes, and translating the knowledge graph into machine readable natural language.Type: ApplicationFiled: November 30, 2021Publication date: June 1, 2023Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Si Heng Sun, Na Liu
-
Publication number: 20230169786Abstract: A system and method for field extraction including determining a key position of a key in an electronic file, isolating candidate key values based on a distance from the key position, selecting a key value from the candidate key values based on an output of a trained neural network, and extracting the key and the key value from the electronic file, regardless of a key-value structure.Type: ApplicationFiled: November 30, 2021Publication date: June 1, 2023Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Peng HuangFu, Si Heng Sun, Yi Chen Zhong
-
Publication number: 20230152971Abstract: A method, computer program product, and computer system for generating and using a basic state layer. N task models are provided (N ? 2). Each task model was trained on a same pre-trained backbone model. Each task model includes M feature layers and a task layer (M ? 1). Each feature layer of each task model includes a parameter matrix that is different for the different models. An encoder-decoder model is trained. The encoder-decoder model includes sequentially: an input layer, an encoder, M hidden layers, a decoder, and an output layer. The encoder is a neural network that maps and compresses the parameter matrices in the input layer into the M hidden layers, which generates a basic state model. The decoder is a neural network that receives the basic state model as input and generates the output layer to be identical to the input layer.Type: ApplicationFiled: November 15, 2021Publication date: May 18, 2023Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Na Liu, Xiang Yu Yang
-
Publication number: 20230127907Abstract: Embodiments of the present disclosure relate to question answering. A computer-implemented method includes determining a plurality of intention candidates of a user from the user's question; determining a set of entities and attributes associated with the set of entities from the plurality of intention candidates; constructing a decision tree from the set of entities and the attributes associated with the set of entities, wherein each node of the decision tree is associated with a respective one of the attributes and represents a respective subset of the plurality of intention candidates, and wherein the respective subset of the plurality of intention candidates are split based on the entities associated with the respective one of the attributes; and generating a question corresponding to a node of the decision tree to determine the user's intention.Type: ApplicationFiled: October 22, 2021Publication date: April 27, 2023Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Yi Chen Zhong, Hai Bo Zou
-
Publication number: 20230073932Abstract: A computer-implemented method, according to one embodiment, includes: receiving an image having characters that correspond to a language, and using a text recognition algorithm to determine a first language believed to correspond to the characters. A first confidence level associated with the first language is also computed, and a determination is made as to whether the first confidence level associated with the first language is outside a predetermined range. In response to determining that the first confidence level associated with the first language is not outside the predetermined range, the first language is output as the given language. The text recognition algorithm is trained using a simple shallow neural network and a generated mixed language corpus. The generated mixed language corpus is formed by: randomly sampling libraries having vocabulary and/or characters therein, and combining the randomly sampled vocabulary and/or characters to form the generated mixed language corpus.Type: ApplicationFiled: September 7, 2021Publication date: March 9, 2023Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Xiang Yu Yang, Qiang He, Yu Pan
-
Patent number: 11501550Abstract: A method, system, and computer program product for segmenting and processing documents for optical character recognition is provided. The method includes receiving a document and detecting different types of text data. The document is divided into a plurality of text regions associated with the different types of said text data. Optical noise is removed from each text region and differing optical character recognition software code is selected for application to each text region. The differing optical character recognition software code is executed with respect to each text region resulting in extractable computer readable text located within each said text region.Type: GrantFiled: November 24, 2020Date of Patent: November 15, 2022Assignee: International Business Machines CorporationInventors: Zhong Fang Yuan, Yu Pan, Tong Liu, Yi Chen Zhong, Li Juan Gao, Qiong Wu, Dan Dan Wu
-
Publication number: 20220180180Abstract: A data-driven model compression technique is introduced that only targets to provide same accuracy as the original (not compressed) model in certain areas by reducing compression parameters. A compression engine relies on backpropagation to determine an extent of parameter value changes and designate certain parameters as key parameters. The model matrix is reshaped according to importance of each neuron. Only randomly generated parameter values of the reshaped parameter matrix are fine tuned to create a reliable compressed neural network model.Type: ApplicationFiled: December 9, 2020Publication date: June 9, 2022Inventors: Tong Liu, Zhong Fang Yuan, Kun Yan Yin, He Li, Li Juan Gao
-
Publication number: 20220164370Abstract: Label-based document classification using artificial intelligence includes collecting, by one or more processors, a plurality of pre-trained classification models into a model pool and a plurality of documents into a document pool. The collected plurality of pre-trained classification models are applied in parallel to the plurality of documents in the document pool to generate a list of labels. Based on the list of labels, a final label result is generated according to which a baseline algorithm for document classification is generated by the one or more processors.Type: ApplicationFiled: November 21, 2020Publication date: May 26, 2022Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Ming Jin Chen, Ke Yong Zhang
-
Publication number: 20220164572Abstract: A method, system, and computer program product for segmenting and processing documents for optical character recognition is provided. The method includes receiving a document and detecting different types of text data. The document is divided into a plurality of text regions associated with the different types of said text data. Optical noise is removed from each text region and differing optical character recognition software code is selected for application to each text region. The differing optical character recognition software code is executed with respect to each text region resulting in extractable computer readable text located within each said text region.Type: ApplicationFiled: November 24, 2020Publication date: May 26, 2022Inventors: Zhong Fang Yuan, Yu Pan, Tong Liu, Yi Chen Zhong, Li Juan Gao, Qiong Wu, Dan Dan Wu