Patents by Inventor Zhong Fang Yuan

Zhong Fang Yuan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11977990
    Abstract: A first set of features associated with a neural network are parameterized. A decision tree is generated from the first set of features. One or more adjustments for the neural network are received at the decision tree. A second set of features associated with the adjustments at the decision tree are parameterized. The parameterized first and second set of features are combined into a plurality of parameters. From the plurality, an adjusted neural network is generated.
    Type: Grant
    Filed: April 28, 2020
    Date of Patent: May 7, 2024
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, De Shuo Kong, Yun He Gao, Tong Liu, Peng Yun Sun, Ya Dong Li
  • Publication number: 20240144041
    Abstract: One or more systems, devices, computer program products and/or computer-implemented methods of use provided herein relate to a process to facilitate abnormal document self-discovery. A system can comprise a memory that stores computer executable components, and a processor that executes the computer executable components stored in the memory, wherein the computer executable components can comprise an object detection component that can generate a knowledge graph with vectors corresponding with nodes representative of a page layout of a document. Additionally, the computer executable components can comprise an evaluation component that can compare the vectors to identify whether an edge is present between corresponding vectors of the knowledge graph; an encoder component that can re-code the knowledge graph; and a comparison component that can compare a structure of the knowledge graph with one or more other knowledge graphs corresponding to one or more other documents to determine if the document is abnormal.
    Type: Application
    Filed: November 1, 2022
    Publication date: May 2, 2024
    Inventors: Zhong Fang Yuan, Fei Wang, Kun Yan Yin, Tong Liu, Yue Liu
  • Patent number: 11971916
    Abstract: A system and method for table conversion including converting a table containing text in tabular form to an image, labeling each text area of the image with a bounding box, determining for each bounding box, a position information, a semantic information, and an image information, reconstructing the image into a graph form having a plurality of nodes, wherein each node represents the bounding box of the text areas of the image, inputting at least two nodes into a trained neural network to determine a relative relationship between the at least two nodes, building a knowledge graph using the relative relationship of the at least two nodes, and translating the knowledge graph into machine readable natural language.
    Type: Grant
    Filed: November 30, 2021
    Date of Patent: April 30, 2024
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Si Heng Sun, Na Liu
  • Patent number: 11972108
    Abstract: A method, computer program product, and computer system for generating and using a basic state layer. N task models are provided (N?2). Each task model was trained on a same pre-trained backbone model. Each task model includes M feature layers and a task layer (M?1). Each feature layer of each task model includes a parameter matrix that is different for the different models. An encoder-decoder model is trained. The encoder-decoder model includes sequentially: an input layer, an encoder, M hidden layers, a decoder, and an output layer. The encoder is a neural network that maps and compresses the parameter matrices in the input layer into the M hidden layers, which generates a basic state model. The decoder is a neural network that receives the basic state model as input and generates the output layer to be identical to the input layer.
    Type: Grant
    Filed: November 15, 2021
    Date of Patent: April 30, 2024
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Na Liu, Xiang Yu Yang
  • Patent number: 11972525
    Abstract: An example operation may include one or more of generating a three-dimensional (3D) model of an object via execution of a machine learning model on one or more images of the object, capturing a plurality of snapshots of the 3D model of the object at different angles to generate a plurality of snapshot images of the object, fusing a feature into each of the plurality of snapshots to generate a plurality of fused snapshots of the 3D model of the object, and storing the plurality of fused snapshots of the 3D model of the object in memory.
    Type: Grant
    Filed: February 21, 2022
    Date of Patent: April 30, 2024
    Assignee: International Business Machines Corporation
    Inventors: Kun Yan Yin, Zhong Fang Yuan, Yi Chen Zhong, Lu Yu, Tong Liu
  • Publication number: 20240129582
    Abstract: Using labelled training content, a content classification model is trained. Using the trained content classification model, a label describing a first content is determined. The first content is classified into a category in a set of categories using the label. Responsive to the first content being classified into a category of inappropriate content, the first content is removed from a storage location.
    Type: Application
    Filed: October 17, 2022
    Publication date: April 18, 2024
    Applicant: International Business Machines Corporation
    Inventors: Si Tong Zhao, Zhong Fang Yuan, Tong Liu, Yi Chen Zhong, Yuan Yuan Ding
  • Publication number: 20240119275
    Abstract: A method for contrastive learning by selecting dropout ratios and locations based on reinforcement learning includes receiving training data having a positive sample corresponding to a target and negative samples not corresponding to the target. A dropout policy for a neural network is produced based on the training data, where the dropout policy identifies at least one connection between neurons in the neural network to dropout. The training data is encoded, based on the dropout policy, to form embeddings, where the embeddings include multiple positive sample embeddings corresponding to the positive sample and multiple negative sample embedding corresponding to the negative samples.
    Type: Application
    Filed: September 28, 2022
    Publication date: April 11, 2024
    Inventors: Zhong Fang Yuan, Si Tong Zhao, Tong Liu, Yi Chen Zhong, Yuan Yuan Ding, Hai Bo Zou
  • Publication number: 20240096121
    Abstract: Provided are a computer program product, system, and method for training and using a vector encoder to determine vectors for sub-images of text in an image to subject to optical character recognition. A vector encoder is trained to encode images representing text into vectors in a vector space. Vectors of images representing similar text have a high degree of cohesion in the vector space. Vectors of images representing dissimilar text have a low degree of cohesion in the vector space. An input image is processed to determine sub-images of the input image that bound text represented in the input image. The sub-images are inputted to the vector encoder to output sub-image vectors. The vector encoder generates a search vector for search text. Optical character recognition is applied to at least one region of the input image including the sub-images having sub-image vectors matching the search vector.
    Type: Application
    Filed: September 15, 2022
    Publication date: March 21, 2024
    Inventors: Zhong Fang YUAN, Tong LIU, Yi Chen ZHONG, Xiang Yu YANG, Guan Chao LI
  • Publication number: 20240062570
    Abstract: A computer-implemented method, system and computer program product for detecting Unicode injection in text. A language model is trained to determine if text data (e.g., text fragment) conforms with human writing habits using negative and positive samples. Negative samples include samples of text that are not classified as being suspect for containing Unicode characters. Such negative samples include text written by humans. Positive samples include samples of text that are to be classified as being suspect for containing Unicode characters. Such positive samples may be formed by randomly inserting Unicode characters into the corpus of negative samples. After training the language model, the language model is able to determine whether the received text data (e.g., text fragment) is suspect for containing Unicode characters based on whether the text data conforms with human writing habits.
    Type: Application
    Filed: August 19, 2022
    Publication date: February 22, 2024
    Inventors: Zhong Fang Yuan, Tong Liu, Ting Ting Cao, Hai Bo Zou, Xiang Yu Yang
  • Patent number: 11881042
    Abstract: A system and method for field extraction including determining a key position of a key in an electronic file, isolating candidate key values based on a distance from the key position, selecting a key value from the candidate key values based on an output of a trained neural network, and extracting the key and the key value from the electronic file, regardless of a key-value structure.
    Type: Grant
    Filed: November 30, 2021
    Date of Patent: January 23, 2024
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Peng HuangFu, Si Heng Sun, Yi Chen Zhong
  • Patent number: 11875793
    Abstract: A system, method, and computer program product for implementing cognitive natural language processing software framework optimization is provided. The method includes receiving instructions associated with an audible user input of a user. An AI input intention of the user is determined and key information is extracted from the audible user input. The key information is inputted into a generated database table and additional key information is retrieved from a dialog table. A supplementary database table comprising the additional key information is generated and the key information is spliced with the additional key information. A resulting spliced data structure is merged into a final database table and natural language is converted into a request code structure within an SQL structure and an interactive AI interface presenting results of the converting is generated. Operational functionality of an AI device is enabled for audibly presenting results of the conversion.
    Type: Grant
    Filed: September 7, 2021
    Date of Patent: January 16, 2024
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Tong Liu, De Shuo Kong, Yao Chen, Hai Bo Zou, Sarbajit K. Rakshit, Zheng Jie
  • Publication number: 20240004913
    Abstract: In an approach for using an open source of existing text labeling models to label sentences that need to be clustered with multiple external tags and then to use the tags as auxiliary information to perform the clustering at a dual level, a processor receives a set of text, wherein the set of text contains one or more sentences. A processor tags each sentence of the set of text with one or more tags using a plurality of open-source text classification models. A processor performs a preliminary clustering of one or more nodes under strict conditions using a canopy clustering algorithm.
    Type: Application
    Filed: June 29, 2022
    Publication date: January 4, 2024
    Inventors: Zhong Fang Yuan, Tong Liu, Wen Wang, Li Juan Gao, Xiang Yu Yang
  • Patent number: 11860980
    Abstract: A method and related system detail a split of an architecture of a monolithic application into an architecture of a micro service application. The method receives source code for the monolithic application, and maps the source code into a directed graph. The graph is split into subgraphs and optimized. The method further provides the detailing of the micro service application split, based on the subgraphs.
    Type: Grant
    Filed: January 5, 2022
    Date of Patent: January 2, 2024
    Assignee: International Business Machines Corporation
    Inventors: Li Juan Gao, Zhong Fang Yuan, Chen Gao, Tong Liu
  • Publication number: 20230419077
    Abstract: A computer-implemented process for modifying a training dataset includes the following operations. The training dataset is benchmarked using a State Of The Art (SOTA) neural network to determine a benchmark for the training dataset. The training set is divided into a plurality of slices. A sequence of a plurality of atomic operations are selected using a selection strategy generator operating on one of the plurality of slices. The sequence of the plurality of atomic operations is applied to modify the one of the plurality of slices to generate a revised one of the plurality of slices. Reverse reinforcement learning is performed on the revised one of the plurality of slices using the benchmark and the SOTA neural network. The training dataset is modified by replacing the one of the plurality of slices with the revised one of the plurality of slices to generate a modified training dataset.
    Type: Application
    Filed: June 23, 2022
    Publication date: December 28, 2023
    Inventors: Zhong Fang Yuan, Tong Liu, Wen Wang, Hai Bo Zou, Xiang Yu Yang
  • Publication number: 20230419163
    Abstract: Training data models using machine learning can include training a computer data model of data distribution using a training data set. The training data set includes training data and additional training data, and the training data and the additional training data being represented by layers of data representing the data distribution of the training data set. The computer data model using the additional training data is iteratively trained for each of the layers of the training data set. Statistical noise is added randomly to each of the layers of the training data set. Data variations are detected in each of the layers of the additional training data. The data variations are diluted in each of the additional layers of the training data, and the computer data model is retrained for the training data set using the diluted data variations in each of the layers of the additional training data.
    Type: Application
    Filed: June 28, 2022
    Publication date: December 28, 2023
    Inventors: Zhong Fang Yuan, Tong Liu, Wen Wang, Xiang Yu Yang, Cheng Gang Hu
  • Patent number: 11854287
    Abstract: A method, a computer program product, and a computer system compare images for content consistency. The method includes receiving a first image including a first document and a second image including a second document. The method includes performing a visual classification analysis on the first image and the second image. The visual classification analysis generates an overlap of the first image with the second image. The method includes determining whether a region of the overlap is indicative of a content inconsistency. As a result of the region of the overlap being indicative of a content inconsistency, the method includes performing a character recognition analysis on a first area of the first image and a second area of the second image corresponding to the region of the overlap to verify the content inconsistency.
    Type: Grant
    Filed: November 23, 2021
    Date of Patent: December 26, 2023
    Assignee: International Business Machines Corporation
    Inventors: Li Juan Gao, Zhong Fang Yuan, Tong Liu, Ming Xia Shi, Ming Jin Chen
  • Patent number: 11809454
    Abstract: Label-based document classification using artificial intelligence includes collecting, by one or more processors, a plurality of pre-trained classification models into a model pool and a plurality of documents into a document pool. The collected plurality of pre-trained classification models are applied in parallel to the plurality of documents in the document pool to generate a list of labels. Based on the list of labels, a final label result is generated according to which a baseline algorithm for document classification is generated by the one or more processors.
    Type: Grant
    Filed: November 21, 2020
    Date of Patent: November 7, 2023
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Tong Liu, Li Juan Gao, Ming Jin Chen, Ke Yong Zhang
  • Patent number: 11803709
    Abstract: A method, computer program product and computer system to provide topic guide during document drafting is provided. A processor retrieves at least one section of text from a document. A processor receives a target topic for the document. A processor extracts at least one local topic from the at least one section of text. A processor generates a semantic network comprising the at least one local topic and the target topic. A processor determines a deviation value for the at least one local topic based on a distance between the at least one local topic and the target topic in the semantic network. A processor, in response to the deviation value exceeding a threshold value, alerts a user that the at least one section of text from the document is off-topic from the target topic.
    Type: Grant
    Filed: September 23, 2021
    Date of Patent: October 31, 2023
    Assignee: International Business Machines Corporation
    Inventors: Xiang Yu Yang, Wen Jie Hao, Zhong Fang Yuan, Wang Hu Dang, Deng Xin Luo, Jia Yong Xie, Wen Wang
  • Patent number: 11783131
    Abstract: Provided is a method, computer program product, and system for fusing knowledge graphs to generate a larger knowledgebase for responding to cross document questions. A processor may extract contextual information from a plurality of documents. The processor may generate, based on the extracted contextual information, a knowledge graph for each document of the plurality of documents. The processor may analyze each knowledge graph to determine if one or more entities of each knowledge graph are linked. The processor may fuse, in response to an entity in a first knowledge graph being linked to an entity in a second knowledge graph, the first knowledge graph with the second knowledge graph to create a fused knowledge graph.
    Type: Grant
    Filed: September 10, 2020
    Date of Patent: October 10, 2023
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Chen Gao, Tong Liu, De Shuo Kong, Ci-Wei Lan, Rong Fu He
  • Publication number: 20230316041
    Abstract: Disclosed are techniques for modifying deep learning models (such as neural networks) to run more efficiently in computing environments with limited floating point computation resources. A deep learning model is trained using a set of training data. Input and output values are then recorded from the layers of the trained model when supplied with the training data, which are then used to generate deep forest decision tree models corresponding to individual layers of the trained model. Experimental versions of the trained model are then generated with different layers of the trained model replaced with their corresponding deep forest decision tree models. These experimental versions are then ranked according to the accuracy of their results compared to the results of the trained model. An updated trained model is then generated with one or more layers replaced with their corresponding deep forest decision tree models.
    Type: Application
    Filed: March 29, 2022
    Publication date: October 5, 2023
    Inventors: Zhong Fang Yuan, Tong Liu, Hai Bo Zou, Si Heng Sun, Na Liu