Patents by Inventor Dong Rui Li

Dong Rui Li has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250131759
    Abstract: In an approach, a processor performs document layout analysis on a document generating a plurality of textual regions; extracts characteristics from each of the plurality of textual regions and associates the respective characteristics to the respective textual region as metadata; classifies each of the plurality of textual regions as an optical character recognition (OCR) region, non-OCR valuable region, or non-OCR non-valuable region using a classifier; performs OCR on each OCR region generating an OCR output; identifies associated constant OCR data from a constant OCR data repository for each non-OCR valuable region; merges the associated constant OCR data with the OCR output generating a complete OCR data for the received document; performs data extraction on the complete OCR data to identify data fields and key-value pairs generating extracted data; and determines whether the extracted data is valid based on a set of rules.
    Type: Application
    Filed: October 24, 2023
    Publication date: April 24, 2025
    Inventors: Jun Hong Zhao, Dong Rui Li, Ang Yi, Jing Zhang, Hai Cheng Wang, Yang Zhong Li
  • Publication number: 20250077539
    Abstract: A system, method, and computer program product are configured to extract key value pair (KVP) data from one or more documents; obtain a content index for the one or more documents; and enhance the content index with the KVP data to provide a KVP content index.
    Type: Application
    Filed: September 1, 2023
    Publication date: March 6, 2025
    Inventors: Dong Rui Li, Xue Lan Zhang, Xue Xu, Zai Ming Lao, YE CHEN
  • Publication number: 20250053746
    Abstract: Disclosed embodiments provide techniques for creating a smaller version of an original document. An ontology is defined, and a document type for an original document is determined, based on the ontology. Multiple key-value pairs (KVPs) are extracted from the original document based on the document type. Pages in the original document that include one or more KVPs are identified, and the smaller, condensed version includes the identified pages from the original document.
    Type: Application
    Filed: August 10, 2023
    Publication date: February 13, 2025
    Inventors: Dong Rui Li, Ye Chen, Zai Ming Lao, Xue Lan Zhang, Xue Xu, Wei U Wang
  • Patent number: 12056948
    Abstract: In an approach, a processor identifies a plurality of text separators in a borderless table, a text separator of the plurality of text separators defining a non-text region between two consecutive text lines in the borderless table. A processor classifies the plurality of text separators into a number of target clusters comprised in a target group based on property information related to the plurality of text separators, the number of target clusters corresponding to a number of separator types. A processor provides indication information to indicate respective separator types of the plurality of text separators based on a result of the classifying.
    Type: Grant
    Filed: July 19, 2021
    Date of Patent: August 6, 2024
    Assignee: International Business Machines Corporation
    Inventors: Ang Yi, Nazrul Islam, Rajesh M. Desai, Jing Zhang, Dong Rui Li, Xue Mei Deng, Ye Chen, Hai Cheng Wang
  • Patent number: 11842143
    Abstract: A method, computer system, and a computer program product is provided for generating a detailed thumbnail and/or preview of a content. In one embodiment the technique comprises analyzing data obtained from a content and classifying it according to a plurality of specific types. A plurality of key information may be then extracted from the data according to the classification. A plurality of key-values are correlated to the plurality of key information so as to provide a plurality of key-value pairs. These pairs are consolidated accordingly to generate a thumbnail and/or a preview that at least renders the key information provided by the consolidated key-value pairs as a rendering.
    Type: Grant
    Filed: August 30, 2022
    Date of Patent: December 12, 2023
    Assignee: International Business Machines Corporation
    Inventors: Dong Rui Li, Zai Ming Lao, Ye Chen, Xue Lan Zhang, Xue Xu
  • Publication number: 20230012784
    Abstract: In an approach, a processor identifies a plurality of text separators in a borderless table, a text separator of the plurality of text separators defining a non-text region between two consecutive text lines in the borderless table. A processor classifies the plurality of text separators into a number of target clusters comprised in a target group based on property information related to the plurality of text separators, the number of target clusters corresponding to a number of separator types. A processor provides indication information to indicate respective separator types of the plurality of text separators based on a result of the classifying.
    Type: Application
    Filed: July 19, 2021
    Publication date: January 19, 2023
    Inventors: Ang Yi, Nazrul Islam, Rajesh M. Desai, Jing Zhang, Dong Rui Li, Xue Mei Deng, Ye Chen, Hai Cheng Wang
  • Patent number: 11514121
    Abstract: Embodiments of the present disclosure relate to a method, system, and computer program product for webpage customization. In some embodiments, a method is disclosed. According to the method, a webpage to be provided to a user is obtained. The webpage comprises at least a first element having a first set of style attributes. A second element matching the first element is determined from a set of elements customized for the user. The second element has a second set of style attributes. The webpage is customized for the user by applying at least part of the second set of style attributes to the first element. The customized webpage is provided to the user. In other embodiments, a system and a computer program product are disclosed.
    Type: Grant
    Filed: August 10, 2020
    Date of Patent: November 29, 2022
    Assignee: International Business Machines Corporation
    Inventors: Dong Rui Li, Ang Yi, Hai Cheng Wang, Jun Hong Zhao, Ye Chen, Xiao Jian Lian, Jing Chen
  • Publication number: 20220309072
    Abstract: A computer transforms content of a composite table into structured data objects. The computer receives a composite table and identifying a data zone characterized by data columns, and a header zone. The computer identifies first header cells arranged coextensive with a single data column and second header cells arranged coextensive with a set of data columns. The computer generates a hierarchical representation of said header cells, based at least in part, on the header cell arrangements. The computer generates a revised table based on the hierarchical representation, with the first header cells identifying a data column and the second header cells identify a first header cell. The computer generates structured data objects representing the zones and being arranged based, at least in part, on the revised table and where the structured data objects are keyed to the first header cells.
    Type: Application
    Filed: March 26, 2021
    Publication date: September 29, 2022
    Inventors: Xue Lan Zhang, Hai Cheng Wang, Jing Zhang, Jun Hong Zhao, Ang Yi, Dong Rui Li
  • Patent number: 11436249
    Abstract: A computer transforms content of a composite table into structured data objects. The computer receives a composite table and identifying a data zone characterized by data columns, and a header zone. The computer identifies first header cells arranged coextensive with a single data column and second header cells arranged coextensive with a set of data columns. The computer generates a hierarchical representation of said header cells, based at least in part, on the header cell arrangements. The computer generates a revised table based on the hierarchical representation, with the first header cells identifying a data column and the second header cells identify a first header cell. The computer generates structured data objects representing the zones and being arranged based, at least in part, on the revised table and where the structured data objects are keyed to the first header cells.
    Type: Grant
    Filed: March 26, 2021
    Date of Patent: September 6, 2022
    Assignee: International Business Machines Corporation
    Inventors: Xue Lan Zhang, Hai Cheng Wang, Jing Zhang, Jun Hong Zhao, Ang Yi, Dong Rui Li
  • Publication number: 20220043870
    Abstract: Embodiments of the present disclosure relate to a method, system, and computer program product for webpage customization. In some embodiments, a method is disclosed. According to the method, a webpage to be provided to a user is obtained. The webpage comprises at least a first element having a first set of style attributes. A second element matching the first element is determined from a set of elements customized for the user. The second element has a second set of style attributes. The webpage is customized for the user by applying at least part of the second set of style attributes to the first element. The customized webpage is provided to the user. In other embodiments, a system and a computer program product are disclosed.
    Type: Application
    Filed: August 10, 2020
    Publication date: February 10, 2022
    Inventors: Dong Rui Li, Ang Yi, Hai Cheng Wang, Jun Hong Zhao, Ye Chen, Xiao Jian Lian, Jing Chen
  • Patent number: 11108712
    Abstract: A method, system and computer program product for processing messages sent to a recipient. The communication channel used to send the message is identified, where such a communication channel is not currently being used by the recipient. After identifying the communication channel(s) currently being used by the recipient, the contact information of users who have previously communicated with the recipient using the communication channel(s) currently being used by the recipient are analyzed. The message is then marked with the identity of the sender as well as the communication channel used by the sender after matching the contact information of the sender with the contact information of a user who had previously communicated with the recipient using the communication channel(s) currently being used by the recipient. The marked message is then sent to the recipient using a communication channel currently being used by the recipient.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: August 31, 2021
    Assignee: International Business Machines Corporation
    Inventors: Xiang Cai, Yan Fen Guo, Dong Rui Li, Xiao Jian Lian, Cheng Fang Wang, Hong Dong Zhao
  • Patent number: 10664440
    Abstract: A computing system performs file conversion upon format expiration. A computing system evaluates a risk of expiration of a first format associated with a first format object. The computing system determines if the risk of expiration is above a threshold. The computing system identifies in response to determining the risk of expiration is above the threshold, one or more files stored in the first format for conversion. The computing system converts one or more files stored in the first format to a second format. The computing system saves the one or more files in the second format.
    Type: Grant
    Filed: February 9, 2017
    Date of Patent: May 26, 2020
    Assignee: International Business Machines Corporation
    Inventors: Yu Fen Chang, Peng Hui Jiang, Dong Rui Li, Lin Sun, Li Xiang, Ting Xie, Yuan Lin Yang
  • Publication number: 20200084165
    Abstract: A method, system and computer program product for processing messages sent to a recipient. The communication channel used to send the message is identified, where such a communication channel is not currently being used by the recipient. After identifying the communication channel(s) currently being used by the recipient, the contact information of users who have previously communicated with the recipient using the communication channel(s) currently being used by the recipient are analyzed. The message is then marked with the identity of the sender as well as the communication channel used by the sender after matching the contact information of the sender with the contact information of a user who had previously communicated with the recipient using the communication channel(s) currently being used by the recipient. The marked message is then sent to the recipient using a communication channel currently being used by the recipient.
    Type: Application
    Filed: November 18, 2019
    Publication date: March 12, 2020
    Inventors: Xiang Cai, Yan Fen Guo, Dong Rui Li, Xiao Jian Lian, Cheng Fang Wang, Hong Dong Zhao
  • Patent number: 10545911
    Abstract: A computing system performs file conversion upon format expiration. A computing system evaluates a risk of expiration of a first format associated with a first format object. The computing system determines if the risk of expiration is above a threshold. The computing system identifies in response to determining the risk of expiration is above the threshold, one or more files stored in the first format for conversion. The computing system converts one or more files stored in the first format to a second format. The computing system saves the one or more files in the second format.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: January 28, 2020
    Assignee: International Business Machines Corporation
    Inventors: Yu Fen Chang, Peng Hui Jiang, Dong Rui Li, Lin Sun, Li Xiang, Ting Xie, Yuan Lin Yang
  • Patent number: 10545912
    Abstract: A computing system performs file conversion upon format expiration. A computing system evaluates a risk of expiration of a first format associated with a first format object. The computing system determines if the risk of expiration is above a threshold. The computing system identifies in response to determining the risk of expiration is above the threshold, one or more files stored in the first format for conversion. The computing system converts one or more files stored in the first format to a second format. The computing system saves the one or more files in the second format.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: January 28, 2020
    Assignee: International Business Machines Corporation
    Inventors: Yu Fen Chang, Peng Hui Jiang, Dong Rui Li, Lin Sun, Li Xiang, Ting Xie, Yuan Lin Yang
  • Patent number: 10536405
    Abstract: A method, system and computer program product for processing messages sent to a recipient. The communication channel used to send the message is identified, where such a communication channel is not currently being used by the recipient. After identifying the communication channel(s) currently being used by the recipient, the contact information of users who have previously communicated with the recipient using the communication channel(s) currently being used by the recipient are analyzed. The message is then marked with the identity of the sender as well as the communication channel used by the sender after matching the contact information of the sender with the contact information of a user who had previously communicated with the recipient using the communication channel(s) currently being used by the recipient. The marked message is then sent to the recipient using a communication channel currently being used by the recipient.
    Type: Grant
    Filed: November 13, 2017
    Date of Patent: January 14, 2020
    Assignee: International Business Machines Corporation
    Inventors: Xiang Cai, Yan Fen Guo, Dong Rui Li, Xiao Jian Lian, Cheng Fang Wang, Hong Dong Zhao
  • Publication number: 20190251059
    Abstract: A computing system performs file conversion upon format expiration. A computing system evaluates a risk of expiration of a first format associated with a first format object. The computing system determines if the risk of expiration is above a threshold. The computing system identifies in response to determining the risk of expiration is above the threshold, one or more files stored in the first format for conversion. The computing system converts one or more files stored in the first format to a second format. The computing system saves the one or more files in the second format.
    Type: Application
    Filed: April 24, 2019
    Publication date: August 15, 2019
    Inventors: Yu Fen Chang, Peng Hui Jiang, Dong Rui Li, Lin Sun, Li Xiang, Ting Xie, Yuan Lin Yang
  • Publication number: 20190251060
    Abstract: A computing system performs file conversion upon format expiration. A computing system evaluates a risk of expiration of a first format associated with a first format object. The computing system determines if the risk of expiration is above a threshold. The computing system identifies in response to determining the risk of expiration is above the threshold, one or more files stored in the first format for conversion. The computing system converts one or more files stored in the first format to a second format. The computing system saves the one or more files in the second format.
    Type: Application
    Filed: April 24, 2019
    Publication date: August 15, 2019
    Inventors: Yu Fen Chang, Peng Hui Jiang, Dong Rui Li, Lin Sun, Li Xiang, Ting Xie, Yuan Lin Yang
  • Publication number: 20190149491
    Abstract: A method, system and computer program product for processing messages sent to a recipient. The communication channel used to send the message is identified, where such a communication channel is not currently being used by the recipient. After identifying the communication channel(s) currently being used by the recipient, the contact information of users who have previously communicated with the recipient using the communication channel(s) currently being used by the recipient are analyzed. The message is then marked with the identity of the sender as well as the communication channel used by the sender after matching the contact information of the sender with the contact information of a user who had previously communicated with the recipient using the communication channel(s) currently being used by the recipient. The marked message is then sent to the recipient using a communication channel currently being used by the recipient.
    Type: Application
    Filed: November 13, 2017
    Publication date: May 16, 2019
    Inventors: Xiang Cai, Yan Fen Guo, Dong Rui Li, Xiao Jian Lian, Cheng Fang Wang, Hong Dong Zhao
  • Patent number: 10229038
    Abstract: A selection of a plurality of graphical user interface (GUI) components of a GUI is received. The selection is received so that the GUI can be tested. Attributes of the selection of GUI components are determined. Using the attributes, a default procedure for testing the GUI is determined. The default procedure includes a first set of input values for GUI components of the plurality of GUI components. The default procedure includes a first sequence in which the first set of input values are provided. Modifications to the default procedure are received. Using the modifications, a final procedure with a second set of input values provided in a sequence is generated. The GUI is tested with the final procedure. Testing the GUI includes providing the second set of input values to respective GUI components in the second sequence.
    Type: Grant
    Filed: March 15, 2016
    Date of Patent: March 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Zhu Hong Cai, Dong Rui Li, Miao Liu, Ying Shen, Kui Song, Wen Yin, Dan Zhu