Patents by Inventor Xu Zhong
Xu Zhong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12366950Abstract: Provided are an interaction method and apparatus, an electronic device, and a storage medium. The method includes displaying a target page, where target content containing a target object is displayed in the target page; receiving a shake trigger operation of a user; and in response to the shake trigger operation, displaying associated content of the target content in a layer above the target page and displaying the detail page of the target object after completing displaying the associated content.Type: GrantFiled: July 27, 2023Date of Patent: July 22, 2025Assignee: Beijing Zitiao Network Technology Co., Ltd.Inventors: Han Xu, Mengqi Wu, Xiaolei Shi, Xu Zhong, Huan Wang, Shuo Wang, Ji Liu, Zhiquan Zhang, Zhiyong Luo, Jia Liu, Chengkai Peng, Yongkang Chen, Ziqi Liu, Jialong Zhao, Yirui Cao, Lei Jin
-
Patent number: 12367352Abstract: Deep learning techniques are disclosed for extraction of embedded data from documents. In an exemplary technique, a set of unstructured text data is received. One or more text groupings are generated by processing the set of unstructured text data. One or more text grouping embeddings are generated in a format for input to a machine learning model based on the one or more generated text groupings. One or more output predictions are generated by inputting the one or more text grouping embeddings into the machine learning model. Each output prediction of the one or more output predictions correspond to a predicted aspect of a text grouping of the one or more text groupings.Type: GrantFiled: August 12, 2022Date of Patent: July 22, 2025Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Xu Zhong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Thanh Long Duong, Mark Edward Johnson
-
Publication number: 20250157209Abstract: Techniques for extracting key information from a document using machine-learning models in a chatbot system is disclosed herein. In one particular aspect, a method is provided that includes receiving a set of data, which includes key fields, within a document at a data processing system that includes a table detection module, a key information extraction module, and a table extraction module. Text information and corresponding location data are extracted via optical character recognition. The table detection module detects whether one or more tables are present in the document and, if applicable, a location of each of the tables. The key information extraction module extracts text from the key fields. The table extraction module extracts each of the tables based on input from the optical character recognition and the table detection module. Extraction results include the text from the key fields and each of the tables can be output.Type: ApplicationFiled: December 26, 2024Publication date: May 15, 2025Applicant: Oracle International CorporationInventors: Yakupitiyage Don Thanuja Samodhye Dharmasiri, Xu Zhong, Ahmed Ataallah Ataallah Abobakr, Hongtao Yang, Budhaditya Saha, Shaoke Xu, Shashi Prasad Suravarapu, Mark Edward Johnson, Thanh Long Duong
-
Patent number: 12277158Abstract: Techniques for maintaining list-type text formatting when converting content from a source content format to a destination content format are disclosed. A system generates text content by applying text formatting tags to segments of characters obtained from a source electronic document. The system parses a static-display type source electronic document to obtain character data of the characters in the source document. The system analyzes the parsed data to identify text arranged in a list-type text format in the source document. The system generates text content in a destination content format different from the source format by applying tags to segments of the text content designating the segments items in a list.Type: GrantFiled: May 31, 2023Date of Patent: April 15, 2025Assignee: Oracle International CorporationInventors: Vishank Bhatia, Xu Zhong, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
-
Publication number: 20250118398Abstract: Techniques are disclosed for automatically generating Subjective, Objective, Assessment and Plan (SOAP) notes. Particularly, techniques are disclosed for training data collection and evaluation for automatic SOAP note generation. Training data is accessed, and evaluation process is performed on the training data to result in evaluated training data. A fine-tuned machine-learning model is generated using the evaluated training data. The fine-tuned machine-learning model can be used to perform a task associated with generating a SOAP note.Type: ApplicationFiled: September 13, 2024Publication date: April 10, 2025Applicant: Oracle International CorporationInventors: Shubham Pawankumar Shah, Syed Najam Abbas Zaidi, Xu Zhong, Poorya Zaremoodi, Srinivasa Phani Kumar Gadde, Arash Shamaei, Ganesh Kumar, Thanh Tien Vu, Nitika Mathur, Chang Xu, Shiquan Yang, Sagar Kalyan Gollamudi
-
Publication number: 20250094480Abstract: Techniques are disclosed herein for generating and using a knowledge base of information extracted from documents. The techniques include accessing a document comprising text and dividing the document into a plurality of chunks of text. The chunks are indexed by storing each chunk mapped to respective identifying metadata including a chunk index for each chunk. A query is received and a chunk relevant to the query is identified. A prompt is formulated including the query, the identified relevant chunk, and a subsequent chunk. The prompt is provided to a language model and output is received from the language model based on the prompt. An answer to the query is returned based on the received output.Type: ApplicationFiled: September 13, 2024Publication date: March 20, 2025Applicant: Oracle International CorporationInventors: Yingqiong Shi, Charles Woodrow Dickstein, Aashna Devang Kanuga, Xu Zhong, Xin Xu
-
Publication number: 20250094464Abstract: Techniques are disclosed herein for selecting document chunks that are most relevant to a query. The techniques include receiving a query and comparing a plurality of stored text passages to the query using a first similarity metric. Based on the comparison, a subset of the plurality of stored text passages that are most similar to the query are selected. A plurality of sentences from the subset of the plurality of stored text passages are identified. The identified sentences are ranked based on the query and a second similarity metric. A subset of the sentences are selected based on the ranking. The subset of the sentences or a derivative thereof are output in response to the query.Type: ApplicationFiled: September 13, 2024Publication date: March 20, 2025Applicant: Oracle International CorporationInventors: Xu Zhong, Aashna Devang Kanuga
-
Publication number: 20250083404Abstract: The present application provides a processing method of a metal composite structure including a first metal layer and a second metal layer stacked on the first metal layer. The processing method includes the steps of defining a first through hole in the first metal layer, and drilling in the first through hole toward the second metal layer to form a traction hole in the second metal layer. Then, hot melt drilling is performed on a surface of the second metal layer away from the first metal layer toward the first through hole, thereby causing the second metal layer to crack under a traction force of the traction hole to form a second through hole, and a portion of the second metal layer to be melted and squeezed to form a bushing which adheres to at least a portion of a sidewall of the first through hole.Type: ApplicationFiled: September 12, 2024Publication date: March 13, 2025Inventors: Xin HUANG, Sheng-Hao HONG, Jian-Xiong QIAN, Lei ZHU, Peng XIE, Xiang-Kun MENG, Feng FANG, Hui WU, Xiao-Hui CHEN, Shuang-Xu ZHONG, Ren-Jun YANG, Chao CHENG, Zhi-Qiang SHEN, Ye-An SUN
-
Patent number: 12217497Abstract: Techniques for extracting key information from a document using machine-learning models in a chatbot system is disclosed herein. In one particular aspect, a method is provided that includes receiving a set of data, which includes key fields, within a document at a data processing system that includes a table detection module, a key information extraction module, and a table extraction module. Text information and corresponding location data are extracted via optical character recognition. The table detection module detects whether one or more tables are present in the document and, if applicable, a location of each of the tables. The key information extraction module extracts text from the key fields. The table extraction module extracts each of the tables based on input from the optical character recognition and the table detection module. Extraction results include the text from the key fields and each of the tables can be output.Type: GrantFiled: August 15, 2022Date of Patent: February 4, 2025Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Yakupitiyage Don Thanuja Samodhye Dharmasiri, Xu Zhong, Ahmed Ataallah Ataallah Abobakr, Hongtao Yang, Budhaditya Saha, Shaoke Xu, Shashi Prasad Suravarapu, Mark Edward Johnson, Thanh Long Duong
-
Publication number: 20240419886Abstract: A data corpus is partitioned into text strings for header classification. A group characteristic is computed for a text string, and whether the group characteristic satisfies a group characteristic criterion is determined. The text string may be disqualified from header classification if the group characteristic criterion is not satisfied, or one or more font characteristics may be determined for the text string if the group characteristic criterion is satisfied. A font characteristic that meets one or more prevalence criteria may be identified and evaluated to determine whether the font characteristic meets at least one font characteristic criterion. The text string may be disqualified from header classification if the font characteristic criterion is not satisfied, or if the font characteristic meets the font characteristic criterion, the text string is classified as a header, and tagged content is generated by applying a header tag to the text string.Type: ApplicationFiled: April 30, 2024Publication date: December 19, 2024Applicant: Oracle International CorporationInventors: Sagar Gollamudi, Vishank Bhatia, Xu Zhong, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
-
Publication number: 20240338395Abstract: Techniques for multi-layer training of a machine learning model are disclosed. A system pre-trains a machine learning model on training data obtained from unlabeled document graph data by executing unsupervised pre-training tasks on the unlabeled document graph data to generate a labeled pre-training data set. The system modifies document graphs to change attributes of nodes in the document graphs. The system pre-trains the machine learning model with a data set including the modified document graphs and un-modified document graphs to generate prediction associated with the modifications to the document graphs. Subsequent to pre-training, the system fine-tunes the machine learning model with a set of labeled training data to generate predictions associated with a specific attribute of a document graph.Type: ApplicationFiled: April 10, 2023Publication date: October 10, 2024Applicant: Oracle International CorporationInventors: Xu Zhong, Don Dharmasiri, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
-
Patent number: 12056434Abstract: Techniques for generating formatting tags for textual content obtained from a source electronic document are disclosed. A system parses a digital file to obtain information about characters in an electronic document. The system applies tags to text generated based on the textual content of the electronic document by creating segments of textually-consecutive characters and applying corresponding text formatting style tags to the segments. The system further identifies segments of text overlapping bounding boxes in the electronic document. The system generates textual content including a segment of text and a corresponding hyperlink associated with the segment of text. The system further generates textual content by selectively applying line breaks from the source electronic document in the textual content.Type: GrantFiled: January 6, 2023Date of Patent: August 6, 2024Assignee: Oracle International CorporationInventors: Vishank Bhatia, Xu Zhong, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, King-Hwa Lee, Christopher Kennewick
-
Publication number: 20240256771Abstract: Techniques for identifying content in key-value pairs of documents using a graph neural network (GNN) are disclosed. A system trains a GNN to identify key-value pair groupings in documents. The GNN classifies nodes in document graphs as key-type nodes and answer-type nodes. The GNN also classifies edges connecting nodes in the document graphs for keeping in the document graph or removing from the document graph. The resulting document graph includes key-value groupings in the document. Upon identifying content matching a query, a system returns as a query response content from among the key-value groupings.Type: ApplicationFiled: January 30, 2023Publication date: August 1, 2024Applicant: Oracle International CorporationInventors: Xu Zhong, Thanh Long Duong, Mark Johnson
-
Patent number: 12001775Abstract: A data corpus is partitioned into text strings for header classification. A group characteristic is computed for a text string, and whether the group characteristic satisfies a group characteristic criterion is determined. The text string may be disqualified from header classification if the group characteristic criterion is not satisfied, or one or more font characteristics may be determined for the text string if the group characteristic criterion is satisfied. A font characteristic that meets one or more prevalence criteria may be identified and evaluated to determine whether the font characteristic meets at least one font characteristic criterion. The text string may be disqualified from header classification if the font characteristic criterion is not satisfied, or if the font characteristic meets the font characteristic criterion, the text string is classified as a header, and tagged content is generated by applying a header tag to the text string.Type: GrantFiled: June 13, 2023Date of Patent: June 4, 2024Assignee: Oracle International CorporationInventors: Sagar Gollamudi, Vishank Bhatia, Xu Zhong, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
-
Publication number: 20240169161Abstract: Obtaining collections of sentences in different languages that are usable for training models in various applications of artificial intelligence is provided. A method is provided that obtains, from text corpus, webpages in a plurality of languages, each of the webpages corresponding to an URL; obtains annotations for each of the webpages based on its URL, to obtain annotated data entries corresponding to the webpages, each of the annotated data entries including a classification label corresponding to a sub-topic of one of a plurality of topics, where each of the plurality of topics includes a corresponding plurality of sub-topics; filters the annotated data entries to obtain topic-specific content in a target language based on the classification labels, the topic-specific content corresponding to one or more sub-topics; performs post-processing on the topic-specific content to obtain result data; and outputs the result data for the topic.Type: ApplicationFiled: August 21, 2023Publication date: May 23, 2024Applicant: Oracle International CorporationInventors: Paria Jamshid Lou, Gioacchino Tangari, Jason Black, Bhagya Gayathri Hettige, Xu Zhong, Poorya Zaremoodi, Thanh Long Duong, Mark Edward Johnson
-
Publication number: 20240126795Abstract: Techniques are disclosed herein for integrating document question answering in an artificial intelligence-based platform, such as a chatbot system. The techniques include receiving a query from a user, rewriting the query to include one or more specific descriptors, computing an embedding vector for the rewritten query, retrieving one or more textual passages from a document store utilizing the embedding vector for the rewritten query, determining one or more answers to the rewritten query within the one or more textual passages, and returning the one or more answers.Type: ApplicationFiled: October 13, 2023Publication date: April 18, 2024Applicant: Oracle International CorporationInventors: Xu Zhong, Thanh Long Duong, Mark Edward Johnson, Charles Woodrow Dickstein, King-Hwa Lee, Xin Xu, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Christopher Kennewick, Balakota Srinivas Vinnakota, Raefer Christopher Gabriel
-
Publication number: 20240126800Abstract: Techniques for maintaining list-type text formatting when converting content from a source content format to a destination content format are disclosed. A system generates text content by applying text formatting tags to segments of characters obtained from a source electronic document. The system parses a static-display type source electronic document to obtain character data of the characters in the source document. The system analyzes the parsed data to identify text arranged in a list-type text format in the source document. The system generates text content in a destination content format different from the source format by applying tags to segments of the text content designating the segments items in a list.Type: ApplicationFiled: May 31, 2023Publication date: April 18, 2024Applicant: Oracle International CorporationInventors: Vishank Bhatia, Xu Zhong, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
-
Patent number: 11914943Abstract: Techniques for generating text content arranged in a consistent read order from a source document including text corresponding to different read orders are disclosed. A system parses a binary file representing an electronic document to identify characters and metadata associated with the characters. The system pre-sorts a character order of characters in each line of the electronic document to generate an ordered list of characters arranged according to the right-to-left reading order. The system performs a layout-mirroring operation to change a position of characters within the modified document relative to a right edge of the document and a left edge of the document. Subsequent to performing layout-mirroring, the system identifies native left-to-right reading-order text in-line with the native right-to-left reading-order text.Type: GrantFiled: February 15, 2023Date of Patent: February 27, 2024Assignee: Oracle International CorporationInventors: Xu Zhong, Vishank Bhatia, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
-
Publication number: 20240061992Abstract: Techniques for generating formatting tags for textual content obtained from a source electronic document are disclosed. A system parses a digital file to obtain information about characters in an electronic document. The system applies tags to text generated based on the textual content of the electronic document by creating segments of textually-consecutive characters and applying corresponding text formatting style tags to the segments. The system further identifies segments of text overlapping bounding boxes in the electronic document. The system generates textual content including a segment of text and a corresponding hyperlink associated with the segment of text. The system further generates textual content by selectively applying line breaks from the source electronic document in the textual content.Type: ApplicationFiled: January 6, 2023Publication date: February 22, 2024Applicant: Oracle International CorporationInventors: Vishank Bhatia, Xu Zhong, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, King-Hwa Lee, Christopher Kennewick
-
Publication number: 20240061989Abstract: Techniques for generating text content arranged in a consistent read order from a source document including text corresponding to different read orders are disclosed. A system parses a binary file representing an electronic document to identify characters and metadata associated with the characters. The system pre-sorts a character order of characters in each line of the electronic document to generate an ordered list of characters arranged according to the right-to-left reading order. The system performs a layout-mirroring operation to change a position of characters within the modified document relative to a right edge of the document and a left edge of the document. Subsequent to performing layout-mirroring, the system identifies native left-to-right reading-order text in-line with the native right-to-left reading-order text.Type: ApplicationFiled: February 15, 2023Publication date: February 22, 2024Applicant: Oracle International CorporationInventors: Xu Zhong, Vishank Bhatia, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi