Context Analysis Or Word Recognition (e.g., Character String) Patents (Class 382/229)

Trigrams or digrams (Class 382/230)

Checking spelling for recognition (Class 382/231)

Multi-word phrase based analysis of electronic documents

Patent number: 11748388

Abstract: A document processing system is configured to identify, for each accessed electronic document in a first set of multiple electronic documents, a set of identified multi-word phrases determined to be in ordered text information in the accessed electronic document, each multi-word phrase of the set of identified multi-word phrases including adjacent words in the ordered text information; and determine, for each accessed electronic document in the first set of multiple electronic documents, a selected document type from the first set of document types based at least on an analysis of the set of identified multi-word phrases with respect to multi-word-phrase characteristics identified by a first definition and associated with each document type in a first set of document types associated with a first document-set type.

Type: Grant

Filed: October 14, 2020

Date of Patent: September 5, 2023

Assignee: Docufree Corporation

Inventor: John Frank Walsh
Systems and methods of business categorization and service recommendation

Patent number: 11741511

Abstract: In one aspect, the present disclosure relates to a method of generating business descriptions performed by a server, said method may include: receiving a plurality of invoices, each invoice being associated with a business of a plurality of businesses; extracting a plurality of texts from the plurality of invoices; embedding the plurality of texts to a vector space to obtain a plurality of invoice vectors; generating a plurality of clusters in the vector space, each cluster of the plurality of clusters comprising at least one invoice vector of the plurality of invoice vectors; generating a description for a cluster, the description for the cluster representing all invoice vectors assigned to the cluster; for each business of the plurality of businesses that has at least one invoice vector assigned to the cluster, associating the business with the description; and indexing the plurality of businesses within a database by the generated descriptions.

Type: Grant

Filed: February 3, 2020

Date of Patent: August 29, 2023

Assignee: Intuit Inc.

Inventors: Erez Katzenelson, Elik Sror, Shlomi Medalion, Shimon Shahar, Shir Meir Lador, Sigalit Bechler, Alexander Zhicharevich, Onn Bar
Document access control based on document component layouts

Patent number: 11734445

Abstract: In an approach for providing a document access control based on document component layouts, a processor detects a layout of a document, the layout including one or more components of the document. A processor defines an access policy to access the one or more components based on the layout. A processor authorizes a request to access the one or more components based on the access policy and the layout. A processor retrieves the one or more components based on the access policy and the authorized request.

Type: Grant

Filed: December 2, 2020

Date of Patent: August 22, 2023

Assignee: International Business Machines Corporation

Inventors: Peter Zhong, Antonio Jose Jimeno Yepes, Lenin Mehedy
Hybrid in-memory/pageable spatial column data

Patent number: 11726985

Abstract: Disclosed herein are system, method, and computer program product embodiments for maintaining of a geometric object in a database. An embodiment operates by a database maintaining a first page storing a data block in the database's on-disk store such that the data block stores at least one byte of the geometric object. After receiving the request for the geometric object, the database loads the page storing the geometric object in the in-memory store and determines the size of the geometric object. Based on the size of the geometric object, the database stores the geometric object in the in-memory store directly or in a heap of the in-memory store.

Type: Grant

Filed: June 2, 2020

Date of Patent: August 15, 2023

Assignee: SAP SE

Inventors: Colin Florendo, Surendra Vishnoi, Janardhan Hungund, Manuel Caroli
Automated indexing and extraction of information in digital documents

Patent number: 11727702

Abstract: Systems and methods for automated indexing and extraction of information in digital documents are disclosed. A method may comprise selecting a page number of a digital document to identify a page containing targeted information; inputting an image of the page into a visual machine learning network (visual ML), wherein the visual ML is trained to recognize text associated with the targeted information in an image; identifying by the visual ML, a section of the image that contains the targeted information; inputting the page number, the digital document, and coordinates of the section into an extraction module; and extracting the targeted information by the extraction module from the section.

Type: Grant

Filed: January 17, 2023

Date of Patent: August 15, 2023

Assignee: VelocityEHS Holdings, Inc.

Inventors: Julia Penfield, Aatish Suman, Veeru Talreja, Misbah Zahid Khan
Object detection and image classification based optical character recognition

Patent number: 11727697

Abstract: A system performs optical character recognition (OCR) on an image displaying a portion of an object. An image classification system identifies the object in the image, based on which one or more object detection models identify labels associated with the object within the image. The system determines text of the identified labels using OCR, and analyzes the OCR resultant text for discrepancies and/or inaccuracies. In response to identifying a discrepancy, the system provides a recommendation for improving the accuracy of the OCR resultant text.

Type: Grant

Filed: June 30, 2021

Date of Patent: August 15, 2023

Assignee: Salesforce, Inc.

Inventors: Dennis Schultz, Daniel Thomas Harrison, Christopher Anthony Kemp, Michael A. Salem
Machine learning enabled text analysis with multi-language support

Patent number: 11720752

Abstract: A language determination model may be applied to select a first machine learning model or a second machine learning model to analyze the input text. The first machine learning model trained to analyze text in a first language, the second machine learning model trained to analyze text in a second language, and the input text may be in a third language. The language determination model may select the first machine learning model based on the first machine learning model having a better performance analyzing text in the third language than the second machine learning model. The language determination model may be updated based on an actual performance of the first machine learning model analyzing the input text. Moreover, the first machine learning model may be subject to additional training if the actual performance of the first machine learning model analyzing the input text is below a threshold value.

Type: Grant

Filed: July 7, 2020

Date of Patent: August 8, 2023

Assignee: SAP SE

Inventor: Tobias Weller
Method and apparatus for automatically extracting information from unstructured data

Patent number: 11710099

Abstract: Various methods, apparatuses/systems, and media for automatically extracting information from unstructured data are provided. A receiver receives digitized data of a document having unstructured data format. A processor applies machine learning models for sectioning the digitized data. An OCR device applies an OCR processing to the sectioned digitized data. The processor matches the sectioned digitized data to patterns and rules; applies classification models to the matched digitized data to identify entities and events from the sectioned digitized data; automatically link each entity with corresponding event in a hierarchical format to generate a document having structured data format; and output the document having the structured data with metadata having the linked entity with corresponding event in the hierarchical format to downstream applications.

Type: Grant

Filed: May 5, 2020

Date of Patent: July 25, 2023

Assignee: JPMORGAN CHASE BANK, N.A.

Inventors: Debraj Majumdar, Loryfel Nunez, Adam Leonard Harry Clark, Jayson Lashin, Amish Seth, Noriel E. Flores, Blesson Thomas
System and method for automated content annotation workflow

Patent number: 11704482

Abstract: An automated content annotation workflow is disclosed. An example embodiment is configured for: registering a plurality of labelers to which annotation tasks are assigned; populating a labeling queue with content data to be annotated; assigning annotation tasks from the labeling queue to the plurality of labelers; enabling the plurality of labelers in an annotation review queue to modify or delete annotations applied by prior labelers; and evaluating a level of performance of the plurality of labelers in applying the annotations.

Type: Grant

Filed: February 3, 2022

Date of Patent: July 18, 2023

Assignee: LABELBOX, INC.

Inventors: Manu Sharma, Brian Rieger, Dan Rasmuson, Connor Harwood, Ryan Quinn, Kyle Owens, Randall Lin
Long running workflows for robotic process automation

Patent number: 11704224

Abstract: Systems and methods for executing a robotic process automation (RPA) workflow are provided. The RPA workflow is executed by a first robot. The execution of the RPA workflow is suspended by the first robot. A current context of the RPA workflow is serialized at a time of the suspension and the current context of the RPA workflow is stored. The execution of the RPA workflow is resumed by a second robot based on a triggering condition by retrieving the current context of the RPA workflow. The first robot and the second robot may be the same robot or different robots.

Type: Grant

Filed: April 7, 2022

Date of Patent: July 18, 2023

Assignee: UiPath, Inc.

Inventors: Palak Kadakia, Liji J. Kunnath, Amol Awate, Remus Rusanu
Character recognition method and apparatus, electronic device, and storage medium

Patent number: 11699276

Abstract: A method, apparatus, electronic device, and storage medium for character recognition are provided. The method may perform image processing on an acquired original image to obtain a region to be recognized. The region may include a character. The method may determine an area ratio of the region to be recognized on the original image. The method may determine an angle between the region to be recognized and a preset direction. The method may determine a character density of the region to be recognized. The method may perform character recognition on the character in the region to be recognized in response to determining that the area ratio is greater than a ratio threshold, the angle is less than an angle threshold, and the character density is less than a density threshold.

Type: Grant

Filed: July 16, 2021

Date of Patent: July 11, 2023

Assignees: Beijing Xiaomi Mobile Software., Ltd., Beijing Xiaomi Pinecone Electronics Co., Ltd.

Inventor: Dong Wang
On-device partial recognition systems and methods

Patent number: 11694459

Abstract: Disclosed is an approach of on-device partial recognition that includes performing partial recognition on an image of a document captured by a mobile device to detect and/or recognize a specific area (e.g., barcodes, non-relevant text, etc.) and filling the recognized area with a solid color. Because the solid color area has a maximum compression ratio, this approach can lead to image size reduction and increased network throughput for client-server based data recognition where further processing such as advanced data extraction is performed at the server side. The approach can be enforced with neural network algorithms to exclude non-relevant information (e.g., logos, phrases, words, etc.).

Type: Grant

Filed: May 24, 2021

Date of Patent: July 4, 2023

Assignee: Open Text Corporation

Inventors: Mikhail Yurievitch Zakharov, Kirill Vaniukov, Christopher Dale Lund
Text refinement network

Patent number: 11688190

Abstract: Systems and methods for text segmentation are described. Embodiments of the inventive concept are configured to receive an image including a foreground text portion and a background portion, classify each pixel of the image as foreground text or background using a neural network that refines a segmentation prediction using a key vector representing features of the foreground text portion, wherein the key vector is based on the segmentation prediction, and identify the foreground text portion based on the classification.

Type: Grant

Filed: November 5, 2020

Date of Patent: June 27, 2023

Assignee: ADOBE INC.

Inventors: Zhifei Zhang, Xingqian Xu, Zhaowen Wang, Brian Price
Providing field extraction recommendations for display

Patent number: 11681900

Abstract: Systems and methods include obtaining a set of events, each event in the set of events comprising a time-stamped portion of raw machine data, the raw machine data produced by one or more components within an information technology or security environment and reflects activity within the information technology or security environment. Thereafter, a first neural network is used to automatically identify variable text to extract as a field from the set of events. An indication of the variable text is provided as a field extraction recommendation, for example, to a user device for presentation to a user.

Type: Grant

Filed: June 15, 2020

Date of Patent: June 20, 2023

Assignee: Splunk Inc.

Inventors: Adam Jamison Oliner, Nghi Huu Nguyen, Jacob Leverich, Zidong Yang
Method and device for controlling text position in a computer display

Patent number: 11681856

Abstract: The present disclosure relates to UI systems and processes including methods for controlling text position in a computer display. A target word in a body of text may be maintained in position by forward rendering and backward rendering, iteratively as the text is modified by the addition or deletion of words or by modifications affecting height or width of a word.

Type: Grant

Filed: November 14, 2022

Date of Patent: June 20, 2023

Assignee: Ascender AI LLC

Inventor: Braddock Gaskill
Image processing apparatus, image processing system, control method thereof, and storage medium

Patent number: 11647139

Abstract: An image processing apparatus according to the present disclosure is an image processing apparatus for automatically transmitting a document file by using a result of a character recognition process on a scan image of a document as a property, and includes: at least one processor that executes the program to perform: extracting a confidence factor indicating a degree of certainty of the result of the character recognition process; in a case where the extracted confidence factor is above a predetermined threshold value, determining that the document file using the result of the character recognition process as the property is allowed to be automatically transmitted; and setting the predetermined threshold value such that an incorrect transmission ratio of document files to be automatically transmitted reaches a target incorrect transmission ratio.

Type: Grant

Filed: December 7, 2021

Date of Patent: May 9, 2023

Assignee: Canon Kabushiki Kaisha

Inventor: Junya Arakawa
Document retrieval apparatus and document retrieval method

Patent number: 11640432

Abstract: A document retrieval apparatus includes: a storage unit that stores documents and dictionaries applied to a model, a correspondence between a model and documents applied to the model, and a correspondence between a document and dictionaries applied to the dictionary; a model selecting unit that selects a model; a search target document specifying unit that specifies documents applied to the model selected by the model selecting unit as search target documents; a dictionary specifying unit that specifies dictionaries applied to the search target document; a query receiving unit that inputs a query; a search keyword extraction unit that extracts a search keyword group by applying the dictionary specified by the dictionary specifying unit to the query; a retrieving unit that retrieves the search target document using the search keyword group; and a retrieval result presenting unit that displays search results retrieved by the retrieving unit.

Type: Grant

Filed: June 5, 2020

Date of Patent: May 2, 2023

Assignee: FANUC CORPORATION

Inventors: Yuji Tuboguchi, Masao Kamiguchi, Noriaki Neko
Electronic device and method for providing multiple services respectively corresponding to multiple external objects included in image

Patent number: 11636675

Abstract: An electronic device according to various embodiments includes a communication circuit, a memory, and a processor, and the processor is configured to: receive a first image from a first external electronic device by using the communication circuit; perform image recognition with respect to the first image by using the first image; generate information regarding an external object included in the first image, based on a result of the recognition; based on the information regarding the external object satisfying a first designated condition, transmit at least a portion of the first image to a second external electronic device corresponding to the first designated condition; and, based on the information regarding the external object satisfying a second designated condition, transmit the at least portion of the first image to a third external electronic device corresponding to the second designated condition.

Type: Grant

Filed: November 14, 2019

Date of Patent: April 25, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Dasom Lee, Jungeun Lee, Sungoh Kim, Hyunhee Park
Automated document intake and processing system

Patent number: 11631266

Abstract: An automated documentation intake and processing system involves pre-processing source images of intake documents and applying a YOLO CNN model to identify fields of interest (FOIs) therein which may contain target data. The system classifies FOIs and upsamples the contents in order to digitize and extract target data. The system may be trained to differentiate between target data and non-target data and incorporates an adjustable confidence scores which reflects the system's degree of accuracy at predicting the correct FOI (e.g., name, address, insurance number, vehicle registration number). The system is pre-trained to detect a subset of documentation types, form fields, and form field types. However, the system is configured to adapt to variations of the same types of documentation, such as different insurance cards or driver licenses from different states.

Type: Grant

Filed: March 31, 2020

Date of Patent: April 18, 2023

Assignee: Wilco Source Inc

Inventors: Aravind Sampath, Sri Kedarnath Relangi, Suresh Kankanala, Sundararaman Ramasamy
Inference method, inference device, and recording medium

Patent number: 11625931

Abstract: An inference method includes acquiring similarities between a domain name serving as an analysis object and each domain name indicated in a legitimate domain name list as feature amounts, and inferring a degree that the domain name serving as the analysis object is wrongly recognized as a legitimate domain name based on the feature amounts acquired at the acquiring and a training model that outputs, as a response to input of the feature amounts, a degree that the domain name serving as the analysis object is wrongly recognized as the legitimate domain name, by processing circuitry.

Type: Grant

Filed: February 4, 2020

Date of Patent: April 11, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Daiki Chiba, Takashi Koide, Ayako Hasegawa, Mitsuaki Akiyama
Text detection, caret tracking, and active element detection

Patent number: 11625138

Abstract: Detection of typed and/or pasted text, caret tracking, and active element detection for a computing system are disclosed. The location on the screen associated with a computing system where the user has been typing or pasting text, potentially including hot keys or other keys that do not cause visible characters to appear, can be identified and the physical position on the screen where typing or pasting occurred can be provided based on the current resolution of where one or more characters appeared, where the cursor was blinking, or both. This can be done by identifying locations on the screen where changes occurred and performing text recognition and/or caret detection on these locations. The physical position of the typing or pasting activity allows determination of an active or focused element in an application displayed on the screen.

Type: Grant

Filed: May 20, 2021

Date of Patent: April 11, 2023

Assignee: UiPath, Inc.

Inventor: Vaclav Skarda
Optical character detection using a low-power sensor

Patent number: 11620806

Abstract: Various aspects of the present disclosure generally relate to optical character detection. In some aspects, a user device may receive, from a vision sensor, a first image that is associated with a first optical character image. The user device may determine, using an image processing model, that the first image depicts the first optical character image. The user device may cause, based at least in part on determining that the first image depicts the first optical character image, a camera to capture a second image that is associated with a second optical character image. The user device may perform an action associated with the second image. Numerous other aspects are provided.

Type: Grant

Filed: June 4, 2020

Date of Patent: April 4, 2023

Assignee: QUALCOMM Incorporated

Inventors: Ravishankar Sivalingam, Russell Gruhlke, Ravindra Vaman Shenoy, Donald William Kidwell, Jr., Evgeni Gousev, Kebin Li, Khurshid Syed Alam, Edwin Chongwoo Park, Arnold Jason Gum
Image processing method, image processing device, electronic device and storage medium

Patent number: 11610291

Abstract: An image processing method, an image processing device, an electronic device, and a non-transitory computer readable storage medium are provided.

Type: Grant

Filed: June 10, 2021

Date of Patent: March 21, 2023

Assignee: Hangzhou Glority Software Limited

Inventors: Qingsong Xu, Qing Li
Assigning documents to entities of a database

Patent number: 11593417

Abstract: In an approach, a processor groups documents into a plurality of groups based on similarity, where: documents of each group have a same document structure; and the document structure is defined by coordinates of text blocks. A processor, for each group of the plurality of groups and for each document of the respective group: retrieves a value of each text block of the respective document in accordance with a document structure of the group; and assigns to each text block of the respective document an attribute that represents the retrieved value of the text block. A processor assigns a first document of the documents to an entity of a database that matches the first document based on the group of text block values and the assigned attributes of the document.

Type: Grant

Filed: January 21, 2021

Date of Patent: February 28, 2023

Assignee: International Business Machines Corporation

Inventors: Thomas Schwarz, Albert Maier, Michael Baessler, Oliver Suhre, Peter Gerstl, Werner Schuetz, Jonathan Roesner, Mariya Chkalova
Visual domain detection systems and methods

Patent number: 11580760

Abstract: Disclosed is an effective domain name defense solution in which a domain name string may be provided to or obtained by a computer embodying a visual domain analyzer. The domain name string may be rendered or otherwise converted to an image. An optical character recognition function may be applied to the image to read out a text string which can then be compared with a protected domain name to determine whether the text string generated by the optical character recognition function from the image converted from the domain name string is similar to or matches the protected domain name. This visual domain analysis can be dynamically applied in an online process or proactively applied in an offline process to hundreds of millions of domain names.

Type: Grant

Filed: May 4, 2020

Date of Patent: February 14, 2023

Assignee: PROOFPOINT, INC.

Inventors: Gaurav Mitesh Dalal, Ali Mesdaq, Sharon Huffner, Harold Nguyen
Electronic device and method for providing multiple services respectively corresponding to multiple external objects included in image

Patent number: 11580732

Abstract: An electronic device according to various embodiments includes a communication circuit, a memory, and a processor, and the processor is configured to: receive a first image from a first external electronic device by using the communication circuit; perform image recognition with respect to the first image by using the first image; generate information regarding an external object included in the first image, based on a result of the recognition; based on the information regarding the external object satisfying a first designated condition, transmit at least a portion of the first image to a second external electronic device corresponding to the first designated condition; and, based on the information regarding the external object satisfying a second designated condition, transmit the at least portion of the first image to a third external electronic device corresponding to the second designated condition.

Type: Grant

Filed: November 14, 2019

Date of Patent: February 14, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Dasom Lee, Jungeun Lee, Sungoh Kim, Hyunhee Park
Integration of an email client with hosted applications

Patent number: 11558321

Abstract: Disclosed are various embodiments for integrating an email client with hosted applications. An email is received from an email client. An image that is a component of the email is identified and sent to an optical character recognition (OCR) service. Extracted text is received from the OCR service. A request for an action object is then sent to a connector for an application, the action object representing a potential action that could be performed with the application based on the extracted text from the OCR service. The action object is then sent to the email client, which is configured to display a prompt allowing a user to perform the action represented by the action object.

Type: Grant

Filed: January 24, 2022

Date of Patent: January 17, 2023

Assignee: VMWARE, INC.

Inventors: Rohit Pradeep Shetty, Shree Harsha Shedigumme
Approximate estimation of number of distinct keys in a multiset using a sample

Patent number: 11537594

Abstract: Herein are quantitative analytics to increase the accuracy of cardinality estimation without increasing sample size. In an embodiment, a computer selects a few sample values from a multiset. A high-frequency exact count of distinct values that have at least a threshold amount of occurrences in the sample values is counted. A low-frequency exact count of distinct values in the sample that do not have at least the threshold amount of occurrences in the sample is counted. Based on multiple binomial probabilities, an upper bound of a count of missing distinct values in the multiset that are not in the sample is calculated. A total count of distinct values (NDV) in the multiset is estimated based on: a) the high-frequency exact count of distinct values, b) the low-frequency exact count of distinct values, and c) the upper bound of the count of missing distinct values in the multiset that are not in the sample.

Type: Grant

Filed: February 5, 2021

Date of Patent: December 27, 2022

Assignee: Oracle International Corporation

Inventor: Suratna Budalakoti
Parsing an ink document using object-level and stroke-level processing

Patent number: 11514695

Abstract: Technology is described herein for parsing an ink document having a plurality of ink strokes. The technology performs stroke-level processing on the plurality of ink strokes to produce stroke-level information, the stroke-level information identifying at least one characteristic associated with each ink stroke. The technology also performs object-level processing on individual objects within the ink document to produce object-level information, the object-level information identifying one or more groupings of ink strokes in the ink document. The technology then parses the ink document into constituent parts based on the stroke-level information and the object-level information. In some implementations, the technology converts the ink stroke data into an ink image. The stroke-level processing and/or the object-level processing may operate on the ink image using one or more neural networks.

Type: Grant

Filed: December 10, 2020

Date of Patent: November 29, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Oussama Elachqar, Badal Yadav, Oz Solomon, Sergey Aleksandrovich Doroshenko, Nima Mohajerin
Probabilistic text index for semi-structured data in columnar analytics storage formats

Patent number: 11514697

Abstract: Herein is a probabilistic indexing technique for searching semi-structured text documents in columnar storage formats such as Parquet, using columnar input/output (I/O) avoidance, and needing minimal storage overhead. In an embodiment, a computer associates columns with text strings that occur in semi-structured documents. Text words that occur in the text strings are detected. Respectively for each text word, a bitmap, of a plurality of bitmaps, that contains a respective bit for each column is generated. Based on at least one of the bitmaps, some of the columns or some of the semi-structured documents are accessed.

Type: Grant

Filed: July 15, 2020

Date of Patent: November 29, 2022

Assignee: Oracle International Corporation

Inventors: Jian Wen, Hamed Ahmadi, Sanjay Jinturkar, Nipun Agarwal, Lijian Wan, Shrikumar Hariharasubrahmanian
Methods and systems for depth-aware image searching

Patent number: 11514102

Abstract: Embodiments provide systems, methods, and non-transitory computer storage media for providing search result images based on associations of keywords and depth-levels of an image. In embodiments, depth-levels of an image are identified using depth-map information of the image to identify depth-segments of the image. The depth-segments are analyzed to determine keywords associated with each depth-segment based on objects, features, or content in each depth-segment. An image depth-level data structure is generated by matching keywords generated for the entire image with the keywords at each depth-level and assigning the depth-level to the keyword in the image depth-level data structure for the entire image. The image depth-level data structure may be queried for images that contain keywords and depth-level information that match the keywords and depth-level information specified in a search query.

Type: Grant

Filed: August 14, 2019

Date of Patent: November 29, 2022

Assignee: Adobe Inc.

Inventors: Subham Gupta, Anuradha, Arnab Sil
Document processing optimization

Patent number: 11501551

Abstract: There is a need for more effective and efficient document processing solution. Accordingly, various embodiments of the present invention introduce various document processing optimization solutions. In one example, a method includes identifying a plurality of input pages each associated with a related input document of a plurality of input documents; for each input page of the plurality of input pages, generating a segmented page; processing each segmented page using a trained encoder model to generate a fixed-dimensional representation of the input page; determining, based at least in part on each fixed-dimensional representation, a plurality of document clusters; determining a plurality of processing groups, where each processing group is associated with one or more related document clusters of the plurality of document clusters; and performing the document processing optimization based at least in part on the plurality of processing groups.

Type: Grant

Filed: June 8, 2020

Date of Patent: November 15, 2022

Assignee: Optum Services (Ireland) Limited

Inventor: Raja Mukherji
Schema-informed extraction for unstructured data

Patent number: 11494425

Abstract: A method of extracting data from documents is provided. The method comprises receiving input of a number of documents and input of a schema of data items available for extraction from the documents. The documents are parsed into a machine-readable representation, and data items in the machine-readable representation are identified according to the schema. Interpretations of data items are propagated within the documents to disambiguate identified data items, and identified data items are matched with other data items in the documents according to the schema. Only identified data items that include a minimal set of interpretations specified by the schema are extracted.

Type: Grant

Filed: February 3, 2020

Date of Patent: November 8, 2022

Assignee: S&P Global Inc.

Inventors: Chester Curme, Delphine Vendryes, Baojia Tong, Matthew Theisen, David Relyea
Method and apparatus for detecting text regions in image, device, and medium

Patent number: 11482023

Abstract: A method and apparatus for detecting text regions in an image, a device, and a medium are provided. The method may include: detecting, based on feature representation of an image, a first text region in the image, where the first text region covers a text in the image, a region occupied by the text being of a certain shape; determining, based on a feature block of the first text region, text geometry information associated with the text, where the text geometry information includes a text centerline of the text and distance information of the centerline from the upper and lower borders of the text; and adjusting, based on the text geometry information associated with the text, the first text region to a second text region, where the second text region also covers the text and is smaller than the first text region.

Type: Grant

Filed: December 11, 2019

Date of Patent: October 25, 2022

Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventors: Chengquan Zhang, Zuming Huang, Mengyi En, Junyu Han, Errui Ding
Artificial intelligence engine architecture for generating candidate drugs

Patent number: 11462304

Abstract: An artificial intelligence engine architecture for generating candidate drugs is disclosed. In one embodiment, a method includes generating, via a creator module, a candidate drug compound including a sequence of a candidate drug compound, including the candidate drug compound as a node in a knowledge graph; generating, via a descriptor module, a description of the candidate drug compound at the node in the knowledge graph, wherein the description comprises drug compound structural information, drug compound activity information, and drug compound semantic information; based on the description, performing, via a scientist module, a benchmark analysis of a parameter of the creator module; and modifying, based on the benchmark analysis, the creator module to change the parameter in a desired way during a subsequent benchmark analysis.

Type: Grant

Filed: June 4, 2021

Date of Patent: October 4, 2022

Assignee: Peptilogics, Inc.

Inventors: Francis Lee, Jonathan D. Steckbeck, Hannes Holste
Dynamic soft keyboard

Patent number: 11416142

Abstract: In accordance with one or more aspects of a dynamic soft keyboard, a user input is received via a soft keyboard having multiple keys. Information describing a current input environment for the soft keyboard is obtained, and a determination is made as to which one or more keys of the multiple keys was intended to be selected by the user input. This determination is made based at least in part on the current input environment.

Type: Grant

Filed: November 23, 2021

Date of Patent: August 16, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Erik M. Geidl, Shawn R. LeProwse, Ian C. LeGrow, Reed L. Townsend
Object recognition and tagging based on fusion deep learning models

Patent number: 11416672

Abstract: Certain embodiments involve transforming an electronic document into a tagged electronic document. For instance, an electronic document processing application generates a tagged electronic document from an input electronic document. The electronic document processing application accesses one or more feature maps that identify, via a set of object-recognition rules, identified objects in the electronic document. The electronic document processing application also obtains a heat map of the electronic document that represents attributes in a pixel-wise manner. The electronic document processing application computes a tag by applying a fusion deep learning model to the one or more feature maps and the heat map. The electronic document processing application generates the tagged electronic document by applying the tag to the electronic document.

Type: Grant

Filed: November 24, 2020

Date of Patent: August 16, 2022

Assignee: Adobe Inc.

Inventors: Vlad Morariu, Rajiv Jain, Nishant Sankaran
Dialogue system, dialogue method, and storage medium

Patent number: 11417319

Abstract: According to one embodiment, a dialogue system includes a setting apparatus and a processing apparatus. The setting apparatus sets in advance a plurality of words that are in impossible combination relationships to each other. The processing apparatus acquires speech of a user, and when a speech recognition result of an object included in the speech includes a word combination included in the plurality of words that are in impossible combination relationships to each other, output a notification to the user that processing of the object cannot be carried out.

Type: Grant

Filed: February 20, 2018

Date of Patent: August 16, 2022

Assignee: Kabushiki Kaisha Toshiba

Inventors: Takami Yoshida, Kenji Iwata, Yuka Kobayashi, Masami Akamine
Reinforcement learning based locally interpretable models

Patent number: 11403490

Abstract: A method for training a locally interpretable model includes obtaining a set of training samples and training a black-box model using the set of training samples. The method also includes generating, using the trained black-box model and the set of training samples, a set of auxiliary training samples and training a baseline interpretable model using the set of auxiliary training samples. The method also includes training, using the set of auxiliary training samples and baseline interpretable model, an instance-wise weight estimator model. For each auxiliary training sample in the set of auxiliary training samples, the method also includes determining, using the trained instance-wise weight estimator model, a selection probability for the auxiliary training sample. The method also includes selecting, based on the selection probabilities, a subset of auxiliary training samples and training the locally interpretable model using the subset of auxiliary training samples.

Type: Grant

Filed: September 23, 2020

Date of Patent: August 2, 2022

Assignee: Google LLC

Inventors: Sercan Omer Arik, Jinsung Yoon, Tomas Jon Pfister
Information processing apparatus and non-transitory computer readable medium

Patent number: 11361529

Abstract: An information processing apparatus includes a processor configured to obtain, for each character of plural characters recognized from an image, (a) position of the character in the image, (b) size of the character, and (c) confidence level of a character recognition result of the character; and determine whether to regard the character as a noise based on a distance between the character and its nearest character, the size of the character, and the confidence level of the character recognition result of the character.

Type: Grant

Filed: August 22, 2019

Date of Patent: June 14, 2022

Assignee: FUJIFILM Business Innovation Corp.

Inventors: Beili Ren, Shunichi Kimura
System and method for transforming unstructured numerical information into a structured format

Patent number: 11347733

Abstract: Embodiments described herein automatically classifies numerical expressions from a textual document and designates a context to understand each numerical expression. Specifically, numerical expressions from a textual context are classified as nominal or cardinal. For cardinal numerical expressions that carry a quantitative meaning, inference terms are determined from the textual context to associate with the cardinal numerical expressions. The numerical expressions are then translated to a format of a numerical value and stored with metadata indicating the unit scale or the meaning of the numerical value.

Type: Grant

Filed: August 8, 2019

Date of Patent: May 31, 2022

Assignee: salesforce.com, inc.

Inventor: Joy Mustafi
Long running workflows for robotic process automation

Patent number: 11334465

Abstract: Systems and methods for executing a robotic process automation (RPA) workflow are provided. The RPA workflow is executed by a first robot. The execution of the RPA workflow is suspended by the first robot. A current context of the RPA workflow is serialized at a time of the suspension and the current context of the RPA workflow is stored. The execution of the RPA workflow is resumed by a second robot based on a triggering condition by retrieving the current context of the RPA workflow. The first robot and the second robot may be the same robot or different robots.

Type: Grant

Filed: December 17, 2019

Date of Patent: May 17, 2022

Assignee: UiPath, Inc.

Inventors: Palak Kadakia, Liji J. Kunnath, Amol Awate, Remus Rusanu
Content fragments aligned to content criteria

Patent number: 11308146

Abstract: Content fragments aligned to content criteria enable rich sets of multimodal content to be generated based on specified content criteria, such as content needs pertaining to various content delivery platforms and scenarios. For instance, the described techniques take a set of content (e.g., text, images, etc.) along with a specified content criteria (e.g., business/user need) and creates content fragment variants that are tailored to the content criteria with respect to both the information presented as well as the style of the content presented.

Type: Grant

Filed: March 4, 2020

Date of Patent: April 19, 2022

Assignee: Adobe Inc.

Inventors: Gaurav Verma, Suryateja B V, Samagra Sharma, Balaji Vasan Srinivasan
Network analyzer that provides answer to inquiry about network failure and network analyzing method

Patent number: 11296925

Abstract: A network analyzer includes a storage and a processor to provide an answer to inquiry including an inquiry statement and device log. The storage stores first information indicating relation between a previously received inquiry statement and an answer for the inquiry statement and second information indicating relation between previously received device log and an answer for the device log. The processor selects a first answer candidate for a new inquiry statement and calculates a first degree of certainty of the first answer candidate based on the first information. The processor selects a second answer candidate for new device log corresponding to the new inquiry statement and calculates a second degree of certainty of the second answer candidate based on the second information. The processor determines an answer based on the first answer candidate, the first degree of certainty, the second answer candidate, and the second degree of certainty.

Type: Grant

Filed: April 1, 2020

Date of Patent: April 5, 2022

Assignee: FUJITSU LIMITED

Inventor: Shinji Yamashita
Saliency mapping by feature reduction and perturbation modeling in medical imaging

Patent number: 11263744

Abstract: For saliency mapping, a machine-learned classifier is used to classify input data. A perturbation encoder is trained and/or applied for saliency mapping of the machine-learned classifier. The training and/or application (testing) of the perturbation encoder uses less than all feature maps of the machine-learned classifier, such as selecting different feature maps of different hidden layers in a multiscale approach. The subset used is selected based on gradients from back-projection. The training of the perturbation encoder may be unsupervised, such as using an entropy score, or semi-supervised, such as using the entropy score and a difference of a perturbation mask from a ground truth segmentation.

Type: Grant

Filed: December 9, 2019

Date of Patent: March 1, 2022

Assignee: Siemens Healthcare GmbH

Inventors: Youngjin Yoo, Pascal Ceccaldi, Eli Gibson, Mariappan S. Nadar
Browsing images via mined hyperlinked text snippets

Patent number: 11250203

Abstract: Images stored in an information repository are prepared for browsing. For each image in the repository, text in the repository is mined to extract snippets of text about the image which are semantically relevant to the image, and for each of these snippets of text, keyterms are detected in the snippet of text which represent either concepts that are related to the image or entities that are related to the image, and the snippet of text and keyterms are associated with the image. Each keyterm that is associated with each image in the repository is hyperlinked to each other image in the repository that has this keyterm associated therewith. A graphical user interface allows a user to browse the images in the repository by using their associated snippets of text and hyperlinked keyterms.

Type: Grant

Filed: August 12, 2013

Date of Patent: February 15, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Simon John Baker, Anitha Kannan, Krishnan Ramnath
Reformatting of context sensitive data

Patent number: 11244122

Abstract: A method for dynamically detecting and converting a context-sensitive information in a first language and a first format to a second language and a second format that is understandable to an end user based on a user-specified setting is provided. The method may use a built-in camera of a computing device to dynamically detect and capture an image frame of context-sensitive information. The method may use Optical character recognition (OCR), as well as contextual information such as GPS data available from a mobile computing device, to automatically translate and reformat the context-sensitive information in real-time so a user may understand it unambiguously.

Type: Grant

Filed: October 28, 2015

Date of Patent: February 8, 2022

Assignee: International Business Machines Corporation

Inventors: Yu-Ning Hsu, Elaine I H Liao, Chih-Yuan Lin, Cheng-Yu Yu
Method for user-operation mode selection and terminals

Patent number: 11237703

Abstract: A method and device for page displaying, a terminal, and a storage medium are provided according to an implementation of the present disclosure. The present disclosure is related to the technical field of terminals. The method includes the following. A user configuration interface is displayed, where the user configuration interface includes a first area and a second area, and on the first area selectable operation modes are displayed. A selection operation on the first area is received, and at least one target operation mode is determined from the selectable operation modes in response to the selection operation. Information associated with the at least one target operation mode is displayed on the second area.

Type: Grant

Filed: July 14, 2020

Date of Patent: February 1, 2022

Assignee: GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD.

Inventors: Qianyang Huang, Lixia Duan
Image reading apparatus and information processing apparatus that reads documents and generates image data

Patent number: 11223727

Abstract: Provided is an image reading apparatus capable of eliminating the need for a user to correct a character portion that cannot be recognized by OCR and improve the operation burden on the user. A non-word detecting unit detects a non-word that is not considered to be a word among a plurality of words constituting the text in a document. A determining unit determines whether or not a compound word obtained by combining the non-word with at least one of the word immediately before the non-word and the word immediately after the non-word in that arrangement order is a word. A character correcting unit identifies the text portion corresponding to the compound word in the text in the document as a failed character recognition portion, and corrects the text of the failed character recognition portion to the text of the compound word.

Type: Grant

Filed: November 26, 2020

Date of Patent: January 11, 2022

Assignee: KYOCERA Document Solutions Inc.

Inventors: Aida Yagon, Ronald Reyes, Charles Allera
Information processing apparatus and non-transitory computer readable medium for selecting a proper version of a recognition dictionary that is not necessarily a latest version

Patent number: 11200450

Abstract: An information processing apparatus includes a selection unit that, when a target document is recognized, selects a first mode in which a latest version of a recognition dictionary is applied, or a second mode in which a version of the recognition dictionary is applied, the version of the recognition dictionary having a highest correct answer rate among plural versions different from the latest version, the correct answer rate being obtained from a recognition result and a confirmation or correction result of each of plural documents.

Type: Grant

Filed: October 9, 2019

Date of Patent: December 14, 2021

Assignee: FUJIFILM Business Innovation Corp.

Inventor: Shintaro Nishioka

prev 1 2 3 4 5 6 … next