Context Analysis Or Word Recognition (e.g., Character String) Patents (Class 382/229)
-
Patent number: 12149862Abstract: A method includes receiving, at one or more processors, a content description and a matrix code. The content description includes data describing one or more computer-readable files. The matrix code is encoded with a link to the one or more files. The method includes the one or more processors determining a condition indicative of a first computing device being unable to display the file with at least a minimum quality level. The determination is based, at least in part, on the content description. The method includes the processor outputting the matrix code to a first display operatively coupled to the one or more processors. Outputting the matrix code occurs in response to determining the condition indicative of the first computing device being unable to display the one or more files with at least the minimum quality level. Furthermore, the method includes displaying, on the first display, the matrix code.Type: GrantFiled: February 8, 2019Date of Patent: November 19, 2024Assignee: Snap-on IncorporatedInventors: Patrick S. Merg, Roy S. Brozovich
-
Patent number: 12124813Abstract: Disclosed embodiments include a computer readable medium that may include instructions that when executed by one or more processing devices cause the one or more processing devices to perform a method. The method may include: receiving an identification of at least one source text document; loading text of the at least one source text document; analyzing the text of the at least one source text document; generating at least one summary snippet associated with one or more portions of the text of the at least one source text document, wherein the at least one summary snippet conveys a meaning associated with the one or more portions of the text, but includes one or more textual differences relative to the one or more portions of the text of the at least one source text document; and causing the at least one summary snippet to be shown on a display.Type: GrantFiled: January 12, 2023Date of Patent: October 22, 2024Assignee: AI21 LabsInventors: Barak Peleg, Dan Padnos, Amnon Morag, Gilad Lumbroso, Yoav Shoham, Ori Goshen, Barak Lenz, Or Dagan, Guy Einy
-
Patent number: 12112561Abstract: Techniques are disclosed to provide an interactive visual representation of semantically related extracted data. In various embodiments, a plurality of data entities are extracted from a file, each entity comprising a key-value pair. One or more sets of related entities, each set comprising an occurrence of a defined repeating type of entity set, are identified among the plurality of data entities. Data associating the one or more sets of related entities with the file and the defined repeating type of entity set are stored.Type: GrantFiled: November 23, 2021Date of Patent: October 8, 2024Assignee: Figma, Inc.Inventors: Amy Pham Le, Joren Lauwers, Richard Vaughan Stebbing, Ankur Goyal
-
Patent number: 12106594Abstract: An information processing apparatus includes a processor configured to receive, if an error is found in sorting of a form image after an operator checks and corrects a result of character recognition performed on the sorted form image, an instruction to cause a process to revert to the sorting for the form image.Type: GrantFiled: June 2, 2021Date of Patent: October 1, 2024Assignee: FUJIFILM BUSINESS INNOVATION CORP.Inventor: Masashi Ban
-
Patent number: 12100234Abstract: In an exemplary embodiment, the invention comprises a principled edit-distance system that performs a method for determining the probability of character errors. In another exemplary embodiment, the invention comprises a post-OCR error correction system that performs a context-sensitive correction method. In another exemplary embodiment, the invention comprises a post-OCR error correction system that performs a comprehensive, unified correction process based on generalized edit distance analysis, wherein the objective is to find a corrected sentence that has the overall smallest edit distance across all levels. In another exemplary embodiment, the invention comprises a post-OCR error correction system that comprises one or more subjective fractional rank-based dictionaries. In another embodiment, the invention comprises a post-OCR error correction system that performs the automatic assignment of rank to words per-document dictionaries.Type: GrantFiled: November 12, 2021Date of Patent: September 24, 2024Assignee: Lexalytics, Inc.Inventors: Jeff Catlin, Brian Pinette
-
Patent number: 12100257Abstract: A system comprising: a document holder; an image sensor directed toward the document holder; a polarized filter between the image sensor and the document holder; a filter motor configured to rotate the polarized filter; one or more processors; and at least one memory having stored thereon computer program code that, when executed by the one or more processors, instructs the one or more processors to: control the filter motor to rotate the polarized filter; control the image sensors to capture a plurality of images of a document within the document holder, the plurality of images corresponding to a respective relative orientations of the polarized filter to the document; identify a feature of the document; and output, in response comparing respective visual characteristics of the feature of the document to corresponding expected visual characteristics of the first feature for the document, an indication of a verification of the document.Type: GrantFiled: July 12, 2023Date of Patent: September 24, 2024Assignee: CAPITAL ONE SERVICES, LLCInventors: Kevin Osborn, Daniel Marsch
-
Patent number: 12099080Abstract: An ingress protection assembly for leading a wire, such as a signal and/or a supply line, through a partition wall of an electrical device is provided. The ingress protection assembly includes a first wall section provided with a receiving slot extending through the first wall section in a lead-through direction and opening at an inlet facing in an assembly direction of the ingress protection assembly; and a second wall section provided with a counter slot extending through the second wall section in the lead-through direction and opening at a counter inlet facing against the assembly direction, wherein at least in a fully assembled state of the ingress protection assembly, the first wall section and the second wall section are at least partially superimposed in a projection along the lead-through direction such that the receiving slot and the counter slot together form an aperture configured for tightly encompassing the wire.Type: GrantFiled: March 26, 2020Date of Patent: September 24, 2024Assignee: Landis+Gyr AGInventor: Andrew Cakebread
-
Patent number: 12086189Abstract: A document search device includes a processor, and a memory storing program instructions that cause the processor to search for an input keyword in a document database in which document information including text data extracted by using a character recognition process from document image data generated by imaging a paper document is stored, select a similar keyword in accordance with a degree of similarity to the input keyword from a group of wildcard strings generated from the input keyword and search for the similar keyword in the document database, the degree of similarity being determined by comparing each character of the input keyword with a corresponding character of a wildcard string in the group of wildcard strings, and output a search result obtained by searching for the input keyword in the document database and a search result obtained by searching for the similar keyword in the document database.Type: GrantFiled: June 22, 2023Date of Patent: September 10, 2024Assignee: RESONAC CORPORATIONInventors: Yoshishige Okuno, Takuya Minami, Eriko Takeda, Hajime Hotta
-
Patent number: 12073397Abstract: The present disclosure involves systems, software, and computer implemented methods for transaction auditing. One example method includes determining valid pixel-based pattern(s) that are included in valid reference images. Fraudulent pixel-based pattern(s) that are included in fraudulent reference images are determined. A request to classify an image is received. A determination is made as to whether pixel values in the image match a valid pixel-based pattern or a fraudulent pixel-based pattern. In response to determining that the pixel values match a valid pixel-based pattern, a likelihood of classifying the first image as a valid image is increased. In response to determining that the pixel values match a fraudulent pixel-based pattern, a likelihood that the image as a fraudulent image is increased. The image is classified in response to the request as either a valid image or a fraudulent image based on the likelihoods.Type: GrantFiled: April 11, 2022Date of Patent: August 27, 2024Assignee: SAP SEInventors: Jesper Lind, Suchitra Sundararaman
-
Patent number: 12028372Abstract: Systems, methods and products for identifying “similar” threats by clustering the threats based on corresponding forensics. A corpus of forensic data for a plurality of threat URLs is obtained by a threat protection system, the data including forensic elements corresponding to each threat URLs. For each pair of threat URLs, the corresponding forensic elements are examined to identify shared forensic elements. A similarity score is then generated for the pair of threat URLs based on the comparison of the corresponding forensic elements, including both malicious and non-malicious elements. Based on the similarity score generated for each pair of threat URLs, clusters of the threat URLs are identified, with each cluster including a subset of the plurality of threat URLs. Clusters of URLs similar to a selected URL may be identified by accessing the threat cluster information using a similar-threat search interface or through internal APIs of the threat protection system.Type: GrantFiled: March 26, 2021Date of Patent: July 2, 2024Assignee: PROOFPOINT, INC.Inventors: Garrick Dasbach, Jonathan Ogilvie
-
Patent number: 12027070Abstract: A computer-implemented method for providing a framework to identify questions and answers dynamically from a dataset based on previous learning and an evaluation score of a user. The method includes creating a library of potential questions and answers from the dataset based on the previous learning and evaluation score of the user, and generating a set of personalized questions, for the user, related to the dataset by utilizing sentence-based machine translation (SBMT) and natural language processing (NLP) tools. The method further includes identifying a plurality of answers for the set of personalized questions for the user, based on collective information available in the dataset, and providing, to the user, the plurality of answers for the set of personalized questions for verification and evaluation.Type: GrantFiled: March 15, 2022Date of Patent: July 2, 2024Assignee: International Business Machines CorporationInventors: Pinaki Bhattacharya, Harish Bharti, Rajeev Mittal, Anupama Ratha, Dinesh Wadekar, Sandeep Sukhija
-
Patent number: 12019622Abstract: Disclosed herein are system, method, and computer program product embodiments for maintaining of a geometric object in a database. An embodiment operates by a database maintaining a first page storing a data block in the database's on-disk store such that the data block stores at least one byte of the geometric object. After receiving the request for the geometric object, the database loads the page storing the geometric object in the in-memory store and determines the size of the geometric object. Based on the size of the geometric object, the database stores the geometric object in the in-memory store directly or in a heap of the in-memory store.Type: GrantFiled: June 29, 2023Date of Patent: June 25, 2024Assignee: SAP SEInventors: Colin Florendo, Surendra Vishnoi, Janardhan Hungund, Manuel Caroli
-
Patent number: 12008986Abstract: A speech recognition system includes, or has access to, conventional speech recognizer data, including a conventional acoustic model and pronunciation dictionary. The speech recognition system generates restructured speech recognizer data from the conventional speech recognizer data. When used at runtime by a speech recognizer module, the restructured speech recognizer data produces more accurate and efficient results than those produced using the conventional speech recognizer data. The restructuring involves segmenting entries of the conventional pronunciation dictionary and acoustic model according to their constituent phonemes and grouping those entries with the same initial N phonemes, for some integer N (e.g., N=3), and deriving a restructured dictionary with a corresponding semi-word acoustic model for the various grouped entries.Type: GrantFiled: April 27, 2020Date of Patent: June 11, 2024Assignee: Interactions LLCInventors: Ilija Zeljkovic, Andrej Ljolje
-
Patent number: 12001669Abstract: An image processing device and method for more easily performing a write operation on an image. Write information corresponding to a feature of an image of processing target is searched for based on relevant information that associates the write information with the feature of the image. For example, it is furthermore possible to search for write information corresponding to the feature of an image of the processing target based on the learning result of the user's behavior. The present invention can be applied to, for example, an information processing device, an image processing device, electronic equipment, an image processing method, a program, or the like.Type: GrantFiled: March 17, 2020Date of Patent: June 4, 2024Assignee: Sony Group CorporationInventors: Gakuya Kondo, Chao Ma
-
Patent number: 11995115Abstract: An information extracting device includes an acquiring unit that acquires, with regard to each information source, a data group made up of data including a content relating to an object, a location where the content was recorded, and a date and time at which the content was recorded, a generating unit that generates a word vector of which the date and time, the location, content using a weight based on the date and time, and a type of the content, are each components, for each of the data of the data group, a distance calculating unit that calculates a distance among the word vectors, a classifying unit that classifies each of data of the data group on the basis of the distance among the word vectors, and an extracting unit that calculates a reliability with regard to the classification, and extracts data from the data group on the basis of the reliability.Type: GrantFiled: June 14, 2019Date of Patent: May 28, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Naoto Abe, Hitoshi Seshimo, Hiroshi Konishi
-
Patent number: 11983945Abstract: A new segment of electronic handwriting is provided to a handwriting recognition module to obtain a plurality of textual interpretations of the new segment. The textual interpretations obtained from the handwriting recognition module are scored based on how each respective electronic handwriting representation would change a display of existing electronic content when the respective electronic handwriting representation is displayed substantially at the user designated position within or adjacent to the existing electronic content. Based on the scoring, an electronic handwriting representation corresponding to a respective textual interpretation of the plurality of textual interpretations is selected, and the existing electronic content is modified to include the selected electronic handwriting representation located substantially at the user designated position.Type: GrantFiled: May 21, 2021Date of Patent: May 14, 2024Assignee: Google LLCInventors: Maria Cirimele, Thomas William Buckley, Robert Ky Mickle, Tayeb Al Karim
-
Patent number: 11967072Abstract: The present disclosure relates to techniques for segmenting objects within medical images using a deep learning network that is localized with object detection based on a derived contrast mechanism. Particularly, aspects are directed to localizing an object of interest within a first medical image having a first characteristic, projecting a bounding box or segmentation mask of the object of interest onto a second medical image having a second characteristic to define a portion of the second medical image, and inputting the portion of the second medical image into a deep learning model that is constructed as a detector using a weighted loss function capable of segmenting the portion of the second medical image and generating a segmentation boundary around the object of interest. The segmentation boundary may be used to calculate a volume of the object of interest for determining a diagnosis and/or a prognosis of a subject.Type: GrantFiled: February 7, 2022Date of Patent: April 23, 2024Assignee: Genentech, Inc.Inventors: Luke Xie, Kai Henrik Barck, Omid Bazgir
-
Patent number: 11941349Abstract: A computer-implemented method for handwritten text line wrapping includes: obtaining, from a user, at least two words of handwritten text on a screen; determining an original bounding box for the at least two words; creating at least one line-break character for the at least two words; determining at least one baseline for the at least two words; determining a new bounding box for the at least two words based on the at least one baseline; generating, on the screen, a text box; moving, on the screen, at least one of the at least two words from a first line of at least one line of handwritten text to a second line of the at least one line of handwritten text, wherein the second line of handwritten text fits within the text box; and adjusting at least one gap between the at least one line of handwritten text.Type: GrantFiled: September 12, 2022Date of Patent: March 26, 2024Assignee: Lenovo (Singapore) Pte. Ltd.Inventors: Tran Minh Khuong Vu, Ryohta Nomura
-
Patent number: 11942086Abstract: A description support device for displaying information on a topic to be checked in an utterance by a user, the description support device includes: an inputter to acquire input information indicating an utterance sentence corresponding to the utterance; a controller to generate information indicating a check result of the topic for the utterance sentence; and a display to display information generated by the controller, wherein the display is configured to display a checklist indicating whether or not the topic is described in the utterance sentence indicated by the input information sequentially acquired by the inputter, and wherein the display is configured to display, according to a likelihood of each utterance sentence, display information including the utterance sentence, the likelihood defining the check result of the topic in the checklist.Type: GrantFiled: December 17, 2020Date of Patent: March 26, 2024Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.Inventors: Natsuki Saeki, Shoichi Araki, Masakatsu Hoshimi, Takahiro Kamai
-
Patent number: 11928553Abstract: Based upon the principles of randomness and self-modification a novel computing machine is constructed. This computing machine executes computations, so that it is difficult to apprehend by an adversary and hijack with malware. These methods can also be used to help thwart reverse engineering of proprietary algorithms, hardware design and other areas of intellectual property. Using quantum randomness in the random instructions and self-modification in the meta instructions, creates computations that are incomputable by a digital computer. In an embodiment, a more powerful computational procedure is created than a computational procedure equivalent to a digital computer procedure. Current digital computer algorithms and procedures can be constructed or designed with ex-machine programs, that are specified by standard instructions, random instructions and meta instructions. A novel computer is invented so that a program's execution is difficult to apprehend.Type: GrantFiled: August 14, 2021Date of Patent: March 12, 2024Assignee: Aemea Inc.Inventor: Michael Stephen Fiske
-
Patent number: 11893747Abstract: The invention provides an image segmentation method and an electronic device. The image segmentation method includes the following steps. Regression analysis is performed on a first gray-scale image to obtain a residual image having an object backbone area. A pixel value of each pixel in the object backbone area is defined as an average gray-scale value of the object backbone area in the residual image, and a second gray-scale image having the object backbone area is generated. It is recursively determined whether a residual polarity of each adjacent pixel adjacent to edge pixels of the object backbone area in the residual image is the same as a residual polarity of the corresponding edge pixel, and whether a pixel value of each adjacent pixel is greater than a first threshold, so as to expand the object backbone area in the second gray-scale image, which is extracted as a target object.Type: GrantFiled: July 1, 2021Date of Patent: February 6, 2024Assignee: Coretronic CorporationInventor: Huai-En Wu
-
Patent number: 11893047Abstract: Systems and methods for automated indexing and extraction of information in digital documents are disclosed. A method may comprise identifying a page containing targeted information; inputting an image of the page into a visual machine learning network (visual ML), wherein the visual ML is trained to recognize text associated with the targeted information in an image; identifying by the visual ML, a section of the image that contains the targeted information; inputting the digital document, and coordinates of the section into an extraction module; and extracting the targeted information by the extraction module from the section.Type: GrantFiled: June 21, 2023Date of Patent: February 6, 2024Assignee: VelocityEHS Holdings, Inc.Inventors: Julia Penfield, Aatish Suman, Veeru Talreja, Misbah Zahid Khan
-
Patent number: 11875590Abstract: Examples provide a self-supervised language model for document-to-document similarity scoring and ranking long documents of arbitrary length in an absence of similarity labels. In a first stage of a two-staged hierarchical scoring, a sentence similarity matrix is created for each paragraph in the candidate document. A sentence similarity score is calculated based on the sentence similarity matrix. In the second stage, a paragraph similarity matrix is constructed based on aggregated sentence similarity scores associated with the first candidate document. A total similarity score for the document is calculated based on the normalize the paragraph similarity matrix for each candidate document in a collection of documents. The model is trained using a masked language model and intra-and-inter document sampling. The documents are ranked based on the similarity scores for the documents.Type: GrantFiled: December 19, 2022Date of Patent: January 16, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Itzik Malkiel, Dvir Ginzburg, Noam Koenigstein, Oren Barkan, Nir Nice
-
Patent number: 11836996Abstract: The present disclosure discloses a method and apparatus for recognizing a text. The method comprises: acquiring images of a text area of an input image, the acquired images including a text centerline graph, a text direction offset graph, a text boundary offset graph, and a text character classification graph; extracting coordinates of feature points of a character center from the text centerline graph; sorting the extracted coordinates of the feature points based on the text direction offset graph to obtain a coordinate sequence of the feature points; determining a polygonal bounding box of the text area based on the coordinate sequence of the feature points of the character center and the text boundary offset graph; and determining a classification result of the feature points of the character center, based on the coordinate sequence of the feature points of the character center and the text character classification graph.Type: GrantFiled: March 23, 2021Date of Patent: December 5, 2023Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Xiaoqiang Zhang, Pengyuan Lv, Shanshan Liu, Chengquan Zhang
-
Patent number: 11829701Abstract: A computer-implemented method for obtaining content of a document is provided. The method includes: receiving data in an unknown format obtained by an OCR application from the document, the data comprising a plurality of visual elements; for each of the plurality of visual elements, obtaining a position in the document; determining, from the plurality of visual elements, one or more graphic elements and one or more textual elements; determining a particular graphic element from the one or more graphic elements based on the position of the particular graphic element; determining, from the one or more textual elements, a key that is associated with the particular graphic element; determining, from the one or more textual elements, one or more attributes that are associated with the particular graphic element; generating an association between the key and each of the one or more attributes; and providing a structured representation of the association.Type: GrantFiled: September 16, 2022Date of Patent: November 28, 2023Assignee: Accenture Global Solutions LimitedInventors: Ameet Sunil Chaubal, Paulina Sperling, Ruth Anne Sullivan, Abhishek Kumar, Bradley Roy Hardwick, Jr.
-
Patent number: 11830294Abstract: An electronic voting system is described that utilizes printed vote records (PVRs) in which a voter's vote selections are recorded in voter readable characters. Optical character recognition (OCR) techniques may then be utilized to scan the PVR to record the voter's selections. The OCR data is then utilized to generate the cast vote record. Thus, the electronic voting system directly interprets the voter selections from the PVR just as the voter sees the data. In this manner “what you see is what you get” printed vote record data is provided for a voter's viewing and that same data is used to generate the cast vote record.Type: GrantFiled: September 23, 2021Date of Patent: November 28, 2023Assignee: Hart InterCivic, Inc.Inventors: James M. Canter, Drew E. Tinney, Ievgen Konovalenko
-
Patent number: 11810382Abstract: Techniques for training an optical character recognition (OCR) model to detect and recognize text in images for robotic process automation (RPA) are disclosed. A text detection model and a text recognition model may be trained separately and then combined to produce the OCR model. Synthetic data and a smaller amount of real, human-labeled data may be used for training to increase the speed and accuracy with which the OCR text detection model and the text recognition model can be trained. After the OCR model has been trained, a workflow may be generated that includes an activity calling the OCR model, and a robot implementing the workflow may be generated and deployed.Type: GrantFiled: October 13, 2021Date of Patent: November 7, 2023Assignee: UiPath, Inc.Inventors: Dorin Andrei Laza, Trong Canh Nguyen
-
Patent number: 11803555Abstract: Methods, systems, and computer program products for a customer relationship management (CRM) system are provided herein. Embodiments presented herein provide for exchange of data between disparate, distributed systems; subscribe to and/or publish customer data change event; creation of master records for consumers using static and streaming sources; providing data provenance, auditing capabilities, and queries across multiple tenants and third party systems. Embodiments provide a single view of a customer in a distributed system environment.Type: GrantFiled: January 31, 2019Date of Patent: October 31, 2023Assignee: Salesforce, Inc.Inventors: Leo Duy Tran, David Angulo, David Woodward, Abhinav Chadda, David Hacker, Steven Ness, Matt Lagrotte, Jason Moody, Daniel Marchant, Matthew James Mondok, Federico Recio, Mehmet Gokmen Orun, Steven Kostrzewski, Christopher Bill, Kaustubh Barde, Lydia Lodovisi, Sarah Flamion, Jamin Hall, Charles Fineman
-
Patent number: 11804092Abstract: An electronic voting system is described that utilizes printed vote records (PVRs) in which a voter's vote selections are recorded in voter readable characters. Optical character recognition (OCR) techniques may then be utilized to scan the PVR to record the voter's selections. The OCR data is then utilized to generate the cast vote record. Thus, the electronic voting system directly interprets the voter selections from the PVR just as the voter sees the data. In this manner “what you see is what you get” printed vote record data is provided for a voter's viewing and that same data is used to generate the cast vote record.Type: GrantFiled: September 23, 2021Date of Patent: October 31, 2023Assignee: Hart InterCivic, Inc.Inventors: James M. Canter, Drew E. Tinney, Ievgen Konovalenko
-
Patent number: 11797581Abstract: An information processing apparatus accepts text data. When specifying a word included in the accepted text data, the information processing apparatus generates a code associated with the specified word and generates information that associates the appearance position of the specified word in the text data with the word. The information processing apparatus stores therein the generated code and the information in association with the accepted text data.Type: GrantFiled: June 5, 2019Date of Patent: October 24, 2023Assignee: FUJITSU LIMITEDInventors: Masahiro Kataoka, Ryo Matsumura, Satoshi Onoue
-
Patent number: 11797551Abstract: A document retrieval apparatus includes a processor which receives an input of a keyword, acquires an author's name and a document file from a digital document database which stores document files of text data obtained by performing a character recognition process with respect to document image data of handwritten documents, and names of authors who wrote the handwritten documents, references an associating keyword database which stores information associating the authors' names, keywords, and associating keywords, to acquire an associating keyword of the input keyword, from the received input keyword and the acquired author's name, searches the acquired document file, using the input keyword and the acquired associating keyword, and outputs a search result of the searching.Type: GrantFiled: February 10, 2020Date of Patent: October 24, 2023Assignee: RESONAC CORPORATIONInventors: Takuya Minami, Yu Kawahara, Shimpei Takemoto, Eriko Takeda, Yoshishige Okuno
-
Patent number: 11797767Abstract: The present disclosure discloses methods and systems for generating multiple scanned files when scanning a document. The method includes receiving a document for scanning from a user. Once received, a user interface is displayed to the user to input one or more keywords based on which multiple scanned files are to be generated. A single scanned file is generated in a pre-defined format. One or more pages having the keywords as input by the user are identified from the scanned file. Based on the one or more identified pages having the keywords input by the user, separate multiple scanned files are automatically generated. As a result, a single scan activity performed by the user generates multiple scanned files.Type: GrantFiled: April 20, 2021Date of Patent: October 24, 2023Assignee: XEROX CORPORATIONInventors: Srinivasarao Bindana, Dara N Lubin, Madhu Talapaneni
-
Patent number: 11797750Abstract: A method for modifying a printable file includes receiving the printable file; identifying an element representing one or more text characters in the printable file; tagging the element; and incorporating metadata in the printable file, wherein the metadata is associated in the printable file with the tagged element and includes the one or more text characters. A method for using a printable file including at least one tagged graphics object that represents one or more characters and associated metadata includes receiving the printable file; and performing an activity using the printable file and the metadata within the printable file.Type: GrantFiled: March 24, 2022Date of Patent: October 24, 2023Assignee: GLOBAL GRAPHICS SOFTWARE LIMITEDInventors: Nigel Wild, Martin Bailey
-
Patent number: 11768993Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for receipt decoding. An example apparatus includes processor circuitry to execute instructions to extract text from the receipt image, the text including bounding boxes; associate ones of the bounding boxes to link horizontally related fields of a the receipt image by selecting a first bounding box; identifying first horizontally aligned bounding boxes, the first horizontally aligned bounding boxes to include at least one bounding box of the bounding boxes that is horizontally aligned relative to the first bounding box; adding the first horizontally aligned bounding boxes to a word sync list; and connecting ones of the first horizontally aligned bounding boxes and the first bounding box based on at least one of an amount of the first horizontally aligned bounding boxes in the word sync list and a relationship among the first horizontally aligned bounding boxes and the first bounding box.Type: GrantFiled: August 8, 2022Date of Patent: September 26, 2023Assignee: Nielsen Consumer LLCInventors: Kannan Shanmuganathan, Hussain Masthan, Padmanabhan Soundararajan, Jose Javier Yebes Torres, Raju Kumar Allam
-
Patent number: 11763588Abstract: Described herein are various technologies pertaining to text extraction from a document. A computing device receives the document. The document comprises computer-readable text and a layout, wherein the layout defines positions of the computer-readable text within a two-dimensional area represented by the document. Responsive to receiving the document, the computing device identifies at least one textual element in the computer-readable text based upon spatial factors between portions of the computer-readable text and contextual relationships between the portions of the computer-readable text. The computing device then outputs the at least one textual element.Type: GrantFiled: November 1, 2021Date of Patent: September 19, 2023Inventors: Ralph Meier, Thorsten Wanschura, Johannes Hausmann, Harry Urbschat
-
Patent number: 11763479Abstract: Various implementations disclosed herein include devices, systems, and methods that provide measurements of objects based on a location of a surface of the objects. An exemplary process may include obtaining a three-dimensional (3D) representation of a physical environment that was generated based on depth data and light intensity image data, generating a 3D bounding box corresponding to an object in the physical environment based on the 3D representation, determining a class of the object based on the 3D semantic data, determining a location of a surface of the object based on the class of the object, the location determined by identifying a plane within the 3D bounding box having semantics in the 3D semantic data satisfying surface criteria for the object, and providing a measurement of the object, the measurement of the object determined based on the location of the surface of the object.Type: GrantFiled: December 1, 2022Date of Patent: September 19, 2023Assignee: Apple Inc.Inventors: Amit Jain, Aditya Sankar, Qi Shan, Alexandre Da Veiga, Shreyas V. Joshi
-
Patent number: 11758088Abstract: Embodiments of the present disclosure provide a method and apparatus for aligning a paragraph and a video. The method may include: acquiring a commentary and a candidate material resource set corresponding to the commentary, a candidate material resource being a video or an image; acquiring a matching degree between each paragraph in the commentary and each candidate material resource in the candidate material resource set; and determining a candidate material resource sequence corresponding to the each paragraph in the commentary based on the matching degrees between the paragraphs in the commentary and the candidate material resources, playing durations of the candidate material resources and text lengths of the paragraphs in the commentary, an image playing duration being a preset image playing duration.Type: GrantFiled: December 4, 2019Date of Patent: September 12, 2023Assignees: Baidu.com Times Technology (Beijing) Co., Ltd., Baidu USA LLCInventors: Hao Tian, Xi Chen, Jeff ChienYu Wang, Daming Lu
-
Patent number: 11755659Abstract: A document search device includes a document search unit configured to search for an input keyword in a document database in which document information including text data is stored, the text data being extracted, by using a character recognition process, from document image data generated by imaging a paper document, a similar keyword selecting unit configured to select a similar keyword in accordance with a degree of similarity to the input keyword, from a group of wildcard strings generated from the input keyword, and cause the document search unit to search for the similar keyword in the document database, and an output unit configured to output a search result obtained by searching for the input keyword in the document database and a search result obtained by searching for the similar keyword in the document database.Type: GrantFiled: September 26, 2019Date of Patent: September 12, 2023Assignee: Resonac CorporationInventors: Yoshishige Okuno, Takuya Minami, Eriko Takeda, Hajime Hotta
-
Patent number: 11748388Abstract: A document processing system is configured to identify, for each accessed electronic document in a first set of multiple electronic documents, a set of identified multi-word phrases determined to be in ordered text information in the accessed electronic document, each multi-word phrase of the set of identified multi-word phrases including adjacent words in the ordered text information; and determine, for each accessed electronic document in the first set of multiple electronic documents, a selected document type from the first set of document types based at least on an analysis of the set of identified multi-word phrases with respect to multi-word-phrase characteristics identified by a first definition and associated with each document type in a first set of document types associated with a first document-set type.Type: GrantFiled: October 14, 2020Date of Patent: September 5, 2023Assignee: Docufree CorporationInventor: John Frank Walsh
-
Patent number: 11741511Abstract: In one aspect, the present disclosure relates to a method of generating business descriptions performed by a server, said method may include: receiving a plurality of invoices, each invoice being associated with a business of a plurality of businesses; extracting a plurality of texts from the plurality of invoices; embedding the plurality of texts to a vector space to obtain a plurality of invoice vectors; generating a plurality of clusters in the vector space, each cluster of the plurality of clusters comprising at least one invoice vector of the plurality of invoice vectors; generating a description for a cluster, the description for the cluster representing all invoice vectors assigned to the cluster; for each business of the plurality of businesses that has at least one invoice vector assigned to the cluster, associating the business with the description; and indexing the plurality of businesses within a database by the generated descriptions.Type: GrantFiled: February 3, 2020Date of Patent: August 29, 2023Assignee: Intuit Inc.Inventors: Erez Katzenelson, Elik Sror, Shlomi Medalion, Shimon Shahar, Shir Meir Lador, Sigalit Bechler, Alexander Zhicharevich, Onn Bar
-
Patent number: 11734445Abstract: In an approach for providing a document access control based on document component layouts, a processor detects a layout of a document, the layout including one or more components of the document. A processor defines an access policy to access the one or more components based on the layout. A processor authorizes a request to access the one or more components based on the access policy and the layout. A processor retrieves the one or more components based on the access policy and the authorized request.Type: GrantFiled: December 2, 2020Date of Patent: August 22, 2023Assignee: International Business Machines CorporationInventors: Peter Zhong, Antonio Jose Jimeno Yepes, Lenin Mehedy
-
Patent number: 11726985Abstract: Disclosed herein are system, method, and computer program product embodiments for maintaining of a geometric object in a database. An embodiment operates by a database maintaining a first page storing a data block in the database's on-disk store such that the data block stores at least one byte of the geometric object. After receiving the request for the geometric object, the database loads the page storing the geometric object in the in-memory store and determines the size of the geometric object. Based on the size of the geometric object, the database stores the geometric object in the in-memory store directly or in a heap of the in-memory store.Type: GrantFiled: June 2, 2020Date of Patent: August 15, 2023Assignee: SAP SEInventors: Colin Florendo, Surendra Vishnoi, Janardhan Hungund, Manuel Caroli
-
Patent number: 11727697Abstract: A system performs optical character recognition (OCR) on an image displaying a portion of an object. An image classification system identifies the object in the image, based on which one or more object detection models identify labels associated with the object within the image. The system determines text of the identified labels using OCR, and analyzes the OCR resultant text for discrepancies and/or inaccuracies. In response to identifying a discrepancy, the system provides a recommendation for improving the accuracy of the OCR resultant text.Type: GrantFiled: June 30, 2021Date of Patent: August 15, 2023Assignee: Salesforce, Inc.Inventors: Dennis Schultz, Daniel Thomas Harrison, Christopher Anthony Kemp, Michael A. Salem
-
Patent number: 11727702Abstract: Systems and methods for automated indexing and extraction of information in digital documents are disclosed. A method may comprise selecting a page number of a digital document to identify a page containing targeted information; inputting an image of the page into a visual machine learning network (visual ML), wherein the visual ML is trained to recognize text associated with the targeted information in an image; identifying by the visual ML, a section of the image that contains the targeted information; inputting the page number, the digital document, and coordinates of the section into an extraction module; and extracting the targeted information by the extraction module from the section.Type: GrantFiled: January 17, 2023Date of Patent: August 15, 2023Assignee: VelocityEHS Holdings, Inc.Inventors: Julia Penfield, Aatish Suman, Veeru Talreja, Misbah Zahid Khan
-
Patent number: 11720752Abstract: A language determination model may be applied to select a first machine learning model or a second machine learning model to analyze the input text. The first machine learning model trained to analyze text in a first language, the second machine learning model trained to analyze text in a second language, and the input text may be in a third language. The language determination model may select the first machine learning model based on the first machine learning model having a better performance analyzing text in the third language than the second machine learning model. The language determination model may be updated based on an actual performance of the first machine learning model analyzing the input text. Moreover, the first machine learning model may be subject to additional training if the actual performance of the first machine learning model analyzing the input text is below a threshold value.Type: GrantFiled: July 7, 2020Date of Patent: August 8, 2023Assignee: SAP SEInventor: Tobias Weller
-
Patent number: 11710099Abstract: Various methods, apparatuses/systems, and media for automatically extracting information from unstructured data are provided. A receiver receives digitized data of a document having unstructured data format. A processor applies machine learning models for sectioning the digitized data. An OCR device applies an OCR processing to the sectioned digitized data. The processor matches the sectioned digitized data to patterns and rules; applies classification models to the matched digitized data to identify entities and events from the sectioned digitized data; automatically link each entity with corresponding event in a hierarchical format to generate a document having structured data format; and output the document having the structured data with metadata having the linked entity with corresponding event in the hierarchical format to downstream applications.Type: GrantFiled: May 5, 2020Date of Patent: July 25, 2023Assignee: JPMORGAN CHASE BANK, N.A.Inventors: Debraj Majumdar, Loryfel Nunez, Adam Leonard Harry Clark, Jayson Lashin, Amish Seth, Noriel E. Flores, Blesson Thomas
-
Patent number: 11704224Abstract: Systems and methods for executing a robotic process automation (RPA) workflow are provided. The RPA workflow is executed by a first robot. The execution of the RPA workflow is suspended by the first robot. A current context of the RPA workflow is serialized at a time of the suspension and the current context of the RPA workflow is stored. The execution of the RPA workflow is resumed by a second robot based on a triggering condition by retrieving the current context of the RPA workflow. The first robot and the second robot may be the same robot or different robots.Type: GrantFiled: April 7, 2022Date of Patent: July 18, 2023Assignee: UiPath, Inc.Inventors: Palak Kadakia, Liji J. Kunnath, Amol Awate, Remus Rusanu
-
Patent number: 11704482Abstract: An automated content annotation workflow is disclosed. An example embodiment is configured for: registering a plurality of labelers to which annotation tasks are assigned; populating a labeling queue with content data to be annotated; assigning annotation tasks from the labeling queue to the plurality of labelers; enabling the plurality of labelers in an annotation review queue to modify or delete annotations applied by prior labelers; and evaluating a level of performance of the plurality of labelers in applying the annotations.Type: GrantFiled: February 3, 2022Date of Patent: July 18, 2023Assignee: LABELBOX, INC.Inventors: Manu Sharma, Brian Rieger, Dan Rasmuson, Connor Harwood, Ryan Quinn, Kyle Owens, Randall Lin
-
Patent number: 11699276Abstract: A method, apparatus, electronic device, and storage medium for character recognition are provided. The method may perform image processing on an acquired original image to obtain a region to be recognized. The region may include a character. The method may determine an area ratio of the region to be recognized on the original image. The method may determine an angle between the region to be recognized and a preset direction. The method may determine a character density of the region to be recognized. The method may perform character recognition on the character in the region to be recognized in response to determining that the area ratio is greater than a ratio threshold, the angle is less than an angle threshold, and the character density is less than a density threshold.Type: GrantFiled: July 16, 2021Date of Patent: July 11, 2023Assignees: Beijing Xiaomi Mobile Software., Ltd., Beijing Xiaomi Pinecone Electronics Co., Ltd.Inventor: Dong Wang
-
Patent number: 11694459Abstract: Disclosed is an approach of on-device partial recognition that includes performing partial recognition on an image of a document captured by a mobile device to detect and/or recognize a specific area (e.g., barcodes, non-relevant text, etc.) and filling the recognized area with a solid color. Because the solid color area has a maximum compression ratio, this approach can lead to image size reduction and increased network throughput for client-server based data recognition where further processing such as advanced data extraction is performed at the server side. The approach can be enforced with neural network algorithms to exclude non-relevant information (e.g., logos, phrases, words, etc.).Type: GrantFiled: May 24, 2021Date of Patent: July 4, 2023Assignee: Open Text CorporationInventors: Mikhail Yurievitch Zakharov, Kirill Vaniukov, Christopher Dale Lund