Segmenting Individual Characters Or Words Patents (Class 382/177)

Separating touching or overlapping characters (Class 382/178)

Segmenting hand-printed characters (Class 382/179)

Natural language translation device

Patent number: 10311149

Abstract: Natural language translation device contains a bus, an input interface connecting to the bus for receiving a source sentence in a first natural language to be translated to a target sentence in second natural language one word at a time in sequential order. A two-dimensional (2-D) symbol containing a super-character characterizing the i-th word of the target sentence based on the received source sentence is formed in accordance with a set of 2-D symbol creation rules. The i-th word of the target sentence is obtained by classifying the 2-D symbol via a deep learning model that contains multiple ordered convolution layers in a Cellular Neural Networks or Cellular Nonlinear Networks (CNN) based integrated circuit.

Type: Grant

Filed: August 8, 2018

Date of Patent: June 4, 2019

Assignee: Gyrfalcon Technology Inc.

Inventors: Lin Yang, Patrick Z. Dong, Catherine Chi, Charles Jin Young, Jason Z Dong, Baohua Sun
Neural network for keyboard input decoding

Patent number: 10248313

Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.

Type: Grant

Filed: March 29, 2017

Date of Patent: April 2, 2019

Assignee: Google LLC

Inventors: Shumin Zhai, Thomas Breuel, Ouais Alsharif, Yu Ouyang, Francoise Beaufays, Johan Schalkwyk
Image analyzing apparatus,image analyzing, and storage medium

Patent number: 10181075

Abstract: An information processing apparatus includes an evaluation unit configured to evaluate whether a partial region of a photographing range of an imaging unit is a region suitable for analysis processing to be performed based on feature quantities of an object, with reference to a track of the object in an image captured by the imaging unit, and an output control unit configured to control the information processing apparatus to output information reflecting an evaluation result obtained by the evaluation unit. Accordingly, the information processing apparatus can support a user to improve the accuracy of the analysis processing to be performed based on the feature quantities of the object.

Type: Grant

Filed: October 12, 2016

Date of Patent: January 15, 2019

Assignee: Canon Kabushiki Kaisha

Inventors: Hiroshi Tojo, Tomoya Honjo, Shinji Yamamoto
Smart flip operation for grouped objects

Patent number: 10176148

Abstract: Technologies are described to provide smart flipping of groups of objects. According to some examples, a graphics module within an application may determine whether an object within a group of objects to be flipped is flippable, that is can be flipped without resulting in loss of object context after the flip operation. Then, the graphics module may flip the group of objects translating all objects (moving their locations to appropriate new locations based on the flip operation), flipping the Objects that can be flipped, and not flipping the object deemed not flippable, thereby preserving the displayed context of the object.

Type: Grant

Filed: August 27, 2015

Date of Patent: January 8, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventor: Rahul Dhaundiyal
Rendering texts on electronic devices

Patent number: 10134367

Abstract: In one embodiment, dividing a set of texts into one or more text blocks, each text block including a portion of the set of texts; rendering each text block to obtain one or more rendered text blocks; determining a placement instruction for each rendered text block, the placement instruction indicating a position of the rendered text block when it is displayed; and sending the one or more rendered text blocks and their respectively associated placement instructions to an electronic device for displaying on the electronic device.

Type: Grant

Filed: March 24, 2016

Date of Patent: November 20, 2018

Assignee: Facebook, Inc.

Inventor: Barak Reuven Naveh
System and method for filtering keywords

Patent number: 10114889

Abstract: Techniques for filtering information are described herein. In accordance with the present disclosure, a text acquisition module is configured to acquire text content to be filtered and a scanning module is configured to scan the text content to be filtered. The disclosed techniques scan the text content through a preset keyword dictionary, record a position of each keyword in the text content and acquire character pitch between keywords in the text content according to the position of each keyword in text content. A pitch judgment module is configured to judge whether the character pitch exceeds a preset character pitch and filter the keyword(s) in the text content in response to a determination that the character pitch exceeds the preset character pitch.

Type: Grant

Filed: May 15, 2013

Date of Patent: October 30, 2018

Assignee: Beijing Qihoo Technology Company Limited

Inventors: Menggang Han, Tiejun Li, Xuping Liu
Natural language processing via a two-dimensional symbol having multiple ideograms contained therein

Patent number: 10102453

Abstract: A string of natural language texts is received and formed a multi-layer 2-D symbol in a first computing system. The 2-D symbol comprises a matrix of N×N pixels of data representing a “super-character”. The matrix is divided into M×M sub-matrices with each sub-matrix containing (N/M)×(N/M) pixels. N and M are positive integers, and N is preferably a multiple of M. Each sub-matrix represents one ideogram defined in an ideogram collection set. “Super-character” represents a meaning formed from a specific combination of a plurality of ideograms. The meaning of the “super-character” is learned in a second computing system by using an image processing technique to classify the 2-D symbol, which is formed in the first computing system and transmitted to the second computing system. Image process technique includes predefining a set of categories and determining a probability for associating each of the predefined categories with the meaning of the “super-character”.

Type: Grant

Filed: September 1, 2017

Date of Patent: October 16, 2018

Assignee: Gyrfalcon Technology Inc.

Inventors: Lin Yang, Patrick Z. Dong, Baohua Sun
Natural language processing using a CNN based integrated circuit

Patent number: 10083171

Abstract: A string of natural language texts is received and formed a multi-layer 2-D symbol in a computing system. The 2-D symbol comprises a matrix of N×N pixels of K-bit data representing a “super-character”. The matrix is divided into M×M sub-matrices with each sub-matrix containing (N/M)×(N/M) pixels. K, N and M are positive integers, and N is preferably a multiple of M. Each sub-matrix represents one ideogram defined in an ideogram collection set. “Super-character” represents a meaning formed from a specific combination of a plurality of ideograms. The meaning of the “super-character” is learned by classifying the 2-D symbol via a trained convolutional neural networks model having bi-valued 3×3 filter kernels in a Cellular Neural Networks or Cellular Nonlinear Networks (CNN) based integrated circuit.

Type: Grant

Filed: September 19, 2017

Date of Patent: September 25, 2018

Assignee: Gyrfalcon Technology Inc.

Inventors: Lin Yang, Patrick Z. Dong, Baohua Sun
Generation and use of trained file classifiers for malware detection

Patent number: 10068187

Abstract: A method includes accessing information identifying multiple files and identifying classification data for the multiple files, where the classification data indicates, for a particular file of the multiple files, whether the particular file includes malware. The method also includes generating n-gram vectors for the multiple files by, for each file, generating an n-gram vector indicating occurrences of character pairs in printable characters representing the file. The method further includes generating and storing a file classifier using the n-gram vectors and the classification data as supervised training data.

Type: Grant

Filed: May 31, 2017

Date of Patent: September 4, 2018

Assignee: SPARKCOGNITION, INC.

Inventor: Na Sai
Method for line and word segmentation for handwritten text images

Patent number: 10062001

Abstract: A method for segmenting an image containing handwritten text into line segments and word segments. The image is horizontally down sampled at a first ratio. Connected regions in the down-sampled image are detected; horizontal neighboring ones are merged to form lines, to segment the original image into line images. Each line image is horizontally down sampled at a second ratio which is smaller than the first ratio. Connected regions in the down-sampled line image are detected to obtain potential word segmentation positions. A path is a way of dividing the line at some or all of the potential word segmentation positions into multiple path segments; for each of all possible paths, word recognition is applied to each path segment to calculate a word recognition score, and an average word recognition score for the path is calculated; the path with the highest score gives the final word segmentation.

Type: Grant

Filed: September 29, 2016

Date of Patent: August 28, 2018

Assignee: KONICA MINOLTA LABORATORY U.S.A., INC.

Inventor: Duanduan Yang
Method and system for processing semantic fragments

Patent number: 10049101

Abstract: The present invention discloses a method and system for processing semantic fragments. Some embodiments of the present invention provides a method for processing semantic fragments. The method comprises: obtaining a plurality of groups of semantic fragments, the plurality of groups of semantic fragments at least including a first group of semantic fragments generated from a first data processing flow and a second group of semantic fragments generated from a second data processing flow, the first data processing flow being different from the second data processing flow; and merging the first group of semantic fragment and the second group of semantic fragment based on semantic equivalence. A corresponding system is also disclosed.

Type: Grant

Filed: August 6, 2015

Date of Patent: August 14, 2018

Assignee: International Business Machines Corporation

Inventors: Wei Hua Duan, Jia Ji, Jiang Lu, Wei Jie Wang, Qiang Xu, Liang Xue
Indicating a word length using an input device

Patent number: 10042543

Abstract: A method, apparatus, and program product are disclosed for receiving an input from an input device, determining one or more characteristics of the received input, the one or more characteristics indicating a word length, and presenting a list of one or more words having word lengths determined according to the indicated word length.

Type: Grant

Filed: September 18, 2013

Date of Patent: August 7, 2018

Assignee: Lenovo (Singapore) PTE. LTD.

Inventors: Russell Speight VanBlon, John Carl Mese, Nathan J. Peterson, Rod D. Waltermann, Arnold S. Weksler
Method and system for OCR-free vehicle identification number localization

Patent number: 9965677

Abstract: Methods and systems for localizing numbers and characters in captured images. A side image of a vehicle captured by one or more cameras can be preprocessed to determine a region of interest. A confidence value of series of windows within regions of interest of different sizes and aspect ratios containing a structure of interest can be calculated. Highest confidence candidate regions can then be identified with respect to the regions of interest and at least one region adjacent to the highest confidence candidate regions. An OCR operation can then be performed in the adjacent region. An identifier can then be returned from the adjacent region in order to localize numbers and characters in the side image of the vehicle.

Type: Grant

Filed: December 9, 2014

Date of Patent: May 8, 2018

Assignee: Conduent Business Services, LLC

Inventors: Orhan Bulan, Howard Mizes, Vladimir Kozitsky, Aaron M. Burry
Augmenting text with multimedia assets

Patent number: 9940307

Abstract: Systems and methods are provided for providing a navigation interface to access or otherwise use electronic content items. In one embodiment, an augmentation application identifies at least one entity referenced in a document. The entity can be referenced in at least two portions of the document by at least two different words or phrases. The augmentation application associates the at least one entity with at least one multimedia asset. The augmentation application generates a layout including at least some content of the document referencing the at least one entity and the at least one multimedia asset associated with the at least one entity. The augmentation application renders the layout for display.

Type: Grant

Filed: December 31, 2012

Date of Patent: April 10, 2018

Assignee: Adobe Systems Incorporated

Inventors: Emre Demiralp, Gavin Stuart Peter Miller, Walter W. Chang, Daicho Ito, Grayson Squier Lang
Information processing system for displaying handwriting action trajectory based on meta information

Patent number: 9928414

Abstract: There is provided an information processing system including a first control unit configured to associate handwriting action trajectory information indicating a user's handwriting action trajectory with meta information capable of being detected from an actual environment where the user's handwriting action is performed.

Type: Grant

Filed: January 27, 2015

Date of Patent: March 27, 2018

Assignee: SONY CORPORATION

Inventors: Makoto Saito, Hiroaki Kitano
Information processing apparatus, information processing method, and information processing program for synthesizing a modified stroke

Patent number: 9922406

Abstract: There is provided an image processing apparatus including an input device configured to receive a stroke input, and a display controller configured to control a displaying of a modified stroke, wherein the modified stroke is synthesized based on characteristic parameters of the received stroke input and characteristic parameters of a reference stroke that has been matched to the received stroke input.

Type: Grant

Filed: May 15, 2014

Date of Patent: March 20, 2018

Assignee: SONY CORPORATION

Inventors: Yoshihito Ohki, Yasuyuki Koga, Tsubasa Tsukahara, Ikuo Yamano, Hiroyuki Mizunuma, Miwa Ichikawa
User interface for overlapping handwritten text input

Patent number: 9881224

Abstract: A “Stroke Untangler” composes handwritten messages from handwritten strokes representing overlapping letters or partial letter segments are drawn on a touchscreen device or touch-sensitive surface. These overlapping strokes are automatically untangled and then segmented and combined into one or more letters, words, or phrases. Advantageously, segmentation and composition is performed without requiring user gestures, timeouts, or other inputs to delimit characters within words, and without using handwriting recognition-based techniques to guide untangling and composing of the overlapping strokes to form characters. In other words, the user draws multiple overlapping strokes. Those strokes are then automatically segmented and combined into one or more corresponding characters. Text recognition of the resulting characters is then performed. Further, the segmentation and combination is performed in real-time, thereby enabling real-time rendering of the resulting characters in a user interface window.

Type: Grant

Filed: December 17, 2013

Date of Patent: January 30, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Wolf Kienzle, Kenneth Paul Hinckley, Mudit Agrawal
Using extracted image text

Patent number: 9881231

Abstract: Methods, systems, and apparatus including computer program products for using extracted image text are provided. In one implementation, a computer-implemented method is provided. The method includes receiving an input of one or more image search terms and identifying keywords from the received one or more image search terms. The method also includes searching a collection of keywords including keywords extracted from image text, retrieving an image associated with extracted image text corresponding to one or more of the image search terms, and presenting the image.

Type: Grant

Filed: December 1, 2016

Date of Patent: January 30, 2018

Assignee: Google LLC

Inventors: Adrian Ulges, Luc Vincent
Method for searching for, recognizing and locating a term in ink, and a corresponding device, program and language

Patent number: 9875254

Abstract: A method for searching for at least one term, consisting of at least one character, in at least one set of ink data is disclosed. This method advantageously includes an operation for converting ink data into intermediate data, in an intermediate format, in the form of at least one segmentation graph, each node of one of the graphs including at least one ink segment associated with at least one assumption of correspondence with a recognition unit, and an operation for searching for the term or terms, carried out on the intermediate data, the conversion operation being carried out once and for all during storage of one of the sets of data, and the search operation being capable of being carried out at any time.

Type: Grant

Filed: January 10, 2006

Date of Patent: January 23, 2018

Assignee: MYSCRIPT

Inventor: Pierre-Michel Lallican
Adaptive guidelines for handwriting

Patent number: 9874950

Abstract: One embodiment provides a method, involving: receiving, at a device, handwriting input from a user; detecting, using a processor, a location of at least a part of the handwriting input; and providing, on a display device, at least one adaptive line to guide the handwriting input; wherein the at least one adaptive line is positioned based on the location of at least a part of the handwriting input. Other aspects are described and claimed.

Type: Grant

Filed: November 21, 2014

Date of Patent: January 23, 2018

Assignee: Lenovo (Singapore) Pte. Ltd.

Inventors: Jianbang Zhang, Steven Richard Perrin, Russell Speight VanBlon, Joshua Neil Novak
Dynamic portmanteau word semantic identification

Patent number: 9852125

Abstract: An approach is provided to discover new portmanteau, such as when ingesting documents into a question answering (QA) system. The approach works by analyzing a words included in electronic documents and identifies words as being possible portmanteaus. To analyze a portmanteau found in a document, the approach identifies morphemes that are included in the identified portmanteau and candidate words that correspond to each of the identified morphemes. A meaning for the new portmanteau is then derived from the meanings of the candidate word meanings.

Type: Grant

Filed: September 28, 2015

Date of Patent: December 26, 2017

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Albert A. Chung, Andrew R. Freed, Sorabh Murgai
Dynamic portmanteau word semantic identification

Patent number: 9852124

Abstract: An approach is provided to discover new portmanteau, such as when ingesting documents into a question answering (QA) system. The approach works by analyzing a words included in electronic documents and identifies words as being possible portmanteaus. To analyze a portmanteau found in a document, the approach identifies morphemes that are included in the identified portmanteau and candidate words that correspond to each of the identified morphemes. A meaning for the new portmanteau is then derived from the meanings of the candidate word meanings.

Type: Grant

Filed: September 2, 2015

Date of Patent: December 26, 2017

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Albert A. Chung, Andrew R. Freed, Sorabh Murgai
Page layout determination of an image undergoing optical character recognition

Patent number: 9785849

Abstract: A method and system is provided for identifying a page layout of an image that includes textual regions. The textual regions are to undergo optical character recognition (OCR). The system includes an input component that receives an input image that includes words around which bounding boxes have been formed and a text identifying component that groups the words into a plurality of text regions. A reading line component groups words within each of the text regions into reading lines. A text region sorting component that sorts the text regions in accordance with their reading order.

Type: Grant

Filed: November 13, 2013

Date of Patent: October 10, 2017

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Mircea Cimpoi, Sasa Galic, Milan Vugdelija
Using extracted image text

Patent number: 9779325

Abstract: Methods, systems, and apparatus including computer program products for using extracted image text are provided. In one implementation, a computer-implemented method is provided. The method includes receiving an input of one or more image search terms and identifying keywords from the received one or more image search terms. The method also includes searching a collection of keywords including keywords extracted from image text, retrieving an image associated with extracted image text corresponding to one or more of the image search terms, and presenting the image.

Type: Grant

Filed: December 1, 2016

Date of Patent: October 3, 2017

Assignee: Google Inc.

Inventors: Adrian Ulges, Luc Vincent
System and method for transcribing historical records into digitized text

Patent number: 9767353

Abstract: A handwriting recognition system converts word images on documents, such as document images of historical records, into computer searchable text. Word images (snippets) on the document are located, and have multiple word features identified. For each word image, a word feature vector is created representing multiple word features. Based on the similarity of word features (e.g., the distance between feature vectors), similar words are grouped together in clusters, and a centroid that has features most representative of words in the cluster is selected. A digitized text word is selected for each cluster based on review of a centroid in the cluster, and is assigned to all words in that cluster and is used as computer searchable text for those word images where they appear in documents. An analyst may review clusters to permit refinement of the parameters used for grouping words in clusters, including the adjustment of weights and other factors used for determining the distance between feature vectors.

Type: Grant

Filed: August 31, 2015

Date of Patent: September 19, 2017

Assignee: Ancestry.com Operations Inc.

Inventors: Jack Reese, Michael Murdock, Shawn Reid, Laryn Brown
Patent mapping

Patent number: 9697577

Abstract: The present inventive subject matter provides systems, methods, software, and data structures for patent mapping, storage, and searching. Some such embodiments include mapping patent documents, claims, and claim limitations. Some further embodiments provide for searching a universe of patent documents by patent document, claim, limitation, class, element, or concept.

Type: Grant

Filed: December 1, 2010

Date of Patent: July 4, 2017

Assignee: Lucid Patent LLC

Inventors: Steven W. Lundberg, Janal M. Kalis, Pradeep Sinha
Nearsighted camera object detection

Patent number: 9684984

Abstract: A system and process of nearsighted (myopia) camera object detection involves detecting the objects through edge detection and outlining or thickening them with a heavy border. Thickening may include making the object bold in the case of text characters. The bold characters are then much more apparent and heavier weighted than the background. Thresholding operations are then applied (usually multiple times) to the grayscale image to remove all but the darkest foreground objects in the background resulting in a nearsighted (myopic) image. Additional processes may be applied to the nearsighted image, such as morphological closing, contour tracing and bounding of the objects or characters. The bound objects or characters can then be averaged to provide repositioning feedback for the camera user. Processed images can then be captured and subjected to OCR to extract relevant information from the image.

Type: Grant

Filed: July 8, 2015

Date of Patent: June 20, 2017

Assignee: Sage Software, Inc.

Inventor: Scott E. Barton
Keyword determinations from voice data

Patent number: 9679570

Abstract: Topics of potential interest to a user, useful for purposes such as targeted advertising and product recommendations, can be extracted from voice content produced by a user. A computing device can capture voice content, such as when a user speaks into or near the device. One or more sniffer algorithms or processes can attempt to identify trigger words in the voice content, which can indicate a level of interest of the user. For each identified potential trigger word, the device can capture adjacent audio that can be analyzed, on the device or remotely, to attempt to determine one or more keywords associated with that trigger word. The identified keywords can be stored and/or transmitted to an appropriate location accessible to entities such as advertisers or content providers who can use the keywords to attempt to select or customize content that is likely relevant to the user.

Type: Grant

Filed: August 17, 2015

Date of Patent: June 13, 2017

Assignee: AMAZON TECHNOLOGIES, INC.

Inventor: Kiran K. Edara
Methods of content-based image area selection

Patent number: 9678642

Abstract: A system and methods for selecting a region of pixels in an image displayed on a touch-sensitive interface is disclosed. The method for selecting the region of pixels is based on determined connectivity of pixels in the image indicating content of the image and includes determining connected pixels on the image representing the content without performing character recognition, detecting a text selection gesture indicative of selecting the region in the image, determining coordinates of the text selection gesture performed on the touch-sensitive interface and selecting the region in the image by bounding a first set of pixels located at a proximity from the coordinates of the text selection gesture.

Type: Grant

Filed: May 29, 2015

Date of Patent: June 13, 2017

Assignee: Lexmark International, Inc.

Inventors: Stuart Willard Daniel, Ahmed Hamad Mohamed Eid, Shaun Timothy Love
Information processing apparatus, information processing system, information processing method and storage medium

Patent number: 9679217

Abstract: According to one embodiment, an information processing apparatus includes an image acquisition module, an elevation-angle acquisition module, a character deformation specification module, a character detection dictionary storage, a character detection dictionary selector and a character detector. The elevation-angle acquisition module is configured to acquire an elevation angle of a photographic device assumed when the photographic device has obtained an acquired image. The character deformation specification module is configured to specify how an appearance of the character in the acquired image is deformed, based on the acquired elevation angle.

Type: Grant

Filed: August 25, 2015

Date of Patent: June 13, 2017

Assignee: Kabushiki Kaisha Toshiba

Inventors: Kaoru Suzuki, Yojiro Tonouchi, Tomoyuki Shibata, Isao Mihara
Triggering actions in response to optically or acoustically capturing keywords from a rendered document

Patent number: 9633013

Abstract: A system for processing text captured from rendered documents is described. The system receives a sequence of one or more words optically or acoustically captured from a rendered document by a user. The system identifies among words of the sequence a word with which an action has been associated. The system then performs the associated action with respect to the user.

Type: Grant

Filed: March 22, 2016

Date of Patent: April 25, 2017

Assignee: Google Inc.

Inventors: Martin T. King, Dale L. Grover, Clifford A. Kushler, James Q. Stafford-Fraser
Image processing apparatus and image processing method

Patent number: 9607237

Abstract: When there is a possibility that a third character region is redundantly selected in both a case where the line extraction process is performed starting from a first character region and a case where the line extraction process is performed starting from a second character region located in a line different from a line containing the first character region, the line recognition unit determines which line to incorporate the third character region in, by comparing a case of incorporating the third character region into the line starting with the first character region, with a case of incorporating the third character region into the line starting with the second character region.

Type: Grant

Filed: February 27, 2014

Date of Patent: March 28, 2017

Assignee: OMRON Corporation

Inventors: Hirotaka Wada, Tomoyoshi Aizawa, Norikazu Tonogai, Tadashi Hyuga, Yoshihisa Minato, Masamichi Oe, Koji Kobayashi
Image processing apparatus, image processing method and computer-readable storage medium

Patent number: 9600731

Abstract: According to one embodiment, an image processing apparatus includes a calculation unit. The calculation unit is configured to calculate a first similarity degree group which is composed of similarity degrees between respective characters constituting a first character string appearing on a first image and respective candidate characters in a candidate character group, to calculate a second similarity degree group which is composed of similarity degrees between respective characters constituting a second character string appearing on a second image and the respective candidate characters, and to calculate a third similarity degree group which is composed of similarity degrees between respective characters constituting a third character string appearing on the second image and the respective candidate characters.

Type: Grant

Filed: April 8, 2015

Date of Patent: March 21, 2017

Assignee: TOSHIBA TEC KABUSHIKI KAISHA

Inventors: Masaaki Yasunaga, Kazuki Taira
Document file generating device and document file generation method

Patent number: 9575935

Abstract: Provided is to create a document file for drawing a picture finely, without increasing a file size. When a size of a first file computed before a process of vectorization is smaller than a size of a file of a manuscript, a process of vectorization is performed. When a size of a second file computed in the process of vectorization is smaller than the size of the file of the manuscript, a process after an end of the process of vectorization is performed. When a size of a third file computed in the process after the end of the process of vectorization is smaller than the size of the file of the manuscript, a vectorization file that is written in vectorized data is generated.

Type: Grant

Filed: January 24, 2015

Date of Patent: February 21, 2017

Assignee: KYOCERA Document Solutions Inc.

Inventor: Motoki Hiratsuka
Automatic accuracy estimation for audio transcriptions

Patent number: 9570068

Abstract: Embodiments of the present invention provide an approach for estimating the accuracy of a transcription of a voice recording. Specifically, in a typical embodiment, each word of a transcription of a voice recording is checked against a customer-specific dictionary and/or a common language dictionary. The number of words not found in either dictionary is determined. An accuracy number for the transcription is calculated from the number of said words not found and the total number of words in the transcription.

Type: Grant

Filed: June 3, 2016

Date of Patent: February 14, 2017

Assignee: International Business Machines Corporation

Inventors: James E. Bostick, John M. Ganci, Jr., John P. Kaemmerer, Craig M. Trim
Image processing apparatus, image processing method and computer-readable storage medium

Patent number: 9563812

Abstract: According to one embodiment, an image processing apparatus includes a calculation unit and a recognition unit. The calculation unit is configured to calculate a first similarity degree group which is composed of similarity degrees between respective characters constituting a first character string appearing on a first image and respective candidate characters in a candidate character group, and to calculate a second similarity degree group which is composed of similarity degrees between respective characters constituting a second character string appearing on a second image and the respective candidate characters in the candidate character group.

Type: Grant

Filed: April 8, 2015

Date of Patent: February 7, 2017

Assignee: Toshiba TEC Kabushiki Kaisha

Inventors: Masaaki Yasunaga, Kazuki Taira
Image processing apparatus, image forming apparatus, and computer readable medium

Patent number: 9552621

Abstract: An image processing apparatus includes a reception unit, an acquisition unit, an enlarging/reducing unit, and a detector. The reception unit receives two image data to be compared. The acquisition unit acquires character sizes of characters contained in the two image data received by the reception unit. The enlarging/reducing unit enlarges or reduces the image data received by the reception unit such that the character sizes of the characters contained in the two image data acquired by the acquisition unit coincide with each other. The detector detects a difference between the two image data which have been enlarged or reduced by the enlarging/reducing unit such that the character sizes of the characters contained in the two image data coincide with each other.

Type: Grant

Filed: November 21, 2014

Date of Patent: January 24, 2017

Assignee: FUJI XEROX CO., LTD.

Inventors: Tetsuharu Watanabe, Naoyuki Enomoto, Yozo Kashima, Tomohisa Ishikawa
Example-based error detection system for automatic evaluation of writing, method for same, and error detection apparatus for same

Patent number: 9542383

Abstract: An error detection system for automatically evaluating writing includes: an example construction apparatus to collect example sentences including various literary styles, to break up the collected example sentences in units of morphemes, and to construct the example sentences in an example-based index database (DB); and an error detection apparatus to break up an input sentence in units of morphemes, to generate one or more morpheme sequences bound in arbitrary window sizes based on one or more of morphemes of the broken-up input sentence, to search the example-based index DB for each of the generated morpheme sequences, and to detect an error according to a frequency at which said each morpheme is arranged in a corresponding morpheme sequence among morpheme sequences searched for through the example-based index DB.

Type: Grant

Filed: December 4, 2013

Date of Patent: January 10, 2017

Assignee: SK TELECOM CO., LTD.

Inventors: Seunghwan Kim, Eunsook Lee, Seongmook Kim, Dongnam Kim, Sung Kim
Template matching with data correction

Patent number: 9530068

Abstract: An approach is provided to generate forms with template inclusions. In the approach, optical character recognition (OCR) text is compared to corresponding text in a selected form. Characters of text in the OCR text are then replaced with text from the template text, the replacing results in a form with template inclusions. The form with template inclusions is then processed by a forms processing operation.

Type: Grant

Filed: November 10, 2014

Date of Patent: December 27, 2016

Assignee: International Business Machines Corporation

Inventors: Keith P. Biegert, Brendan C. Bull, David Contreras, Robert C. Sizemore, Sterling R. Smith
Triggering actions in response to optically or acoustically capturing keywords from a rendered document

Patent number: 9514134

Abstract: A system for processing text captured from rendered documents is described. The system receives a sequence of one or more words optically or acoustically captured from a rendered document by a user. The system identifies among words of the sequence a word with which an action has been associated. The system then performs the associated action with respect to the user.

Type: Grant

Filed: July 15, 2015

Date of Patent: December 6, 2016

Assignee: Google Inc.

Inventors: Martin T. King, Dale L. Grover, Clifford A. Kushler, James Q. Stafford-Fraser
Providing in-line previews of a source image for aid in correcting OCR errors

Patent number: 9501853

Abstract: The present disclosure is directed toward systems and methods for assisting users in correcting OCR errors. For example, systems and methods described herein involve identifying the position of a cursor within a machine-readable document. Systems and methods described herein also involve identifying corresponding position co-ordinates in a source image, as well as, capturing an image preview from the source image based on the corresponding position co-ordinates. Systems and methods described herein may also involve providing the preview of the source image within the machine-readable document.

Type: Grant

Filed: January 9, 2015

Date of Patent: November 22, 2016

Assignee: ADOBE SYSTEMS INCORPORATED

Inventors: Sayeed Ullah Khan, Subrato Namata, Uttam Dwivedi
Text extraction

Patent number: 9495357

Abstract: Embodiments are used to extract terms from any text set that are used on other text, such as in a repository, that then can be used in a variety of applications, from providing search results, to analyzing data sets, to building a variety of text generation tools, such as messaging and emails.

Type: Grant

Filed: May 2, 2014

Date of Patent: November 15, 2016

Inventors: Athena Ann Smyros, Constantine John Smyros
Media item re-orientation

Patent number: 9472037

Abstract: The present invention provides a method of re-orienting an image of a media item, comprising determining at least one linear array formed by a plurality of locations associated with an image of the media item; determining a skew angle of said linear array with respect to a reference axis; and re-orienting said image by rotating said image responsive to said skew angle. Apparatus for re-orienting an image of a media item and a document processing module are also provided.

Type: Grant

Filed: January 31, 2014

Date of Patent: October 18, 2016

Assignee: NCR CORPORATION

Inventor: Ping Chen
Coarse document classification

Patent number: 9418312

Abstract: Systems and methods coarsely classify unknown documents in a group or not with reference document(s). Documents get scanned into digital images. Counts of contours are taken. The closer the counts of the contours of the unknown document reside to the reference document(s), the more likely the documents are all of a same type. Embodiments typify contour analysis, classification acceptance or not, application of algorithms, and imaging devices with scanners, to name a few.

Type: Grant

Filed: September 10, 2014

Date of Patent: August 16, 2016

Assignee: LEXMARK INTERNATIONAL TECHNOLOGY, SA

Inventors: Ranajyoti Chakraborti, Kunal Das, Rajib Dutta, Sabyasachi Samanta, Subhadeep Samanta
System and methods for computerized machine-learning based authentication of electronic documents including use of linear programming for classification

Patent number: 9406030

Abstract: Electronic document classification comprising providing training documents sorted into classes; linear programming including selecting inputs which maximize an output, given constraints on inputs, the output maximized being a difference between: a. first estimated probability that a document instance will be correctly classified, by a classifier corresponding to given inputs, as belonging to its own class, and b. second estimated probability that document instance will be classified, by the classifier, as not belonging to its own class; and classifying electronic document instances into classes, using a preferred classifier corresponding, to the inputs selected by the linear programming. A computerized electronic document forgery detection method provides training documents and uses a processor to select value-ranges of non-trivial parameters, such that selected values-range(s) of parameters are typical to an authentic document of given class, and atypical to a forged document of same class.

Type: Grant

Filed: July 23, 2012

Date of Patent: August 2, 2016

Assignee: AU10TIX LIMITED

Inventors: Guy Dolev, Sergey Markin, Avi Bar-Nissim, Asher Uziel
Detecting errors in recognized text

Patent number: 9384389

Abstract: Some examples include detecting errors in text that has been recognized using automated text recognition technology. For instance, errors in the recognized text may be detected based on glyph image similarity and the use of a language model, dictionary information, or the like. Some implementations may group together glyphs based on association of the glyphs with the same glyph identifier and a similarity of the appearance of the glyphs. Furthermore, the words associated with each glyph may be checked against a language model, such as to check a spelling or other validity of the words, and a score may be assigned to each group of glyphs based on the validity of the words corresponding to the glyphs in that group. Groups that have a score that fails to meet a threshold may be reviewed by a person or may undergo automated correction techniques.

Type: Grant

Filed: September 12, 2012

Date of Patent: July 5, 2016

Assignee: Amazon Technologies, Inc.

Inventors: Viswanath Sankaranarayanan, Sridhar Jayaraman
Coarse document classification in an imaging device

Patent number: 9367760

Abstract: Systems and methods coarsely classify unknown documents in a group or not with reference document(s). Documents get scanned into digital images. Counts of contours are taken. The closer the counts of the contours of the unknown document reside to the reference document(s), the more likely the documents are all of a same type. Embodiments typify contour analysis, classification acceptance or not, application of algorithms, and imaging devices with scanners, to name a few.

Type: Grant

Filed: November 24, 2015

Date of Patent: June 14, 2016

Assignee: LEXMARK INTERNATIONAL, INC.

Inventors: Ranajyoti Chakraborti, Kunal Das, Rajib Dutta, Sabyasachi Samanta, Subhadeep Samanta
Identifying user marks using patterned lines on pre-printed forms

Patent number: 9361536

Abstract: Methods, devices, and systems replace solid lines of user-fillable areas of a print job with patterned lines and then print the print job with the patterned lines to print user-fillable pre-printed forms, using a printing device. These methods, devices, and systems also scan at least one of the user-fillable pre-printed forms having user markings to produce a scan, using an optical scanner. Further, such methods, devices, and systems produce an altered scan by removing only the patterned lines from the scan to leave the user markings in the altered scan using the image processor. Then, these methods, devices, and systems can identify user-supplied characters by performing automated character recognition on the user markings in the altered scan using the image processor and output such user-supplied characters from the image processor.

Type: Grant

Filed: December 16, 2014

Date of Patent: June 7, 2016

Assignee: Xerox Corporation

Inventors: Richard L. Howe, Eric M. Gross, Dennis L. Venable
Methods and apparatus for identifying tables in digital files

Patent number: 9348848

Abstract: A method for identifying a table in a digital file includes extracting lines from a layout of the digital file, wherein the lines comprise horizontal lines and vertical lines. The method also includes identifying intersected line groups, wherein each intersected line group comprises a horizontal line of the extracted horizontal lines and a vertical line of the extracted vertical lines, the horizontal line and the vertical line intersecting with each other. The method further includes determining whether the number of intersected lines in each intersected line group is larger than a first threshold. If yes, the method further includes identifying an area in which the intersected line groups are located as a table area. If no, the method further includes performing vertical projection on characters in the area, and identifying the area as a table area based on results of the vertical projection.

Type: Grant

Filed: April 26, 2013

Date of Patent: May 24, 2016

Assignees: Peking University Founder Group Co., Ltd., Beijing Founder Apabi Technology Ltd.

Inventors: Ning Dong, Wenjuan Huang
Recognizing handwriting input using rotatable support lines

Patent number: 9304681

Abstract: Software, firmware, and systems are described for identifying characters in a handwritten input received from a user on an input device, irrespective of an angle that the input is received at. In one implementation, the system establishes an anchor point and distances from the anchor point to reference support lines. A set of candidate characters is identified based on received handwritten input. The system estimates support lines for each of the candidate characters. The system ranks the candidate characters based on a total deviation measurement from the expectation for each candidate, where the expectation in part is based on the established distance from the established anchor point to reference support lines, and identifies a best-ranked candidate based at least in part on a smallest total deviation measurement.

Type: Grant

Filed: May 1, 2015

Date of Patent: April 5, 2016

Assignee: Nuance Communications, Inc.

Inventors: Jonas Andersson, Lars Jonas Morwing

prev 1 2 3 4 5 6 … next