Segmenting Individual Characters Or Words Patents (Class 382/177)
  • Patent number: 10311149
    Abstract: Natural language translation device contains a bus, an input interface connecting to the bus for receiving a source sentence in a first natural language to be translated to a target sentence in second natural language one word at a time in sequential order. A two-dimensional (2-D) symbol containing a super-character characterizing the i-th word of the target sentence based on the received source sentence is formed in accordance with a set of 2-D symbol creation rules. The i-th word of the target sentence is obtained by classifying the 2-D symbol via a deep learning model that contains multiple ordered convolution layers in a Cellular Neural Networks or Cellular Nonlinear Networks (CNN) based integrated circuit.
    Type: Grant
    Filed: August 8, 2018
    Date of Patent: June 4, 2019
    Assignee: Gyrfalcon Technology Inc.
    Inventors: Lin Yang, Patrick Z. Dong, Catherine Chi, Charles Jin Young, Jason Z Dong, Baohua Sun
  • Patent number: 10248313
    Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.
    Type: Grant
    Filed: March 29, 2017
    Date of Patent: April 2, 2019
    Assignee: Google LLC
    Inventors: Shumin Zhai, Thomas Breuel, Ouais Alsharif, Yu Ouyang, Francoise Beaufays, Johan Schalkwyk
  • Patent number: 10181075
    Abstract: An information processing apparatus includes an evaluation unit configured to evaluate whether a partial region of a photographing range of an imaging unit is a region suitable for analysis processing to be performed based on feature quantities of an object, with reference to a track of the object in an image captured by the imaging unit, and an output control unit configured to control the information processing apparatus to output information reflecting an evaluation result obtained by the evaluation unit. Accordingly, the information processing apparatus can support a user to improve the accuracy of the analysis processing to be performed based on the feature quantities of the object.
    Type: Grant
    Filed: October 12, 2016
    Date of Patent: January 15, 2019
    Assignee: Canon Kabushiki Kaisha
    Inventors: Hiroshi Tojo, Tomoya Honjo, Shinji Yamamoto
  • Patent number: 10176148
    Abstract: Technologies are described to provide smart flipping of groups of objects. According to some examples, a graphics module within an application may determine whether an object within a group of objects to be flipped is flippable, that is can be flipped without resulting in loss of object context after the flip operation. Then, the graphics module may flip the group of objects translating all objects (moving their locations to appropriate new locations based on the flip operation), flipping the Objects that can be flipped, and not flipping the object deemed not flippable, thereby preserving the displayed context of the object.
    Type: Grant
    Filed: August 27, 2015
    Date of Patent: January 8, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Rahul Dhaundiyal
  • Patent number: 10134367
    Abstract: In one embodiment, dividing a set of texts into one or more text blocks, each text block including a portion of the set of texts; rendering each text block to obtain one or more rendered text blocks; determining a placement instruction for each rendered text block, the placement instruction indicating a position of the rendered text block when it is displayed; and sending the one or more rendered text blocks and their respectively associated placement instructions to an electronic device for displaying on the electronic device.
    Type: Grant
    Filed: March 24, 2016
    Date of Patent: November 20, 2018
    Assignee: Facebook, Inc.
    Inventor: Barak Reuven Naveh
  • Patent number: 10114889
    Abstract: Techniques for filtering information are described herein. In accordance with the present disclosure, a text acquisition module is configured to acquire text content to be filtered and a scanning module is configured to scan the text content to be filtered. The disclosed techniques scan the text content through a preset keyword dictionary, record a position of each keyword in the text content and acquire character pitch between keywords in the text content according to the position of each keyword in text content. A pitch judgment module is configured to judge whether the character pitch exceeds a preset character pitch and filter the keyword(s) in the text content in response to a determination that the character pitch exceeds the preset character pitch.
    Type: Grant
    Filed: May 15, 2013
    Date of Patent: October 30, 2018
    Assignee: Beijing Qihoo Technology Company Limited
    Inventors: Menggang Han, Tiejun Li, Xuping Liu
  • Patent number: 10102453
    Abstract: A string of natural language texts is received and formed a multi-layer 2-D symbol in a first computing system. The 2-D symbol comprises a matrix of N×N pixels of data representing a “super-character”. The matrix is divided into M×M sub-matrices with each sub-matrix containing (N/M)×(N/M) pixels. N and M are positive integers, and N is preferably a multiple of M. Each sub-matrix represents one ideogram defined in an ideogram collection set. “Super-character” represents a meaning formed from a specific combination of a plurality of ideograms. The meaning of the “super-character” is learned in a second computing system by using an image processing technique to classify the 2-D symbol, which is formed in the first computing system and transmitted to the second computing system. Image process technique includes predefining a set of categories and determining a probability for associating each of the predefined categories with the meaning of the “super-character”.
    Type: Grant
    Filed: September 1, 2017
    Date of Patent: October 16, 2018
    Assignee: Gyrfalcon Technology Inc.
    Inventors: Lin Yang, Patrick Z. Dong, Baohua Sun
  • Patent number: 10083171
    Abstract: A string of natural language texts is received and formed a multi-layer 2-D symbol in a computing system. The 2-D symbol comprises a matrix of N×N pixels of K-bit data representing a “super-character”. The matrix is divided into M×M sub-matrices with each sub-matrix containing (N/M)×(N/M) pixels. K, N and M are positive integers, and N is preferably a multiple of M. Each sub-matrix represents one ideogram defined in an ideogram collection set. “Super-character” represents a meaning formed from a specific combination of a plurality of ideograms. The meaning of the “super-character” is learned by classifying the 2-D symbol via a trained convolutional neural networks model having bi-valued 3×3 filter kernels in a Cellular Neural Networks or Cellular Nonlinear Networks (CNN) based integrated circuit.
    Type: Grant
    Filed: September 19, 2017
    Date of Patent: September 25, 2018
    Assignee: Gyrfalcon Technology Inc.
    Inventors: Lin Yang, Patrick Z. Dong, Baohua Sun
  • Patent number: 10068187
    Abstract: A method includes accessing information identifying multiple files and identifying classification data for the multiple files, where the classification data indicates, for a particular file of the multiple files, whether the particular file includes malware. The method also includes generating n-gram vectors for the multiple files by, for each file, generating an n-gram vector indicating occurrences of character pairs in printable characters representing the file. The method further includes generating and storing a file classifier using the n-gram vectors and the classification data as supervised training data.
    Type: Grant
    Filed: May 31, 2017
    Date of Patent: September 4, 2018
    Assignee: SPARKCOGNITION, INC.
    Inventor: Na Sai
  • Patent number: 10062001
    Abstract: A method for segmenting an image containing handwritten text into line segments and word segments. The image is horizontally down sampled at a first ratio. Connected regions in the down-sampled image are detected; horizontal neighboring ones are merged to form lines, to segment the original image into line images. Each line image is horizontally down sampled at a second ratio which is smaller than the first ratio. Connected regions in the down-sampled line image are detected to obtain potential word segmentation positions. A path is a way of dividing the line at some or all of the potential word segmentation positions into multiple path segments; for each of all possible paths, word recognition is applied to each path segment to calculate a word recognition score, and an average word recognition score for the path is calculated; the path with the highest score gives the final word segmentation.
    Type: Grant
    Filed: September 29, 2016
    Date of Patent: August 28, 2018
    Assignee: KONICA MINOLTA LABORATORY U.S.A., INC.
    Inventor: Duanduan Yang
  • Patent number: 10049101
    Abstract: The present invention discloses a method and system for processing semantic fragments. Some embodiments of the present invention provides a method for processing semantic fragments. The method comprises: obtaining a plurality of groups of semantic fragments, the plurality of groups of semantic fragments at least including a first group of semantic fragments generated from a first data processing flow and a second group of semantic fragments generated from a second data processing flow, the first data processing flow being different from the second data processing flow; and merging the first group of semantic fragment and the second group of semantic fragment based on semantic equivalence. A corresponding system is also disclosed.
    Type: Grant
    Filed: August 6, 2015
    Date of Patent: August 14, 2018
    Assignee: International Business Machines Corporation
    Inventors: Wei Hua Duan, Jia Ji, Jiang Lu, Wei Jie Wang, Qiang Xu, Liang Xue
  • Patent number: 10042543
    Abstract: A method, apparatus, and program product are disclosed for receiving an input from an input device, determining one or more characteristics of the received input, the one or more characteristics indicating a word length, and presenting a list of one or more words having word lengths determined according to the indicated word length.
    Type: Grant
    Filed: September 18, 2013
    Date of Patent: August 7, 2018
    Assignee: Lenovo (Singapore) PTE. LTD.
    Inventors: Russell Speight VanBlon, John Carl Mese, Nathan J. Peterson, Rod D. Waltermann, Arnold S. Weksler
  • Patent number: 9965677
    Abstract: Methods and systems for localizing numbers and characters in captured images. A side image of a vehicle captured by one or more cameras can be preprocessed to determine a region of interest. A confidence value of series of windows within regions of interest of different sizes and aspect ratios containing a structure of interest can be calculated. Highest confidence candidate regions can then be identified with respect to the regions of interest and at least one region adjacent to the highest confidence candidate regions. An OCR operation can then be performed in the adjacent region. An identifier can then be returned from the adjacent region in order to localize numbers and characters in the side image of the vehicle.
    Type: Grant
    Filed: December 9, 2014
    Date of Patent: May 8, 2018
    Assignee: Conduent Business Services, LLC
    Inventors: Orhan Bulan, Howard Mizes, Vladimir Kozitsky, Aaron M. Burry
  • Patent number: 9940307
    Abstract: Systems and methods are provided for providing a navigation interface to access or otherwise use electronic content items. In one embodiment, an augmentation application identifies at least one entity referenced in a document. The entity can be referenced in at least two portions of the document by at least two different words or phrases. The augmentation application associates the at least one entity with at least one multimedia asset. The augmentation application generates a layout including at least some content of the document referencing the at least one entity and the at least one multimedia asset associated with the at least one entity. The augmentation application renders the layout for display.
    Type: Grant
    Filed: December 31, 2012
    Date of Patent: April 10, 2018
    Assignee: Adobe Systems Incorporated
    Inventors: Emre Demiralp, Gavin Stuart Peter Miller, Walter W. Chang, Daicho Ito, Grayson Squier Lang
  • Patent number: 9928414
    Abstract: There is provided an information processing system including a first control unit configured to associate handwriting action trajectory information indicating a user's handwriting action trajectory with meta information capable of being detected from an actual environment where the user's handwriting action is performed.
    Type: Grant
    Filed: January 27, 2015
    Date of Patent: March 27, 2018
    Assignee: SONY CORPORATION
    Inventors: Makoto Saito, Hiroaki Kitano
  • Patent number: 9922406
    Abstract: There is provided an image processing apparatus including an input device configured to receive a stroke input, and a display controller configured to control a displaying of a modified stroke, wherein the modified stroke is synthesized based on characteristic parameters of the received stroke input and characteristic parameters of a reference stroke that has been matched to the received stroke input.
    Type: Grant
    Filed: May 15, 2014
    Date of Patent: March 20, 2018
    Assignee: SONY CORPORATION
    Inventors: Yoshihito Ohki, Yasuyuki Koga, Tsubasa Tsukahara, Ikuo Yamano, Hiroyuki Mizunuma, Miwa Ichikawa
  • Patent number: 9881224
    Abstract: A “Stroke Untangler” composes handwritten messages from handwritten strokes representing overlapping letters or partial letter segments are drawn on a touchscreen device or touch-sensitive surface. These overlapping strokes are automatically untangled and then segmented and combined into one or more letters, words, or phrases. Advantageously, segmentation and composition is performed without requiring user gestures, timeouts, or other inputs to delimit characters within words, and without using handwriting recognition-based techniques to guide untangling and composing of the overlapping strokes to form characters. In other words, the user draws multiple overlapping strokes. Those strokes are then automatically segmented and combined into one or more corresponding characters. Text recognition of the resulting characters is then performed. Further, the segmentation and combination is performed in real-time, thereby enabling real-time rendering of the resulting characters in a user interface window.
    Type: Grant
    Filed: December 17, 2013
    Date of Patent: January 30, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Wolf Kienzle, Kenneth Paul Hinckley, Mudit Agrawal
  • Patent number: 9881231
    Abstract: Methods, systems, and apparatus including computer program products for using extracted image text are provided. In one implementation, a computer-implemented method is provided. The method includes receiving an input of one or more image search terms and identifying keywords from the received one or more image search terms. The method also includes searching a collection of keywords including keywords extracted from image text, retrieving an image associated with extracted image text corresponding to one or more of the image search terms, and presenting the image.
    Type: Grant
    Filed: December 1, 2016
    Date of Patent: January 30, 2018
    Assignee: Google LLC
    Inventors: Adrian Ulges, Luc Vincent
  • Patent number: 9875254
    Abstract: A method for searching for at least one term, consisting of at least one character, in at least one set of ink data is disclosed. This method advantageously includes an operation for converting ink data into intermediate data, in an intermediate format, in the form of at least one segmentation graph, each node of one of the graphs including at least one ink segment associated with at least one assumption of correspondence with a recognition unit, and an operation for searching for the term or terms, carried out on the intermediate data, the conversion operation being carried out once and for all during storage of one of the sets of data, and the search operation being capable of being carried out at any time.
    Type: Grant
    Filed: January 10, 2006
    Date of Patent: January 23, 2018
    Assignee: MYSCRIPT
    Inventor: Pierre-Michel Lallican
  • Patent number: 9874950
    Abstract: One embodiment provides a method, involving: receiving, at a device, handwriting input from a user; detecting, using a processor, a location of at least a part of the handwriting input; and providing, on a display device, at least one adaptive line to guide the handwriting input; wherein the at least one adaptive line is positioned based on the location of at least a part of the handwriting input. Other aspects are described and claimed.
    Type: Grant
    Filed: November 21, 2014
    Date of Patent: January 23, 2018
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: Jianbang Zhang, Steven Richard Perrin, Russell Speight VanBlon, Joshua Neil Novak
  • Patent number: 9852125
    Abstract: An approach is provided to discover new portmanteau, such as when ingesting documents into a question answering (QA) system. The approach works by analyzing a words included in electronic documents and identifies words as being possible portmanteaus. To analyze a portmanteau found in a document, the approach identifies morphemes that are included in the identified portmanteau and candidate words that correspond to each of the identified morphemes. A meaning for the new portmanteau is then derived from the meanings of the candidate word meanings.
    Type: Grant
    Filed: September 28, 2015
    Date of Patent: December 26, 2017
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Albert A. Chung, Andrew R. Freed, Sorabh Murgai
  • Patent number: 9852124
    Abstract: An approach is provided to discover new portmanteau, such as when ingesting documents into a question answering (QA) system. The approach works by analyzing a words included in electronic documents and identifies words as being possible portmanteaus. To analyze a portmanteau found in a document, the approach identifies morphemes that are included in the identified portmanteau and candidate words that correspond to each of the identified morphemes. A meaning for the new portmanteau is then derived from the meanings of the candidate word meanings.
    Type: Grant
    Filed: September 2, 2015
    Date of Patent: December 26, 2017
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Albert A. Chung, Andrew R. Freed, Sorabh Murgai
  • Patent number: 9785849
    Abstract: A method and system is provided for identifying a page layout of an image that includes textual regions. The textual regions are to undergo optical character recognition (OCR). The system includes an input component that receives an input image that includes words around which bounding boxes have been formed and a text identifying component that groups the words into a plurality of text regions. A reading line component groups words within each of the text regions into reading lines. A text region sorting component that sorts the text regions in accordance with their reading order.
    Type: Grant
    Filed: November 13, 2013
    Date of Patent: October 10, 2017
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Mircea Cimpoi, Sasa Galic, Milan Vugdelija
  • Patent number: 9779325
    Abstract: Methods, systems, and apparatus including computer program products for using extracted image text are provided. In one implementation, a computer-implemented method is provided. The method includes receiving an input of one or more image search terms and identifying keywords from the received one or more image search terms. The method also includes searching a collection of keywords including keywords extracted from image text, retrieving an image associated with extracted image text corresponding to one or more of the image search terms, and presenting the image.
    Type: Grant
    Filed: December 1, 2016
    Date of Patent: October 3, 2017
    Assignee: Google Inc.
    Inventors: Adrian Ulges, Luc Vincent
  • Patent number: 9767353
    Abstract: A handwriting recognition system converts word images on documents, such as document images of historical records, into computer searchable text. Word images (snippets) on the document are located, and have multiple word features identified. For each word image, a word feature vector is created representing multiple word features. Based on the similarity of word features (e.g., the distance between feature vectors), similar words are grouped together in clusters, and a centroid that has features most representative of words in the cluster is selected. A digitized text word is selected for each cluster based on review of a centroid in the cluster, and is assigned to all words in that cluster and is used as computer searchable text for those word images where they appear in documents. An analyst may review clusters to permit refinement of the parameters used for grouping words in clusters, including the adjustment of weights and other factors used for determining the distance between feature vectors.
    Type: Grant
    Filed: August 31, 2015
    Date of Patent: September 19, 2017
    Assignee: Ancestry.com Operations Inc.
    Inventors: Jack Reese, Michael Murdock, Shawn Reid, Laryn Brown
  • Patent number: 9697577
    Abstract: The present inventive subject matter provides systems, methods, software, and data structures for patent mapping, storage, and searching. Some such embodiments include mapping patent documents, claims, and claim limitations. Some further embodiments provide for searching a universe of patent documents by patent document, claim, limitation, class, element, or concept.
    Type: Grant
    Filed: December 1, 2010
    Date of Patent: July 4, 2017
    Assignee: Lucid Patent LLC
    Inventors: Steven W. Lundberg, Janal M. Kalis, Pradeep Sinha
  • Patent number: 9684984
    Abstract: A system and process of nearsighted (myopia) camera object detection involves detecting the objects through edge detection and outlining or thickening them with a heavy border. Thickening may include making the object bold in the case of text characters. The bold characters are then much more apparent and heavier weighted than the background. Thresholding operations are then applied (usually multiple times) to the grayscale image to remove all but the darkest foreground objects in the background resulting in a nearsighted (myopic) image. Additional processes may be applied to the nearsighted image, such as morphological closing, contour tracing and bounding of the objects or characters. The bound objects or characters can then be averaged to provide repositioning feedback for the camera user. Processed images can then be captured and subjected to OCR to extract relevant information from the image.
    Type: Grant
    Filed: July 8, 2015
    Date of Patent: June 20, 2017
    Assignee: Sage Software, Inc.
    Inventor: Scott E. Barton
  • Patent number: 9679570
    Abstract: Topics of potential interest to a user, useful for purposes such as targeted advertising and product recommendations, can be extracted from voice content produced by a user. A computing device can capture voice content, such as when a user speaks into or near the device. One or more sniffer algorithms or processes can attempt to identify trigger words in the voice content, which can indicate a level of interest of the user. For each identified potential trigger word, the device can capture adjacent audio that can be analyzed, on the device or remotely, to attempt to determine one or more keywords associated with that trigger word. The identified keywords can be stored and/or transmitted to an appropriate location accessible to entities such as advertisers or content providers who can use the keywords to attempt to select or customize content that is likely relevant to the user.
    Type: Grant
    Filed: August 17, 2015
    Date of Patent: June 13, 2017
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventor: Kiran K. Edara
  • Patent number: 9678642
    Abstract: A system and methods for selecting a region of pixels in an image displayed on a touch-sensitive interface is disclosed. The method for selecting the region of pixels is based on determined connectivity of pixels in the image indicating content of the image and includes determining connected pixels on the image representing the content without performing character recognition, detecting a text selection gesture indicative of selecting the region in the image, determining coordinates of the text selection gesture performed on the touch-sensitive interface and selecting the region in the image by bounding a first set of pixels located at a proximity from the coordinates of the text selection gesture.
    Type: Grant
    Filed: May 29, 2015
    Date of Patent: June 13, 2017
    Assignee: Lexmark International, Inc.
    Inventors: Stuart Willard Daniel, Ahmed Hamad Mohamed Eid, Shaun Timothy Love
  • Patent number: 9679217
    Abstract: According to one embodiment, an information processing apparatus includes an image acquisition module, an elevation-angle acquisition module, a character deformation specification module, a character detection dictionary storage, a character detection dictionary selector and a character detector. The elevation-angle acquisition module is configured to acquire an elevation angle of a photographic device assumed when the photographic device has obtained an acquired image. The character deformation specification module is configured to specify how an appearance of the character in the acquired image is deformed, based on the acquired elevation angle.
    Type: Grant
    Filed: August 25, 2015
    Date of Patent: June 13, 2017
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kaoru Suzuki, Yojiro Tonouchi, Tomoyuki Shibata, Isao Mihara
  • Patent number: 9633013
    Abstract: A system for processing text captured from rendered documents is described. The system receives a sequence of one or more words optically or acoustically captured from a rendered document by a user. The system identifies among words of the sequence a word with which an action has been associated. The system then performs the associated action with respect to the user.
    Type: Grant
    Filed: March 22, 2016
    Date of Patent: April 25, 2017
    Assignee: Google Inc.
    Inventors: Martin T. King, Dale L. Grover, Clifford A. Kushler, James Q. Stafford-Fraser
  • Patent number: 9607237
    Abstract: When there is a possibility that a third character region is redundantly selected in both a case where the line extraction process is performed starting from a first character region and a case where the line extraction process is performed starting from a second character region located in a line different from a line containing the first character region, the line recognition unit determines which line to incorporate the third character region in, by comparing a case of incorporating the third character region into the line starting with the first character region, with a case of incorporating the third character region into the line starting with the second character region.
    Type: Grant
    Filed: February 27, 2014
    Date of Patent: March 28, 2017
    Assignee: OMRON Corporation
    Inventors: Hirotaka Wada, Tomoyoshi Aizawa, Norikazu Tonogai, Tadashi Hyuga, Yoshihisa Minato, Masamichi Oe, Koji Kobayashi
  • Patent number: 9600731
    Abstract: According to one embodiment, an image processing apparatus includes a calculation unit. The calculation unit is configured to calculate a first similarity degree group which is composed of similarity degrees between respective characters constituting a first character string appearing on a first image and respective candidate characters in a candidate character group, to calculate a second similarity degree group which is composed of similarity degrees between respective characters constituting a second character string appearing on a second image and the respective candidate characters, and to calculate a third similarity degree group which is composed of similarity degrees between respective characters constituting a third character string appearing on the second image and the respective candidate characters.
    Type: Grant
    Filed: April 8, 2015
    Date of Patent: March 21, 2017
    Assignee: TOSHIBA TEC KABUSHIKI KAISHA
    Inventors: Masaaki Yasunaga, Kazuki Taira
  • Patent number: 9575935
    Abstract: Provided is to create a document file for drawing a picture finely, without increasing a file size. When a size of a first file computed before a process of vectorization is smaller than a size of a file of a manuscript, a process of vectorization is performed. When a size of a second file computed in the process of vectorization is smaller than the size of the file of the manuscript, a process after an end of the process of vectorization is performed. When a size of a third file computed in the process after the end of the process of vectorization is smaller than the size of the file of the manuscript, a vectorization file that is written in vectorized data is generated.
    Type: Grant
    Filed: January 24, 2015
    Date of Patent: February 21, 2017
    Assignee: KYOCERA Document Solutions Inc.
    Inventor: Motoki Hiratsuka
  • Patent number: 9570068
    Abstract: Embodiments of the present invention provide an approach for estimating the accuracy of a transcription of a voice recording. Specifically, in a typical embodiment, each word of a transcription of a voice recording is checked against a customer-specific dictionary and/or a common language dictionary. The number of words not found in either dictionary is determined. An accuracy number for the transcription is calculated from the number of said words not found and the total number of words in the transcription.
    Type: Grant
    Filed: June 3, 2016
    Date of Patent: February 14, 2017
    Assignee: International Business Machines Corporation
    Inventors: James E. Bostick, John M. Ganci, Jr., John P. Kaemmerer, Craig M. Trim
  • Patent number: 9563812
    Abstract: According to one embodiment, an image processing apparatus includes a calculation unit and a recognition unit. The calculation unit is configured to calculate a first similarity degree group which is composed of similarity degrees between respective characters constituting a first character string appearing on a first image and respective candidate characters in a candidate character group, and to calculate a second similarity degree group which is composed of similarity degrees between respective characters constituting a second character string appearing on a second image and the respective candidate characters in the candidate character group.
    Type: Grant
    Filed: April 8, 2015
    Date of Patent: February 7, 2017
    Assignee: Toshiba TEC Kabushiki Kaisha
    Inventors: Masaaki Yasunaga, Kazuki Taira
  • Patent number: 9552621
    Abstract: An image processing apparatus includes a reception unit, an acquisition unit, an enlarging/reducing unit, and a detector. The reception unit receives two image data to be compared. The acquisition unit acquires character sizes of characters contained in the two image data received by the reception unit. The enlarging/reducing unit enlarges or reduces the image data received by the reception unit such that the character sizes of the characters contained in the two image data acquired by the acquisition unit coincide with each other. The detector detects a difference between the two image data which have been enlarged or reduced by the enlarging/reducing unit such that the character sizes of the characters contained in the two image data coincide with each other.
    Type: Grant
    Filed: November 21, 2014
    Date of Patent: January 24, 2017
    Assignee: FUJI XEROX CO., LTD.
    Inventors: Tetsuharu Watanabe, Naoyuki Enomoto, Yozo Kashima, Tomohisa Ishikawa
  • Patent number: 9542383
    Abstract: An error detection system for automatically evaluating writing includes: an example construction apparatus to collect example sentences including various literary styles, to break up the collected example sentences in units of morphemes, and to construct the example sentences in an example-based index database (DB); and an error detection apparatus to break up an input sentence in units of morphemes, to generate one or more morpheme sequences bound in arbitrary window sizes based on one or more of morphemes of the broken-up input sentence, to search the example-based index DB for each of the generated morpheme sequences, and to detect an error according to a frequency at which said each morpheme is arranged in a corresponding morpheme sequence among morpheme sequences searched for through the example-based index DB.
    Type: Grant
    Filed: December 4, 2013
    Date of Patent: January 10, 2017
    Assignee: SK TELECOM CO., LTD.
    Inventors: Seunghwan Kim, Eunsook Lee, Seongmook Kim, Dongnam Kim, Sung Kim
  • Patent number: 9530068
    Abstract: An approach is provided to generate forms with template inclusions. In the approach, optical character recognition (OCR) text is compared to corresponding text in a selected form. Characters of text in the OCR text are then replaced with text from the template text, the replacing results in a form with template inclusions. The form with template inclusions is then processed by a forms processing operation.
    Type: Grant
    Filed: November 10, 2014
    Date of Patent: December 27, 2016
    Assignee: International Business Machines Corporation
    Inventors: Keith P. Biegert, Brendan C. Bull, David Contreras, Robert C. Sizemore, Sterling R. Smith
  • Patent number: 9514134
    Abstract: A system for processing text captured from rendered documents is described. The system receives a sequence of one or more words optically or acoustically captured from a rendered document by a user. The system identifies among words of the sequence a word with which an action has been associated. The system then performs the associated action with respect to the user.
    Type: Grant
    Filed: July 15, 2015
    Date of Patent: December 6, 2016
    Assignee: Google Inc.
    Inventors: Martin T. King, Dale L. Grover, Clifford A. Kushler, James Q. Stafford-Fraser
  • Patent number: 9501853
    Abstract: The present disclosure is directed toward systems and methods for assisting users in correcting OCR errors. For example, systems and methods described herein involve identifying the position of a cursor within a machine-readable document. Systems and methods described herein also involve identifying corresponding position co-ordinates in a source image, as well as, capturing an image preview from the source image based on the corresponding position co-ordinates. Systems and methods described herein may also involve providing the preview of the source image within the machine-readable document.
    Type: Grant
    Filed: January 9, 2015
    Date of Patent: November 22, 2016
    Assignee: ADOBE SYSTEMS INCORPORATED
    Inventors: Sayeed Ullah Khan, Subrato Namata, Uttam Dwivedi
  • Patent number: 9495357
    Abstract: Embodiments are used to extract terms from any text set that are used on other text, such as in a repository, that then can be used in a variety of applications, from providing search results, to analyzing data sets, to building a variety of text generation tools, such as messaging and emails.
    Type: Grant
    Filed: May 2, 2014
    Date of Patent: November 15, 2016
    Inventors: Athena Ann Smyros, Constantine John Smyros
  • Patent number: 9472037
    Abstract: The present invention provides a method of re-orienting an image of a media item, comprising determining at least one linear array formed by a plurality of locations associated with an image of the media item; determining a skew angle of said linear array with respect to a reference axis; and re-orienting said image by rotating said image responsive to said skew angle. Apparatus for re-orienting an image of a media item and a document processing module are also provided.
    Type: Grant
    Filed: January 31, 2014
    Date of Patent: October 18, 2016
    Assignee: NCR CORPORATION
    Inventor: Ping Chen
  • Patent number: 9418312
    Abstract: Systems and methods coarsely classify unknown documents in a group or not with reference document(s). Documents get scanned into digital images. Counts of contours are taken. The closer the counts of the contours of the unknown document reside to the reference document(s), the more likely the documents are all of a same type. Embodiments typify contour analysis, classification acceptance or not, application of algorithms, and imaging devices with scanners, to name a few.
    Type: Grant
    Filed: September 10, 2014
    Date of Patent: August 16, 2016
    Assignee: LEXMARK INTERNATIONAL TECHNOLOGY, SA
    Inventors: Ranajyoti Chakraborti, Kunal Das, Rajib Dutta, Sabyasachi Samanta, Subhadeep Samanta
  • Patent number: 9406030
    Abstract: Electronic document classification comprising providing training documents sorted into classes; linear programming including selecting inputs which maximize an output, given constraints on inputs, the output maximized being a difference between: a. first estimated probability that a document instance will be correctly classified, by a classifier corresponding to given inputs, as belonging to its own class, and b. second estimated probability that document instance will be classified, by the classifier, as not belonging to its own class; and classifying electronic document instances into classes, using a preferred classifier corresponding, to the inputs selected by the linear programming. A computerized electronic document forgery detection method provides training documents and uses a processor to select value-ranges of non-trivial parameters, such that selected values-range(s) of parameters are typical to an authentic document of given class, and atypical to a forged document of same class.
    Type: Grant
    Filed: July 23, 2012
    Date of Patent: August 2, 2016
    Assignee: AU10TIX LIMITED
    Inventors: Guy Dolev, Sergey Markin, Avi Bar-Nissim, Asher Uziel
  • Patent number: 9384389
    Abstract: Some examples include detecting errors in text that has been recognized using automated text recognition technology. For instance, errors in the recognized text may be detected based on glyph image similarity and the use of a language model, dictionary information, or the like. Some implementations may group together glyphs based on association of the glyphs with the same glyph identifier and a similarity of the appearance of the glyphs. Furthermore, the words associated with each glyph may be checked against a language model, such as to check a spelling or other validity of the words, and a score may be assigned to each group of glyphs based on the validity of the words corresponding to the glyphs in that group. Groups that have a score that fails to meet a threshold may be reviewed by a person or may undergo automated correction techniques.
    Type: Grant
    Filed: September 12, 2012
    Date of Patent: July 5, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Viswanath Sankaranarayanan, Sridhar Jayaraman
  • Patent number: 9367760
    Abstract: Systems and methods coarsely classify unknown documents in a group or not with reference document(s). Documents get scanned into digital images. Counts of contours are taken. The closer the counts of the contours of the unknown document reside to the reference document(s), the more likely the documents are all of a same type. Embodiments typify contour analysis, classification acceptance or not, application of algorithms, and imaging devices with scanners, to name a few.
    Type: Grant
    Filed: November 24, 2015
    Date of Patent: June 14, 2016
    Assignee: LEXMARK INTERNATIONAL, INC.
    Inventors: Ranajyoti Chakraborti, Kunal Das, Rajib Dutta, Sabyasachi Samanta, Subhadeep Samanta
  • Patent number: 9361536
    Abstract: Methods, devices, and systems replace solid lines of user-fillable areas of a print job with patterned lines and then print the print job with the patterned lines to print user-fillable pre-printed forms, using a printing device. These methods, devices, and systems also scan at least one of the user-fillable pre-printed forms having user markings to produce a scan, using an optical scanner. Further, such methods, devices, and systems produce an altered scan by removing only the patterned lines from the scan to leave the user markings in the altered scan using the image processor. Then, these methods, devices, and systems can identify user-supplied characters by performing automated character recognition on the user markings in the altered scan using the image processor and output such user-supplied characters from the image processor.
    Type: Grant
    Filed: December 16, 2014
    Date of Patent: June 7, 2016
    Assignee: Xerox Corporation
    Inventors: Richard L. Howe, Eric M. Gross, Dennis L. Venable
  • Patent number: 9348848
    Abstract: A method for identifying a table in a digital file includes extracting lines from a layout of the digital file, wherein the lines comprise horizontal lines and vertical lines. The method also includes identifying intersected line groups, wherein each intersected line group comprises a horizontal line of the extracted horizontal lines and a vertical line of the extracted vertical lines, the horizontal line and the vertical line intersecting with each other. The method further includes determining whether the number of intersected lines in each intersected line group is larger than a first threshold. If yes, the method further includes identifying an area in which the intersected line groups are located as a table area. If no, the method further includes performing vertical projection on characters in the area, and identifying the area as a table area based on results of the vertical projection.
    Type: Grant
    Filed: April 26, 2013
    Date of Patent: May 24, 2016
    Assignees: Peking University Founder Group Co., Ltd., Beijing Founder Apabi Technology Ltd.
    Inventors: Ning Dong, Wenjuan Huang
  • Patent number: 9304681
    Abstract: Software, firmware, and systems are described for identifying characters in a handwritten input received from a user on an input device, irrespective of an angle that the input is received at. In one implementation, the system establishes an anchor point and distances from the anchor point to reference support lines. A set of candidate characters is identified based on received handwritten input. The system estimates support lines for each of the candidate characters. The system ranks the candidate characters based on a total deviation measurement from the expectation for each candidate, where the expectation in part is based on the established distance from the established anchor point to reference support lines, and identifies a best-ranked candidate based at least in part on a smallest total deviation measurement.
    Type: Grant
    Filed: May 1, 2015
    Date of Patent: April 5, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Jonas Andersson, Lars Jonas Morwing