Segmenting Individual Characters Or Words Patents (Class 382/177)
  • Patent number: 9288039
    Abstract: A system and method for text language identification allow private information of a server and a client to be kept secret from each other. An encrypted score for each of a plurality of languages is received by the server from the client. The encrypted scores are generated by homomorphic addition of encrypted frequencies of n-grams in a list of n-grams extracted from text. The unencrypted list is not provided to the server. The encrypted frequencies of the n-grams in the list are extracted using encrypted resources which, for each of the plurality of languages, include an encrypted frequency for each of a set of n-grams. At the server, the encrypted scores are decrypted to generate unencrypted scores and information is provided to the client based on the unencrypted scores from which the client is able to identify a language for the text.
    Type: Grant
    Filed: December 1, 2014
    Date of Patent: March 15, 2016
    Assignee: XEROX CORPORATION
    Inventors: Nicolas Monet, Johan Clier
  • Patent number: 9256592
    Abstract: The positioning of elements of a broken word can be corrected by receiving an optical character recognition (OCR) conversion of a printed publication and identifying multiple parts of the broken word from the OCR conversion to place in a graphical user interface (GUI). The multiple parts can be placed in the GUI using original positioning data for the printed publication. A user can make a selection in the GUI indicating that multiple parts from the OCR conversion are of the broken word and can automatically adjust bounds of the multiple parts to form a corrected word.
    Type: Grant
    Filed: November 7, 2012
    Date of Patent: February 9, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Satishkumar Kothandapani Shanmugasundaram, Shubham Chandra Gupta, Arpita Agrawal
  • Patent number: 9250802
    Abstract: According to an embodiment, a shaping device includes an acquiring unit, an extracting unit, first and second calculators, a determining unit, a shaping unit, and a display unit. The acquiring unit is configured to acquire strokes handwritten by a user. The extracting unit is configured to extract multiple combinations of strokes. The first calculator is configured to calculate a first likelihood representing a probability that each combination related to a target graphic. The second calculator is configured to calculate a second likelihood representing a probability that each combination related to an incomplete shape. The determining unit is configured to determine whether there is a first combination having first likelihood not less than a first threshold and the second likelihood not more than a second threshold. The shaping unit is configured to shape the strokes into the target graphic. The display unit is configured to display a result of shaping.
    Type: Grant
    Filed: March 5, 2014
    Date of Patent: February 2, 2016
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Shihomi Takahashi, Tomoyuki Shibata, Kazunori Imoto, Yasunobu Yamauchi
  • Patent number: 9251143
    Abstract: Converting technical data from field oriented electronic data sources into natural language form is disclosed. An approach includes obtaining document data from an input document, wherein the document data is in a non-natural language form. The approach includes determining a data type of the document data from one of a plurality of data types defined in a detection and conversion database. The approach includes translating the document data to a natural language form based on the determined data type. The approach additionally includes outputting the translated document data in natural language form to an output data stream.
    Type: Grant
    Filed: January 13, 2012
    Date of Patent: February 2, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: John J. Bird, Doyle J. McCoy
  • Patent number: 9245051
    Abstract: An approach is provided for conducting a search based on an extraction of a search term from available sensor data. The approach involves determining sensor data associated with at least one device, the sensor data determined from among a plurality of available data modes. The approach also involved processing and/or facilitating a processing of the sensor data to cause, at least in part, an extraction of one or more search terms for at least one query. The approach further involves determining one or more results of the at least one query based, at least in part, on context information associated with the at least one device, user profile information associated with the at least one device, or a combination thereof.
    Type: Grant
    Filed: September 20, 2011
    Date of Patent: January 26, 2016
    Assignee: NOKIA TECHNOLOGIES OY
    Inventors: Aaron Licata, Adetokunbo Bamidele, Mark Travis Fulks
  • Patent number: 9230514
    Abstract: Methods and systems for rendering text to simulate human penmanship are described. A text rendering engine converts a text string into an image that can be displayed on a string using one or more seed numbers to influence the rendering and appearance of the text. The text rendering engine may render each character of the text string using a size, weight, slope, or Bezier curve control point selected based on the seed numbers.
    Type: Grant
    Filed: June 20, 2012
    Date of Patent: January 5, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Michael Patrick Bacus, Shawn C. Deyell, Hong Chen
  • Patent number: 9215260
    Abstract: A system and method for a live streaming platform that can redundantly process input streams in parallel ingestion pipelines is disclosed herein. Ingested input streams in the parallel pipelines can be segmented using a stable segmentation function that creates identical segments in each of the streams in the pipelines. If errors occur, or there are disruptions in one or more of the input streams or pipelines, the live streaming platform can switch between the input streams on a per segment basis to provide reliable streaming feeds to a content distribution network. A master stream can be constructed from each of the master segments per a time period based on a reliability of each of the input streams and segments. Practicing pipeline affinity by selecting subsequent master segments from the same pipeline can minimize glitches.
    Type: Grant
    Filed: August 18, 2014
    Date of Patent: December 15, 2015
    Assignee: Google Inc.
    Inventors: Francisco Manuel Galanes, Vijnan Shastri, Pawel Jurczyk
  • Patent number: 9148472
    Abstract: In one embodiment, there is provided a server. The server includes: a receiver configured to receive, from an electronic device, an instruction signal instructing the server to detect a name of a first person, positional information representing a position of the electronic device and an image data of the first person; an extractor configured to extract electronic devices existing within a certain range from the electronic device based on the positional information, wherein positional information of the electronic devices are stored in the server; a detector configured to detect the first person by performing face detection on persons who are associated with the extracted electronic devices using the image data of the first person; and an output module configured to output the name of the first person.
    Type: Grant
    Filed: February 8, 2013
    Date of Patent: September 29, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Eita Shuto
  • Patent number: 9116890
    Abstract: A system for processing text captured from rendered documents is described. The system receives a sequence of one or more words optically or acoustically captured from a rendered document by a user. The system identifies among words of the sequence a word with which an action has been associated. The system then performs the associated action with respect to the user.
    Type: Grant
    Filed: June 11, 2014
    Date of Patent: August 25, 2015
    Assignee: Google Inc.
    Inventors: Martin T. King, Dale L. Grover, Clifford A. Kushler, James Q. Stafford-Fraser
  • Patent number: 9110904
    Abstract: A method and devices for obtaining metadata associated with programs from multiple metadata sources. Title metadata is compared to determine if any title metadata match. Transformation rules are applied to the title metadata when title metadata does not match with other title metadata. The transformation rules transform the title metadata into a common format. The title metadata is compared after transformation to determine whether the title metadata matches other title metadata. Title metadata that matches other title metadata, the metadata associated with a program is aggregated.
    Type: Grant
    Filed: September 21, 2011
    Date of Patent: August 18, 2015
    Assignee: VERIZON PATENT AND LICENSING INC.
    Inventors: Zhiying Jin, Haiyan Zhou, Xuefeng Yao
  • Publication number: 20150146982
    Abstract: A method and an electronic device are provided for obtaining an image or a video frame, including applying to the image or the video frame, at least one image processing technique, scanning the image or the video frame, to identify a text item, determining an item type for the identified text item, and determining an action, corresponding to the item type.
    Type: Application
    Filed: November 26, 2013
    Publication date: May 28, 2015
    Applicant: BLACKBERRY LIMITED
    Inventors: Wade TSAI, David Michael SUTTON, Jean-Francois DESGAGNES, Ryan William JOSAL
  • Patent number: 9031831
    Abstract: Embodiments of the present invention disclose a dictionary lookup method and an electronic device that implements the dictionary lookup method. The dictionary lookup method allows a user to quickly obtain meanings and translations of words from electronic dictionaries while reading a text on a display screen of the electronic device, wherein reading text is utilized by performing an optical character recognition comprising of determining a set of base forms of each inflected recognized word. Advantageously, in one embodiment the meanings (e.g., the base forms) and translations may be displayed in a balloon, in a pop-up window, as subscript, as superscript, or in any other suitable manner when the user touches a word on the display screen, in one embodiment.
    Type: Grant
    Filed: January 14, 2011
    Date of Patent: May 12, 2015
    Assignee: ABBYY Development LLC
    Inventor: Dmitry Levchenko
  • Patent number: 9020262
    Abstract: The present disclosure includes a system and method for symbol compression using conditional entropy estimation. One method for symbol compression using conditional entropy estimation includes approximating a quantity of symbol encoding bits for a number of symbols using a conditional entropy estimation. Dictionary entries are generated from the number of symbols so as to minimize a total bit-stream quantity. The total bit-stream quantity includes at least the approximated quantity of symbol encoding bits and a quantity of dictionary entries encoding bits. The symbols are encoded using the dictionary entries as a reference.
    Type: Grant
    Filed: July 31, 2012
    Date of Patent: April 28, 2015
    Assignees: Hewlett-Packard Development Company, L.P., Purdue Research Foundation
    Inventors: Dejan Depalov, Peter Bauer, Charles A. Bouman, Jan Allebach, Yandong Guo
  • Patent number: 9020265
    Abstract: A system and method is provided for automatically recognizing building numbers in street level images. In one aspect, a processor selects a street level image that is likely to be near an address of interest. The processor identifies those portions of the image that are visually similar to street numbers, and then extracts the numeric values of the characters displayed in such portions. If an extracted value corresponds with the building number of the address of interest such as being substantially equal to the address of interest, the extracted value and the image portion are displayed to a human operator. The human operator confirms, by looking at the image portion, whether the image portion appears to be a building number that matches the extracted value. If so, the processor stores a value that associates that building number with the street level image.
    Type: Grant
    Filed: June 5, 2014
    Date of Patent: April 28, 2015
    Assignee: Google Inc.
    Inventors: Bo Wu, Alessandro Bissacco, Raymond W. Smith, Kong Man Cheung, Andrea Frome, Shlomo Urbach
  • Patent number: 9014477
    Abstract: A method and apparatus for automatically identifying character segments for character recognition is provided. The method involves receiving a plurality of words and a ground truth corresponding to each word of the plurality of words. The plurality of words may be received in a cursive script. Each word of the plurality of words is segmented into one or more character segments based on the ground truth corresponding to each word. Thereafter, the segmentation of each word is refined by iteratively re-segmenting each word based on one or more similar character segments.
    Type: Grant
    Filed: October 27, 2011
    Date of Patent: April 21, 2015
    Assignee: King Abdulaziz City for Science and Technology (KACST)
    Inventors: Ahmad Abdulkader, Hussein Khalid Al-Omari, Mohammad Sulaiman Khorsheed
  • Patent number: 9014478
    Abstract: A method and apparatus for determining a reading order of characters The method includes preparing a list of character information, which is character information extracted from image data by character recognition processing and preparing a list of line information, which is made up of a line box surrounding a set of characters which are continuously aligned in the same direction in image data and an alignment direction of characters in the line box. In response to a request for adding character information to the list of character information, extracting a line box containing a character region of the character to be added, obtaining all character information having the character region contained in the concerned line box from the list of character information and rearranging according to the position with respect to the alignment direction of characters corresponding to the line box to determine a new reading order of characters.
    Type: Grant
    Filed: August 30, 2012
    Date of Patent: April 21, 2015
    Assignee: International Business Machines Corporation
    Inventors: Toshinari Itoko, Daisuke Sato
  • Patent number: 9008425
    Abstract: A method of detection of numbered captions in a document includes receiving a document including a sequence of document pages and identifying illustrations on pages of the document. For each identified illustration, associated text is identified. An imitation page is generated for each of the identified illustrations, each imitation page comprising a single illustration and its associated text. For a sequence of the imitation pages, a sequence of terms is identified. Each term is derived from a text fragment of the associate text of a respective imitation page. The terms of a sequence complying with at least one predefined numbering scheme which defines a form and an incremental state of the terms in a sequence. The terms of the identified sequence of terms are construed as being at least a part of a numbered caption for a respective illustration in the document.
    Type: Grant
    Filed: January 29, 2013
    Date of Patent: April 14, 2015
    Assignee: Xerox Corporation
    Inventors: Herve Dejean, Jean-Luc Meunier
  • Patent number: 9008428
    Abstract: Machines, systems and methods for character recognition disambiguation are provided. The method comprises selecting a first set of characters that match a first visual profile based on results of a character recognition process applied to target content; selecting a subset of the first set based on criteria associated with at least one of confidence level with which characters grouped in the subset are recognized or fragmentation associated with the characters grouped in the subset; and disambiguating recognition results for the characters grouped in the subset by displaying the characters along with context information, wherein reviewing two or more of the characters on a display screen along with context information associated with said two or more characters allows a human operator to select one or more suspect characters from among the two or more characters.
    Type: Grant
    Filed: January 28, 2013
    Date of Patent: April 14, 2015
    Assignee: International Business Machines Corporation
    Inventors: Ella Barkan, Itoko Toshinari, Asaf Tzadok
  • Patent number: 8995768
    Abstract: A method for processing data of a scanned book having a plurality of pages is disclosed. The method includes obtaining page image data from a page. The method further includes segmenting and recognizing the page image data to obtain locations of rectangular boxes corresponding to the respective characters and text codes for the respective characters. The method also includes obtaining respective aggregated character line information for each line of characters. The method further includes adjusting the rectangular boxes in accordance with the obtained aggregated character line information.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: March 31, 2015
    Assignees: Peking University Founder Group Co., Ltd., Beijing Founder Apabi Technology Ltd.
    Inventors: Ruiheng Qiu, Yun Li
  • Patent number: 8995780
    Abstract: A method for creating a binary mask image from an inputted digital image of a scanned document, including the steps of creating a binarized image by binarizing the inputted digital image, detecting first text regions representing light text on a dark background, and inverting the first text regions, such that the inverted first text regions are interpretable in the same way as dark text on a light background. A method for comparing in a binary image a first pixel blob with a second pixel blob to determine whether they represent matching symbols, including the steps of detecting a line in one blob not present in the other and/or determining if one of the blobs represents an italicized symbol where the other does not.
    Type: Grant
    Filed: December 23, 2013
    Date of Patent: March 31, 2015
    Assignee: I.R.I.S.
    Inventors: Michel Dauw, Pierre De Muelenaere
  • Patent number: 8989494
    Abstract: A method and apparatus for determining a reading order of characters The method includes preparing a list of character information, which is character information extracted from image data by character recognition processing and preparing a list of line information, which is made up of a line box surrounding a set of characters which are continuously aligned in the same direction in image data and an alignment direction of characters in the line box. In response to a request for adding character information to the list of character information, extracting a line box containing a character region of the character to be added, obtaining all character information having the character region contained in the concerned line box from the list of character information and rearranging according to the position with respect to the alignment direction of characters corresponding to the line box to determine a new reading order of characters.
    Type: Grant
    Filed: June 5, 2012
    Date of Patent: March 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Toshinari Itoko, Daisuke Sato
  • Publication number: 20150071542
    Abstract: In embodiments, one or more computer-readable media may have instructions stored thereon which, when executed by a processor of a computing device provide the computing device with a redaction module. The redaction module may be configured to receive a request to redact a selection of text from a document and identify instances of the text occurring within the document through an analysis of word coordinate information of an image of the document. The redaction module may further be configured to generate redaction information, including redaction coordinates, the redaction coordinates may be based on the word coordinate information associated with respective instances of the text occurring within the document. The redactions, when applied to the image in accordance with the redaction coordinates, may redact the respective instances of the text. Other embodiments may be described and/or claimed.
    Type: Application
    Filed: September 6, 2013
    Publication date: March 12, 2015
    Applicant: Lighthouse Document Technologies, Inc. (d/b/a Lighthouse eDiscovery)
    Inventors: Christopher Byron Dahl, Debora Noemi Motyka Jones, Kevin Patrick O'Neill, Geoffrey Alan David Belger, Vladas Walter Mazelis, Nathaniel Byington, Beau Hodges Holt, John Charles Olson
  • Patent number: 8977054
    Abstract: Candidate identification utilizing fingerprint identification is disclosed.
    Type: Grant
    Filed: August 6, 2014
    Date of Patent: March 10, 2015
    Assignee: DST Technologies, Inc.
    Inventor: Joshua O. Highley
  • Patent number: 8977072
    Abstract: Various embodiments of the present invention relate to a method, system and computer program product for detecting and recognizing text in the images captured by cameras and scanners. First, a series of image-processing techniques is applied to detect text regions in the image. Subsequently, the detected text regions pass through different processing stages that reduce blurring and the negative effects of variable lighting. This results in the creation of multiple images that are versions of the same text region. Some of these multiple versions are sent to a character-recognition system. The resulting texts from each of the versions of the image sent to the character-recognition system are then combined to a single result, wherein the single result is detected text.
    Type: Grant
    Filed: December 13, 2012
    Date of Patent: March 10, 2015
    Assignee: A9.com, Inc.
    Inventors: Raghavan Manmatha, Mark A. Ruzon
  • Patent number: 8965126
    Abstract: A character recognition device includes image input unit that receives an image, character region detection unit that detects a character region in the image, character region separation unit that separates the character region on a character-by-character basis, character recognition unit that performs character-by-character recognition on the characters present in separated regions and outputs one or more character recognition result candidates for each character, first character string transition data creation unit that receives the candidates, calculates weights for transitions to the candidates and creates first character string transition data based on a set of the candidates and the weights, and WFST processing unit that sequentially performs state transitions based on the first character string transition data, accumulates weights in each state transition and calculates a cumulative weight for each state transition, and outputs one or more state transition results based on the cumulative weight.
    Type: Grant
    Filed: February 24, 2012
    Date of Patent: February 24, 2015
    Assignee: NTT DOCOMO, INC.
    Inventors: Takafumi Yamazoe, Minoru Etoh, Takeshi Yoshimura, Kosuke Tsujino
  • Patent number: 8965112
    Abstract: Systems and methods for sequence transcription with neural networks are provided. More particularly, a neural network can be implemented to map a plurality of training images received by the neural network into a probabilistic model of sequences comprising P(S|X) by maximizing log P(S|X) on the plurality of training images. X represents an input image and S represents an output sequence of characters for the input image. The trained neural network can process a received image containing characters associated with building numbers. The trained neural network can generate a predicted sequence of characters by processing the received image.
    Type: Grant
    Filed: December 17, 2013
    Date of Patent: February 24, 2015
    Assignee: Google Inc.
    Inventors: Julian Ibarz, Yaroslav Bulatov, Ian Goodfellow
  • Patent number: 8965127
    Abstract: A word segmentation method for processing a document image applies clustering analysis to the spacing segments of a line. The spacing segments are generated by thresholding a one-dimensional vertical projection profile of the line. Taking advantage of the bimodal distribution of spacing length distribution of text lines, a k-means clustering algorithm is used, with the number of clusters pre-set to two, to classify the spacing segments as either character spacing or word spacing. Moreover, k-means++ initialization is used to enhance performance of cluster analysis. The clustering result such as cluster centers and compactness is used to prune single-word text line, single table item, etc. The locations of the word spacing segments are then used to segment the line of text into words.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: February 24, 2015
    Assignee: Konica Minolta Laboratory U.S.A., Inc.
    Inventors: Chaohong Wu, Wei Ming
  • Patent number: 8965123
    Abstract: A system and a method for identification of alphanumeric characters present in a series in an image are disclosed. The system and method captures the image and further processes it for binarization by computing a pattern of the image. The generated binarized images are then filtered for removing unwanted components. Candidate images are identified out of the filtered binarized images. All the obtained candidate images are combined to generate a final candidate image which is further segmented in order to recognize a valid alphanumeric character present in the series.
    Type: Grant
    Filed: March 25, 2013
    Date of Patent: February 24, 2015
    Assignee: Tata Consultancy Services Limited
    Inventors: Tanushyam Chattopadhyay, Ujjwal Bhattacharya, Bidyut Baran Chaudhuri
  • Patent number: 8958643
    Abstract: Recognition of numerical characters is disclosed, including: extracting a subimage from a received image comprising information pertaining to a plurality of numerical characters, wherein the extracted subimage is associated with one of the plurality of numerical characters; and performing recognition based at least in part on a set of topological information associated with the subimage, including: processing the subimage to obtain the set of topological information associated with the subimage; comparing the set of topological information associated with the subimage with a preset set of stored topological information; determining that in the event that the set of topological information associated with the subimage matches the preset set of stored topological information, the subimage is associated with a recognized numerical character associated with the preset set of stored topological information.
    Type: Grant
    Filed: June 11, 2014
    Date of Patent: February 17, 2015
    Assignee: Alibaba Group Holding Limited
    Inventor: Xiang Sun
  • Patent number: 8953886
    Abstract: Character recognition is described. In one embodiment, it may use matched sequences rather than character shape to determine a computer-legible result.
    Type: Grant
    Filed: August 8, 2013
    Date of Patent: February 10, 2015
    Assignee: Google Inc.
    Inventors: Martin T. King, Dale L. Grover, Clifford A. Kushler, James Quentin Stafford-Fraser
  • Patent number: 8953885
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing optical character recognition. In one aspect, a method includes receiving a text image I. A set of feature functions are evaluated for a log linear model to determine respective feature values for the text image I, wherein each feature function hi maps the text image I to a feature value, and wherein each feature function hi is associated with a respective feature weight ?i. A transcription {circumflex over (T)} is determined that minimizes a cost of the log linear model.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: February 10, 2015
    Assignee: Google Inc.
    Inventors: Franz Josef Och, Ashok Chhabedia Popat, Dmitriy Genzel, Michael E. Jahr
  • Patent number: 8948511
    Abstract: An automated document processing system is configured to normalize zones obtained from a document, and to extract articles from the normalized zones. In one configuration, the system receives at least one zone from the document, and applies at least one zone-breaking factor, thereby creating normalized sub-zones within which text lines are consistent with the at least one zone-breaking factor. The normalized sub-zones may be evaluated to obtain a reading order. Adjacent sub-zones are joined if text similarity exceeds a threshold value. Weakly joined sub-zones are separated where indicated by a topic vectors analysis of the weakly joined sub-zones.
    Type: Grant
    Filed: October 19, 2005
    Date of Patent: February 3, 2015
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Daniel Ortega, Sherif Yacoub, Jose Abad Peiro, Paolo Faraboschi
  • Patent number: 8941864
    Abstract: Disclosed is an image processing apparatus which (i) determines whether or not characters to be subjected to a character recognition process in image data have a size larger than a predetermined size, (ii) in a case where the characters is determined as larger than the predetermined size, reduces at least a region including the characters so that the size of the characters fits within the predetermined size, and (iii) performs a character recognition process of the characters with use of the reduced image data.
    Type: Grant
    Filed: November 9, 2011
    Date of Patent: January 27, 2015
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Hitoshi Hirohata, Akihito Yoshida, Atsuhisa Morimoto, Yohsuke Konishi
  • Patent number: 8923618
    Abstract: An expression, for which complementary information can be outputted, is extracted from a document obtained by character recognition for an image. Complementary information related to the extracted expression is outputted when a character or a symbol adjacent to the beginning or the end of the extracted expression is not a predetermined character or symbol. Output of complementary information related to the extracted expression is skipped when the character or symbol adjacent to the beginning or the end of the extracted expression is the predetermined character or symbol. A problem that complementary information unrelated to an original text is outputted is prevented even when a false character recognition occurs.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: December 30, 2014
    Assignee: Sharp Kabushiki Kaisha
    Inventor: Takeshi Kutsumi
  • Patent number: 8917276
    Abstract: A graphics or image rendering system, such as a map image rendering system, receives image data from an image database in the form of vector data that defines various image objects, such as roads, geographical boundaries, etc., and textures defining text strings to be displayed on the image to provide, for example, labels for the image objects. The imaging rendering system renders the images such that the individual characters of the text strings are placed on the image following a multi-segmented or curved line. This rendering system enables text strings to be placed on a map image so that the text follows the center line of a curved or angled road or other image feature without knowing the specifics of the curvature of the line along which the text will be placed when creating the texture that stores the text string information.
    Type: Grant
    Filed: March 19, 2013
    Date of Patent: December 23, 2014
    Assignee: Google Inc.
    Inventor: Brian Cornell
  • Patent number: 8914278
    Abstract: A computer-assisted language correction system including spelling correction functionality, misused word correction functionality, grammar correction functionality and vocabulary enhancement functionality utilizing contextual feature-sequence functionality employing an internet corpus.
    Type: Grant
    Filed: July 31, 2008
    Date of Patent: December 16, 2014
    Assignee: Ginger Software, Inc.
    Inventors: Yael Karov Zangvil, Avner Zangvil
  • Patent number: 8913833
    Abstract: An image processing apparatus includes: an extraction unit that extracts a first image and a second image similar to the first image, in a first resolution; and a generation unit that generates an image in a second resolution based on the respective images extracted by the extraction unit and phases of the respective images calculated with precision higher than one pixel in the first resolution.
    Type: Grant
    Filed: May 2, 2007
    Date of Patent: December 16, 2014
    Assignee: Fuji Xerox Co., Ltd.
    Inventors: Yutaka Koshi, Shunichi Kimura, Ikken So, Masanori Sekino
  • Patent number: 8908971
    Abstract: Methods, devices and systems are described for transcribing text from artifacts to electronic files. A computer system is provided, wherein the computer system comprises a computer-readable storage device. An image of the artifact is received wherein text is present on the artifact. A first portion of the text is analyzed. Characters representing the first portion of the text are identified at a first confidence level equal to or greater than a threshold confidence level. The characters representing the first portion of the text are stored. A second portion of the text appearing on the artifact is analyzed. A plurality of candidates to represent the second portion of the text are identified at a second confidence level below the threshold confidence level. Finally, the plurality of candidates to a user for selection are presented.
    Type: Grant
    Filed: September 25, 2013
    Date of Patent: December 9, 2014
    Assignee: Ancestry.com Operations Inc.
    Inventor: Lee Samuel Jensen
  • Patent number: 8910308
    Abstract: Systems and methods are provided for challenge/response animation. In one implementation, a request for protected content may be received from a client, and the protected content may comprise data. A challenge phrase comprising a plurality of characters may be determined, and a computer processor may divide the challenge phrase into at least two character subsets selected from the characters comprising the challenge phrase. Each of the at least two character subsets may include less than all of the characters comprising the challenge phrase. The at least two character subsets may be sent to the client in response to the request; and an answer to the challenge phrase may be received from the client in response to the at least two character subsets. Access to the protected content may be limited based on whether the answer correctly solves the challenge phrase.
    Type: Grant
    Filed: August 19, 2013
    Date of Patent: December 9, 2014
    Assignee: AOL Inc.
    Inventor: Scott Dorfman
  • Patent number: 8903131
    Abstract: Information display equipment that can display translated words and/or translation information in real time. The information display equipment relates to a camera dictionary that can perform dictionary display in real time. In addition, this equipment distinguishes characters included in an object photographed by a photographing portion. Then this equipment extracts information corresponding to these characters from a dictionary. Examples of the information corresponding to the characters are translated words or illustrative examples for a certain term. Then a display portion displays the information corresponding to the characters.
    Type: Grant
    Filed: December 7, 2010
    Date of Patent: December 2, 2014
    Assignee: Kabushiki Kaisha Square Enix
    Inventor: Ryota Aomi
  • Patent number: 8903173
    Abstract: An image of a rectangular target is resolved. First and second dimensions for the rectangular target are determined from an initial image. A cropped and de-skewed final image for the rectangular target is produced responsive to the first and second dimensions.
    Type: Grant
    Filed: December 21, 2011
    Date of Patent: December 2, 2014
    Assignee: NCR Corporation
    Inventor: Jeffrey S. Cooper
  • Patent number: 8891871
    Abstract: One system to which the present invention is applied obtains the digitized form image of a form, recognizes a character string existing in the obtained form image, extracts a headline wording being a predetermined character string from the recognized character strings, determines a table structure existing in the form image, on the basis of the extracted headline wording and the arrangement of headline wordings in the form image and specifies a correspondence relationship between a headline wording and a character string other than the headline wording that is recognized, using the determination result.
    Type: Grant
    Filed: June 19, 2009
    Date of Patent: November 18, 2014
    Assignee: Fujitsu Frontech Limited
    Inventors: Shinichi Eguchi, Hajime Kawashima, Kouichi Kanamoto, Shohei Hasegawa, Katsutoshi Kobara, Maki Yabuki
  • Patent number: 8891872
    Abstract: A system and method for identifying characters using a processor and a sparse distributed memory (SDM) module. The system and method are configured to receive image data relating to an object having a surface with physical markings thereon. The physical markings include characters-of-interest. The system and method are also configured to analyze the image data to convert at least one of the characters-of-interest in the image data into a corresponding feature vector. The system and method are also configured to identify the characters-of-interest using the feature vector and the SDM module. A suggested identity for the characters-of-interest is provided.
    Type: Grant
    Filed: December 16, 2011
    Date of Patent: November 18, 2014
    Assignee: General Electric Company
    Inventors: Joseph Salvo, John Carbone, Lynn Ann Derose, Daniel Messier, Bouchra Bouqata, Adam McCann, William Leonard
  • Patent number: 8873856
    Abstract: The technology is directed to determining a class associated with an image. In some examples, a method determines the class associated with an image. The method can include determining a segmentation score for an image segment based on a comparison of the image segment and a region of an image. The region of the image can be associated with the image segment. The method further includes determining a confidence score for the image segment based on the segmentation score and a classification score. The classification score can be indicative of a similarity between the image segment and at least one class. The method further includes determining a class associated with the image based on the confidence score. The method further includes outputting the class associated with the image.
    Type: Grant
    Filed: September 9, 2013
    Date of Patent: October 28, 2014
    Assignee: Matrox Electronic Systems, Ltd.
    Inventors: Sylvain Chapleau, Vincent Paquin
  • Patent number: 8855413
    Abstract: Described is a method for identifying text or other information in one or more images and reflowing images of individual elements of text at a word boundary or character boundary on devices of different sizes. The text may be rescaled while retaining the look and feel of the original text. The size may be scaled according to one or more parameters. Text may be captured in a plurality of images and merged together to form a single document or document-like collection. Text may be fully recognized, indexed, sorted and/or be made searchable. Text may be wrapped around objects and features identified as non-text or non-informational elements in an image. Borders or edges between successive elements of text may be smoothed, combined, overlapped and/or blended. Backgrounds of text may be adjusted to make the appearance of successive elements aesthetically pleasing or as close to the original as possible.
    Type: Grant
    Filed: May 13, 2011
    Date of Patent: October 7, 2014
    Assignee: ABBYY Development LLC
    Inventor: Ding-Yuan Tang
  • Patent number: 8854691
    Abstract: An image processing apparatus extracts a line segment included in an image, and includes a density gradient direction determining section that determines a direction, in which density changes, of each processing unit composed of a predetermined number of pixels of an image, and an line segment extracting section that regards a couple of processing units whose density gradient direction are opposite each other as a processing unit pair and extracts a processing unit group including a plurality of processing unit pairs allocated in a row in a direction perpendicular to the density gradient directions as a line segment.
    Type: Grant
    Filed: February 2, 2012
    Date of Patent: October 7, 2014
    Assignee: Murata Machinery Ltd.
    Inventor: Nariyasu Kan
  • Publication number: 20140294302
    Abstract: A system for processing text captured from rendered documents is described. The system receives a sequence of one or more words optically or acoustically captured from a rendered document by a user. The system identifies among words of the sequence a word with which an action has been associated. The system then performs the associated action with respect to the user.
    Type: Application
    Filed: June 11, 2014
    Publication date: October 2, 2014
    Inventors: Martin T. King, Dale L. Grover, Clifford A. Kushler, James Q. Stafford-Fraser
  • Patent number: 8848240
    Abstract: An object area containing a text or a graphic is extracted from a document image containing the text or the graphic. Then, on the basis of the extracted object area and stored size information, a cropping area is determined that surrounds the object area with given margins. Further, setting of margins is received. Then, on the basis of the object area and the received setting of the margins, the cropping area is determined. Then, the cropping area determined on the basis of the object area and the size information or alternatively the cropping area determined on the basis of the object area and the setting of the margins is cropped from the document image.
    Type: Grant
    Filed: June 21, 2011
    Date of Patent: September 30, 2014
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Atsuhisa Morimoto, Yohsuke Konishi, Hitoshi Hirohata, Akihito Yoshida
  • Patent number: 8842925
    Abstract: A method and apparatus for encoding an image is provided. An image coding unit, including a region that deviates from a boundary of a current picture, is divided to obtain a coding unit having a smaller size than the size of the image coding unit, and encoding is performed only in a region that does not deviate from the boundary of the current picture. A method and apparatus for decoding an image encoded by the method and apparatus for encoding an image is also provided.
    Type: Grant
    Filed: November 11, 2013
    Date of Patent: September 23, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Min-su Cheon
  • Publication number: 20140270526
    Abstract: A word segmentation method for processing a document image applies clustering analysis to the spacing segments of a line. The spacing segments are generated by thresholding a one-dimensional vertical projection profile of the line. Taking advantage of the bimodal distribution of spacing length distribution of text lines, a k-means clustering algorithm is used, with the number of clusters pre-set to two, to classify the spacing segments as either character spacing or word spacing. Moreover, k-means++ initialization is used to enhance performance of cluster analysis. The clustering result such as cluster centers and compactness is used to prune single-word text line, single table item, etc. The locations of the word spacing segments are then used to segment the line of text into words.
    Type: Application
    Filed: March 14, 2013
    Publication date: September 18, 2014
    Applicant: KONICA MINOLTA LABORATORY U.S.A., INC.
    Inventors: Chaohong Wu, Wei Ming