Context Analysis Or Word Recognition (e.g., Character String) Patents (Class 382/229)
  • Publication number: 20080002893
    Abstract: Methods, systems, and apparatus including computer program products for recognizing text in images are provided. In one implementation, a computer-implemented method for recognizing text in an image is provided. The method includes receiving a plurality of images. The method also includes processing the images to detect a corresponding set of regions of the images, each image having a region corresponding to each other image region, as potentially containing text. The method further includes combining the regions to generate an enhanced region image and performing optical character recognition on the enhanced region image.
    Type: Application
    Filed: June 29, 2006
    Publication date: January 3, 2008
    Inventors: Luc Vincent, Adrian Ulges
  • Publication number: 20080002916
    Abstract: Methods, systems, and apparatus including computer program products for using extracted image text are provided. In one implementation, a computer-implemented method is provided. The method includes receiving an input of one or more image search terms and identifying keywords from the received one or more image search terms. The method also includes searching a collection of keywords including keywords extracted from image text, retrieving an image associated with extracted image text corresponding to one or more of the image search terms, and presenting the image.
    Type: Application
    Filed: June 29, 2006
    Publication date: January 3, 2008
    Inventors: Luc Vincent, Adrian Ulges
  • Patent number: 7313277
    Abstract: A method for recognition of a handwritten pattern comprises the steps of forming (4) a representation of the handwritten pattern, forming (6) at least two sub-configurations by dividing the representation of the handwritten pattern, and processing the subconfigurations. The step of processing comprises the steps of comparing (8) each subconfiguration with reference configurations, selecting (10) at least one subconfiguration candidate for each subconfiguration among the reference configurations based on said step of comparing, and determining (12) at least one candidate pattern consisting of one selected subconfiguration candidate for each subconfiguration. The method further comprises the steps of comparing (14) the representation of the handwritten pattern to the candidate pattern, and computing (16) a cost function in order to find a closest matching candidate pattern.
    Type: Grant
    Filed: February 8, 2002
    Date of Patent: December 25, 2007
    Assignee: Zi Decuma AB
    Inventors: Jonas Morwing, Gunnar Sparr
  • Publication number: 20070286486
    Abstract: A system for automatically recognizing a handwriting image and converting such image to text data including a sequence of validated words, has an image input device, a number of handwriting recognition engines, and control unit. A first handwriting recognition engine is responsive to the image input device, for analyzing the data file and providing one or more possible text words for each successive word in the data file. The first handwriting recognition engine further provides a resemblance indication for each possible text word indicating a level of resemblance between its appearance and the appearance of the handwritten word in the data file. In the event that there is not a high level of confidence in the selection of the first handwriting recognition engine, a selection of a validated word is based on the selections of one or more of the other handwriting recognition engines.
    Type: Application
    Filed: May 30, 2006
    Publication date: December 13, 2007
    Inventor: Ira P. Goldstein
  • Patent number: 7305435
    Abstract: A mobile device, system, method, and software for communicating with the internet utilizing a written universal resource locator (URL). A camera unit is used to receive a raw visual light image containing a written URL, the raw visual light image is converted to an electronic image, and the device locates glyphs of at least one particular standardized set of URL characters in the electronic image, for example glyphs corresponding to www. Then the URL characters are extracted from the electronic image, the URL is sent in a request signal to a web server, and in response an internet site is presented. The mobile device includes initiation means for sending an instruction to obtain a raw visual light image that includes glyphs of at least one particular set of characters, such as www, and further includes a camera, a display, and an internet interface. The mobile device processes an electronic image signal provided by the camera, in order to obtain the web site signal from the internet interface.
    Type: Grant
    Filed: August 25, 2003
    Date of Patent: December 4, 2007
    Assignee: Nokia Corporation
    Inventor: Kimmo Hämynen
  • Patent number: 7302343
    Abstract: Methods are disclosed for encoding latitude/longitude coordinates within a URL in a relatively compact form. The method includes converting latitude and longitude coordinates from floating-point numbers to non-negative integers. A set of base-N string representations are generated for the integers (N represents the number of characters in an implementation-defined character set being utilized). The latitude string and longitude string are then concatenated to yield a single output string. The output string is utilized as a geographic indicator with a URL.
    Type: Grant
    Filed: July 31, 2003
    Date of Patent: November 27, 2007
    Assignee: Microsoft Corporation
    Inventor: Bryan Beatty
  • Patent number: 7289668
    Abstract: Methods and systems for document image decoding incorporating a Stack algorithm improve document image decoding. The application of the Stack algorithm is iterated to improved decoding. A provisional weight is determined for a partial path to reduce template matching. In addition, semantically equivalent hypotheses are identified to reduce redundant hypotheses.
    Type: Grant
    Filed: August 9, 2002
    Date of Patent: October 30, 2007
    Assignee: Xerox Corporation
    Inventors: Daniel H. Greene, Tze-Lei Poo, Ashok C. Popat
  • Patent number: 7283126
    Abstract: A touch system comprises a touch panel having a touch surface and a projector presenting images onto the touch surface. A computer executing an applications program is coupled to the touch panel and the projector. The computer is responsive to contact output generated by the touch panel in response to proximity of a pointer to the touch surface and updates image data conveyed to the projector so that images presented on the touch surface reflect pointer activity. The computer executes a gesture suggestion and writing recognition routine. The gesture suggestion and writing recognition routine performs recognition to convert ink on the touch surface into another object based on a selected interpretation.
    Type: Grant
    Filed: June 12, 2002
    Date of Patent: October 16, 2007
    Assignee: Smart Technologies Inc.
    Inventor: Andy Leung
  • Publication number: 20070206884
    Abstract: There is provided an image processing apparatus including a character recognition section that executes character recognition on an input document image and outputs a character recognition result, an item name extraction section that extracts a character string relevant to an item name of an information item from the character recognition result, an item value extraction section that extracts a character string of an item value corresponding to the item name from the vicinity of the character string relevant to the item name in the document image, and an extraction information creation section that creates extraction information by associating the character string of the item value extracted by the item value extraction section to the item name.
    Type: Application
    Filed: August 29, 2006
    Publication date: September 6, 2007
    Inventor: Masahiro Kato
  • Patent number: 7251367
    Abstract: A system augments stylus keyboarding with shorthand gesturing. The system defines a shorthand symbol for each word according to its movement pattern on an optimized stylus keyboard. The system recognizes word patterns by identifying an input as a stroke, and then matching the stroke to a stored list of word patterns. The system then generates and displays the matched word to the user.
    Type: Grant
    Filed: December 20, 2002
    Date of Patent: July 31, 2007
    Assignee: International Business Machines Corporation
    Inventor: Shumin Zhai
  • Patent number: 7240062
    Abstract: Multiple recognition engines (110) provide different interpretations (116) of a word at a given location within a scanned document (108). A word node corresponding to each unique interpretation is stored within a word index (102), with each word node being linked to word nodes of previously and subsequently recognized words.
    Type: Grant
    Filed: March 6, 2001
    Date of Patent: July 3, 2007
    Assignee: iArchives, Inc.
    Inventors: Timothy L. Andersen, Frederick A. Zarndt, Robert B. Wille, Michael E. Rimer, Michael U. Bailey, G. Bret Millar, E. Derek Rowley
  • Publication number: 20070140568
    Abstract: A history management apparatus includes: a detection unit that detects a character or a word included in image data which is processed in an image processing apparatus; and a storage unit that stores information to specify a detection position of the detected character or word in the image data and the image data in association with each other. The stored information is provided to an image data searching process using the character or the word and the detecting position as search conditions.
    Type: Application
    Filed: June 13, 2006
    Publication date: June 21, 2007
    Applicant: Fuji Xerox Co., Ltd.
    Inventors: Yoshihide Kohtani, Ayumi Segi
  • Patent number: 7221800
    Abstract: Text that is adjacent to predetermined indicia is detected in a digital image. Alternatively, or in addition, a digital image can have text that is adjacent to a predefined insertion field. Text that is input and/or derived from an optically scanned image is substituted in the digital image for the predetermined indicia and/or for the predefined insertion field. The substituted text matches the font of text adjacent thereto in the digital image. The digital image having the substitution is rendered.
    Type: Grant
    Filed: August 29, 2003
    Date of Patent: May 22, 2007
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Robert Sesek, Travis J. Parry, Chad A. Stevens
  • Patent number: 7219310
    Abstract: Glyph instructions are formed which are understandable by a person following the instructions, irrespective of which written language is understood by the person. The glyph instructions follow defined grammar and syntax rules. A plurality of action glyphs are used to represent a plurality of defined actions capable of being undertaken by the person following the instructions. A plurality of material glyphs are defined to represent a plurality of materials which are includable as part of the instruction, and a plurality of instrumentation glyphs are defined to represent a plurality of instruments which may be included in the instructions. Selected ones of the action glyphs, material glyphs and instrumentation glyphs are arranged in relationship to each other in accordance with the predetermined grammar and syntax to form specific instructions understandable by the person following the instruction, irrespective of the written language which is understood by the person.
    Type: Grant
    Filed: November 5, 2001
    Date of Patent: May 15, 2007
    Assignee: Xerox Corporation
    Inventors: Jesus Santoyo Ortega, Jose Luis Duenas, Rosa Elena Castillo, Jésus Esquivel, Hugo C. Correa, Mauricio Campos, Salvador De Luna, Gilberto Esparza
  • Patent number: 7200271
    Abstract: In accordance with this invention, a method, computer program product, and system for performing automated recognition of blocks of text within a graphic file are provided. The method, computer program product, and system automatically transform drawings into a graphic file format that provides enriched electronic display and text search of the graphic files. The text within large sets of drawings, parts catalogs, and various manuals is automatically discovered, extracted and indexed by the geometric location of the text within the graphic file. Single lines of text and blocks of text are recognized by utilizing geometric reasoning techniques based upon proximity and font characteristics. As such, the present invention automatically produces an interactive electronic representation of a graphic file that allows a user to quickly and accurately search graphic files for particular text, whether the text appears in a single line or over multiple lines in a block of text.
    Type: Grant
    Filed: October 4, 2001
    Date of Patent: April 3, 2007
    Assignee: The Boeing Company
    Inventors: John H. Boose, Lawrence S. Baum, Molly L. Boose
  • Patent number: 7197185
    Abstract: A logical separation between pages, such as an implicit page break, is introduced to separate text entered during one handwriting session from text entered during another handwriting session. The amount of time elapsed since ink has been captured on the previous page is a factor that may be used to determine whether to insert an implicit page break into the new page. A change in context, such as a different date or different recognized subject matter labels, is also a factor that may be considered in determining whether to insert an implicit page break.
    Type: Grant
    Filed: January 19, 2006
    Date of Patent: March 27, 2007
    Assignee: Microsoft Corporation
    Inventors: Charlton E Lui, Anthony S Smith, Dan W Altman, Cynthia C Tee, Evan M Feldman
  • Patent number: 7190833
    Abstract: The invention relates to a mobile device with a built-in image capture device, and a character recognition function to present the information gathered with the character recognition result. With the mobile device, the character line extraction process is displayed whenever necessary, and the resolution of an image to be inputted for recognition processing is enhanced. Accordingly, it is possible for the operator to select the target character line with ease. In addition, the mobile device has a character recognition ratio improved by the enhancement in resolution.
    Type: Grant
    Filed: July 23, 2002
    Date of Patent: March 13, 2007
    Assignee: Hitachi, Ltd.
    Inventors: Tatsuhiko Kagehiro, Minenobu Seki, Hiroshi Sako
  • Patent number: 7181068
    Abstract: A mathematical expression recognizing device comprises a character recognition unit which recognizes characters in a document image, a dictionary storing a pair of evaluation scores for each type of word, the score showing the possibility of belonging to the text and that of belonging to the mathematical expression, an evaluation unit which obtains the evaluation scores showing the possibility of belonging to the text and that of belonging to the mathematical expression for each of the words included in the recognized characters with reference to the dictionary, and a mathematical expression detecting unit which searches for an optimal path connecting words by selecting one of the text and the mathematical expression based on a formative grammar and the evaluation scores showing the possibility of belonging to the text and that of belonging to the mathematical expression for each of the words, thereby detecting characters belonging to the mathematical expression.
    Type: Grant
    Filed: March 5, 2002
    Date of Patent: February 20, 2007
    Assignees: Kabushiki Kaisha Toshiba
    Inventors: Masakazu Suzuki, Kazuaki Yokota, Yuko Eto
  • Patent number: 7181067
    Abstract: In computerized recognition having multiple experts, a method and system is described that obtains an optimum value for an expert tuning parameter in a single pass over sample tuning data. Each tuning sample is applied to two experts, resulting in scores from which ranges of parameters that correct incorrect recognition errors without changing correct results for that sample are determined. To determine the range data for a given sample, the experts return scores for each prototype in a database, the scores separated into matching and non-matching scores. The matching and non-matching scores from each expert are compared, providing upper and lower bounds defining ranges. Maxima and minima histograms track upper and lower bound range data, respectively. An analysis of the histograms based on the full set of tuning samples provides the optimum value. For tuning multiple parameters, each parameter may be optimized by this method in isolation, and then iterated.
    Type: Grant
    Filed: February 2, 2005
    Date of Patent: February 20, 2007
    Assignee: Microsoft Corporation
    Inventor: Gregory N. Hullender
  • Patent number: 7167588
    Abstract: Methods and systems for document image decoding incorporating a Stack algorithm improve document image decoding. The application of the Stack algorithm is iterated to improve decoding. A provisional weight is determined for a partial path to reduce template matching. In addition, semantically equivalent hypotheses are identified to reduce redundant hypotheses.
    Type: Grant
    Filed: August 9, 2002
    Date of Patent: January 23, 2007
    Assignee: Xerox Corporation
    Inventors: Daniel H. Greene, Justin Romberg, Ashok C. Popat
  • Patent number: 7164798
    Abstract: Systems and methods for learning-based automatic commercial content detection are described. In one aspect, program data is divided into multiple segments. The segments are analyzed to determine visual, audio, and context-based feature sets that differentiate commercial content from non-commercial content. The context-based features are a function of single-side left and/or right neighborhoods of segments of the multiple segments.
    Type: Grant
    Filed: February 18, 2003
    Date of Patent: January 16, 2007
    Assignee: Microsoft Corporation
    Inventors: Xian-Sheng Hua, Lie Lu, Mingjing Li, Hong-Jiang Zhang
  • Patent number: 7155061
    Abstract: In a computing device, a method and system for searching for matching ink words or phrases, by comparing a given search term of at least one word (and possibly alternates) with the words in a document, including recognized ink words and any possible alternates for those recognized words as returned by a recognizer. Various matching tests are possible because of the use of alternates, which also may have corresponding probability rankings that may influence the search. Searching may occur in actively edited ink documents, or the recognition results may be saved as saved search file data that can be searched independent of recognition.
    Type: Grant
    Filed: June 21, 2002
    Date of Patent: December 26, 2006
    Assignee: Microsoft Corporation
    Inventors: Charlton E. Lui, Gregory H. Manto, Vikram Madan, Ryan E. Cukierman, Jon E. Clark
  • Patent number: 7142716
    Abstract: A document image search apparatus generates a text by performing the character recognition of a document image and determines a re-process scope. Then, the apparatus generates a candidate character lattice from the re-recognition result of the re-process scope, generates character strings from the candidate character lattice and adds the character strings to the text. Then, the apparatus performs index search using the text with the character strings added.
    Type: Grant
    Filed: September 12, 2001
    Date of Patent: November 28, 2006
    Assignee: Fujitsu Limited
    Inventors: Yutaka Katsuyama, Satoshi Naoi, Fumihito Nishino
  • Patent number: 7136530
    Abstract: A method and apparatus for extracting information from symbolically compressed document images. A deciphering module generates first and second text strings by deciphering respective sequences of template identifiers in first and second symbolically compressed document images. A conditional n-gram module receives the first and second text strings from the deciphering module and extracts n-gram terms therefrom based on a predicate condition. A comparison module generates a measure of similarity between the first and second symbolically compressed document images based on the n-gram terms extracted by the conditional n-gram module.
    Type: Grant
    Filed: September 30, 2003
    Date of Patent: November 14, 2006
    Assignee: Ricoh Co., Ltd.
    Inventors: Dar-Shyang Lee, Jonathan J. Hull
  • Patent number: 7130470
    Abstract: A method and system for context-based sorting of character strings. A first sorting weight of a current character of a character string is determined from a first table. The first sorting weight is stored. Provided the current character is a predetermined character, a second table is accessed. A second sorting weight of the current character is determined from the location of a preceding character within the second table. The first sorting weight is replaced with the second sorting weight for the current character. Embodiments of the present invention provide an efficient method of context-based sorting in languages, such as Japanese, where the sorting weight of a character can be altered by the preceding character.
    Type: Grant
    Filed: March 15, 2002
    Date of Patent: October 31, 2006
    Assignee: Oracle International Corporation
    Inventor: Ching Lan Ho
  • Patent number: 7124130
    Abstract: The present invention is directed to an address recognition apparatus for recognizing a written address. The apparatus includes an input device that receives a scanned image of the written address and transforms the image into digital data, a character recognizing section that recognizes a word string in the digital data on a unit character basis, a word extracting section that extracts characters recognized by the character recognizing section on a unit word basis, and an address word string dictionary that previously stores a plurality of first word strings. The apparatus further includes and an address word string recognizing section that collates a second word string, determines words of the second word string respectively corresponding to the words of the first word string, evaluates each of the first word strings, and recognizes one of the first word strings as the address word string.
    Type: Grant
    Filed: September 4, 2003
    Date of Patent: October 17, 2006
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Naotake Natori
  • Patent number: 7120302
    Abstract: The present invention embodies a character recognition method for constructing a result string from a plurality of result sets. Each result set comprises at least one candidate character, and each candidate character has an associated confidence indication. The method can begin by selecting a plurality of character types. For each selected character type, a candidate string can be created by concatenating a candidate character of the selected character type from each result set. The associated confidence indication for each concatenated candidate character can be combined to form a corresponding combined confidence indication for each created candidate string. The created candidate string with the most favorable corresponding combined confidence indication can be selected as the result string.
    Type: Grant
    Filed: July 31, 2001
    Date of Patent: October 10, 2006
    Assignee: RAF Technology, Inc.
    Inventor: Stephen E. M. Billester
  • Patent number: 7110607
    Abstract: A multilevel image into which a color or a black-and-white image is converted is input, and a slightly indistinct binary image generating unit generates a slightly indistinct binary image that includes a slightly indistinct line pattern and does not include background noise. Additionally, a shape-preserved binary image generating unit generates a binary image that preserves the shape of a line pattern and includes background noise. These images are ANDed for each pixel, so that a binary image that preserves the shape of the line pattern and does not include the background noise is generated.
    Type: Grant
    Filed: November 28, 2001
    Date of Patent: September 19, 2006
    Assignee: Fujitsu Limited
    Inventors: Katsuhito Fujimoto, Atsuko Ohara, Satoshi Naoi
  • Patent number: 7110998
    Abstract: The querying application of the present invention provides method and apparatus to find related queries having the greatest-valued and/or least-valued results. Elements (or inputs) of the related queries overlap with the elements of the user query. Preferably, the querying application of the present invention enables the user to trace other queries having the greatest-valued and/or least-valued results that overlap other elements of the user query. In accordance with another aspect of the present invention, the querying application of the present invention provides method and apparatus to find non-related queries having the greatest-valued and/or least-valued results.
    Type: Grant
    Filed: October 12, 1999
    Date of Patent: September 19, 2006
    Assignee: Virtual Gold, Inc.
    Inventors: Inderpal S. Bhandari, Rajiv Pratap, Krishnakumar Ramanujam
  • Patent number: 7106905
    Abstract: Systems and methods for processing text-based electronic documents are provided. Briefly described, one embodiment of a method for processing a text-based electronic document comprises the steps of: comparing at least one word in a text-based electronic document to a native language dictionary to determine whether the at least one word conforms to a predefined rule; for each of the at least one word that does not conform to the predefined rule, fragmenting the at least one word into word fragments; combining at least two consecutive word fragments; and comparing the combination of the word fragments to the native language dictionary.
    Type: Grant
    Filed: August 23, 2002
    Date of Patent: September 12, 2006
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Steven J. Simske
  • Patent number: 7092574
    Abstract: A mathematical expression recognizing device comprises a character recognition unit which recognizes characters in a document image, a dictionary storing a pair of evaluation scores for each type of word, the score showing the possibility of belonging to the text and that of belonging to the mathematical expression, an evaluation unit which obtains the evaluation scores showing the possibility of belonging to the text and that of belonging to the mathematical expression for each of the words included in the recognized characters with reference to the dictionary, and a mathematical expression detecting unit which searches for an optimal path connecting words by selecting one of the text and the mathematical expression based on a formative grammar and the evaluation scores showing the possibility of belonging to the text and that of belonging to the mathematical expression for each of the words, thereby detecting characters belonging to the mathematical expression.
    Type: Grant
    Filed: March 5, 2002
    Date of Patent: August 15, 2006
    Assignees: Kabushiki Kaisha Toshiba
    Inventors: Masakazu Suzuki, Kazuaki Yokota, Yuko Eto
  • Patent number: 7092567
    Abstract: A method of post-processing character data from an optical character recognition (OCR) engine and apparatus to perform the method. This exemplary method includes segmenting the character data into a set of initial words. The set of initial words is word level processed to determine at least one candidate word corresponding to each initial word. The set of initial words is segmented into a set of sentences. Each sentence in the set of sentences includes a plurality of initial words and candidate words corresponding to the initial words. A sentence is selected from the set of sentences. The selected sentence is word disambiguity processed to determine a plurality of final words. A final word is selected from the at least one candidate word corresponding to a matching initial word. The plurality of final words is then assembled as post-processed OCR data.
    Type: Grant
    Filed: November 4, 2002
    Date of Patent: August 15, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Yue Ma, Jinhong Katherine Guo, Mu Li, Yu-kun Tong, Tian-shun Yao, Jing-bo Zhu
  • Patent number: 7081975
    Abstract: The present invention employs an instruction button for inputting an instruction to simultaneously executing registration, in a full-text-search database, of a document scanned by one scanning operation and subjected to OCR processing, and registration of the document in a designated folder contained in a database of a folder tree structure. This single instruction button enables the user to simultaneously register read images in a folder, and text data resulting from OCR processing executed on the read images.
    Type: Grant
    Filed: September 7, 2004
    Date of Patent: July 25, 2006
    Assignees: Kabushiki Kaisha Toshiba, Toshiba Tec Kabushiki Kaisha
    Inventors: Nobuhisa Yoda, Tatsuya Haraguchi
  • Patent number: 7047238
    Abstract: Disclosed are a document retrieval method and system for separately performing a process for correcting erroneously recognized characters existing in characteristic character strings within a seed document or the documents to be registered and a process for tolerating erroneously recognized characters existing in the documents targeted for retrieval. The process for correcting erroneously recognized characters existing in characteristic character strings extracts characteristic character strings from a read document, replaces the extracted characteristic character strings containing erroneously recognized characters with character strings appropriate for document retrieval, and selects characteristic character strings for use in actual document retrieval.
    Type: Grant
    Filed: February 21, 2003
    Date of Patent: May 16, 2006
    Assignees: Hitachi, Ltd., Hitachi Systems & Services, Ltd.
    Inventors: Katsumi Tada, Hisashi Takatori
  • Patent number: 7039637
    Abstract: An evaluator system accepts input textual messages in unknown languages and assesses which character sets, corresponding to languages, matches that message. Textual messages whose individual characters are encoded in 16 bit Unicode or other universal format are parsed, and character sets which can express each character and the accumulated correspondence is logged. When the character sets against which the message is being tested only provide partial matches, the invention can determine which offers the best fit, including by means of a weighting function. The evaluation technology of the invention can be applied to multipart documents, and to search engines and indices. Documents can be indexed according to assigned character sets, and quary strings matched to indices according to language.
    Type: Grant
    Filed: August 27, 1999
    Date of Patent: May 2, 2006
    Assignee: International Business Machines Corporation
    Inventors: Brendan P. Murray, Kuniaki Takizawa
  • Patent number: 7039240
    Abstract: Methods and systems for document image decoding incorporating a Stack algorithm improve document image decoding. The application of the Stack algorithm is iterated to improve decoding. A provisional weight is determined for a partial path to reduce template matching. In addition, semantically equivalent hypotheses are identified to reduce redundant hypotheses.
    Type: Grant
    Filed: August 9, 2002
    Date of Patent: May 2, 2006
    Assignee: Xerox Corporation
    Inventors: Daniel H. Greene, Justin K Romberg, Tze-Lei Poo, Ashok C. Popat
  • Patent number: 7031521
    Abstract: A logical separation between pages, such as an implicit page break, is introduced to separate text entered during one handwriting session from text entered during another handwriting session. The amount of time elapsed since ink has been captured on the previous page is a factor that may be used to determine whether to insert an implicit page break into the new page. A change in context, such as a different date or different recognized subject matter labels, is also a factor that may be considered in determining whether to insert an implicit page break.
    Type: Grant
    Filed: October 1, 2004
    Date of Patent: April 18, 2006
    Assignee: Microsoft Corporation
    Inventors: Charlton E. Lui, Anthony S. Smith, Dan W. Altman, Cynthia C. Tee, Evan M. Feldman
  • Patent number: 7031002
    Abstract: A system and method of using character set matching to identify the matching or best-matching font to print text of indeterminate language are presented. Today's operating systems do not provide the native tools and functions to easily display text of unknown language or multiple languages. The complexity of any underlying code that handles a multilingual display is sharply increased due to the text being segmented into multiple text runs. The invention employs character set engine that provides necessary character set guessing functionality, as well as an enumerator module to build a linked list of suitable output fonts to display text from an arbitrary language, and multilingual text. Output on a laser, inkjet or other printing apparatus can be granted by traversing that list.
    Type: Grant
    Filed: August 27, 1999
    Date of Patent: April 18, 2006
    Assignee: International Business Machines Corporation
    Inventor: David D. Taieb
  • Patent number: 7024042
    Abstract: The capacity of a character feature dictionary is reduced, and stored as a feature dictionary. The capacity is reduced by clustering feature vectors in units of columns or rows for character features, by making m column vectors represent the column or row features, and by assigning 1 to m identification numbers. The capacity of the dictionary can be further reduced by representing a column or row feature with an addition sum of other column or row features, or differential features after clustering is performed, or by performing dimension compression for character features. Word recognition is performed by synthesizing a word feature for a comparison based on a word list to be recognized, and by making a comparison between a feature extracted from an input word and the synthesized feature. Or, a comparison between input word and input word features whose numbers of dimensions are different may be made with nonlinear elastic matching.
    Type: Grant
    Filed: September 12, 2001
    Date of Patent: April 4, 2006
    Assignee: Fujitsu Limited
    Inventor: Yoshinobu Hotta
  • Patent number: 6990237
    Abstract: A logical separation between pages, such as an implicit page break, is introduced to separate text entered during one handwriting session from text entered during another handwriting session. If the user leaves more than a threshold amount of blank space at the bottom of the page immediately preceding the new page, then an implicit page break may be inserted at the beginning of the new page. The amount of blank space left at the end of the preceding page may be combined with other criteria to determine whether to insert an implicit page break. The amount of time elapsed since ink has been captured on the previous page is another factor that may be used by itself or combined with other factors to determine whether to insert an implicit page break into the new page. A change in context, such as a different date or different recognized subject matter labels, is also a factor that may be considered in determining whether to insert an implicit page break.
    Type: Grant
    Filed: June 14, 2004
    Date of Patent: January 24, 2006
    Assignee: Microsoft Corporation
    Inventors: Charlton E. Lui, Anthony S. Smith, Dan W. Altman, Cynthia C. Tee, Evan M. Feldman
  • Patent number: 6978044
    Abstract: This invention is to compare each character of a first character string with each character of a second character string, vote for a matrix having two sides corresponding to the characters of the first character string and the characters of the second character string and calculate values of the voting result for respective components arranged in an oblique direction of the matrix. The matching result is determined based on the calculated values of the voting result. As a result, a high-speed and highly precise matching process which is noise-resistant and takes the character arrangement into consideration can be attained.
    Type: Grant
    Filed: March 30, 2004
    Date of Patent: December 20, 2005
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Takuma Akagi
  • Patent number: 6975766
    Abstract: A named entity discriminating system capable of discriminating names entities such as location names, personal names, and organization names in text with a high degree of accuracy is provided. A reading means reads text from a hypertext database. A single text analyzing means analyzes each text read by the reading means and detects candidates for the named entity in the text. A complex text analyzing means estimates the likelihood of the candidate named entity detected by the single text analyzing means by an analysis with reference to referring link text or linked text of the text in which the candidate named entity appears.
    Type: Grant
    Filed: September 7, 2001
    Date of Patent: December 13, 2005
    Assignee: NEC Corporation
    Inventor: Toshikazu Fukushima
  • Patent number: 6963665
    Abstract: A form sheet type determining method and apparatus for determining to which of predetermined form sheets an input form sheet corresponds. A plurality of sets of keywords are registered in a keyword register with one set of keywords for each predetermined form sheet type; image data of an input form sheet is read, character strings are extracted from the read image data, and character recognition is performed on each extracted character string; each of the character recognized strings is extracted as a keyword; the extracted keywords are collated, for each form sheet type, with the sets of keywords registered in the keyword register, thereby to determine the type of the input form sheet.
    Type: Grant
    Filed: August 25, 2000
    Date of Patent: November 8, 2005
    Assignee: Hitachi, Ltd.
    Inventors: Atsuhiro Imaizumi, Masato Teramoto, Tsukasa Yasue
  • Patent number: 6950555
    Abstract: In a combined holistic and analytic recognition system, the holistic recognition module will recognize an input word or phrase image by matching an input string of character features for the whole word or phrase against a string of prototype features for a plurality of reference words in a lexicon. This will yield a holistic answer list of recognized word or phrase candidates for the input word or phrase along with a confidence value for each answer on the list. At the same time based on each answer in the answer list, the holistic recognition modules will generate a list of character features and segment the character features into sets for each character in an answer. The analytical recognition module uses segmentation hypotheses from the segmented character feature sets to cut the image of the input string of characters into individual character images.
    Type: Grant
    Filed: February 16, 2001
    Date of Patent: September 27, 2005
    Assignee: Parascript LLC
    Inventors: Alexander Filatov, Igor Kil, Arseni Seregin
  • Patent number: 6944344
    Abstract: A search apparatus searches for a keyword from a character recognition result using an index table. The character recognition result being obtained as a result of character recognition of characters in an original document. The index table includes an index character string; a position of a portion, in the character recognition result, which matches the index character string; and a credibility which is defined for each character included in the index character string and indicates a probability of the character existing in a portion, in the original document, which corresponds to a portion, in the character recognition result, which matches the character.
    Type: Grant
    Filed: June 6, 2001
    Date of Patent: September 13, 2005
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Taro Imagawa, Kenji Kondo, Yoshihiko Matsukawa, Tsuyoshi Mekata
  • Patent number: 6941030
    Abstract: A character-recognition pre-processing apparatus includes extraction means for extracting an image of a character string to be subjected to character recognition; setting means for setting the smallest rectangle that surrounds the character string image extracted; specifying means for specifying the position of each character within the smallest rectangle set by the setting means; detection means for detecting, at each character position specified, the shortest distance between a character region and the lower edge of the smallest rectangle, and the shortest distance between the character region and the upper edge of the smallest rectangle; and judgment means for judging whether the character string extracted is in an upright state or an inverted state, on the basis of variations in the two shortest distances detected.
    Type: Grant
    Filed: November 30, 2000
    Date of Patent: September 6, 2005
    Assignee: PFU Limited
    Inventors: Hiroshi Kakutani, Yasuharu Inami
  • Patent number: 6937766
    Abstract: A method for generating an index of the text of a video image sequence is provided. The method includes the steps of determining the image text objects in each of a plurality of frames of the video image sequence; comparing the image text objects in each of the plurality of frames of the video image sequence to obtain a record of frame sequences having matching image text objects; extracting the content for each of the similar image text objects in text string format; and storing the text string for each image text object as a video text object in a retrievable medium.
    Type: Grant
    Filed: April 13, 2000
    Date of Patent: August 30, 2005
    Assignee: MATE—Media Access Technologies Ltd.
    Inventors: Itzhak Wilf, Joseph Ladkani, Ovadya Menadeva, Hayit Greenspan
  • Patent number: 6927774
    Abstract: A character display device and method therefor are adapted to obtain a proximal reference point of each character comprising a character series and calculate display coordinates of each character from said proximal reference point and the display angle and display reference position of the character series.
    Type: Grant
    Filed: December 8, 2000
    Date of Patent: August 9, 2005
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventor: Fumiko Yano
  • Patent number: 6922489
    Abstract: A method of interpreting an image using a statistical or probabilistic interpretation model is disclosed. The image has associated therewith contextual information. The method comprises the following steps: providing the contextual information associated with the image for analysis; analyzing the additional contextual information to identify predetermined features relating to the image; and biasing the statistical or probabilistic interpretation model in accordance with the identified features.
    Type: Grant
    Filed: October 29, 1998
    Date of Patent: July 26, 2005
    Assignees: Canon Kabushiki Kaisha, Canon Information Systems Research Australia Pty. Ltd.
    Inventors: Alison Joan Lennon, Delphine Anh Dao Le
  • Patent number: 6917708
    Abstract: A method of automatically recognizing text. The text is divided into whole words which are each recognize. Each whole word is characterized according to its silhouette. The silhouette is characterized by features in the silhouette such as upwardly extending “polls” and downwardly extending “holes”. The silhouette may also be characterized by its first syllable blends. Numbers are assigned to each of the different characteristics, and numbers may also be assigned based on analysis of a database of different kinds of cursive words. Recognition may be automatically carry out prior recognizing system which recognizes in this way.
    Type: Grant
    Filed: January 19, 2001
    Date of Patent: July 12, 2005
    Assignee: California Institute of Technology
    Inventors: Rodney M. Goodman, Donal J. Woods, Patricia A. Keaton, Joseph Chen