Context Analysis Or Word Recognition (e.g., Character String) Patents (Class 382/229)
-
Publication number: 20080002893Abstract: Methods, systems, and apparatus including computer program products for recognizing text in images are provided. In one implementation, a computer-implemented method for recognizing text in an image is provided. The method includes receiving a plurality of images. The method also includes processing the images to detect a corresponding set of regions of the images, each image having a region corresponding to each other image region, as potentially containing text. The method further includes combining the regions to generate an enhanced region image and performing optical character recognition on the enhanced region image.Type: ApplicationFiled: June 29, 2006Publication date: January 3, 2008Inventors: Luc Vincent, Adrian Ulges
-
Publication number: 20080002916Abstract: Methods, systems, and apparatus including computer program products for using extracted image text are provided. In one implementation, a computer-implemented method is provided. The method includes receiving an input of one or more image search terms and identifying keywords from the received one or more image search terms. The method also includes searching a collection of keywords including keywords extracted from image text, retrieving an image associated with extracted image text corresponding to one or more of the image search terms, and presenting the image.Type: ApplicationFiled: June 29, 2006Publication date: January 3, 2008Inventors: Luc Vincent, Adrian Ulges
-
Patent number: 7313277Abstract: A method for recognition of a handwritten pattern comprises the steps of forming (4) a representation of the handwritten pattern, forming (6) at least two sub-configurations by dividing the representation of the handwritten pattern, and processing the subconfigurations. The step of processing comprises the steps of comparing (8) each subconfiguration with reference configurations, selecting (10) at least one subconfiguration candidate for each subconfiguration among the reference configurations based on said step of comparing, and determining (12) at least one candidate pattern consisting of one selected subconfiguration candidate for each subconfiguration. The method further comprises the steps of comparing (14) the representation of the handwritten pattern to the candidate pattern, and computing (16) a cost function in order to find a closest matching candidate pattern.Type: GrantFiled: February 8, 2002Date of Patent: December 25, 2007Assignee: Zi Decuma ABInventors: Jonas Morwing, Gunnar Sparr
-
Publication number: 20070286486Abstract: A system for automatically recognizing a handwriting image and converting such image to text data including a sequence of validated words, has an image input device, a number of handwriting recognition engines, and control unit. A first handwriting recognition engine is responsive to the image input device, for analyzing the data file and providing one or more possible text words for each successive word in the data file. The first handwriting recognition engine further provides a resemblance indication for each possible text word indicating a level of resemblance between its appearance and the appearance of the handwritten word in the data file. In the event that there is not a high level of confidence in the selection of the first handwriting recognition engine, a selection of a validated word is based on the selections of one or more of the other handwriting recognition engines.Type: ApplicationFiled: May 30, 2006Publication date: December 13, 2007Inventor: Ira P. Goldstein
-
Patent number: 7305435Abstract: A mobile device, system, method, and software for communicating with the internet utilizing a written universal resource locator (URL). A camera unit is used to receive a raw visual light image containing a written URL, the raw visual light image is converted to an electronic image, and the device locates glyphs of at least one particular standardized set of URL characters in the electronic image, for example glyphs corresponding to www. Then the URL characters are extracted from the electronic image, the URL is sent in a request signal to a web server, and in response an internet site is presented. The mobile device includes initiation means for sending an instruction to obtain a raw visual light image that includes glyphs of at least one particular set of characters, such as www, and further includes a camera, a display, and an internet interface. The mobile device processes an electronic image signal provided by the camera, in order to obtain the web site signal from the internet interface.Type: GrantFiled: August 25, 2003Date of Patent: December 4, 2007Assignee: Nokia CorporationInventor: Kimmo Hämynen
-
Patent number: 7302343Abstract: Methods are disclosed for encoding latitude/longitude coordinates within a URL in a relatively compact form. The method includes converting latitude and longitude coordinates from floating-point numbers to non-negative integers. A set of base-N string representations are generated for the integers (N represents the number of characters in an implementation-defined character set being utilized). The latitude string and longitude string are then concatenated to yield a single output string. The output string is utilized as a geographic indicator with a URL.Type: GrantFiled: July 31, 2003Date of Patent: November 27, 2007Assignee: Microsoft CorporationInventor: Bryan Beatty
-
Patent number: 7289668Abstract: Methods and systems for document image decoding incorporating a Stack algorithm improve document image decoding. The application of the Stack algorithm is iterated to improved decoding. A provisional weight is determined for a partial path to reduce template matching. In addition, semantically equivalent hypotheses are identified to reduce redundant hypotheses.Type: GrantFiled: August 9, 2002Date of Patent: October 30, 2007Assignee: Xerox CorporationInventors: Daniel H. Greene, Tze-Lei Poo, Ashok C. Popat
-
Patent number: 7283126Abstract: A touch system comprises a touch panel having a touch surface and a projector presenting images onto the touch surface. A computer executing an applications program is coupled to the touch panel and the projector. The computer is responsive to contact output generated by the touch panel in response to proximity of a pointer to the touch surface and updates image data conveyed to the projector so that images presented on the touch surface reflect pointer activity. The computer executes a gesture suggestion and writing recognition routine. The gesture suggestion and writing recognition routine performs recognition to convert ink on the touch surface into another object based on a selected interpretation.Type: GrantFiled: June 12, 2002Date of Patent: October 16, 2007Assignee: Smart Technologies Inc.Inventor: Andy Leung
-
Publication number: 20070206884Abstract: There is provided an image processing apparatus including a character recognition section that executes character recognition on an input document image and outputs a character recognition result, an item name extraction section that extracts a character string relevant to an item name of an information item from the character recognition result, an item value extraction section that extracts a character string of an item value corresponding to the item name from the vicinity of the character string relevant to the item name in the document image, and an extraction information creation section that creates extraction information by associating the character string of the item value extracted by the item value extraction section to the item name.Type: ApplicationFiled: August 29, 2006Publication date: September 6, 2007Inventor: Masahiro Kato
-
Patent number: 7251367Abstract: A system augments stylus keyboarding with shorthand gesturing. The system defines a shorthand symbol for each word according to its movement pattern on an optimized stylus keyboard. The system recognizes word patterns by identifying an input as a stroke, and then matching the stroke to a stored list of word patterns. The system then generates and displays the matched word to the user.Type: GrantFiled: December 20, 2002Date of Patent: July 31, 2007Assignee: International Business Machines CorporationInventor: Shumin Zhai
-
Patent number: 7240062Abstract: Multiple recognition engines (110) provide different interpretations (116) of a word at a given location within a scanned document (108). A word node corresponding to each unique interpretation is stored within a word index (102), with each word node being linked to word nodes of previously and subsequently recognized words.Type: GrantFiled: March 6, 2001Date of Patent: July 3, 2007Assignee: iArchives, Inc.Inventors: Timothy L. Andersen, Frederick A. Zarndt, Robert B. Wille, Michael E. Rimer, Michael U. Bailey, G. Bret Millar, E. Derek Rowley
-
Publication number: 20070140568Abstract: A history management apparatus includes: a detection unit that detects a character or a word included in image data which is processed in an image processing apparatus; and a storage unit that stores information to specify a detection position of the detected character or word in the image data and the image data in association with each other. The stored information is provided to an image data searching process using the character or the word and the detecting position as search conditions.Type: ApplicationFiled: June 13, 2006Publication date: June 21, 2007Applicant: Fuji Xerox Co., Ltd.Inventors: Yoshihide Kohtani, Ayumi Segi
-
Patent number: 7221800Abstract: Text that is adjacent to predetermined indicia is detected in a digital image. Alternatively, or in addition, a digital image can have text that is adjacent to a predefined insertion field. Text that is input and/or derived from an optically scanned image is substituted in the digital image for the predetermined indicia and/or for the predefined insertion field. The substituted text matches the font of text adjacent thereto in the digital image. The digital image having the substitution is rendered.Type: GrantFiled: August 29, 2003Date of Patent: May 22, 2007Assignee: Hewlett-Packard Development Company, L.P.Inventors: Robert Sesek, Travis J. Parry, Chad A. Stevens
-
Patent number: 7219310Abstract: Glyph instructions are formed which are understandable by a person following the instructions, irrespective of which written language is understood by the person. The glyph instructions follow defined grammar and syntax rules. A plurality of action glyphs are used to represent a plurality of defined actions capable of being undertaken by the person following the instructions. A plurality of material glyphs are defined to represent a plurality of materials which are includable as part of the instruction, and a plurality of instrumentation glyphs are defined to represent a plurality of instruments which may be included in the instructions. Selected ones of the action glyphs, material glyphs and instrumentation glyphs are arranged in relationship to each other in accordance with the predetermined grammar and syntax to form specific instructions understandable by the person following the instruction, irrespective of the written language which is understood by the person.Type: GrantFiled: November 5, 2001Date of Patent: May 15, 2007Assignee: Xerox CorporationInventors: Jesus Santoyo Ortega, Jose Luis Duenas, Rosa Elena Castillo, Jésus Esquivel, Hugo C. Correa, Mauricio Campos, Salvador De Luna, Gilberto Esparza
-
Patent number: 7200271Abstract: In accordance with this invention, a method, computer program product, and system for performing automated recognition of blocks of text within a graphic file are provided. The method, computer program product, and system automatically transform drawings into a graphic file format that provides enriched electronic display and text search of the graphic files. The text within large sets of drawings, parts catalogs, and various manuals is automatically discovered, extracted and indexed by the geometric location of the text within the graphic file. Single lines of text and blocks of text are recognized by utilizing geometric reasoning techniques based upon proximity and font characteristics. As such, the present invention automatically produces an interactive electronic representation of a graphic file that allows a user to quickly and accurately search graphic files for particular text, whether the text appears in a single line or over multiple lines in a block of text.Type: GrantFiled: October 4, 2001Date of Patent: April 3, 2007Assignee: The Boeing CompanyInventors: John H. Boose, Lawrence S. Baum, Molly L. Boose
-
Patent number: 7197185Abstract: A logical separation between pages, such as an implicit page break, is introduced to separate text entered during one handwriting session from text entered during another handwriting session. The amount of time elapsed since ink has been captured on the previous page is a factor that may be used to determine whether to insert an implicit page break into the new page. A change in context, such as a different date or different recognized subject matter labels, is also a factor that may be considered in determining whether to insert an implicit page break.Type: GrantFiled: January 19, 2006Date of Patent: March 27, 2007Assignee: Microsoft CorporationInventors: Charlton E Lui, Anthony S Smith, Dan W Altman, Cynthia C Tee, Evan M Feldman
-
Patent number: 7190833Abstract: The invention relates to a mobile device with a built-in image capture device, and a character recognition function to present the information gathered with the character recognition result. With the mobile device, the character line extraction process is displayed whenever necessary, and the resolution of an image to be inputted for recognition processing is enhanced. Accordingly, it is possible for the operator to select the target character line with ease. In addition, the mobile device has a character recognition ratio improved by the enhancement in resolution.Type: GrantFiled: July 23, 2002Date of Patent: March 13, 2007Assignee: Hitachi, Ltd.Inventors: Tatsuhiko Kagehiro, Minenobu Seki, Hiroshi Sako
-
Patent number: 7181068Abstract: A mathematical expression recognizing device comprises a character recognition unit which recognizes characters in a document image, a dictionary storing a pair of evaluation scores for each type of word, the score showing the possibility of belonging to the text and that of belonging to the mathematical expression, an evaluation unit which obtains the evaluation scores showing the possibility of belonging to the text and that of belonging to the mathematical expression for each of the words included in the recognized characters with reference to the dictionary, and a mathematical expression detecting unit which searches for an optimal path connecting words by selecting one of the text and the mathematical expression based on a formative grammar and the evaluation scores showing the possibility of belonging to the text and that of belonging to the mathematical expression for each of the words, thereby detecting characters belonging to the mathematical expression.Type: GrantFiled: March 5, 2002Date of Patent: February 20, 2007Assignees: Kabushiki Kaisha ToshibaInventors: Masakazu Suzuki, Kazuaki Yokota, Yuko Eto
-
Patent number: 7181067Abstract: In computerized recognition having multiple experts, a method and system is described that obtains an optimum value for an expert tuning parameter in a single pass over sample tuning data. Each tuning sample is applied to two experts, resulting in scores from which ranges of parameters that correct incorrect recognition errors without changing correct results for that sample are determined. To determine the range data for a given sample, the experts return scores for each prototype in a database, the scores separated into matching and non-matching scores. The matching and non-matching scores from each expert are compared, providing upper and lower bounds defining ranges. Maxima and minima histograms track upper and lower bound range data, respectively. An analysis of the histograms based on the full set of tuning samples provides the optimum value. For tuning multiple parameters, each parameter may be optimized by this method in isolation, and then iterated.Type: GrantFiled: February 2, 2005Date of Patent: February 20, 2007Assignee: Microsoft CorporationInventor: Gregory N. Hullender
-
Patent number: 7167588Abstract: Methods and systems for document image decoding incorporating a Stack algorithm improve document image decoding. The application of the Stack algorithm is iterated to improve decoding. A provisional weight is determined for a partial path to reduce template matching. In addition, semantically equivalent hypotheses are identified to reduce redundant hypotheses.Type: GrantFiled: August 9, 2002Date of Patent: January 23, 2007Assignee: Xerox CorporationInventors: Daniel H. Greene, Justin Romberg, Ashok C. Popat
-
Patent number: 7164798Abstract: Systems and methods for learning-based automatic commercial content detection are described. In one aspect, program data is divided into multiple segments. The segments are analyzed to determine visual, audio, and context-based feature sets that differentiate commercial content from non-commercial content. The context-based features are a function of single-side left and/or right neighborhoods of segments of the multiple segments.Type: GrantFiled: February 18, 2003Date of Patent: January 16, 2007Assignee: Microsoft CorporationInventors: Xian-Sheng Hua, Lie Lu, Mingjing Li, Hong-Jiang Zhang
-
Patent number: 7155061Abstract: In a computing device, a method and system for searching for matching ink words or phrases, by comparing a given search term of at least one word (and possibly alternates) with the words in a document, including recognized ink words and any possible alternates for those recognized words as returned by a recognizer. Various matching tests are possible because of the use of alternates, which also may have corresponding probability rankings that may influence the search. Searching may occur in actively edited ink documents, or the recognition results may be saved as saved search file data that can be searched independent of recognition.Type: GrantFiled: June 21, 2002Date of Patent: December 26, 2006Assignee: Microsoft CorporationInventors: Charlton E. Lui, Gregory H. Manto, Vikram Madan, Ryan E. Cukierman, Jon E. Clark
-
Patent number: 7142716Abstract: A document image search apparatus generates a text by performing the character recognition of a document image and determines a re-process scope. Then, the apparatus generates a candidate character lattice from the re-recognition result of the re-process scope, generates character strings from the candidate character lattice and adds the character strings to the text. Then, the apparatus performs index search using the text with the character strings added.Type: GrantFiled: September 12, 2001Date of Patent: November 28, 2006Assignee: Fujitsu LimitedInventors: Yutaka Katsuyama, Satoshi Naoi, Fumihito Nishino
-
Patent number: 7136530Abstract: A method and apparatus for extracting information from symbolically compressed document images. A deciphering module generates first and second text strings by deciphering respective sequences of template identifiers in first and second symbolically compressed document images. A conditional n-gram module receives the first and second text strings from the deciphering module and extracts n-gram terms therefrom based on a predicate condition. A comparison module generates a measure of similarity between the first and second symbolically compressed document images based on the n-gram terms extracted by the conditional n-gram module.Type: GrantFiled: September 30, 2003Date of Patent: November 14, 2006Assignee: Ricoh Co., Ltd.Inventors: Dar-Shyang Lee, Jonathan J. Hull
-
Patent number: 7130470Abstract: A method and system for context-based sorting of character strings. A first sorting weight of a current character of a character string is determined from a first table. The first sorting weight is stored. Provided the current character is a predetermined character, a second table is accessed. A second sorting weight of the current character is determined from the location of a preceding character within the second table. The first sorting weight is replaced with the second sorting weight for the current character. Embodiments of the present invention provide an efficient method of context-based sorting in languages, such as Japanese, where the sorting weight of a character can be altered by the preceding character.Type: GrantFiled: March 15, 2002Date of Patent: October 31, 2006Assignee: Oracle International CorporationInventor: Ching Lan Ho
-
Patent number: 7124130Abstract: The present invention is directed to an address recognition apparatus for recognizing a written address. The apparatus includes an input device that receives a scanned image of the written address and transforms the image into digital data, a character recognizing section that recognizes a word string in the digital data on a unit character basis, a word extracting section that extracts characters recognized by the character recognizing section on a unit word basis, and an address word string dictionary that previously stores a plurality of first word strings. The apparatus further includes and an address word string recognizing section that collates a second word string, determines words of the second word string respectively corresponding to the words of the first word string, evaluates each of the first word strings, and recognizes one of the first word strings as the address word string.Type: GrantFiled: September 4, 2003Date of Patent: October 17, 2006Assignee: Kabushiki Kaisha ToshibaInventor: Naotake Natori
-
Patent number: 7120302Abstract: The present invention embodies a character recognition method for constructing a result string from a plurality of result sets. Each result set comprises at least one candidate character, and each candidate character has an associated confidence indication. The method can begin by selecting a plurality of character types. For each selected character type, a candidate string can be created by concatenating a candidate character of the selected character type from each result set. The associated confidence indication for each concatenated candidate character can be combined to form a corresponding combined confidence indication for each created candidate string. The created candidate string with the most favorable corresponding combined confidence indication can be selected as the result string.Type: GrantFiled: July 31, 2001Date of Patent: October 10, 2006Assignee: RAF Technology, Inc.Inventor: Stephen E. M. Billester
-
Patent number: 7110607Abstract: A multilevel image into which a color or a black-and-white image is converted is input, and a slightly indistinct binary image generating unit generates a slightly indistinct binary image that includes a slightly indistinct line pattern and does not include background noise. Additionally, a shape-preserved binary image generating unit generates a binary image that preserves the shape of a line pattern and includes background noise. These images are ANDed for each pixel, so that a binary image that preserves the shape of the line pattern and does not include the background noise is generated.Type: GrantFiled: November 28, 2001Date of Patent: September 19, 2006Assignee: Fujitsu LimitedInventors: Katsuhito Fujimoto, Atsuko Ohara, Satoshi Naoi
-
Patent number: 7110998Abstract: The querying application of the present invention provides method and apparatus to find related queries having the greatest-valued and/or least-valued results. Elements (or inputs) of the related queries overlap with the elements of the user query. Preferably, the querying application of the present invention enables the user to trace other queries having the greatest-valued and/or least-valued results that overlap other elements of the user query. In accordance with another aspect of the present invention, the querying application of the present invention provides method and apparatus to find non-related queries having the greatest-valued and/or least-valued results.Type: GrantFiled: October 12, 1999Date of Patent: September 19, 2006Assignee: Virtual Gold, Inc.Inventors: Inderpal S. Bhandari, Rajiv Pratap, Krishnakumar Ramanujam
-
Patent number: 7106905Abstract: Systems and methods for processing text-based electronic documents are provided. Briefly described, one embodiment of a method for processing a text-based electronic document comprises the steps of: comparing at least one word in a text-based electronic document to a native language dictionary to determine whether the at least one word conforms to a predefined rule; for each of the at least one word that does not conform to the predefined rule, fragmenting the at least one word into word fragments; combining at least two consecutive word fragments; and comparing the combination of the word fragments to the native language dictionary.Type: GrantFiled: August 23, 2002Date of Patent: September 12, 2006Assignee: Hewlett-Packard Development Company, L.P.Inventor: Steven J. Simske
-
Patent number: 7092574Abstract: A mathematical expression recognizing device comprises a character recognition unit which recognizes characters in a document image, a dictionary storing a pair of evaluation scores for each type of word, the score showing the possibility of belonging to the text and that of belonging to the mathematical expression, an evaluation unit which obtains the evaluation scores showing the possibility of belonging to the text and that of belonging to the mathematical expression for each of the words included in the recognized characters with reference to the dictionary, and a mathematical expression detecting unit which searches for an optimal path connecting words by selecting one of the text and the mathematical expression based on a formative grammar and the evaluation scores showing the possibility of belonging to the text and that of belonging to the mathematical expression for each of the words, thereby detecting characters belonging to the mathematical expression.Type: GrantFiled: March 5, 2002Date of Patent: August 15, 2006Assignees: Kabushiki Kaisha ToshibaInventors: Masakazu Suzuki, Kazuaki Yokota, Yuko Eto
-
Patent number: 7092567Abstract: A method of post-processing character data from an optical character recognition (OCR) engine and apparatus to perform the method. This exemplary method includes segmenting the character data into a set of initial words. The set of initial words is word level processed to determine at least one candidate word corresponding to each initial word. The set of initial words is segmented into a set of sentences. Each sentence in the set of sentences includes a plurality of initial words and candidate words corresponding to the initial words. A sentence is selected from the set of sentences. The selected sentence is word disambiguity processed to determine a plurality of final words. A final word is selected from the at least one candidate word corresponding to a matching initial word. The plurality of final words is then assembled as post-processed OCR data.Type: GrantFiled: November 4, 2002Date of Patent: August 15, 2006Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Yue Ma, Jinhong Katherine Guo, Mu Li, Yu-kun Tong, Tian-shun Yao, Jing-bo Zhu
-
Patent number: 7081975Abstract: The present invention employs an instruction button for inputting an instruction to simultaneously executing registration, in a full-text-search database, of a document scanned by one scanning operation and subjected to OCR processing, and registration of the document in a designated folder contained in a database of a folder tree structure. This single instruction button enables the user to simultaneously register read images in a folder, and text data resulting from OCR processing executed on the read images.Type: GrantFiled: September 7, 2004Date of Patent: July 25, 2006Assignees: Kabushiki Kaisha Toshiba, Toshiba Tec Kabushiki KaishaInventors: Nobuhisa Yoda, Tatsuya Haraguchi
-
Patent number: 7047238Abstract: Disclosed are a document retrieval method and system for separately performing a process for correcting erroneously recognized characters existing in characteristic character strings within a seed document or the documents to be registered and a process for tolerating erroneously recognized characters existing in the documents targeted for retrieval. The process for correcting erroneously recognized characters existing in characteristic character strings extracts characteristic character strings from a read document, replaces the extracted characteristic character strings containing erroneously recognized characters with character strings appropriate for document retrieval, and selects characteristic character strings for use in actual document retrieval.Type: GrantFiled: February 21, 2003Date of Patent: May 16, 2006Assignees: Hitachi, Ltd., Hitachi Systems & Services, Ltd.Inventors: Katsumi Tada, Hisashi Takatori
-
Patent number: 7039637Abstract: An evaluator system accepts input textual messages in unknown languages and assesses which character sets, corresponding to languages, matches that message. Textual messages whose individual characters are encoded in 16 bit Unicode or other universal format are parsed, and character sets which can express each character and the accumulated correspondence is logged. When the character sets against which the message is being tested only provide partial matches, the invention can determine which offers the best fit, including by means of a weighting function. The evaluation technology of the invention can be applied to multipart documents, and to search engines and indices. Documents can be indexed according to assigned character sets, and quary strings matched to indices according to language.Type: GrantFiled: August 27, 1999Date of Patent: May 2, 2006Assignee: International Business Machines CorporationInventors: Brendan P. Murray, Kuniaki Takizawa
-
Patent number: 7039240Abstract: Methods and systems for document image decoding incorporating a Stack algorithm improve document image decoding. The application of the Stack algorithm is iterated to improve decoding. A provisional weight is determined for a partial path to reduce template matching. In addition, semantically equivalent hypotheses are identified to reduce redundant hypotheses.Type: GrantFiled: August 9, 2002Date of Patent: May 2, 2006Assignee: Xerox CorporationInventors: Daniel H. Greene, Justin K Romberg, Tze-Lei Poo, Ashok C. Popat
-
Patent number: 7031521Abstract: A logical separation between pages, such as an implicit page break, is introduced to separate text entered during one handwriting session from text entered during another handwriting session. The amount of time elapsed since ink has been captured on the previous page is a factor that may be used to determine whether to insert an implicit page break into the new page. A change in context, such as a different date or different recognized subject matter labels, is also a factor that may be considered in determining whether to insert an implicit page break.Type: GrantFiled: October 1, 2004Date of Patent: April 18, 2006Assignee: Microsoft CorporationInventors: Charlton E. Lui, Anthony S. Smith, Dan W. Altman, Cynthia C. Tee, Evan M. Feldman
-
Patent number: 7031002Abstract: A system and method of using character set matching to identify the matching or best-matching font to print text of indeterminate language are presented. Today's operating systems do not provide the native tools and functions to easily display text of unknown language or multiple languages. The complexity of any underlying code that handles a multilingual display is sharply increased due to the text being segmented into multiple text runs. The invention employs character set engine that provides necessary character set guessing functionality, as well as an enumerator module to build a linked list of suitable output fonts to display text from an arbitrary language, and multilingual text. Output on a laser, inkjet or other printing apparatus can be granted by traversing that list.Type: GrantFiled: August 27, 1999Date of Patent: April 18, 2006Assignee: International Business Machines CorporationInventor: David D. Taieb
-
Patent number: 7024042Abstract: The capacity of a character feature dictionary is reduced, and stored as a feature dictionary. The capacity is reduced by clustering feature vectors in units of columns or rows for character features, by making m column vectors represent the column or row features, and by assigning 1 to m identification numbers. The capacity of the dictionary can be further reduced by representing a column or row feature with an addition sum of other column or row features, or differential features after clustering is performed, or by performing dimension compression for character features. Word recognition is performed by synthesizing a word feature for a comparison based on a word list to be recognized, and by making a comparison between a feature extracted from an input word and the synthesized feature. Or, a comparison between input word and input word features whose numbers of dimensions are different may be made with nonlinear elastic matching.Type: GrantFiled: September 12, 2001Date of Patent: April 4, 2006Assignee: Fujitsu LimitedInventor: Yoshinobu Hotta
-
Patent number: 6990237Abstract: A logical separation between pages, such as an implicit page break, is introduced to separate text entered during one handwriting session from text entered during another handwriting session. If the user leaves more than a threshold amount of blank space at the bottom of the page immediately preceding the new page, then an implicit page break may be inserted at the beginning of the new page. The amount of blank space left at the end of the preceding page may be combined with other criteria to determine whether to insert an implicit page break. The amount of time elapsed since ink has been captured on the previous page is another factor that may be used by itself or combined with other factors to determine whether to insert an implicit page break into the new page. A change in context, such as a different date or different recognized subject matter labels, is also a factor that may be considered in determining whether to insert an implicit page break.Type: GrantFiled: June 14, 2004Date of Patent: January 24, 2006Assignee: Microsoft CorporationInventors: Charlton E. Lui, Anthony S. Smith, Dan W. Altman, Cynthia C. Tee, Evan M. Feldman
-
Patent number: 6978044Abstract: This invention is to compare each character of a first character string with each character of a second character string, vote for a matrix having two sides corresponding to the characters of the first character string and the characters of the second character string and calculate values of the voting result for respective components arranged in an oblique direction of the matrix. The matching result is determined based on the calculated values of the voting result. As a result, a high-speed and highly precise matching process which is noise-resistant and takes the character arrangement into consideration can be attained.Type: GrantFiled: March 30, 2004Date of Patent: December 20, 2005Assignee: Kabushiki Kaisha ToshibaInventor: Takuma Akagi
-
Patent number: 6975766Abstract: A named entity discriminating system capable of discriminating names entities such as location names, personal names, and organization names in text with a high degree of accuracy is provided. A reading means reads text from a hypertext database. A single text analyzing means analyzes each text read by the reading means and detects candidates for the named entity in the text. A complex text analyzing means estimates the likelihood of the candidate named entity detected by the single text analyzing means by an analysis with reference to referring link text or linked text of the text in which the candidate named entity appears.Type: GrantFiled: September 7, 2001Date of Patent: December 13, 2005Assignee: NEC CorporationInventor: Toshikazu Fukushima
-
Patent number: 6963665Abstract: A form sheet type determining method and apparatus for determining to which of predetermined form sheets an input form sheet corresponds. A plurality of sets of keywords are registered in a keyword register with one set of keywords for each predetermined form sheet type; image data of an input form sheet is read, character strings are extracted from the read image data, and character recognition is performed on each extracted character string; each of the character recognized strings is extracted as a keyword; the extracted keywords are collated, for each form sheet type, with the sets of keywords registered in the keyword register, thereby to determine the type of the input form sheet.Type: GrantFiled: August 25, 2000Date of Patent: November 8, 2005Assignee: Hitachi, Ltd.Inventors: Atsuhiro Imaizumi, Masato Teramoto, Tsukasa Yasue
-
Patent number: 6950555Abstract: In a combined holistic and analytic recognition system, the holistic recognition module will recognize an input word or phrase image by matching an input string of character features for the whole word or phrase against a string of prototype features for a plurality of reference words in a lexicon. This will yield a holistic answer list of recognized word or phrase candidates for the input word or phrase along with a confidence value for each answer on the list. At the same time based on each answer in the answer list, the holistic recognition modules will generate a list of character features and segment the character features into sets for each character in an answer. The analytical recognition module uses segmentation hypotheses from the segmented character feature sets to cut the image of the input string of characters into individual character images.Type: GrantFiled: February 16, 2001Date of Patent: September 27, 2005Assignee: Parascript LLCInventors: Alexander Filatov, Igor Kil, Arseni Seregin
-
Patent number: 6944344Abstract: A search apparatus searches for a keyword from a character recognition result using an index table. The character recognition result being obtained as a result of character recognition of characters in an original document. The index table includes an index character string; a position of a portion, in the character recognition result, which matches the index character string; and a credibility which is defined for each character included in the index character string and indicates a probability of the character existing in a portion, in the original document, which corresponds to a portion, in the character recognition result, which matches the character.Type: GrantFiled: June 6, 2001Date of Patent: September 13, 2005Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Taro Imagawa, Kenji Kondo, Yoshihiko Matsukawa, Tsuyoshi Mekata
-
Patent number: 6941030Abstract: A character-recognition pre-processing apparatus includes extraction means for extracting an image of a character string to be subjected to character recognition; setting means for setting the smallest rectangle that surrounds the character string image extracted; specifying means for specifying the position of each character within the smallest rectangle set by the setting means; detection means for detecting, at each character position specified, the shortest distance between a character region and the lower edge of the smallest rectangle, and the shortest distance between the character region and the upper edge of the smallest rectangle; and judgment means for judging whether the character string extracted is in an upright state or an inverted state, on the basis of variations in the two shortest distances detected.Type: GrantFiled: November 30, 2000Date of Patent: September 6, 2005Assignee: PFU LimitedInventors: Hiroshi Kakutani, Yasuharu Inami
-
Patent number: 6937766Abstract: A method for generating an index of the text of a video image sequence is provided. The method includes the steps of determining the image text objects in each of a plurality of frames of the video image sequence; comparing the image text objects in each of the plurality of frames of the video image sequence to obtain a record of frame sequences having matching image text objects; extracting the content for each of the similar image text objects in text string format; and storing the text string for each image text object as a video text object in a retrievable medium.Type: GrantFiled: April 13, 2000Date of Patent: August 30, 2005Assignee: MATE—Media Access Technologies Ltd.Inventors: Itzhak Wilf, Joseph Ladkani, Ovadya Menadeva, Hayit Greenspan
-
Patent number: 6927774Abstract: A character display device and method therefor are adapted to obtain a proximal reference point of each character comprising a character series and calculate display coordinates of each character from said proximal reference point and the display angle and display reference position of the character series.Type: GrantFiled: December 8, 2000Date of Patent: August 9, 2005Assignee: Mitsubishi Denki Kabushiki KaishaInventor: Fumiko Yano
-
Patent number: 6922489Abstract: A method of interpreting an image using a statistical or probabilistic interpretation model is disclosed. The image has associated therewith contextual information. The method comprises the following steps: providing the contextual information associated with the image for analysis; analyzing the additional contextual information to identify predetermined features relating to the image; and biasing the statistical or probabilistic interpretation model in accordance with the identified features.Type: GrantFiled: October 29, 1998Date of Patent: July 26, 2005Assignees: Canon Kabushiki Kaisha, Canon Information Systems Research Australia Pty. Ltd.Inventors: Alison Joan Lennon, Delphine Anh Dao Le
-
Patent number: 6917708Abstract: A method of automatically recognizing text. The text is divided into whole words which are each recognize. Each whole word is characterized according to its silhouette. The silhouette is characterized by features in the silhouette such as upwardly extending “polls” and downwardly extending “holes”. The silhouette may also be characterized by its first syllable blends. Numbers are assigned to each of the different characteristics, and numbers may also be assigned based on analysis of a database of different kinds of cursive words. Recognition may be automatically carry out prior recognizing system which recognizes in this way.Type: GrantFiled: January 19, 2001Date of Patent: July 12, 2005Assignee: California Institute of TechnologyInventors: Rodney M. Goodman, Donal J. Woods, Patricia A. Keaton, Joseph Chen