Context Analysis Or Word Recognition (e.g., Character String) Patents (Class 382/229)
  • Patent number: 7532758
    Abstract: A method and apparatus for generating a template for use in handwriting recognition are provided. In the method and apparatus text is obtained, character strings in the text are identified, each character string being formed from a sequence of one or more characters and each character having a respective type, a sequence of character types is determined for each character string and a template is defined for each character type sequence.
    Type: Grant
    Filed: April 14, 2008
    Date of Patent: May 12, 2009
    Assignee: Silverbrook Research Pty Ltd
    Inventor: Jonathon Leigh Napper
  • Patent number: 7522771
    Abstract: Methods, systems, and computer-readable media for ascertaining neighborhood information in a dynamically changing environment, such as an electronic ink environment may include: (a) receiving data representing plural electronic ink strokes; (b) defining a first vertex associated with a first ink stroke; and (c) determining neighboring vertices to the first vertex, wherein the neighboring vertices are associated with ink stroke(s) other than the first ink stroke. Additional systems, methods, and computer-readable media may include: (a) receiving data representing plural electronic ink strokes; (b) defining plural vertices associated with the ink strokes; (c) receiving input indicating a selection of an ink component; and (d) determining at least one neighboring component by determining which ink component(s) located outside of the selection include one or more ink strokes having vertices that neighbor vertices included in the selection.
    Type: Grant
    Filed: March 17, 2005
    Date of Patent: April 21, 2009
    Assignee: Microsoft Corporation
    Inventors: Herry Sutanto, Ming Ye, Sashi Raghupathy
  • Publication number: 20090092323
    Abstract: A system and method for character error correction is provided, useful for a user of mobile appliances to produce written text with reduced errors. The system includes an interface, a word prediction engine, a statistical engine, an editing distance calculator, and a selector. A string of characters, known as the inputted word, may be entered into the mobile device via the interface. The word prediction engine may then generate word candidates similar to the inputted word using fuzzy logic and user preferences generated from past user behavior. The statistical engine may then generate variable error costs determined by the probability of erroneously inputting any given character. The editing distance calculator may then determine the editing distance between the inputted word and each of the word candidates by grid comparison using the variable error costs. The selector may choose one or more preferred candidates from the word candidates using the editing distances.
    Type: Application
    Filed: October 4, 2007
    Publication date: April 9, 2009
    Inventors: Weigen Qiu, Samuel Yin Lun Pun
  • Patent number: 7515751
    Abstract: In a computing device, a method and system for searching for matching ink words or phrases, by comparing a given search term of at least one word (and possibly alternates) with the words in a document, including recognized ink words and any possible alternates for those recognized words as returned by a recognizer. Various matching tests are possible because of the use of alternates, which also may have corresponding probability rankings that may influence the search. Searching may occur in actively edited ink documents, or the recognition results may be saved as saved search file data that can be searched independent of recognition.
    Type: Grant
    Filed: September 11, 2006
    Date of Patent: April 7, 2009
    Assignee: Microsoft Corporation
    Inventors: Charlton E. Lui, Gregory H. Manto, Vikram Madan, Ryan E. Cukierman, Jon E. Clark
  • Patent number: 7512275
    Abstract: When first and second images are input, a partial image feature calculating unit calculates feature values of partial images of the two images. A maximum matching score position searching unit searches for a position of the second image that attains to the highest matching score with each of the partial images of the first image. A movement-vector-based similarity score calculating unit calculates similarity between the first and second images, using information related to that partial image whose movement vector has direction and length within a prescribed range, which movement vector representing positional relation between a reference position for measuring, for each of the partial images, the position of the partial image in the first image and the position of the maximum matching score corresponding to the partial image searched out by the maximum matching score position searching unit. The images as the object of collation may belong to the same category classified based on the feature values.
    Type: Grant
    Filed: October 19, 2004
    Date of Patent: March 31, 2009
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Manabu Yumoto, Yasufumi Itoh, Takashi Horiyama, Manabu Onozaki, Toshiya Okamoto
  • Publication number: 20090074306
    Abstract: Word correlations are estimated using a content-based method, which uses visual features of image representations of the words. The image representations of the subject words may be generated by retrieving images from data sources (such as the Internet) using image search with the subject words as query words. One aspect of the techniques is based on calculating the visual distance or visual similarity between the sets of retrieved images corresponding to each query word. The other is based on calculating the visual consistence among the set of the retrieved images corresponding to a conjunctive query word. The combination of the content-based method and a text-based method may produce even better result.
    Type: Application
    Filed: December 13, 2007
    Publication date: March 19, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Jing Liu, Bin Wang, Zhiwei Li, Mingjing Li, Wei-Ying Ma
  • Patent number: 7505180
    Abstract: A method for performing optical character recognition (OCR) on an image of a document including text includes embedding a physical manifestation of digital information associated with the text on the document. When the document is scanned with a scanning device, the digital information and a digital text file are produced. The digital text file is proofed using the digital information.
    Type: Grant
    Filed: November 15, 2005
    Date of Patent: March 17, 2009
    Assignee: Xerox Corporation
    Inventors: Dennis C. DeYoung, Devin J. Rosenbauer
  • Patent number: 7499588
    Abstract: A global optimization framework for optical character recognition (OCR) of low-resolution photographed documents that combines a binarization-type process, segmentation, and recognition into a single process. The framework includes a machine learning approach trained on a large amount of data. A convolutional neural network can be employed to compute a classification function at multiple positions and take grey-level input which eliminates binarization. The framework utilizes preprocessing, layout analysis, character recognition, and word recognition to output high recognition rates. The framework also employs dynamic programming and language models to arrive at the desired output.
    Type: Grant
    Filed: May 20, 2004
    Date of Patent: March 3, 2009
    Assignee: Microsoft Corporation
    Inventors: Charles E. Jacobs, James R. Rinker, Patrice Y. Simard, Paul A. Viola
  • Patent number: 7496233
    Abstract: A user defines a job flow of desired service cooperation according to a GUI screen displayed on a client terminal where parallel processing of plural parallel-executable jobs can be set. According to the thus-defined job flow, an instruction data generation server generates instruction data defining the content of processes, a storage location of a document as a subject, and other items. When the user selects desired one of the instruction data, the selected instruction data is sent to a cooperative processing server.
    Type: Grant
    Filed: September 15, 2003
    Date of Patent: February 24, 2009
    Assignee: Fuji Xerox Co., Ltd.
    Inventors: Kazuko Kirihara, Yuji Hikawa, Yukio Tajima, Akihiro Enomoto, Hidekazu Ozawa
  • Publication number: 20090034851
    Abstract: Systems and methods for classifying content as adult content and, if desired, blocking content so classified from presentation to a user are provided. Received content is analyzed using a sequential series of classification techniques, each successive technique being implemented only if the previous technique did not result in classification of the content as adult content. In this way, adult content may be identified across a variety of different media types (e.g., text, images, video, etc.) and yet processing power may be reserved if one or more techniques requiring less power is sufficient to determine that the received content is, in fact, adult content. Content classification may be performed in-band (that is, in substantially real-time such that content may be identified and/or blocked at the time results of a user query are returned) or out-of-band (that is, prospectively as new content is received but not in association with a user query).
    Type: Application
    Filed: August 3, 2007
    Publication date: February 5, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Xiadong Fan, Richard Qian
  • Patent number: 7487461
    Abstract: A command pattern recognition system based on a virtual keyboard layout combines pattern recognition with a virtual, graphical, or on-screen keyboard to provide a command control method with relative ease of use. The system allows the user conveniently issue commands on pen-based computing or communication devices. The system supports a very large set of commands, including practically all commands needed for any application. By utilizing shortcut definitions it can work with any existing software without any modification. In addition, the system utilizes various techniques to achieve reliable recognition of a very large gesture vocabulary. Further, the system provides feedback and display methods to help the user effectively use and learn command gestures for commands.
    Type: Grant
    Filed: May 4, 2005
    Date of Patent: February 3, 2009
    Assignee: International Business Machines Corporation
    Inventors: Shumin Zhai, Per-Ola Kristensson
  • Publication number: 20090028446
    Abstract: An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided into separate characters. Image features of each character image are extracted. Based on the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters, from a character image feature dictionary which stores the image features of character image in units of character, and a first index matrix of M×N cells is prepared. A candidate character string composed of a plurality of candidate characters constituting a first column of the first index matrix, is subjected to a lexical analysis according to a language model, and whereby a second index matrix having a character string which makes sense is prepared. In the language model, statistics are taken and then, the lexical analysis is performed.
    Type: Application
    Filed: January 10, 2008
    Publication date: January 29, 2009
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Publication number: 20090016617
    Abstract: A mobile apparatus for receiving an electronic message that comprises a text message from a sender. The mobile device comprises a contact records repository that stores a number digital images, which are associated with a respective number of user identifiers. The mobile device further comprises a text analysis module that identifies predefined expressions in the text message, an image-editing module that matches one of the user identifiers with the sender and edits the associated digital image according to the identified predefined expression, and an output module for outputting the edited digital image.
    Type: Application
    Filed: July 13, 2007
    Publication date: January 15, 2009
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Orna Bregman-Amitai, Nili Karmon
  • Publication number: 20090005078
    Abstract: A portable communication apparatus is provided which comprises an image-capturing means for capturing an image; a character recognition means for recognizing characters which appear in that captured image; a location means for identifying the current location of the portable communication apparatus; a data retrieval means for accessing one or more databases in order to retrieve data based on the recognized characters and on the current location of the portable communication apparatus.
    Type: Application
    Filed: June 24, 2008
    Publication date: January 1, 2009
    Applicant: xSights Media Ltd.
    Inventor: Eran DARIEL
  • Publication number: 20080317359
    Abstract: A printer 1 has transportation paths for conveying media in two directions, that of a first transportation path P1 and that of a second transportation path P2 (or third transportation path P3) perpendicular to the first transportation path P1. With this printer 1 a single compact unit can be used for media processing by reading and printing the media, as well as for printing receipts and validation printing.
    Type: Application
    Filed: August 28, 2008
    Publication date: December 25, 2008
    Inventors: TOSHIYUKI SASAKI, Masashi Fujikawa, Kunio Omura
  • Patent number: 7457464
    Abstract: A digital image is composed at a digital transmitter device from a hardcopy source. The digital image includes an optically scanned image. Indicia is detected on the hardcopy image. A substitute is made for the indicia in the composed digital image. A modified rendering of the digital image is output.
    Type: Grant
    Filed: August 29, 2003
    Date of Patent: November 25, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Chad A. Stevens, Robert Sesek, Travis J. Parry
  • Patent number: 7454063
    Abstract: The present invention is a method of optical character recognition. First, text is received. Next all words in the text are identified and associated with the appropriate line in the document. The directional derivative of the pixellation density function defining the text is then taken, and the highest value points for each word are identified from this equation. These highest value points are used to calculate a baseline for each word. A median anticipated baseline is also calculated and used to verify each baseline, which is corrected as necessary. Each word is then parsed into feature regions, and the features are identified through a series of complex analyses. After identifying the main features, outlying ornaments are identified and associated with appropriate features. The results are then compared to a database to identify the features and then displayed.
    Type: Grant
    Filed: September 22, 2005
    Date of Patent: November 18, 2008
    Assignee: The United States of America as represented by the Director National Security Agency
    Inventors: Kyle E Kneisl, Jesse Otero
  • Publication number: 20080273802
    Abstract: A form processing program which is capable of automatically extracting keywords. When the image of a scanned form is entered, a layout recognizer extracts a readout region of the form image, a character recognizer recognizes characters within the readout region. A form logical definition database stores form logical definitions defining strings as keywords according to logical structures which are common to forms of same type. A possible string extractor extracts as possible strings combinations of recognized characters each of which satisfies defined relationships of a string. A linking unit links the possible strings according to positional relationships, and determines a combination of possible strings as keywords.
    Type: Application
    Filed: July 8, 2008
    Publication date: November 6, 2008
    Applicant: FUJITSU LIMITED
    Inventors: Hiroaki Takebe, Katsuhito Fujimoto
  • Patent number: 7446817
    Abstract: A method and apparatus for detecting text associated with video are provided. The method of detecting the text of the video includes reading a t-th frame (where t is a positive integer) among frames forming the video as a current frame, determining whether there is a text area detected from a previous frame which is a (t?N)-th (where N is a positive integer) frame among the frames forming the video, in the current frame, and upon determining that there is no text area detected from the previous frame in the current frame, detecting the text area in the entire current frame. Upon determining that there is the text area detected from the previous frame in the current frame, the text area is detected from a remaining area obtained by excluding from the current frame an area corresponding to the text area detected from the previous frame. Whether there is a text area in a next frame which is a (t+N)-th frame among the frames forming the video is verified.
    Type: Grant
    Filed: February 14, 2005
    Date of Patent: November 4, 2008
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Cheolkon Jung, Jiyeun Kim, Youngsu Moon
  • Patent number: 7443316
    Abstract: A method (300) for entering a character into an electronic device (100) is provided. The method (300) includes displaying (301) input character keys (204) on a touch sensitive region (202) of a display screen (105) of the device (100), the keys identifying an associated character. Next, a display step (309) shows at least one entered character in a display region (201) of the screen, the entered character having been selected by actuation of one of the character keys (204). Next, a group of potential subsequent characters that follow the entered character is predicted (311, 317). A second set of input character keys (205) identifying the potential subsequent characters is displayed (327). The second set of keys (205) are grouped together (323) such that their relative screen locations with respect to each other are different to that of corresponding keys in the first set of keys (204).
    Type: Grant
    Filed: September 1, 2005
    Date of Patent: October 28, 2008
    Assignee: Motorola, Inc.
    Inventor: Swee Ho Lim
  • Patent number: 7444021
    Abstract: The present invention provides a method of identifying a string formed from a number of hand-written characters, such as hand-written words. In order to achieve this, the method operates to determine character probabilities for each character in the string, as well as to determine the probability of the string corresponding to a predetermined form of template. In this regard, each template represents a respective combination of character types. The template and character probabilities are then combined to determine string probabilities, with the character string being identified in accordance with the determined string probabilities.
    Type: Grant
    Filed: October 15, 2002
    Date of Patent: October 28, 2008
    Assignee: Silverbrook Research Pty Ltd
    Inventor: Jonathon Leigh Napper
  • Patent number: 7437001
    Abstract: A method for recognition of a handwritten pattern comprises the steps of forming (4) a representation of the handwritten pattern, forming (6) at least two subconfigurations by dividing the representation of the handwritten pattern, and processing the subconfigurations. The step of processing comprises the steps of comparing (8) each subconfiguration with reference configurations, selecting (10) at least one subconfiguration candidate for each subconfiguration among the reference configurations based on said step of comparing, and determining (12) at least one candidate pattern consisting of one selected subconfiguration candidate for each subconfiguration. The method further comprises the steps of comparing (14) the representation of the handwritten pattern to the candidate pattern, and computing (16) a cost function in order to find a closest matching candidate pattern.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: October 14, 2008
    Assignee: ZI Decuma AB
    Inventors: Jonas Morwing, Gunnar Sparr
  • Publication number: 20080240582
    Abstract: A method and an apparatus for character string recognition may be provided that enables prevention of a decrease in recognition accuracy for a character string even when distortion of an image appears in a direction perpendicular to a medium transfer direction.
    Type: Application
    Filed: March 31, 2008
    Publication date: October 2, 2008
    Applicant: NIDEC SANKYO CORPORATION
    Inventor: Hiroshi NAKAMURA
  • Publication number: 20080212882
    Abstract: The present invention is related to a method and system providing a pattern-classifier encoded dictionary for use in language processing systems implemented in computer systems. The pattern encoded dictionary according to the present invention may be utilized in Optical Character Recognition systems or (OCR) or Automatic Speech Recognition systems (ASR) to retrieve reliably identified words used in an adaptive manner or as a tool to configure said OCR or ASR system.
    Type: Application
    Filed: June 14, 2006
    Publication date: September 4, 2008
    Applicant: Lumex AS
    Inventors: Hans Christian Meyer, Mats Stefan Carlin, Knut Tharald Fosseide
  • Patent number: 7420701
    Abstract: Systems and methods for accurately recognizing a language format of an input imaging data stream when no explicit language switch is present. A sniffer process is initiated when an imaging device receives an input imaging data stream. The sniffer process analyzes an initial sample of the input stream to determine the language format by enumerating through a set of language recognizers that are implemented as callback functions. The enumeration uses a dynamic heuristic approach to selecting the order in which to try the language recognizers. Each language recognizer has a sample size associated with it. For each language recognizer enumerated, the sniffer process pre-reads the associated sample size and invokes the associated callback function with the byte sample. The enumeration continues until a language recognizer acknowledges recognition of the language format or the set of language recognizers is exhausted.
    Type: Grant
    Filed: June 10, 2004
    Date of Patent: September 2, 2008
    Assignee: Sharp Laboratories of America, Inc.
    Inventor: Andrew Rodney Ferlitsch
  • Publication number: 20080208576
    Abstract: Character information recognition means (101) extracts, through a character recognition process, character information from a selection button included in an index image. Based on text data having been outputted from the character information recognition means (101), index dictionary creation means (102) creates an index dictionary usable for a speech recognition process performed by speech recognition means (104). The speech recognition means (104) performs the speech recognition process by using speech data (D1) retrieved through an ADC (7) and the index dictionary stored in storage means (107). Based on a result of the speech recognition process performed by the speech recognition means (104), reproduction control means (105) performs reproduction control of a chapter. Thus, a desired button can be selected by speech, from chapter selection buttons displayed on a chapter selection image of a DVD video.
    Type: Application
    Filed: November 4, 2005
    Publication date: August 28, 2008
    Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.
    Inventors: Atsushi Iisaka, Atsushi Yamashita, Takuya Hirai
  • Patent number: 7418442
    Abstract: Providing the ability to search a document for content recorded as both ink characters and text characters. A character from a search query word is retrieved. A program retrieves the character in the electronic document. The program determines if the character in the electronic document is an ink or text character. For text characters, the character in the document content is compared to a character in the search query word to determine if the characters match. For ink characters, an ink alternate term is obtained. A character in the ink alternate is compared to the character of the search query word to determine if the characters match. Once all characters in the ink alternate word are compared, another ink alternate word is retrieved and compared to the search query word.
    Type: Grant
    Filed: September 30, 2003
    Date of Patent: August 26, 2008
    Assignee: Microsoft Corporation
    Inventor: Nathaniel Marvin Myhre
  • Patent number: 7415137
    Abstract: A method of processing an image includes steps of identifying one candidate for a human face region within an image; calculating a probability that the candidate for human face region represents a human face; and saving the probability as attached information to the image. The method of processing an image can also include steps of identifying one candidate for human face region within an image; calculating a probability that the candidate for human face region represents a human face; judging whether or not the candidate for human face region represents a human face by comparing the probability with a threshold; and saving a result of the step of judging as attached information to the image. According to these methods, results of identifying candidates for human face regions will be saved to the image, and further processes to be conducted on the image can be facilitated.
    Type: Grant
    Filed: November 20, 2003
    Date of Patent: August 19, 2008
    Assignee: Canon Kabushiki Kaisha
    Inventors: Xinwu Chen, Xin Ji, Libing Wang, Yoshihiro Ishida
  • Publication number: 20080193021
    Abstract: A method and apparatus for generating a template for use in handwriting recognition are provided. In the method and apparatus text is obtained, character strings in the text are identified, each character string being formed from a sequence of one or more characters and each character having a respective type, a sequence of character types is determined for each character string and a template is defined for each character type sequence.
    Type: Application
    Filed: April 14, 2008
    Publication date: August 14, 2008
    Inventor: Jonathon Leigh Napper
  • Patent number: 7406201
    Abstract: A method for encoding characters includes identifying one or more sequences of the character codes that are likely to be generated due a segmentation error in application of a pattern recognition process, and associating a respective extension character code with each of the sequences. The area of an image containing characters is divided into segments, such that each segment contains approximately one character. The pattern recognition process is applied to each of the segments in order to generate an input string of character codes. At least one of the identified sequences of the character codes in the input string is replaced with the respective extension character code so as to generate a modified string. The output string is determined by comparing the modified string to a directory of known strings.
    Type: Grant
    Filed: December 4, 2003
    Date of Patent: July 29, 2008
    Assignee: International Business Machines Corporation
    Inventors: Andre Heilper, Eugene Walach
  • Patent number: 7403656
    Abstract: A character recognition method that is robust under an unknown illumination condition is provided. An apparatus for realizing such robust character recognition includes plural different binarization, means for synthesizing character sub-image candidates that have been obtained from the binarization units, and means for analyzing character sub-image candidates and for recognizing an image as a character string consisting of character sub-image candidates.
    Type: Grant
    Filed: February 4, 2005
    Date of Patent: July 22, 2008
    Assignee: Hitachi, Ltd.
    Inventor: Masashi Koga
  • Publication number: 20080170786
    Abstract: A technique that can contribute to a reduction in an operation burden in managing a processing result of semantic determination processing applied to objects included in an image is provided. An object included in an image of image data is extracted. A semantic of the object in a layout of the image data is determined. When it is determined that plural objects have an identical semantic, a display unit is caused to notify information concerning the plural objects, which are determined as having the semantic, in association with information concerning the semantic.
    Type: Application
    Filed: December 28, 2007
    Publication date: July 17, 2008
    Applicants: KABUSHIKI KAISHA TOSHIBA, TOSHIBA TEC KABUSHIKI KAISHA
    Inventors: Hajime Tomizawa, Akihiko Fujiwara
  • Publication number: 20080170075
    Abstract: A display controller includes a character display unit for displaying character information on a display unit; a keyword detecting unit for detecting a predetermined keyword from the character information displayed by the character display unit; an image information detecting unit for detecting image information including additional information corresponding to the keyword detected by the keyword detecting unit, from image information including predetermined additional information and stored in a storing unit; and a thumbnail image displaying unit for displaying on the display unit a thumbnail image(s) of the image information detected by the image information detecting unit.
    Type: Application
    Filed: January 15, 2008
    Publication date: July 17, 2008
    Applicant: SONY ERICSSON MOBILE COMMUNICATIONS JAPAN, INC.
    Inventors: Seiji MURAMATSU, Yoshimitsu Funabashi, Mayu Irimajiri, Atsushi Imai, Keiko Hiraoka, Takamoto Tsuda, Takeshi Matsuzawa, Takeshi Tanigawa, Tomoharu Okamoto, Akihiko Adachi, Tatsuhiko Nishimura
  • Publication number: 20080166057
    Abstract: A video structuring device includes: character string extraction means for determining whether or not a character string is present in a frame image, and if it determines that a character string is present, generating character string position information for the character string present in a character string present frame image in which the character string is present, and outputting the character string position information, frame identifying information for identifying the character string present frame image, and the character string present frame image; video information storage means for storing frame identifying information, character string present frame image and character string position information in an index file all associated with one another; and structure information presentation means for associating character string display in the form of an image which is produced by cutting an area where the character string is present based on the character string present frame image and character string
    Type: Application
    Filed: October 24, 2006
    Publication date: July 10, 2008
    Inventor: Noboru Nakajima
  • Publication number: 20080159635
    Abstract: A system for enabling user interaction with computer software which includes a computer system which transfers print data to a printer. The printer is responsive to the print data to print a form by printing information indicative of a text field coincident with coded data indicative of the text field, so that when a sensing device is moved relative to the text field the sensing device can sense the coded data and generate the indicating data indicative of its movement. The computer system uses the indicating data to determine the relative movement and then perform an action associated with the text field based on the movement. The computer system further determines the information, an identity indicative of the text field, and a layout defining an arrangement for coded data indicative of the identity and information, and generates the print data to be indicative of the identity, layout and information.
    Type: Application
    Filed: March 17, 2008
    Publication date: July 3, 2008
    Inventors: Paul Lapstun, Kia Silverbrook
  • Patent number: 7391419
    Abstract: An information distribution system configured to deliver various types of content provided by an information distributor to information receivers through a network and transmitting the content to be distributed converted to colors, color values, or color digital values. By converting the content to colors, color values, or color digital values, it is possible to reduce the amount of information transmitted. Due to this, it becomes possible to shorten the time required for distribution of content and to improve practicality. Further, it becomes possible to reduce the distribution costs.
    Type: Grant
    Filed: May 22, 2002
    Date of Patent: June 24, 2008
    Assignee: Tani Electronics Corporation
    Inventor: Okie Tani
  • Patent number: 7391527
    Abstract: A method is directed to using a multifunction printer to identify pages of a printed document that have a specified text string. The method comprises electronically converting, with the multifunction printer, a plurality of pages of the printed document to a plurality of electronic text pages corresponding to the printed pages. The multifunction printer electronically searches the plurality of electronic text pages to identify which electronic text pages include the specified text string. The multifunction printer then communicates the identified electronic text pages to the user.
    Type: Grant
    Filed: April 29, 2003
    Date of Patent: June 24, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Cory Irwin, Carl Price
  • Publication number: 20080137971
    Abstract: A method and system for character recognition are described. In one embodiment, it may use matched sequences rather than character shape to determine a computer legible result.
    Type: Application
    Filed: April 1, 2005
    Publication date: June 12, 2008
    Inventors: Martin T. King, Dale L. Grover, Clifford A. Kushler, James Q. Stafford-Fraser
  • Publication number: 20080131006
    Abstract: A pure adversarial optical character recognition (OCR) approach in identifying text content in images. An image and a search term are input to a pure adversarial OCR module, which searches the image for presence of the search term. The image may be extracted from an email by an email processing engine. The OCR module may split the image into several character-blocks that each has a reasonable probability of containing a character (e.g., an ASCII character). The OCR module may form a sequence of blocks that represent a candidate match to the search term and calculate the similarity of the candidate sequence to the search term. The OCR module may be configured to output whether or not the search term is found in the image and, if applicable, the location of the search term in the image.
    Type: Application
    Filed: August 16, 2007
    Publication date: June 5, 2008
    Inventor: Jonathan James Oliver
  • Publication number: 20080131005
    Abstract: An adversarial approach in detecting inappropriate text content in images. An expression from a listing of expressions may be selected. The listing of expressions may include words, phrases, or other textual content indicative of a particular type of message. Using the selected expression as a reference, the image is searched for a section that could be similar to the selected expression. The similarity between the selected expression and the section of the image may be in terms of shape. The section may be scored against the selected expression to determine how well the selected expression matches the section. The score may be used to determine whether or not the selected expression is present in the image.
    Type: Application
    Filed: May 16, 2007
    Publication date: June 5, 2008
    Inventor: Jonathan James Oliver
  • Patent number: 7379603
    Abstract: Methods of organizing a series of sibling data entities in a digital computer are provided for preserving sibling ranking information associated with the sibling data entities and for attaching the sibling ranking information to a joint parent of the sibling data entities to facilitate on-demand generation of ranked parent candidates. A rollup function of the present invention builds a rollup matrix (126) that embodies information about the sibling entities and the sibling ranking information and provides a method for reading out the ranked parent candidates from the rollup matrix in order of their parent confidences (141). Parent confidences are based on the sibling ranking information, either alone or in combination with n-gram dictionary ranking or other ranking information.
    Type: Grant
    Filed: April 8, 2003
    Date of Patent: May 27, 2008
    Assignee: RAF Technology, Inc.
    Inventors: David Justin Ross, Stephen E. M. Billester, Brent R. Smith
  • Patent number: 7379596
    Abstract: An improved system and method for personalizing recognition of an input method is provided. A trainable handwriting recognizer may be personalized by using ink written by the user and text authored by the user. The system includes a personalization service engine and a framework with interfaces for collecting, storing, and accessing user ink and authored information for training recognizers. The trainers of the system may include a text trainer for augmenting a recognizer's dictionary using text content and a shape trainer for tuning generic recognizer components using ink data supplied by a user. The trainers may load multiple trainer clients, each capable of training one or more specific recognizers. Furthermore, a framework is provided for supporting pluggable trainers. Any trainable recognizer may be dynamically personalized using the harvested information authored by the user and ink written by the user.
    Type: Grant
    Filed: October 24, 2003
    Date of Patent: May 27, 2008
    Assignee: Microsoft Corporation
    Inventors: Patrick Haluptzok, Ross Nathaniel Luengen, Benoit J. Jurion, Michael Revow, Richard Kane Sailor
  • Publication number: 20080118162
    Abstract: A mobile communications device with an integrated camera is directed towards text. A video stream is analyzed in real time to detect one or more words in a specified region of the video frames and to indicate the detected words on a display. Users can select a word in a video stream and subsequently move or extend the initial selection. It is thus possible to select multiple words. A subregion of the video frame comprising the detected word(s) is pre-processed and compressed before being sent to a remote optical character recognition (OCR) function which may be integrated in an online service such as an online search service.
    Type: Application
    Filed: November 20, 2006
    Publication date: May 22, 2008
    Applicant: Microsoft Corporation
    Inventor: Frank Siegemund
  • Publication number: 20080115070
    Abstract: Text analysis methods, text analysis apparatuses, and articles of manufacture are described according to some aspects. In one aspect, a text analysis method includes accessing information indicative of data content of a collection of text comprising a plurality of different topics, using a computing device, analyzing the information indicative of the data content, and using results of the analysis, identifying a presence of a new topic in the collection of text.
    Type: Application
    Filed: November 10, 2006
    Publication date: May 15, 2008
    Inventors: Paul D. Whitney, Alan R. Willse, Charles A. Lopresti, Amanda M. White
  • Patent number: 7369704
    Abstract: In a circumstance where an image processing apparatus is connected to and capable of communicating with a plurality of processing servers each performing a specific data processing service, what kind of processing is performed on document image data read out from a document by image reading means is determined in accordance with the document image data. Then an address of a processing server capable of performing the processing thus determined is searched. Then at least a part of the document image data or character-string image extracted therefrom is supplied to the address thus searched, and the data processing service is requested. From this address, a result of the data processing service is obtained, and the obtained result of the data processing service is outputted.
    Type: Grant
    Filed: May 17, 2005
    Date of Patent: May 6, 2008
    Assignee: Sharp Kabushiki Kaisha
    Inventor: Tomoyuki Honma
  • Patent number: 7362902
    Abstract: Character data for a plurality of characters on which character recognition is being performed is received for processing. The character data includes character assignments and character locations. A reference location is defined in relation to a location of one of the characters, and the character assignments are resolved into one or more groupings according to a distance of the characters from the reference location.
    Type: Grant
    Filed: May 28, 2004
    Date of Patent: April 22, 2008
    Assignee: Affiliated Computer Services, Inc.
    Inventors: Billy S. Baker, Gary S. Smith
  • Patent number: 7356188
    Abstract: Described herein is a technology for recognizing the content of text documents. The technology determines one or more hash values for the content of a text document. Alternatively, the technology may generate a “sifted text” version of a document. In one implementation described herein, document recognition is used to determine whether the content of one document is copied (i.e., plagiarized) from another document. This is done by comparing hash values of documents (or alternatively their sifted text). In another implementation described herein, document recognition is used to categorize the content of a document so that it may be grouped with other documents in the same category. This abstract itself is not intended to limit the scope of this patent. The scope of the present invention is pointed out in the appending claims.
    Type: Grant
    Filed: April 24, 2001
    Date of Patent: April 8, 2008
    Assignee: Microsoft Corporation
    Inventors: Ramarathnam Venkatesan, Michael T. Malkin
  • Publication number: 20080037879
    Abstract: An expansion of the construction and organization of the electronic literary macramé (ELM), the knowledge transfer tool (KTT), or any document of similar type to enrich the connections and associations for their readers, providing for manual author- or editor-defined links and directives for hypertext handling and navigation, easy-to-use indexing capabilities, structuring and presentation of information in a visually-organized form such as a table, list, matrix, tree, pyramid, or other two-dimensional arrangement, with all features integrated into an unobtrusive, and enriched referencing mechanism to assist authors, editors and readers of an ELM, KTT, or other electronic document of similar type.
    Type: Application
    Filed: July 25, 2007
    Publication date: February 14, 2008
    Inventor: Dana W. Paxson
  • Publication number: 20080025618
    Abstract: A form processing apparatus extracts layout information and character information from a form document. A candidate extracting unit extracts word candidates from the character information. A frequency digitizing unit calculates emission probability of a word candidate from each element. A relation digitizing unit calculates transition probability that relationship between word candidates is established. An evaluating unit calculates an evaluation value indicative of a probability of appearance of word candidates in respective logical elements. A determining unit determines the element and a word candidate thereof as the element and a character string thereof in the form document, based on the evaluation value.
    Type: Application
    Filed: November 15, 2006
    Publication date: January 31, 2008
    Inventors: Akihiro Minagawa, Hiroaki Takebe, Katsuhito Fujimoto
  • Patent number: 7317465
    Abstract: A method of displaying an image may include receiving image data for the image, and defining first and second sub-frames of the image. The first and second sub-frames may have corresponding pluralities of image elements, with each image element of the second sub-frame spatially offset an offset distance from a corresponding image element of the first sub-frame. The first sub-frame may be displayed in a first position, and the second sub-frame may be displayed in a second position. Each displayed image element of the second sub-frame may be spatially offset substantially the offset distance from the corresponding displayed image element of the first sub-frame.
    Type: Grant
    Filed: January 27, 2004
    Date of Patent: January 8, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Will Allen, Edward B. Anderson