Segmenting Individual Characters Or Words Patents (Class 382/177)
-
Publication number: 20100166309Abstract: A Mixed Media Reality (MMR) system and associated techniques are disclosed. The MMR system provides mechanisms for forming a mixed media document that includes media of at least two types (e.g., printed paper as a first medium and digital content and/or web link as a second medium). In one particular embodiment, the MMR system includes an MMR user, a MMR computer, a user printer that produces a printed document, a networked media server, an office portal, a service provider server, an electronic display that is electrically connected to a set-top box, a document scanner, a network, a capture device, a cellular infrastructure, wireless fidelity (Wi-Fi) technology, Bluetooth® technology, infrared (IR) technology, wired technology, and a geo location mechanism.Type: ApplicationFiled: March 8, 2010Publication date: July 1, 2010Applicant: RICOH CO., LTD.Inventors: Jonathan J. Hull, Berna Erol, Jamey Graham, Peter E. Hart, Dar-Shyang Lee, Kurt Piersol
-
Patent number: 7738737Abstract: An image processing apparatus sequentially reduces a document, while changing the reduction factor step-by-step. Next, the image processing apparatus refers to the characters that constitute the document that has been reduced with the respective reduction factors, and specifies a reduction factor at which blank regions surrounded by line portions that express each character do not disappear. When an appropriate reduction factor is specified, the image processing apparatus specifies a resolution of the characters for that reduction factor, and converts the resolution of the document data to that specified resolution. Then, the image processing apparatus performs various processing for the document data whose resolution has been converted. Thus the resolution of document data is converted such that the document is reduced with a reduction factor suitable for computer processing.Type: GrantFiled: March 20, 2006Date of Patent: June 15, 2010Assignee: Fuji Xerox Co., Ltd.Inventors: Katsuhiko Itonori, Hiroaki Ikegami, Hideaki Ashikaga, Shunichi Kimura, Hiroki Yoshimura, Masanori Onda, Masahiro Kato, Masanori Satake
-
Patent number: 7734092Abstract: A method of processing an image includes receiving a digital version of the image, processing the digital version of the image through at least two binarization processes to thereby create a first binarization and a second binarization, and processing the first binarization through a first optical character recognition process to thereby create a first OCR output file. Processing the first binarization through a first optical character recognition process includes compiling first metrics associated with the first OCR output file. The method also includes processing the second binarization through the first optical character recognition process to thereby create a second OCR output file. Processing the second binarization through the first optical character recognition process includes compiling second metrics associated with the second OCR output file. The method also includes using the metrics, at least in part, to select a final OCR output file from among the OCR output files.Type: GrantFiled: November 15, 2006Date of Patent: June 8, 2010Assignee: Ancestry.com Operations Inc.Inventors: Donald B. Curtis, Shawn Reid
-
Patent number: 7715628Abstract: Precise grayscale character segmentation apparatus and method. The precise grayscale character segmentation apparatus comprises an adjustment and segmentation unit for adjusting and segmenting an inputted low resolution text line image undergone coarse segmentation, so as to generate an adjusted character image; a character image binarization unit for generating a binary character image from the character image inputted therein; a noise removal unit for removing noise information in the binary character image generated by the binarization unit; and a final character image segmentation unit for generating a precisely segmented character image from the binary character image from which noise has been removed.Type: GrantFiled: February 17, 2006Date of Patent: May 11, 2010Assignee: Fujitsu LimitedInventors: Sun Jun, Yoshinobu Hotta, Yutaka Katsuyama, Satoshi Naoi
-
Patent number: 7710602Abstract: Embodiments of the present invention comprise methods and systems for context-based adaptive image processing wherein print job elements are processed according to context, which may be determined by segmentation and analysis of print job elements.Type: GrantFiled: March 31, 2003Date of Patent: May 4, 2010Assignee: Sharp Laboratories of America, Inc.Inventor: James E. Owen
-
Publication number: 20100104188Abstract: Computer-implemented methods and systems are provided for text segmentation of textual data. Rules are accessed that define how the input stream is to be segmented into textual data elements through pattern matching. The one or more rules are applied to the input stream to determine the textual data elements in the input stream which are then provided as output.Type: ApplicationFiled: October 27, 2008Publication date: April 29, 2010Inventor: Peter Anthony Vetere
-
Patent number: 7703113Abstract: In certain embodiments, a method for generating fees using a receiving device, involves distributing censored video from a distributor video to a receiving device; and uncensoring the censored video using the receiving device upon payment of a fee. The receiving device uses overlay data received from the distributor to uncensor the censored video by overlaying the overlay data over the censored video using a video overlay frame to overlay a video frame containing the censored video data in accordance with boundaries determined by an alpha plane within the receiving device. This abstract should not be considered limiting since embodiments consistent with the present invention may involve more, different or fewer elements.Type: GrantFiled: July 24, 2007Date of Patent: April 20, 2010Assignees: Sony Corporation, Sony Electronics Inc.Inventor: Thomas Patrick Dawson
-
Patent number: 7697758Abstract: Techniques for shape clustering and applications in processing various documents, including an output of an optical character recognition (OCR) process.Type: GrantFiled: September 11, 2006Date of Patent: April 13, 2010Assignee: Google Inc.Inventors: Luc Vincent, Raymond W. Smith
-
Patent number: 7692813Abstract: Upon synthesizing objects, information bits indicating the types of objects are lost. To solve this problem, this invention provides an image processing apparatus having discrimination means for discriminating a type of object to be rendered, determination means for determining the presence/absence of synthesis of the discriminated object, synthesis means for synthesizing an object and information of the type of object in accordance with the determination result, and processing means for appending information indicating the type of synthesized object to a rendering result obtained by rendering the object to be rendered in units of pixels.Type: GrantFiled: March 9, 2006Date of Patent: April 6, 2010Assignee: Canon Kabushiki KaishaInventors: Ken-ichi Ohta, Shigeo Yamagata, Takuto Harada, Atsushi Matsumoto
-
Publication number: 20100066735Abstract: There is provided a 3D human machine interface (“3D HMI”), which 3D HMI may include (1) an image acquisition assembly, (2) an initializing module, (3) an image segmentation module, (4) a segmented data processing module, (5) a scoring module, (6) a projection module, (7) a fitting module, (8) a scoring and error detection module, (9) a recovery module, (10) a three dimensional correlation module, (11) a three dimensional skeleton prediction module, (12) an output module and a (13) depth extraction module.Type: ApplicationFiled: April 15, 2007Publication date: March 18, 2010Inventor: Dor Givon
-
Patent number: 7680648Abstract: Methods and systems for improving text segmentation are disclosed. In one embodiment, at least a first segmented result and a second segmented result are determined from a string of characters, a first frequency of occurrence for the first segmented result and a second frequency of occurrence for the second segmented result are determined, and an operable segmented result is identified from the first segmented result and the second segmented result based at least in part on the first frequency of occurrence and the second frequency of occurrence.Type: GrantFiled: September 30, 2004Date of Patent: March 16, 2010Assignee: Google Inc.Inventors: Gilad Israel Elbaz, Jacob Leon Mandelson
-
Patent number: 7676089Abstract: An apparatus, method, system, computer program and product, each capable of applying document layout analysis to a document image with control of a non-character area. A non-character area is extracted from a document image to be processed. A character image is generated from the document image by removing the non-character area from the document image. The character image is segmented into a plurality of sections to generate a segmented image. The segmented image is adjusted using a selected component of the non-character image to generate an adjusted segmented image. A segmentation result is output, which is generated based on the adjusted segmented image.Type: GrantFiled: February 28, 2006Date of Patent: March 9, 2010Assignee: Ricoh Company, Ltd.Inventor: Hirobumi Nishida
-
Publication number: 20100054599Abstract: A document processing apparatus includes: a character segmentation unit that segment a plurality of character images from a document image; a character image classifying unit that classifies the character images to categories corresponding to each of the character images; an average character image obtaining unit that obtains average character images for each of the categories of the character images classified by the character image classifying unit; a character recognizing unit that performs a character recognition to a character contained in each of the average character images; and an output unit that outputs character discriminating information as a character recognition result obtained by the character recognizing unit.Type: ApplicationFiled: February 17, 2009Publication date: March 4, 2010Applicant: Fuji Xerox Co., Ltd.Inventor: Katsuhiko Itonori
-
Publication number: 20100040287Abstract: Methods and systems for segmenting printed media pages into individual articles quickly and efficiently. A printed media based image that may include a variety of columns, headlines, images, and text is input into the system which comprises a block segmenter and a article segmenter system. The block segmenter identifies and produces blocks of textual content from a printed media image while the article segmenter system determines which blocks of textual content belong to one or more articles in the printed media image based on a classifier algorithm. A method for segmenting printed media pages into individual articles is also presented.Type: ApplicationFiled: August 13, 2008Publication date: February 18, 2010Applicant: Google Inc.Inventors: Ankur Jain, Vivek Sahasranaman, Shobhit Saxena, Krishnendu Chaudhury
-
Publication number: 20100034461Abstract: A method of generating a media signal is provided. The method detects a pattern indicating a request for a media signal to be generated from an input image, extracts a region identified by the detected pattern and generates the media signal for the extracted region.Type: ApplicationFiled: April 28, 2009Publication date: February 11, 2010Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventor: Kuk-hyun HAN
-
Publication number: 20100008581Abstract: A method of characterizing a word image includes traversing the word image in steps with a window and at each of a plurality of the steps, identifying a window image. For each of the plurality of window images, a feature is extracted. The word image is characterized, based on the features extracted from the plurality of window images, wherein the features are considered as a loose collection with associated sequential information.Type: ApplicationFiled: July 8, 2008Publication date: January 14, 2010Applicant: Xerox CorporationInventor: Marco J. Bressan
-
Publication number: 20100008582Abstract: A method for recognizing an image photographed by a camera and translating characters in connection with an electronic dictionary is provided. The method includes directly selecting an area to be recognized from the photographed character image and performing character recognition, translating and recognizing characters of a user's selected word in connection with dictionary data, and displaying translation result information of user's selected character or word in connection with dictionary data on a screen device. The recognition includes providing information on location of the selected character image area and location of the recognized character string words to the user, and then translating a character string or word in a location area selected by the user. The electronic dictionary-connected search and translation is for searching the character or word selected in connection with the electronic dictionary database, and providing translation result to the user.Type: ApplicationFiled: July 9, 2009Publication date: January 14, 2010Applicant: Samsung Electronics Co., Ltd.Inventors: Sang-Ho KIM, Seong-Taek Hwang, Sang-Wook Oh, Hyun-Soo Kim, Jung-Rim Kim, Ji-Hoon Kim, Dong-Chang Lee, Yun-Je Oh, Hee-Won Jung
-
Patent number: 7643986Abstract: A translation device for translating a document has an image analysis unit and a translation unit. The image analysis unit determines a word and an abbreviation of the word. The translation unit translates the word and generates a new abbreviation based on the translated word.Type: GrantFiled: August 30, 2005Date of Patent: January 5, 2010Assignee: Fuji Xerox Co., Ltd.Inventors: Naoko Sato, Masatoshi Tagawa, Michihiro Tamune, Atsushi Itoh, Hiroshi Masuichi, Kiyoshi Tashiro
-
Publication number: 20090324081Abstract: Disclosed is a method and an apparatus for recognizing a character and efficiently removing a misrecognized character. The method includes detecting character regions including at least one character in an input image, converting the input image into a binary image, discriminating the characters from a non-character, re-classifying the character region including a number of characters equal to or less than a threshold into a non-character region, and outputting only the characters present in the character region.Type: ApplicationFiled: June 24, 2009Publication date: December 31, 2009Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Sang-Wook OH, Seong-Taek HWANG, Sang-Ho KIM, Hee-Won JUNG
-
Publication number: 20090290797Abstract: An image processing apparatus has a separation unit for separating objects constituting an image input by an image input unit, a setting unit for setting a criterion to determine whether or not a separated object is stored, and a determination unit for determining whether the separated object is stored based on the criterion set by the setting unit. The image processing apparatus also has a unit for displaying the separated object, responding to a user access via an interface unit, when the separated object is determined to be stored by the determination unit and storing the separated object such that the separated object can be reused.Type: ApplicationFiled: February 11, 2009Publication date: November 26, 2009Applicant: CANON KABUSHIKI KAISHAInventors: Junya Arakawa, Hiroshi Kaburagi, Tsutomu Sakaue, Takeshi Namikala, Manabu Takebayashi, Reiji Misawa, Osamu Iinuma, Naoki Ito, Yoichi Kashibuchi, Shinji Sano
-
Patent number: 7617173Abstract: The present invention includes methods for printing and verifying postage indicia. At least a portion of the indicia is printed with a resolution characteristic that may be changed from indicium to indicium. Each indicium includes data that indicates the resolution used to print the indicium or indicium portion.Type: GrantFiled: October 28, 2003Date of Patent: November 10, 2009Assignee: Pitney Bowes Inc.Inventor: Easwaran Nambudiri
-
Patent number: 7616333Abstract: An application programming interface instantiates an ink analyzer object that receives document data for a document containing electronic ink content from a software application hosting the document and running on a first processing thread. The ink analyzer object then employs the first thread to make a copy of the document data, provides the copy of the document data to an electronic ink analysis process, and returns control of the first processing thread to the analysis process. After the analysis process has analyzed the electronic ink, the ink analyzer object reconciles the results of the analysis process with current document data for the document.Type: GrantFiled: October 14, 2005Date of Patent: November 10, 2009Assignee: Microsoft CorporationInventors: Jamie N. Wakeam, Gavin M. Gear, Jerome J. Turner, Sebastian Poulose, Subha Bhattacharyay, Todd M. Landstad, Roman Snystar, Timothy H. Kannapel, Jennifer Teed, Erin Devoy
-
Publication number: 20090268039Abstract: An apparatus and method for outputting multimedia and an education apparatus by using camera are disclosed, wherein an object is photographed by a camera, feature points are extracted from images of the photographed object and multimedia corresponding to images that accords the most with the feature points are outputted from a database, such that an output speed of the multimedia can be increased.Type: ApplicationFiled: April 29, 2008Publication date: October 29, 2009Inventor: Man Hui Yi
-
Patent number: 7606418Abstract: A writing analysis apparatus analyzes a content of a writing, probes into various kinds of images contained in the given writing, etc., in a time series manner. The writing analytic apparatus includes: a writing source having writing data; a word list having one or more sets of word data representing a predetermined image; and writing analyzing means which decomposes a writing in the writing source into a predetermined analysis unit which includes at least one sentence, extracts words existing in the word list from the analysis unit and creates an analytic table which shows each extracted word in accordance with the analysis unit. An image included in the writing can be analyzed based on various factors presented in the word list.Type: GrantFiled: June 15, 2004Date of Patent: October 20, 2009Inventors: Keiko Mizoo, Asahiko Mizoo
-
Patent number: 7602995Abstract: An apparatus, system, method, and computer program product is disclosed, each capable of correcting distortion in a scanned image, using at least a character line extracted from the scanned image. The character line is extracted based on a circumscribed rectangle, representing the vertical component of the character. The distortion in the scanned image is corrected based on the length of the circumscribed rectangle in the main scanning direction.Type: GrantFiled: February 10, 2005Date of Patent: October 13, 2009Assignee: Ricoh Company, Ltd.Inventors: Tadashi Araki, Maki Shinoda
-
Patent number: 7596270Abstract: A method, system, and computer-readable medium containing computer-executable instructions are provided, for randomly relocating text character images of a scanned-in Asian character document to produce a shuffled image, wherein the meaning of text in the shuffled image is not understandable although individual characters forming the text in the shuffled image are recognizable. In one embodiment, the method includes generally four steps: (1) dividing an Asian character document image into a text image portion and a non-text image portion; (2) structuring the text image portion into a multiple resolution-level pyramid; (3) extracting shuffleable character images by analyzing the multiple-resolution-level pyramid; and (4) shuffling some or all of the extracted shuffleable character images to create a shuffled image. The shuffled (e.g., encoded) image can be reshuffled (e.g.Type: GrantFiled: September 23, 2005Date of Patent: September 29, 2009Assignee: DynaComware Taiwan Inc.Inventor: Kuo-Young Cheng
-
Patent number: 7596269Abstract: A system for processing text captured from rendered documents is described. The system receives a sequence of one or more words optically or acoustically captured from a rendered document by a user. The system identifies among words of the sequence a word with which an action has been associated. The system then performs the associated action with respect to the user.Type: GrantFiled: April 1, 2005Date of Patent: September 29, 2009Assignee: Exbiblio B.V.Inventors: Martin T. King, Dale L. Grover, Clifford A. Kushler, James Q. Stafford-Fraser
-
Patent number: 7593572Abstract: Ink-parser-parameter optimization may be performed via parallel processing to accelerate searching for a set of optimal ink-parser parameters. Evaluators may parse pages of ink notes with different groups of parameters and may compute corresponding values for evaluation functions. Separate evaluation functions may be defined for the following types of ink-parker parsing engines: writing parser, writing/drawing classification, table detection, and list detection. A searcher may perform a grid-searching algorithm or a genetic algorithm to generate groups of parameters and may then pass the parameters to available evaluators for evaluation until evaluation-function values for a group of parameters satisfy a convergence condition.Type: GrantFiled: February 9, 2006Date of Patent: September 22, 2009Assignee: Microsoft CorporationInventors: Zhouchen Lin, Yantao Li, Yu Zou, Xianfang Wang, Jian Wang
-
Patent number: 7593574Abstract: Systems and methods are disclosed that facilitate normalizing and beautifying digitally generated handwriting, such as can be generated on a tablet PC or via scanning a handwritten document. A classifier can identify extrema in the digital handwriting and label such extrema according to predefined categories (e.g., bottom, baseline, midline, top, other, . . . ). Multi-linear regression, polynomial regression, etc., can be performed to align labeled extrema to respective and corresponding desired points as indicated by the labels. Additionally, displacement techniques can be applied to the regressed handwriting to optimize legibility for reading by a human viewer and/or for character recognition by a handwriting recognition application. The displacement techniques can comprise a “rubber sheet” displacement algorithm in conjunction with a “rubber rod” displacement algorithm, which can collectively preserve spatial features of the handwriting during warping thereof.Type: GrantFiled: July 1, 2005Date of Patent: September 22, 2009Assignee: Microsoft CorporationInventors: Patrice Y. Simard, Maneesh Agrawala, David W. Steinkraus
-
Publication number: 20090202151Abstract: An image processing apparatus of an embodiment of the invention includes a character region characteristic determination unit to identify a character region of an image and to output a character region characteristic determination signal, a character region image separation unit to separate, based on the character region characteristic determination signal, the image into at least two attribute regions, that is, plural character region images and an other region image, and a separated image processing unit to process each of the plural character region images and the other region image, and in at least the separated image processing unit, according to a characteristic of each of the plural character region images, at least one process of a compression method, a compression ratio, a resolution, and a multi-value number for at least one of the character region images is different from a process of the other region image or the other character region image.Type: ApplicationFiled: February 13, 2008Publication date: August 13, 2009Applicants: KABUSHIKI KAISHA TOSHIBA, TOSHIBA TEC KABUSHIKI KAISHAInventor: Sunao Tabata
-
Publication number: 20090169106Abstract: A method for altering a recognition error correction data structure, the method includes: altering at least one key out of a set of semantically similar keys in response to text appearance probabilities of keys of the set of semantically similar keys to provide an at least one altered key; and replacing the at least one key by the at least one altered key.Type: ApplicationFiled: January 2, 2008Publication date: July 2, 2009Inventors: Ella Barkan, Tal Drory, Andre Heilper
-
Publication number: 20090161955Abstract: A method for extracting a character string from print data rasterizes the print data into a raster image. Then, the method divides the raster image into a character region and non-character region and determines character data used for metadata based on the raster image of the character region and character data extracted from the print data and drawn at approximately the same position as the character region.Type: ApplicationFiled: December 17, 2008Publication date: June 25, 2009Applicant: CANON KABUSHIKI KAISHAInventor: Naohiro Isshiki
-
Patent number: 7545992Abstract: This present invention provides an image processing system and image processing method which can reliably transmit image information to a destination without attaching a large file which applies load to an e-mail system or reception terminal and make the receiving side easily acquire necessary image data on the basis of determination on the receiving side. In an image input/output device (10), image information is input from an image input device (201) and stored in a HDD (208) in a control unit (200). A low-resolution image or vector data is generated from the image information in accordance with the properties of objects contained in the image information. The generated information and information about the storage location of the image information are transmitted to a designated transmission destination.Type: GrantFiled: July 6, 2005Date of Patent: June 9, 2009Assignee: Canon Kabushiki KaishaInventors: Shinichi Kato, Hiroyuki Yaguchi
-
Publication number: 20090129676Abstract: Disclosed are systems and methods for segmenting a string comprised of one or more string segments using similarity values. In embodiments, each string segment may contain at least a variation of a marker string that may be used to separate string segments in the string. In embodiments, a similarity value representing the result of comparing the marker string to substrings of the string may be computed, and a similarity vector representing the set of comparisons for the locations on the string may be generated. In embodiments, the similarity vector may be used to identify candidate segmentation locations in the string. In embodiments, a set of segmentation locations in the string may be derived from the candidate segmentation locations in the string, and the string may be segmented according to the set of segmentation locations.Type: ApplicationFiled: November 20, 2007Publication date: May 21, 2009Inventors: Ali Zandifar, Jing Xiao
-
Patent number: 7532756Abstract: A grayscale character dictionary generation apparatus, comprising a first synthetic grayscale degraded character image generation unit for generating first synthetic grayscale degraded character images using binary character images inputted therein; a clustering unit for dividing each category of the first synthetic grayscale degraded character images generated by the first synthetic grayscale degraded character image generation unit into a plurality of clusters; a template generation unit for generating template for each of the clusters; a transformation matrix generation unit for generating transformation matrix in relation to each of the templates; and a second synthetic grayscale degraded character dictionary generation unit for obtaining character feature of every grayscale degraded character of each of the clusters using the transformation matrix, and for constructing eigenspace of each category of the synthetic grayscale degraded character, which is the second synthetic grayscale character dictionary.Type: GrantFiled: January 11, 2006Date of Patent: May 12, 2009Assignee: Fujitsu LimitedInventors: Sun Jun, Yoshinobu Hotta, Yutaka Katsuyama, Satoshi Naoi
-
Patent number: 7526128Abstract: A method and system of line extraction in a digital ink sequence of handwritten text data points, the method including the steps of: obtaining are provided in which a stroke sequence comprised of a sequence of are strokes is obtained, the strokes are segmented into a sequence of substrokes by applying a stroke segmentation algorithm angular differences are calculated between neighboring groups of substrokes, in the sequence of substrokes, and the positions of the extrema of the angular differences are determined, thereby indentifying the substrokes at line breaks and enabling line extraction.Type: GrantFiled: February 17, 2004Date of Patent: April 28, 2009Assignee: Silverbrook Research Pty LtdInventors: Dimitrios Koubaroulis, Jonathon Leigh Napper, Paul Lapstun, Kia Silverbrook
-
Publication number: 20090103808Abstract: An image processing method comprises analysing an image of a portion of text, and detecting the inter-line spacing and the inter-word spacing across the area of the image. Based on the inter-line and inter-word spacings, a quadrilateral shape is derived which represents the deformation of the text image from an undistorted image. The image is modified to perform perspective correction based on the derived quadrilateral.Type: ApplicationFiled: September 22, 2008Publication date: April 23, 2009Inventors: Prasenjit Dey, Anbumani Subramanian
-
Patent number: 7512272Abstract: A method and system for recognizing alphabetic characters that contain diacritics is described. An image analysis separates the character into its constituent components. The one or more diacritic components are then distinguished and isolated from the base portion of the character. Optical recognition is performed separately on the base portion. The diacritic is recognized through a special image analysis and pattern recognition algorithms. The image analysis extracts geometric information from the one or more diacritic components. The extracted information is used as input for the pattern recognition algorithms. The output is a code that corresponds to a particular diacritic. The recognized base portion and diacritic are combined and a check is performed for acceptable combinations in a chosen language. By separately recognizing the base portion and diacritic, the character sets used by the recognizer can be narrowed, resulting in greater recognition.Type: GrantFiled: October 5, 2004Date of Patent: March 31, 2009Assignee: Cardiff Software, Inc.Inventors: Isaac Mayzlin, Emily Ann Deere
-
Publication number: 20090080775Abstract: In an image-processing apparatus having a capability of performing region distinction processing and an image region discrimination processing method, a first region distinction unit uses a previously set threshold value for an image region distinction to perform a region distinction processing of a character and a non-character on image data read from an original document, an edge feature amount image and a character determination signal are obtained, a second region distinction unit makes a region distinction on the edge feature amount image based on the threshold value and generates and displays sub-region images obtained by dividing the edge feature amount image into plural parts, a character discrimination strength adjustment is performed on a display screen while each of the sub-region images is visually identified, the correction parameter is reflected in the edge feature amount image, and the region distinction processing is performed again.Type: ApplicationFiled: September 24, 2007Publication date: March 26, 2009Applicants: KABUSHIKI KAISHA TOSHIBA, TOSHIBA TEC KABUSHIKI KAISHAInventor: Hiromasa Tanaka
-
Patent number: 7505632Abstract: The present invention relates to method, apparatus and storage medium for enhancing document image, and method, apparatus and storage medium for character recognition. For enhancing the document image especially half-tone block image and improving the recognition ratio thereof, the block image is segmented into line images, which are subject to noise reduction. Then, based on the connected component densities, the noise-reduced line images are sorted into three types including normal line image, broken-stroke line image and hollow-stroke line image. Based on their types and other properties, the noise-reduced line images are enhanced, generating enhanced line images, which as a whole constitutes an enhanced block image.Type: GrantFiled: November 12, 2004Date of Patent: March 17, 2009Assignee: Canon Kabushiki KaishaInventors: Ou Hu, Xian Li
-
Publication number: 20090060335Abstract: A method of characterizing a word image includes traversing the word image stepwise with a window to provide a plurality of window images. For each of the plurality of window images, the method includes splitting the window image to provide a plurality of cells. A feature, such as a gradient direction histogram, is extracted from each of the plurality of cells. The word image can then be characterized based on the features extracted from the plurality of window images.Type: ApplicationFiled: August 30, 2007Publication date: March 5, 2009Inventors: Jose A. Rodriguez Serrano, Florent C. Perronnin
-
Publication number: 20090060336Abstract: A document image processing apparatus includes an specifying section, an extracting section, a recognizing section, an interpreting section, an arranging section and a generating section. The specifying section specifies a sentence region including a character row from a document image. The extracting section extracts at least one of character row images included in the specified sentence region. The recognizing section recognizes respective characters included in the extracted character row image. The interpreting section interprets an original sentence character row comprising the recognized characters and generates an interpreted sentence character row. The arranging section arranges the respective character row images in the sentence region by contracting the respective character row images. The arranging section arranges the generated respective interpreted sentence character rows in a vacant region except a region arranging the respective character row images from the sentence region.Type: ApplicationFiled: March 19, 2008Publication date: March 5, 2009Applicant: FUJI XEROX CO., LTD.Inventor: Yuya Konno
-
Patent number: 7471826Abstract: A method for segmentation of characters in text that segments text into lines, words and slices and determines at least one of fixed pitch and proportional pitch prior to segmentation. The method computes histograms of the lines and defines widths of lobes of the histograms of the lines as the character pitches. In addition, the method further analyzes the character pitches; segments lines into words; computes histograms of the words and aggregating the histograms of the words at predetermined points. Moreover, the method segments the words; slicing them words into an upper slice and lower slice and further segments the upper slice and the lower slice. The results are then combined to provide for both coarse and fine segmentation that enhance the performance of character OCR for documents scanned as at least one of gray-scale images and color images.Type: GrantFiled: March 31, 2008Date of Patent: December 30, 2008Assignee: International Business Machines CorporationInventors: Yaakov Navon, Eugeniusz Walach
-
Publication number: 20080317343Abstract: Aspects of the present invention relate to systems and methods for determining text orientation in a digital image.Type: ApplicationFiled: June 21, 2007Publication date: December 25, 2008Inventors: Ahmet Mufit Ferman, Jon M. Speigle
-
Patent number: 7468801Abstract: An application programming interface instantiates an ink analyzer object that receives document data for a document containing electronic ink content from a software application hosting the document and running on a first processing thread. The ink analyzer object then employs the first thread to make a copy of the document data, provides the copy of the document data to an electronic ink analysis process, and returns control of the first processing thread to the analysis process. After the analysis process has analyzed the electronic ink, the ink analyzer object reconciles the results of the analysis process with current document data for the document.Type: GrantFiled: August 21, 2003Date of Patent: December 23, 2008Assignee: Microsoft CorporationInventors: Jamie Wakeam, Richard Duncan, Bodin Dresevic, Herry Sutanto, Sashi Raghupathy, Timothy H. Kannapel, Zoltan Szilagyi, Jerome Turner, Todd Landstad, Haiyong Wang, Roman Snytsar
-
Publication number: 20080304746Abstract: To provide a method and apparatus for character string recognition that enables improvement in accuracy of character recognition while maintaining high-speed operation performance in character recognition.Type: ApplicationFiled: April 28, 2008Publication date: December 11, 2008Applicant: NIDEC SANKYO CORPORATIONInventor: Hiroshi Nakamura
-
Patent number: 7460711Abstract: A method for reading a meter includes (1) capturing a first image of digits displayed by the meter, (2) roughly locating the digits by correlating the entire first image against symbols, (3) precisely locating the digits by correlating the digits against the symbols, which are now rotated, resized, and repositioned to maximize correlation, (4) determining and storing nominal centers of the digits in a nonvolatile memory. The method further includes (5) capturing a second image of the digits, (6) locating regions of interest in the second image according to the nominal centers, (7) determining vertical positions of full digits (or partial digits) in the regions of interest, (8) aligning symbols (or partial symbols) and the full digits (or the partial digits) according to the vertical position, and (9) correlating the symbol and the full digits (or the partial symbols and the partial digits).Type: GrantFiled: August 27, 2004Date of Patent: December 2, 2008Assignee: Avago Technologies ECBU IP (Singapore) Pte. Ltd.Inventors: Richard L. Baer, Mark M Butterworth, Peter H. Mahowald
-
Publication number: 20080292186Abstract: A word recognition method of performing recognition processing with respect to each word candidate obtained by reading characters in character information written in a reading material is provided. This word recognition method includes a matching processing step of collating each word candidate with a plurality of words in a word dictionary and calculating, every word, a matching score indicative of a degree that each word candidate matches with a word, a character quality score calculating step of calculating a character quality score indicative of a degree that a character candidate constituting each word candidate matches with an arbitrary character, and a correcting step of correcting a matching score obtained at the matching processing step based on a character quality score acquired at the character quality score calculating step.Type: ApplicationFiled: August 1, 2008Publication date: November 27, 2008Applicant: KABUSHIKI KAISHA TOSHIBAInventor: Tomoyuki Hamamura
-
Patent number: 7454063Abstract: The present invention is a method of optical character recognition. First, text is received. Next all words in the text are identified and associated with the appropriate line in the document. The directional derivative of the pixellation density function defining the text is then taken, and the highest value points for each word are identified from this equation. These highest value points are used to calculate a baseline for each word. A median anticipated baseline is also calculated and used to verify each baseline, which is corrected as necessary. Each word is then parsed into feature regions, and the features are identified through a series of complex analyses. After identifying the main features, outlying ornaments are identified and associated with appropriate features. The results are then compared to a database to identify the features and then displayed.Type: GrantFiled: September 22, 2005Date of Patent: November 18, 2008Assignee: The United States of America as represented by the Director National Security AgencyInventors: Kyle E Kneisl, Jesse Otero
-
Publication number: 20080267503Abstract: An interactive system provides for increasing retrieval performance of images depicting text by allowing users to provide relevance feedback on words contained in the images. The system includes a user interface through which the user queries the system with query terms for images contained in the system. Word image suggestions are displayed to the user through the user interface, where each word image suggestion contains the same or slightly variant text as recognized from the word image by the system than the particular query terms. Word image suggestions can be included in the system by the user to increase system recall of images for the one or more query terms and can be excluded from the system by the user to increase precision of image retrieval results for particular query terms.Type: ApplicationFiled: April 26, 2007Publication date: October 30, 2008Applicant: FUJI XEROX CO., LTD.Inventors: Laurent Denoue, John E. Adcock, David M. Hilbert, Daniel Billsus