Distinguishing Text From Other Regions Patents (Class 382/176)

Image processing apparatus and image processing method

Patent number: 8995033

Abstract: An image processing apparatus includes an image data acquiring portion, an image determining portion, and an image data converting portion. The image data acquiring portion is configured to acquire first image data representing a color image. The image determining portion is configured to determine whether or not the image represented by the first image data is an image mainly composed of black characters. The image data converting portion is configured to convert the first image data to second image data by converting the first image data to binary image data when the image determining portion determines that the image represented by the first image data is an image mainly composed of black characters.

Type: Grant

Filed: March 26, 2014

Date of Patent: March 31, 2015

Assignee: KYOCERA Document Solutions Inc.

Inventor: Kunihiko Tanaka
Feature sensitive captioning of media content

Patent number: 8989490

Abstract: There are provided methods and systems for use in performing feature sensitive captioning of media content. In one implementation, such a method includes detecting an aesthetically determinative feature of a media content unit selected by a user, and determining a captioning aesthetic for a caption of the media content unit based at least in part on the aesthetically determinative feature. The captioning aesthetic may include a background aesthetic and a text aesthetic. The captioning aesthetic may be utilized by a feature sensitive captioning application to produce a feature sensitive caption for the media content unit.

Type: Grant

Filed: July 18, 2013

Date of Patent: March 24, 2015

Assignee: Disney Enterprises, Inc.

Inventors: Jinsong Gu, Amy Ilyse Rosenthal, Gillian Salit, Scott Gerlach, Andrea Kuhnertova
Detection of transitions between text and non-text frames in a video stream

Patent number: 8989499

Abstract: Detecting the start of a credit roll within video program may allow for the automatic extension of video recordings among other functions. The start of the credit roll may be detected by determining the number of text blocks within a sequence of frames and identifying a point in the sequence of frames where a difference between the number of text blocks in frames occurring before the point and the number of text blocks in frames occurring after the point is greatest and exceeds a specified threshold. Text blocks may be identified within each frame by partitioning the frame into one or more segments and recording the segments having a pixel of a sufficiently high contrast. Contiguous segments may be merged or combined into single blocks, which may then be filtered to remove noise and false positives. Additional content may be inserted into the credit roll frames.

Type: Grant

Filed: October 20, 2010

Date of Patent: March 24, 2015

Assignee: Comcast Cable Communications, LLC

Inventors: Oliver Jojic, David F. Houghton
Control apparatus controlling processing of image read by reading device

Patent number: 8989489

Abstract: In a control apparatus, a controller operates as: identifying a reading condition instructed for reading an image from a document; and determining a method of an analysis processing, the identifying including identifying a reading section instructed to read an image from the document. If an identified reading condition satisfies a first condition including that an identified reading section is a first reading section configured to read an image from a document while maintaining the document to be stationary, a first analysis processing configured to extract a first type region from a read out image is determined. If the identified reading condition satisfies a second condition including that the identified reading section is a second reading section configured to read an image from the document while conveying the document, a second analysis processing configured to extract a second type region from the read out image is determined.

Type: Grant

Filed: June 30, 2013

Date of Patent: March 24, 2015

Assignee: Brother Kogyo Kabushiki Kaisha

Inventor: Mayumi Kuraya
Method and system for preprocessing the region of video containing text

Patent number: 8989491

Abstract: A method and system for preprocessing text containing region of a video The invention provides a method and system for preprocessing the text containing region of video for improving the optical character recognition input.

Type: Grant

Filed: December 29, 2010

Date of Patent: March 24, 2015

Assignee: Tata Consultancy Services Limited

Inventors: Tanushyam Chattopadhyay, Aniruddha Sinha, Arpan Pal
DETECTING TEXT USING STROKE WIDTH BASED TEXT DETECTION

Publication number: 20150078664

Abstract: Detecting text using stroke width based text detection. As a part of the text detection, a representation of an image is generated that includes pixels that are associated with the stroke widths of components of the image. Connected components of the image are identified by filtering out portions of the pixels using metrics related to stroke width. Text is detected in the image based on the identified connected components.

Type: Application

Filed: November 26, 2014

Publication date: March 19, 2015

Inventors: Boris Epshtein, Eyal Ofek, Yonatan Wexler
LINE SEGMENTATION METHOD APPLICABLE TO DOCUMENT IMAGES CONTAINING HANDWRITING AND PRINTED TEXT CHARACTERS OR SKEWED TEXT LINES

Publication number: 20150063699

Abstract: A text line segmentation method for a document image containing printed text and handwriting, or document image containing skewed lines or printed text. Connected component (CC) are obtained for the document, and their bounding boxes and centroids are calculated. The CCs are categorized into three categories based on bounding box sizes: small objects, regular text objects, and large objects involving handwriting. The centroids of regular text objects are used in a cluster analysis to find the vertical centers of the N text lines. Then, each CC is classified into one of the N lines based on the vertical distance between its centroid and the vertical centers of text lines, and copied into to a corresponding object board. Extra spaces are removed from the object boards to obtain the line segments. The large object involving handwriting will be classified into one of the lines but absent from other lines.

Type: Application

Filed: August 30, 2013

Publication date: March 5, 2015

Applicant: KONICA MINOLTA LABORATORY U.S.A., INC.

Inventor: Chaohong Wu
Assisted OCR

Publication number: 20150063698

Abstract: A method including determining a position of each glyph in an image of a text document, identifying word boundaries in the document thereby implying the existence of a first plurality of words, preparing a first array of word lengths based on the first plurality of words, preparing a second array of word lengths based on a second plurality of words of a text file including a certain text, comparing at least part of the first array to at least part of the second array to find a best alignment between the first and second array, deriving a layout of at least part of the certain text as arranged in the image of the text document at least based on the best alignment and the position of at least some of the glyphs in the image. Related apparatus and methods are also described.

Type: Application

Filed: August 28, 2013

Publication date: March 5, 2015

Inventors: Guy Adini, Harel Cain, Oded Rimon
OPTICAL CHARACTER RECOGNITION BY ITERATIVE RE-SEGMENTATION OF TEXT IMAGES USING HIGH-LEVEL CUES

Publication number: 20150055866

Abstract: Disclosed techniques include receiving an electronic image containing depictions of characters, segmenting at least some of the depictions of characters using a first segmentation technique to produce a first segmented portion, and performing a first character recognition on the first segmented portion to determine a first sequence of characters. The techniques also include determining, based on the performing the first character recognition, that the first sequence of characters does not match the depictions of characters. The techniques further include segmenting at least some of the depictions of characters using a second segmentation technique, based on the determining, to produce a second segmented portion, and performing a second character recognition on at least a portion of the second segmented portion to produce a second sequence of characters. The techniques also include outputting a third sequence of characters based on at least part of the second sequence of characters.

Type: Application

Filed: May 25, 2012

Publication date: February 26, 2015

Inventors: Mark Joseph Cummins, Alessandro Bissacco
Image processing device, method and storage medium for storing and displaying an electronic document

Patent number: 8965125

Abstract: Character code data and vector drawing data are both listed and provided in a re-editable manner. Electronic data is generated in which information obtained by vectorizing character areas in an image and information obtained by recognizing characters in the image are stored in respective storage locations. As for the electronic data generated in this manner, because character code data and vector drawing data generated from the input image are both presented by a display and edit program, a user can immediately utilize the both data.

Type: Grant

Filed: September 24, 2013

Date of Patent: February 24, 2015

Assignee: Canon Kabushiki Kaisha

Inventors: Taeko Yamazaki, Tomotoshi Kanatsu, Makoto Enomoto, Kitahiro Kaneda
Method for segmenting text words in document images

Patent number: 8965127

Abstract: A word segmentation method for processing a document image applies clustering analysis to the spacing segments of a line. The spacing segments are generated by thresholding a one-dimensional vertical projection profile of the line. Taking advantage of the bimodal distribution of spacing length distribution of text lines, a k-means clustering algorithm is used, with the number of clusters pre-set to two, to classify the spacing segments as either character spacing or word spacing. Moreover, k-means++ initialization is used to enhance performance of cluster analysis. The clustering result such as cluster centers and compactness is used to prune single-word text line, single table item, etc. The locations of the word spacing segments are then used to segment the line of text into words.

Type: Grant

Filed: March 14, 2013

Date of Patent: February 24, 2015

Assignee: Konica Minolta Laboratory U.S.A., Inc.

Inventors: Chaohong Wu, Wei Ming
Sequence transcription with deep neural networks

Patent number: 8965112

Abstract: Systems and methods for sequence transcription with neural networks are provided. More particularly, a neural network can be implemented to map a plurality of training images received by the neural network into a probabilistic model of sequences comprising P(S|X) by maximizing log P(S|X) on the plurality of training images. X represents an input image and S represents an output sequence of characters for the input image. The trained neural network can process a received image containing characters associated with building numbers. The trained neural network can generate a predicted sequence of characters by processing the received image.

Type: Grant

Filed: December 17, 2013

Date of Patent: February 24, 2015

Assignee: Google Inc.

Inventors: Julian Ibarz, Yaroslav Bulatov, Ian Goodfellow
Method for binarizing scanned document images containing gray or light colored text printed with halftone pattern

Patent number: 8947736

Abstract: A method for binarizing a scanned document images containing gray or light colored text printed with halftone patterns. The document image is initially binarized and connected image components are extracted from the initial binary image as text characters. Each text character is classified as either a halftone text character or a non-halftone text character based on an analysis of its topology features. The topology features may be the Euler number of the text character; a text character with a Euler number below ?2 is classified as halftone text. The gray-scale document image is then divided into halftone text regions containing only halftone text characters and non-halftone text regions. Each region is binarized using its own pixel value statistics. This eliminates the influence of black text on the threshold values for binarizing halftone text. The binary maps of the regions are combined to generate the final binary map.

Type: Grant

Filed: November 15, 2010

Date of Patent: February 3, 2015

Assignee: Konica Minolta Laboratory U.S.A., Inc.

Inventors: Songyang Yu, Wei Ming
Apparatus and method for scanning and decoding information in an identified location in a document

Patent number: 8947745

Abstract: A imaging scanner identifies first and second locations in a first and second captured image of a document, analyzes each character in the identified locations, and produces a first and second string, each including a character and a confidence value. The device determines that a first measurement of the confidence values in each of the first and second string is beyond a range of a first threshold. The device compares the confidence value for each character in the first string with a corresponding confidence value in the second string, selects a character from one of the first or second string with a higher confidence value; and produces a combined string including the selected characters and the confidence value associated with each selected character.

Type: Grant

Filed: July 3, 2013

Date of Patent: February 3, 2015

Assignee: Symbol Technologies, Inc.

Inventor: Ming-Xi Zhao
Automated document processing system

Patent number: 8948511

Abstract: An automated document processing system is configured to normalize zones obtained from a document, and to extract articles from the normalized zones. In one configuration, the system receives at least one zone from the document, and applies at least one zone-breaking factor, thereby creating normalized sub-zones within which text lines are consistent with the at least one zone-breaking factor. The normalized sub-zones may be evaluated to obtain a reading order. Adjacent sub-zones are joined if text similarity exceeds a threshold value. Weakly joined sub-zones are separated where indicated by a topic vectors analysis of the weakly joined sub-zones.

Type: Grant

Filed: October 19, 2005

Date of Patent: February 3, 2015

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Daniel Ortega, Sherif Yacoub, Jose Abad Peiro, Paolo Faraboschi
Feature Sensitive Captioning of Media Content

Publication number: 20150023597

Abstract: There are provided methods and systems for use in performing feature sensitive captioning of media content. In one implementation, such a method includes detecting an aesthetically determinative feature of a media content unit selected by a user, and determining a captioning aesthetic for a caption of the media content unit based at least in part on the aesthetically determinative feature. The captioning aesthetic may include a background aesthetic and a text aesthetic. The captioning aesthetic may be utilized by a feature sensitive captioning application to produce a feature sensitive caption for the media content unit.

Type: Application

Filed: July 18, 2013

Publication date: January 22, 2015

Applicant: Disney Enterprises, Inc.

Inventors: Jinsong Gu, Amy Ilyse Rosenthal, Gillian Salit, Scott Gerlach, Andrea Kuhnertova
Image processing apparatus, image processing method, and computer-readable recording device

Patent number: 8938122

Abstract: An image processing apparatus includes a small area divider that divides, on the basis of edge information of an image, the image into multiple small areas each including multiple pixels; an attribute probability estimator that estimates attribute probability for each of the small areas, which is probability that the small area is attributed to a specific area to be detected; an adjacent-small-area connection strength calculator that calculates connection strength that quantitatively indicates a degree to which small areas adjacent to each other among the multiple small areas are attributed to the same area that is the specific area or a non-specific area; and a specific area detector that detects the specific area on the basis of the attribute probability and the connection strength.

Type: Grant

Filed: May 4, 2012

Date of Patent: January 20, 2015

Assignee: Olympus Corporation

Inventors: Yamato Kanda, Makoto Kitamura, Masashi Hirota, Takashi Kono, Takehiro Matsuda
Foreground analysis based on tracking information

Patent number: 8934714

Abstract: Techniques for performing foreground analysis are provided. The techniques include identifying a region of interest in a video scene, detecting a static foreground object in the region of interest, and determining whether the static foreground object is abandoned or removed, wherein said determining comprises performing a foreground analysis based on tracking information and pruning one or more false alarms using one or more track statistics.

Type: Grant

Filed: May 6, 2013

Date of Patent: January 13, 2015

Assignee: International Business Machines Corporation

Inventors: Rogerio S. Feris, Arun Hampapur, Frederik C. Kjeldsen, Hao-Wei Liu
ARTICLE ESTIMATING SYSTEM, ARTICLE ESTIMATING METHOD, AND ARTICLE ESTIMATING PROGRAM

Publication number: 20150003729

Abstract: A server 2 includes an extraction unit 21, an analysis unit 22, a first estimating unit 24, an information acquisition unit 25 and a second estimating unit 26. The extraction unit 21 extracts an image area for each article. The analysis unit 22 analyzes the image area to acquire analysis information. The first estimating unit 24 narrows down candidates estimated to correspond to the article in the image area based on the analysis information. When the candidates were able to be narrowed down, the information acquisition unit 25 acquires additional information additional information of a reference article. The second estimating unit 26 attempts a narrowing process based on the additional information of the reference article in addition to the analysis information, for the image area including a spine, which is an image area in which candidates were unable to be narrowed down.

Type: Application

Filed: April 8, 2013

Publication date: January 1, 2015

Applicant: RAKUTEN, INC.

Inventor: Yasuyuki Hayashi
System and method for determining co-occurrence groups of images

Patent number: 8923629

Abstract: A system and a method are disclosed that determine images with co-occurrence groups of individuals from an image collection. A value of a similarity metric is computed for each pair of images of the image collection, the value of the similarity metric being computed based on a comparison of the number of individuals in common between the images of the pair and the total number of individuals identified in both images of the pair. The collection of images is clustered based on the computed values of the similarity metric. At least one co-occurrence group is determined based on the results of the clustering, where a co-occurrence group is determined as a cluster of images that have a similar combination of individuals.

Type: Grant

Filed: April 27, 2011

Date of Patent: December 30, 2014

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Yuli Gao
Image processing apparatus and image processing program

Patent number: 8923635

Abstract: An image processing apparatus includes a first path information calculating unit, a second path information calculating unit, and a path selecting unit. The first path information calculating unit calculates first path information which is information representing a first path for separating areas from an image. The second path information calculating unit calculates second path information representing a second path for separating the areas from the image, the second path being the reverse of the first path. The path selecting unit selects one of the first path information calculated by the first path information calculating unit and the second path information calculated by the second path information calculating unit.

Type: Grant

Filed: September 9, 2010

Date of Patent: December 30, 2014

Assignee: Fuji Xerox Co., Ltd.

Inventor: Eiichi Tanaka
Information output device and information output method

Patent number: 8923618

Abstract: An expression, for which complementary information can be outputted, is extracted from a document obtained by character recognition for an image. Complementary information related to the extracted expression is outputted when a character or a symbol adjacent to the beginning or the end of the extracted expression is not a predetermined character or symbol. Output of complementary information related to the extracted expression is skipped when the character or symbol adjacent to the beginning or the end of the extracted expression is the predetermined character or symbol. A problem that complementary information unrelated to an original text is outputted is prevented even when a false character recognition occurs.

Type: Grant

Filed: September 14, 2012

Date of Patent: December 30, 2014

Assignee: Sharp Kabushiki Kaisha

Inventor: Takeshi Kutsumi
Detecting text using stroke width based text detection

Patent number: 8917935

Abstract: Detecting text using stroke width based text detection. As a part of the text detection, a representation of an image is generated that includes pixels that are associated with the stroke widths of components of the image. Connected components of the image are identified by filtering out portions of the pixels using metrics related to stroke width. Text is detected in the image based on the identified connected components.

Type: Grant

Filed: May 19, 2008

Date of Patent: December 23, 2014

Assignee: Microsoft Corporation

Inventors: Boris Epshtein, Eyal Ofek, Yonatan Wexler
METHOD AND SYSTEM FOR RECOGNIZING INFORMATION

Publication number: 20140355883

Abstract: Embodiments of the present application relate to a method for recognizing information, a system for recognizing information, and a computer program product for recognizing information. A method for recognizing information is provided. The method includes locating a card zone for each frame within a card image frame sequence comprising a plurality of frames, locating an information zone within each card zone, dividing each information zone into at least one character zone, de-blurring a character zone corresponding to a same region across all the frames in the card image frame sequence, and recognizing character string information based on the de-blurred character zone.

Type: Application

Filed: May 30, 2014

Publication date: December 4, 2014

Inventors: Yang Li, Guo Chen
SEGREGATION OF HANDWRITTEN INFORMATION FROM TYPOGRAPHIC INFORMATION ON A DOCUMENT

Publication number: 20140348422

Abstract: A system for segregating handwritten information from typographic information on a document may include a memory, an interface, and a processor. The memory stores an electronic document image of a document where the electronic document image includes pixels and each pixel has a characteristic. The processor may receive, via the interface, the electronic document image and may identify first, second and third most frequently occurring characteristics of the pixels of the electronic document image. The pixels having the first most frequently occurring characteristic represent a background of the document. The processor may determine the typographic information of the document as represented by pixels having the second most frequently occurring characteristic. The processor may determine the handwritten information of the document as represented by pixels having the third most frequently occurring characteristic.

Type: Application

Filed: August 12, 2014

Publication date: November 27, 2014

Inventors: Paul M. Ives, Peter E. Clark, Michael V. Gentry
Systems and methods for automatically processing electronic documents

Patent number: 8897563

Abstract: In a document analysis system that receives and processes jobs from a plurality of users, in which each job may contain multiple electronic documents, to extract data from the electronic documents, a method of automatically pre-processing each received electronic document using a plurality of image transformation algorithms to improve subsequent data extraction from said document is provided. The method includes: electronically partitioning each received electronic document page into pieces; automatically processing each piece of the received electronic document page using each of a plurality of image pre-processing algorithms to produce a plurality of image variations of each piece; and analyzing the outputs of subsequent processing and data extraction, on each of the image variations of the pieces to determine which output is best, from the plurality of outputs for each piece.

Type: Grant

Filed: October 28, 2013

Date of Patent: November 25, 2014

Assignee: Gruntworx, LLC

Inventors: Girish Welling, Nirupam Sarkar, Tushar Mahata, Vartika Singh, Depankar Neogi, Steven K. Ladd
Extracting documents from a natural scene image

Patent number: 8897565

Abstract: The present technology proposes techniques for extracting forms and other types of documents from images taken with a mobile client device. By calculating and making adjustments along a document's detected borders, an input image can be transformed such that the document within the image may be properly aligned and background clutter completely removed. The resulting text fields of the extracted document are thus upright, aligned and locatable at predictable points.

Type: Grant

Filed: June 29, 2012

Date of Patent: November 25, 2014

Assignee: Google Inc.

Inventors: Leon Palm, Hartwig Adam
Information processing apparatus and recording medium for increasing precision in detecting blank space

Patent number: 8891112

Abstract: An information processing apparatus for performing image processing for document data created by a document creation application to generate print data of each page and sending the generated print data to an image forming apparatus, the information processing apparatus including: a control section for specifying, based on document data, a region where drawing object data included in the document data exists as a drawing object region, and detecting blank space in each print page based on the specified drawing object region.

Type: Grant

Filed: March 1, 2011

Date of Patent: November 18, 2014

Assignee: Konica Minolta Business Technologies, Inc.

Inventor: Koji Sato
System and method for digitizing documents and encoding information relating to same for display by handheld computing devices

Patent number: 8891145

Abstract: A system and method are provided for processing scanned documents by digitizing the scanned documents, converting the digitized documents to a JPEG2000 file, encoding content information corresponding to the digitized documents using spatial capabilities of JPEG2000's Region of Interest feature, and creating an image file having the digitized documents and the region of interest information for forwarding to a computing device, such as a handheld computing device, for display. The system and method are especially useful in Digital Mail applications which entail digitizing and delivering mail documents to recipients.

Type: Grant

Filed: May 16, 2007

Date of Patent: November 18, 2014

Assignee: Xerox Corporation

Inventor: William K. Stumbo
Image overlaying device and image overlaying program

Patent number: 8878874

Abstract: An image overlaying device includes an image inputting unit, a memory and a controller. The controller is configured to obtain template data which define a plurality of layout areas and to extract each of the plurality of layout areas from the obtained template data. The controller determines positions of the extracted plurality of layout areas. The controller stores, in the memory, layout order information corresponding to an order of the layout area and further stores, in the memory image order information corresponding to an order of the image data of the documents The controller determines corresponding image data of documents corresponding to each of the plurality of layout areas and generates overlaid image data by laying out, in each of the plurality of layout areas, the determined corresponding image data of the documents.

Type: Grant

Filed: September 30, 2010

Date of Patent: November 4, 2014

Assignee: Brother Kogyo Kabushiki Kaisha

Inventor: Keigo Yano
Comparing extracted card data with user data

Patent number: 8879783

Abstract: Extracting card data comprises receiving, by one or more computing devices, a digital image of a card; perform an image recognition process on the digital representation of the card; identifying an image in the digital representation of the card; comparing the identified image to an image database comprising a plurality of images and determining that the identified image matches a stored image in the image database; determining a card type associated with the stored image and associating the card type with the card based on the determination that the identified image matches the stored image; and performing a particular optical character recognition algorithm on the digital representation of the card, the particular optical character recognition algorithm being based on the determined card type. Another example uses an issuer identification number to improve data extraction. Another example compares extracted data with user data to improve accuracy.

Type: Grant

Filed: November 12, 2013

Date of Patent: November 4, 2014

Assignee: Google Inc.

Inventors: Xiaohang Wang, Farhan Shamsi, Sanjiv Kumar, Henry Allan Rowley, Marcus Quintana Mitchell
Combined-media scene tracking for audio-video summarization

Patent number: 8872979

Abstract: Techniques are presented for analyzing audio-video segments, usually from multiple sources. A combined similarity measure is determined from text similarities and video similarities. The text and video similarities measure similarity between audio-video scenes for text and video, respectively. The combined similarity measure is then used to determine similar scenes in the audio-video segments. When the audio-video segments are from multiple audio-video sources, the similar scenes are common scenes in the audio-video segments. Similarities may be converted to or measured by distance. Distance matrices may be determined by using the similarity matrices. The text and video distance matrices are normalized before the combined similarity matrix is determined. Clustering is performed using distance values determined from the combined similarity matrix.

Type: Grant

Filed: May 21, 2002

Date of Patent: October 28, 2014

Assignee: Avaya Inc.

Inventors: Amit Bagga, Jianying Hu, Jialin Zhong
METHOD AND SYSTEM USING TWO PARALLEL OPTICAL CHARACTER RECOGNITION PROCESSSES

Publication number: 20140314319

Abstract: A method and a system for providing a text-based representation of a portion of a working area to a user are provided. The method includes acquiring an image of the entire working area and performing a fast OCR process on at least a region of interest of the image corresponding to the portion of the working area, thereby rapidly obtaining an initial machine-encoded representation of the portion of the working area and immediately presenting it to the user as the text-based representation. Parallelly to the fast OCR process, a high-precision OCR process is performed on at least the region of interest of the image, thereby obtaining a high-precision machine-encoded representation of the portion of the working area. Upon completing the high-precision OCR process, the high-precision machine-encoded representation of the portion of the working area is presented to the user as the text-based representation, in replacement of the initial machine-encoded representation.

Type: Application

Filed: April 18, 2014

Publication date: October 23, 2014

Applicant: TECHNOLOGIES HUMANWARE INC.

Inventors: Pierre HAMEL, Alain BÉLANGER, Éric BEAUCHAMP
Method and system for a text data entry from an electronic document

Patent number: 8867838

Abstract: A method for processing an electronic document is provided. The electronic document includes a plurality of text fields and a text label associated with each of the plurality of text fields. The method includes step of extracting the plurality of text fields from the electronic document. The method includes step of grouping the plurality of extracted text fields to generate a plurality of groups. The method includes step of labeling the plurality of groups based on a first pre-defined criteria to generate a plurality of labeled groups. The method includes step of distributing the plurality of labeled groups in a plurality of queues based on a second pre-defined criteria. The method includes step of transmitting the plurality of labeled groups from the plurality of queues to one or more crowdworkers based on a third pre-defined criteria.

Type: Grant

Filed: September 13, 2012

Date of Patent: October 21, 2014

Assignee: Xerox Corporation

Inventors: Chithralekha Balamurugan, Shourya Roy, Jacki O'Neill, Sujit Gujar
Detecting separator lines in a web page

Patent number: 8867837

Abstract: A system and method of detecting separator lines in a web page may include determining coordinates of visible web elements on a web page, generating an edge image of the web page based on the coordinates of the web elements, filtering edges belonging to non-separator line elements within the edge image, detecting horizontal lines within the edge image, detecting vertical lines within the edge image, and filtering short lines within the edge image. A system for detecting separator lines in a web page may include a memory device, and a processor communicatively coupled to the memory, in which the processor determines coordinates of visible web elements on a web page, generates an edge image of the web page based on the coordinates of the web elements, filters edges belonging to non-separator line elements within the edge image, detects horizontal lines within the edge image, detects vertical lines within the edge image, and filters short lines within the edge image.

Type: Grant

Filed: July 30, 2010

Date of Patent: October 21, 2014

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Hui-Man Hou, Li-Wei Zheng, Jian-Ming Jin, Jian Fan, Suk Hwan Lim
METHOD OF MANAGING IMAGE AND ELECTRONIC DEVICE THEREOF

Publication number: 20140307966

Abstract: A system processes an image in an electronic device, by determining whether a text character is included in an image and extracting the determined text character from the image. The extracted text character is stored in association with the image.

Type: Application

Filed: April 7, 2014

Publication date: October 16, 2014

Applicant: Samsung Electronics Co., Ltd.

Inventors: Bo-Kun CHOI, Han-Jib KIM, Pil-Joo YOON, Soo-Ji HWANG
STRAIGHTENING OUT DISTORTED PERSPECTIVE ON IMAGES

Publication number: 20140307967

Abstract: Methods for correcting distortions in an image including text, or an image of a page that includes text, are disclosed. The methods include identifying reliable and substantially straight lines from elements in the image. Vanishing points are determined from the lines. Parameters associated with a rectangle are determined. A coordinate conversion is performed.

Type: Application

Filed: June 26, 2014

Publication date: October 16, 2014

Applicant: ABBYY Development LLC

Inventors: Olga Kacher, Vladimir Rybkin
Method, device, and system for performing color enhancement on whiteboard color image

Patent number: 8860806

Abstract: Disclosed are a device, a method, and a system for enhancing color. The device comprises a unit used to extract a foreground portion from a whiteboard color image to serve as whiteboard contents; a unit used to stretch R, G, and B channel values of each of a plurality of foreground pixels forming the whiteboard contents; a unit used to adjust color tone of each of the foreground pixels; a unit used to adjust a ratio of color saturation degree to color intensity of each of the foreground pixels so as to cause the ratio to approach a ratio expectation value; and a unit used to increase the color saturation degree and the color intensity of each of the foreground pixel so as to cause the two to approach a color saturation degree expectation value and a color intensity expectation value Id, respectively.

Type: Grant

Filed: August 15, 2011

Date of Patent: October 14, 2014

Assignee: Ricoh Company, Ltd.

Inventors: Wenbo Zhang, Yan Li
Image processing apparatus and image processing method for extracting a line segment

Patent number: 8854691

Abstract: An image processing apparatus extracts a line segment included in an image, and includes a density gradient direction determining section that determines a direction, in which density changes, of each processing unit composed of a predetermined number of pixels of an image, and an line segment extracting section that regards a couple of processing units whose density gradient direction are opposite each other as a processing unit pair and extracts a processing unit group including a plurality of processing unit pairs allocated in a row in a direction perpendicular to the density gradient directions as a line segment.

Type: Grant

Filed: February 2, 2012

Date of Patent: October 7, 2014

Assignee: Murata Machinery Ltd.

Inventor: Nariyasu Kan
Image rectification using an orientation vector field

Patent number: 8855419

Abstract: This invention is a method for rectifying an input digital image including warped textual information. The method includes analyzing the input digital image to determine local orientations for a plurality of local image regions and determining an orientation vector field by interpolating between the determined local orientations for a lattice of positions. A set of streamlines are determined responsive to the orientation vector field. A global deformation function is formed by interpolating between the streamlines and is used to form a rectified image.

Type: Grant

Filed: November 20, 2012

Date of Patent: October 7, 2014

Assignee: Eastman Kodak Company

Inventors: Hao Wu, Kevin Edward Spaulding
Systems and methods for block recomposition for compound image compression

Patent number: 8855418

Abstract: A new approach is proposed that contemplates systems and methods to support block-based compression of a compound image by skipping “don't care” blocks in the layers of the image while neither introducing significant overhead nor requiring changes to the compression method used. The block-based compression approach first segments a compound image into multiple layers and then recomposes a new set of image layers, possibly with new dimensions, from only the non-“don't care” blocks in the layers of the original image. The approach may later decompress the compressed image layers and restore the image by copying the decompressed blocks to their respective positions in the original image.

Type: Grant

Filed: August 14, 2013

Date of Patent: October 7, 2014

Assignee: Citrix Systems, Inc.

Inventor: Bernd Oliver Christiansen
Data recognition in content

Patent number: 8849041

Abstract: The disclosure relates to recognizing data such as items or entities in content. In some aspects, content may be received and feature information, such as face recognition data and voice recognition data may be generated. Scene segmentation may also be performed on the content, grouping the various shots of the video content into one or more shot collections, such as scenes. For example, a decision lattice representative of possible scene segmentations may be determined and the most probable path through the decision lattice may be selected as the scene segmentation. Upon generating the feature information and performing the scene segmentation, one or more items or entities that are present in the scene may be identified.

Type: Grant

Filed: June 4, 2012

Date of Patent: September 30, 2014

Assignee: Comcast Cable Communications, LLC

Inventors: Jan Neumann, Evelyne Tzoukermann, Amit Bagga, Oliver Jojic, Bageshree Shevade, David Houghton, Corey Farrell
Document analysis systems and methods

Patent number: 8849031

Abstract: A method embodiment herein begins by capturing a source image. The source image is segmented into first planes. The first planes can each comprise a mask plane and foreground plane combination. The binary images in the first planes are structurally analyzed to identify different regions of text, tables, handwriting, line art, equations, etc., using a document model that has information of size, shape, and spatial arrangement of possible regions. Then, the method extracts (crops out) these regions from the foreground plane to create second mask/foreground plane pairs. Thus, the method creates “second” planes from the first planes, so that a separate second plane is created for each of the regions. Next, tags are associated with each of the second planes (to create tagged mask/foreground plane pairs) and the second planes and associated tags are combined into a mixed raster content (MRC) document.

Type: Grant

Filed: October 20, 2005

Date of Patent: September 30, 2014

Assignee: Xerox Corporation

Inventor: John C. Handley
Image processing apparatus, image reading apparatus, image forming apparatus, image processing method, and recording medium

Patent number: 8848240

Abstract: An object area containing a text or a graphic is extracted from a document image containing the text or the graphic. Then, on the basis of the extracted object area and stored size information, a cropping area is determined that surrounds the object area with given margins. Further, setting of margins is received. Then, on the basis of the object area and the received setting of the margins, the cropping area is determined. Then, the cropping area determined on the basis of the object area and the size information or alternatively the cropping area determined on the basis of the object area and the setting of the margins is cropped from the document image.

Type: Grant

Filed: June 21, 2011

Date of Patent: September 30, 2014

Assignee: Sharp Kabushiki Kaisha

Inventors: Atsuhisa Morimoto, Yohsuke Konishi, Hitoshi Hirohata, Akihito Yoshida
Efficient blending methods for AR applications

Patent number: 8842909

Abstract: The use of optical character recognition (OCR) in mobile devices is becoming prevalent with the increasing use of mobile devices. One important application for OCR in mobile devices is recognizing and translating the text to a language understandable by the user. Techniques are provided for replacing symbols in an image, while reducing the artifacts as a result of re-rendering of the background image.

Type: Grant

Filed: October 27, 2011

Date of Patent: September 23, 2014

Assignee: Qualcomm Incorporated

Inventors: Hyung-Il Koo, Young-Ki Baik, Beom-Su Kim
Method and apparatus for encoding and decoding coding unit of picture boundary

Patent number: 8842925

Abstract: A method and apparatus for encoding an image is provided. An image coding unit, including a region that deviates from a boundary of a current picture, is divided to obtain a coding unit having a smaller size than the size of the image coding unit, and encoding is performed only in a region that does not deviate from the boundary of the current picture. A method and apparatus for decoding an image encoded by the method and apparatus for encoding an image is also provided.

Type: Grant

Filed: November 11, 2013

Date of Patent: September 23, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventor: Min-su Cheon
Finding text in natural scenes

Patent number: 8837830

Abstract: As set forth herein, systems and methods facilitate providing an efficient edge-detection and closed-contour based approach for finding text in natural scenes such as photographic images, digital, and/or electronic images, and the like. Edge information (e.g., edges of structures or objects in the images) is obtained via an edge detection technique. Edges from text characters form closed contours even in the presence of reasonable levels of noise. Closed contour linking and candidate text line formation are two additional features of the described approach. A candidate text line classifier is applied to further screen out false-positive text identifications. Candidate text regions for placement of text in the natural scene of the electronic image are highlighted and presented to a user.

Type: Grant

Filed: June 12, 2012

Date of Patent: September 16, 2014

Assignee: Xerox Corporation

Inventors: Raja Bala, Zhigang Fan, Hengzhou Ding, Jan P. Allebach, Charles A. Bouman
Image conversion of text-based images

Patent number: 8830241

Abstract: Conversion of text-based images to vector graphics (VG) is disclosed. The text-based images may include images of equations, custom typefaces, or other types of text that may not be included in a font selection of an optical character recognition (OCR) device or an application stored on a viewing device. A textual image may be converted from a raster graphics (RG) image to a VG image, which may enable resizing and alignment of the VG image with body text. In some aspects, the server may determine a body size of a reference character in the VG image. The server may determine a baseline of the VG image that may be used to align the image with the body text.

Type: Grant

Filed: November 30, 2009

Date of Patent: September 9, 2014

Assignee: Amazon Technologies, Inc.

Inventor: Martin Gorner
Image processing device for accurately identifying region in image without increase in memory requirement

Patent number: 8830529

Abstract: An image forming device performs functions including: dividing the image into a plurality of band images each including at least one sub-region; creating region data used to identify each sub-region included in the band image; updating the region data such that the region data identifies both first and second uniform sub-regions as a single uniform region of the image when the first uniform sub-region abuts the second uniform sub-region, the first and second uniform sub-regions being included in the first and second band image and classified as the uniform sub-region, respectively; and updating the region data such that the region data identifies both first and second nonuniform sub-regions as a single nonuniform region of the image when the first nonuniform sub-region abuts the second nonuniform sub-region, the first and second nonuniform sub-regions being included in the first and second band image and classified as the nonuniform sub-region, respectively.

Type: Grant

Filed: July 30, 2012

Date of Patent: September 9, 2014

Assignee: Brother Kogyo Kabushiki Kaisha

Inventors: Ryohei Ozawa, Masaki Kondo
Identification of regions of a document

Patent number: 8832549

Abstract: Some embodiments provide a for analyzing a document that includes a number of primitive elements. The method identifies boundaries between sets of primitive elements and identifies regions bounded by the boundaries. The method uses the identified regions to define structural elements for the document. The method defines a structured document based on the primitive elements and the structural elements.

Type: Grant

Filed: June 7, 2009

Date of Patent: September 9, 2014

Assignee: Apple Inc.

Inventors: Philip Andrew Mansfield, Michael Robert Levy

prev … 2 3 4 5 6 7 8 9 10 … next