Patents by Inventor Dan S. Bloomberg

Dan S. Bloomberg has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8910037
    Abstract: A signature for a page of text is generated. The signature serves as an identifier of the text page. Positions of words in a text page are determined. Positions of multiple second words in the text page are determined relative to the position of a first word in the text page. A signature value is generated that describes the second word positions relative to the first word position. The signature value is stored. Additional signatures for the text page can be generated, each signature describing positions of other words in the text page relative to a word in the text page for which the signature is being generated. The signatures can be used to compare the text page to another text page and generate a measure of similarity that describes the result of the comparison.
    Type: Grant
    Filed: February 28, 2012
    Date of Patent: December 9, 2014
    Assignee: Google Inc.
    Inventors: Nemanja L. Spasojevic, Guillaume Poncin, Dan S. Bloomberg
  • Patent number: 8151186
    Abstract: A signature for a page of text is generated. The signature serves as an identifier of the text page. Positions of words in a text page are determined. Positions of multiple second words in the text page are determined relative to the position of a first word in the text page. A signature value is generated that describes the second word positions relative to the first word position. The signature value is stored. Additional signatures for the text page can be generated, each signature describing positions of other words in the text page relative to a word in the text page for which the signature is being generated. The signatures can be used to compare the text page to another text page and generate a measure of similarity that describes the result of the comparison.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: April 3, 2012
    Assignee: Google Inc.
    Inventors: Nemanja L. Spasojevic, Guillaume Poncin, Dan S. Bloomberg
  • Patent number: 8151187
    Abstract: A signature for a page of text is generated. The signature serves as an identifier of the text page. Positions of words in a text page are determined. Positions of multiple second words in the text page are determined relative to the position of a first word in the text page. A signature value is generated that describes the second word positions relative to the first word position. The signature value is stored. Additional signatures for the text page can be generated, each signature describing positions of other words in the text page relative to a word in the text page for which the signature is being generated. The signatures can be used to compare the text page to another text page and generate a measure of similarity that describes the result of the comparison.
    Type: Grant
    Filed: September 26, 2011
    Date of Patent: April 3, 2012
    Assignee: Google Inc.
    Inventors: Nemanja L. Spasojevic, Guillaume Poncin, Dan S. Bloomberg
  • Publication number: 20110289395
    Abstract: The invention converts a document originating in a page-image format into a form suitable for an arbitrarily sized display, by reformatting or “re-flowing” of the document to fit an arbitrarily sized display device. A two-stage system analyzes, or “deconstructs,” page image layout. The deconstruction includes both physical (geometric) and logical (functional) segmentation of page images. The segment that image elements may include blocks, lines, and/or words of text, and other segmented image elements. The segment that image elements are synthesized and converted into an intermediate structure. The intermediate data structure is then distilled or converted or redisplayed into any number of standard print formats.
    Type: Application
    Filed: June 3, 2011
    Publication date: November 24, 2011
    Applicant: XEROX CORPORATION
    Inventors: Thomas M. BREUEL, Henry S. BAIRD, William C. JANSSEN, Ashok C. POPAT, Dan S. BLOOMBERG
  • Patent number: 7489830
    Abstract: A method and system for storing and generating anti-aliased text and lineart data from compressed document image files, using a MRC model that represents the image as an ordered set of mask/image pairs at resolutions appropriate to the content of each layer. The method and system provide the ability to generate for anti-aliased text data to improve appearance at both high and low resolution, and to avoid baseline jitter of compressed tokens.
    Type: Grant
    Filed: July 20, 2007
    Date of Patent: February 10, 2009
    Assignee: Xerox Corporation
    Inventors: Dan S. Bloomberg, Luc Vincent
  • Patent number: 7469059
    Abstract: Systems and methods for reorganizing raw image data captured by a camera for improved image processing are disclosed. The method generally includes separately compressing each color component, e.g., RGB, of raw grayscale image data to generate a reorganized grayscale data output file by a first processor, and performing color decoding to generate a color image output by a second processor. The raw grayscale image data may be that of an image of a target captured by a digital camera in an image capturing system. The second processor is separate from the first processor that records the raw grayscale image data of the image captured and may be physically external to the camera. The raw grayscale image data may be a 2N×2N array and the reorganized grayscale data output includes 4 N×N color-specific arrays where N is a factor of 8.
    Type: Grant
    Filed: December 17, 2004
    Date of Patent: December 23, 2008
    Assignee: Google Inc.
    Inventors: Francois-Marie Lefevere, Marin Saric, Dan S. Bloomberg
  • Patent number: 7266250
    Abstract: A method and system for storing and generating anti-aliased text and lineart data from compressed document images files, using a MRC model that represents the image as an ordered set of mask/image pairs at resolutions appropriate to the content of each layer. The method and system provide the ability to generate for anti-aliased text data to improve appearance at both high and low resolution, and to avoid baseline jitter of compressed tokens.
    Type: Grant
    Filed: February 15, 2006
    Date of Patent: September 4, 2007
    Assignee: Xerox Corporation
    Inventors: Dan S. Bloomberg, Luc Vincent
  • Patent number: 6983084
    Abstract: A method of aligning a first page image and a second page image is disclosed. The first page image and the second page image are deskewed. Then, the first page image and the second page image are vertically aligned. In particular, a first vertical data set comprising a plurality of first values each first value based on a horizontal scanline of the first page image is generated. Moreover, a second vertical data set comprising a plurality of second values each second value based on a horizontal scanline of the second page image is generated. One of the first and second vertical data sets is dilated. Then, the first and second vertical data sets are cross-correlated to generate cross-correlation data. A maximum value of the cross-correlation data is determined, whereas the maximum value indicates vertical alignment between the first and second page images. Finally, the first and second page images are horizontally aligned.
    Type: Grant
    Filed: August 28, 2002
    Date of Patent: January 3, 2006
    Assignee: Hewlett-Packard Development Company, LP.
    Inventors: Hui Chao, Dan S. Bloomberg
  • Patent number: 6952803
    Abstract: A system and method for editing and transcribing using a structured freeform editor is provided. The method implemented in the system includes interpreting structure of freeform graphic elements and selectively editing the structure and/or selectively transcribing scribble elements to an editable format.
    Type: Grant
    Filed: December 29, 1998
    Date of Patent: October 4, 2005
    Assignee: Xerox Corporation
    Inventors: Dan S. Bloomberg, Thomas P. Moran
  • Publication number: 20040205568
    Abstract: The invention converts a document originating in a page-image format into a form suitable for an arbitrarily sized display, by reformatting or “re-flowing” of the document to fit an arbitrarily sized display device.
    Type: Application
    Filed: August 27, 2002
    Publication date: October 14, 2004
    Inventors: Thomas M. Breuel, Henry S. Baird, William C. Janssen, Ashok C. Popat, Dan S. Bloomberg
  • Patent number: 6738518
    Abstract: In a text recognition system that uses a stochastic finite state network to model a document image layout, the computational efficiency of text line decoding is improved. In a typical implementation, the dynamic programming operation that accomplishes decoding uses actual scores computed between two-dimensional (2D) bitmapped character template images and the (2D) bitmapped observed image. Scoring measures the degree of a match between a character template and the observed image. Computation of these actual scores is replaced with the simpler computation of column-based (i.e., one-dimensional) heuristic scores. Because the column-based heuristic scores can be shown to be a true upper bound on actual template-image scores, the heuristic scores are accurate enough to use in place of actual scoring during text line decoding.
    Type: Grant
    Filed: May 12, 2000
    Date of Patent: May 18, 2004
    Assignee: Xerox Corporation
    Inventors: Thomas P. Minka, Dan S. Bloomberg, Ashok C. Popat
  • Patent number: 6678415
    Abstract: A text recognition system represents the decoded message of a document image as a path through an image network. A method for integrating a language model into the network selectively expands the network to accommodate the language model only for certain ones of the paths in the network, effectively managing the memory storage requirements and computational complexities of integrating the language model efficiently into the network. The language model generates probability distributions indicating the probability of a certain character occurring in a string, given one or more previous characters in the string. Selectively expanding the image network is achieved by initially using upper bounds on the language model probabilities on the branches of an unexpanded image network. A best path search operation is then performed to determine an estimated best path through the image network using these upper bound scores.
    Type: Grant
    Filed: May 12, 2000
    Date of Patent: January 13, 2004
    Assignee: Xerox Corporation
    Inventors: Ashok C. Popat, Dan S. Bloomberg, Daniel H. Greene
  • Publication number: 20030215136
    Abstract: A method of document segmentation. Specifically, one embodiment of the present invention discloses a method of document segmentation that performs a plurality of projection profiles of pixel intensities on a document containing a plurality of text lines over a range of angles. A plurality of slope values for a plurality of discrete distances perpendicular to said range of angles is calculated for the plurality of projection profiles. A set of maximum absolute slope values is sorted out from the plurality of slope values. Text lines of first and second type are identified by setting a threshold slope value. Absolute slope values greater than the threshold slope value indicate the plurality of text lines of said first type. Absolute slope values less than the threshold slope value indicate the plurality of text lines of a second type.
    Type: Application
    Filed: May 17, 2002
    Publication date: November 20, 2003
    Inventors: Hui Chao, Dan S. Bloomberg
  • Publication number: 20030215157
    Abstract: A method of aligning a first page image and a second page image is disclosed. The first page image and the second page image are deskewed. Then, the first page image and the second page image are vertically aligned. In particular, a first vertical data set comprising a plurality of first values each first value based on a horizontal scanline of the first page image is generated. Moreover, a second vertical data set comprising a plurality of second values each second value based on a horizontal scanline of the second page image is generated. One of the first and second vertical data sets is dilated. Then, the first and second vertical data sets are cross-correlated to generate cross-correlation data. A maximum value of the cross-correlation data is determined, whereas the maximum value indicates vertical alignment between the first and second page images. Finally, the first and second page images are horizontally aligned.
    Type: Application
    Filed: August 28, 2002
    Publication date: November 20, 2003
    Inventors: Hui Chao, Dan S. Bloomberg
  • Patent number: 6641051
    Abstract: A system for printing glyph frames around known obstructions. All frames in an area are determined to be obstructed or unobstructed, based on their location with respect to other printed areas. The unobstructed locations can be numbered and glyph data printed within. In the alternative, the good locations can be numbered modulo some number much smaller that the number of available locations to provide redundancy. The unobstructed locations can be stored in either the sync lines or in the data area of other locations known to be unobstructed. Also, the frame itself can be identified as obstructed or unobstructed to provide more redundancy.
    Type: Grant
    Filed: September 24, 1999
    Date of Patent: November 4, 2003
    Assignee: Xerox Corporation
    Inventors: Daniel H. Illowsky, Dan S. Bloomberg, Robert E. Weltman
  • Patent number: 6594393
    Abstract: In a text recognition system, the computational efficiency of a text line image decoding operation is improved by utilizing the characteristic of a graph known as the cut set. The branches of the data structure that represents the image are initially labeled with estimated scores. When estimated scores are used, the decoding operation must perform iteratively on a text line before producing the best path through the data structure. After each iteration, nodes in the best path are re-scored with actual scores. The decoding operation incorporates an operating mode called skip mode.
    Type: Grant
    Filed: May 12, 2000
    Date of Patent: July 15, 2003
    Inventors: Thomas P. Minka, Dan S. Bloomberg, Ashok C. Popat
  • Patent number: 6439465
    Abstract: Individual glyph frames for providing simple data blocks that may be read individually. A frame is usually on the order of ⅙th of an inch in size and can be used as a pointer to anywhere on the page or can contain a small piece of data such as the page or form number.
    Type: Grant
    Filed: September 24, 1999
    Date of Patent: August 27, 2002
    Assignee: Xerox Corporation
    Inventor: Dan S. Bloomberg
  • Patent number: 6427920
    Abstract: A specific bit pattern (or set of bit patterns) that have particular frequency and length requirements is used to remove low-frequency, visually observable structure from rendered embedded digital data. In use, the pattern is replicated enough times to be able to XOR with the data before rendering as glyphs, or the like. On decoding (reading), the read data is XOR'd again to recover the original data.
    Type: Grant
    Filed: September 24, 1999
    Date of Patent: August 6, 2002
    Assignee: Xerox Corporation
    Inventors: Dan S. Bloomberg, Robert E. Weltman
  • Patent number: 6076738
    Abstract: This invention provides self-clocking glyph shape codes for encoding digital data in the shapes of glyphs that are suitable for printing on hardcopy recording media. Advantageously, the glyphs are selected so that they tend not to degrade into each other when they are degraded and/or distorted as a result, for example, of being photocopied, transmitted via facsimile, and/or scanned-in to an electronic document processing system. Moreover, for at least some applications, the glyphs desirably are composed of printed pixel patterns containing nearly the same number of ON pixels and nearly the same number of OFF pixels, such that the code that is rendered by printing such glyphs on substantially uniformly spaced centers appears to have a generally uniform texture. In the case of codes printed at higher spatial densities, this texture is likely to be perceived as a generally uniform gray tone.
    Type: Grant
    Filed: May 10, 1994
    Date of Patent: June 20, 2000
    Assignee: Xerox Corporation
    Inventors: Dan S. Bloomberg, David L. Hecht, Robert F. Tow, L. Prasadam Flores
  • Patent number: RE38758
    Abstract: This invention provides self-clocking glyph shape codes for encoding digital data in the shapes of glyphs that are suitable for printing on hardcopy recording media. Advantageously, the glyphs are selected so that they tend not to degrade into each other when they are degraded and/or distorted as a result, for example, of being photocopied, transmitted via facsimile, and/or scanned-in to an electronic document processing system. Moreover, for at least some applications, the glyphs desirably are composed of printed pixel patterns containing nearly the same number of ON pixels and nearly the same number of OFF pixels, such that the code that is rendered by printing such glyphs on substantially uniformly spaced centers appears to have a generally uniform texture. In the case of codes printed at higher spatial densities, this texture is likely to be perceived as a generally uniform gray tone.
    Type: Grant
    Filed: June 18, 2001
    Date of Patent: July 19, 2005
    Assignee: Xerox Corporation
    Inventors: Dan S. Bloomberg, David L. Hecht, Robert F. Tow, L. Prasadam Flores