Patents by Inventor Jianjun Dou

Jianjun Dou has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8750616
    Abstract: In an extracting step, the extracting portion obtains a linked component composed of a plurality of mutually linking pixels from a character string region composed of a plurality of characters, and extracts section elements from the character string region, the section elements each being surrounded by a circumscribing figure circumscribing to the linked component. In the first altering step, the first altering portion combines section elements at least having a mutually overlapping part among the extracted section elements so as to prepare a new section element. In the first selecting step, the first selecting portion determines a reference size in advance and selects section elements having a size greater than the reference size, from among the section elements altered in the first altering step.
    Type: Grant
    Filed: December 21, 2007
    Date of Patent: June 10, 2014
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Patent number: 8295600
    Abstract: An image document processing device extracts a character sequence image having M number of characters in an image document, divides the image into individual character images, extracts features of the individual character images, and based on the features, selects N (N is an integer more than 1) character images in the order of degree of matching from a font-feature dictionary for storing features of all character images according to fonts, and generates an M×N index matrix for the extracted character sequence. In searching, the device searches an index-information storage section with respect to each search character included in a search keyword in an input search expression, and extracts an image document including an index matrix including the search keyword. This provides an image document processing device and an image document processing method each allowing indexing not requiring user's operation and each allowing highly precise searching without OCR recognition.
    Type: Grant
    Filed: December 7, 2007
    Date of Patent: October 23, 2012
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Patent number: 8290269
    Abstract: A headline-region initial processing section clips a headline-region image in an image document, divides the image into individual character images, and extracts features of the individual character images. Based on the features, a candidate-character-sequence generating section selects N (N is an integer more than 1) character images as candidate characters in the order of degree of matching from a font-feature dictionary for storing features of individual character images, and generates M×N index matrix where M is the number of characters in an extracted character sequence. Based on the index matrix, a document-name generating section generates a meaningful document name according to the image document. An image-document-DB management section manages accumulated image documents using the document name.
    Type: Grant
    Filed: December 10, 2007
    Date of Patent: October 16, 2012
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Patent number: 8208765
    Abstract: An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided into separate characters. Image features of each character image are extracted. Based on the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters, from a character image feature dictionary which stores the image features of character image in units of character, and a first index matrix of M×N cells is prepared. A candidate character string composed of a plurality of candidate characters constituting a first column of the first index matrix, is subjected to a lexical analysis according to a language model, and whereby a second index matrix having a character string which makes sense is prepared. In the language model, statistics are taken and then, the lexical analysis is performed.
    Type: Grant
    Filed: January 10, 2008
    Date of Patent: June 26, 2012
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Patent number: 8160402
    Abstract: An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided character by character, and image features of each character image are extracted. On the basis of the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters from a character image feature dictionary which stores the image features of character image in units of character, and the first index matrix of M×N cells is prepared. A candidate character string composed of a plurality of candidate characters constituting the first column of the first index matrix, is subjected to a lexical analysis according to a predetermined language model, whereby a second index matrix adjusted into a character string which makes sense is prepared to be utilized for searching.
    Type: Grant
    Filed: January 10, 2008
    Date of Patent: April 17, 2012
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Publication number: 20090028445
    Abstract: An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided character by character, and image features of each character image are extracted. On the basis of the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters from a character image feature dictionary which stores the image features of character image in units of character, and the first index matrix of M×N cells is prepared. A candidate character string composed of a plurality of candidate characters constituting the first column of the first index matrix, is subjected to a lexical analysis according to a predetermined language model, whereby a second index matrix adjusted into a character string which makes sense is prepared to he utilized for searching.
    Type: Application
    Filed: January 10, 2008
    Publication date: January 29, 2009
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Publication number: 20090028435
    Abstract: In an extracting step, the extracting portion obtains a linked component composed of a plurality of mutually linking pixels from a character string region composed of a plurality of characters, and extracts section elements from the character string region, the section elements each being surrounded by a circumscribing figure circumscribing to the linked component. In the first altering step, the first altering portion combines section elements at least having a mutually overlapping part among the extracted section elements so as to prepare a new section element. In the first selecting step, the first selecting portion determines a reference size in advance and selects section elements having a size greater than the reference size, from among the section elements altered in the first altering step.
    Type: Application
    Filed: December 21, 2007
    Publication date: January 29, 2009
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Publication number: 20090028446
    Abstract: An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided into separate characters. Image features of each character image are extracted. Based on the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters, from a character image feature dictionary which stores the image features of character image in units of character, and a first index matrix of M×N cells is prepared. A candidate character string composed of a plurality of candidate characters constituting a first column of the first index matrix, is subjected to a lexical analysis according to a language model, and whereby a second index matrix having a character string which makes sense is prepared. In the language model, statistics are taken and then, the lexical analysis is performed.
    Type: Application
    Filed: January 10, 2008
    Publication date: January 29, 2009
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Publication number: 20090030882
    Abstract: There is provided a document image processing apparatus which can reduce troubles to find a desired heading from a document image. A heading region extracting portion searches an index information DB and extracts a heading region containing a search keyword. An order setting portion automatically sets in line with a predetermined rule an order of the heading regions extracted by the heading region extracting portion. On a displaying portion is displayed a document image on which the heading regions extracted by the heading region extracting portion are highlighted in accordance with the order set by the order setting portion. A display order of search results may be set by determining importance of the extracted heading regions based on the number of the search keyword and features of character images in the heading regions.
    Type: Application
    Filed: January 10, 2008
    Publication date: January 29, 2009
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Publication number: 20080181505
    Abstract: A headline-region initial processing section clips a headline-region image in an image document, divides the image into individual character images, and extracts features of the individual character images. Based on the features, a candidate-character-sequence generating section selects N (N is an integer more than 1) character images as candidate characters in the order of degree of matching from a font-feature dictionary for storing features of individual character images, and generates M×N index matrix where M is the number of characters in an extracted character sequence. Based on the index matrix, a document-name generating section generates a meaningful document name according to the image document. An image-document-DB management section manages accumulated image documents using the document name.
    Type: Application
    Filed: December 10, 2007
    Publication date: July 31, 2008
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Publication number: 20080170810
    Abstract: An image document processing device extracts a character sequence image having M number of characters in an image document, divides the image into individual character images, extracts features of the individual character images, and based on the features, selects N (N is an integer more than 1) character images in the order of degree of matching from a font-feature dictionary for storing features of all character images according to fonts, and generates an M×N index matrix for the extracted character sequence. In searching, the device searches an index-information storage section with respect to each search character included in a search keyword in an input search expression, and extracts an image document including an index matrix including the search keyword. This provides an image document processing device and an image document processing method each allowing indexing not requiring user's operation and each allowing highly precise searching without OCR recognition.
    Type: Application
    Filed: December 7, 2007
    Publication date: July 17, 2008
    Inventors: Bo WU, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia