Patents by Inventor Hong Jiang Zhang

Hong Jiang Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speedup of face detection in digital images

Publication number: 20040264744

Abstract: Improved methods and apparatuses are provided for use in face detection. The methods and apparatuses significantly reduce the number of candidate windows within a digital image that need to be processed using more complex and/or time consuming face detection algorithms. The improved methods and apparatuses include a skin color filter and an adaptive non-face skipping scheme.

Type: Application

Filed: June 30, 2003

Publication date: December 30, 2004

Applicant: MICROSOFT CORPORATION

Inventors: Lei Zhang, Mingjing Li, Hong-Jiang Zhang
Face annotation for photo management

Publication number: 20040264780

Abstract: Systems and methods for annotating a face in a digital image are described. In one aspect, a probability model is trained by mapping one or more sets of sample facial features to corresponding names of individuals. A face from an input data set of at least one the digital image is then detected. Facial features are then automatically extracted from the detected face. A similarity measure is them modeled as a posterior probability that the facial features match a particular set of features identified in the probability model. The similarity measure is statistically learned. A name is then inferred as a function of the similarity measure. The face is then annotated with the name.

Type: Application

Filed: June 30, 2003

Publication date: December 30, 2004

Inventors: Lei Zhang, Longbin Chen, Mingjing Li, Hong-Jiang Zhang
Stereo-coupled face shape registration

Publication number: 20040264745

Abstract: A face model having outer and inner facial features is matched to that of first and second models. Each facial feature of the first and second models is represented by plurality of points that are adjusted for each matching outer and inner facial feature of the first and second models using 1) the corresponding epipolar constraint for the inner features of the first and second models. 2) Local grey-level structure of both outer and inner features of the first and second models. The matching and the adjusting are repeated, for each of the first and second models, until the points for each of the outer and inner facial features on the respective first and second models that are found to match that of the face model have a relative offset there between of not greater than a predetermined convergence tolerance. The inner facial features can include a pair of eyes, a nose and a mouth. The outer facial features can include a pair of eyebrows and a silhouette of the jaw, chin, and cheeks.

Type: Application

Filed: June 30, 2003

Publication date: December 30, 2004

Applicant: MICROSOFT CORPORATION

Inventors: Lie Gu, Li Ziqing, Hong-Jiang Zhang
Relevance maximizing, iteration minimizing, relevance-feedback, content-based image retrieval (CBIR)

Publication number: 20040243541

Abstract: An implementation of a technology, described herein, for relevance-feedback, content-based facilitating accurate and efficient image retrieval minimizes the number of iterations for user feedback regarding the semantic relevance of exemplary images while maximizing the resulting relevance of each iteration. One technique for accomplishing this is to use a Bayesian classifier to treat positive and negative feedback examples with different strategies. In addition, query refinement techniques are applied to pinpoint the users' intended queries with respect to their feedbacks. These techniques further enhance the accuracy and usability of relevance feedback. This abstract itself is not intended to limit the scope of this patent. The scope of the present invention is pointed out in the appending claims.

Type: Application

Filed: April 26, 2004

Publication date: December 2, 2004

Inventors: Hong-Jiang Zhang, Zhong Su, Xingquan Zhu
Head pose assessment methods and systems

Publication number: 20040240708

Abstract: Improvements are provided to effectively assess a user's face and head pose such that a computer or like device can track the user's attention towards a display device(s). Then the region of the display or graphical user interface that the user is turned towards can be automatically selected without requiring the user to provide further inputs. A frontal face detector is applied to detect the user's frontal face and then key facial points such as left/right eye center, left/right mouth corner, nose tip, etc., are detected by component detectors. The system then tracks the user's head by an image tracker and determines yaw, tilt and roll angle and other pose information of the user's head through a coarse to fine process according to key facial points and/or confidence outputs by pose estimator.

Type: Application

Filed: May 30, 2003

Publication date: December 2, 2004

Applicant: MICROSOFT CORPORATION

Inventors: Yuxiao Hu, Lei Zhang, Mingjing Li, Hong-Jiang Zhang
Modification of red-eye-effect in digital image

Publication number: 20040228542

Abstract: In one aspect, the present disclosure describes a process for automatic artifact compensation in a digital representation of an image. The process includes detecting, by a processor, regions corresponding to facial images within the digital representation; locating, by the processor, red-eye regions within the detected regions; and automatically modifying, by the processor, the located red-eye regions to provide a modified image.

Type: Application

Filed: May 13, 2003

Publication date: November 18, 2004

Applicant: MICROSOFT CORPORATION

Inventors: Lei Zhang, Yanfeng Sun, Mingjing Li, Hong-Jiang Zhang
Media agent

Publication number: 20040220925

Abstract: Systems and methods for a media agent are described. In one aspect, user access of a media content source is detected. Responsive to this detection, a piece of media content and associated text is collected from the media content source. Semantic text features are extracted from the associated text and the piece of media content. The semantic text features are indexed into a media database.

Type: Application

Filed: May 24, 2004

Publication date: November 4, 2004

Applicant: Microsoft Corporation

Inventors: Wen-Yin Liu, Hong-Jiang Zhang, Zheng Chen
Generating a music snippet

Publication number: 20040216585

Abstract: Systems and methods for extracting a music snippet from a music stream are described. In one aspect, one or more music sentences are extracted from the music stream. The one or more sentences are extracted as a function of peaks and valleys of acoustic energy across sequential music stream portions. The music snippet is selected based on the one or more music sentences.

Type: Application

Filed: June 3, 2004

Publication date: November 4, 2004

Applicant: Microsoft Corporation

Inventors: Lie Lu, Hong-Jiang Zhang, Po Yuan
Media agent

Publication number: 20040215663

Abstract: Systems and methods for a media agent are described. In one aspect, it is determined whether a user wants to save or download a media object from a media source. Semantic information is then extracted from the media source. A filename is then suggested to the user for the media object based on the semantic information.

Type: Application

Filed: May 24, 2004

Publication date: October 28, 2004

Applicant: Microsoft Corporation

Inventors: Wen-Yin Liu, Hong-Jiang Zhang, Zheng Chen
Generating visually representative video thumbnails

Publication number: 20040197071

Abstract: An algorithm identifies a salient video frame from a video sequence for use as a video thumbnail. The identification of a video thumbnail is based on a frame goodness measure. The algorithm calculates a color histogram of a frame, and then calculates the entropy and standard deviation of the color histogram. The frame goodness measure is a weighted combination of the entropy and the standard deviation. A video frame having the highest value of frame goodness measure for a video sequence is determined as the video thumbnail for a video sequence.

Type: Application

Filed: April 1, 2003

Publication date: October 7, 2004

Applicant: MICROSOFT CORPORATION

Inventors: Dong Zhang, Yijin Wang, Hong-Jiang Zhang
Automatic detection and segmentation of music videos in an audio/video stream

Publication number: 20040170392

Abstract: A “music video parser” automatically detects and segments music videos in a combined audio-video media stream. Automatic detection and segmentation is achieved by integrating shot boundary detection, video text detection and audio analysis to automatically detect temporal boundaries of each music video in the media stream. In one embodiment, song identification information, such as, for example, a song name, artist name, album name, etc., is automatically extracted from the media stream using video optical character recognition (OCR). This information is then used in alternate embodiments for cataloging, indexing and selecting particular music videos, and in maintaining statistics such as the times particular music videos were played, and the number of times each music video was played.

Type: Application

Filed: February 19, 2003

Publication date: September 2, 2004

Inventors: Lie Lu, Yan-Feng Sun, Mingjing Li, Xian-Sheng Hua, Hong-Jiang Zhang
Generating a music snippet

Patent number: 6784354

Abstract: Systems and methods for extracting a music snippet from a music stream are described. In one aspect, the music stream is divided into multiple frames of fixed length. The most-salient frame of the multiple frames is then identified. One or more music sentences are then extracted from the music stream as a function of peaks and valleys of acoustic energy across sequential music stream portions. The music snippet is the sentence that includes the most-salient frame.

Type: Grant

Filed: March 13, 2003

Date of Patent: August 31, 2004

Assignee: Microsoft Corporation

Inventors: Lie Lu, Hong-Jiang Zhang, Po Yuan
Systems and methods for enhanced image adaptation

Publication number: 20040165784

Abstract: Systems and methods for adapting images for substantially optimal presentation by heterogeneous client display sizes are described. In one aspect, an image is modeled with respect to multiple visual attentions to generate respective attention objects for each of the visual attentions. For each of one or more image adaptation schemes, an objective measure of information fidelity (IF) is determined for a region R of the image. The objective measures are determined as a function of a resource constraint of the display device and as a function of a weighted sum of IF of each attention object in the region R. A substantially optimal adaptation scheme is then selected as a function of the calculated objective measures. The image is then adapted via the selected substantially optimal adaptation scheme to generate an adapted image as a function of at least the target area of the client display.

Type: Application

Filed: February 20, 2003

Publication date: August 26, 2004

Inventors: Xing Xie, Wei-Ying Ma, Hong-Jiang Zhang, Liqun Chen, Xin Fan
Learning-based automatic commercial content detection

Publication number: 20040161154

Abstract: Systems and methods for learning-based automatic commercial content detection are described. In one aspect, program data is divided into multiple segments. The segments are analyzed to determine visual, audio, and context-based feature sets that differentiate commercial content from non-commercial content. The context-based features are a function of single-side left and/or right neighborhoods of segments of the multiple segments.

Type: Application

Filed: February 18, 2003

Publication date: August 19, 2004

Inventors: Xian-Sheng Hua, Lie Lu, Mingjing Li, Hong-Jiang Zhang
Organizing and displaying photographs based on time

Publication number: 20040145602

Abstract: A technique is provided for organizing and displaying digital photographs based on time. The technique includes inputting data representing a photograph and storing the data as a photograph image file. The technique then identifies the manner in which the photograph image file stores time information (such as date and time of day). For instance, the technique determines whether the time information is digitally encoded in the image file, or whether it is embedded within the image data itself. The technique next includes extracting the time information from the photograph image file using a technique appropriate to the identified manner in which the time information is stored, to produce extracted time information. The photographs are then inserted into a time sequence based on the extracted time information, and presented on a calendar display at a location representative of the chronological placement of the photograph within the time sequence.

Type: Application

Filed: January 24, 2003

Publication date: July 29, 2004

Applicant: Microsoft Corporation

Inventors: Yan-Feng Sun, Lei Zhang, Mingjing Li, Hong-Jiang Zhang
Relevance maximizing, iteration minimizing, relevance-feedback, content-based image retrieval (CBIR)

Patent number: 6748398

Abstract: An implementation of a technology, described herein, for relevance-feedback, content-based facilitating accurate and efficient image retrieval minimizes the number of iterations for user feedback regarding the semantic relevance of exemplary images while maximizing the resulting relevance of each iteration. One technique for accomplishing this is to use a Bayesian classifier to treat positive and negative feedback examples with different strategies. In addition, query refinement techniques are applied to pinpoint the users' intended queries with respect to their feedbacks. These techniques further enhance the accuracy and usability of relevance feedback. This abstract itself is not intended to limit the scope of this patent. The scope of the present invention is pointed out in the appending claims.

Type: Grant

Filed: March 30, 2001

Date of Patent: June 8, 2004

Assignee: Microsoft Corporation

Inventors: Hong-Jiang Zhang, Zhong Su, Xingquan Zhu
Method of real-time speaker change point detection, speaker tracking and speaker model construction

Publication number: 20040107100

Abstract: A method is provided for real-time speaker change detection and speaker tracking in a speech signal. The method is a “coarse-to-refine” process, which consists of two stages: pre-segmentation and refinement. In the pre-segmentation process, the covariance of a feature vector of each segment of speech is built initially. A distance is determined based on the covariance of the current segment and a previous segment; and the distance is used to determine if there is a potential speaker change between these two segments. If there is no speaker change, the model of current identified speaker model is updated by incorporating data of the current segment. Otherwise, if there is a speaker change, a refinement process is utilized to confirm the potential speaker change point.

Type: Application

Filed: November 29, 2002

Publication date: June 3, 2004

Inventors: Lie Lu, Hong-Jiang Zhang
Small form factor web browsing

Publication number: 20040103371

Abstract: A large web page is analyzed and partitioned into smaller sub-pages so that a user can navigate the web page on a small form factor device. The user can browse the sub-pages to find and read information in the content of the large web page. The partitioning can be performed at a web server, an edge server, at the small form factor device, or can be distributed across one or more such devices. The analysis leverages design habits of a web page author to extract a representation structure of an authored web page. The extracted representation structure includes high level structure using several markup language tag selection rules and low level structure using visual boundary detection in which visual units of the low level structure are provided by clustering markup language tags. User viewing habits can be learned to display favorite parts of a web page.

Type: Application

Filed: November 27, 2002

Publication date: May 27, 2004

Inventors: Yu Chen, Wei-Ying Ma, Ming-Yu Wang, Hong Jiang Zhang
Using shape suppression to identify areas of images that include particular shapes

Patent number: 6738512

Abstract: Shape suppression is used to identify areas of images that include particular shapes. According to one embodiment, a Vector Quantization (VQ)-based shape classifier is designed to identify the vertical edges of a set of shapes (e.g., English letters and numbers). A shape suppression filter is applied to the candidate areas, which are identified from a vertical edge map according to the edge density, to remove the vertical edges which are not classified as characteristic of shapes. Areas with high enough edge density after the filtering are identified as potential areas of the image that include one or more of the set of shapes.

Type: Grant

Filed: June 19, 2000

Date of Patent: May 18, 2004

Assignee: Microsoft Corporation

Inventors: Xiangrong Chen, Hong-Jiang Zhang
Systems and methods for generating a motion attention model

Publication number: 20040086046

Abstract: Systems and methods to generate a motion attention model of a video data sequence are described. In one aspect, a motion saliency map B is generated to precisely indicate motion attention areas for each frame in the video data sequence. The motion saliency maps are each based on intensity I, spatial coherence Cs, and temporal coherence Ct values. These values are extracted from each block or pixel in motion fields that are extracted from the video data sequence. Brightness values of detected motion attention areas in each frame are accumulated to generate, with respect to time, the motion attention model.

Type: Application

Filed: November 1, 2002

Publication date: May 6, 2004

Inventors: Yu-Fei Ma, Hong-Jiang Zhang

prev … 5 6 7 8 9 10 11 next