Patents by Inventor Hans Peter Graf

Hans Peter Graf has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7260539
    Abstract: Methods and apparatus for rendering a talking head on a client device are disclosed. The client device has a client cache capable of storing audio/visual data associated with rendering the talking head. The method comprises storing sentences in a client cache of a client device that relate to bridging delays in a dialog, storing sentence templates to be used in dialogs, generating a talking head response to a user inquiry from the client device, and determining whether sentences or stored templates stored in the client cache relate to the talking head response. If the stored sentences or stored templates relate to the talking head response, the method comprises instructing the client device to use the appropriate stored sentence or template from the client cache to render at least a part of the talking head response and transmitting a portion of the talking head response not stored in the client cache, if any, to the client device to render a complete talking head response.
    Type: Grant
    Filed: April 25, 2003
    Date of Patent: August 21, 2007
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Joern Ostermann
  • Patent number: 7231099
    Abstract: A method of improving the lighting conditions of a real scene or video sequence. Digitally generated light is added to a scene for video conferencing over telecommunication networks. A virtual illumination equation takes into account light attenuation, lambertian and specular reflection. An image of an object is captured, a virtual light source illuminates the object within the image. In addition, the object can be the head of the user. The position of the head of the user is dynamically tracked so that an three-dimensional model is generated which is representative of the head of the user. Synthetic light is applied to a position on the model to form an illuminated model.
    Type: Grant
    Filed: August 31, 2005
    Date of Patent: June 12, 2007
    Assignee: AT&T
    Inventors: Andrea Basso, Eric Cosatto, David Crawford Gibbon, Hans Peter Graf, Shan Liu
  • Patent number: 7209882
    Abstract: A system and method for generating a video sequence having mouth movements synchronized with speech sounds are disclosed. The system utilizes a database of n-phones as the smallest selectable unit, wherein n is larger than 1 and preferably 3. The system calculates a target cost for each candidate n-phone for a target frame using a phonetic distance, coarticulation parameter, and speech rate. For each n-phone in a target sequence, the system searches for candidate n-phones that are visually similar according to the target cost. The system samples each candidate n-phone to get a same number of frames as in the target sequence and builds a video frame lattice of candidate video frames. The system assigns a joint cost to each pair of adjacent frames and searches the video frame lattice to construct the video sequence by finding the optimal path through the lattice according to the minimum of the sum of the target cost and the joint cost over the sequence.
    Type: Grant
    Filed: May 10, 2002
    Date of Patent: April 24, 2007
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Fu Jie Huang
  • Patent number: 7177811
    Abstract: A method is provided for customizing a multi-media message created by a sender for a recipient, in which the multi-media message includes an animated entity audibly presenting speech converted from text by the sender. At least one image is received from the sender. Each of the at least one image is associated with a tag. The sender is presented with options to insert the tag associated with one of the at least one image into the sender text.
    Type: Grant
    Filed: March 6, 2006
    Date of Patent: February 13, 2007
    Assignee: AT&T Corp.
    Inventors: Joern Ostermann, Barbara Buda, Mehmet Reha Civanlar, Eric Cosatto, Hans Peter Graf, Thomas M. Isaacson, Yann Andre LeCun
  • Patent number: 7136818
    Abstract: A system and method of controlling the movement of a virtual agent while the agent is speaking to a human user during a conversation is disclosed. The method comprises receiving speech data to be spoken by the virtual agent, performing a prosodic analysis of the speech data, selecting matching prosody patterns from a speaking database and controlling the virtual agent movement according to the selected prosody patterns.
    Type: Grant
    Filed: June 17, 2002
    Date of Patent: November 14, 2006
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Volker Franz Strom
  • Patent number: 7117155
    Abstract: A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. Representative parameters are extracted from the image samples and stored in an animation library. The processor also samples a plurality of multiphones comprising images together with their associated sounds. The processor extracts parameters from these images comprising data characterizing mouth shapes, maps, rules, or equations, and stores the resulting parameters and sound information in a coarticulation library. The animated sequence begins with the processor considering an input phoneme sequence, recalling from the coarticulation library parameters associated with that sequence, and selecting appropriate image samples from the animation library based on that sequence. The image samples are concatenated together, and the corresponding sound is output, to form the animated synthesis.
    Type: Grant
    Filed: October 1, 2003
    Date of Patent: October 3, 2006
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Juergen Schroeter
  • Patent number: 7076430
    Abstract: A system and method of controlling the movement of a virtual agent while the agent is listening to a human user during a conversation is disclosed. The method comprises receiving speech data from the user, performing a prosodic analysis of the speech data and controlling the virtual agent movement according to the prosodic analysis.
    Type: Grant
    Filed: June 17, 2002
    Date of Patent: July 11, 2006
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Thomas M. Isaacson, Volker Franz Strom
  • Patent number: 7035803
    Abstract: A system and method of providing sender customization of multi-media messages through the use of inserted images or video. The images or video may be sender-created or predefined and available to the sender via a web server. The method relates to customizing a multi-media message created by a sender for a recipient, the multi-media message having an animated entity audibly presenting speech converted from text created by the sender. The method comprises receiving at least one image from the sender, associating each at least one image with a tag, presenting the sender with options to insert the tag associated with one of the at least one image into the sender text, and after the sender inserts the tag associated with one of the at least one images into the sender text, delivering the multi-media message with the at least one image presented as background to the animated entity according to a position of the tag associated with the at least one image in the sender text.
    Type: Grant
    Filed: November 2, 2001
    Date of Patent: April 25, 2006
    Assignee: AT&T Corp.
    Inventors: Joern Ostermann, Barbara Buda, Mehmet Reha Civanlar, Eric Cosatto, Hans Peter Graf, Thomas M. Isaacson, Yann Andre LeCun
  • Patent number: 6990452
    Abstract: A system and method of providing sender-customization of multi-media messages through the use of emoticons is disclosed. The sender inserts the emoticons into a text message. As an animated face audibly delivers the text, emoticons associated with the message are started a predetermined period of time or number of words prior to the position of the emoticon in the message text and completed a predetermined length of time or number of words following the location of the emoticon. The sender may insert emoticons through the use of emoticon buttons that are icons available for choosing. Upon sender selections of an emoticon, an icon representing the emoticon is inserted into the text at the position of the cursor. Once an emoticon is chosen, the sender may also choose the amplitude for the emoticon and increased or decreased amplitude will be displayed in the icon inserted into the message text.
    Type: Grant
    Filed: November 2, 2001
    Date of Patent: January 24, 2006
    Assignee: AT&T Corp.
    Inventors: Joern Ostermann, Mehmet Reha Civanlar, Eric Cosatto, Hans Peter Graf, Yann Andre LeCun
  • Patent number: 6980697
    Abstract: A method of improving the lighting conditions of a real scene or video sequence. Digitally generated light is added to a scene for video conferencing over telecommunication networks. A virtual illumination equation takes into account light attenuation, lambertian and specular reflection. An image of an object is captured, a virtual light source illuminates the object within the image. In addition, the object can be the head of the user. The position of the head of the user is dynamically tracked so that an three-dimensional model is generated which is representative of the head of the user. Synthetic light is applied to a position on the model to form an illuminated model.
    Type: Grant
    Filed: January 25, 2002
    Date of Patent: December 27, 2005
    Assignee: AT&T Corp.
    Inventors: Andrea Basso, Eric Cosatto, David Crawford Gibbon, Hans Peter Graf, Shan Liu
  • Patent number: 6963839
    Abstract: A method for customizing a voice in a multi-media message created by a sender for a recipient is disclosed. The multi-media message comprises a text message from the sender to be delivered by an animated entity. The method comprises presenting an option to the sender to insert voice emoticons into the text message associated with parameters of a voice used by the animated entity to deliver the text message. The message is then delivered wherein the voice of the animated entity is modified throughout the message according to the voice emoticons. The voice emoticons may relate to features such as voice stress, volume, pauses, emotion, yelling, or whispering. After the sender inserts various voice emoticons into the text of the message, the animated entity delivers the multi-media message giving effect to each voice emoticon in the text. A volume or intensity of the voice emoticons may be given effect by repeating the tags.
    Type: Grant
    Filed: November 2, 2001
    Date of Patent: November 8, 2005
    Assignee: AT&T Corp.
    Inventors: Joern Ostermann, Mehmet Reha Civanlar, Hans Peter Graf, Thomas M. Isaacson
  • Publication number: 20040215460
    Abstract: Methods and apparatus for rendering a talking head on a client device are disclosed. The client device has a client cache capable of storing audio/visual data associated with rendering the talking head. The method comprises storing sentences in a client cache of a client device that relate to bridging delays in a dialog, storing sentence templates to be used in dialogs, generating a talking head response to a user inquiry from the client device, and determining whether sentences or stored templates stored in the client cache relate to the talking head response. If the stored sentences or stored templates relate to the talking head response, the method comprises instructing the client device to use the appropriate stored sentence or template from the client cache to render at least a part of the talking head response and transmitting a portion of the talking head response not stored in the client cache, if any, to the client device to render a complete talking head response.
    Type: Application
    Filed: April 25, 2003
    Publication date: October 28, 2004
    Inventors: Eric Cosatto, Hans Peter Graf, Joern Ostermann
  • Publication number: 20040064321
    Abstract: A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. Representative parameters are extracted from the image samples and stored in an animation library. The processor also samples a plurality of multiphones comprising images together with their associated sounds. The processor extracts parameters from these images comprising data characterizing mouth shapes, maps, rules, or equations, and stores the resulting parameters and sound information in a coarticulation library. The animated sequence begins with the processor considering an input phoneme sequence, recalling from the coarticulation library parameters associated with that sequence, and selecting appropriate image samples from the animation library based on that sequence. The image samples are concatenated together, and the corresponding sound is output, to form the animated synthesis.
    Type: Application
    Filed: October 1, 2003
    Publication date: April 1, 2004
    Inventors: Eric Cosatto, Hans Peter Graf, Juergen Schroeter
  • Patent number: 6662161
    Abstract: A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. Representative parameters are extracted from the image samples and stored in an animation library. The processor also samples a plurality of multiphones comprising images together with their associated sounds. The processor extracts parameters from these images comprising data characterizing mouth shapes, maps, rules, or equations, and stores the resulting parameters and sound information in a coarticulation library. The animated sequence begins with the processor considering an input phoneme sequence, recalling from the coarticulation library parameters associated with that sequence, and selecting appropriate image samples from the animation library based on that sequence. The image samples are concatenated together, and the corresponding sound is output, to form the animated synthesis.
    Type: Grant
    Filed: September 7, 1999
    Date of Patent: December 9, 2003
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Juergen Schroeter
  • Patent number: 6654018
    Abstract: A system and method for generating photo-realistic talking-head animation from a text input utilizes an audio-visual unit selection process. The lip-synchronization is obtained by optimally selecting and concatenating variable-length video units of the mouth area. The unit selection process utilizes the acoustic data to determine the target costs for the candidate images and utilizes the visual data to determine the concatenation costs. The image database is prepared in a hierarchical fashion, including high-level features (such as a full 3D modeling of the head, geometric size and position of elements) and pixel-based, low-level features (such as a PCA-based metric for labeling the various feature bitmaps).
    Type: Grant
    Filed: March 29, 2001
    Date of Patent: November 25, 2003
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Gerasimos Potamianos, Juergen Schroeter
  • Patent number: 6504546
    Abstract: A method for modeling three-dimensional objects to create photo-realistic animations using a data-driven approach. The three-dimensional object is defined by a set of separate three-dimensional planes, each plane enclosing an area of the object that undergoes visual changes during animation. Recorded video is used to create bitmap data to populate a database for each three-dimensional plane. The video is analyzed in terms of both rigid movements (changes in pose) and plastic deformation (changes in expression) to create the bitmaps. The modeling is particularly well-suited for animations of a human face, where an audio track generated by a text-to-speech synthesizer can be added to the animation to create a photo-realistic “talking head”.
    Type: Grant
    Filed: February 8, 2000
    Date of Patent: January 7, 2003
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf
  • Patent number: 6118887
    Abstract: A method for tracking heads and faces is disclosed wherein a variety of different representation models can be used to define individual heads and facial features in a multi-channel capable tracking algorithm. The representation models generated by the channels during a sequence of frames are ultimately combined into a representation comprising a highly robust and accurate tracked output. In a preferred embodiment, the method conducts an initial overview procedure to establish the optimal tracking strategy to be used in light of the particular characteristics of the tracking application.
    Type: Grant
    Filed: October 10, 1997
    Date of Patent: September 12, 2000
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Gerasimos Potamianos
  • Patent number: 6112177
    Abstract: A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. Representative parameters are extracted from the image samples and stored in an animation library. The processor also samples a plurality of multiphones comprising images together with their associated sounds. The processor extracts parameters from these images comprising data characterizing mouth shapes, maps, rules, or equations, and stores the resulting parameters and sound information in a coarticulation library. The animated sequence begins with the processor considering an input phoneme sequence, recalling from the coarticulation library parameters associated with that sequence, and selecting appropriate image samples from the animation library based on that sequence. The image samples are concatenated together, and the corresponding sound is output, to form the animated synthesis.
    Type: Grant
    Filed: November 7, 1997
    Date of Patent: August 29, 2000
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Juergen Schroeter
  • Patent number: 6028960
    Abstract: A face feature analysis which begins by generating multiple face feature candidates, e.g., eyes and nose positions, using an isolated frame face analysis. Then, a nostril tracking window is defined around a nose candidate and tests are applied to the pixels therein based on percentages of skin color area pixels and nostril area pixels to determine whether the nose candidate represents an actual nose. Once actual nostrils are identified, size, separation and contiguity of the actual nostrils is determined by projecting the nostril pixels within the nostril tracking window. A mouth window is defined around the mouth region and mouth detail analysis is then applied to the pixels within the mouth window to identify inner mouth and teeth pixels and therefrom generate an inner mouth contour. The nostril position and inner mouth contour are used to generate a synthetic model head.
    Type: Grant
    Filed: September 20, 1996
    Date of Patent: February 22, 2000
    Assignee: Lucent Technologies Inc.
    Inventors: Hans Peter Graf, Eric David Petajan
  • Patent number: 5995119
    Abstract: A method for generating photo-realistic characters wherein one or more pictures of an individual are decomposed into a plurality of parameterized facial parts. The facial parts are stored in memory. To create animated frames, the individual facial parts are recalled from memory in a defined manner and overlaid onto a base face to form a whole face, which, in turn, may be overlaid onto a background image to form an animated frame.
    Type: Grant
    Filed: June 6, 1997
    Date of Patent: November 30, 1999
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf