Patents by Inventor Kim Silverman

Kim Silverman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20100082349
    Abstract: Algorithms for synthesizing speech used to identify media assets are provided. Speech may be selectively synthesized form text strings associated with media assets. A text string may be normalized and its native language determined for obtaining a target phoneme for providing human-sounding speech in a language (e.g., dialect or accent) that is familiar to a user. The algorithms may be implemented on a system including several dedicated render engines. The system may be part of a back end coupled to a front end including storage for media assets and associated synthesized speech, and a request processor for receiving and processing requests that result in providing the synthesized speech. The front end may communicate media assets and associated synthesized speech content over a network to host devices coupled to portable electronic devices on which the media assets and synthesized speech are played back.
    Type: Application
    Filed: September 29, 2008
    Publication date: April 1, 2010
    Applicant: Apple Inc.
    Inventors: Jerome Bellegarda, Devang Naik, Kim Silverman
  • Publication number: 20100082329
    Abstract: Algorithms for synthesizing speech used to identify media assets are provided. Speech may be selectively synthesized form text strings associated with media assets. A text string may be normalized and its native language determined for obtaining a target phoneme for providing human-sounding speech in a language (e.g., dialect or accent) that is familiar to a user. The algorithms may be implemented on a system including several dedicated render engines. The system may be part of a back end coupled to a front end including storage for media assets and associated synthesized speech, and a request processor for receiving and processing requests that result in providing the synthesized speech. The front end may communicate media assets and associated synthesized speech content over a network to host devices coupled to portable electronic devices on which the media assets and synthesized speech are played back.
    Type: Application
    Filed: September 29, 2008
    Publication date: April 1, 2010
    Applicant: Apple Inc.
    Inventors: Kim Silverman, Devang Naik, Kevin Lenzo, Caroline Henton
  • Publication number: 20100082344
    Abstract: Algorithms for synthesizing speech used to identify media assets are provided. Speech may be selectively synthesized form text strings associated with media assets. A text string may be normalized and its native language determined for obtaining a target phoneme for providing human-sounding speech in a language (e.g., dialect or accent) that is familiar to a user. The algorithms may be implemented on a system including several dedicated render engines. The system may be part of a back end coupled to a front end including storage for media assets and associated synthesized speech, and a request processor for receiving and processing requests that result in providing the synthesized speech. The front end may communicate media assets and associated synthesized speech content over a network to host devices coupled to portable electronic devices on which the media assets and synthesized speech are played back.
    Type: Application
    Filed: September 29, 2008
    Publication date: April 1, 2010
    Applicant: Apple, Inc.
    Inventors: Devang Naik, Kim Silverman, Jerome Bellegarda
  • Publication number: 20080155438
    Abstract: Methods and systems for providing graphical user interfaces are described. Overlaid, information-bearing windows whose contents remain unchanged for a predetermined period of time become translucent. The translucency can be graduated so that, over time, if the window's contents remain unchanged, the window becomes more translucent. In addition to visual translucency, windows also have a manipulative translucent quality. Upon reaching a certain level of visual translucency, user input in the region of the window is interpreted as an operation on the underlying objects rather than the contents of the overlaying window.
    Type: Application
    Filed: March 11, 2008
    Publication date: June 26, 2008
    Applicant: APPLE INC.
    Inventors: Thomas Bonura, Kim Silverman
  • Publication number: 20080091430
    Abstract: A method and apparatus is provided for generating speech that sounds more natural. In one embodiment, word prominence and latent semantic analysis are used to generate more natural sounding speech. A method for generating speech that sounds more natural may comprise generating synthesized speech having certain word prominence characteristics and applying a semantically-driven word prominence assignment model to specify word prominence consistent with the way humans assign word prominence.
    Type: Application
    Filed: December 4, 2007
    Publication date: April 17, 2008
    Inventors: Jerome Bellegarda, Kim Silverman
  • Patent number: 7343562
    Abstract: Methods and systems for providing graphical user interfaces are described. overlaid, Information-bearing windows whose contents remain unchanged for a predetermined period of time become translucent. The translucency can be graduated so that, over time, if the window's contents remain unchanged, the window becomes more translucent. In addition to visual translucency, windows according to the present invention also have a manipulative translucent quality. Upon reaching a certain level of visual translucency, user input in the region of the window is interpreted as an operation on the underlying objects rather than the contents of the overlaying window.
    Type: Grant
    Filed: November 5, 2003
    Date of Patent: March 11, 2008
    Assignee: Apple Inc.
    Inventors: Thomas Bonura, Kim Silverman
  • Publication number: 20070294083
    Abstract: A method and system for training a user authentication by voice signal are described. In one embodiment, a set of feature vectors are decomposed into speaker-specific recognition units. The speaker-specific recognition units are used to compute distribution values to train the voice signal. In addition, spectral feature vectors are decomposed into speaker-specific characteristic units which are compared to the speaker-specific distribution values. If the speaker-specific characteristic units are within a threshold limit of the speaker-specific distribution values, the speech signal is authenticated.
    Type: Application
    Filed: June 11, 2007
    Publication date: December 20, 2007
    Inventors: Jerome Bellegarda, Kim Silverman
  • Publication number: 20070106742
    Abstract: A method and apparatus for filtering messages comprising determining a first semantic anchor corresponding to a first group of messages, for example, legitimate messages and a second semantic anchor corresponding to a second group of messages, for example, unsolicited messages. Determining a vector corresponding to an incoming message; comparing the vector corresponding to the incoming message with at least one of the first semantic anchor and the second semantic anchor to obtain a first comparison value and a second comparison value; and filtering the incoming message based on the first comparison value and the second comparison value.
    Type: Application
    Filed: December 20, 2006
    Publication date: May 10, 2007
    Inventors: Jerome Bellegarda, Devang Naik, Kim Silverman
  • Publication number: 20060206574
    Abstract: A method and apparatus for filtering messages comprising determining a first semantic anchor corresponding to a first group of messages, for example, legitimate messages and a second semantic anchor corresponding to a second group of messages, for example, unsolicited messages. Determining a vector corresponding to an incoming message; comparing the vector corresponding to the incoming message with at least one of the first semantic anchor and the second semantic anchor to obtain a first comparison value and a second comparison value; and filtering the incoming message based on the first comparison value and the second comparison value.
    Type: Application
    Filed: May 9, 2006
    Publication date: September 14, 2006
    Inventors: Jerome Bellegarda, Devang Naik, Kim Silverman
  • Publication number: 20060168150
    Abstract: Improved techniques for providing supplementary media for media items are disclosed. The media items are typically fixed media items. The supplementary media is one or more of audio, video, image, or text that is provided by a user to supplement (e.g., personalize, customize, annotate, etc.) the fixed media items. In one embodiment, the supplementary media can be provided by user interaction with an on-line media store where media items can be browsed, searched, purchased and/or acquired via a computer network. In another embodiment, the supplementary media can be generated on a playback device.
    Type: Application
    Filed: March 6, 2006
    Publication date: July 27, 2006
    Inventors: Devang Naik, Kim Silverman, Guy Tribble
  • Publication number: 20050038650
    Abstract: A method and apparatus to use semantic inference with speech recognition systems includes recognizing at least one spoken word, processing the spoken word using a context-free grammar, deriving an output from the context-free grammar, and translating the output to a predetermined command.
    Type: Application
    Filed: September 21, 2004
    Publication date: February 17, 2005
    Inventors: Jerome Bellegarda, Kim Silverman
  • Patent number: 6785652
    Abstract: A method and an apparatus for improved duration modeling of phonemes in a speech synthesis system are provided. According to one aspect, text is received into a processor of a speech synthesis system. The received text is processed using a sum-of-products phoneme duration model that is used in either the formant method or the concatenative method of speech generation. The phoneme duration model, which is used along with a phoneme pitch model, is produced by developing a non-exponential functional transformation form for use with a generalized additive model. The non-exponential functional transformation form comprises a root sinusoidal transformation that is controlled in response to a minimum phoneme duration and a maximum phoneme duration. The minimum and maximum phoneme durations are observed in training data. The received text is processed by specifying at least one of a number of contextual factors for the generalized additive model.
    Type: Grant
    Filed: December 19, 2002
    Date of Patent: August 31, 2004
    Assignee: Apple Computer, Inc.
    Inventors: Jerome R. Bellegarda, Kim Silverman
  • Publication number: 20040090467
    Abstract: Methods and systems for providing graphical user interfaces are described. overlaid, Information-bearing windows whose contents remain unchanged for a predetermined period of time become translucent. The translucency can be graduated so that, over time, if the window's contents remain unchanged, the window becomes more translucent. In addition to visual translucency, windows according to the present invention also have a manipulative translucent quality. Upon reaching a certain level of visual translucency, user input in the region of the window is interpreted as an operation on the underlying objects rather than the contents of the overlaying window.
    Type: Application
    Filed: November 5, 2003
    Publication date: May 13, 2004
    Applicant: Apple Computer, Inc.
    Inventors: Thomas Bonura, Kim Silverman
  • Patent number: 6697779
    Abstract: A method and system for training a user authentication by voice signal are described. In one embodiment, during training, a set of all spectral feature vectors for a given speaker is globally decomposed into speaker-specific decomposition units and a speaker-specific recognition unit. During recognition, spectral feature vectors are locally decomposed into speaker-specific characteristic units. The speaker-specific recognition unit is used together with selected speaker-specific characteristic units to compute a speaker-specific comparison unit. If the speaker-specific comparison unit is within a threshold limit, then the voice signal is authenticated. In addition, a speaker-specific content unit is time-aligned with selected speaker-specific characteristic units. If the alignment is within a threshold limit, then the voice signal is authenticated. In one embodiment, if both thresholds are satisfied, then the user is authenticated.
    Type: Grant
    Filed: September 29, 2000
    Date of Patent: February 24, 2004
    Assignee: Apple Computer, Inc.
    Inventors: Jerome Bellegarda, Devang Naik, Matthias Neeracher, Kim Silverman
  • Patent number: 6670970
    Abstract: Methods and systems for providing graphical user interfaces are described. overlaid, Information-bearing windows whose contents remain unchanged for a predetermined period of time become translucent. The translucency can be graduated so that, over time, if the window's contents remain unchanged, the window becomes more translucent. In addition to visual translucency, windows according to the present invention also have a manipulative translucent quality. Upon reaching a certain level of visual translucency, user input in the region of the window is interpreted as an operation on the underlying objects rather than the contents of the overlaying window.
    Type: Grant
    Filed: December 20, 1999
    Date of Patent: December 30, 2003
    Assignee: Apple Computer, Inc.
    Inventors: Thomas Bonura, Kim Silverman
  • Publication number: 20030093277
    Abstract: A method and an apparatus for improved duration modeling of phonemes in a speech synthesis system are provided. According to one aspect, text is received into a processor of a speech synthesis system. The received text is processed using a sum-of-products phoneme duration model that is used in either the formant method or the concatenative method of speech generation. The phoneme duration model, which is used along with a phoneme pitch model, is produced by developing a non-exponential functional transformation form for use with a generalized additive model. The non-exponential functional transformation form comprises a root sinusoidal transformation that is controlled in response to a minimum phoneme duration and a maximum phoneme duration. The minimum and maximum phoneme durations are observed in training data. The received text is processed by specifying at least one of a number of contextual factors for the generalized additive model.
    Type: Application
    Filed: December 19, 2002
    Publication date: May 15, 2003
    Inventors: Jerome R. Bellegarda, Kim Silverman
  • Patent number: 6553344
    Abstract: A method and an apparatus for improved duration modeling of phonemes in a speech synthesis system are provided. According to one aspect, text is received into a processor of a speech synthesis system. The received text is processed using a sum-of-products phoneme duration model that is used in either the formant method or the concatenative method of speech generation. The phoneme duration model, which is used along with a phoneme pitch model, is produced by developing a non-exponential functional transformation form for use with a generalized additive model. The non-exponential functional transformation form comprises a root sinusoidal transformation that is controlled in response to a minimum phoneme duration and a maximum phoneme duration. The minimum and maximum phoneme durations are observed in training data. The received text is processed by specifying at least one of a number of contextual factors for the generalized additive model.
    Type: Grant
    Filed: February 22, 2002
    Date of Patent: April 22, 2003
    Assignee: Apple Computer, Inc.
    Inventors: Jerome R. Bellegarda, Kim Silverman
  • Publication number: 20020138270
    Abstract: A method and an apparatus for improved duration modeling of phonemes in a speech synthesis system are provided. According to one aspect, text is received into a processor of a speech synthesis system. The received text is processed using a sum-of-products phoneme duration model that is used in either the formant method or the concatenative method of speech generation. The phoneme duration model, which is used along with a phoneme pitch model, is produced by developing a non-exponential functional transformation form for use with a generalized additive model. The non-exponential functional transformation form comprises a root sinusoidal transformation that is controlled in response to a minimum phoneme duration and a maximum phoneme duration. The minimum and maximum phoneme durations are observed in training data. The received text is processed by specifying at least one of a number of contextual factors for the generalized additive model.
    Type: Application
    Filed: February 22, 2002
    Publication date: September 26, 2002
    Applicant: Apple Computer, Inc.
    Inventors: Jerome R. Bellegarda, Kim Silverman
  • Patent number: 6366884
    Abstract: A method and an apparatus for improved duration modeling of phonemes in a speech synthesis system are provided. According to one aspect, text is received into a processor of a speech synthesis system. The received text is processed using a sum-of-products phoneme duration model that is used in either the formant method or the concatenative method of speech generation. The phoneme duration model, which is used along with a phoneme pitch model, is produced by developing a non-exponential functional transformation form for use with a generalized additive model. The non-exponential functional transformation form comprises a root sinusoidal transformation that is controlled in response to a minimum phoneme duration and a maximum phoneme duration. The minimum and maximum phoneme durations are observed in training data. The received text is processed by specifying at least one of a number of contextual factors for the generalized additive model.
    Type: Grant
    Filed: November 8, 1999
    Date of Patent: April 2, 2002
    Assignee: Apple Computer, Inc.
    Inventors: Jerome R. Bellegarda, Kim Silverman
  • Patent number: 6064960
    Abstract: A method and an apparatus for improved duration modeling of phonemes in a speech synthesis system are provided. According to one aspect, text is received into a processor of a speech synthesis system. The received text is processed using a sum-of-products phoneme duration model that is used in either the formant method or the concatenative method of speech generation. The phoneme duration model, which is used along with a phoneme pitch model, is produced by developing a non-exponential functional transformation form for use with a generalized additive model. The non-exponential functional transformation form comprises a root sinusoidal transformation that is controlled in response to a minimum phoneme duration and a maximum phoneme duration. The minimum and maximum phoneme durations are observed in training data. The received text is processed by specifying at least one of a number of contextual factors for the generalized additive model.
    Type: Grant
    Filed: December 18, 1997
    Date of Patent: May 16, 2000
    Assignee: Apple Computer, Inc.
    Inventors: Jerome R. Bellegarda, Kim Silverman