Patents by Inventor Slava Shechtman

Slava Shechtman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10418025
    Abstract: A method for producing speech comprises: accessing an expressive prosody model, wherein the model is generated by: receiving a plurality of non-neutral prosody vector sequences, each vector associated with one of a plurality of time-instances; receiving a plurality of expression labels, each having a time-instance selected from a plurality of non-neutral time-instances of the plurality of time-instances; producing a plurality of neutral prosody vector sequences equivalent to the plurality of non-neutral sequences by applying a linear combination of a plurality of statistical measures to a plurality of sub-sequences selected according to an identified proximity test applied to a plurality of neutral time-instances of the plurality of time-instances; and training at least one machine learning module using the plurality of non-neutral sequences and the plurality of neutral sequences to produce an expressive prosodic model; and using the model within a Text-To-Speech-System to produce an audio waveform from an in
    Type: Grant
    Filed: December 6, 2017
    Date of Patent: September 17, 2019
    Assignee: International Business Machines Corporation
    Inventors: Slava Shechtman, Zvi Kons
  • Publication number: 20190172443
    Abstract: A method for producing speech comprises: accessing an expressive prosody model, wherein the model is generated by: receiving a plurality of non-neutral prosody vector sequences, each vector associated with one of a plurality of time-instances; receiving a plurality of expression labels, each having a time-instance selected from a plurality of non-neutral time-instances of the plurality of time-instances; producing a plurality of neutral prosody vector sequences equivalent to the plurality of non-neutral sequences by applying a linear combination of a plurality of statistical measures to a plurality of sub-sequences selected according to an identified proximity test applied to a plurality of neutral time-instances of the plurality of time-instances; and training at least one machine learning module using the plurality of non-neutral sequences and the plurality of neutral sequences to produce an expressive prosodic model; and using the model within a Text-To-Speech-System to produce an audio waveform from an in
    Type: Application
    Filed: December 6, 2017
    Publication date: June 6, 2019
    Inventors: Slava Shechtman, Zvi Kons
  • Patent number: 10226702
    Abstract: A computer-implemented method, computerized apparatus and computer program product. The method comprises capturing one or more images of a scene in which a driver is driving a vehicle; analyzing the images to retrieve an event or detail; conveying to the driver the a question or a challenge related to the event or detail; receiving a response from the driver; analyzing the response; and determining a score related to the driver.
    Type: Grant
    Filed: May 25, 2015
    Date of Patent: March 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Ron Hoory, Mattias Marder, Slava Shechtman
  • Patent number: 9892335
    Abstract: Embodiments of the present invention may provide the capability to identify a specific object being interacted with that may be cheaply and easily included in mass-produced objects. In an embodiment, a computer-implemented method for object identification may comprise receiving a signal produced by a physical interaction with an object to be identified, the signal produced by an identification structure coupled to the object during physical interaction with the object, processing the signal to form digital data representing the signal, and accessing a database using the digital data to retrieve information identifying the object.
    Type: Grant
    Filed: June 5, 2016
    Date of Patent: February 13, 2018
    Assignee: International Business Machines Corporation
    Inventors: Ophir Azulai, Udi Barzelay, Mattias Marder, Dror Porat, Slava Shechtman
  • Publication number: 20170351930
    Abstract: Embodiments of the present invention may provide the capability to identify a specific object being interacted with that may be cheaply and easily included in mass-produced objects. In an embodiment, a computer-implemented method for object identification may comprise receiving a signal produced by a physical interaction with an object to be identified, the signal produced by an identification structure coupled to the object during physical interaction with the object, processing the signal to form digital data representing the signal, and accessing a database using the digital data to retrieve information identifying the object.
    Type: Application
    Filed: June 5, 2016
    Publication date: December 7, 2017
    Applicant: International Business Machines Corporation
    Inventors: Ophir Azulai, Udi Barzelay, Mattias Marder, Dror Porat, Slava Shechtman
  • Patent number: 9685170
    Abstract: According to some embodiments of the present invention, there is provided a computerized method for selecting and correcting pitch marks in speech processing and modification. The method comprises an action of receiving a continuous speech signal representing audible speech recorded by a microphone, where a sequence of pitch values and two or more pitch mark temporal values are computed from the continuous speech signal. The method comprises an action of computing for each of the pitch mark temporal values a lower limit temporal value and an upper limit temporal value by a cross-correlation function of the continuous speech signal around the pitch mark temporal values associated with pairs of elements in the sequence and replacing one or more of the pitch mark temporal values with one or more new temporal value between the lower limit temporal value and the upper limit temporal value.
    Type: Grant
    Filed: October 21, 2015
    Date of Patent: June 20, 2017
    Assignee: International Business Machines Corporation
    Inventor: Slava Shechtman
  • Publication number: 20170117001
    Abstract: According to some embodiments of the present invention, there is provided a computerized method for selecting and correcting pitch marks in speech processing and modification. The method comprises an action of receiving a continuous speech signal representing audible speech recorded by a microphone, where a sequence of pitch values and two or more pitch mark temporal values are computed from the continuous speech signal. The method comprises an action of computing for each of the pitch mark temporal values a lower limit temporal value and an upper limit temporal value by a cross-correlation function of the continuous speech signal around the pitch mark temporal values associated with pairs of elements in the sequence and replacing one or more of the pitch mark temporal values with one or more new temporal value between the lower limit temporal value and the upper limit temporal value.
    Type: Application
    Filed: October 21, 2015
    Publication date: April 27, 2017
    Inventor: Slava Shechtman
  • Patent number: 9564140
    Abstract: Some embodiments relate to techniques for encoding an audio signal represented by a plurality of frames including a first frame. The techniques include using at least one computer hardware processor to perform: obtaining an initial discrete spectral representation of the first frame; obtaining a primary discrete spectral representation of the initial discrete spectral representation at least in part by estimating a phase envelope of the initial discrete spectral representation and evaluating the estimated phase envelope at a discrete set of frequencies; calculating a residual discrete spectral representation of the initial discrete spectral representation based on the initial discrete spectral representation and the primary discrete spectral representation; and encoding the residual discrete spectral representation using a plurality of codewords.
    Type: Grant
    Filed: April 7, 2015
    Date of Patent: February 7, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Slava Shechtman, Alexander Sorin
  • Publication number: 20160346695
    Abstract: A computer-implemented method, computerized apparatus and computer program product. The method comprises capturing one or more images of a scene in which a driver is driving a vehicle; analyzing the images to retrieve an event or detail; conveying to the driver the a question or a challenge related to the event or detail; receiving a response from the driver; analyzing the response; and determining a score related to the driver.
    Type: Application
    Filed: May 25, 2015
    Publication date: December 1, 2016
    Inventors: Ron Hoory, Mattias Marder, Slava Shechtman
  • Patent number: 9484045
    Abstract: An embodiment according to the invention provides a capability of automatically predicting how favorable a given speech signal is for statistical modeling, which is advantageous in a variety of different contexts. In Multi-Form Segment (MFS) synthesis, for example, an embodiment according to the invention uses prediction capability to provide an automatic acoustic driven template versus model decision maker with an output quality that is high, stable and depends gradually on the system footprint. In speaker selection for a statistical Text-to-Speech synthesis (TTS) system build, as another example context, an embodiment according to the invention enables a fast selection of the most appropriate speaker among several available ones for the full voice dataset recording and preparation, based on a small amount of recorded speech material.
    Type: Grant
    Filed: September 7, 2012
    Date of Patent: November 1, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Alexander Sorin, Slava Shechtman, Vincent Pollet
  • Patent number: 9484036
    Abstract: Computer systems employing speaker verification as a security approach to prevent un-authorized access by intruders may be tricked by a synthetic speech with voice characteristics similar to those of an authorized user of the computer system. According to at least one example embodiment, a method and corresponding apparatus for detecting a synthetic speech signal include extracting a plurality of speech features from multiple segments of the speech signal; analyzing the plurality of speech features to determine whether the plurality of speech features exhibit periodic variation behavior; and determining whether the speech signal is a synthetic speech signal or a natural speech signal based on whether or not a periodic variation behavior of the plurality of speech features is detected. The embodiments of synthetic speech detection result in security enhancement of the computer system employing speaker verification.
    Type: Grant
    Filed: August 28, 2013
    Date of Patent: November 1, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Zvi Kons, Hagai Aronowitz, Slava Shechtman
  • Publication number: 20160300580
    Abstract: Some embodiments relate to techniques for encoding an audio signal represented by a plurality of frames including a first frame. The techniques include using at least one computer hardware processor to perform: obtaining an initial discrete spectral representation of the first frame; obtaining a primary discrete spectral representation of the initial discrete spectral representation at least in part by estimating a phase envelope of the initial discrete spectral representation and evaluating the estimated phase envelope at a discrete set of frequencies; calculating a residual discrete spectral representation of the initial discrete spectral representation based on the initial discrete spectral representation and the primary discrete spectral representation; and encoding the residual discrete spectral representation using a plurality of codewords.
    Type: Application
    Filed: April 7, 2015
    Publication date: October 13, 2016
    Applicant: Nuance Communications, Inc.
    Inventors: Slava Shechtman, Alexander Sorin
  • Patent number: 9224402
    Abstract: A method for speech parameterization and coding of a continuous speech signal. The method comprises dividing said speech signal into a plurality of speech frames, and for each one of the plurality of speech frames, modeling said speech frame by a first harmonic modeling to produce a plurality of harmonic model parameters, reconstructing an estimated frame signal from the plurality of harmonic model parameters, subtracting the estimated frame signal from the speech frame to produce a harmonic model residual, performing at least one second harmonic modeling analysis on the first harmonic model residual to determine at least one set of second harmonic model components, removing the at least one set of second harmonic model components from the first harmonic model residual to produce a harmonically-filtered residual signal, and processing the harmonically-filtered residual signal with analysis by synthesis techniques to produce vectors of codebook indices and corresponding gains.
    Type: Grant
    Filed: September 30, 2013
    Date of Patent: December 29, 2015
    Assignee: International Business Machines Corporation
    Inventor: Slava Shechtman
  • Patent number: 9159323
    Abstract: A method including: obtaining, via a plurality of communication devices, a plurality of speech signals respectively associated with human speakers, the speech signals including verbal components and non-verbal components; identifying a plurality of geographical locations, each geographic location associated with a respective one of the plurality of the communication devices; extracting the non-verbal components from the obtained speech signals; deducing physiological or psychological conditions of the human speakers by analyzing, over a specified period, the extracted non-verbal components, using predefined relations between characteristics of the non-verbal components and physiological or psychological conditions of the human speakers; and providing a geographical distribution of the deduced physiological or psychological conditions of the human speakers by associating the deduced physiological or psychological conditions of the human speakers with geographical locations thereof.
    Type: Grant
    Filed: July 29, 2013
    Date of Patent: October 13, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Slava Shechtman, Raphael Steinberg
  • Publication number: 20150095035
    Abstract: A method for speech parameterization and coding of a continuous speech signal. The method comprises dividing said speech signal into a plurality of speech frames, and for each one of the plurality of speech frames, modeling said speech frame by a first harmonic modeling to produce a plurality of harmonic model parameters, reconstructing an estimated frame signal from the plurality of harmonic model parameters, subtracting the estimated frame signal from the speech frame to produce a harmonic model residual, performing at least one second harmonic modeling analysis on the first harmonic model residual to determine at least one set of second harmonic model components, removing the at least one set of second harmonic model components from the first harmonic model residual to produce a harmonically-filtered residual signal, and processing the harmonically-filtered residual signal with analysis by synthesis techniques to produce vectors of codebook indices and corresponding gains.
    Type: Application
    Filed: September 30, 2013
    Publication date: April 2, 2015
    Applicant: International Business Machines Corporation
    Inventor: Slava Shechtman
  • Publication number: 20150066512
    Abstract: Computer systems employing speaker verification as a security approach to prevent un-authorized access by intruders may be tricked by a synthetic speech with voice characteristics similar to those of an authorized user of the computer system. According to at least one example embodiment, a method and corresponding apparatus for detecting a synthetic speech signal include extracting a plurality of speech features from multiple segments of the speech signal; analyzing the plurality of speech features to determine whether the plurality of speech features exhibit periodic variation behavior; and determining whether the speech signal is a synthetic speech signal or a natural speech signal based on whether or not a periodic variation behavior of the plurality of speech features is detected. The embodiments of synthetic speech detection result in security enhancement of the computer system employing speaker verification.
    Type: Application
    Filed: August 28, 2013
    Publication date: March 5, 2015
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Zvi Kons, Hagai Aronowitz, Slava Shechtman
  • Patent number: 8786659
    Abstract: A method for responding to media conference deficiencies, the method includes: monitoring, by at least one receiver, a quality of media conference signals being received by at least one receiver during the media conference; sending, in response to the monitoring, to at least an end user transmitter that transmitted the media conference signals, a quality indication representative of a quality of the received media conference signals; recording inadequately received media conference signals that were inadequately received by a certain end user receiver and participating in an activity related to a transmission, to the certain end user receiver, of the inadequately received media conference signals or of a representation of the inadequately received media conference signals.
    Type: Grant
    Filed: May 29, 2012
    Date of Patent: July 22, 2014
    Assignee: International Business Machines Corporation
    Inventors: Ron Hoory, Michael Rodeh, Slava Shechtman
  • Patent number: 8682670
    Abstract: A method, system and computer program product are provided for enhancement of speech synthesized by a statistical text-to-speech (TTS) system employing a parametric representation of speech in a space of acoustic feature vectors. The method includes: defining a parametric family of corrective transformations operating in the space of the acoustic feature vectors and dependent on a set of enhancing parameters; and defining a distortion indictor of a feature vector or a plurality of feature vectors.
    Type: Grant
    Filed: July 7, 2011
    Date of Patent: March 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Slava Shechtman, Alexander Sorin
  • Publication number: 20140074468
    Abstract: An embodiment according to the invention provides a capability of automatically predicting how favorable a given speech signal is for statistical modeling, which is advantageous in a variety of different contexts. In Multi-Form Segment (MFS) synthesis, for example, an embodiment according to the invention uses prediction capability to provide an automatic acoustic driven template versus model decision maker with an output quality that is high, stable and depends gradually on the system footprint. In speaker selection for a statistical Text-to-Speech synthesis (TTS) system build, as another example context, an embodiment according to the invention enables a fast selection of the most appropriate speaker among several available ones for the full voice dataset recording and preparation, based on a small amount of recorded speech material.
    Type: Application
    Filed: September 7, 2012
    Publication date: March 13, 2014
    Applicant: Nuance Communications, Inc.
    Inventors: Alexander Sorin, Slava Shechtman, Vincent Pollet
  • Publication number: 20130317825
    Abstract: A method including: obtaining, via a plurality of communication devices, a plurality of speech signals respectively associated with human speakers, the speech signals including verbal components and non-verbal components; identifying a plurality of geographical locations, each geographic location associated with a respective one of the plurality of the communication devices; extracting the non-verbal components from the obtained speech signals; deducing physiological or psychological conditions of the human speakers by analyzing, over a specified period, the extracted non-verbal components, using predefined relations between characteristics of the non-verbal components and physiological or psychological conditions of the human speakers; and providing a geographical distribution of the deduced physiological or psychological conditions of the human speakers by associating the deduced physiological or psychological conditions of the human speakers with geographical locations thereof.
    Type: Application
    Filed: July 29, 2013
    Publication date: November 28, 2013
    Applicant: Nuance Communications, Inc.
    Inventors: Slava Shechtman, Raphael Steinberg