Patents by Inventor Shay Ben-David

Shay Ben-David has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Adaptation of speech recognition

Patent number: 9911410

Abstract: A method, computer program product, and system for adapting speech recognition of a user's speech is provided. The method includes receiving a first utterance from a user having a duration below a predetermined threshold, identifying at least one further utterance from the user that provides additional information, generating a concatenated utterance by concatenating the first utterance with the at least one further utterance, transmitting the concatenated utterance to a speech recognition server, receiving a transcription of the concatenated utterance from the speech recognition server that includes a transcription of the first utterance, and extracting the transcription of the first utterance from the transcription of the concatenated utterance. The transcription of the first utterance is based on the additional information provided by the at least one further utterance.

Type: Grant

Filed: August 19, 2015

Date of Patent: March 6, 2018

Assignee: International Business Machines Corporation

Inventor: Shay Ben-David
Detection of target and non-target users using multi-session information

Patent number: 9837080

Abstract: Systems and methods for maintaining speaker recognition performance are provided. A method for maintaining speaker recognition performance, comprises training a plurality of models respectively corresponding to speaker recognition scores from a plurality of speakers over a plurality of sessions, and using the plurality of models to conclude whether a speaker seeking access to an environment is a non-ideal target speaker or a non-ideal non-target speaker. Using the plurality of models to conclude comprises calculating a first probability that the speaker seeking access is the non-ideal target speaker, calculating a second probability that the speaker seeking access is the non-ideal non-target speaker, and determining whether the first probability, the second probability or a sum of the first probability and the second probability is above a probability threshold.

Type: Grant

Filed: August 21, 2014

Date of Patent: December 5, 2017

Assignee: International Business Machines Corporation

Inventors: Hagai Aronowitz, Shay Ben-David, David Nahamoo, Jason W. Pelecanos, Orith Toledo-Ronen
ADAPTATION OF SPEECH RECOGNITION

Publication number: 20170053643

Abstract: A method, computer program product, and system for adapting speech recognition of a user's speech is provided. The method includes receiving a first utterance from a user having a duration below a predetermined threshold, identifying at least one further utterance from the user that provides additional information, generating a concatenated utterance by concatenating the first utterance with the at least one further utterance, transmitting the concatenated utterance to a speech recognition server, receiving a transcription of the concatenated utterance from the speech recognition server that includes a transcription of the first utterance, and extracting the transcription of the first utterance from the transcription of the concatenated utterance. The transcription of the first utterance is based on the additional information provided by the at least one further utterance.

Type: Application

Filed: August 19, 2015

Publication date: February 23, 2017

Inventor: SHAY BEN-DAVID
Synchronization of data streams with associated metadata streams using smallest sum of absolute differences between time indices of data events and metadata events

Patent number: 9535450

Abstract: Synchronizing a data stream with an associated metadata stream by receiving a data stream and a metadata stream having a plurality of metadata events associated with the data stream, identifying within the data stream a plurality of data events, matching each of the data events to one of the metadata events in accordance with a matching criterion, and synchronizing the data stream with the metadata stream by effecting a relative time shift between the metadata stream and the data stream in accordance with a time shift adjustment value that results in the smallest sum of absolute differences between time indices of each matched data event and metadata event.

Type: Grant

Filed: July 17, 2011

Date of Patent: January 3, 2017

Assignee: International Business Machines Corporation

Inventors: Shay Ben-David, Evgeny Hazanovich, Zak Mandel
SYSTEMS AND METHODS FOR DETECTION OF TARGET AND NON-TARGET USERS USING MULTI-SESSION INFORMATION

Publication number: 20160055844

Abstract: Systems and methods for maintaining speaker recognition performance are provided. A method for maintaining speaker recognition performance, comprises training a plurality of models respectively corresponding to speaker recognition scores from a plurality of speakers over a plurality of sessions, and using the plurality of models to conclude whether a speaker seeking access to an environment is a non-ideal target speaker or a non-ideal non-target speaker. Using the plurality of models to conclude comprises calculating a first probability that the speaker seeking access is the non-ideal target speaker, calculating a second probability that the speaker seeking access is the non-ideal non-target speaker, and determining whether the first probability, the second probability or a sum of the first probability and the second probability is above a probability threshold.

Type: Application

Filed: August 21, 2014

Publication date: February 25, 2016

Inventors: Hagai Aronowitz, Shay Ben-David, David Nahamoo, Jason W. Pelecanos, Orith Toledo-Ronen
Synchronizing distributed speech recognition

Patent number: 9208785

Abstract: Methods, apparatus, and computer program products are disclosed for synchronizing distributed speech recognition (‘DSR’) that include receiving in a DSR client notification from a voice server of readiness to conduct speech recognition and, responsive to the receiving, transmitting by the DSR client, from the DSR client to the voice server, speech for recognition.

Type: Grant

Filed: May 10, 2006

Date of Patent: December 8, 2015

Assignee: Nuance Communications, Inc.

Inventors: Shay Ben-David, Charles W. Cross, Jr.
Voice transformation with encoded information

Patent number: 8930182

Abstract: Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.

Type: Grant

Filed: March 17, 2011

Date of Patent: January 6, 2015

Assignee: International Business Machines Corporation

Inventors: Shay Ben-David, Ron Hoory, Zvi Kons, David Nahamoo
Recognition techniques to enhance automation in a computing environment

Patent number: 8856396

Abstract: Systems and methods for detecting end of a transaction in a computing environment are provided. The method comprises determining a target area in a graphical user environment displayed on a display screen, wherein a change is expected to occur when end of a transaction is reached; masking the target area at least partially to remove content included in the target area that is present before or after the transaction was initiated; monitoring the target area for change in content; and detecting the end of the transaction when the content of the target area has changed.

Type: Grant

Filed: October 20, 2013

Date of Patent: October 7, 2014

Assignee: International Business Machines Corporation

Inventors: Ella Barkan, Shay Ben-David, Amir Geva
Recognition Techniques to Enhance Automation In a Computing Environment

Publication number: 20140101660

Abstract: Systems and methods for detecting end of a transaction in a computing environment are provided. The method comprises determining a target area in a graphical user environment displayed on a display screen, wherein a change is expected to occur when end of a transaction is reached; masking the target area at least partially to remove content included in the target area that is present before or after the transaction was initiated; monitoring the target area for change in content; and detecting the end of the transaction when the content of the target area has changed.

Type: Application

Filed: October 20, 2013

Publication date: April 10, 2014

Applicant: International Business Machines Corporation

Inventors: Ella Barkan, Shay Ben-David, Amir Geva, Mattias Marder
Recognition techniques to enhance automation in a computing environment

Patent number: 8601172

Abstract: Systems and methods for detecting end of a transaction in a computing environment are provided. The method comprises determining a target area in a graphical user environment displayed on a display screen, wherein a change is expected to occur when end of a transaction is reached; masking the target area at least partially to remove content included in the target area that is present before or after the transaction was initiated; monitoring the target area for change in content; and detecting the end of the transaction when the content of the target area has changed.

Type: Grant

Filed: May 19, 2011

Date of Patent: December 3, 2013

Assignee: International Business Machines Corporation

Inventors: Ella Barkan, Shay Ben-David, Amir Geva, Mattias Erwin Marder
Distributed off-line voice services

Patent number: 8451823

Abstract: A voice processing system includes a real-time voice server, which is arranged to process real-time voice processing tasks for clients of the system. A gateway processor is arranged to accept from a client a request to perform an off-line voice processing task, to convert the off-line voice processing task into an equivalent real-time voice processing task, to invoke the voice server to process the equivalent real-time voice processing task, and to output a result of the equivalent real-time voice processing task.

Type: Grant

Filed: December 13, 2005

Date of Patent: May 28, 2013

Assignee: Nuance Communications, Inc.

Inventors: Shay Ben-David, Ron Hoory, Alexey Roytman, Zohar Sivan, James Jude Sliwa
Synchronization of Data Streams with Associated Metadata Streams

Publication number: 20130019121

Abstract: Synchronizing a data stream with an associated metadata stream by receiving a data stream and a metadata stream having a plurality of metadata events associated with the data stream, identifying within the data stream a plurality of data events, matching each of the data events to one of the metadata events in accordance with a matching criterion, and synchronizing the data stream with the metadata stream by effecting a relative time shift between the metadata stream and the data stream in accordance with a time shift adjustment value that results in the smallest sum of absolute differences between time indices of each matched data event and metadata event.

Type: Application

Filed: July 17, 2011

Publication date: January 17, 2013

Applicant: International Business Machines Corporation

Inventors: Shay Ben-David, Evgeny Hazanovich, Zak Mandel
Recognition Techniques to Enhance Automation In a Computing Environment

Publication number: 20120296856

Abstract: Systems and methods for detecting end of a transaction in a computing environment are provided. The method comprises determining a target area in a graphical user environment displayed on a display screen, wherein a change is expected to occur when end of a transaction is reached; masking the target area at least partially to remove content included in the target area that is present before or after the transaction was initiated; monitoring the target area for change in content; and detecting the end of the transaction when the content of the target area has changed.

Type: Application

Filed: May 19, 2011

Publication date: November 22, 2012

Applicant: International Business Machines Corporation

Inventors: Ella Barkan, Shay Ben-David, Amir Geva, Mattias Marder
VOICE TRANSFORMATION WITH ENCODED INFORMATION

Publication number: 20120239387

Abstract: Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.

Type: Application

Filed: March 17, 2011

Publication date: September 20, 2012

Applicant: International Business Corporation

Inventors: Shay Ben-David, Ron Hoory, Zvi Kons, David Nahamoo
SPEECH OUTPUT WITH CONFIDENCE INDICATION

Publication number: 20110313762

Abstract: A method, system, and computer program product are provided for speech output with confidence indication. The method includes receiving a confidence score for segments of speech or text to be synthesized to speech. The method includes modifying a speech segment by altering one or more parameters of the speech proportionally to the confidence score.

Type: Application

Filed: June 20, 2010

Publication date: December 22, 2011

Applicant: International Business Machines Corporation

Inventors: Shay Ben-David, Ron Hoory
Document session replay for multimodal applications

Patent number: 7801728

Abstract: Methods, apparatus, and computer program products are described for document session replay for multimodal applications. including identifying, by a multimodal browser in dependence upon a log produced by a Form Interpretation Algorithm (‘FIA’) during a previous document session with a user, a speech prompt provided by a multimodal application in the previous document session; identifying, by a multimodal browser in replay mode in dependence upon the log, a response to the prompt provided by a user of the multimodal application in the previous document session; retrieving, by the multimodal browser in dependence upon the log, an X+V page of the multimodal application associated with the speech prompt and the response; rendering, by the multimodal browser, the visual elements of the retrieved X+V page; replaying, by the multimodal browser, the speech prompt; and replaying, by a multimodal browser, the response.

Type: Grant

Filed: February 26, 2007

Date of Patent: September 21, 2010

Assignee: Nuance Communications, Inc.

Inventors: Shay Ben-David, Charles W. Cross, Jr., Marc T. White
Remote tracing and debugging of automatic speech recognition servers by speech reconstruction from cepstra and pitch information

Patent number: 7783488

Abstract: Methods and systems are provided for remote tuning and debugging of an automatic speech recognition system. Trace files are generated on-site from input speech by efficient, lossless compression of MFCC data, which is merged with compressed pitch and voicing information and stored as trace files. The trace files are transferred to a remote site where human-intelligible speech is reconstructed and analyzed. Based on the analysis, parameters of the automatic speech recognition system are remotely adjusted.

Type: Grant

Filed: December 19, 2005

Date of Patent: August 24, 2010

Assignee: Nuance Communications, Inc.

Inventors: Shay Ben-David, Baiju Dhirajlal Mandalia, Zohar Sivan, Alexander Sorin
Reaching a Communications Service Subscriber Who is Not Answering an Incoming Communications Request

Publication number: 20100048182

Abstract: A method for reaching a communications service subscriber who does not answer an incoming communications request to the subscriber's communications device, the method including detecting that an incoming communications request received at a first communications device is not answered, locating a second communications device within a predefined distance from the first communications device, and sending a message to the second communications device indicating the receipt of the incoming communications request at the first communications device.

Type: Application

Filed: August 25, 2008

Publication date: February 25, 2010

Inventors: Shay Ben-David, Itzhack Goldberg, Boaz Mizrachi
Automated processing of paper forms using remotely-stored form content

Patent number: 7532368

Abstract: A computer-implemented method for processing paper forms includes capturing at a computer system an image of a paper form in which information has been filled-in. A location identifier is extracted from the image. The location identifier indicates an address in a storage location external to the computer system, at which the filled-in information is electronically stored. The information is retrieved responsively to the location identifier by communication with the storage location via a wide area network (WAN), so as to convey the information electronically from the storage location to the computer system. The information is processed using a data processing application running on the computer system.

Type: Grant

Filed: October 18, 2006

Date of Patent: May 12, 2009

Assignee: International Business Machines Corporation

Inventors: Shay Ben-David, Amir Geva
Document Session Replay for Multimodal Applications

Publication number: 20080208587

Abstract: Methods, apparatus, and computer program products are described for document session replay for multimodal applications. including identifying, by a multimodal browser in dependence upon a log produced by a Form Interpretation Algorithm (‘FIA’) during a previous document session with a user, a speech prompt provided by a multimodal application in the previous document session; identifying, by a multimodal browser in replay mode in dependence upon the log, a response to the prompt provided by a user of the multimodal application in the previous document session; retrieving, by the multimodal browser in dependence upon the log, an X+V page of the multimodal application associated with the speech prompt and the response; rendering, by the multimodal browser, the visual elements of the retrieved X+V page; replaying, by the multimodal browser, the speech prompt; and replaying, by a multimodal browser, the response.

Type: Application

Filed: February 26, 2007

Publication date: August 28, 2008

Inventors: Shay Ben-David, Charles W. Cross, Marc T. White

1 2 next