Patents by Inventor Shay Ben-David
Shay Ben-David has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9911410Abstract: A method, computer program product, and system for adapting speech recognition of a user's speech is provided. The method includes receiving a first utterance from a user having a duration below a predetermined threshold, identifying at least one further utterance from the user that provides additional information, generating a concatenated utterance by concatenating the first utterance with the at least one further utterance, transmitting the concatenated utterance to a speech recognition server, receiving a transcription of the concatenated utterance from the speech recognition server that includes a transcription of the first utterance, and extracting the transcription of the first utterance from the transcription of the concatenated utterance. The transcription of the first utterance is based on the additional information provided by the at least one further utterance.Type: GrantFiled: August 19, 2015Date of Patent: March 6, 2018Assignee: International Business Machines CorporationInventor: Shay Ben-David
-
Patent number: 9837080Abstract: Systems and methods for maintaining speaker recognition performance are provided. A method for maintaining speaker recognition performance, comprises training a plurality of models respectively corresponding to speaker recognition scores from a plurality of speakers over a plurality of sessions, and using the plurality of models to conclude whether a speaker seeking access to an environment is a non-ideal target speaker or a non-ideal non-target speaker. Using the plurality of models to conclude comprises calculating a first probability that the speaker seeking access is the non-ideal target speaker, calculating a second probability that the speaker seeking access is the non-ideal non-target speaker, and determining whether the first probability, the second probability or a sum of the first probability and the second probability is above a probability threshold.Type: GrantFiled: August 21, 2014Date of Patent: December 5, 2017Assignee: International Business Machines CorporationInventors: Hagai Aronowitz, Shay Ben-David, David Nahamoo, Jason W. Pelecanos, Orith Toledo-Ronen
-
Publication number: 20170053643Abstract: A method, computer program product, and system for adapting speech recognition of a user's speech is provided. The method includes receiving a first utterance from a user having a duration below a predetermined threshold, identifying at least one further utterance from the user that provides additional information, generating a concatenated utterance by concatenating the first utterance with the at least one further utterance, transmitting the concatenated utterance to a speech recognition server, receiving a transcription of the concatenated utterance from the speech recognition server that includes a transcription of the first utterance, and extracting the transcription of the first utterance from the transcription of the concatenated utterance. The transcription of the first utterance is based on the additional information provided by the at least one further utterance.Type: ApplicationFiled: August 19, 2015Publication date: February 23, 2017Inventor: SHAY BEN-DAVID
-
Patent number: 9535450Abstract: Synchronizing a data stream with an associated metadata stream by receiving a data stream and a metadata stream having a plurality of metadata events associated with the data stream, identifying within the data stream a plurality of data events, matching each of the data events to one of the metadata events in accordance with a matching criterion, and synchronizing the data stream with the metadata stream by effecting a relative time shift between the metadata stream and the data stream in accordance with a time shift adjustment value that results in the smallest sum of absolute differences between time indices of each matched data event and metadata event.Type: GrantFiled: July 17, 2011Date of Patent: January 3, 2017Assignee: International Business Machines CorporationInventors: Shay Ben-David, Evgeny Hazanovich, Zak Mandel
-
Publication number: 20160055844Abstract: Systems and methods for maintaining speaker recognition performance are provided. A method for maintaining speaker recognition performance, comprises training a plurality of models respectively corresponding to speaker recognition scores from a plurality of speakers over a plurality of sessions, and using the plurality of models to conclude whether a speaker seeking access to an environment is a non-ideal target speaker or a non-ideal non-target speaker. Using the plurality of models to conclude comprises calculating a first probability that the speaker seeking access is the non-ideal target speaker, calculating a second probability that the speaker seeking access is the non-ideal non-target speaker, and determining whether the first probability, the second probability or a sum of the first probability and the second probability is above a probability threshold.Type: ApplicationFiled: August 21, 2014Publication date: February 25, 2016Inventors: Hagai Aronowitz, Shay Ben-David, David Nahamoo, Jason W. Pelecanos, Orith Toledo-Ronen
-
Patent number: 9208785Abstract: Methods, apparatus, and computer program products are disclosed for synchronizing distributed speech recognition (‘DSR’) that include receiving in a DSR client notification from a voice server of readiness to conduct speech recognition and, responsive to the receiving, transmitting by the DSR client, from the DSR client to the voice server, speech for recognition.Type: GrantFiled: May 10, 2006Date of Patent: December 8, 2015Assignee: Nuance Communications, Inc.Inventors: Shay Ben-David, Charles W. Cross, Jr.
-
Patent number: 8930182Abstract: Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.Type: GrantFiled: March 17, 2011Date of Patent: January 6, 2015Assignee: International Business Machines CorporationInventors: Shay Ben-David, Ron Hoory, Zvi Kons, David Nahamoo
-
Patent number: 8856396Abstract: Systems and methods for detecting end of a transaction in a computing environment are provided. The method comprises determining a target area in a graphical user environment displayed on a display screen, wherein a change is expected to occur when end of a transaction is reached; masking the target area at least partially to remove content included in the target area that is present before or after the transaction was initiated; monitoring the target area for change in content; and detecting the end of the transaction when the content of the target area has changed.Type: GrantFiled: October 20, 2013Date of Patent: October 7, 2014Assignee: International Business Machines CorporationInventors: Ella Barkan, Shay Ben-David, Amir Geva
-
Publication number: 20140101660Abstract: Systems and methods for detecting end of a transaction in a computing environment are provided. The method comprises determining a target area in a graphical user environment displayed on a display screen, wherein a change is expected to occur when end of a transaction is reached; masking the target area at least partially to remove content included in the target area that is present before or after the transaction was initiated; monitoring the target area for change in content; and detecting the end of the transaction when the content of the target area has changed.Type: ApplicationFiled: October 20, 2013Publication date: April 10, 2014Applicant: International Business Machines CorporationInventors: Ella Barkan, Shay Ben-David, Amir Geva, Mattias Marder
-
Patent number: 8601172Abstract: Systems and methods for detecting end of a transaction in a computing environment are provided. The method comprises determining a target area in a graphical user environment displayed on a display screen, wherein a change is expected to occur when end of a transaction is reached; masking the target area at least partially to remove content included in the target area that is present before or after the transaction was initiated; monitoring the target area for change in content; and detecting the end of the transaction when the content of the target area has changed.Type: GrantFiled: May 19, 2011Date of Patent: December 3, 2013Assignee: International Business Machines CorporationInventors: Ella Barkan, Shay Ben-David, Amir Geva, Mattias Erwin Marder
-
Patent number: 8451823Abstract: A voice processing system includes a real-time voice server, which is arranged to process real-time voice processing tasks for clients of the system. A gateway processor is arranged to accept from a client a request to perform an off-line voice processing task, to convert the off-line voice processing task into an equivalent real-time voice processing task, to invoke the voice server to process the equivalent real-time voice processing task, and to output a result of the equivalent real-time voice processing task.Type: GrantFiled: December 13, 2005Date of Patent: May 28, 2013Assignee: Nuance Communications, Inc.Inventors: Shay Ben-David, Ron Hoory, Alexey Roytman, Zohar Sivan, James Jude Sliwa
-
Publication number: 20130019121Abstract: Synchronizing a data stream with an associated metadata stream by receiving a data stream and a metadata stream having a plurality of metadata events associated with the data stream, identifying within the data stream a plurality of data events, matching each of the data events to one of the metadata events in accordance with a matching criterion, and synchronizing the data stream with the metadata stream by effecting a relative time shift between the metadata stream and the data stream in accordance with a time shift adjustment value that results in the smallest sum of absolute differences between time indices of each matched data event and metadata event.Type: ApplicationFiled: July 17, 2011Publication date: January 17, 2013Applicant: International Business Machines CorporationInventors: Shay Ben-David, Evgeny Hazanovich, Zak Mandel
-
Publication number: 20120296856Abstract: Systems and methods for detecting end of a transaction in a computing environment are provided. The method comprises determining a target area in a graphical user environment displayed on a display screen, wherein a change is expected to occur when end of a transaction is reached; masking the target area at least partially to remove content included in the target area that is present before or after the transaction was initiated; monitoring the target area for change in content; and detecting the end of the transaction when the content of the target area has changed.Type: ApplicationFiled: May 19, 2011Publication date: November 22, 2012Applicant: International Business Machines CorporationInventors: Ella Barkan, Shay Ben-David, Amir Geva, Mattias Marder
-
Publication number: 20120239387Abstract: Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.Type: ApplicationFiled: March 17, 2011Publication date: September 20, 2012Applicant: International Business CorporationInventors: Shay Ben-David, Ron Hoory, Zvi Kons, David Nahamoo
-
Publication number: 20110313762Abstract: A method, system, and computer program product are provided for speech output with confidence indication. The method includes receiving a confidence score for segments of speech or text to be synthesized to speech. The method includes modifying a speech segment by altering one or more parameters of the speech proportionally to the confidence score.Type: ApplicationFiled: June 20, 2010Publication date: December 22, 2011Applicant: International Business Machines CorporationInventors: Shay Ben-David, Ron Hoory
-
Patent number: 7801728Abstract: Methods, apparatus, and computer program products are described for document session replay for multimodal applications. including identifying, by a multimodal browser in dependence upon a log produced by a Form Interpretation Algorithm (‘FIA’) during a previous document session with a user, a speech prompt provided by a multimodal application in the previous document session; identifying, by a multimodal browser in replay mode in dependence upon the log, a response to the prompt provided by a user of the multimodal application in the previous document session; retrieving, by the multimodal browser in dependence upon the log, an X+V page of the multimodal application associated with the speech prompt and the response; rendering, by the multimodal browser, the visual elements of the retrieved X+V page; replaying, by the multimodal browser, the speech prompt; and replaying, by a multimodal browser, the response.Type: GrantFiled: February 26, 2007Date of Patent: September 21, 2010Assignee: Nuance Communications, Inc.Inventors: Shay Ben-David, Charles W. Cross, Jr., Marc T. White
-
Patent number: 7783488Abstract: Methods and systems are provided for remote tuning and debugging of an automatic speech recognition system. Trace files are generated on-site from input speech by efficient, lossless compression of MFCC data, which is merged with compressed pitch and voicing information and stored as trace files. The trace files are transferred to a remote site where human-intelligible speech is reconstructed and analyzed. Based on the analysis, parameters of the automatic speech recognition system are remotely adjusted.Type: GrantFiled: December 19, 2005Date of Patent: August 24, 2010Assignee: Nuance Communications, Inc.Inventors: Shay Ben-David, Baiju Dhirajlal Mandalia, Zohar Sivan, Alexander Sorin
-
Reaching a Communications Service Subscriber Who is Not Answering an Incoming Communications Request
Publication number: 20100048182Abstract: A method for reaching a communications service subscriber who does not answer an incoming communications request to the subscriber's communications device, the method including detecting that an incoming communications request received at a first communications device is not answered, locating a second communications device within a predefined distance from the first communications device, and sending a message to the second communications device indicating the receipt of the incoming communications request at the first communications device.Type: ApplicationFiled: August 25, 2008Publication date: February 25, 2010Inventors: Shay Ben-David, Itzhack Goldberg, Boaz Mizrachi -
Patent number: 7532368Abstract: A computer-implemented method for processing paper forms includes capturing at a computer system an image of a paper form in which information has been filled-in. A location identifier is extracted from the image. The location identifier indicates an address in a storage location external to the computer system, at which the filled-in information is electronically stored. The information is retrieved responsively to the location identifier by communication with the storage location via a wide area network (WAN), so as to convey the information electronically from the storage location to the computer system. The information is processed using a data processing application running on the computer system.Type: GrantFiled: October 18, 2006Date of Patent: May 12, 2009Assignee: International Business Machines CorporationInventors: Shay Ben-David, Amir Geva
-
Publication number: 20080208587Abstract: Methods, apparatus, and computer program products are described for document session replay for multimodal applications. including identifying, by a multimodal browser in dependence upon a log produced by a Form Interpretation Algorithm (‘FIA’) during a previous document session with a user, a speech prompt provided by a multimodal application in the previous document session; identifying, by a multimodal browser in replay mode in dependence upon the log, a response to the prompt provided by a user of the multimodal application in the previous document session; retrieving, by the multimodal browser in dependence upon the log, an X+V page of the multimodal application associated with the speech prompt and the response; rendering, by the multimodal browser, the visual elements of the retrieved X+V page; replaying, by the multimodal browser, the speech prompt; and replaying, by a multimodal browser, the response.Type: ApplicationFiled: February 26, 2007Publication date: August 28, 2008Inventors: Shay Ben-David, Charles W. Cross, Marc T. White