Patents Examined by Thierry L Pham

Digital video synthesis

Patent number: 10170136

Abstract: A method which includes: detecting phrases in a transcript of an audiovisual file; applying a speech recognition algorithm to the audiovisual file and to a list of words of the phrase, to output a temporal location of each of the words that are uttered in the audio channel; compiling a list of sub-phrases of each of the phrases; creating a temporal sub-phrase map that comprises a temporal location of each of the sub-phrases; extracting the uttered sub-phrases from the audiovisual file, to create multiple sub-phrase audiovisual files; and constructing a database the multiple sub-phrase audiovisual files and of the sub-phrase uttered in each of the files. The method may also include: receiving a phrase; querying the database for audiovisual files which comprise uttered sub-phrases of the phrase; and splicing at least some of the audiovisual files to a compilation audiovisual file in which the phrase is uttered.

Type: Grant

Filed: May 6, 2015

Date of Patent: January 1, 2019

Assignee: AL LEVY TECHNOLOGIES LTD.

Inventor: Alon Levi
Selecting alternates in speech recognition

Patent number: 10140978

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.

Type: Grant

Filed: September 13, 2017

Date of Patent: November 27, 2018

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Dave Harwath, Ian C. McGraw
Determining audio event based on location information

Patent number: 10134422

Abstract: A method of determining, by an electronic device, an audio event is disclosed. The method may include receiving an input sound from a sound source by a plurality of sound sensors. The method may also extracting, by a processor, at least one sound feature from the received input sound, determining, by the processor, location information of the sound source based on the input sound received by the sound sensors, determining, by the processor, the audio event indicative of the input sound based on the at least one sound feature and the location information, and transmitting, by a communication unit, a notification of the audio event to an external electronic device.

Type: Grant

Filed: December 1, 2015

Date of Patent: November 20, 2018

Assignee: QUALCOMM Incorporated

Inventors: Kyu Woong Hwang, Yongwoo Cho, Jun-Cheol Cho, Sunkuk Moon
OCR through voice recognition

Patent number: 10133920

Abstract: One embodiment provides a method, including: receiving, at an input and display device, handwriting input; receiving, using a processor, voice input; generating, using a processor, at least one first word based on the handwriting input; generating, using a processor, at least one second word based on the voice input; and determining, using a processor, a highest probability word based on the at least one first word and the at least one second word. Other aspects are described and claimed.

Type: Grant

Filed: February 27, 2015

Date of Patent: November 20, 2018

Assignee: Lenovo (Singapore) Pte. Ltd.

Inventors: Antoine Roland Raux, Grigori Zaitsev, Russell Speight VanBlon, Jianbang Zhang
Printing apparatus, printing apparatus control method, and program

Patent number: 10136029

Abstract: A printing apparatus includes a receiving unit which receives print data, an operating unit which receives a print instruction from a user, a display unit which displays a password entry screen for receiving a password entry from a user, and a printing unit which receives a print instruction from a user through the operating unit and prints print data without accepting a password through a password entry screen if a password added to the print data is matched with a fixed password and print data to be printed if a print instruction from a user is received through the operating unit, if the password added to the print data is matched with the fixed password, and if the password received through a password entry screen is matched with the password added to the print data.

Type: Grant

Filed: September 15, 2017

Date of Patent: November 20, 2018

Assignee: CANON KABUSHIKI KAISHA

Inventor: Naoya Kakutani
Detection and reconstruction of East Asian layout features in a fixed format document

Patent number: 10127221

Abstract: Detection of East Asian layout features and reconstruction of East Asian layout features is provided. Vertically written text in the fixed format document is detected and rotated for layout analysis. After layout analysis, the rotated text is rotated back and restructured in a flow format document. When a plurality of characters is written horizontally in a vertical line of text, vertically overlapping text runs are detected, designated as horizontal-in-vertical text, and are restructured as horizontal-in-vertical text in a flow format document. Lines of text are analyzed for attributes of a ruby line and are designated as ruby text, associated with corresponding text in a ruby base line, and restructured as ruby text in a flow format document. Text in a fixed format document is analyzed for detection of a particular East Asian language so that a font for the language is designated in a flow format document.

Type: Grant

Filed: May 2, 2016

Date of Patent: November 13, 2018

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Drazen Zaric, Milan Sesum, Milos Lazarevic, Milos Raskovic
Optimizing machine translations for user engagement

Patent number: 10114819

Abstract: Exemplary embodiments relate to techniques for improving a machine translation system. The machine translation system may include one or more models for generating a translation. The system may generate multiple candidate translations, and may present the candidate translations to different groups of users, such as users of a social network. User engagement with the different candidate translations may be measured, and the system may determine which of the candidate translations was most favored by the users. For example, in the context of a social network, the number of times that the translation is liked or shared, or the number of comments associated with the translation, may be used to determine user engagement with the translation. The models of the machine translation system may be modified to favor the most-favored candidate translation. The translation system may repeat this process to continue to tune the models in a feedback loop.

Type: Grant

Filed: June 24, 2016

Date of Patent: October 30, 2018

Assignee: FACEBOOK, INC.

Inventors: Ying Zhang, Fei Huang, Kay Rottmann, Necip Fazil Ayan
Blind diarization of recorded calls with arbitrary number of speakers

Patent number: 10109280

Abstract: In a method of diarization of audio data, audio data is segmented into a plurality of utterances. Each utterance is represented as an utterance model representative of a plurality of feature vectors. The utterance models are clustered. A plurality of speaker models are constructed from the clustered utterance models. A hidden Markov model is constructed of the plurality of speaker models. A sequence of identified speaker models is decoded.

Type: Grant

Filed: December 12, 2017

Date of Patent: October 23, 2018

Assignee: VERINT SYSTEMS LTD.

Inventors: Oana Sidi, Ron Wein
Multi-media context language processing

Patent number: 10089299

Abstract: Technology is disclosed that improves language processing engines by using multi-media (image, video, etc.) context data when training and applying language models. Multi-media context data can be obtained from one or more sources such as object/location/person identification in the multi-media, multi-media characteristics, labels or characteristics provided by an author of the multi-media, or information about the author of the multi-media. This context data can be used as additional input for a machine learning process that creates a model used in language processing. The resulting model can be used as part of various language processing engines such as a translation engine, correction engine, tagging engine, etc., by taking multi-media context/labeling for a content item as part of the input for computing results of the model.

Type: Grant

Filed: July 17, 2017

Date of Patent: October 2, 2018

Assignee: Facebook, Inc.

Inventors: Kay Rottmann, Mirjam Maess
System, method and computer program product for creating a summarization from recorded audio of meetings

Patent number: 10089290

Abstract: A meeting summarization method, system, and computer program product, include recording meeting audio of a meeting, capturing notes including a time stamp from each of a plurality of users associated with the meeting, synchronizing the recorded meeting audio of the meeting and each of the notes of each of the plurality of users based on a correlation between the time stamp, and analyzing the synchronized meeting audio and notes to determine highlights of the meeting based on a co-occurrence of notes between the plurality of users.

Type: Grant

Filed: October 17, 2017

Date of Patent: October 2, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Keith William Grueneberg, Jason Crawford, Jonathan Lenchner, Satya V. Nitta, Christian Makaya, Sharad C. Sundararajan
Social networking with assistive technology device

Patent number: 10083684

Abstract: An approach is provided that assists visually impaired users. The approach analyzes a document that is being utilized by the visually impaired user. The analysis derives a sensitivity of the document. A vocal characteristic corresponding to the derived sensitivity is retrieved. Text from the document is audibly read to the visually impaired user with a text to speech process that utilizes the retrieved vocal characteristic. The retrieved vocal characteristic conveys the derived sensitivity of the document to the visually impaired user.

Type: Grant

Filed: August 22, 2016

Date of Patent: September 25, 2018

Assignee: International Business Machines Corporation

Inventors: Maureen E. Kraft, Fang Lu, Azadeh Salehi, Weisong Wang
Voice command input device and voice command input method

Patent number: 10074367

Abstract: A voice command input device includes a first voice input unit, a second voice input unit, and a voice command identifier. The first voice input unit converts a voice into first voice command information, and outputs first identification information and the first voice command information. The second voice input unit converts a voice into second voice command information, and outputs second identification information and the second voice command information. The voice command identifier refers to the first identification information and the second identification information, and generates a control signal for controlling an operation target appliance based on the result of referring, the first voice command information, and the second voice command information.

Type: Grant

Filed: March 26, 2015

Date of Patent: September 11, 2018

Assignee: Panasonic Intellectual Property Management Co., Ltd.

Inventor: Keiichi Toiyama
Programmatic selection of service provider

Patent number: 10068267

Abstract: Systems, methods, and computer-readable media are disclosed for determining a recommendation for a service provider to perform a service on a creative work based on metadata associated with the creative work and attribute information associated with the service provider. Systems, methods, and computer-readable media are also disclosed for determining a recommendation for a creative work requiring a service to be performed based on metadata associated with the creative work and attribute information associated with the service provider.

Type: Grant

Filed: September 4, 2015

Date of Patent: September 4, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Tyler Matthew Schloesser, Vamsi Moudgalya Koppunur, Chandra Shekar Neti, Branon Jeffrey Lyle, Yaodong Liu, Anita Anil Borkar
Customizing actions based on contextual data and voice-based inputs

Patent number: 10062383

Abstract: Methods and systems are provided for customizing an action. In some implementations, voice input is received from a user and a context is determined from the voice input. Potential contextual data is identified based on the context and the voice input. A level of confidence is determined for an association of the potential contextual data and the context. An action is performed based on the voice input, the potential contextual data, and the level of confidence. The potential contextual data is used to customize the action.

Type: Grant

Filed: November 20, 2017

Date of Patent: August 28, 2018

Assignee: Google LLC

Inventors: Zoltan Stekkelpak, Gyula Simonyi
Web conference system providing multi-language support

Patent number: 10042847

Abstract: A method, system and computer program product for enabling attendees of a web conference to view materials of the web conference in their native language. When the conference server determines that the preferred native language of the attendee differs from the preferred native language of the presenter of the web conference, the conference server creates a virtual environment that is a clone of a host environment of the presenter that runs a native language pack of the preferred native language of the attendee. Upon the presenter starting the web conference, the screen shot shared by the presenter to the attendees is captured from the host environment of the presenter and then translated into the preferred native language of the attendee using the native language pack of the attendee's virtual environment. The translated screen shot is then sent to the attendee in the attendee's preferred native language from the virtual environment.

Type: Grant

Filed: August 25, 2017

Date of Patent: August 7, 2018

Assignee: International Business Machines Corporation

Inventors: Qi En Jiang, Joey H. Y. Tseng, Di Wu, Xi Bo Zhu, Dong Jun Zong
Crowdsourcing translations

Patent number: 10025780

Abstract: In one embodiment, a method includes accessing, by one or more of the computing devices, one or more translations for each text string of a plurality of text strings; determining, by one or more of the computing devices, a priority value for each text string of the plurality of text strings, wherein the priority value for the text string is based on one or more reliability-values of the one or more translations for the text string; selecting, by one or more of the computing devices, a particular text string from the plurality of text strings based on its priority value; and sending, to a client system, instructions configured to present a translation prompt comprising the particular text string.

Type: Grant

Filed: October 5, 2017

Date of Patent: July 17, 2018

Assignee: Facebook, Inc.

Inventor: Luis Francisco Sarmenta
Acoustic channel-based data communications method

Patent number: 10020896

Abstract: It discloses an acoustic channel-based data communications method which performs channel coding on an original data signal using a CRC coding method and a BCH coding method to obtain a coded sequence; modulates the coded sequence using a preset audio sequence symbol set via a symbol mapping method to obtain a digital audio signal; selects a channel frequency band according to characteristics of a transmitting equipment and interference between frequency bands; and converts the digital audio signal into an analog audio signal through a digital-to-analog converter and transmits the signal to a channel for transmission according to the selected channel frequency band.

Type: Grant

Filed: January 15, 2018

Date of Patent: July 10, 2018

Assignee: SUZHOU REALPOWER ELECTRIC APPLIANCE CO., LTD

Inventor: Jinghong Chen
Orienting a microphone array to a user location

Patent number: 10019996

Abstract: For orienting a microphone array to a user location, a processor detects a user location with a presence sensor that detects a user using electromagnetic signals. In addition, the processor orients a microphone array to the user location.

Type: Grant

Filed: August 29, 2016

Date of Patent: July 10, 2018

Assignee: LENOVO (Singapore) PTE. LTD.

Inventors: Song Wang, John Weldon Nicholson, Ming Qian
Automated processing of transcripts, transcript designations, and/or video clip load files

Patent number: 10013407

Abstract: In an aspect, a computerized method for generating processed files of deposition testimony transcript designations may include accessing a file containing designations of contents of a textual transcript, quarantining errors in the designations, and generating a processed file containing processed designations of contents of the textual transcript having quarantined errors removed therefrom. In another aspect, a computerized method of generating designations for a deposition testimony transcript may include accessing designation information regarding designations made with respect to text of the deposition testimony transcript, accessing rules for generating designations based on the designation information, and generating the designations based on the rules.

Type: Grant

Filed: August 21, 2015

Date of Patent: July 3, 2018

Assignee: Designation Station, LLC

Inventor: Christopher John Grimm
Computer-implemented systems and methods for speaker recognition using a neural network

Patent number: 10008209

Abstract: Systems and methods are provided for providing voice authentication of a candidate speaker. Training data sets are accessed, where each training data set comprises data associated with a training speech sample of a speaker and a plurality of speaker metrics, where the plurality of speaker metrics include a native language of the speaker. The training data sets are used to train a neural network, where the data associated with each training speech sample is a training input to the neural network, and each of the plurality of speaker metrics is a training output to the neural network. Data associated with a speech sample is provided to the neural network to generate a vector that contains values for the plurality of speaker metrics, and the values contained in the vector are compared to values contained in a reference vector associated with a known person to determine whether the candidate speaker is the known person.

Type: Grant

Filed: September 23, 2016

Date of Patent: June 26, 2018

Assignee: Educational Testing Service

Inventors: Yao Qian, Jidong Tao, David Suendermann-Oeft, Keelan Evanini, Alexei V. Ivanov, Vikram Ramanarayanan

prev … 9 10 11 12 13 14 15 16 17 … next