Patents Examined by Jesse S Pullias
  • Patent number: 11076052
    Abstract: Various disclosed implementations involve processing and/or playback of a recording of a conference involving a plurality of conference participants. Some implementations disclosed herein involve receiving audio data corresponding to a recording of at least one conference involving a plurality of conference participants. In some examples, only a portion of the received audio data will be selected as playback audio data. The selection process may involve a topic selection process, a talkspurt filtering process and/or an acoustic feature selection process. Some examples involve receiving an indication of a target playback time duration. Selecting the portion of audio data may involve making a time duration of the playback audio data within a threshold time difference of the target playback time duration.
    Type: Grant
    Filed: February 3, 2016
    Date of Patent: July 27, 2021
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Richard J. Cartwright, Xuejing Sun
  • Patent number: 11074908
    Abstract: A method, computer program product, and computer system for identifying, by a computing device, at least one language model component of a plurality of language model components in at least one application associated with automatic speech recognition (ASR) and natural language understanding (NLU) usage. A contribution bias may be received for the at least one language model component. The ASR and NLU may be aligned between the plurality of language model components based upon, at least in part, the contribution bias.
    Type: Grant
    Filed: June 14, 2019
    Date of Patent: July 27, 2021
    Assignee: Nuance Communications, Inc.
    Inventors: Nathan Bodenstab, Matt Hohensee, Dermot Connolly, Kenneth Smith, Vittorio Manzone
  • Patent number: 11069334
    Abstract: Embodiments are provided to recognize features and activities from an audio signal. In one embodiment, a model is generated from sound effect data, which is augmented and projected into an audio domain to form a training dataset efficiently. Sound effect data is data that has been artificially created or from enhanced sounds or sound processes to provide a more accurate baseline of sound data than traditional training data. The sound effect data is augmented to create multiple variants to broaden the sound effect data. The augmented sound effects are projected into various audio domains, such as indoor, outdoor, urban, based on mixing background sounds consistent with these audio domains. The model is installed on any computing device, such as a laptop, smartphone, or other device. Features and activities from an audio signal are then recognized by the computing device based on the model without the need for in-situ training.
    Type: Grant
    Filed: August 13, 2019
    Date of Patent: July 20, 2021
    Assignee: Carnegie Mellon University
    Inventors: Gierad Laput, Karan Ahuja, Mayank Goel, Christopher Harrison
  • Patent number: 11037576
    Abstract: A system determines if a call participant of a call between the call participant and a voice response system is a human or a machine. Responsive to determining that the call participant is a human, an emotional state of the call participant is determined. Environmental information of an environment associated with the call participant is receiving. A receptiveness level of the call participant is determined based upon the emotional state and the environmental information. A message to the call participant is determined based upon the receptiveness level and one or more machine-learning models.
    Type: Grant
    Filed: November 15, 2018
    Date of Patent: June 15, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Aaron K. Baughman, Mauro Marzorati, Gary Francis Diamanti, Sarbajit K. Rakshit
  • Patent number: 11030255
    Abstract: A method generates data visualizations based on user selected data sources and user input that specifies natural language commands requesting information about the data sources. The computer determines one or more keywords from the natural language command and determines, based on the one or more keywords, a user intent to generate a new data visualization. The computer then generates a visual specification that specifies a plurality of visual variables. Each visual variable of the plurality of visual variables is generated based on the first user intent. The computer then generates and displays a data visualization based on the visual specification.
    Type: Grant
    Filed: September 18, 2019
    Date of Patent: June 8, 2021
    Assignee: Tableau Software, LLC
    Inventors: Melanie K. Tory, Vidya Raghavan Setlur, Alex Djalali
  • Patent number: 11030418
    Abstract: A translation device is configured to acquire utterance spoken by a speaker in a first language and translate contents of the utterance into a second language for information presentation, and includes an input unit, a controller, a notification unit, and a storage. The input unit acquires the utterance in the first language and generates voice data from the utterance. The controller acquires a first evaluation value. The notification unit presents the speaker with information on utterance reinput request. The notification unit presents first information on utterance reinput request when the first evaluation value is less than or equal to a first predetermined value. The controller generates new voice recognition data with reference to the past voice recognition data and voice recognition data of reinput utterance, when the voice recognition data of the reinput utterance has an evaluation value less than or equal to a predetermined value.
    Type: Grant
    Filed: February 18, 2019
    Date of Patent: June 8, 2021
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Taketoshi Nakao, Ryo Ishida, Takahiro Kamai, Tetsuji Mochida, Mikio Morioka
  • Patent number: 11024293
    Abstract: Systems and methods are described for personifying communications. According to at least one embodiment, the computer-implemented method for personifying a natural-language communication includes observing a linguistic pattern of a user. The method may also include analyzing the linguistic pattern of the user and adapting the natural-language communication based at least in part on the analyzed linguistic pattern of the user. In some embodiments, observing the linguistic pattern of the user may include receiving data indicative of the linguistic pattern of the user. The data may be one of verbal data or written data. Written data may include at least one of a text message, email, social media post, or computer-readable note. Verbal data may include at least one of a recorded telephone conversation, voice command, or voice message.
    Type: Grant
    Filed: August 12, 2018
    Date of Patent: June 1, 2021
    Assignee: Vivint, Inc.
    Inventors: Jefferson Lyman, Nic Brunson, Wade Shearer, Mike Warner, Stefan Walger
  • Patent number: 11017782
    Abstract: A controller and method of classifying a user into one of a plurality of user classes. One or more voice samples are received from the user, from which a frequency spectrum is generated. One or more values defining respective features of the frequency spectrum are extracted from the frequency spectrum. Each of the respective features are defined by values of frequency, amplitude, and/or position in the spectrum. One or more of the respective features are resonant frequencies in the voice of the user. A user profile of the user is generated and comprises the extracted one or more values. The user profile is supplied to a machine learning algorithm that is trained to classify users as belonging to one of the plurality of user classes based on the one or more values in their respective user profile.
    Type: Grant
    Filed: November 14, 2018
    Date of Patent: May 25, 2021
    Assignee: XMOS Ltd.
    Inventors: Kevin Michael Short, Kourosh Zarringhalam
  • Patent number: 11003859
    Abstract: A method, computer program product, and a system where a processor(s) obtains, a requirement comprising a structure defined by textual content. The processor(s) identifies content relevant to predefined label(s) in the structure, based on applying a natural language classification algorithm to the structure; each predefined label indicates an atomic function. The processor(s) generates via the natural language classification algorithm, an array of values for each label; each value corresponds to a level of confidence the natural language classification algorithm correctly identified the atomic function indicated by each predefined label, in the structure. The processor(s) ranks the values in the array of values by confidence level, pairing values with labels. The processor(s) evaluates the pairs, utilizing a linear regression, to identify a portion of the pairs relevant to the requirement which are above a relevance threshold.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: May 11, 2021
    Assignee: International Business Machines Corporation
    Inventors: Gerhardt J. Scriven, Nikit Shah
  • Patent number: 10991372
    Abstract: Embodiments of the present disclosure are directed to a speech interaction method executed at an electronic device, a speech interaction apparatus, and a computer readable storage medium. The method includes receiving an image sequence of a user from an image capturing apparatus coupled to the electronic device. The method also includes detecting a change in a head feature of the user from the image sequence. After that, the method includes determining whether the change in the head feature matches a predetermined change pattern. The method further includes causing the electronic device to enter an active state in response to determining that the change in the head feature matches the predetermined change pattern, the electronic device in the active state being capable of responding to a speech command of the user.
    Type: Grant
    Filed: December 17, 2018
    Date of Patent: April 27, 2021
    Assignee: BEIJING BAIDU NETCOM SCIENC AND TECHNOLOGY CO., LTD.
    Inventors: Liang Gao, Jiliang Xie
  • Patent number: 10990760
    Abstract: Systems and methods are described for determining customer sentiment using natural language processing in technical support communications. Communication content exchanged between a customer device and an agent device may be filtered to remove technical support syntax. Using natural language processing techniques, the processor may assign baseline values to features within the filtered communication content. To assign the baseline values, features from the filtered communication content may be identified, where the features pertain to expressed sentiments, and a trained first model may be applied to identify polarities and strengths related to the identified features. A score value may then be assigned to each identified feature, the score values being based on the polarities and strengths. A subset of the score values may then be weighted based on metadata and/or context, and the score values may be combined using a second model to determine an overall sentiment of the filtered communication content.
    Type: Grant
    Filed: March 13, 2019
    Date of Patent: April 27, 2021
    Assignee: SupportLogic, Inc.
    Inventors: Charles C. Monnett, Lawrence Spracklen, Krishna Raj Raja
  • Patent number: 10984816
    Abstract: A voice enhancement method and apparatus of a smart device and a smart device are disclosed. The method comprises: monitoring and collecting a voice signal sent by a user in real time; determining a direction of the user according to the voice signal; collecting a depth image in the direction of the user; determining a sound source direction of the user according to the depth image; and adjusting a beamforming direction of a microphone array on the smart device according to the sound source direction of the user, and performing enhancement processing on the voice signal.
    Type: Grant
    Filed: July 5, 2018
    Date of Patent: April 20, 2021
    Assignee: GOERTEK INC.
    Inventors: Jian Zhu, Xiangdong Zhang, Zhenyu Yu, Zhiping Luo, Dong Yan
  • Patent number: 10984818
    Abstract: The disclosure relates to an apparatus for determining a quality score (MOS) for an audio signal sample, the apparatus comprising: an extractor configured to extract a feature vector from the audio signal sample, wherein the feature vector comprises a plurality of feature values and wherein each feature value is associated to a different feature of the feature vector; a pre-processor configured to pre-process a feature value of the feature vector based on a cumulative distribution function associated to the feature represented by the feature value to obtain a pre-processed feature value; and a processor configured to implement a neural network and to determine the quality score (MOS) for the audio signal sample based on the pre-processed feature value and a set of neural network parameters for the neural network associated to the cumulative distribution function.
    Type: Grant
    Filed: February 7, 2019
    Date of Patent: April 20, 2021
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Wei Xiao, Mona Hakami, Willem Bastiaan Kleijn
  • Patent number: 10978082
    Abstract: An audio processor for processing an audio signal to acquire a subband representation thereof includes a cascaded lapped critically sampled transform stage and a time domain aliasing reduction stage, the former being configured to perform a cascaded lapped critically sampled transform on at least two partially overlapping blocks of samples of the audio signal, to acquire a set of subband samples on the basis of a first block of samples of the audio signal, and to acquire a corresponding set of subband samples on the basis of a second block of samples of the audio signal. The latter is configured to perform a weighted combination of two corresponding sets of subband samples, which are acquired on the basis of the first and second blocks of samples of the audio signal, respectively, to acquire an aliasing reduced subband representation of the audio signal.
    Type: Grant
    Filed: January 18, 2019
    Date of Patent: April 13, 2021
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Nils Werner, Bernd Edler
  • Patent number: 10971147
    Abstract: In an aspect of the present disclosure, a method for providing an alternate modality of input for filling a form field in response to a failure of voice recognition is disclosed including prompting the user for information corresponding to a field of a form, generating speech data by capturing a spoken response of the user to the prompt using at least one input device, attempting to convert the speech data to text, determining that the attempted conversion has failed, evaluating the failure using at least one speech rule, selecting, based on the evaluation, an alternate input modality to be used for receiving the information corresponding to the field of the form, receiving the information corresponding to the field of the form from the alternate input modality, and injecting the received information into the field of the form.
    Type: Grant
    Filed: March 7, 2019
    Date of Patent: April 6, 2021
    Assignee: International Business Machines Corporation
    Inventors: Robert H. Grant, Trudy L. Hewitt, Mitchell J. Mason, Robert J. Moore, Kenneth A. Winburn
  • Patent number: 10963498
    Abstract: Methods and systems are provided for generating automatic program recommendations based on user interactions. In some embodiments, control circuitry processes verbal data received during an interaction between a user of a user device and a person with whom the user is interacting. The control circuitry analyzes the verbal data to automatically identify a media asset referred to during the interaction by at least one of the user and the person with whom the user is interacting. The control circuitry adds the identified media asset to a list of media assets associated with the user of the user device. The list of media assets is transmitted to a second user device of the user.
    Type: Grant
    Filed: November 6, 2018
    Date of Patent: March 30, 2021
    Assignee: Rovi Guides, Inc.
    Inventors: Brian Fife, Jason Braness, Michael Papish, Thomas Steven Woods
  • Patent number: 10964337
    Abstract: A method, a device and a storage medium for evaluating speech quality include: receiving speech data to be evaluated; extracting evaluation features of the speech data to be evaluated; performing quality evaluation to the speech data to be evaluated according to the evaluation features of the speech data to be evaluated and a predetermined speech quality evaluation model, in which the speech quality evaluation model is an indication of a relationship between evaluation features of single-ended speech data and quality information of the single-ended speech data.
    Type: Grant
    Filed: February 20, 2019
    Date of Patent: March 30, 2021
    Assignee: Iflytek Co., Ltd.
    Inventors: Bing Yin, Si Wei, Guoping Hu, Su Cheng
  • Patent number: 10964316
    Abstract: One non-limiting embodiment provides a method, including: receiving, from a user, user input comprising a trigger event; identifying, using at least one processor, active media content; and performing, based upon the trigger event, an action with respect to the active media content. This embodiment is intended to be non-limiting and other embodiments are contemplated, disclosed, and discussed.
    Type: Grant
    Filed: August 9, 2017
    Date of Patent: March 30, 2021
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: Roderick Echols, Ryan Charles Knudson, Timothy Winthrop Kingsbury, Jonathan Gaither Knox
  • Patent number: 10956656
    Abstract: A system and method for automatically generating a narrative story receives data and information pertaining to a domain event. The received data and information and/or one or more derived features are then used to identify a plurality of angles for the narrative story. The plurality of angles is then filtered, for example through use of parameters that specify a focus for the narrative story, length of the narrative story, etc. Points associated with the filtered plurality of angles are then assembled and the narrative story is rendered using the filtered plurality of angles and the assembled points.
    Type: Grant
    Filed: November 20, 2019
    Date of Patent: March 23, 2021
    Assignee: NARRATIVE SCIENCE INC.
    Inventors: Lawrence A. Birnbaum, Kristian J. Hammond, Nicholas D. Allen, John R. Templon
  • Patent number: 10943582
    Abstract: A method and apparatus of training an acoustic feature extracting model, a device and a computer storage medium. The method comprises: considering a first acoustic feature extracted respectively from speech data corresponding to user identifiers as training data; training an initial model based on a deep neural network based on a criterion of a minimum classification error, until a preset first stop condition is reached; using a triplet loss layer to replace a Softmax layer in the initial model to constitute an acoustic feature extracting model, and continuing to train the acoustic feature extracting model until a preset second stop condition is reached, the acoustic feature extracting model being used to output a second acoustic feature of the speech data; wherein the triplet loss layer is used to maximize similarity between the second acoustic features of the same user, and minimize similarity between the second acoustic features of different users.
    Type: Grant
    Filed: May 14, 2018
    Date of Patent: March 9, 2021
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Bing Jiang, Xiaokong Ma, Chao Li, Xiangang Li