Neural Network Patents (Class 704/232)
  • Patent number: 12034555
    Abstract: Systems and methods for facilitating a watch party are provided. In one example, a method includes: initiating a watch party session for a host user using, presenting content selected by the host user on a first user device during the watch party session, initiating a chat session concurrent with the watch party session, receiving a participation request by a guest user sent from a second user device for participating in the chat session; in response to the participation request, authenticating the guest user; presenting the content selected by the host user on the second user device; synchronizing the presentation of the content on the first user device with the presentation of the second user device; and facilitating communication between the host user and the guest user during the chat session.
    Type: Grant
    Filed: May 10, 2023
    Date of Patent: July 9, 2024
    Assignee: DISH Network Technologies India Private Limited
    Inventors: Melvin P. Perinchery, Preetham Kumar
  • Patent number: 12033649
    Abstract: Embodiments are disclosed for noise floor estimation and noise reduction, In an embodiment, a method comprises: obtaining an audio signal; dividing the audio signal into a plurality of buffers; determining time-frequency samples for each buffer of the audio signal; for each buffer and for each frequency, determining a median (or mean) and a measure of an amount of variation of energy based on the samples in the buffer and samples in neighboring buffers that together span a specified time range of the audio signal; combining the median (or mean) and the measure of the amount of variation of energy into a cost function; for each frequency: determining a signal energy of a particular buffer of the audio signal that corresponds to a minimum value of the cost function; selecting the signal energy as the estimated noise floor of the audio signal; and reducing, using the estimated noise floor, noise in the audio signal.
    Type: Grant
    Filed: January 18, 2021
    Date of Patent: July 9, 2024
    Assignee: DOLBY INTERNATIONAL AB
    Inventors: Giulio Cengarle, Antonio Mateos Sole, Davide Scaini
  • Patent number: 12019641
    Abstract: Systems and techniques are provided for processing one or more data samples. For example, a neural network classifier can be trained to perform few-shot open-set recognition (FSOSR) based on a task-agnostic open-set prototype. A process can include determining one or more prototype representations for each class included in a plurality of support samples. A task-agnostic open-set prototype representation can be determined, in a same learned metric space as the one or more prototype representations. One or more distance metrics can be determined for each query sample of one or more query samples, based on the one or more prototype representations and the task-agnostic open-set prototype representation. Based on the one or more distance metrics, each query sample can be classified into one of classes associated with the one or more prototype representations or an open-set class associated with the task-agnostic open-set prototype representation.
    Type: Grant
    Filed: January 12, 2023
    Date of Patent: June 25, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Byeonggeun Kim, Juntae Lee, Simyung Chang
  • Patent number: 12020135
    Abstract: A library of machine learning primitives is provided to optimize a machine learning model to improve the efficiency of inference operations. In one embodiment a trained convolutional neural network (CNN) model is processed into a trained CNN model via pruning, convolution window optimization, and quantization.
    Type: Grant
    Filed: August 26, 2021
    Date of Patent: June 25, 2024
    Assignee: Intel Corporation
    Inventors: Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Barath Lakshmanan, Ben J. Ashbaugh, Jingyi Jin, Jeremy Bottleson, Mike B. Macpherson, Kevin Nealis, Dhawal Srivastava, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Altug Koker, Abhishek R. Appu
  • Patent number: 12020697
    Abstract: An audio keyword searcher arranged to identify a voice segment of a received audio signal; identify, by an automatic speech recognition engine, one or more phonemes included in the voice segment; output, from the automatic speech recognition engine, the one or more phonemes to a keyword filter to detect whether the voice segment includes any of the one or more first keywords of the first keyword list and, if detected, output the one or more phonemes included in the voice segment to a decoder but, if not detected, not output the one or more phonemes included in the voice segment to the decoder. If the one or more phonemes are output to the decoder: generate a word lattice associated with the voice segment; search the word lattice for one or more second keywords, and determine whether the voice segment includes the one or more second keywords.
    Type: Grant
    Filed: July 15, 2020
    Date of Patent: June 25, 2024
    Assignee: Raytheon Applied Signal Technology, Inc.
    Inventor: Jonathan C. Wintrode
  • Patent number: 12019639
    Abstract: This application relates to apparatus and methods for generating preference profiles that may be used to rank search results. In some examples, a computing device obtains browsing session data and determines items that were engaged, such as items that were viewed or clicked. The computing device obtains item property data, such as product descriptions, for the items, and applies a dependency parser to the item property data to identify portions that include certain words, such as nouns or adjectives, which are then identified as attributes. The computing device generates attribute data identifying portions of the item property data as item attributes. In some examples, the computing device applies one or more machine learning algorithms to the session data and/or search query to identify item attributes. The computing device may generate a profile that includes the item attributes, and may rank search results based on the attribute data, among other uses.
    Type: Grant
    Filed: January 25, 2023
    Date of Patent: June 25, 2024
    Assignee: Walmart Apollo, LLC
    Inventors: Rahul Iyer, Soumya Wadhwa, Stephen Dean Guo, Kannan Achan
  • Patent number: 12008459
    Abstract: This document relates to architectures and training procedures for multi-task machine learning models, such as neural networks. One example method involves providing a multi-task machine learning model having one or more shared layers and two or more task-specific layers. The method can also involve performing a pretraining stage on the one or more shared layers using one or more unsupervised prediction tasks. The method can also involve performing a tuning stage on the one or more shared layers and the two or more task-specific layers using respective task-specific objectives.
    Type: Grant
    Filed: June 17, 2019
    Date of Patent: June 11, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Weizhu Chen, Pengcheng He, Xiaodong Liu, Jianfeng Gao
  • Patent number: 11996099
    Abstract: An embodiment dialogue system includes a speech recognizer configured to convert an utterance of a user into an utterance text, a natural language understanding module configured to identify an intention of the user based on the utterance text, and a controller configured to generate a first control signal for performing control corresponding to the intention of the user, identify whether an additional control item related to the control corresponding to the intention of the user exists, and in response to the additional control item existing, generate a second control signal for displaying information about the additional control item on a display.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: May 28, 2024
    Assignees: Hyundai Motor Company, Kia Corporation
    Inventors: Sungwang Kim, Donghyeon Lee, Minjae Park
  • Patent number: 11990151
    Abstract: The present technology relates to a particular-sound detector and method, and a program that make it possible to improve the performance of detecting particular sounds. The particular-sound detector includes a particular-sound detecting section that detects a particular sound on a basis of a plurality of audio signals obtained by collecting sounds by a plurality of microphones provided to a wearable device. In addition, the plurality of the microphones includes two microphones that are equidistant at least from a sound source of the particular sound, and one microphone arranged at a predetermined position. The present technology can be applied to headphones.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: May 21, 2024
    Assignee: Sony Group Corporation
    Inventors: Yuki Yamamoto, Yuji Tokozume, Toru Chinen
  • Patent number: 11989941
    Abstract: Embodiments described a method of video-text pre-learning to effectively learn cross-modal representations from sparse video frames and text. Specifically, an align and prompt framework provides a video and language pre-training framework that encodes the frames and text independently using a transformer-based video encoder and a text encoder. A multi-modal encoder is then employed to capture cross-modal interaction between a plurality of video frames and a plurality of texts. The pre-training includes a prompting entity modeling that enables the model to capture fine-grained region-entity alignment.
    Type: Grant
    Filed: December 30, 2021
    Date of Patent: May 21, 2024
    Assignee: Salesforce, Inc.
    Inventors: Dongxu Li, Junnan Li, Chu Hong Hoi
  • Patent number: 11970059
    Abstract: The invention relates to a system for interacting with an occupant of a motor vehicle comprising: a. a measuring device comprising at least one sensor arranged to acquire at least one parameter associated with the occupant of said vehicle; b. an on-board processing unit arranged to receive said parameter and to define a data item representing the emotional state of said occupant by means of said model, said representative data item being a comfort index score (CISn) of said occupant; c. the representative data item corresponding to a point in a two-dimensional space (anvn) for characterising the emotional state of the occupant; d. characterised in that an emotional comfort index is computed on the basis of the representative data item; e. and in that at least one actuator is configured to activate at least one multi-sensory stimulus for interacting with the occupant, said stimulus allowing the emotional state of said occupant to be changed.
    Type: Grant
    Filed: January 6, 2020
    Date of Patent: April 30, 2024
    Assignee: VALEO SYSTEMES THERMIQUES
    Inventors: Georges De Pelsemaeker, Antoine Boilevin, Hamid Bessaa
  • Patent number: 11967340
    Abstract: Disclosed is a method for detecting a voice from audio data, performed by a computing device according to an exemplary embodiment of the present disclosure. The method includes obtaining audio data; generating image data based on a spectrum of the obtained audio data; analyzing the generated image data by utilizing a pre-trained neural network model; and determining whether an automated response system (ARS) voice is included in the audio data, based on the analysis of the image data.
    Type: Grant
    Filed: June 23, 2023
    Date of Patent: April 23, 2024
    Assignee: ActionPower Corp.
    Inventors: Subong Choi, Dongchan Shin, Jihwa Lee
  • Patent number: 11961515
    Abstract: A method includes receiving a plurality of unlabeled audio samples corresponding to spoken utterances not paired with corresponding transcriptions. At a target branch of a contrastive Siamese network, the method also includes generating a sequence of encoder outputs for the plurality of unlabeled audio samples and modifying time characteristics of the encoder outputs to generate a sequence of target branch outputs. At an augmentation branch of a contrastive Siamese network, the method also includes performing augmentation on the unlabeled audio samples, generating a sequence of augmented encoder outputs for the augmented unlabeled audio samples, and generating predictions of the sequence of target branch outputs generated at the target branch. The method also includes determining an unsupervised loss term based on target branch outputs and predictions of the sequence of target branch outputs. The method also includes updating parameters of the audio encoder based on the unsupervised loss term.
    Type: Grant
    Filed: December 14, 2021
    Date of Patent: April 16, 2024
    Assignee: Google LLC
    Inventors: Jaeyoung Kim, Soheil Khorram, Hasim Sak, Anshuman Tripathi, Han Lu, Qian Zhang
  • Patent number: 11942094
    Abstract: A speaker verification method includes receiving audio data corresponding to an utterance, processing a first portion of the audio data that characterizes a predetermined hotword to generate a text-dependent evaluation vector, and generating one or more text-dependent confidence scores. When one of the text-dependent confidence scores satisfies a threshold, the operations include identifying a speaker of the utterance as a respective enrolled user associated with the text-dependent confidence score that satisfies the threshold and initiating performance of an action without performing speaker verification. When none of the text-dependent confidence scores satisfy the threshold, the operations include processing a second portion of the audio data that characterizes a query to generate a text-independent evaluation vector, generating one or more text-independent confidence scores, and determining whether the identity of the speaker of the utterance includes any of the enrolled users.
    Type: Grant
    Filed: March 24, 2021
    Date of Patent: March 26, 2024
    Assignee: Google LLC
    Inventors: Roza Chojnacka, Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno
  • Patent number: 11941787
    Abstract: Examples are provided relating to recovering depth data from noisy phase data of low-signal pixels. One example provides a computing system, comprising a logic machine, and a storage machine holding instructions executable by the logic machine to process depth data by obtaining depth image data and active brightness image data for a plurality of pixels, the depth image data comprising phase data for a plurality of frequencies, and identifying low-signal pixels based at least on the active brightness image data. The instructions are further executable to apply a denoising filter to phase data of the low-signal pixels to obtain denoised phase data and not applying the denoising filter to phase data of other pixels. The instructions are further executable to, after applying the denoising filter, perform phase unwrapping on the phase data for the plurality of frequencies to obtain a depth image, and output the depth image.
    Type: Grant
    Filed: August 23, 2021
    Date of Patent: March 26, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Sergio Ortiz Egea, Augustine Cha
  • Patent number: 11941356
    Abstract: Embodiments described herein propose a densely connected Transformer architecture in which each Transformer layer takes advantages of all previous layers. Specifically, the input for each Transformer layer comes from the outputs of all its preceding layers; and the output information of each layer will be incorporated in all its subsequent layers. In this way, a L-layer Transformer network will have L(L+1)/2 connections. In this way, the dense connection allows the linguistic information learned by the lower layer to be directly propagated to all upper layers and encourages feature reuse throughout the network. Each layer is thus directly optimized from the loss function in the fashion of implicit deep supervision.
    Type: Grant
    Filed: October 26, 2020
    Date of Patent: March 26, 2024
    Assignee: Salesforce, Inc.
    Inventors: Linqing Liu, Caiming Xiong
  • Patent number: 11929063
    Abstract: A supervised discriminator for detecting bio-markers in an audio sample dataset is trained and a denoising autoencoder is trained to learn a latent space that is used to reconstruct an output audio sample with a same fidelity as an input audio sample of the audio sample dataset. A conditional auxiliary generative adversarial network (GAN) trained to generate the output audio sample with the same fidelity as the input audio sample, wherein the output audio sample is void of the bio-markers. The conditional auxiliary generative adversarial network (GAN), the corresponding supervised discriminator, and the corresponding denoising autoencoder are deployed in an audio processing system.
    Type: Grant
    Filed: November 23, 2021
    Date of Patent: March 12, 2024
    Assignee: International Business Machines Corporation
    Inventors: Victor Abayomi Akinwande, Celia Cintas, Komminist Weldemariam, Aisha Walcott
  • Patent number: 11922178
    Abstract: Methods, apparatus, systems, and articles of manufacture to load data into an accelerator are disclosed. An example apparatus includes data provider circuitry to load a first section and an additional amount of compressed machine learning parameter data into a processor engine. Processor engine circuitry executes a machine learning operation using the first section of compressed machine learning parameter data. A compressed local data re-user circuitry determines if a second section is present in the additional amount of compressed machine learning parameter data. The processor engine circuitry executes a machine learning operation using the second section when the second section is present in the additional amount of compressed machine learning parameter data.
    Type: Grant
    Filed: June 25, 2021
    Date of Patent: March 5, 2024
    Assignee: Intel Corporation
    Inventors: Arnab Raha, Deepak Mathaikutty, Debabrata Mohapatra, Sang Kyun Kim, Gautham Chinya, Cormac Brick
  • Patent number: 11908447
    Abstract: According to an aspect, method for synthesizing multi-speaker speech using an artificial neural network comprises generating and storing a speech learning model for a plurality of users by subjecting a synthetic artificial neural network of a speech synthesis model to learning, based on speech data of the plurality of users, generating speaker vectors for a new user who has not been learned and the plurality of users who have already been learned by using a speaker recognition model, determining a speaker vector having the most similar relationship with the speaker vector of the new user according to preset criteria out of the speaker vectors of the plurality of users who have already been learned, and generating and learning a speaker embedding of the new user by subjecting the synthetic artificial neural network of the speech synthesis model to learning, by using a value of a speaker embedding of a user for the determined speaker vector as an initial value and based on speaker data of the new user.
    Type: Grant
    Filed: August 4, 2021
    Date of Patent: February 20, 2024
    Assignee: IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY)
    Inventors: Joon Hyuk Chang, Jae Uk Lee
  • Patent number: 11907855
    Abstract: A computer implemented method of storing and retrieving feature map data of a neural network the method comprising receiving a first portion of feature map data from local storage, selecting a first set of subportions of the first portion of feature map data, compressing the subportions to produce a first plurality of sections of compressed feature map data and instructing the storage of the sections into external storage. The method also comprises receiving a second plurality of sections of compressed feature map data from the external storage, decompressing the sections to produce a second set of subportions of the second portion of feature map data and storing the second portion of feature map data in local storage. The first and second sets of subportions each correspond to a predetermined format of subdivision and the method comprises selecting the predetermined format of subdivision from a plurality of predetermined formats of subdivision.
    Type: Grant
    Filed: March 30, 2020
    Date of Patent: February 20, 2024
    Assignee: Arm Limited
    Inventors: Erik Persson, Stefan Johannes Frid, Elliot Maurice Simon Rosemarine
  • Patent number: 11886768
    Abstract: Embodiments are disclosed for real time generative audio for brush and canvas interaction in digital drawing. The method may include receiving a user input and a selection of a tool for generating audio for a digital drawing interaction. The method may further include generating intermediary audio data based on the user input and the tool selection, wherein the intermediary audio data includes a pitch and a frequency. The method may further include processing, by a trained audio transformation model and through a series of one or more layers of the trained audio transformation model, the intermediary audio data. The method may further include adjusting the series of one or more layers of the trained audio transformation model to include one or more additional layers to produce an adjusted audio transformation model. The method may further include generating, by the adjusted audio transformation model, an audio sample based on the intermediary audio data.
    Type: Grant
    Filed: April 29, 2022
    Date of Patent: January 30, 2024
    Assignee: Adobe Inc.
    Inventors: Pranay Kumar, Nipun Jindal
  • Patent number: 11868884
    Abstract: The present disclosure provides methods and systems for providing machine learning model service. The method may comprise: (a) generating, by a first computing system, a first output data using a first machine learning model, wherein the first machine learning model is trained on a first training dataset; (b) transmitting the first output data to a second computing system, wherein the first training dataset and the first machine learning model are inaccessible to the second computing system; (c) creating an input data by joining the first output data with a selected set of input features accessible to the second computing system; and (d) generating a second output data using a second machine learning model to process the input data.
    Type: Grant
    Filed: June 17, 2020
    Date of Patent: January 9, 2024
    Assignee: MOLOCO, INC.
    Inventors: Jian Gong Deng, Ikkjin Ahn, Daeseob Lim, Bokyung Choi, Sechan Oh, William Kanaan
  • Patent number: 11868736
    Abstract: Introduced here is a computer program that is representative of a software-implemented collaboration platform that is designed to facilitate conversations in virtual environments, document those conversations, and analyze those conversations, all in real time. The collaboration platform can include or integrate tools for turning ideas—expressed through voice—into templatized, metadata-rich data structures called “knowledge objects.” Discourse throughout a conversation can be converted into a transcription (or simply “transcript”), parsed to identify topical shifts, and then segmented based on the topical shifts. Separately documenting each topic in the form of its own “knowledge object” allows the collaboration platform to not only better catalogue what was discussed in a single ideation session, but also monitor discussion of the same topic over multiple ideation sessions.
    Type: Grant
    Filed: November 9, 2022
    Date of Patent: January 9, 2024
    Assignee: Moonbeam, Inc.
    Inventors: Nirav S. Desai, Trond Tamaio Nilsen, Philip Roger Lamb
  • Patent number: 11862146
    Abstract: Audio signals of speech may be processed using an acoustic model. An acoustic model may be implemented with multiple streams of processing where different streams perform processing using different dilation rates. For example, a first stream may process features of the audio signal with one or more convolutional neural network layers having a first dilation rate, and a second stream may process features of the audio signal with one or more convolutional neural network layers having a second dilation rate. Each stream may compute a stream vector, and the stream vectors may be combined to a vector of speech unit scores, where the vector of speech unit scores provides information about the acoustic content of the audio signal. The vector of speech unit scores may be used for any appropriate application of speech, such as automatic speech recognition.
    Type: Grant
    Filed: July 2, 2020
    Date of Patent: January 2, 2024
    Assignee: ASAPP, INC.
    Inventors: Kyu Jeong Han, Tao Ma, Daniel Povey
  • Patent number: 11848748
    Abstract: An apparatus and method for enhancing broadcast radio includes a DNN trained on data sets of audio created from a synthesized broadcasting process and original source audio. Broadcast radio signals are received at a radio module and processed through the DNN to produce enhanced audio.
    Type: Grant
    Filed: December 14, 2020
    Date of Patent: December 19, 2023
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Joseph Kampeas, Igal Kotzer
  • Patent number: 11842736
    Abstract: Provided is an in-ear device and associated computational support system that leverages machine learning to interpret sensor data descriptive of one or more in-ear phenomena during subvocalization by the user. An electronic device can receive sensor data generated by at least one sensor at least partially positioned within an ear of a user, wherein the sensor data was generated by the at least one sensor concurrently with the user subvocalizing a subvocalized utterance. The electronic device can then process the sensor data with a machine-learned subvocalization interpretation model to generate an interpretation of the subvocalized utterance as an output of the machine-learned subvocalization interpretation model.
    Type: Grant
    Filed: February 10, 2023
    Date of Patent: December 12, 2023
    Assignee: Google LLC
    Inventors: Yaroslav Volovich, Ant Oztaskent, Blaise Aguera-Arcas
  • Patent number: 11822657
    Abstract: Disclosed is a computer implemented method for malware detection that analyses a file on a per packet basis. The method receives a packet of one or more packets associated a file, and converting a binary content associated with the packet into a digital representation and tokenizing plain text content associated with the packet. The method extracts one or more n-gram features, an entropy feature, and a domain feature from the converted content of the packet and applies a trained machine learning model to the one or more features extracted from the packet. The output of the machine learning method is a probability of maliciousness associated with the received packet. If the probability of maliciousness is above a threshold value, the method determines that the file associated with the received packet is malicious.
    Type: Grant
    Filed: April 20, 2022
    Date of Patent: November 21, 2023
    Assignee: Zscaler, Inc.
    Inventors: Huihsin Tseng, Hao Xu, Jian L Zhen
  • Patent number: 11817081
    Abstract: A learning device calculates an image feature using a model (image encoder) that receives an image and outputs the image feature obtained by mapping the image into a latent space. The learning device calculates an audio feature using a model (audio encoder) that receives a speech in a predetermined language and outputs the audio feature obtained by mapping the speech into the latent space, and that includes a neural network provided with a self-attention mechanism. The learning device updates parameters of the models used by an image feature calculation unit and an audio feature calculation unit such that the image feature of a first image is similar to the audio feature of a speech corresponding to the first image.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: November 14, 2023
    Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
    Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
  • Patent number: 11810552
    Abstract: The present disclosure provides an artificial intelligence (AI) system for sequence-to-sequence modeling with attention adapted for streaming applications. The AI system comprises at least one processor; and memory having instructions stored thereon that, when executed by the processor, cause the AI system to process each input frame in a sequence of input frames through layers of a deep neural network (DNN) to produce a sequence of outputs. At least some of the layers of the DNN include a dual self-attention module having a dual non-causal and causal architecture attending to non-causal frames and causal frames. Further, the AI system renders the sequence of outputs.
    Type: Grant
    Filed: July 2, 2021
    Date of Patent: November 7, 2023
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Niko Moritz, Takaaki Hori, Jonathan Le Roux
  • Patent number: 11810471
    Abstract: A computer system analyses audio data representing a user speaking words from a body of text and identifies occasions where the user mispronounces an expected phoneme. Mispronunciation of the expected phoneme is identified by comparison with a phonetic sequence corresponding to the text, based on a predetermined or user-selected language model. The system requires the user to read continuously for a period of time, so that the user cannot hide any tendency they have to pronounce the words of the text either incorrectly or differently to the expected phonemes from the language model. The system operates on the basis of comparing the similarity of the spoken sounds of the user with the expected phonemes for the body of text, and it is not necessary to convert the user's speech to text. As the computer system need only work with the similarity scores and the sequence of expected phonemes, it can be implemented in a computationally efficient manner.
    Type: Grant
    Filed: May 13, 2019
    Date of Patent: November 7, 2023
    Assignee: SPEECH ENGINEERING LIMITED
    Inventor: David Matthew Karas
  • Patent number: 11804216
    Abstract: Systems and methods for generating training data for a supervised topic modeling system from outputs of a topic discovery model are described herein. In an embodiment, a system receives a plurality of digitally stored call transcripts and, using a topic model, generates an output which identifies a plurality of topics represented in the plurality of digitally stored call transcripts. Using the output of the topic model, the system generates an input dataset for a supervised learning model by identify a first subset of the plurality of digitally stored call transcripts that include the particular topic, storing a positive value for the first subset, identifying a second subset that do not include the particular topic, and storing a negative value for the second subset. The input training dataset is then used to train a supervised learning model.
    Type: Grant
    Filed: August 3, 2022
    Date of Patent: October 31, 2023
    Assignee: Invoca, Inc.
    Inventors: Michael McCourt, Anoop Praturu
  • Patent number: 11803746
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural programming. One of the methods includes processing a current neural network input using a core recurrent neural network to generate a neural network output; determining, from the neural network output, whether or not to end a currently invoked program and to return to a calling program from the set of programs; determining, from the neural network output, a next program to be called; determining, from the neural network output, contents of arguments to the next program to be called; receiving a representation of a current state of the environment; and generating a next neural network input from an embedding for the next program to be called and the representation of the current state of the environment.
    Type: Grant
    Filed: April 27, 2020
    Date of Patent: October 31, 2023
    Assignee: DeepMind Technologies Limited
    Inventors: Scott Ellison Reed, Joao Ferdinando Gomes de Freitas
  • Patent number: 11790921
    Abstract: Systems, methods, and non-transitory computer-readable media can obtain a stream of audio waveform data that represents speech involving a plurality of speakers. As the stream of audio waveform data is obtained, a plurality of audio chunks can be determined. An audio chunk can be associated with one or more identity embeddings. The stream of audio waveform data can be segmented into a plurality of segments based on the plurality of audio chunks and respective identity embeddings associated with the plurality of audio chunks. A segment can be associated with a speaker included in the plurality of speakers. Information describing the plurality of segments associated with the stream of audio waveform data can be provided.
    Type: Grant
    Filed: February 8, 2021
    Date of Patent: October 17, 2023
    Assignee: OTO Systems Inc.
    Inventors: Valentin Alain Jean Perret, Nándor Kedves, Nicolas Lucien Perony
  • Patent number: 11790894
    Abstract: A system uses conversation engines to process natural language requests and conduct automatic conversations with users. The system generates responses to users in an online conversation. The system ranks generated user responses for the online conversation. The system generates a context vector based on a sequence of utterances of the conversation and generates response vectors for generated user responses. The system ranks the user responses based on a comparison of the context vectors and user response vectors. The system uses a machine learning based model that uses a pretrained neural network that supports multiple languages. The system determines a context of an utterance based on utterances in the conversation. The system generates responses and ranks them based on the context. The ranked responses are used to respond to the user.
    Type: Grant
    Filed: March 15, 2021
    Date of Patent: October 17, 2023
    Assignee: Salesforce, Inc.
    Inventors: Yixin Mao, Zachary Alexander, Victor Winslow Yee, Joseph R. Zeimen, Na Cheng, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
  • Patent number: 11783811
    Abstract: A computer-implemented method is provided for model training. The method includes training a second end-to-end neural speech recognition model that has a bidirectional encoder to output same symbols from an output probability lattice of the second end-to-end neural speech recognition model as from an output probability lattice of a trained first end-to-end neural speech recognition model having a unidirectional encoder. The method also includes building a third end-to-end neural speech recognition model that has a unidirectional encoder by training the third end-to-end neural speech recognition model as a student by using the trained second end-to-end neural speech recognition model as a teacher in a knowledge distillation method.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: October 10, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gakuto Kurata, George Andrei Saon
  • Patent number: 11783173
    Abstract: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural network (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.
    Type: Grant
    Filed: August 4, 2016
    Date of Patent: October 10, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dilek Z Hakkani-Tur, Asli Celikyilmaz, Yun-Nung Chen, Li Deng, Jianfeng Gao, Gokhan Tur, Ye-Yi Wang
  • Patent number: 11769044
    Abstract: A neural network mapping method and a neural network mapping apparatus are provided. The method includes: mapping a calculation task for a preset feature map of each network layer in a plurality of network layers in a convolutional neural network to at least one processing element of a chip; acquiring the number of phases needed by a plurality of processing elements in the chip for completing the calculation tasks, and performing a first stage of balancing on the number of phases of the plurality of processing elements; and based on the number of the phases of the plurality of processing elements obtained after the first stage of balancing, mapping the calculation task for the preset feature map of each network layer in the plurality of network layers in the convolutional neural network to at least one processing element of the chip subjected to the first stage of balancing.
    Type: Grant
    Filed: October 27, 2020
    Date of Patent: September 26, 2023
    Assignee: LYNXI TECHNOLOGIES CO., LTD.
    Inventors: Weihao Zhang, Han Li, Chuan Hu, Yaolong Zhu
  • Patent number: 11769035
    Abstract: Techniques are described automatically determining runtime configurations used to execute recurrent neural networks (RNNs) for training or inference. One such configuration involves determining whether to execute an RNN in a looped, or “rolled,” execution pattern or in a non-looped, or “unrolled,” execution pattern. Execution of an RNN using a rolled execution pattern generally consumes less memory resources than execution using an unrolled execution pattern, whereas execution of an RNN using an unrolled execution pattern typically executes faster. The configuration choice thus involves a time-memory tradeoff that can significantly affect the performance of the RNN execution. This determination is made automatically by a machine learning (ML) runtime by analyzing various factors such as, for example, a type of RNN being executed, the network structure of the RNN, characteristics of the input data to the RNN, an amount of computing resources available, and so forth.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: September 26, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Lai Wei, Hagay Lupesko, Anirudh Acharya, Ankit Khedia, Sandeep Krishnamurthy, Cheng-Che Lee, Kalyanee Shriram Chendke, Vandana Kannan, Roshani Nagmote
  • Patent number: 11765524
    Abstract: A hearing aid with the variable number of channels includes: a microphone that receives a sound signal; an AD converter that converts the sound signal input from the microphone into a digital signal and outputs the converted digital signal; a controller that determines a filter bank channel for processing the digital signal output from the AD converter; a buffer unit that delays the digital signal based on the determined filter bank channel; a signal processor that includes at least one filter bank channel, synthesizes the digital signal using the determined filter bank channel and outputs the synthesized digital signal; a DA converter that converts the digital signal into the sound signal and outputs the converted sound signal; and a speaker that outputs the sound signal output from the DA converter, in which the controller determines the filter bank channel for processing the digital signal based on a preset condition.
    Type: Grant
    Filed: May 11, 2023
    Date of Patent: September 19, 2023
    Assignee: Korea Photonics Technology Institute
    Inventors: Seon Man Kim, Kwang Hoon Lee
  • Patent number: 11756529
    Abstract: Proposed are a method and apparatus for speech recognition, and a storage medium. The specific solution includes: obtaining audio data to be recognized; decoding the audio data to obtain a first syllable of a to-be-converted word, in which the first syllable is a combination of at least one phoneme corresponding to the to-be-converted word; obtaining a sentence to which the to-be-converted word belongs and a converted word in the sentence, and obtaining a second syllable of the converted word; encoding the first syllable and the second syllable to generate first encoding information of the first syllable; and decoding the first encoding information to obtain a text corresponding to the to-be-converted word.
    Type: Grant
    Filed: December 16, 2020
    Date of Patent: September 12, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Liao Zhang, Xiaoyin Fu, Zhengxiang Jiang, Mingxin Liang, Junyao Shao, Qi Zhang, Zhijie Chen, Qiguang Zang
  • Patent number: 11749260
    Abstract: Disclosed is a method for speech recognition performed by one or more processors of a computing device. The method includes inputting voice information into an encoder to extract a first feature vector and calculating a first loss function. The method includes inputting the first feature vector extracted from the encoder to a first decoder to perform prediction on the voice information, calculating a second loss function, and extracting a second feature vector. The method includes inputting a second feature vector extracted from the first decoder to a second decoder to perform grapheme-based prediction, and calculating a third loss function. The method includes training at least one of the encoder, the first decoder, or the second decoder based on the first loss function, the second loss function, and the third loss function.
    Type: Grant
    Filed: September 23, 2022
    Date of Patent: September 5, 2023
    Assignee: ACTIONPOWER CORP.
    Inventors: Hwanbok Mun, Dongchan Shin, Gyujin Kim, Seongmin Park, Jihwa Lee
  • Patent number: 11748567
    Abstract: Described herein are embodiments of a framework named as total correlation variational autoencoder (TC_VAE) to disentangle syntax and semantics by making use of total correlation penalties of KL divergences. One or more Kullback-Leibler (KL) divergence terms in a loss for a variational autoencoder are discomposed so that generated hidden variables may be separated. Embodiments of the TC_VAE framework were examined on semantic similarity tasks and syntactic similarity tasks. Experimental results show that better disentanglement between syntactic and semantic representations have been achieved compared with state-of-the-art (SOTA) results on the same data sets in similar settings.
    Type: Grant
    Filed: July 10, 2020
    Date of Patent: September 5, 2023
    Assignee: Baidu USA LLC
    Inventors: Dingcheng Li, Shaogang Ren, Ping Li
  • Patent number: 11743210
    Abstract: The disclosed exemplary embodiments include computer-implemented apparatuses and processes that automatically populate deep-linked interfaces based n programmatically established chatbot sessions. For example, an apparatus may determine a candidate parameter value for a first parameter of an exchange of data based on received messaging information and on information characterizing prior exchanges of data between a device and the apparatus. The apparatus may also generate interface data that associates the first candidate parameter value with a corresponding interface element of a first digital interface, and may store the store interface data within a data repository. In some instances, the apparatus may transmit linking data associated with the stored interface data to the device, and an application program executed by the device may present a representation of the linking data within a second digital interface.
    Type: Grant
    Filed: April 3, 2020
    Date of Patent: August 29, 2023
    Assignee: The Toronto-Dominion Bank
    Inventors: Tae Gyun Moon, Robert Alexander McCarter, Kheiver Kayode Roberts
  • Patent number: 11727920
    Abstract: A RNN-T model includes a prediction network configured to, at each of a plurality of times steps subsequent to an initial time step, receive a sequence of non-blank symbols. For each non-blank symbol the prediction network is also configured to generate, using a shared embedding matrix, an embedding of the corresponding non-blank symbol, assign a respective position vector to the corresponding non-blank symbol, and weight the embedding proportional to a similarity between the embedding and the respective position vector. The prediction network is also configured to generate a single embedding vector at the corresponding time step. The RNN-T model also includes a joint network configured to, at each of the plurality of time steps subsequent to the initial time step, receive the single embedding vector generated as output from the prediction network at the corresponding time step and generate a probability distribution over possible speech recognition hypotheses.
    Type: Grant
    Filed: May 26, 2021
    Date of Patent: August 15, 2023
    Assignee: Google LLC
    Inventors: Rami Botros, Tara Sainath
  • Patent number: 11715461
    Abstract: Computer implemented method and system for automatic speech recognition. A first speech sequence is processed, using a time reduction operation of an encoder NN, into a second speech sequence comprising a second set of speech frame feature vectors that each concatenate information from a respective plurality of speech frame feature vectors included in the first set and includes fewer speech frame feature vectors than the first speech sequence. The second speech sequence is transformed, using a self-attention operation of the encoder NN, into a third speech sequence comprising a third set of speech frame feature vectors. The third speech sequence is processed using a probability operation of the encoder NN, to predict a sequence of first labels corresponding to the third set of speech frame feature vectors, and using a decoder NN to predict a sequence of second labels corresponding to the third set of speech frame feature vectors.
    Type: Grant
    Filed: October 21, 2020
    Date of Patent: August 1, 2023
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Md Akmal Haidar, Chao Xing
  • Patent number: 11685395
    Abstract: An autonomous driving assistance device includes: a determination unit for determining whether or not a driver of a vehicle needs a rest on the basis of detection information of a state of the driver; and a control unit for causing an output device of the vehicle to output a pattern that prompts the driver to sleep through at least one of sight, hearing, or touch during a period in which the vehicle has started moving to a parking area and is parked at the parking area in a case where the determination unit has determined that the driver needs a rest.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: June 27, 2023
    Assignee: MITSUBISHI ELECTRIC CORPORATION
    Inventors: Misato Yuasa, Shinsaku Fukutaka, Munetaka Nishihira, Akiko Imaishi, Tsuyoshi Sempuku
  • Patent number: 11676008
    Abstract: The present disclosure provides systems and methods that enable parameter-efficient transfer learning, multi-task learning, and/or other forms of model re-purposing such as model personalization or domain adaptation. In particular, as one example, a computing system can obtain a machine-learned model that has been previously trained on a first training dataset to perform a first task. The machine-learned model can include a first set of learnable parameters. The computing system can modify the machine-learned model to include a model patch, where the model patch includes a second set of learnable parameters. The computing system can train the machine-learned model on a second training dataset to perform a second task that is different from the first task, which may include learning new values for the second set of learnable parameters included in the model patch while keeping at least some (e.g., all) of the first set of parameters fixed.
    Type: Grant
    Filed: September 20, 2019
    Date of Patent: June 13, 2023
    Assignee: GOOGLE LLC
    Inventors: Mark Sandler, Andrey Zhmoginov, Andrew Gerald Howard, Pramod Kaushik Mudrakarta
  • Patent number: 11676026
    Abstract: Computer-implemented, machine-learning systems and methods relate to a neural network having at least two subnetworks, i.e., a first subnetwork and a second subnetwork. The systems and methods estimate the partial derivative(s) of an objective with respect to (i) an output activation of a node in first subnetwork, (ii) the input to the node, and/or (iii) the connection weights to the node. The estimated partial derivative(s) are stored in a data store and provided as input to the second subnetwork. Because the estimated partial derivative(s) are persisted in a data store, the second subnetwork has access to them even after the second subnetwork has gone through subsequent training iterations. Using this information, subnetwork 160 can compute classifications and regression functions that can help, for example, in the training of the first subnetwork.
    Type: Grant
    Filed: June 4, 2019
    Date of Patent: June 13, 2023
    Assignee: D5AI LLC
    Inventor: James K. Baker
  • Patent number: 11676625
    Abstract: A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.
    Type: Grant
    Filed: January 20, 2021
    Date of Patent: June 13, 2023
    Assignee: Google LLC
    Inventors: Shuo-Yiin Chang, Bo Li, Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
  • Patent number: 11672472
    Abstract: Provided herein is a method and system for the estimation of apnea-hypopnea index (AHI), as an indicator for Obstructive sleep apnea (OSA) severity, by combining speech descriptors from three separate and distinct speech signal domains. These domains include the acoustic short-term features (STF) of continuous speech, the long-term features (LTF) of continuous speech, and features of sustained vowels (SVF). Combining these speech descriptors may provide the ability to estimate the severity of OSA using statistical learning and speech analysis approaches.
    Type: Grant
    Filed: July 10, 2017
    Date of Patent: June 13, 2023
    Assignees: B.G. NEGEV TECHNOLOGIES AND APPLICATIONS LTD., AT BEN-GURION UNIVERSITY, MOR RESEARCH APPLICATIONS LTD.
    Inventors: Yaniv Zigel, Dvir Ben Or, Ariel Tarasiuk, Eliran Dafna