Patents by Inventor Zhao SONG
Zhao SONG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12288237Abstract: Embodiments provide systems, methods, and computer storage media for a Nonsymmetric Determinantal Point Process (NDPPs) for compatible set recommendations in a setting where data representing entities (e.g., items) arrives in a stream. A stream representing compatible sets of entities is received and used to update a latent representation of the entities and a compatibility distribution indicating likelihood of compatibility of subsets of the entities. The probability distribution is accessed in a single sequential pass to predict a compatible complete set of entities that completes an incomplete set of entities. The predicted complete compatible set is provided a recommendation for entities that complete the incomplete set of entities.Type: GrantFiled: May 12, 2022Date of Patent: April 29, 2025Assignee: Adobe Inc.Inventors: Ryan A. Rossi, Aravind Reddy Talla, Zhao Song, Anup Rao, Tung Mai, Nedim Lipka, Gang Wu, Eunyee Koh
-
Patent number: 12219180Abstract: Embodiments described herein provide methods and systems for facilitating actively-learned context modeling. In one embodiment, a subset of data is selected from a training dataset corresponding with an image to be compressed, the subset of data corresponding with a subset of data of pixels of the image. A context model is generated using the selected subset of data. The context model is generally in the form of a decision tree having a set of leaf nodes. Entropy values corresponding with each leaf node of the set of leaf nodes are determined. Each entropy value indicates an extent of diversity of context associated with the corresponding leaf node. Additional data from the training dataset is selected based on the entropy values corresponding with the leaf nodes. The updated subset of data is used to generate an updated context model for use in performing compression of the image.Type: GrantFiled: May 20, 2022Date of Patent: February 4, 2025Assignee: Adobe Inc.Inventors: Gang Wu, Yang Li, Stefano Petrangeli, Viswanathan Swaminathan, Haoliang Wang, Ryan A. Rossi, Zhao Song
-
Patent number: 12130788Abstract: An anomalous period of operation of a database management system is detected by analyzing a time series of data points indicating the number of database queries pending processing by the system. Conditions associated with execution of the pending database queries are recorded and analyzed to identify conditions correlated with the anomalous period of operation. A recommendation for tuning the database is generated based on analysis of the conditions.Type: GrantFiled: September 30, 2021Date of Patent: October 29, 2024Assignee: Amazon Technologies, Inc.Inventors: Vikramank Yogendra Singh, Zhao Song, Balakrishnan Narayanaswamy, Maxym Kharchenko, Jeremiah C Wilton, Vijay Gopal Joshi, Joshua Tobey Oberwetter, Kyle Henderson Hailey
-
Publication number: 20240273378Abstract: Systems and methods for distributed machine learning are provided. According to one aspect, a method for distributed machine learning includes obtaining, by an edge device, a static machine learning model from a hub device, computing, by the edge device, an objective function for a dynamic machine learning model based on a relationship between the dynamic machine learning model and the static machine learning model, and updating, by the edge device, the dynamic machine learning model based on the objective function.Type: ApplicationFiled: February 2, 2023Publication date: August 15, 2024Inventors: Saayan Mitra, Arash Givchi, Xiang Chen, Somdeb Sarkhel, Ryan A. Rossi, Zhao Song
-
Patent number: 12047273Abstract: A control system facilitates active management of a streaming data system. Given historical data traffic for each data stream processed by a streaming data system, the control system uses a machine learning model to predict future data traffic for each data stream. The control system selects a matching between data streams and servers for a future time that minimizes a cost comprising a switching cost and a server imbalance cost based on the predicted data traffic for the future time. In some configurations, the matching is selected using a planning window comprising a number of future time steps dynamically selected based on uncertainty associated with the predicted data traffic. Given the selected matching, the control system may manage the streaming data system by causing data streams to be moved between servers based on the matching.Type: GrantFiled: February 14, 2022Date of Patent: July 23, 2024Assignee: ADOBE INC.Inventors: Georgios Theocharous, Kai Wang, Zhao Song, Sridhar Mahadevan
-
Publication number: 20240152799Abstract: Systems and methods for data augmentation are described. Embodiments of the present disclosure receive a dataset that includes a plurality of nodes and a plurality of edges, wherein each of the plurality of edges connects two of the plurality of nodes; compute a first nonnegative matrix representing a homophilous cluster affinity; compute a second nonnegative matrix representing a heterophilous cluster affinity; compute a probability of an additional edge based on the dataset using a machine learning model that represents a homophilous cluster and a heterophilous cluster based on the first nonnegative matrix and the second nonnegative matrix; and generate an augmented dataset including the plurality of nodes, the plurality of edges, and the additional edge.Type: ApplicationFiled: October 31, 2022Publication date: May 9, 2024Inventors: Sudhanshu Chanpuriya, Ryan A. Rossi, Nedim Lipka, Anup Bandigadi Rao, Tung Mai, Zhao Song
-
Publication number: 20240144307Abstract: One aspect of systems and methods for segment size estimation includes identifying a segment of users for a first time period based on time series data, wherein the time series data includes a series of interactions between users and a content channel and wherein the segment includes a portion of the users interacting with the content channel during the first time period; computing a segment return value for a second time period based on the time series data by computing a first subset and a second subset of the segment, wherein the first subset includes users that interact with the content channel greater than a threshold number of times during a range of the time series data and the second subset comprises a complement of the first subset with respect to the segment; and providing customized content to a user in the segment based on the segment return value.Type: ApplicationFiled: October 18, 2022Publication date: May 2, 2024Inventors: Tung Mai, Ritwik Sinha, Trevor Hyrum Paulsen, Xiang Chen, William Brandon George, Nate Purser, Zhao Song
-
Patent number: 11875809Abstract: Developed and presented herein are embodiments of a new end-to-end approach for audio denoising, from a synthesis perspective. Instead of explicitly modelling the noise component in the input signal, embodiments directly synthesize the denoised audio from a generative model (or vocoder), as in text-to-speech systems. In one or more embodiments, to generate the phonetic contents for the autoregressive generative model, it is learned via a variational autoencoder with discrete latent representations. Furthermore, in one or more embodiments, a new matching loss is presented for the denoising purpose, which is masked on when the corresponding latent codes differ. As compared against other method on test datasets, embodiments achieve competitive performance and can be trained from scratch.Type: GrantFiled: October 1, 2020Date of Patent: January 16, 2024Assignee: Baidu USA LLCInventors: Zhao Song, Wei Ping
-
Publication number: 20230379507Abstract: Embodiments described herein provide methods and systems for facilitating actively-learned context modeling. In one embodiment, a subset of data is selected from a training dataset corresponding with an image to be compressed, the subset of data corresponding with a subset of data of pixels of the image. A context model is generated using the selected subset of data. The context model is generally in the form of a decision tree having a set of leaf nodes. Entropy values corresponding with each leaf node of the set of leaf nodes are determined. Each entropy value indicates an extent of diversity of context associated with the corresponding leaf node. Additional data from the training dataset is selected based on the entropy values corresponding with the leaf nodes. The updated subset of data is used to generate an updated context model for use in performing compression of the image.Type: ApplicationFiled: May 20, 2022Publication date: November 23, 2023Inventors: Gang Wu, Yang Li, Stefano Petrangeli, Viswanathan Swaminathan, Haoliang Wang, Ryan A. Rossi, Zhao Song
-
Publication number: 20230368265Abstract: Embodiments provide systems, methods, and computer storage media for a Nonsymmetric Determinantal Point Process (NDPPs) for compatible set recommendations in a setting where data representing entities (e.g., items) arrives in a stream. A stream representing compatible sets of entities is received and used to update a latent representation of the entities and a compatibility distribution indicating likelihood of compatibility of subsets of the entities. The probability distribution is accessed in a single sequential pass to predict a compatible complete set of entities that completes an incomplete set of entities. The predicted complete compatible set is provided a recommendation for entities that complete the incomplete set of entities.Type: ApplicationFiled: May 12, 2022Publication date: November 16, 2023Inventors: Ryan A. Rossi, Aravind Reddy Talla, Zhao Song, Anup Rao, Tung Mai, Nedim Lipka, Gang Wu, Anup Rao
-
Publication number: 20230298189Abstract: The present application is applicable to the technical field of computer vision, and provides a method for reconstructing a three-dimensional object combining structured light and photometry and a terminal device, wherein the method comprises: acquiring N first images, wherein each first image is obtained by shooting after a coded pattern having a coding stripe sequence is projected to a three-dimensional object, and N is a positive integer; determining structured light depth information of the three-dimensional object based on the N first images; acquiring M second images, wherein the M second images are obtained by shooting after P light sources are respectively projected to the three-dimensional object from different directions, and M and P are positive integers; determining photometric information of the three-dimensional object based on the M second images; and reconstructing the three-dimensional object based on the structured light depth information and the photometric information.Type: ApplicationFiled: November 17, 2020Publication date: September 21, 2023Applicant: SHENZHEN INSTITUTES OF ADVANCED TECHNOLOGY CHINESE ACADEMY OF SCIENCESInventors: Zhan SONG, Zhao SONG
-
Publication number: 20230289473Abstract: According to various embodiments, a method for encrypting image data for a neural network are disclosed. The method includes mixing the image data with other datapoints to form mixed data; and applying a pixel-wise random mask to the mixed data to form encrypted data. According to various embodiments, a method for encrypting text data for a neural network for natural language processing is disclosed. The method includes encoding each text datapoint via a pretrained text encoder to form encoded datapoints; mixing the encoded datapoints with other encoded datapoints to form mixed data; applying a random mask to the mixed data to form encrypted data; and incorporating the encrypted data into training a classifier of the neural network and fine-tuning the text encoder.Type: ApplicationFiled: June 17, 2021Publication date: September 14, 2023Applicant: The Trustees of Princeton UniversityInventors: Sanjeev ARORA, Kai LI, Yangsibo HUANG, Zhao SONG, Danqi CHEN
-
Publication number: 20230281680Abstract: Systems and methods for resource allocation are described. The systems and methods include receiving utilization data for computing resources shared by a plurality of users, updating a pricing agent using a reinforcement learning model based on the utilization data, identifying resource pricing information using the pricing agent, and allocating the computing resources to the plurality of users based on the resource pricing information.Type: ApplicationFiled: March 1, 2022Publication date: September 7, 2023Inventors: Michail Mamakos, Sridhar Mahadevan, Viswanathan Swaminathan, Mariette Philippe Souppe, Ritwik Sinha, Saayan Mitra, Zhao Song
-
Publication number: 20230261966Abstract: A control system facilitates active management of a streaming data system. Given historical data traffic for each data stream processed by a streaming data system, the control system uses a machine learning model to predict future data traffic for each data stream. The control system selects a matching between data streams and servers for a future time that minimizes a cost comprising a switching cost and a server imbalance cost based on the predicted data traffic for the future time. In some configurations, the matching is selected using a planning window comprising a number of future time steps dynamically selected based on uncertainty associated with the predicted data traffic. Given the selected matching, the control system may manage the streaming data system by causing data streams to be moved between servers based on the matching.Type: ApplicationFiled: February 14, 2022Publication date: August 17, 2023Inventors: Georgios Theocharous, Kai Wang, Zhao Song, Sridhar Mahadevan
-
Patent number: 11521592Abstract: WaveFlow is a small-footprint generative flow for raw audio, which may be directly trained with maximum likelihood. WaveFlow handles the long-range structure of waveform with a dilated two-dimensional (2D) convolutional architecture, while modeling the local variations using expressive autoregressive functions. WaveFlow may provide a unified view of likelihood-based models for raw audio, including WaveNet and WaveGlow, which may be considered special cases. It generates high-fidelity speech, while synthesizing several orders of magnitude faster than existing systems since it uses only a few sequential steps to generate relatively long waveforms. WaveFlow significantly reduces the likelihood gap that has existed between autoregressive models and flow-based models for efficient synthesis. Its small footprint with 5.91M parameters makes it 15 times smaller than some existing models. WaveFlow can generate 22.05 kHz high-fidelity audio 42.Type: GrantFiled: August 5, 2020Date of Patent: December 6, 2022Assignee: Baidu USA LLCInventors: Wei Ping, Kainan Peng, Kexin Zhao, Zhao Song
-
Publication number: 20220108712Abstract: Developed and presented herein are embodiments of a new end-to-end approach for audio denoising, from a synthesis perspective. Instead of explicitly modelling the noise component in the input signal, embodiments directly synthesize the denoised audio from a generative model (or vocoder), as in text-to-speech systems. In one or more embodiments, to generate the phonetic contents for the autoregressive generative model, it is learned via a variational autoencoder with discrete latent representations. Furthermore, in one or more embodiments, a new matching loss is presented for the denoising purpose, which is masked on when the corresponding latent codes differ. As compared against other method on test datasets, embodiments achieve competitive performance and can be trained from scratch.Type: ApplicationFiled: October 1, 2020Publication date: April 7, 2022Applicant: Baidu USA LLCInventors: Zhao SONG, Wei PING
-
Patent number: 11017761Abstract: Presented herein are embodiments of a non-autoregressive sequence-to-sequence model that converts text to an audio representation. Embodiment are fully convolutional, and a tested embodiment obtained about 46.7 times speed-up over a prior model at synthesis while maintaining comparable speech quality using a WaveNet vocoder. Interestingly, a tested embodiment also has fewer attention errors than the autoregressive model on challenging test sentences. In one or more embodiments, the first fully parallel neural text-to-speech system was built by applying the inverse autoregressive flow (IAF) as the parallel neural vocoder. System embodiments can synthesize speech from text through a single feed-forward pass. Also disclosed herein are embodiments of a novel approach to train the IAF from scratch as a generative model for raw waveform, which avoids the need for distillation from a separately trained WaveNet.Type: GrantFiled: October 16, 2019Date of Patent: May 25, 2021Assignee: Baidu USA LLCInventors: Kainan Peng, Wei Ping, Zhao Song, Kexin Zhao
-
Publication number: 20210090547Abstract: WaveFlow is a small-footprint generative flow for raw audio, which may be directly trained with maximum likelihood. WaveFlow handles the long-range structure of waveform with a dilated two-dimensional (2D) convolutional architecture, while modeling the local variations using expressive autoregressive functions. WaveFlow may provide a unified view of likelihood-based models for raw audio, including WaveNet and WaveGlow, which may be considered special cases. It generates high-fidelity speech, while synthesizing several orders of magnitude faster than existing systems since it uses only a few sequential steps to generate relatively long waveforms. WaveFlow significantly reduces the likelihood gap that has existed between autoregressive models and flow-based models for efficient synthesis. Its small footprint with 5.91M parameters makes it 15 times smaller than some existing models. WaveFlow can generate 22.05 kHz high-fidelity audio 42.Type: ApplicationFiled: August 5, 2020Publication date: March 25, 2021Applicant: Baidu USA LLCInventors: Wei PING, Kainan PENG, Kexin ZHAO, Zhao SONG
-
Publication number: 20200066253Abstract: Presented herein are embodiments of a non-autoregressive sequence-to-sequence model that converts text to an audio representation. Embodiment are fully convolutional, and a tested embodiment obtained about 46.7 times speed-up over a prior model at synthesis while maintaining comparable speech quality using a WaveNet vocoder. Interestingly, a tested embodiment also has fewer attention errors than the autoregressive model on challenging test sentences. In one or more embodiments, the first fully parallel neural text-to-speech system was built by applying the inverse autoregressive flow (IAF) as the parallel neural vocoder. System embodiments can synthesize speech from text through a single feed-forward pass. Also disclosed herein are embodiments of a novel approach to train the IAF from scratch as a generative model for raw waveform, which avoids the need for distillation from a separately trained WaveNet.Type: ApplicationFiled: October 16, 2019Publication date: February 27, 2020Applicant: Baidu USA LLCInventors: Kainan PENG, Wei PING, Zhao SONG, Kexin ZHAO
-
Publication number: 20200042872Abstract: A parameter estimation unit 81 estimates parameters of a neural network model that maximize the lower limit of a log marginal likelihood related to observation value data and hidden layer nodes. A variational probability estimation unit 82 estimates parameters of the variational probability of nodes that maximize the lower limit of the log marginal likelihood. A node deletion determination unit 83 determines nodes to be deleted on the basis of the variational probability of which the parameters have been estimated, and deletes nodes determined to correspond to the nodes to be deleted. A convergence determination unit 84 determines the convergence of the neural network model on the basis of the change in the variational probability.Type: ApplicationFiled: August 16, 2017Publication date: February 6, 2020Applicant: NEC CORPORATIONInventors: Yusuke MURAOKA, Ryohei FUJIMAKI, Zhao SONG