Patents by Inventor Tung Mai
Tung Mai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11899693Abstract: A cluster generation system identifies data elements, from a first binary record, that each have a particular value and correspond to respective binary traits. A candidate description function describing the binary traits is generated, the candidate description function including a model factor that describes the data elements. Responsive to determining that a second record has additional data elements having the particular value and corresponding to the respective binary traits, the candidate description function is modified to indicate that the model factor describes the additional elements. The candidate description function is also modified to include a correction factor describing an additional binary trait excluded from the respective binary traits. Based on the modified candidate description function, the cluster generation system generates a data summary cluster, which includes a compact representation of the binary traits of the data elements and additional data elements.Type: GrantFiled: February 22, 2022Date of Patent: February 13, 2024Assignee: Adobe Inc.Inventors: Yeuk-yin Chan, Tung Mai, Ryan Rossi, Moumita Sinha, Matvey Kapilevich, Margarita Savova, Fan Du, Charles Menguy, Anup Rao
-
Publication number: 20230401246Abstract: Embodiments provide systems, methods, and computer storage media for determining string similarity and pattern matching in strings that arrive in a stream. A stream representing string of characters is received and used to compute mapping values that are compared to a mapping value of a query string to identify a match between strings in the stream of characters and the query string. The stream of characters is searched in a single sequential pass to detect a match or the longest matching substring with a query string. An identified match or absence of a match is provided.Type: ApplicationFiled: June 8, 2022Publication date: December 14, 2023Inventors: Tung Mai, Ryan A. Rossi, Anup Rao
-
Publication number: 20230368265Abstract: Embodiments provide systems, methods, and computer storage media for a Nonsymmetric Determinantal Point Process (NDPPs) for compatible set recommendations in a setting where data representing entities (e.g., items) arrives in a stream. A stream representing compatible sets of entities is received and used to update a latent representation of the entities and a compatibility distribution indicating likelihood of compatibility of subsets of the entities. The probability distribution is accessed in a single sequential pass to predict a compatible complete set of entities that completes an incomplete set of entities. The predicted complete compatible set is provided a recommendation for entities that complete the incomplete set of entities.Type: ApplicationFiled: May 12, 2022Publication date: November 16, 2023Inventors: Ryan A. Rossi, Aravind Reddy Talla, Zhao Song, Anup Rao, Tung Mai, Nedim Lipka, Gang Wu, Anup Rao
-
Publication number: 20230267132Abstract: A cluster generation system identifies data elements, from a first binary record, that each have a particular value and correspond to respective binary traits. A candidate description function describing the binary traits is generated, the candidate description function including a model factor that describes the data elements. Responsive to determining that a second record has additional data elements having the particular value and corresponding to the respective binary traits, the candidate description function is modified to indicate that the model factor describes the additional elements. The candidate description function is also modified to include a correction factor describing an additional binary trait excluded from the respective binary traits. Based on the modified candidate description function, the cluster generation system generates a data summary cluster, which includes a compact representation of the binary traits of the data elements and additional data elements.Type: ApplicationFiled: February 22, 2022Publication date: August 24, 2023Inventors: Yeuk-yin Chan, Tung Mai, Ryan Rossi, Moumita Sinha, Matvey Kapilevich, Margarita Savova, Fan Du, Charles Menguy, Anup Rao
-
Patent number: 11720592Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that estimate the overlap between sets of data samples. In particular, in one or more embodiments, the disclosed systems utilize a sketch-based sampling routine and a flexible, accurate estimator to determine the overlap (e.g., the intersection) between sets of data samples. For example, in some implementations, the disclosed systems generate a sketch vector—such as a one permutation hashing vector—for each set of data samples. The disclosed systems further compare the sketch vectors to determine an equal bin similarity estimator, a lesser bin similarity estimator, and a greater bin similarity estimator. The disclosed systems utilize one or more of the determined similarity estimators in generating an overlap estimation for the sets of data samples.Type: GrantFiled: August 10, 2022Date of Patent: August 8, 2023Assignee: Adobe Inc.Inventors: Anup Rao, Tung Mai, Matvey Kapilevich
-
Publication number: 20230153338Abstract: A search system facilitates efficient and fast near neighbor search given item vector representations of items, regardless of item type or corpus size. To index an item, the search system expands an item vector for the item to generate an expanded item vector and selects elements of the expanded item vector. The item is index by storing an identifier of the item in posting lists of an index corresponding to the position of each selected element in the expanded item vector. When a query is received, a query vector for the item is expanded to generate an expanded query vector, and elements of the expanded query vector are selected. Candidate items are identified based on posting lists corresponding to the position of each selected element in the expand query vector. The candidate items may be ranked, and a result set is returned as a response to the query.Type: ApplicationFiled: November 15, 2021Publication date: May 18, 2023Inventors: Tung Mai, Saayan Mitra, Ryan A. Rossi, Gaurav Gupta, Anup Rao, Xiang Chen
-
Publication number: 20230118785Abstract: Systems and methods for training a neural network are described. One or more embodiments of the present disclosure include training a neural network based on a first combined gradient of a loss function at a plurality of sampled elements of a dataset; receiving an insertion request that indicates an insertion element to be added to the dataset, or a deletion request that indicates a deletion element to be removed from the dataset, wherein the deletion element is one of the plurality of sampled elements; computing a second combined gradient of the loss function by adding the insertion element to the dataset or by replacing the deletion element with a replacement element from the dataset; determining whether the first combined gradient and the second combined gradient satisfy a stochastic condition; and retraining the neural network to obtain a modified neural network based on the determination.Type: ApplicationFiled: October 18, 2021Publication date: April 20, 2023Inventors: Enayat Ullah, Anup Bandigadi Rao, Tung Mai, Ryan A. Rossi
-
Patent number: 11620460Abstract: A novel system and method to facilitate an issuing and storing key/keycard system includes a network, a cloud server, a terminal, a cabinet control server, and at least an issuing and storing key/keycard cabinet. The customer/ the user, through the terminal, can sign up/ receive authentication methods to verify user rights/owner of key/keycard to the cloud server. The issuing and storing key/keycard cabinet is used in issuing and storing the customer/user's key/keycard. The cabinet control server controls and operates the issuing and storing key/keycard cabinet. The invention is to provide a method of issuing and storing key/keycard includes steps: i) the customer/user register/verify user rights/owner of key/keycard to the cloud server; ii) the customer/user takes/stores key/keycards at the issuing and storing key/keycard cabinet; iii) synchronization data between the cloud server with the cabinet control server.Type: GrantFiled: October 5, 2022Date of Patent: April 4, 2023Inventor: Tung Mai Le
-
Patent number: 11609915Abstract: The present disclosure relates to method for responding to a query requesting an intersection being performed. The method includes receiving a query referencing a first set, a second set, and a desired quantile related to the first set from among a plurality of quantiles; generating a data structure including a bottom-k sketch of user identifiers (ids) of the first set and corresponding numerical values of the first data; partitioning the data structure into a plurality of sketches to correspond to the quantiles, respectively; determining an intersection of one of the sketches associated with the desired quantile and a sketch of the second set; and responding to the query based on the intersection.Type: GrantFiled: March 15, 2021Date of Patent: March 21, 2023Assignee: ADOBE INC.Inventors: Tung Mai, Anup Rao, Yeshwanth Vijayakumar
-
Patent number: 11544281Abstract: In some embodiments, a model training system trains a sample generation model configured to generate synthetic data entries for a dataset. The sample generation model includes a prior model for generating an estimated latent vector from a partially observed data entry, a proposal model for generating a latent vector from a data entry of the dataset and a mask corresponding to the partially observed data entry, and a generative model for generating the synthetic data entries from the latent vector and the partially observed data entry. The model training system trains the sample generation model to optimize an objective function that includes a first term determined using the synthetic data entries and a second term determined using the estimated latent vector and the latent vector. The trained sample generation model can be executed on a client computing device to service queries using the generated synthetic data entries.Type: GrantFiled: November 20, 2020Date of Patent: January 3, 2023Assignee: Adobe Inc.Inventors: Subrata Mitra, Nikhil Sheoran, Anup Rao, Tung Mai, Sapthotharan Krishnan Nair, Shivakumar Vaithyanathan, Thomas Jacobs, Ghetia Siddharth, Jatin Varshney, Vikas Maddukuri, Laxmikant Mishra
-
Patent number: 11526907Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for determining an increased matching for large graphs in which an increased matching is generated for the graph by leveraging an initial matching for a small fraction of edges of the large graph. An initial matching for a random subset of edges of an input graph is leveraged to generate alternating paths based on the initially matched edges and the remaining edges, not included in the random subset. An increased matching for the entire graph includes the alternating paths without the initial matched edges, thus increasing the number of matched edges in the increased matching by at least one for every initially matched edge. Graph-based tasks may then be triggered based on the increased matching.Type: GrantFiled: November 19, 2019Date of Patent: December 13, 2022Assignee: ADOBE INC.Inventors: Alireza Farhadi, Ryan A. Rossi, Tung Mai, Anup Rao
-
Publication number: 20220391407Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that estimate the overlap between sets of data samples. In particular, in one or more embodiments, the disclosed systems utilize a sketch-based sampling routine and a flexible, accurate estimator to determine the overlap (e.g., the intersection) between sets of data samples. For example, in some implementations, the disclosed systems generate a sketch vector—such as a one permutation hashing vector—for each set of data samples. The disclosed systems further compare the sketch vectors to determine an equal bin similarity estimator, a lesser bin similarity estimator, and a greater bin similarity estimator. The disclosed systems utilize one or more of the determined similarity estimators in generating an overlap estimation for the sets of data samples.Type: ApplicationFiled: August 10, 2022Publication date: December 8, 2022Inventors: Anup Rao, Tung Mai, Matvey Kapilevich
-
Publication number: 20220309334Abstract: Techniques are provided for training graph neural networks with heterophily datasets and generating predictions for such datasets with heterophily. A computing device receives a dataset including a graph data structure and processes the dataset using a graph neural network. The graph neural network defines prior belief vectors respectively corresponding to nodes of the graph data structure, executes a compatibility-guided propagation from the set of prior belief vectors and using a compatibility matrix. The graph neural network predicts predicting a class label for a node of the graph data structure based on the compatibility-guided propagations and a characteristic of at least one node within a neighborhood of the node. The computing device outputs the graph data structure where it is usable by a software tool for modifying an operation of a computing environment.Type: ApplicationFiled: March 23, 2021Publication date: September 29, 2022Inventors: Ryan Rossi, Tung Mai, Nedim Lipka, Jiong Zhu, Anup Rao, Viswanathan Swaminathan
-
Patent number: 11449523Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that estimate the overlap between sets of data samples. In particular, in one or more embodiments, the disclosed systems utilize a sketch-based sampling routine and a flexible, accurate estimator to determine the overlap (e.g., the intersection) between sets of data samples. For example, in some implementations, the disclosed systems generate a sketch vector—such as a one permutation hashing vector—for each set of data samples. The disclosed systems further compare the sketch vectors to determine an equal bin similarity estimator, a lesser bin similarity estimator, and a greater bin similarity estimator. The disclosed systems utilize one or more of the determined similarity estimators in generating an overlap estimation for the sets of data samples.Type: GrantFiled: November 5, 2020Date of Patent: September 20, 2022Assignee: Adobe Inc.Inventors: Anup Rao, Tung Mai, Matvey Kapilevich
-
Publication number: 20220292101Abstract: The present disclosure relates to method for responding to a query requesting an intersection being performed. The method includes receiving a query referencing a first set, a second set, and a desired quantile related to the first set from among a plurality of quantiles; generating a data structure including a bottom-k sketch of user identifiers (ids) of the first set and corresponding numerical values of the first data; partitioning the data structure into a plurality of sketches to correspond to the quantiles, respectively; determining an intersection of one of the sketches associated with the desired quantile and a sketch of the second set; and responding to the query based on the intersection.Type: ApplicationFiled: March 15, 2021Publication date: September 15, 2022Inventors: TUNG MAI, Anup Rao, Yeshwanth Vijayakumar
-
Publication number: 20220164346Abstract: In some embodiments, a model training system trains a sample generation model configured to generate synthetic data entries for a dataset. The sample generation model includes a prior model for generating an estimated latent vector from a partially observed data entry, a proposal model for generating a latent vector from a data entry of the dataset and a mask corresponding to the partially observed data entry, and a generative model for generating the synthetic data entries from the latent vector and the partially observed data entry. The model training system trains the sample generation model to optimize an objective function that includes a first term determined using the synthetic data entries and a second term determined using the estimated latent vector and the latent vector. The trained sample generation model can be executed on a client computing device to service queries using the generated synthetic data entries.Type: ApplicationFiled: November 20, 2020Publication date: May 26, 2022Inventors: Subrata Mitra, Nikhil Sheoran, Anup Rao, Tung Mai, Sapthotharan Krishnan Nair, Shivakumar Vaithyanathan, Thomas Jacobs, Ghetia Siddharth, Jatin Varshney, Vikas Maddukuri, Laxmikant Mishra
-
Patent number: 11343325Abstract: A system and method for fast, accurate, and scalable typed graphlet estimation. The system and method utilizes typed edge sampling and typed path sampling to estimate typed graphlet counts in large graphs in a small fraction of the computing time of existing systems. The obtained unbiased estimates of typed graphlets are highly accurate, and have applications in the analysis, mining, and predictive modeling of massive real-world networks. During operation, the system obtains a dataset indicating nodes and edges of a graph. The system samples a portion of the graph and counts a number of graph features in the sampled portion of the graph. The system then computes an occurrence frequency of a typed graphlet pattern and a total number of typed graphlets associated with the typed graphlet pattern in the graph.Type: GrantFiled: August 31, 2020Date of Patent: May 24, 2022Assignee: Adobe Inc.Inventors: Ryan Rossi, Tung Mai, Anup Rao
-
Publication number: 20220148015Abstract: Techniques are provided for analyzing user actions that have occurred over a time period. The user actions can be, for example, with respect to the user's navigation of content or interaction with an application. Such user data is provided in an action string, which is converted into a highly searchable format. As such, the presence and frequency of particular user actions and patterns of user actions within an action string of a particular user, as well as among multiple action strings of multiple users, are determinable. Subsequences of one or more action strings are identified and both the number of action strings that include a particular subsequence and the frequency that a particular subsequence is present in a given action string are determinable. The conversion involves breaking that string into a sorted list of locations for the actions within that string. Queries can be readily applied against the sorted list.Type: ApplicationFiled: November 12, 2020Publication date: May 12, 2022Applicant: Adobe Inc.Inventors: Tung Mai, Iftikhar Ahamath Burhanuddin, Georgios Theocharous, Anup Rao
-
Publication number: 20220138218Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that estimate the overlap between sets of data samples. In particular, in one or more embodiments, the disclosed systems utilize a sketch-based sampling routine and a flexible, accurate estimator to determine the overlap (e.g., the intersection) between sets of data samples. For example, in some implementations, the disclosed systems generate a sketch vector—such as a one permutation hashing vector—for each set of data samples. The disclosed systems further compare the sketch vectors to determine an equal bin similarity estimator, a lesser bin similarity estimator, and a greater bin similarity estimator. The disclosed systems utilize one or more of the determined similarity estimators in generating an overlap estimation for the sets of data samples.Type: ApplicationFiled: November 5, 2020Publication date: May 5, 2022Inventors: Anup Rao, Tung Mai, Matvey Kapilevich
-
Publication number: 20220070266Abstract: A system and method for fast, accurate, and scalable typed graphlet estimation. The system and method utilizes typed edge sampling and typed path sampling to estimate typed graphlet counts in large graphs in a small fraction of the computing time of existing systems. The obtained unbiased estimates of typed graphlets are highly accurate, and have applications in the analysis, mining, and predictive modeling of massive real-world networks. During operation, the system obtains a dataset indicating nodes and edges of a graph. The system samples a portion of the graph and counts a number of graph features in the sampled portion of the graph. The system then computes an occurrence frequency of a typed graphlet pattern and a total number of typed graphlets associated with the typed graphlet pattern in the graph.Type: ApplicationFiled: August 31, 2020Publication date: March 3, 2022Inventors: Ryan ROSSI, Tung MAI, Anup RAO