Patents Assigned to Trifacta, Inc.
-
Patent number: 10824606Abstract: A system standardizes values that occur in large datasets before the dataset is analyzed. The system identifies values in a dataset that are similar to each other and associates those values with each other to form groups. The system determines a canonical value for each group of associated values. Within each group, the system replaces values that have been associated with each other with the canonical value for the group. As a result, the dataset is transformed into a dataset that has standardized values, and the standardized dataset is provided as input for analysis by a data analysis system. By standardizing the dataset in this manner, the data analysis system can process a larger portion of the dataset.Type: GrantFiled: November 15, 2016Date of Patent: November 3, 2020Assignee: Trifacta Inc.Inventors: Sean Philip Kandel, Zain Asgar, Wei Zheng, Philip John Vander Broek
-
Patent number: 10733198Abstract: A data preprocessing system builds transformation scripts for preprocessing datasets for processing by a data analysis system. The data preprocessing system presents various representations of data of a dataset including visual representations, textual representations, or structural representations. The data preprocessing system receives selections of attributes or values based on these representations. The data preprocessing system generates recommendations of transformations based on the attributes or values selected. The data preprocessing system builds a transformation script based on the recommendations of the transformations. The transformation script can be used for preprocessing the dataset for analysis by a data analysis system.Type: GrantFiled: June 28, 2016Date of Patent: August 4, 2020Assignee: Trifacta Inc.Inventors: Edward Eli Marschner, Sean Philip Kandel, Chris Beavers, Adam Silberstein, Alon Bartur
-
Patent number: 10650020Abstract: A system analyzes transformations for processing datasets. The transformations may be used for build a transformation script for preprocessing data for analysis by big data analysis systems. The system receives a new transformation for analysis. The system determines a measure of an impact of the new transformation operation on a dataset. The system determines statistical information describing rows of the transformed dataset that are impacted by the new transformation. The system receives request to add the new transformation to the transformation script responsive to presenting the statistical information.Type: GrantFiled: September 16, 2016Date of Patent: May 12, 2020Assignee: Trifacta Inc.Inventors: Vihang Jitendra Mehta, Seshadri Sadasivan Mahalingam, Philip John Vander Broek
-
Patent number: 10545978Abstract: A data preprocessing system builds transformation scripts for preprocessing datasets for processing by a data analysis system. The data preprocessing system presents various representations of data of a dataset including visual representations, textual representations, or structural representations. The data preprocessing system receives selections of attributes or values based on these representations. The data preprocessing system generates recommendations of transformations based on the attributes or values selected. The data preprocessing system builds a transformation script based on the recommendations of the transformations. The transformation script can be used for preprocessing the dataset for analysis by a data analysis system.Type: GrantFiled: June 28, 2016Date of Patent: January 28, 2020Assignee: Trifacta Inc.Inventors: Edward Eli Marschner, Sean Philip Kandel, Chris Beavers, Adam Silberstein, Alon Bartur
-
Patent number: 10459942Abstract: A system determines samples of datasets that are typically processed by big data analysis systems. The samples are for use for development and testing of transformations for preprocessing the datasets in preparation for analysis by big data systems. The system receives one or more transform operations input datasets for the transform operations. The system determines samples associated with the transform operations. According to a sampling strategy, the system determines samples that return at least a threshold number of records in the result set obtained by applying a transformation. According to another sampling strategy, the system receives criteria describing the result of the transform operations and determines sample sets that generate result sets satisfying the criteria as a result of applying the transform operations.Type: GrantFiled: April 29, 2016Date of Patent: October 29, 2019Assignee: TRIFACTA INC.Inventors: Adam Eli Silberstein, Edward Eli Marschner, Sean Philip Kandel, Philip John Vander Broek, Alon Shulim Bartur, Wei Zheng
-
Patent number: 10437847Abstract: A system determines samples of datasets that are typically processed by big data analysis systems. The samples are for use for development and testing of transformations for preprocessing the datasets in preparation for analysis by big data systems. The system receives one or more transform operations input datasets for the transform operations. The system determines samples associated with the transform operations. According to a sampling strategy, the system determines samples that return at least a threshold number of records in the result set obtained by applying a transformation. According to another sampling strategy, the system receives criteria describing the result of the transform operations and determines sample sets that generate result sets satisfying the criteria as a result of applying the transform operations.Type: GrantFiled: April 29, 2016Date of Patent: October 8, 2019Assignee: Trifacta Inc.Inventors: Adam Eli Silberstein, Edward Eli Marschner, Sean Philip Kandel, Philip John Vander Broek, Alon Shulim Bartur, Wei Zheng
-
Patent number: 10346421Abstract: A system provides data profile information describing attributes of a dataset. The system determines relative frequency of occurrences of attribute values with respect to a set of bins from a histogram of another attribute. The system presents a user interface that presents statistical information describing attributes of a dataset based on the relative frequency of occurrences of attribute values. The system generates a transformation script based on the user interactions for transforming records of the dataset. The transformation script is configured to preprocess data of the dataset for further analysis.Type: GrantFiled: October 14, 2016Date of Patent: July 9, 2019Assignee: Trifacta Inc.Inventors: Jeffrey Heer, Lars Grammel, Sean Philip Kandel, Philip John Vander Broek
-
Patent number: 10331621Abstract: A system and method identifies rows in a file as uniform rows or outlier rows based on statistics from the file, and displays a sampling of uniform and outlier rows.Type: GrantFiled: September 19, 2014Date of Patent: June 25, 2019Assignee: TRIFACTA INC.Inventors: Aaron J. Elmore, Adam Eli Silberstein, Joseph M. Hellerstein, Sean Philip Kandel
-
Patent number: 9842112Abstract: A system and method parses one or more fields from a file by receiving example locations of the field in the file, fashioning rules that describe the field from the locations, and then scoring the rules against some or all of the file.Type: GrantFiled: October 27, 2014Date of Patent: December 12, 2017Assignee: Trifacta, Inc.Inventors: Jeffrey Heer, Sean Philip Kandel
-
Patent number: 9753928Abstract: A system and method automatically identifies any or all of potential row, column and string delimiters in a file in which such delimiters are unknown to the program making such identification.Type: GrantFiled: September 19, 2014Date of Patent: September 5, 2017Assignee: Trifacta, Inc.Inventors: Aaron J. Elmore, Adam E. Silberstein, Joseph M. Hellerstein, Sean Kandel