Patents by Inventor Victor Tze-Yeuan Tso

Victor Tze-Yeuan Tso has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11461304
    Abstract: Signature-based cache optimization for data preparation includes: performing a first set of sequenced data preparation operations on one or more sets of data to generate a plurality of transformation results; caching one or more of the plurality of transformation results and one or more corresponding operation signatures, a cached operation signature being derived based at least in part on a subset of sequenced operations that generated a corresponding result; receiving a specification of a second set of sequenced operations; determining an operation signature associated with the second set of sequenced operations; identifying a cached result among the cached results based at least in part on the determined operation signature; and outputting the cached result.
    Type: Grant
    Filed: March 10, 2020
    Date of Patent: October 4, 2022
    Assignee: DataRobot, Inc.
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso
  • Publication number: 20220284183
    Abstract: A step editor for data preparation can instruct a user interface to present a first plurality of operations to be applied in a sequential order to one or more sets of data, receive user inputs including at least one indication to mute at least one operation of the first plurality of operations to prevent the processors from performing the at least one operation, generate a second plurality of operations, the second plurality of operations to be applied in a sequential order to the sets of data and comprising the first plurality of operations excluding the operation muted by the user inputs, obtain a cached data traversal program associated with the second plurality of operations and comprising a representation of a result of transforming the sets of data, and instruct the user interface to present output based at least in part on execution of the cached data traversal program.
    Type: Application
    Filed: March 25, 2022
    Publication date: September 8, 2022
    Applicant: DataRobot, Inc.
    Inventors: Nenshad Bardoliwalla, Michael Matthews, Ian Timourian, Jing Chen, Lilia Gutnik, Whitman Kwok, Dave Brewster, Victor Tze-Yeuan Tso
  • Patent number: 11288447
    Abstract: Using a step editor for data preparation includes: receiving an indication of a user input with respect to at least some of a set of sequenced data preparation operations on a set of data; generating, using one or more processors, a signature based at least in part on the set of sequenced data preparation operations, references to the set of data, and the user input; using the generated signature to determine whether there exists a cached result associated with the set of sequenced data preparation operations, the references to the set of data, and the user input; based at least in part on the determination, obtaining a data traversal program representing a result associated with the set of sequenced operations, the references to the set of data, and the user input; and providing output based at least in part on the result represented by the obtained data traversal program.
    Type: Grant
    Filed: March 10, 2020
    Date of Patent: March 29, 2022
    Assignee: DR HoldCo 2, Inc.
    Inventors: Nenshad Dinshaw Bardoliwalla, Michael Matthews, Ian Timourian, Jing Chen, Lilia Gutnik, Whitman Kwok, Dave Brewster, Victor Tze-Yeuan Tso
  • Publication number: 20220012231
    Abstract: Automatic append includes: identifying, based at least in part on contents of a first data set comprising a first plurality of columns and contents of a second data set comprising a second plurality of columns, a plurality of matching columns and a plurality of non-matching columns. The matching columns comprise one or more columns among the first plurality of columns; and corresponding one or more matching columns among the second plurality of columns. The non-matching columns comprise: one or more columns among the first plurality of columns that do not match with any columns among the second plurality of columns; and one or more columns among the second plurality of columns that do not match with any columns among the first plurality of columns.
    Type: Application
    Filed: May 21, 2021
    Publication date: January 13, 2022
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso, Ashley Ping Jin, Quan Chuong Ta, Lakshman Roy Sankar, Whitman Kwok
  • Patent number: 11169978
    Abstract: Distributed pipeline optimization for data preparation includes receiving a specification of a set of sequenced operations to be performed on a set of organized data. It further includes dividing the set of data into a plurality of work portions based on a cost function that is dependent on at least one dimension of the set of data. It further includes distributing the plurality of work portions to a plurality of processing nodes to be processed according to the specification of operations.
    Type: Grant
    Filed: October 14, 2015
    Date of Patent: November 9, 2021
    Assignee: DR HoldCo 2, Inc.
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso
  • Patent number: 11030183
    Abstract: Automatic append includes: identifying, based at least in part on contents of a first data set comprising a first plurality of columns and contents of a second data set comprising a second plurality of columns, a plurality of matching columns and a plurality of non-matching columns. The matching columns comprise one or more columns among the first plurality of columns; and corresponding one or more matching columns among the second plurality of columns. The non-matching columns comprise: one or more columns among the first plurality of columns that do not match with any columns among the second plurality of columns; and one or more columns among the second plurality of columns that do not match with any columns among the first plurality of columns.
    Type: Grant
    Filed: September 13, 2016
    Date of Patent: June 8, 2021
    Assignee: DR HoldCo 2, Inc.
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso, Ashley Ping Jin, Quan Chuong Ta, Lakshman Roy Sankar, Whitman Kwok
  • Publication number: 20210056090
    Abstract: Cache optimization for data preparation includes: generating a data traversal program that represents a result of a set of sequenced data preparation operations performed on one or more sets of data, wherein the data traversal program indicates how to assemble one or more affected columns in the one or more sets of data to derive the result; in response to receiving a specification of the set of sequenced operations to be performed on the one or more sets of data, accessing the data traversal program that represents the result or a stored copy of the data traversal program that represents the result; assembling the one or more affected columns in the one or more sets of data according to the data traversal program to re-generate the result; and outputting the result.
    Type: Application
    Filed: July 1, 2020
    Publication date: February 25, 2021
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso
  • Patent number: 10740316
    Abstract: Cache optimization for data preparation includes generating a data traversal program that represents a result of a set of sequenced data preparation operations performed on one or more sets of data. The data traversal program indicates how to assemble one or more affected columns in the one or more sets of data to derive the result. It further includes in response to receiving a specification of the set of sequenced operations to be performed on the one or more sets of data, accessing the data traversal program that represents the result or a stored copy of the data traversal program that represents the result. It further includes assembling the one or more affected columns in the one or more sets of data according to the data traversal program to re-generate the result. It further includes outputting the result.
    Type: Grant
    Filed: October 14, 2015
    Date of Patent: August 11, 2020
    Assignee: DR HoldCo 2, Inc.
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso
  • Publication number: 20200210642
    Abstract: Using a step editor for data preparation includes: receiving an indication of a user input with respect to at least some of a set of sequenced data preparation operations on a set of data; generating, using one or more processors, a signature based at least in part on the set of sequenced data preparation operations, references to the set of data, and the user input; using the generated signature to determine whether there exists a cached result associated with the set of sequenced data preparation operations, the references to the set of data, and the user input; based at least in part on the determination, obtaining a data traversal program representing a result associated with the set of sequenced operations, the references to the set of data, and the user input; and providing output based at least in part on the result represented by the obtained data traversal program.
    Type: Application
    Filed: March 10, 2020
    Publication date: July 2, 2020
    Inventors: Nenshad Dinshaw Bardoliwalla, Michael Matthews, Ian Timourian, Jing Chen, Lilia Gutnik, Whitman Kwok, Dave Brewster, Victor Tze-Yeuan Tso
  • Publication number: 20200210399
    Abstract: Signature-based cache optimization for data preparation includes: performing a first set of sequenced data preparation operations on one or more sets of data to generate a plurality of transformation results; caching one or more of the plurality of transformation results and one or more corresponding operation signatures, a cached operation signature being derived based at least in part on a subset of sequenced operations that generated a corresponding result; receiving a specification of a second set of sequenced operations; determining an operation signature associated with the second set of sequenced operations; identifying a cached result among the cached results based at least in part on the determined operation signature; and outputting the cached result.
    Type: Application
    Filed: March 10, 2020
    Publication date: July 2, 2020
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso
  • Patent number: 10642815
    Abstract: Using a step editor for data preparation includes receiving an indication of a user input with respect to at least some of a set of sequenced data preparation operations on a set of data. It further includes generating, using one or more processors, a signature based at least in part on the set of sequenced data preparation operations, references to the set of data, and the user input. It further includes using the generated signature to determine whether there exists a cached result associated with the set of sequenced data preparation operations, the references to the set of data, and the user input. It further includes based at least in part on the determination, obtaining a data traversal program representing a result associated with the set of sequenced operations, the references to the set of data, and the user input. It further includes providing output based at least in part on the result represented by the obtained data traversal program.
    Type: Grant
    Filed: October 14, 2015
    Date of Patent: May 5, 2020
    Assignee: Paxata, Inc.
    Inventors: Nenshad Dinshaw Bardoliwalla, Michael Matthews, Ian Timourian, Jing Chen, Lilia Gutnik, Whitman Kwok, Dave Brewster, Victor Tze-Yeuan Tso
  • Patent number: 10642814
    Abstract: Signature-based cache optimization for data preparation includes performing a first set of sequenced data preparation operations on one or more sets of data to generate a plurality of transformation results. It further includes caching one or more of the plurality of transformation results and one or more corresponding operation signatures, a cached operation signature being derived based at least in part on a subset of sequenced operations that generated a corresponding result. It further includes receiving a specification of a second set of sequenced operations. It further includes determining an operation signature associated with the second set of sequenced operations. It further includes identifying a cached result among the cached results based at least in part on the determined operation signature; and outputting the cached result.
    Type: Grant
    Filed: October 14, 2015
    Date of Patent: May 5, 2020
    Assignee: Paxata, Inc.
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso
  • Patent number: 10216792
    Abstract: Automated join detection includes: identifying a set of one or more candidate joins of a first table and a second table; evaluating a set of one or more quality measures corresponding to the set of one or more candidate joins; obtaining a set of one or more selected joins among the set of one or more candidate joins, the set of one or more selected joins being selected based at least in part on one or more corresponding quality measures; and generating a joined table, including by joining the first table and the second table according to a selected join.
    Type: Grant
    Filed: October 14, 2015
    Date of Patent: February 26, 2019
    Assignee: Paxata, Inc.
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso, Ashley Jin, Quan Chuong Ta, Lakshman Roy Sankar, Nenshad Dinshaw Bardoliwalla
  • Publication number: 20170262491
    Abstract: Automatic append includes: identifying, based at least in part on contents of a first data set comprising a first plurality of columns and contents of a second data set comprising a second plurality of columns, a plurality of matching columns and a plurality of non-matching columns. The matching columns comprise one or more columns among the first plurality of columns; and corresponding one or more matching columns among the second plurality of columns. The non-matching columns comprise: one or more columns among the first plurality of columns that do not match with any columns among the second plurality of columns; and one or more columns among the second plurality of columns that do not match with any columns among the first plurality of columns.
    Type: Application
    Filed: September 13, 2016
    Publication date: September 14, 2017
    Inventors: David Brewster, Victor Tze-Yeuan Tso, Ashley Ping Jin, Quan Chuong Ta, Lakshman Roy Sankar, Whitman Kwok
  • Publication number: 20170109402
    Abstract: Automated join detection includes: identifying a set of one or more candidate joins of a first table and a second table; evaluating a set of one or more quality measures corresponding to the set of one or more candidate joins; obtaining a set of one or more selected joins among the set of one or more candidate joins, the set of one or more selected joins being selected based at least in part on one or more corresponding quality measures; and generating a joined table, including by joining the first table and the second table according to a selected join.
    Type: Application
    Filed: October 14, 2015
    Publication date: April 20, 2017
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso, Ashley Jin, Quan Chuong Ta, Lakshman Roy Sankar, Nenshad Dinshaw Bardoliwalla
  • Publication number: 20170109388
    Abstract: Signature-based cache optimization for data preparation includes: performing a first set of sequenced data preparation operations on one or more sets of data to generate a plurality of transformation results; caching one or more of the plurality of transformation results and one or more corresponding operation signatures, a cached operation signature being derived based at least in part on a subset of sequenced operations that generated a corresponding result; receiving a specification of a second set of sequenced operations; determining an operation signature associated with the second set of sequenced operations; identifying a cached result among the cached results based at least in part on the determined operation signature; and outputting the cached result.
    Type: Application
    Filed: October 14, 2015
    Publication date: April 20, 2017
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso
  • Publication number: 20170109387
    Abstract: Cache optimization for data preparation includes: generating a data traversal program that represents a result of a set of sequenced data preparation operations performed on one or more sets of data, wherein the data traversal program indicates how to assemble one or more affected columns in the one or more sets of data to derive the result; in response to receiving a specification of the set of sequenced operations to be performed on the one or more sets of data, accessing the data traversal program that represents the result or a stored copy of the data traversal program that represents the result; assembling the one or more affected columns in the one or more sets of data according to the data traversal program to re-generate the result; and outputting the result.
    Type: Application
    Filed: October 14, 2015
    Publication date: April 20, 2017
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso
  • Publication number: 20170109389
    Abstract: Using a step editor for data preparation includes: receiving an indication of a user input with respect to at least some of a set of sequenced data preparation operations on a set of data; generating, using one or more processors, a signature based at least in part on the set of sequenced data preparation operations, references to the set of data, and the user input; using the generated signature to determine whether there exists a cached result associated with the set of sequenced data preparation operations, the references to the set of data, and the user input; based at least in part on the determination, obtaining a data traversal program representing a result associated with the set of sequenced operations, the references to the set of data, and the user input; and providing output based at least in part on the result represented by the obtained data traversal program.
    Type: Application
    Filed: October 14, 2015
    Publication date: April 20, 2017
    Inventors: Nenshad Dinshaw Bardoliwalla, Michael Matthews, Ian Timourian, Jing Chen, Lilia Gutnik, Whitman Kwok, Dave Brewster, Victor Tze-Yeuan Tso
  • Publication number: 20170109378
    Abstract: Distributed pipeline optimization for data preparation includes: receiving a specification of a set of sequenced operations to be performed on a set of organized data; dividing the set of data into a plurality of work portions based on a cost function that is dependent on at least one dimension of the set of data; and distributing the plurality of work portions to a plurality of processing nodes to be processed according to the specification of operations.
    Type: Application
    Filed: October 14, 2015
    Publication date: April 20, 2017
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso