Patents by Inventor Prasan Roy

Prasan Roy has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9785657
    Abstract: Generation of synthetic database data includes annotated query subplans for a multiple table query workload that includes a desired cardinality for nodes (v) in the subplans. The subplans may be merged and represented by a direct acyclic graph (DAG). The maximum entropy joint probability distribution for each attribute (x) for each node (v) is determined as: p ? ( x ) = exp [ ( ? v ? ? w v ? f v ? ( x ) Z ] ) for each node v, where wv is a weight of node v, fv is a conjunct of predicates in a subplan rooted at node v, and Z is a normalization factor. This distribution is determined such that the desired cardinality, and selectivities for each node v determined from the desired cardinality, are satisfied. The data for a plurality of tables are generated by sampling the maximum entropy joint probability distribution for a domain of attributes (x) of a plurality of tables. Data may be efficiently generated for multiple table queries and for DAGs.
    Type: Grant
    Filed: September 13, 2014
    Date of Patent: October 10, 2017
    Assignee: International Business Machines Corporation
    Inventors: Atreyee Dey, Prasan Roy
  • Patent number: 9361323
    Abstract: A system for receiving a declarative specification including a plurality of stages. Each stage specifies an atomic operation, a data input to the atomic operation, and a data output from the atomic operation. The data input is characterized by a data type. Links between at least two of the stages are generated to create a data integration workflow. The data integration workflow is compiled to generate computer code for execution on a parallel processing platform. The computer code configured to perform at least one of data preparation and data analysis.
    Type: Grant
    Filed: October 4, 2011
    Date of Patent: June 7, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Manoj K. Agarwal, Himanshu Gupta, Rajeev Gupta, Sanjeev K. Gupta, Mukesh K. Mohania, Sriram K. Padmanabhan, Prasan Roy
  • Patent number: 9317542
    Abstract: A method for receiving a declarative specification including a plurality of stages. Each stage specifies an atomic operation, a data input to the atomic operation, and a data output from the atomic operation. The data input is characterized by a data type. Links between at least two of the stages are generated to create a data integration workflow. The data integration workflow is compiled to generate computer code for execution on a parallel processing platform. The computer code configured to perform at least one of data preparation and data analysis.
    Type: Grant
    Filed: April 29, 2013
    Date of Patent: April 19, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Manoj K. Agarwal, Himanshu Gupta, Rajeev Gupta, Sanjeev K. Gupta, Mukesh K. Mohania, Sriram K. Padmanabhan, Prasan Roy
  • Patent number: 9244950
    Abstract: Generation of synthetic database data includes annotated query subplans for a multiple table query workload that includes a desired cardinality for nodes (v) in the subplans. The subplans may be merged and represented by a direct acyclic graph (DAG). The maximum entropy joint probability distribution for each attribute (x) for each node (v) is determined as: p ? ( x ) = exp ( ? v ? ? w v ? f v ? ( x ) Z ) for each node ?, where wv is a weight of node v, fv is a conjunct of predicates in a subplan rooted at node v, and Z is a normalization factor. This distribution is determined such that the desired cardinality, and selectivities for each node v determined from the desired cardinality, are satisfied. The data for a plurality of tables are generated by sampling the maximum entropy joint probability distribution for a domain of attributes (x) of a plurality of tables. Data may be efficiently generated for multiple table queries and for DAGs.
    Type: Grant
    Filed: July 3, 2013
    Date of Patent: January 26, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Atreyee Dey, Prasan Roy
  • Patent number: 9135289
    Abstract: Identifying matching transactions. First and second log files contain operation records of transactions in a transaction workload, each file recording a respective execution of the transaction workload, the method comprising. A first record location in the first file and an associated window of a defined number of sequential second record locations in the second file are advanced one record location at a time. Whether each operation record of a complete transaction at a first record location has a matching operation record at one of the record locations in the associated window of second record locations is determined. If so, the complete transaction in the first file and the transaction that includes the matching operation records in the second file are identified as matching transactions.
    Type: Grant
    Filed: June 2, 2014
    Date of Patent: September 15, 2015
    Assignee: International Business Machines Corporation
    Inventors: Manoj K. Agarwal, Curt L. Cotner, Amitava Kundu, Prasan Roy, Rajesh Sambandhan
  • Patent number: 9063944
    Abstract: A predefined number of matches is identified between records in a first file and records in a second file. For the matches, determine the span of the actual range of record positions in the second file relative to the positions of the operation records in the first file within which all matches were found. If the actual span is smaller than the span of a current defined range of record positions by at least a first threshold value, decrease the span of the current defined range. If the actual span is within a second threshold value of the span of the current defined range, increase the span of the current defined range. If an amount above a third threshold value of operation records in the first file are not matched to operation records in the second file, increasing the span of the current defined range.
    Type: Grant
    Filed: February 21, 2013
    Date of Patent: June 23, 2015
    Assignee: International Business Machines Corporation
    Inventors: Manoj K. Agarwal, Curt L. Cotner, Amitava Kundu, Prasan Roy, Rajesh Sambandhan
  • Patent number: 8984516
    Abstract: A method, computer program product, and computer system for shared execution of mixed data flows, performed by one or more computing devices, comprises identifying one or more resource sharing opportunities across a plurality of parallel tasks. The plurality of parallel tasks includes zero or more relational operations and at least one non-relational operation. The plurality of parallel tasks relative to the relational operations and the at least one non-relational operation are executed. In response to executing the plurality of parallel tasks, one or more resources of the identified resource sharing opportunities is shared across the relational operations and the at least one non-relational operation.
    Type: Grant
    Filed: May 10, 2013
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Rajeev Gupta, Padmashree Ravindra, Prasan Roy
  • Patent number: 8984515
    Abstract: A method, computer program product, and computer system for shared execution of mixed data flows, performed by one or more computing devices, comprises identifying one or more resource sharing opportunities across a plurality of parallel tasks. The plurality of parallel tasks includes zero or more relational operations and at least one non-relational operation. The plurality of parallel tasks relative to the relational operations and the at least one non-relational operation are executed. In response to executing the plurality of parallel tasks, one or more resources of the identified resource sharing opportunities is shared across the relational operations and the at least one non-relational operation.
    Type: Grant
    Filed: May 31, 2012
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Rajeev Gupta, Padmashree Ravindra, Prasan Roy
  • Patent number: 8959519
    Abstract: Methods and arrangements for processing hierarchical data in a map-reduce framework. Hierarchical data is accepted, and a map-reduce job is performed on the hierarchical data. This performing of a map-reduce job includes determining a cost of partitioning the data, determining a cost of redefining the job and thereupon selectively performing at least one step taken from the group consisting of: partitioning the data and redefining the job.
    Type: Grant
    Filed: August 29, 2012
    Date of Patent: February 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Manoj K. Agarwal, Himanshu Gupta, Rajeev Gupta, Sriram K. Padmanabhan, Prasan Roy
  • Publication number: 20150012522
    Abstract: Generation of synthetic database data includes annotated query subplans for a multiple table query workload that includes a desired cardinality for nodes (v) in the subplans. The subplans may be merged and represented by a direct acyclic graph (DAG). The maximum entropy joint probability distribution for each attribute (x) for each node (v) is determined as: p ? ( x ) = exp ( ? v ? ? w v ? f v ? ( x ) Z ) for each node v, where wv is a weight of node v, fv is a conjunct of predicates in a subplan rooted at node v, and Z is a normalization factor. This distribution is determined such that the desired cardinality, and selectivities for each node v determined from the desired cardinality, are satisfied. The data for a plurality of tables are generated by sampling the maximum entropy joint probability distribution for a domain of attributes (x) of a plurality of tables. Data may be efficiently generated for multiple table queries and for DAGs.
    Type: Application
    Filed: July 3, 2013
    Publication date: January 8, 2015
    Inventors: Atreyee DEY, Prasan ROY
  • Publication number: 20150012523
    Abstract: Generation of synthetic database data includes annotated query subplans for a multiple table query workload that includes a desired cardinality for nodes (v) in the subplans. The subplans may be merged and represented by a direct acyclic graph (DAG). The maximum entropy joint probability distribution for each attribute (x) for each node (v) is determined as: p ? ( x ) = exp [ ( ? v ? ? w v ? f v ? ( x ) Z ] ) for each node v, where wv is a weight of node v, fv is a conjunct of predicates in a subplan rooted at node v, and Z is a normalization factor. This distribution is determined such that the desired cardinality, and selectivities for each node v determined from the desired cardinality, are satisfied. The data for a plurality of tables are generated by sampling the maximum entropy joint probability distribution for a domain of attributes (x) of a plurality of tables. Data may be efficiently generated for multiple table queries and for DAGs.
    Type: Application
    Filed: September 13, 2014
    Publication date: January 8, 2015
    Inventors: Atreyee DEY, Prasan ROY
  • Publication number: 20140279945
    Abstract: Identifying matching transactions. First and second log files contain operation records of transactions in a transaction workload, each file recording a respective execution of the transaction workload, the method comprising. A first record location in the first file and an associated window of a defined number of sequential second record locations in the second file are advanced one record location at a time. Whether each operation record of a complete transaction at a first record location has a matching operation record at one of the record locations in the associated window of second record locations is determined. If so, the complete transaction in the first file and the transaction that includes the matching operation records in the second file are identified as matching transactions.
    Type: Application
    Filed: June 2, 2014
    Publication date: September 18, 2014
    Applicant: International Business Machines Corporation
    Inventors: Manoj K. Agarwal, Curt L. Cotner, Amitava Kundu, Prasan Roy, Rajesh Sambandhan
  • Publication number: 20140236976
    Abstract: A predefined number of matches is identified between records in a first file and records in a second file. For the matches, determine the span of the actual range of record positions in the second file relative to the positions of the operation records in the first file within which all matches were found. If the actual span is smaller than the span of a current defined range of record positions by at least a first threshold value, decrease the span of the current defined range. If the actual span is within a second threshold value of the span of the current defined range, increase the span of the current defined range. If an amount above a third threshold value of operation records in the first file are not matched to operation records in the second file, increasing the span of the current defined range.
    Type: Application
    Filed: February 21, 2013
    Publication date: August 21, 2014
    Applicant: International Business Machines Corporation
    Inventors: Manoj K. Agarwal, Curt L. Cotner, Amitava Kundu, Prasan Roy, Rajesh Sambandhan
  • Patent number: 8788471
    Abstract: A method for identifying matching transactions between two log files where each transaction includes one or more statements. Each log file record records the execution of a statement and includes a transaction identifier. Each record in turn in one log file is compared to an advancing window of records in the other log file. A first table contains associations of statements to transactions and transactions to statements for records in the window. If a match is found between a record in the one file and a record in the window, information associating partial transactions in the one file to potential transactions of the records in the window is added to a second table. If an end-of-transaction record is read from the one file, a best match is found between the ended transaction and the potential transactions based on information in the first and second tables.
    Type: Grant
    Filed: May 30, 2012
    Date of Patent: July 22, 2014
    Assignee: International Business Machines Corporation
    Inventors: Manoj K. Agarwal, Curt L. Cotner, Amitava Kundu, Prasan Roy, Rajesh Sambandhan
  • Patent number: 8788473
    Abstract: A method for identifying matching transactions between two log files where each transaction includes one or more statements. Each log file record records the execution of a statement and includes a transaction identifier. Each record in turn in one log file is compared to an advancing window of records in the other log file. A first table contains associations of statements to transactions and transactions to statements for records in the window. If a match is found between a record in the one file and a record in the window, information associating partial transactions in the one file to potential transactions of the records in the window is added to a second table. If an end-of-transaction record is read from the one file, a best match is found between the ended transaction and the potential transactions based on information in the first and second tables.
    Type: Grant
    Filed: May 31, 2013
    Date of Patent: July 22, 2014
    Assignee: International Business Machines Corporation
    Inventors: Manoj K. Agarwal, Curt L. Cotner, Amitava Kundu, Prasan Roy, Rajesh Samabandhan
  • Patent number: 8677366
    Abstract: Methods and arrangements for processing hierarchical data in a map-reduce framework. Hierarchical data is accepted, and a map-reduce job is performed on the hierarchical data. This performing of a map-reduce job includes determining a cost of partitioning the data, determining a cost of redefining the job and thereupon selectively performing at least one step taken from the group consisting of: partitioning the data and redefining the job.
    Type: Grant
    Filed: May 31, 2011
    Date of Patent: March 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Manoj K. Agarwal, Himanshu Gupta, Rajeev Gupta, Sriram K. Padmanabhan, Prasan Roy
  • Publication number: 20130325826
    Abstract: A method for identifying matching transactions between two log files where each transaction includes one or more statements. Each log file record records the execution of a statement and includes a transaction identifier. Each record in turn in one log file is compared to an advancing window of records in the other log file. A first table contains associations of statements to transactions and transactions to statements for records in the window. If a match is found between a record in the one file and a record in the window, information associating partial transactions in the one file to potential transactions of the records in the window is added to a second table. If an end-of-transaction record is read from the one file, a best match is found between the ended transaction and the potential transactions based on information in the first and second tables.
    Type: Application
    Filed: May 30, 2012
    Publication date: December 5, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Manoj K. Agarwal, Curt L. Cotner, Amitava Kundu, Prasan Roy, Rajesh Sambandhan
  • Publication number: 20130326534
    Abstract: A method, computer program product, and computer system for shared execution of mixed data flows, performed by one or more computing devices, comprises identifying one or more resource sharing opportunities across a plurality of parallel tasks. The plurality of parallel tasks includes zero or more relational operations and at least one non-relational operation. The plurality of parallel tasks relative to the relational operations and the at least one non-relational operation are executed. In response to executing the plurality of parallel tasks, one or more resources of the identified resource sharing opportunities is shared across the relational operations and the at least one non-relational operation.
    Type: Application
    Filed: May 10, 2013
    Publication date: December 5, 2013
    Applicant: International Business Machines Corporation
    Inventors: Rajeev Gupta, Padmashree Ravindra, Prasan Roy
  • Publication number: 20130326538
    Abstract: A method, computer program product, and computer system for shared execution of mixed data flows, performed by one or more computing devices, comprises identifying one or more resource sharing opportunities across a plurality of parallel tasks. The plurality of parallel tasks includes zero or more relational operations and at least one non-relational operation. The plurality of parallel tasks relative to the relational operations and the at least one non-relational operation are executed. In response to executing the plurality of parallel tasks, one or more resources of the identified resource sharing opportunities is shared across the relational operations and the at least one non-relational operation.
    Type: Application
    Filed: May 31, 2012
    Publication date: December 5, 2013
    Applicant: International Business Machines Corporation
    Inventors: RAJEEV GUPTA, Padmashree Ravindra, Prasan Roy
  • Publication number: 20130325829
    Abstract: A method for identifying matching transactions between two log files where each transaction includes one or more statements. Each log file record records the execution of a statement and includes a transaction identifier. Each record in turn in one log file is compared to an advancing window of records in the other log file. A first table contains associations of statements to transactions and transactions to statements for records in the window. If a match is found between a record in the one file and a record in the window, information associating partial transactions in the one file to potential transactions of the records in the window is added to a second table. If an end-of-transaction record is read from the one file, a best match is found between the ended transaction and the potential transactions based on information in the first and second tables.
    Type: Application
    Filed: May 31, 2013
    Publication date: December 5, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Manoj K. Agarwal, Curt L. Cotner, Amitava Kundu, Prasan Roy, Rajesh Sambandhan