Patents by Inventor Ashvin Agrawal

Ashvin Agrawal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

TRACKING PROVENANCE IN DATA SCIENCE SCRIPTS

Publication number: 20230394369

Abstract: Embodiments described herein enable tracking machine learning (“ML”) model data provenance. In particular, a computing device is configured to accept ML model code that, when executed, instantiates and trains an ML model, to parse the ML model code into a workflow intermediate representation (WIR), to semantically annotate the WIR to provide an annotated WIR, and to identify, based on the annotated WIR and ML API corresponding to the ML model code, data from at least one data source that is relied upon by the ML model code when training the ML model. A WIR may be generated from an abstract syntax tree (AST) based on the ML model code, generating provenance relationships (PRs) based at least in part on relationships between nodes of the AST, wherein a PR comprises one or more input variables, an operation, a caller, and one or more output variables.

Type: Application

Filed: August 21, 2023

Publication date: December 7, 2023

Inventors: Avrilia FLORATOU, Ashvin AGRAWAL, MohammadHossein NAMAKI, Subramaniam Venkatraman KRISHNAN, Fotios PSALLIDAS, Yinghui WU
Mitigating slow instances in large-scale streaming pipelines

Patent number: 11822454

Abstract: A system is described herein for mitigating slow process instances in a streaming application. The system includes a slow process instance candidate identifier configured to identify, based on a relative watermark latency, a set of slow process instance candidates from among a plurality of process instances that comprise the streaming application. The system further includes a set of filters configured to remove false positives from the set of slow process instance candidates. The filters account for window operations performed by the process instances as well as stabilization time needed for downstream process instances to stabilize after a slow upstream process instance is mitigated by a mitigation implementer, which may also be included in the system.

Type: Grant

Filed: August 25, 2022

Date of Patent: November 21, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Ashvin Agrawal, Avrilia Floratou, Ke Wang, Daniel E. Musgrave
Tracking provenance in data science scripts

Patent number: 11775862

Abstract: A system enables tracking machine learning (“ML”) model data provenance. In particular, a computing device is configured to accept ML model code that, when executed, instantiates and trains an ML model, to parse the ML model code into a workflow intermediate representation (WIR), to semantically annotate the WIR to provide an annotated WIR, and to identify, based on the annotated WIR and ML API corresponding to the ML model code, data from at least one data source that is relied upon by the ML model code when training the ML model. A WIR may be generated from an abstract syntax tree (AST) based on the ML model code, generating provenance relationships (PRs) based at least in part on relationships between nodes of the AST, wherein a PR comprises one or more input variables, an operation, a caller, and one or more output variables.

Type: Grant

Filed: January 14, 2020

Date of Patent: October 3, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Avrilia Floratou, Ashvin Agrawal, MohammadHossein Namaki, Subramaniam Venkatraman Krishnan, Fotios Psallidas, Yinghui Wu
MITIGATING SLOW INSTANCES IN LARGE-SCALE STREAMING PIPELINES

Publication number: 20220405186

Abstract: A system is described herein for mitigating slow process instances in a streaming application. The system includes a slow process instance candidate identifier configured to identify, based on a relative watermark latency, a set of slow process instance candidates from among a plurality of process instances that comprise the streaming application. The system further includes a set of filters configured to remove false positives from the set of slow process instance candidates. The filters account for window operations performed by the process instances as well as stabilization time needed for downstream process instances to stabilize after a slow upstream process instance is mitigated by a mitigation implementer, which may also be included in the system.

Type: Application

Filed: August 25, 2022

Publication date: December 22, 2022

Inventors: Ashvin AGRAWAL, Avrilia FLORATOU, Ke WANG, Daniel E. MUSGRAVE
Cache and I/O management for analytics over disaggregated stores

Patent number: 11474945

Abstract: Methods, systems, apparatuses, and computer program products are provided for prefetching data. A workload analyzer may identify job characteristics for a plurality of previously executed jobs in a workload executing on a cluster of one or more compute resources. For each job, identified job characteristics may include identification of an input dataset and an input bandwidth characteristic for the input dataset. A future workload predictor may identify future jobs expected to execute on the cluster based at least on the identified job characteristics. A cache assignment determiner may determine a cache assignment that identifies a prefetch dataset for at least one of the future jobs. A network bandwidth allocator may determine a network bandwidth assignment for the prefetch dataset. A plan instructor may instruct a compute resource of the cluster to load data to a cache local to the cluster according to the cache assignment and the network bandwidth assignment.

Type: Grant

Filed: June 2, 2021

Date of Patent: October 18, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Virajith Jalaparti, Sriram S. Rao, Christopher W. Douglas, Ashvin Agrawal, Avrilia Floratou, Ishai Menache, Srikanth Kandula, Mainak Ghosh, Joseph Naor
Mitigating slow instances in large-scale streaming pipelines

Patent number: 11461213

Abstract: A system is described herein for mitigating slow process instances in a streaming application. The system includes a slow process instance candidate identifier configured to identify, based on a relative watermark latency, a set of slow process instance candidates from among a plurality of process instances that comprise the streaming application. The system further includes a set of filters configured to remove false positives from the set of slow process instance candidates. The filters account for window operations performed by the process instances as well as stabilization time needed for downstream process instances to stabilize after a slow upstream process instance is mitigated by a mitigation implementer, which may also be included in the system.

Type: Grant

Filed: October 31, 2019

Date of Patent: October 4, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Ashvin Agrawal, Avrilia Floratou, Ke Wang, Daniel E. Musgrave
CACHE AND I/O MANAGEMENT FOR ANALYTICS OVER DISAGGREGATED STORES

Publication number: 20210286728

Abstract: Methods, systems, apparatuses, and computer program products are provided for prefetching data. A workload analyzer may identify job characteristics for a plurality of previously executed jobs in a workload executing on a cluster of one or more compute resources. For each job, identified job characteristics may include identification of an input dataset and an input bandwidth characteristic for the input dataset. A future workload predictor may identify future jobs expected to execute on the cluster based at least on the identified job characteristics. A cache assignment determiner may determine a cache assignment that identifies a prefetch dataset for at least one of the future jobs. A network bandwidth allocator may determine a network bandwidth assignment for the prefetch dataset. A plan instructor may instruct a compute resource of the cluster to load data to a cache local to the cluster according to the cache assignment and the network bandwidth assignment.

Type: Application

Filed: June 2, 2021

Publication date: September 16, 2021

Inventors: Virajith Jalaparti, Sriram S. Rao, Christopher W. Douglas, Ashvin Agrawal, Avrilia Floratou, Ishai Menache, Srikanth Kandula, Mainak Ghosh, Joseph Naor
TRACKING PROVENANCE IN DATA SCIENCE SCRIPTS

Publication number: 20210216905

Abstract: Embodiments described herein enable tracking machine learning (“ML”) model data provenance. In particular, a computing device is configured to accept ML model code that, when executed, instantiates and trains an ML model, to parse the ML model code into a workflow intermediate representation (WIR), to semantically annotate the WIR to provide an annotated WIR, and to identify, based on the annotated WIR and ML API corresponding to the ML model code, data from at least one data source that is relied upon by the ML model code when training the ML model. A WIR may be generated from an abstract syntax tree (AST) based on the ML model code, generating provenance relationships (PRs) based at least in part on relationships between nodes of the AST, wherein a PR comprises one or more input variables, an operation, a caller, and one or more output variables.

Type: Application

Filed: January 14, 2020

Publication date: July 15, 2021

Inventors: Avrilia Floratou, Ashvin Agrawal, MohammadHossein Namaki, Subramaniam Venkatraman Krishnan, Fotios Psallidas, Yinghui Wu
Cache and I/O management for analytics over disaggregated stores

Patent number: 11055225

Abstract: Methods, systems, apparatuses, and computer program products are provided for prefetching data. A workload analyzer may identify job characteristics for a plurality of previously executed jobs in a workload executing on a cluster of one or more compute resources. For each job, identified job characteristics may include identification of an input dataset and an input bandwidth characteristic for the input dataset. A future workload predictor may identify future jobs expected to execute on the cluster based at least on the identified job characteristics. A cache assignment determiner may determine a cache assignment that identifies a prefetch dataset for at least one of the future jobs. A network bandwidth allocator may determine a network bandwidth assignment for the prefetch dataset. A plan instructor may instruct a compute resource of the cluster to load data to a cache local to the cluster according to the cache assignment and the network bandwidth assignment.

Type: Grant

Filed: October 22, 2019

Date of Patent: July 6, 2021

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Virajith Jalaparti, Sriram S. Rao, Christopher W. Douglas, Ashvin Agrawal, Avrilia Floratou, Ishai Menache, Srikanth Kandula, Mainak Ghosh, Joseph Naor
MITIGATING SLOW INSTANCES IN LARGE-SCALE STREAMING PIPELINES

Publication number: 20210133075

Abstract: A system is described herein for mitigating slow process instances in a streaming application. The system includes a slow process instance candidate identifier configured to identify, based on a relative watermark latency, a set of slow process instance candidates from among a plurality of process instances that comprise the streaming application. The system further includes a set of filters configured to remove false positives from the set of slow process instance candidates. The filters account for window operations performed by the process instances as well as stabilization time needed for downstream process instances to stabilize after a slow upstream process instance is mitigated by a mitigation implementer, which may also be included in the system.

Type: Application

Filed: October 31, 2019

Publication date: May 6, 2021

Inventors: Ashvin Agrawal, Avrilia Floratou, Ke Wang, Daniel E. Musgrave
CACHE AND I/O MANAGEMENT FOR ANALYTICS OVER DISAGGREGATED STORES

Publication number: 20210096996

Abstract: Methods, systems, apparatuses, and computer program products are provided for prefetching data. A workload analyzer may identify job characteristics for a plurality of previously executed jobs in a workload executing on a cluster of one or more compute resources. For each job, identified job characteristics may include identification of an input dataset and an input bandwidth characteristic for the input dataset. A future workload predictor may identify future jobs expected to execute on the cluster based at least on the identified job characteristics. A cache assignment determiner may determine a cache assignment that identifies a prefetch dataset for at least one of the future jobs. A network bandwidth allocator may determine a network bandwidth assignment for the prefetch dataset. A plan instructor may instruct a compute resource of the cluster to load data to a cache local to the cluster according to the cache assignment and the network bandwidth assignment.

Type: Application

Filed: October 22, 2019

Publication date: April 1, 2021

Inventors: Virajith Jalaparti, Sriram S. Rao, Christopher W. Douglas, Ashvin Agrawal, Avrilia Floratou, Ishai Menache, Srikanth Kandula, Mainak Ghosh, Joseph Naor
Table data persistence

Patent number: 10922285

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for a distributed parallel processing database system that persists table data in memory to a distributed file system. A distributed parallel processing database system persists table data in memory to a distributed file system. A parameter of creating a database table specifies that data records in the database table and history of changes to the data records can be stored in memory as well as in the distributed file system. When the database table is populated or otherwise modified, data records and the history in memory are evicted to the distributed file system as log files and removed from memory. The log files can be designated as write-only, where the data records, once written, cannot be read by structured query language (SQL) queries, or as read-write, where the data records, once written, can be read by SQL queries.

Type: Grant

Filed: May 1, 2017

Date of Patent: February 16, 2021

Assignee: Pivotal Software, Inc.

Inventors: Daniel Allen Smith, Anthony M. Baker, Sumedh Wale, Hemant Bhanawat, Jagannathan Ramnarayanan, Swapnil Prakash Bawaskar, Ashvin Agrawal, Neeraj Kumar
Compacting data history files

Patent number: 10417203

Abstract: Methods, systems, and apparatus for obtaining one or more metadata files, determining, by one or more computers and in accordance with a minor compaction setting, to perform a minor compaction of the one or more metadata files, creating one or more intermediate metadata files that each include at least compacted contacts of one or more of the metadata files, according to the determination to perform minor compaction of the one or more metadata files, determining, in accordance with a major compaction setting, to perform a major compaction of one or more of the intermediate metadata files, and creating one or more snapshot metadata files that each include at least compacted contents of one or more of the intermediate metadata files, according to the determination to perform the major compaction of one or more of the intermediate metadata files.

Type: Grant

Filed: February 2, 2017

Date of Patent: September 17, 2019

Assignee: Pivotal Software, Inc.

Inventors: Jagannathan Ramnarayanan, Ashvin Agrawal, Anthony M. Baker, Daniel Allen Smith, Hemant Bhanawat, Swapnil Prakash Bawaskar
COMPACTING DATA HISTORY FILES

Publication number: 20170147616

Abstract: Methods, systems, and apparatus for obtaining one or more metadata files, determining, by one or more computers and in accordance with a minor compaction setting, to perform a minor compaction of the one or more metadata files, creating one or more intermediate metadata files that each include at least compacted contacts of one or more of the metadata files, according to the determination to perform minor compaction of the one or more metadata files, determining, in accordance with a major compaction setting, to perform a major compaction of one or more of the intermediate metadata files, and creating one or more snapshot metadata files that each include at least compacted contents of one or more of the intermediate metadata files, according to the determination to perform the major compaction of one or more of the intermediate metadata files.

Type: Application

Filed: February 2, 2017

Publication date: May 25, 2017

Inventors: Jagannathan Ramnarayanan, Ashvin Agrawal, Anthony M. Baker, Daniel Allen Smith, Hemant Bhanawat, Swapnil Prakash Bawaskar
Table data persistence

Patent number: 9639544

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for a distributed parallel processing database system that persists table data in memory to a distributed file system. A distributed parallel processing database system persists table data in memory to a distributed file system. A parameter of creating a database table specifies that data records in the database table and history of changes to the data records can be stored in memory as well as in the distributed file system. When the database table is populated or otherwise modified, data records and the history in memory are evicted to the distributed file system as log files and removed from memory. The log files can be designated as write-only, where the data records, once written, cannot be read by structured query language (SQL) queries, or as read-write, where the data records, once written, can be read by SQL queries.

Type: Grant

Filed: October 28, 2014

Date of Patent: May 2, 2017

Assignee: Pivotal Software, Inc.

Inventors: Daniel Allen Smith, Anthony M. Baker, Sumedh Wale, Hemant Bhanawat, Jagannathan Ramnarayanan, Swapnil Prakash Bawaskar, Ashvin Agrawal, Neeraj Kumar
Compacting data file histories

Patent number: 9582527

Abstract: Methods, systems, and apparatus for obtaining one or more metadata files, determining, by one or more computers and in accordance with a minor compaction setting, to perform a minor compaction of the one or more metadata files, creating one or more intermediate metadata files that each include at least compacted contacts of one or more of the metadata files, according to the determination to perform minor compaction of the one or more metadata files, determining, in accordance with a major compaction setting, to perform a major compaction of one or more of the intermediate metadata files, and creating one or more snapshot metadata files that each include at least compacted contents of one or more of the intermediate metadata files, according to the determination to perform the major compaction of one or more of the intermediate metadata files.

Type: Grant

Filed: October 28, 2014

Date of Patent: February 28, 2017

Assignee: Pivotal Software, Inc.

Inventors: Jagannathan Ramnarayanan, Ashvin Agrawal, Anthony M. Baker, Daniel Allen Smith, Hemant Bhanawat, Swapnil Prakash Bawaskar
Method and system for automatically identifying optimal meeting locations

Patent number: 9558457

Abstract: A method and system for automatically identifying optimal meeting locations. The method includes receiving a plurality of meeting parameters associated with one or more participants. The method also includes identifying a list of optimal meeting locations relevant to one or more of the plurality of meeting parameters. The method further includes ranking the list of optimal meeting locations. Further, the method includes enabling a user to select an optimal meeting location from the list of optimal meeting locations. The system includes one or more electronic devices and a user electronic device. The user electronic device includes a communication interface, a memory, and a processor.

Type: Grant

Filed: July 26, 2011

Date of Patent: January 31, 2017

Assignee: EXCALIBUR IP, LLC

Inventors: Deepak Kumar V, Subramaniam Venkatraman Krishnan, Ashvin Agrawal
Location-aware adaptive event reminder

Patent number: 9311628

Abstract: An appointment having an associated appointment location and a reminder time is received. The method also includes tracking a current location and a travel time, the travel time comprising an estimated amount of time for travel from the current location to the appointment location. Further, the method includes adjusting the reminder time to accommodate the travel time. Furthermore, the method includes activating an event reminder in accordance with the adjusted reminder time.

Type: Grant

Filed: December 22, 2010

Date of Patent: April 12, 2016

Assignee: Yahoo! Inc.

Inventors: Ashvin Agrawal, Subramaniam Venkatraman Krishnan
SELECTING FILES FOR COMPACTION

Publication number: 20150120684

Abstract: Methods, systems, and apparatus for identifying two or more files, each of which include multiple entries, determining a respective size of each of the files, each size being an estimate of how many distinct entries exist in the respective file that are not garbage entries, determining a combined size of the files, where the combined size of the files is an arithmetic sum of the respective sizes of the files, estimating a compacted size of the files, where the estimated compacted size of the files is an estimate of how many distinct entries exist in the files that are not garbage entries, selecting the two or more files for compaction, based at least on a comparison of the combined size of the files to the estimated compacted size of the files, and compacting the two or more selected files.

Type: Application

Filed: October 28, 2014

Publication date: April 30, 2015

Inventors: Swapnil Prakash Bawaskar, Ashvin Agrawal, Daniel Allen Smith, Anthony M. Baker, Jagannathan Ramnarayanan, Hemant Bhanawat
COMPACTING DATA FILE HISTORIES

Publication number: 20150120656

Abstract: Methods, systems, and apparatus for obtaining one or more metadata files, determining, by one or more computers and in accordance with a minor compaction setting, to perform a minor compaction of the one or more metadata files, creating one or more intermediate metadata files that each include at least compacted contacts of one or more of the metadata files, according to the determination to perform minor compaction of the one or more metadata files, determining, in accordance with a major compaction setting, to perform a major compaction of one or more of the intermediate metadata files, and creating one or more snapshot metadata files that each include at least compacted contents of one or more of the intermediate metadata files, according to the determination to perform the major compaction of one or more of the intermediate metadata files.

Type: Application

Filed: October 28, 2014

Publication date: April 30, 2015

Inventors: Jagannathan Ramnarayanan, Ashvin Agrawal, Anthony M. Baker, Daniel Allen Smith, Hemant Bhanawat, Swapnil Prakash Bawaskar

1 2 next