Patents by Inventor Vijayan Prabhakaran

Vijayan Prabhakaran has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12216653
    Abstract: Methods for improving performance of tiered storage of a data processing service by detecting and reducing thrashing of data blocks between warm and cold storage tiers are disclosed. In order to understand the frequency of hits by incoming queries to data blocks that are not currently stored in the warm storage tier, the elapsed time between query hits to the respective data blocks may be tracked using timers. Times below a given amount of time may be used to indicate thrashing. For example, recently evicted data blocks that are subsequently hit by a query within a short amount of time since eviction may indicate thrashing. In scenarios in which thrashing may be occurring, a threshold corresponding to the number of times a given data block in the cold storage tier receives a query hit before being added to the warm storage tier may be turned on.
    Type: Grant
    Filed: March 31, 2022
    Date of Patent: February 4, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Induja Sreekanthan, Sriram Subramanian, Athanasios Papathanasiou, Vijayan Prabhakaran
  • Patent number: 12204510
    Abstract: Disclosed is a configuration for managing the organization of data tables in cloud-based storage. The configuration receives metrics for data processing operations on the data table. Metrics include at least one of a size of the data table, a size of each file in the data table, and metadata describing the data table. The configuration automatically executes a cost-benefit analysis based on the one or more metrics for each candidate maintenance operation in a plurality of candidate maintenance operations. The configuration automatically selects a maintenance operation from the candidate maintenance operations to automate based on the cost-benefit analysis of the one or more candidate maintenance operations. The selected maintenance operation is automated and scheduled on the data table.
    Type: Grant
    Filed: May 8, 2023
    Date of Patent: January 21, 2025
    Assignee: Databricks, Inc.
    Inventors: Vijayan Prabhakaran, Himanshu Raja, Rahul Potharaju, Naga Raju Bhanoori, Lin Ma, Rajesh Parangi Sharabhalingappa, Jintian Liang, Zachary Vaughn Schuermann, Kam Cheung Ting
  • Publication number: 20250013606
    Abstract: A data processing service generates a data classifier tree for managing data files of a data table. The data classifier tree may be configured as a KD-classifier tree and includes a plurality of nodes and edges. A node of the data classifier tree may represent a splitting condition with respect to key-values for a respective key. A node of the data classifier tree may be associated with one or more data files assigned to the node. The data files assigned to the node each include a subset of records having key-values that satisfy the conditions represented by the node and parent nodes of the node. The data processing service may efficiently cluster the data in the data table while reducing the number of data files that are rewritten when data is modified or added to the data table.
    Type: Application
    Filed: July 5, 2023
    Publication date: January 9, 2025
    Inventors: Prakhar Jain, Frederick Ryan Johnson, Terry Kim, Vijayan Prabhakaran, Bart Samwel
  • Publication number: 20240378181
    Abstract: Disclosed is a configuration for managing the organization of data tables in cloud-based storage. The configuration receives metrics for data processing operations on the data table. Metrics include at least one of a size of the data table, a size of each file in the data table, and metadata describing the data table. The configuration automatically executes a cost-benefit analysis based on the one or more metrics for each candidate maintenance operation in a plurality of candidate maintenance operations. The configuration automatically selects a maintenance operation from the candidate maintenance operations to automate based on the cost-benefit analysis of the one or more candidate maintenance operations. The selected maintenance operation is automated and scheduled on the data table.
    Type: Application
    Filed: May 8, 2023
    Publication date: November 14, 2024
    Inventors: Vijayan Prabhakaran, Himanshu Raja, Rahul Potharaju, Naga Raju Bhanoori, Lin Ma, Rajesh Parangi Sharabhalingappa, Jintian Liang, Zach Schuermann, Kam Cheung Ting
  • Patent number: 11886422
    Abstract: A protocol for implementing ACID transactions that provides snapshot isolation in a distributed setting that does not require synchronized clocks is described. The protocol ensures at commit time that transactions touching common objects do not commit out of order. The protocol can be used in the context of a distributed data lake built on an object store in which clients can transactionally add or remove objects from logical tables.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: January 30, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Bohou Li, Vijayan Prabhakaran, Mehul A. Shah, Benjamin Sowell, Douglas Brian Terry
  • Patent number: 11842085
    Abstract: Methods for modeling performance of tiered storage of a data processing service given an increase in the storage capacity of a warm storage tier of the tiered storage are disclosed. Buffers in the warm storage tier are used to store data block identifiers corresponding to a set of data blocks that would be stored in the warm storage tier given the increase in storage capacity in addition to those already stored in the warm storage tier. When an incoming query targets a data block that has a corresponding data block identifier in one of the buffers, a hit counter is incremented in order to track the hit rate that would be made on the up-sized warm storage tier. In response to adding the data block targeted by the query to the warm storage tier, one or more evictions from the warm storage tier may additionally be triggered.
    Type: Grant
    Filed: March 31, 2022
    Date of Patent: December 12, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Induja Sreekanthan, Sriram Subramanian, Athanasios Papathanasiou, Vijayan Prabhakaran
  • Patent number: 11709809
    Abstract: Techniques for using tree data structures to maintain a transactionally consistent set with support for time-travel queries are described. When a transaction commits, a new version of the tree data structure is created using a copy-on-write based method such that the tree shares internal nodes with previous trees to save space. This approach may be used in the implementation of a transactional data catalog in which the files that make up a table are stored in a transactional set.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: July 25, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Bohou Li, Vijayan Prabhakaran, Mehul A. Shah, Benjamin Sowell, Douglas Brian Terry
  • Patent number: 11599514
    Abstract: Techniques for implementing systems using transactional version sets are described. Transactional version sets or t-sets include a collection of elements, each having a collection of metadata. A t-set is transactional in that a sequence of updates to one or more t-sets are made within an atomic transaction. A t-set is versioned since each committed transaction that updates it produces a new timestamped version that can be accessed via time-travel queries.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: March 7, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Vinay Agrawal, Bohou Li, Vikas Malik, Tushar Poddar, Vijayan Prabhakaran, Mukesh Punhani, Mehul A. Shah, Benjamin Sowell, Douglas Brian Terry
  • Patent number: 11341104
    Abstract: Techniques for resizing a distributed database are described. A request to resize a distributed database is received. The distributed database stores data organized into one or more rows of one or more tables. Each node of the first plurality of nodes is assigned a portion of the data. A portion of the data assigned to a first node in the first plurality of nodes is selected to be assigned to a second node in a second plurality of nodes. The number of nodes in the first and second plurality of nodes is different, and the first and second plurality of nodes include at least one common node. Metadata of the selected portion of the data is transferred from the first node to the second node. The metadata that includes a location of the selected portion of the data within a provider network.
    Type: Grant
    Filed: March 21, 2019
    Date of Patent: May 24, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Vijayan Prabhakaran, Rajesh Parangi Sharabhalingappa, Sanuj Basu, Gokul Soundararajan, Krishna Chaitanya Gudipati, Aditya Subrahmanyan
  • Patent number: 11307765
    Abstract: Data in a storage system is deduplicated after receiving from at least one writing entity requests for a plurality of write operations for a corresponding plurality of data blocks in a storage object. The received blocks are buffered and sorted in order and a sequence of clumps is created from the buffered blocks, where each clump comprises a grouping of at least one of the sorted, buffered blocks. A boundary is determined between at least one pair of clumps based at least in part on the content of at least one of the buffered blocks, and it is then determined whether at least one of the clumps is a duplicate of a previously stored clump.
    Type: Grant
    Filed: March 18, 2019
    Date of Patent: April 19, 2022
    Assignee: VMware, Inc.
    Inventors: R. Hugo Patterson, III, Sazzala Reddy, Vijayan Prabhakaran, Garrett Smith, Lakshmi Narayanan Bairavasundaram, Ganesh Venkitachalam
  • Publication number: 20190250818
    Abstract: Data in a storage system is deduplicated after receiving from at least one writing entity requests for a plurality of write operations for a corresponding plurality of data blocks in a storage object. The received blocks are buffered and sorted in order and a sequence of clumps is created from the buffered blocks, where each clump comprises a grouping of at least one of the sorted, buffered blocks. A boundary is determined between at least one pair of clumps based at least in part on the content of at least one of the buffered blocks, and it is then determined whether at least one of the clumps is a duplicate of a previously stored clump.
    Type: Application
    Filed: March 18, 2019
    Publication date: August 15, 2019
    Inventors: R. Hugo Patterson, III, Sazzala Reddy, Vijayan Prabhakaran, Garrett Smith, Lakshmi Narayanan Bairavasundaram, Ganesh Venkitachalam
  • Patent number: 10235044
    Abstract: Data in a storage system is deduplicated after receiving from at least one writing entity requests for a plurality of write operations for a corresponding plurality of data blocks in a storage object. The received blocks are buffered and sorted in order and a sequence of clumps is created from the buffered blocks, where each clump comprises a grouping of at least one of the sorted, buffered blocks. A boundary is determined between at least one pair of clumps based at least in part on the content of at least one of the buffered blocks, and it is then determined whether at least one of the clumps is a duplicate of a previously stored clump.
    Type: Grant
    Filed: June 9, 2016
    Date of Patent: March 19, 2019
    Assignee: Datrium, Inc.
    Inventors: R. Hugo Patterson, III, Sazzala Reddy, Vijayan Prabhakaran, Garrett Smith, Lakshmi Narayanan Bairavasundaram, Ganesh Venkitachalam
  • Patent number: 9836362
    Abstract: A machine-implemented method includes automatically determining that a host device is restarting from a disruptive stoppage of operations and that in-process write transactions by the host device to respective pages of non-volatile storage may have been interrupted. The method includes, in response to the determination, automatically scanning the non-volatile storage for all metadata-containing storage pages with respective identifications S(i) and having corresponding metadata relating each respective storage page S(i) to a corresponding data page P(j) and a corresponding version number V(k). The method includes automatically identifying scanned storage pages S(i) that have for their corresponding data page P(j) a most recent version number HV(k) and, in some cases, a secondmost recent version number.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: December 5, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vijayan Prabhakaran, Lidong Zhou, Thomas Lee Rodeheffer
  • Publication number: 20170103002
    Abstract: A cyclic commit protocol is used to store relationships between transactions and is used by the technology to determine whether a transaction is committed or not. The protocol allows creation of a cycle of transactions which can be used to recover the state of a storage device after a host failure by identifying the last committed version of intention records as committed or uncommitted based on the data stored in the physical pages.
    Type: Application
    Filed: December 20, 2016
    Publication date: April 13, 2017
    Inventors: Vijayan Prabhakaran, Lidong Zhou, Thomas Lee Rodeheffer
  • Publication number: 20170031994
    Abstract: Data in a storage system is deduplicated after receiving from at least one writing entity requests for a plurality of write operations for a corresponding plurality of data blocks in a storage object. The received blocks are buffered and sorted in order and a sequence of clumps is created from the buffered blocks, where each clump comprises a grouping of at least one of the sorted, buffered blocks. A boundary is determined between at least one pair of clumps based at least in part on the content of at least one of the buffered blocks, and it is then determined whether at least one of the clumps is a duplicate of a previously stored clump.
    Type: Application
    Filed: June 9, 2016
    Publication date: February 2, 2017
    Applicant: Datrium, Inc.
    Inventors: R. Hugo PATTERSON, III, Sazzala REDDY, Vijayan PRABHAKARAN, Garrett SMITH, Lakshmi Narayanan BAIRAVASUNDARAM, Ganesh VENKITACHALAM
  • Patent number: 9542431
    Abstract: A cyclic commit protocol is used to store relationships between transactions and is used by the technology to determine whether a transaction is committed or not. The protocol allows creation of a cycle of transactions which can be used to recover the state of a storage device after a host failure by identifying the last committed version of intention records as committed or uncommitted based on the data stored in the physical pages.
    Type: Grant
    Filed: October 24, 2008
    Date of Patent: January 10, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vijayan Prabhakaran, Lidong Zhou, Thomas Lee Rodeheffer
  • Patent number: 9235396
    Abstract: A data partitioning plan is automatically generated that—given a data-parallel program and a large input dataset, and without having to first run the program on the input dataset—substantially optimizes performance of the distributed execution system that explicitly measures and infers various properties of both data and computation to perform cost estimation and optimization. Estimation may comprise inferring the cost of a candidate data partitioning plan, and optimization may comprise generating an optimal partitioning plan based on the estimated costs of computation and input/output.
    Type: Grant
    Filed: December 13, 2011
    Date of Patent: January 12, 2016
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Qifa Ke, Vijayan Prabhakaran, Yinglian Xie, Yuan Yu, Jingyue Wu, Junfeng Yang
  • Patent number: 8972491
    Abstract: An application programming interface is provided that allows applications to assign multiple service-level agreements to their data transactions. The service-level agreements include latency bounds and consistency guarantees. The applications may assign utility values to each of the service-level agreements. A monitor component monitors the various replica nodes in a cloud storage system for latency and consistency, and when a transaction is received from an application, the monitor determines which of the replica nodes can likely fulfill the transaction in satisfaction of any of the service-level agreements. Where multiple service-level agreements can be satisfied, the replica node that can fulfill the transaction according to the service-level agreement with the greatest utility is selected. The application may be charged for the transaction based on the utility of the service-level agreement that was satisfied.
    Type: Grant
    Filed: October 5, 2012
    Date of Patent: March 3, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Hussam Abu-Libdeh, Marcos K. Aguilera, Mahesh Balakrishnan, Ramakrishna R. Kotla, Vijayan Prabhakaran, Douglas Brian Terry
  • Publication number: 20140101225
    Abstract: An application programming interface is provided that allows applications to assign multiple service-level agreements to their data transactions. The service-level agreements include latency bounds and consistency guarantees. The applications may assign utility values to each of the service-level agreements. A monitor component monitors the various replica nodes in a cloud storage system for latency and consistency, and when a transaction is received from an application, the monitor determines which of the replica nodes can likely fulfill the transaction in satisfaction of any of the service-level agreements. Where multiple service-level agreements can be satisfied, the replica node that can fulfill the transaction according to the service-level agreement with the greatest utility is selected. The application may be charged for the transaction based on the utility of the service-level agreement that was satisfied.
    Type: Application
    Filed: October 5, 2012
    Publication date: April 10, 2014
    Applicant: Microsoft Corporation
    Inventors: Hussam Abu-Libdeh, Marcos K. Aguilera, Mahesh Balakrishnan, Ramakrishna R. Kotla, Vijayan Prabhakaran, Douglas Brian Terry
  • Patent number: 8631272
    Abstract: A duplicate-aware disk array (DADA) leaves duplicated content on the disk array largely unmodified, instead of removing duplicated content, and then uses these duplicates to improve system performance, reliability, and availability of the disk array. Several implementations disclosed herein are directed to the selection of one duplicate from among a plurality of duplicates to act as the proxy for the other duplicates found in the disk array. Certain implementations disclosed herein are directed to scrubbing latent sector errors (LSEs) on duplicate-aware disk arrays. Other implementations are directed to disk reconstruction/recovery on duplicate-aware disk arrays. Yet other implementations are directed to load balancing on duplicate-aware disk arrays.
    Type: Grant
    Filed: March 4, 2011
    Date of Patent: January 14, 2014
    Assignee: Microsoft Corporation
    Inventors: Vijayan Prabhakaran, Yiying Zhang