Patents by Inventor Abhinav Duggal

Abhinav Duggal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10509675
    Abstract: A set of tasks, associated with a replication job, are generated for replicating from a source to destination site. An initial number of worker nodes are allocated to process the tasks. Each task involves a first type of worker node at the source site, a second type of worker node at the destination site, and includes one or more of copying an object from the source site to the destination site, or deleting an object from the destination site. The replication job is monitored. Based on the monitoring, a number of worker nodes is adjusted from the initial number to a new number, different from the initial number.
    Type: Grant
    Filed: February 2, 2018
    Date of Patent: December 17, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Atul Avinash Karmarkar, Philip Shilane, Kevin Xu, Abhinav Duggal
  • Patent number: 10452643
    Abstract: Systems and methods for verifying files in bulk in a file system. When files are represented by a segment tree, the levels of the segment trees are walked by level such that that multiple files are verified at the same time in order to identify missing segments. Then, a bottom up scan is performed using the missing segments to identify the files corresponding to the missing segments. The missing files can then be handled by the file system.
    Type: Grant
    Filed: June 28, 2016
    Date of Patent: October 22, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Abhinav Duggal, Tony Wong
  • Publication number: 20190243702
    Abstract: A controller at a source site generates a set of tasks associated with a replication job. Each task involves a source worker node from among a set of source worker nodes at the source site, a destination worker node from among a set of destination worker nodes at the destination site, and includes one or more of copying an object from the source to destination site, or deleting an object from the destination site. Status update messages concerning the tasks are received at a message queue connected between the controller and the set of source worker nodes. The status update messages are logged into a persistent key-value store. Upon a failure to complete the replication job, the key-value store is accessed to identify tasks that were and were not completed before the failure. The tasks that were not completed are resent to the source worker nodes.
    Type: Application
    Filed: February 2, 2018
    Publication date: August 8, 2019
    Inventors: Philip Shilane, Kevin Xu, Abhinav Duggal, Atul Avinash Karmarkar
  • Publication number: 20190243547
    Abstract: A source site includes a controller, a set of source worker nodes, and a message queue connected between the controller and source worker nodes. A destination site includes a set of destination worker nodes. The controller identifies differences between a first snapshot created at the source site at a first time and a second snapshot created at a second time, after the first time. Based on the differences, a set of tasks are generated. The tasks include one or more of copying an object from the source to destination or deleting an object from the destination. The controller places the tasks onto the message queue. A first source worker node retrieves the first task and coordinates with a first destination worker node to perform the first task. A second source worker nodes retrieves the second task and coordinates with a second destination worker node to perform the second task.
    Type: Application
    Filed: February 2, 2018
    Publication date: August 8, 2019
    Inventors: Abhinav Duggal, Atul Avinash Karmarkar, Philip Shilane, Kevin Xu
  • Publication number: 20190243688
    Abstract: A set of tasks, associated with a replication job, are generated for replicating from a source to destination site. An initial number of worker nodes are allocated to process the tasks. Each task involves a first type of worker node at the source site, a second type of worker node at the destination site, and includes one or more of copying an object from the source site to the destination site, or deleting an object from the destination site. The replication job is monitored. Based on the monitoring, a number of worker nodes is adjusted from the initial number to a new number, different from the initial number.
    Type: Application
    Filed: February 2, 2018
    Publication date: August 8, 2019
    Inventors: Atul Avinash Karmarkar, Philip Shilane, Kevin Xu, Abhinav Duggal
  • Publication number: 20190245918
    Abstract: A source worker node at a source site fetches a task from a message queue. The task specifies replicating a first object at the source site to a destination site. A request for a connection is issued from the source worker node to the destination site. The request is received by a load balancer at the destination site and assigned to a destination worker node. A connection is established between the source and destination worker nodes. A determination is made that the destination site does not include an object that is the same as the first object. Upon the determination, a deduplication is performed between the source and destination worker nodes of segments into which the first object has been divided. Deduplicated segments of the first object are transmitted from the source worker node to the destination worker node for storage at the destination site.
    Type: Application
    Filed: February 2, 2018
    Publication date: August 8, 2019
    Inventors: Kevin Xu, Abhinav Duggal, Atul Avinash Karmarkar, Philip Shilane
  • Patent number: 10353867
    Abstract: According to one embodiment, fingerprints of segment trees are scanned, each segment tree representing one of the files in a filesystem namespace. For each of the fingerprints representing a segment, setting a corresponding bit in a live reference vector (LRV) to indicate that the segment has been referenced by a file in the filesystem namespace. A file index mapping fingerprints to storage locations of segments is scanned, including, for each fingerprint found in the file index, setting a corresponding bit in a live index vector (LIV) to indicate that the fingerprint exists in the file index. The LR vector and the LI vector are compared to determine whether there is any mismatch. A garbage collection operation is performed in response to determining that the LR vector and the LI vector are matched.
    Type: Grant
    Filed: June 27, 2016
    Date of Patent: July 16, 2019
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Tony Wong, Abhinav Duggal
  • Publication number: 20190213169
    Abstract: Systems and methods for incrementally repairing physical locality for live or active data are provided. Files that are enumerated to determine their locality are identified using dataless consistency points. The files are walked in order to measure their locality or at least the locality of their data segments. Locality repair is performed when the locality is greater than a threshold locality.
    Type: Application
    Filed: March 18, 2019
    Publication date: July 11, 2019
    Inventor: Abhinav Duggal
  • Patent number: 10318159
    Abstract: In general, in one aspect, the invention relates to a method for managing persistent storage in a storage system. The method includes determining, using a first plurality of containers in the storage system, a locality threshold, and performing, using the locality threshold, a locality repair on a first container of a second plurality of containers in the storage system, wherein the second plurality of containers comprises the first plurality of container.
    Type: Grant
    Filed: June 14, 2017
    Date of Patent: June 11, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Lan Bai, Atul Karmarkar, Abhinav Duggal
  • Patent number: 10235371
    Abstract: Systems and methods for incrementally repairing physical locality for live or active data are provided. Files that are enumerated to determine their locality are identified using dataless consistency points. The files are walked in order to measure their locality or at least the locality of their data segments. Locality repair is performed when the locality is greater than a threshold locality.
    Type: Grant
    Filed: June 28, 2016
    Date of Patent: March 19, 2019
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventor: Abhinav Duggal
  • Patent number: 10146697
    Abstract: Embodiments are directed to perfect physical garbage collection (PPGC) process that uses a NUMA-aware perfect hash vector. The process splits a perfect hash vector (PHVEC) into a number of perfect hash vectors, wherein the number corresponds to a number of nodes having a processing core and associated local memory, directs each perfect hash to a respective local memory of a node so that each perfect hash vector accesses only a local memory, and assigns fingerprints in the perfect hash vector to a respective node using a mask function. The process also performs a simultaneous creation of perfect hash vectors in a multi-threaded manner by scanning the Index once.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: December 4, 2018
    Assignee: EMC IP Holding Company LLC
    Inventors: Abhinav Duggal, Tony Wong
  • Patent number: 10120875
    Abstract: Techniques for deduplicating data streams are described herein. According to one embodiment, a first data stream is received to be stored in a storage system, where the first data stream includes data blocks and each data block includes a header and a footer. A boundary detector is to detect boundaries of the data blocks by matching at least a portion of a header with a footer of each data block and a header of an adjacent data block. An anchoring unit is to anchor the first data stream based on the determined boundaries of the data blocks using a plurality of anchors. A deduplication engine is to deduplicate the first data stream into a plurality of deduplicated data segments based on the plurality of anchors. The deduplicated data segments are then stored in one or more persistent storage devices of the storage system.
    Type: Grant
    Filed: December 2, 2014
    Date of Patent: November 6, 2018
    Assignee: EMC IP Holding Company LLC
    Inventors: Abhinav Duggal, Anshita Agrawal
  • Patent number: 10108544
    Abstract: Embodiments are directed to perfect physical garbage collection (PPGC) process that dynamically estimates duplicate containers using a Bloom filter-based dead vector by scanning an index containing a mapping of fingerprints to a container ID for a plurality of containers; returning, for each fingerprint, a fingerprint sequence associating each fingerprint with a respective unique container ID, wherein a last entry of the sequence is preserved and the remaining entries are considered duplicates; and maintaining a duplicate array of counts of the duplicates indexed by container IDs, and wherein the duplicate array comprises a duplicate counter that keeps track of a number of live duplicated segments for each container, and further wherein a live segment is a live duplicate segment if a segment with a same fingerprint exists in another container with a higher container ID.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: October 23, 2018
    Assignee: EMC IP Holding Company LLC
    Inventors: Abhinav Duggal, Tony Wong
  • Patent number: 10108543
    Abstract: Embodiments are directed to perfect physical garbage collection (PPGC) process that uses a perfect hash vector instead of large Bloom filters of the regular physical garbage collection process for the live and live instance vectors and consolidates both into a single live vector using the perfect hash vector. A method of PPGC includes an analysis phase walking an index containing a mapping of fingerprints to a container ID for a plurality of containers and building a perfect hash function for a walk vector and a live vector, wherein the live vector uses a perfect hash vector, an enumeration phase inserting live segments in memory into the perfect hash vector, a select phase traversing the plurality of containers and selecting containers that meet a defined liveness threshold and a copy phase copying live segments out of the selected containers.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: October 23, 2018
    Assignee: EMC IP Holding Company LLC
    Inventors: Abhinav Duggal, Tony Wong
  • Patent number: 9898369
    Abstract: In an embodiment of the present invention, a method can include converting a data-full snapshot having a plurality of user data and corresponding metadata to a dataless snapshot. The dataless snapshot stores the metadata corresponding to the user data. Converting the data-full snapshot to the dataless snapshot includes removing the user data from the data-full snapshot. The metadata can be at least one of a checksum or hash of the corresponding user data.
    Type: Grant
    Filed: August 5, 2014
    Date of Patent: February 20, 2018
    Assignee: EMC IP Holding Company LLC
    Inventors: Dheer Moghe, Abhinav Duggal
  • Patent number: 9767106
    Abstract: In an embodiment, a method can include loading a first snapshot of data stored on a storage device, the first snapshot being verified. The method can further include capturing a second snapshot of data stored on the store device after waiting an interval of time from creation of the first snapshot. The method can further include generating a list of closed files between the two snapshots by differentiating the first snapshot and the second snapshot. The method can additionally include verifying the second snapshot by comparing the closed files in the list of closed files by in the second snapshot to the closed files in the storage device, which is an active snapshot. The method can also include deleting the first snapshot.
    Type: Grant
    Filed: June 30, 2014
    Date of Patent: September 19, 2017
    Assignee: EMC IP Holding Company LLC
    Inventors: Abhinav Duggal, Dheer Moghe
  • Patent number: 9225691
    Abstract: Exemplary methods for deduplicating encrypted files are described herein. The exemplary methods include receiving a first encrypted data file from a remote source that is encrypted by a first security key. In one embodiment, the methods include transmitting to a remote security manager a first key identifier (ID) that is extracted from the first data file, the first key ID identifying the first security key. In one aspect of the invention, in response to receiving the first security key from the remote security manager based on the first key ID, decrypting the first data file using the first security key provided by the remote security manager. In at least one embodiment, the methods include deduplicating the decrypted first data file.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: December 29, 2015
    Assignee: EMC Corporation
    Inventors: Shankar Balasubramanian, Abhinav Duggal, Bharath Krishnappa, Ravi Sharda
  • Patent number: 9183218
    Abstract: Techniques for deduplicating structured datasets using hybrid chunking and header removal. According to one embodiment, a request is received to deduplicate a file having a plurality of data blocks, each data block having a header and a data portion. The data blocks are anchored using first anchors to indicate block boundaries based on their headers. At least one second anchor is added within a data portion of at least one data block if the data portion of at least one data block satisfies a predetermined condition. The data blocks are then deduplicated based on the first and second anchors.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: November 10, 2015
    Assignee: EMC Corporation
    Inventors: Grant R. Wallace, Abhinav Duggal