Patents by Inventor Srinath Krishnamachari

Srinath Krishnamachari has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10049118
    Abstract: A cluster-wide consistency checker ensures that two file systems of a storage input/output (I/O) stack executing on each node of a cluster are self-consistent as well as consistent with respect to each other. The file systems include a deduplication file system and a host-facing file system that cooperate to provide a layered file system of the storage I/O stack. The deduplication file system is a log-structured file system managed by an extent store layer of the storage I/O stack, whereas the host-facing file system is managed by a volume layer of the stack. Illustratively, each log-structured file system implements a key-value store and cooperates with other nodes of the cluster to provide a cluster-wide (global) key-value store. The consistency checker verifies and/or fixes on-disk structures of the layered file system to ensure its consistency.
    Type: Grant
    Filed: June 1, 2015
    Date of Patent: August 14, 2018
    Assignee: NetApp, Inc.
    Inventors: Dhaval Patel, Chaitanya Patel, John Muth, Srinath Krishnamachari
  • Patent number: 9952765
    Abstract: A layout of a transaction log enables efficient logging of metadata into entries of the log, as well as efficient reclamation and recovery of the log entries by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The transaction log is illustratively a two stage, append-only logging structure, wherein the first level is non-volatile random access memory (NVRAM) embodied as a NVlog and the second stage is disk, e.g., solid state drive (SSD). During crash recovery, the log entries are examined for consistency and scanned to identify those entries that have completed and those that are active, which require replay. The log entries are walked from oldest to newest (using sequence numbers) searching for the highest sequence number. Partially complete log entries (e.g., log entries in-progress when a crash occurs) may be discarded for failing a checksum (e.g., a CRC error).
    Type: Grant
    Filed: October 6, 2015
    Date of Patent: April 24, 2018
    Assignee: NetApp, Inc.
    Inventors: Srinath Krishnamachari, Anshul Pundir, Sriranjani Babu
  • Patent number: 9846539
    Abstract: A technique recovers from a low space condition associated with storage space reserved in an extent store to accommodate write requests received from a host and associated metadata managed by a layered file system of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The write requests, including user data, are persistently recorded on non-volatile random access memory (NVRAM) prior to returning an acknowledgement to the host by a persistence layer of the storage I/O stack. Volume metadata managed by a volume layer of the layered file system is embodied as mappings from logical block addresses (LBAs) of a logical unit (LUN) accessible by the host to extent keys maintained by an extent store layer of the layered file system. Extent store metadata managed by the extent store layer is embodied as mappings from the extent keys to the storage locations of the extents on storage devices of storage arrays coupled to the nodes of the cluster.
    Type: Grant
    Filed: January 22, 2016
    Date of Patent: December 19, 2017
    Assignee: NetApp, Inc.
    Inventors: Sriranjani Babu, Mandar Naik, Srinath Krishnamachari, Dhaval Patel
  • Patent number: 9836355
    Abstract: Embodiments herein are directed to efficient crash recovery of persistent metadata managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. Volume metadata managed by the volume layer is organized as a multi-level dense tree, wherein each level of the dense tree includes volume metadata entries for storing the volume metadata. When a level of the dense tree is full, the volume metadata entries of the level are merged with the next lower level of the dense tree. During a merge operation, two sets of generation IDs may be used in accordance with a double buffer arrangement: a first generation ID for the append buffer that is full (i.e., a merge staging buffer) and a second, incremented generation ID for the append buffer that accepts new volume metadata entries. Upon completion of the merge operation, the lower level (e.g., level 1) to which the merge is directed is assigned the generation ID of the merge staging buffer.
    Type: Grant
    Filed: September 22, 2016
    Date of Patent: December 5, 2017
    Assignee: NetApp, Inc.
    Inventors: Anshul Pundir, Janice D'Sa, Srinath Krishnamachari, Ling Zheng
  • Publication number: 20170212919
    Abstract: A bottom-up technique repairs a data structure, e.g., a multi-level dense tree, used to organize volume metadata as metadata entries managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The bottom-up repair technique implements a progressive repair algorithm that initially involves traversing each level of the dense tree to determine consistency of metadata entries by ensuring that the entries, e.g., (i) monotonically increase, (ii) do not overlap and (iii), if appropriate, reference (point to) existing entries of a lower level. The technique detects and corrects inconsistencies by, e.g., deleting out-of-order and overlapping entries, and adjusting the range of an index entry to reference the corresponding lower level entry. The technique then examines whether metadata entries at a lower level of the tree are referenced (pointed to) by corresponding index entries in an upper (parent) level.
    Type: Application
    Filed: January 25, 2016
    Publication date: July 27, 2017
    Inventors: Anthony J. Li, Srinath Krishnamachari, Ling Zheng
  • Publication number: 20170212690
    Abstract: A technique recovers from a low space condition associated with storage space reserved in an extent store to accommodate write requests received from a host and associated metadata managed by a layered file system of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The write requests, including user data, are persistently recorded on non-volatile random access memory (NVRAM) prior to returning an acknowledgement to the host by a persistence layer of the storage I/O stack. Volume metadata managed by a volume layer of the layered file system is embodied as mappings from logical block addresses (LBAs) of a logical unit (LUN) accessible by the host to extent keys maintained by an extent store layer of the layered file system. Extent store metadata managed by the extent store layer is embodied as mappings from the extent keys to the storage locations of the extents on storage devices of storage arrays coupled to the nodes of the cluster.
    Type: Application
    Filed: January 22, 2016
    Publication date: July 27, 2017
    Inventors: SRIRANJANI BABU, MANDAR NAIK, SRINATH KRISHNAMACHARI, DHAVAL PATEL
  • Publication number: 20170097873
    Abstract: A layout of a transaction log enables efficient logging of metadata into entries of the log, as well as efficient reclamation and recovery of the log entries by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The transaction log is illustratively a two stage, append-only logging structure, wherein the first level is non-volatile random access memory (NVRAM) embodied as a NVlog and the second stage is disk, e.g., solid state drive (SSD). During crash recovery, the log entries are examined for consistency and scanned to identify those entries that have completed and those that are active, which require replay. The log entries are walked from oldest to newest (using sequence numbers) searching for the highest sequence number. Partially complete log entries (e.g., log entries in-progress when a crash occurs) may be discarded for failing a checksum (e.g., a CRC error).
    Type: Application
    Filed: October 6, 2015
    Publication date: April 6, 2017
    Inventors: Srinath Krishnamachari, Anshul Pundir, Sriranjani Babu
  • Publication number: 20170097771
    Abstract: A layout of a transaction log enables efficient logging of metadata into entries of the log, as well as efficient reclamation and recovery of the log entries by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The transaction log is illustratively a two stage, append-only logging structure, wherein the first level is non-volatile random access memory (NVRAM) embodied as a NV log and the second stage is disk, e.g., solid state drive (SSD). The layout of the logging structure facilitates steady-state logging of metadata managed by the volume layer and crash recovery. Steady-state logging of metadata into the log entries occurs while the storage I/O stack of a node actively processes I/O requests, while crash recovery of the log entries occurs after an unexpected shutdown of the node.
    Type: Application
    Filed: October 1, 2015
    Publication date: April 6, 2017
    Inventors: Srinath Krishnamachari, Anshul Pundir, Sriranjani Babu
  • Publication number: 20170010939
    Abstract: Embodiments herein are directed to efficient crash recovery of persistent metadata managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. Volume metadata managed by the volume layer is organized as a multi-level dense tree, wherein each level of the dense tree includes volume metadata entries for storing the volume metadata. When a level of the dense tree is full, the volume metadata entries of the level are merged with the next lower level of the dense tree. During a merge operation, two sets of generation IDs may be used in accordance with a double buffer arrangement: a first generation ID for the append buffer that is full (i.e., a merge staging buffer) and a second, incremented generation ID for the append buffer that accepts new volume metadata entries. Upon completion of the merge operation, the lower level (e.g., level 1) to which the merge is directed is assigned the generation ID of the merge staging buffer.
    Type: Application
    Filed: September 22, 2016
    Publication date: January 12, 2017
    Inventors: Anshul Pundir, Janice D'Sa, Srinath Krishnamachari, Ling Zheng
  • Publication number: 20160350358
    Abstract: A cluster-wide consistency checker ensures that two file systems of a storage input/output (I/O) stack executing on each node of a cluster are self-consistent as well as consistent with respect to each other. The file systems include a deduplication file system and a host-facing file system that cooperate to provide a layered file system of the storage I/O stack. The deduplication file system is a log-structured file system managed by an extent store layer of the storage I/O stack, whereas the host-facing file system is managed by a volume layer of the stack. Illustratively, each log-structured file system implements a key-value store and cooperates with other nodes of the cluster to provide a cluster-wide (global) key-value store. The consistency checker verifies and/or fixes on-disk structures of the layered file system to ensure its consistency.
    Type: Application
    Filed: June 1, 2015
    Publication date: December 1, 2016
    Inventors: Dhaval Patel, Chaitanya Patel, John Muth, Srinath Krishnamachari
  • Patent number: 9501359
    Abstract: Embodiments herein are directed to efficient crash recovery of persistent metadata managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. Volume metadata managed by the volume layer is organized as a multi-level dense tree, wherein each level of the dense tree includes volume metadata entries for storing the volume metadata. When a level of the dense tree is full, the volume metadata entries of the level are merged with the next lower level of the dense tree. During a merge operation, two sets of generation IDs may be used in accordance with a double buffer arrangement: a first generation ID for the append buffer that is full (i.e., a merge staging buffer) and a second, incremented generation ID for the append buffer that accepts new volume metadata entries. Upon completion of the merge operation, the lower level (e.g., level 1) to which the merge is directed is assigned the generation ID of the merge staging buffer.
    Type: Grant
    Filed: September 10, 2014
    Date of Patent: November 22, 2016
    Assignee: NetApp, Inc.
    Inventors: Anshul Pundir, Janice D'Sa, Srinath Krishnamachari, Ling Zheng
  • Publication number: 20160246522
    Abstract: An exactly once semantics (EOS) system of a storage input/output (I/O) stack implements a technique ensuring that non-idempotent operations occur exactly once in a storage system embodied as a node of a cluster. Illustratively, a first layer of the storage I/O stack may act as a client issuing a non-idempotent operation to second layer of the stack, which may act as a server. According to the technique, the EOS system may wrap (i.e., encapsulate) the non-idempotent operation within a transaction embodied as an EOS transaction data structure having a transaction identifier that uniquely identifies the transaction. The server may complete the transaction and reply with a result to the client, which may acknowledge receipt of the reply. In response to a crash and subsequent recovery of the node, the EOS system may determine whether the transaction had completed prior to the crash. If so, the EOS system ensures that the transaction is not re-played (re-executed).
    Type: Application
    Filed: February 25, 2015
    Publication date: August 25, 2016
    Inventors: Srinath Krishnamachari, Kayuri H. Patel, Jeffrey S. Kimmel, Edward D. McClanahan
  • Publication number: 20160077744
    Abstract: A deferred refcount update technique efficiently frees storage space for metadata (associated with data) to be deleted during a merge operation managed by a volume layer of a node. The metadata is illustratively volume metadata embodied as mappings from logical block addresses (LBAs) of a logical unit (LUN) to extent keys maintained by an extent store layer of the node. One or more requests to delete (or overwrite) an LBA range within a LUN may be captured as page keys associated with metadata pages during the merge operation and the storage space associated with those metadata pages may be freed in an out-of-band fashion. The page keys of the metadata pages may be persistently recorded in a reference count (refcount) log to thereby allow the merge operation to complete without resolving deletion of the keys. A batch of page keys may be organized as one or more delete requests and, once the merge completes, the keys may be inserted into the refcount log.
    Type: Application
    Filed: September 11, 2014
    Publication date: March 17, 2016
    Inventors: Anshul Pundir, Ashwin Pednekar, Srinath Krishnamachari, Ling Zheng
  • Publication number: 20160070618
    Abstract: Embodiments herein are directed to efficient crash recovery of persistent metadata managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. Volume metadata managed by the volume layer is organized as a multi-level dense tree, wherein each level of the dense tree includes volume metadata entries for storing the volume metadata. When a level of the dense tree is full, the volume metadata entries of the level are merged with the next lower level of the dense tree. During a merge operation, two sets of generation IDs may be used in accordance with a double buffer arrangement: a first generation ID for the append buffer that is full (i.e., a merge staging buffer) and a second, incremented generation ID for the append buffer that accepts new volume metadata entries. Upon completion of the merge operation, the lower level (e.g., level 1) to which the merge is directed is assigned the generation ID of the merge staging buffer.
    Type: Application
    Filed: September 10, 2014
    Publication date: March 10, 2016
    Inventors: Anshul Pundir, Janice D'Sa, Srinath Krishnamachari, Ling Zheng
  • Publication number: 20160070714
    Abstract: A low-overhead merge technique enables restart of a merge operation with minimal logging of state information relating to progress of the merge operation by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The technique enables restart of the merge operation by ensuring that metadata, i.e., metadata pages, generated during the merge operation is not subject to de-duplication by providing a unique value in each metadata page that distinguishes the page, i.e., renders the page distinct or “unique”, from other metadata pages in an extent store. In addition, the technique ensures that a reference count on each metadata page is a value denoting a lack of de-duplication. To that end, the extent store layer is configured to not increment the reference count for a metadata page if, during the merge operation, the page is identical (and thus subject to deduplication) to an existing metadata page in the extent store.
    Type: Application
    Filed: September 10, 2014
    Publication date: March 10, 2016
    Inventors: Janice D'Sa, Anshul Pundir, Srinath Krishnamachari, Ling Zheng
  • Publication number: 20160070644
    Abstract: An offset range striping technique increases concurrency of operation execution directed to metadata managed by a volume layer of a storage input/output (I/O) stack, while reducing contention among resources of one or more nodes of a cluster. A logical unit (LUN) may be apportioned into multiple volumes, each of which may be partitioned into multiple regions, wherein each region is represented by a dense tree. The technique increases concurrency of operation execution (e.g., modifications to the metadata at the offset ranges), while reducing contention among the resources (e.g., CPUs and NVLogs) by distributing the offset range operations among the regions and mapping the regions to services and NVLogs. Such increased concurrency and reduction of contention may be achieved by implementation of the technique to (i) apportion each region into disjoint chunks (i.e.
    Type: Application
    Filed: September 10, 2014
    Publication date: March 10, 2016
    Inventors: Janice D'Sa, Anshul Pundir, Srinath Krishnamachari, Ling Zheng, Jeffrey S. Kimmel