Patents by Inventor Jaideep Dastidar

Jaideep Dastidar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Machine learning model updates to ML accelerators

Patent number: 11586578

Abstract: Examples herein describe a peripheral I/O device with a hybrid gateway that permits the device to have both I/O and coherent domains. As a result, the compute resources in the coherent domain of the peripheral I/O device can communicate with the host in a similar manner as CPU-to-CPU communication in the host. The dual domains in the peripheral I/O device can be leveraged for machine learning (ML) applications. While an I/O device can be used as an ML accelerator, these accelerators previously only used an I/O domain. In the embodiments herein, compute resources can be split between the I/O domain and the coherent domain where a ML engine is in the I/O domain and a ML model is in the coherent domain. An advantage of doing so is that the ML model can be coherently updated using a reference ML model stored in the host.

Type: Grant

Filed: October 26, 2020

Date of Patent: February 21, 2023

Assignee: XILINX, INC.

Inventor: Jaideep Dastidar
Hybrid hardware-software coherent framework

Patent number: 11586369

Abstract: Examples herein describe an accelerator device that shares the same coherent domain as hardware elements in a host computing device. The embodiments herein describe a mix of hardware and software coherency which reduces the overhead of managing data when large chunks of data are moved from the host into the accelerator device. In one embodiment, an accelerator application executing on the host identifies a data set it wishes to transfer to the accelerator device to be processed. The accelerator application transfers ownership from a home agent in the host to the accelerator device. A slave agent can then take ownership of the data. As a result, any memory operation requests received from a requesting agent in the accelerator device can gain access to the data set in local memory via the slave agent without the slave agent obtaining permission from the home agent in the host.

Type: Grant

Filed: May 29, 2019

Date of Patent: February 21, 2023

Assignee: XILINX, INC.

Inventors: Millind Mittal, Jaideep Dastidar
Logical transport overlayed over a physical transport having a tree topology

Patent number: 11563639

Abstract: In an example, a system specifies a first configuration of the physical transport network that models a plurality of devices as a corresponding first plurality of nodes having a tree topology. Each node of the first plurality of nodes has at least one first device identifier and at least one first connection identifier to other nodes in the tree topology. The system specifies a second configuration of the logical transport network that models the plurality of devices as the first plurality of nodes having a non-tree topology. Each node of the first plurality of nodes has at least one second device identifier, at least one second connection identifier to other nodes in the non-tree topology, the at least one first device identifier, and the at least one first connection identifier of the tree topology.

Type: Grant

Filed: July 2, 2018

Date of Patent: January 24, 2023

Assignee: XILINX, INC.

Inventors: Millind Mittal, Jaideep Dastidar
Hardware coherent computational expansion memory

Patent number: 11556344

Abstract: Embodiments herein describe transferring ownership of data (e.g., cachelines or blocks of data comprising multiple cachelines) from a host to hardware in an I/O device. In one embodiment, the host and I/O device (e.g., an accelerator) are part of a cache-coherent system where ownership of data can be transferred from a home agent (HA) in the host to a local HA in the I/O device—e.g., a computational slave agent (CSA). That way, a function on the I/O device (e.g., an accelerator function) can request data from the local HA without these requests having to be sent to the host HA. Further, the accelerator function can indicate whether the local HA tracks the data on a cacheline-basis or by a data block (e.g., multiple cachelines). This provides flexibility that can reduce overhead from tracking the data, depending on the function's desired use of the data.

Type: Grant

Filed: September 28, 2020

Date of Patent: January 17, 2023

Assignee: XILINX, INC.

Inventors: Millind Mittal, Jaideep Dastidar
CACHE COHERENT ACCELERATION FUNCTION VIRTUALIZATION

Publication number: 20230004442

Abstract: The embodiments herein describe a virtualization framework for cache coherent accelerators where the framework incorporates a layered approach for accelerators in their interactions between a cache coherent protocol layer and the functions performed by the accelerator. In one embodiment, the virtualization framework includes a first layer containing the different instances of accelerator functions (AFs), a second layer containing accelerator function engines (AFE) in each of the AFs, and a third layer containing accelerator function threads (AFTs) in each of the AFEs. Partitioning the hardware circuitry using multiple layers in the virtualization framework allows the accelerator to be quickly re-provisioned in response to requests made by guest operation systems or virtual machines executing in a host. Further, using the layers to partition the hardware permits the host to re-provision sub-portions of the accelerator while the remaining portions of the accelerator continue to operate as normal.

Type: Application

Filed: September 6, 2022

Publication date: January 5, 2023

Inventors: Millind MITTAL, Jaideep DASTIDAR
Logical transport over a fixed PCIE physical transport network

Patent number: 11477049

Abstract: A method and a system for transparently overlaying a logical transport network over an existing physical transport network is disclosed. The system designates a virtual channel located in a first transaction layer of a network conforming to a first network protocol. The system assembles a transaction layer packet in a second logical transaction layer of a second network protocol that is also recognizable by the first transaction layer. The system transfers the transaction layer packet from the second transaction layer to the virtual channel. The system transmits the transaction layer packet over the first transaction layer using the designated virtual channel over the network.

Type: Grant

Filed: August 2, 2018

Date of Patent: October 18, 2022

Assignee: XILINX, INC.

Inventors: Millind Mittal, Kiran S. Puranik, Jaideep Dastidar
Cache coherent acceleration function virtualization

Patent number: 11474871

Abstract: The embodiments herein describe a virtualization framework for cache coherent accelerators where the framework incorporates a layered approach for accelerators in their interactions between a cache coherent protocol layer and the functions performed by the accelerator. In one embodiment, the virtualization framework includes a first layer containing the different instances of accelerator functions (AFs), a second layer containing accelerator function engines (AFE) in each of the AFs, and a third layer containing accelerator function threads (AFTs) in each of the AFEs. Partitioning the hardware circuitry using multiple layers in the virtualization framework allows the accelerator to be quickly re-provisioned in response to requests made by guest operation systems or virtual machines executing in a host. Further, using the layers to partition the hardware permits the host to re-provision sub-portions of the accelerator while the remaining portions of the accelerator continue to operate as normal.

Type: Grant

Filed: September 25, 2019

Date of Patent: October 18, 2022

Assignee: XILINX, INC.

Inventors: Millind Mittal, Jaideep Dastidar
FINE-GRAINED MULTI-TENANT CACHE MANAGEMENT

Publication number: 20220292024

Abstract: The embodiments herein describe a multi-tenant cache that implements fine-grained allocation of the entries within the cache. Each entry in the cache can be allocated to a particular tenant—i.e., fine-grained allocation—rather than having to assign all the entries in a way to a particular tenant. If the tenant does not currently need those entries (which can be tracked using counters), the entries can be invalidated (i.e., deallocated) and assigned to another tenant. Thus, fine-grained allocation provides a flexible allocation of entries in a hardware cache that permits an administrator to reserve any number of entries for a particular tenant, but also permit other tenants to use this bandwidth when the reserved entries are not currently needed by the tenant.

Type: Application

Filed: May 26, 2022

Publication date: September 15, 2022

Inventors: Millind MITTAL, Jaideep DASTIDAR
SPATIAL DISTRIBUTION IN A 3D DATA PROCESSING UNIT

Publication number: 20220269638

Abstract: The embodiments herein describe a 3D SmartNIC that spatially distributes compute, storage, or network functions in three dimensions using a plurality of layers. That is, unlike current SmartNIC that can perform acceleration functions in a 2D, a 3D Smart can distribute these functions across multiple stacked layers, where each layer can communicate directly or indirectly with the other layers.

Type: Application

Filed: February 24, 2021

Publication date: August 25, 2022

Inventor: Jaideep DASTIDAR
Disaggregated switch control path with direct-attached dispatch

Patent number: 11386031

Abstract: Embodiments herein describe techniques for separating data transmitted between I/O functions in an integrated component and a host into separate data paths. In one embodiment, data packets are transmitted using a direct data path that bypasses a switch in the integrated component. In contrast, configuration packets (e.g., hot-swap, hot-add, hot-remove data, some types of descriptors, etc.) are transmitted to the switch which then forwards the configuration packets to their destination. The direct path for the data packets does not rely on switch connectivity (and its accompanying latency) to transport bandwidth sensitive traffic between the host and the I/O functions, and instead avoids (e.g., bypasses) the bandwidth, resource, store/forward, and latency properties of the switch. Meanwhile, the software compatibility attributes, such as hot plug attributes (which are not latency or bandwidth sensitive), continue to be supported by using the switch to provide a configuration data path.

Type: Grant

Filed: June 5, 2020

Date of Patent: July 12, 2022

Assignee: XILINX, INC.

Inventors: Millind Mittal, Jaideep Dastidar
Fine-grained multi-tenant cache management

Patent number: 11372769

Abstract: The embodiments herein describe a multi-tenant cache that implements fine-grained allocation of the entries within the cache. Each entry in the cache can be allocated to a particular tenant—i.e., fine-grained allocation—rather than having to assign all the entries in a way to a particular tenant. If the tenant does not currently need those entries (which can be tracked using counters), the entries can be invalidated (i.e., deallocated) and assigned to another tenant. Thus, fine-grained allocation provides a flexible allocation of entries in a hardware cache that permits an administrator to reserve any number of entries for a particular tenant, but also permit other tenants to use this bandwidth when the reserved entries are not currently needed by the tenant.

Type: Grant

Filed: August 29, 2019

Date of Patent: June 28, 2022

Assignee: XILINX, INC.

Inventors: Millind Mittal, Jaideep Dastidar
Multiple protocol layer conversion

Patent number: 11375050

Abstract: Embodiments herein describe a layer converter that includes a proxy legacy interface that permits the layers for a legacy interconnect protocol to be recycled without any modifications, thus achieving legacy functionality alongside the new protocols' layer implementation. Put differently, the layer converter permits the layers of the legacy interconnect protocol to be reused to permit data to be transmitted on a link shared with data transmitted using a new interconnect protocol.

Type: Grant

Filed: September 11, 2020

Date of Patent: June 28, 2022

Assignee: XILINX, INC.

Inventors: Millind Mittal, Jaideep Dastidar, Kiran Puranik
HARDWARE COHERENT COMPUTATIONAL EXPANSION MEMORY

Publication number: 20220100523

Abstract: Embodiments herein describe transferring ownership of data (e.g., cachelines or blocks of data comprising multiple cachelines) from a host to hardware in an I/O device. In one embodiment, the host and I/O device (e.g., an accelerator) are part of a cache-coherent system where ownership of data can be transferred from a home agent (HA) in the host to a local HA in the I/O device—e.g., a computational slave agent (CSA). That way, a function on the I/O device (e.g., an accelerator function) can request data from the local HA without these requests having to be sent to the host HA. Further, the accelerator function can indicate whether the local HA tracks the data on a cacheline-basis or by a data block (e.g., multiple cachelines). This provides flexibility that can reduce overhead from tracking the data, depending on the function's desired use of the data.

Type: Application

Filed: September 28, 2020

Publication date: March 31, 2022

Inventors: Millind MITTAL, Jaideep DASTIDAR
Compressed tag coherency messaging

Patent number: 11271860

Abstract: An example cache-coherent packetized network system includes: a home agent; a snooped agent; and a request agent configured to send, to the home agent, a request message for a first address, the request message having a first transaction identifier of the request agent; where the home agent is configured to send, to the snooped agent, a snoop request message for the first address, the snoop request message having a second transaction identifier of the home agent; and where the snooped agent is configured to send a data message to the request agent, the data message including a first compressed tag generated using a function based on the first address.

Type: Grant

Filed: November 15, 2019

Date of Patent: March 8, 2022

Assignee: XILINX, INC.

Inventors: Millind Mittal, Jaideep Dastidar
DISAGGREGATED SWITCH CONTROL PATH WITH DIRECT-ATTACHED DISPATCH

Publication number: 20210382838

Abstract: Embodiments herein describe techniques for separating data transmitted between I/O functions in an integrated component and a host into separate data paths. In one embodiment, data packets are transmitted using a direct data path that bypasses a switch in the integrated component. In contrast, configuration packets (e.g., hot-swap, hot-add, hot-remove data, some types of descriptors, etc.) are transmitted to the switch which then forwards the configuration packets to their destination. The direct path for the data packets does not rely on switch connectivity (and its accompanying latency) to transport bandwidth sensitive traffic between the host and the I/O functions, and instead avoids (e.g., bypasses) the bandwidth, resource, store/forward, and latency properties of the switch. Meanwhile, the software compatibility attributes, such as hot plug attributes (which are not latency or bandwidth sensitive), continue to be supported by using the switch to provide a configuration data path.

Type: Application

Filed: June 5, 2020

Publication date: December 9, 2021

Inventors: Millind MITTAL, Jaideep DASTIDAR
Systems and Methods for Arbitrating Traffic in a Bus

Publication number: 20210342282

Abstract: A system and method for efficiently arbitrating traffic on a bus. A computing system includes a fabric for routing traffic among one or more agents and one or more endpoints. The fabric includes multiple arbiters in an arbitration hierarchy. Arbiters store traffic in buffers with each buffer associated with a particular traffic type and a source of the traffic. Arbiters maintain a respective urgency counter for keeping track of a period of time traffic of a particular type is blocked by upstream arbiters. When the block is removed, the traffic of the particular type has priority for selection based on the urgency counter. When arbiters receive feedback from downstream arbiters or sources, the arbiters adjust selection priority accordingly. For example, changes in bandwidth requirement, low latency tolerance and active status cause adjustments in selection priority of stored requests.

Type: Application

Filed: July 14, 2021

Publication date: November 4, 2021

Inventors: Nachiappan Chidambaram Nachiappan, Jaideep Dastidar, Yiu Chun Tse, Ripudaman Singh, Shawn Munetoshi Fukami, Benjamin K. Dodge, Vinodh R. Cuppu
Producer-to-consumer active direct cache transfers

Patent number: 11113194

Abstract: The embodiments herein creates DCT mechanisms that initiate a DCT at the time the updated data is being evicted from the producer cache. These DCT mechanisms are applied when the producer is replacing the updated contents in its cache because the producer has either moved on to working on a different data set (e.g., a different task) or moved on to working on a different function, or when the producer-consumer task manager (e.g., a management unit) enforces software coherency by sending Cache Maintenance Operations (CMO). One advantage of the DCT mechanism is that because the direct cache transfer takes place at the time the updated data is being evicted, by the time the consumer begins its task, the updated contents have already been placed in its own cache or another cache within the cache hierarchy.

Type: Grant

Filed: September 4, 2019

Date of Patent: September 7, 2021

Assignee: XILINX, INC.

Inventors: Jaideep Dastidar, Millind Mittal
Delegated snoop protocol

Patent number: 11093394

Abstract: An example Cache-Coherent Non-Uniform Memory Access (CC-NUMA) system includes: one or more fabric switches; a home agent coupled to the one or more fabric switches; first and second response agents coupled to the fabric switches; wherein the home agent is configured to send a delegated snoop message to the first response agent, the delegated snoop message instructing the first response agent to snoop the second response agent; wherein the first response agent is configured to snoop the second response agent in response to the delegated snoop message; and wherein the first and second response agents are configured to perform a cache-to-cache transfer during the snoop.

Type: Grant

Filed: September 4, 2019

Date of Patent: August 17, 2021

Assignee: XILINX, INC.

Inventors: Millind Mittal, Jaideep Dastidar
Systems and methods for arbitrating traffic in a bus

Patent number: 11093425

Abstract: A system and method for efficiently arbitrating traffic on a bus. A computing system includes a fabric for routing traffic among one or more agents and one or more endpoints. The fabric includes multiple arbiters in an arbitration hierarchy. Arbiters store traffic in buffers with each buffer associated with a particular traffic type and a source of the traffic. Arbiters maintain a respective urgency counter for keeping track of a period of time traffic of a particular type is blocked by upstream arbiters. When the block is removed, the traffic of the particular type has priority for selection based on the urgency counter. When arbiters receive feedback from downstream arbiters or sources, the arbiters adjust selection priority accordingly. For example, changes in bandwidth requirement, low latency tolerance and active status cause adjustments in selection priority of stored requests.

Type: Grant

Filed: August 20, 2018

Date of Patent: August 17, 2021

Assignee: Apple Inc.

Inventors: Nachiappan Chidambaram Nachiappan, Jaideep Dastidar, Yiu Chun Tse, Ripudaman Singh, Shawn Munetoshi Fukami, Benjamin K. Dodge, Vinodh R. Cuppu
Routing network using global address map with adaptive main memory expansion for a plurality of home agents

Patent number: 11074208

Abstract: An adaptive memory expansion scheme is proposed, where one or more memory expansion capable Hosts or Accelerators can have their memory mapped to one or more memory expansion devices. The embodiments below describe discovery, configuration, and mapping schemes that allow independent SCM implementations and CPU-Host implementations to match their memory expansion capabilities. As a result, a memory expansion host (e.g., a memory controller in a CPU or an Accelerator) can declare multiple logical memory expansion pools, each with a unique capacity. These logical memory pools can be matched to physical memory in the SCM cards using windows in a global address map. These windows represent shared memory for the Home Agents (HAs) (e.g., the Host) and the Slave Agent (SAs) (e.g., the memory expansion device).

Type: Grant

Filed: August 29, 2019

Date of Patent: July 27, 2021

Assignee: XILINX, INC.

Inventors: Jaideep Dastidar, Millind Mittal

prev 1 2 3 4 next