Patents by Inventor Jaideep Dastidar

Jaideep Dastidar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Adaptive integrated programmable device platform

Patent number: 11063594

Abstract: An integrated circuit (IC) includes a first interface configured for operation with a plurality of tenants implemented concurrently in the integrated circuit, wherein the plurality of tenants communicate with a host data processing system using the first interface. The IC includes a second interface configured for operation with the plurality of tenants, wherein the plurality of tenants communicate with one or more network nodes via a network using the second interface. The IC can include a programmable logic circuitry configured for operation with the plurality of tenants, wherein the programmable logic circuitry implements one or more hardware accelerated functions for the plurality of tenants and routes data between the first interface and the second interface. The first interface, the second interface, and the programmable logic circuitry are configured to provide isolation among the plurality of tenants.

Type: Grant

Filed: May 11, 2020

Date of Patent: July 13, 2021

Assignee: Xilinx, Inc.

Inventors: Sagheer Ahmad, Jaideep Dastidar, Brian C. Gaide, Juan J. Noguera Serra, Ian A. Swarbrick
Domain aware data migration in coherent heterogenous systems

Patent number: 10970217

Abstract: Embodiments disclosed herein provide a domain aware data migration scheme between processing elements, memory, and various caches in a CC-NUMA system. The scheme creates domain awareness in data migration operations, such as Direct Cache Transfer (DCT) operation, stashing operation, and in the allocation of policies of snoop filters and private, shared, or inline caches. The scheme defines a hardware-software interface to communicate locality information (also referred herein as affinity information or proximity information) and subsequent hardware behavior for optimal data migration, thus overcoming traditional CC-NUMA limitations.

Type: Grant

Filed: May 24, 2019

Date of Patent: April 6, 2021

Assignee: XILINX, INC.

Inventors: Jaideep Dastidar, Millind Mittal
Systems and methods for providing a back pressure free interconnect

Patent number: 10963172

Abstract: A system and method for efficiently allocating data storage to agents. A computing system includes an interconnect with intermediate buffers for storing transactions and corresponding payload data during transport between sources and destinations. A data storage limit is set on an amount of data storage corresponding to outstanding transactions for each of the multiple sources based on the initial buffer assignments. A number of outstanding transactions for each of the multiple sources is limited based on a corresponding data storage limit. If the rate of allocation of a given buffer assigned to a first source exceeds a threshold, then a second source is selected with available space exceeding a threshold in an assigned buffer. If it is determined the second source is not assigned to a buffer with a rate of allocation exceeding a threshold, then buffer storage is reassigned from the second source to the first source.

Type: Grant

Filed: August 9, 2018

Date of Patent: March 30, 2021

Assignee: Apple Inc.

Inventors: Nachiappan Chidambaram Nachiappan, David L. Trawick, Yiu Chun Tse, Deniz Balkan, Hengsheng Geng, Shawn Munetoshi Fukami, Jaideep Dastidar, Benjamin K. Dodge, Vinodh R. Cuppu
PRODUCER-TO-CONSUMER ACTIVE DIRECT CACHE TRANSFERS

Publication number: 20210064529

Abstract: The embodiments herein creates DCT mechanisms that initiate a DCT at the time the updated data is being evicted from the producer cache. These DCT mechanisms are applied when the producer is replacing the updated contents in its cache because the producer has either moved on to working on a different data set (e.g., a different task) or moved on to working on a different function, or when the producer-consumer task manager (e.g., a management unit) enforces software coherency by sending Cache Maintenance Operations (CMO). One advantage of the DCT mechanism is that because the direct cache transfer takes place at the time the updated data is being evicted, by the time the consumer begins its task, the updated contents have already been placed in its own cache or another cache within the cache hierarchy.

Type: Application

Filed: September 4, 2019

Publication date: March 4, 2021

Inventors: Jaideep DASTIDAR, Millind MITTAL
Scratchpad memory management in a computing system

Patent number: 10922226

Abstract: An example computing system includes a memory, a peripheral device configured to send a page request for accessing the memory, the page request indicating whether the page request is for regular memory or scratchpad memory, and a processor having a memory management unit (MMU). The MMU is configured to receive the page request and prevent memory pages from being marked dirty in response to the page request indicating scratchpad memory.

Type: Grant

Filed: December 3, 2018

Date of Patent: February 16, 2021

Assignee: XILINX, INC.

Inventors: Jaideep Dastidar, Chetan Loke
HYBRID HARDWARE-SOFTWARE COHERENT FRAMEWORK

Publication number: 20200379664

Abstract: Examples herein describe an accelerator device that shares the same coherent domain as hardware elements in a host computing device. The embodiments herein describe a mix of hardware and software coherency which reduces the overhead of managing data when large chunks of data are moved from the host into the accelerator device. In one embodiment, an accelerator application executing on the host identifies a data set it wishes to transfer to the accelerator device to be processed. The accelerator application transfers ownership from a home agent in the host to the accelerator device. A slave agent can then take ownership of the data. As a result, any memory operation requests received from a requesting agent in the accelerator device can gain access to the data set in local memory via the slave agent without the slave agent obtaining permission from the home agent in the host.

Type: Application

Filed: May 29, 2019

Publication date: December 3, 2020

Applicant: Xilinx, Inc.

Inventors: Millind Mittal, Jaideep Dastidar
MACHINE LEARNING MODEL UPDATES TO ML ACCELERATORS

Publication number: 20200341941

Abstract: Examples herein describe a peripheral I/O device with a hybrid gateway that permits the device to have both I/O and coherent domains. As a result, the compute resources in the coherent domain of the peripheral I/O device can communicate with the host in a similar manner as CPU-to-CPU communication in the host. The dual domains in the peripheral I/O device can be leveraged for machine learning (ML) applications. While an I/O device can be used as an ML accelerator, these accelerators previously only used an I/O domain. In the embodiments herein, compute resources can be split between the I/O domain and the coherent domain where a ML engine is in the I/O domain and a ML model is in the coherent domain. An advantage of doing so is that the ML model can be coherently updated using a reference ML model stored in the host.

Type: Application

Filed: April 26, 2019

Publication date: October 29, 2020

Applicant: Xilinx, Inc.

Inventors: Jaideep Dastidar, Millind Mittal
Peripheral I/O device with assignable I/O and coherent domains

Patent number: 10817455

Abstract: Examples herein describe a peripheral I/O device with a hybrid gateway that permits the device to have both I/O and coherent domains. That is, the I/O device can benefit from a traditional I/O model where the I/O device driver manages some of the compute resources in the I/O device as well as the benefits of adding other compute resources in the I/O device to the same coherent domain used by the hardware in the host computing system. As result, the compute resources in the coherent domain of the peripheral I/O device can communicate with the host in a similar manner as, e.g., CPU-to-CPU communication in the host. At the same time, the compute resources in the I/O domain can benefit from the advantages of the traditional I/O device model which provides efficiencies when doing large memory transfers between the host and the I/O device (e.g., DMA).

Type: Grant

Filed: April 10, 2019

Date of Patent: October 27, 2020

Assignee: XILINX, INC.

Inventors: Jaideep Dastidar, Sagheer Ahmad, Ian A. Swarbrick
Machine learning model updates to ML accelerators

Patent number: 10817462

Abstract: Examples herein describe a peripheral I/O device with a hybrid gateway that permits the device to have both I/O and coherent domains. As a result, the compute resources in the coherent domain of the peripheral I/O device can communicate with the host in a similar manner as CPU-to-CPU communication in the host. The dual domains in the peripheral I/O device can be leveraged for machine learning (ML) applications. While an I/O device can be used as an ML accelerator, these accelerators previously only used an I/O domain. In the embodiments herein, compute resources can be split between the I/O domain and the coherent domain where a ML engine is in the I/O domain and a ML model is in the coherent domain. An advantage of doing so is that the ML model can be coherently updated using a reference ML model stored in the host.

Type: Grant

Filed: April 26, 2019

Date of Patent: October 27, 2020

Assignee: XILINX, INC.

Inventors: Jaideep Dastidar, Millind Mittal
PERIPHERAL I/O DEVICE WITH ASSIGNABLE I/O AND COHERENT DOMAINS

Publication number: 20200327089

Abstract: Examples herein describe a peripheral I/O device with a hybrid gateway that permits the device to have both I/O and coherent domains. That is, the I/O device can benefit from a traditional I/O model where the I/O device driver manages some of the compute resources in the I/O device as well as the benefits of adding other compute resources in the I/O device to the same coherent domain used by the hardware in the host computing system. As result, the compute resources in the coherent domain of the peripheral I/O device can communicate with the host in a similar manner as, e.g., CPU-to-CPU communication in the host. At the same time, the compute resources in the I/O domain can benefit from the advantages of the traditional I/O device model which provides efficiencies when doing large memory transfers between the host and the I/O device (e.g., DMA).

Type: Application

Filed: April 10, 2019

Publication date: October 15, 2020

Applicant: Xilinx, Inc.

Inventors: Jaideep Dastidar, Sagheer Ahmad, Ian A. Swarbrick
Hybrid precise and imprecise cache snoop filtering

Patent number: 10761985

Abstract: Circuits and methods for combined precise and imprecise snoop filtering. A memory and a plurality of processors are coupled to the interconnect circuitry. A plurality of cache circuits are coupled to the plurality of processor circuits, respectively. A first snoop filter is coupled to the interconnect and is configured to filter snoop requests by individual cache lines of a first subset of addresses of the memory. A second snoop filter is coupled to the interconnect and is configured to filter snoop requests by groups of cache lines of a second subset of addresses of the memory. Each group encompasses a plurality of cache lines.

Type: Grant

Filed: August 2, 2018

Date of Patent: September 1, 2020

Assignee: Xilinx, Inc.

Inventors: Millind Mittal, Jaideep Dastidar
Scalable coherence management independent of transport protocol

Patent number: 10698824

Abstract: Disclosed systems and methods include in each agent, an agent layer, a link layer, and a port layer. The agent layer looks-up a port identifier in an address-to-port identifier map in response to a request directed to another agent and submits the request to the port layer. The link layer includes a plurality of links, and each link buffers communications from and to the agent layer. The port layer looks-up, in response to the request from the agent layer, a link identifier and chip identifier and writes the request to one of the links identified by the link identifier and associated with the chip identifier. The port layer also reads requests from the links and submits communications to a transport layer circuit based on the requests read from the links and associated chip identifiers.

Type: Grant

Filed: September 25, 2018

Date of Patent: June 30, 2020

Assignee: Xilinx, Inc.

Inventors: Millind Mittal, Jaideep Dastidar
Domain assist processor-peer for coherent acceleration

Patent number: 10698842

Abstract: Examples herein describe a peripheral I/O device with a domain assist processor (DAP) and a domain specific accelerator (DSA) that are in the same coherent domain as CPUs and memory in a host computing system. Peripheral I/O devices were previously unable to participate in a cache-coherent shared-memory multiprocessor paradigm with hardware resources in the host computing system. As a result, domain assist processing for lightweight processor functions (e.g., open source functions such as gzip, open source crypto libraries, open source network switches, etc.) either are performed using CPUs resources in the host or by provisioning a special processing system in the peripheral I/O device (e.g., using programmable logic in a FPGA). The embodiments herein use a DAP in the peripheral I/O device to perform the lightweight processor functions that would otherwise be performed by hardware resources in the host or by a special processing system in the peripheral I/O device.

Type: Grant

Filed: April 10, 2019

Date of Patent: June 30, 2020

Assignee: XILINX, INC.

Inventors: Jaideep Dastidar, Sagheer Ahmad
Adaptive integrated programmable device platform

Patent number: 10673439

Abstract: A device can include programmable logic circuitry, a processor system coupled to the programmable logic circuitry, and a network-on-chip. The network-on-chip is coupled to the programmable logic circuitry and the processor system. The network-on-chip is programmable to establish user specified data paths communicatively linking a circuit block implemented in the programmable logic circuitry and the processor system. The programmable logic circuitry, the network-on-chip, and the processor system are configured using a platform management controller.

Type: Grant

Filed: March 27, 2019

Date of Patent: June 2, 2020

Assignee: Xilinx, Inc.

Inventors: Sagheer Ahmad, Jaideep Dastidar, Brian C. Gaide, Juan J. Noguera Serra, Ian A. Swarbrick
Transparent port aggregation in multi-chip transport protocols

Patent number: 10664422

Abstract: Various implementations of a multi-chip system operable according to a predefined transport protocol are disclosed. In one embodiment, a system comprises a first IC comprising a processing element communicatively coupled with first physical ports. The system further comprises a second IC comprising second physical ports communicatively coupled with a first set of the first physical ports via first physical links, and one or more memory devices that are communicatively coupled with the second physical ports and accessible by the processing element via the first physical links. The first IC further comprises a data structure describing a first level of port aggregation to be applied across the first set. The second IC comprises a first distribution function configured to provide ordering to data communicated using the second physical ports. The first distribution function is based on the first level of port aggregation.

Type: Grant

Filed: July 31, 2019

Date of Patent: May 26, 2020

Assignee: XILINX, INC.

Inventors: Millind Mittal, Jaideep Dastidar
Systems and methods for scheduling different types of memory requests with varying data sizes

Patent number: 10649922

Abstract: A system and method for efficiently scheduling requests. In various embodiments, a processor sends commands such as read requests and write requests to an arbiter. The arbiter reduces latencies between commands being sent to a communication fabric and corresponding data being sent to the fabric. When the arbiter selects a given request, the arbiter identifies a first subset of stored requests affected by the given request being selected. The arbiter adjusts one or more attributes of the first subset of requests based on the selection of the given request. In one example, the arbiter replaces a weight attribute with a value, such as a zero value, indicating the first subset of requests should not be selected. Therefore, during the next selection by the arbiter, only the requests in a second subset different from the first subset are candidates for selection.

Type: Grant

Filed: August 6, 2018

Date of Patent: May 12, 2020

Assignee: Apple Inc.

Inventors: Shawn Munetoshi Fukami, Jaideep Dastidar, Yiu Chun Tse
SYSTEMS AND METHODS FOR ARBITRATING TRAFFIC IN A BUS

Publication number: 20200057737

Abstract: A system and method for efficiently arbitrating traffic on a bus. A computing system includes a fabric for routing traffic among one or more agents and one or more endpoints. The fabric includes multiple arbiters in an arbitration hierarchy. Arbiters store traffic in buffers with each buffer associated with a particular traffic type and a source of the traffic. Arbiters maintain a respective urgency counter for keeping track of a period of time traffic of a particular type is blocked by upstream arbiters. When the block is removed, the traffic of the particular type has priority for selection based on the urgency counter. When arbiters receive feedback from downstream arbiters or sources, the arbiters adjust selection priority accordingly. For example, changes in bandwidth requirement, low latency tolerance and active status cause adjustments in selection priority of stored requests.

Type: Application

Filed: August 20, 2018

Publication date: February 20, 2020

Inventors: Nachiappan Chidambaram Nachiappan, Jaideep Dastidar, Yiu Chun Tse, Ripudaman Singh, Shawn Munetoshi Fukami, Benjamin K. Dodge, Vinodh R. Cuppu
SYSTEMS AND METHODS FOR PROVIDING A BACK PRESSURE FREE INTERCONNECT

Publication number: 20200050379

Abstract: A system and method for efficiently allocating data storage to agents. A computing system includes an interconnect with intermediate buffers for storing transactions and corresponding payload data during transport between sources and destinations. A data storage limit is set on an amount of data storage corresponding to outstanding transactions for each of the multiple sources based on the initial buffer assignments. A number of outstanding transactions for each of the multiple sources is limited based on a corresponding data storage limit. If the rate of allocation of a given buffer assigned to a first source exceeds a threshold, then a second source is selected with available space exceeding a threshold in an assigned buffer. If it is determined the second source is not assigned to a buffer with a rate of allocation exceeding a threshold, then buffer storage is reassigned from the second source to the first source.

Type: Application

Filed: August 9, 2018

Publication date: February 13, 2020

Inventors: Nachiappan Chidambaram Nachiappan, David L. Trawick, Yiu Chun Tse, Deniz Balkan, Hengsheng Geng, Shawn Munetoshi Fukami, Jaideep Dastidar, Benjamin K. Dodge, Vinodh R. Cuppu
HYBRID PRECISE AND IMPRECISE CACHE SNOOP FILTERING

Publication number: 20200042446

Abstract: Circuits and methods for combined precise and imprecise snoop filtering. A memory and a plurality of processors are coupled to the interconnect circuitry. A plurality of cache circuits are coupled to the plurality of processor circuits, respectively. A first snoop filter is coupled to the interconnect and is configured to filter snoop requests by individual cache lines of a first subset of addresses of the memory. A second snoop filter is coupled to the interconnect and is configured to filter snoop requests by groups of cache lines of a second subset of addresses of the memory. Each group encompasses a plurality of cache lines.

Type: Application

Filed: August 2, 2018

Publication date: February 6, 2020

Applicant: Xilinx, Inc.

Inventors: Millind Mittal, Jaideep Dastidar
SYSTEMS AND METHODS FOR OPTIMIZING SCHEDULING DIFFERENT TYPES OF MEMORY REQUESTS WITH VARYING DATA SIZES

Publication number: 20200042469

Abstract: A system and method for efficiently scheduling requests. In various embodiments, a processor sends commands such as read requests and write requests to an arbiter. The arbiter reduces latencies between commands being sent to a communication fabric and corresponding data being sent to the fabric. When the arbiter selects a given request, the arbiter identifies a first subset of stored requests affected by the given request being selected. The arbiter adjusts one or more attributes of the first subset of requests based on the selection of the given request. In one example, the arbiter replaces a weight attribute with a value, such as a zero value, indicating the first subset of requests should not be selected. Therefore, during the next selection by the arbiter, only the requests in a second subset different from the first subset are candidates for selection.

Type: Application

Filed: August 6, 2018

Publication date: February 6, 2020

Inventors: Shawn Munetoshi Fukami, Jaideep Dastidar, Yiu Chun Tse

prev 1 2 3 4 next