Patents by Inventor Jaideep Dastidar

Jaideep Dastidar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11063594
    Abstract: An integrated circuit (IC) includes a first interface configured for operation with a plurality of tenants implemented concurrently in the integrated circuit, wherein the plurality of tenants communicate with a host data processing system using the first interface. The IC includes a second interface configured for operation with the plurality of tenants, wherein the plurality of tenants communicate with one or more network nodes via a network using the second interface. The IC can include a programmable logic circuitry configured for operation with the plurality of tenants, wherein the programmable logic circuitry implements one or more hardware accelerated functions for the plurality of tenants and routes data between the first interface and the second interface. The first interface, the second interface, and the programmable logic circuitry are configured to provide isolation among the plurality of tenants.
    Type: Grant
    Filed: May 11, 2020
    Date of Patent: July 13, 2021
    Assignee: Xilinx, Inc.
    Inventors: Sagheer Ahmad, Jaideep Dastidar, Brian C. Gaide, Juan J. Noguera Serra, Ian A. Swarbrick
  • Patent number: 10970217
    Abstract: Embodiments disclosed herein provide a domain aware data migration scheme between processing elements, memory, and various caches in a CC-NUMA system. The scheme creates domain awareness in data migration operations, such as Direct Cache Transfer (DCT) operation, stashing operation, and in the allocation of policies of snoop filters and private, shared, or inline caches. The scheme defines a hardware-software interface to communicate locality information (also referred herein as affinity information or proximity information) and subsequent hardware behavior for optimal data migration, thus overcoming traditional CC-NUMA limitations.
    Type: Grant
    Filed: May 24, 2019
    Date of Patent: April 6, 2021
    Assignee: XILINX, INC.
    Inventors: Jaideep Dastidar, Millind Mittal
  • Patent number: 10963172
    Abstract: A system and method for efficiently allocating data storage to agents. A computing system includes an interconnect with intermediate buffers for storing transactions and corresponding payload data during transport between sources and destinations. A data storage limit is set on an amount of data storage corresponding to outstanding transactions for each of the multiple sources based on the initial buffer assignments. A number of outstanding transactions for each of the multiple sources is limited based on a corresponding data storage limit. If the rate of allocation of a given buffer assigned to a first source exceeds a threshold, then a second source is selected with available space exceeding a threshold in an assigned buffer. If it is determined the second source is not assigned to a buffer with a rate of allocation exceeding a threshold, then buffer storage is reassigned from the second source to the first source.
    Type: Grant
    Filed: August 9, 2018
    Date of Patent: March 30, 2021
    Assignee: Apple Inc.
    Inventors: Nachiappan Chidambaram Nachiappan, David L. Trawick, Yiu Chun Tse, Deniz Balkan, Hengsheng Geng, Shawn Munetoshi Fukami, Jaideep Dastidar, Benjamin K. Dodge, Vinodh R. Cuppu
  • Publication number: 20210064529
    Abstract: The embodiments herein creates DCT mechanisms that initiate a DCT at the time the updated data is being evicted from the producer cache. These DCT mechanisms are applied when the producer is replacing the updated contents in its cache because the producer has either moved on to working on a different data set (e.g., a different task) or moved on to working on a different function, or when the producer-consumer task manager (e.g., a management unit) enforces software coherency by sending Cache Maintenance Operations (CMO). One advantage of the DCT mechanism is that because the direct cache transfer takes place at the time the updated data is being evicted, by the time the consumer begins its task, the updated contents have already been placed in its own cache or another cache within the cache hierarchy.
    Type: Application
    Filed: September 4, 2019
    Publication date: March 4, 2021
    Inventors: Jaideep DASTIDAR, Millind MITTAL
  • Patent number: 10922226
    Abstract: An example computing system includes a memory, a peripheral device configured to send a page request for accessing the memory, the page request indicating whether the page request is for regular memory or scratchpad memory, and a processor having a memory management unit (MMU). The MMU is configured to receive the page request and prevent memory pages from being marked dirty in response to the page request indicating scratchpad memory.
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: February 16, 2021
    Assignee: XILINX, INC.
    Inventors: Jaideep Dastidar, Chetan Loke
  • Publication number: 20200379664
    Abstract: Examples herein describe an accelerator device that shares the same coherent domain as hardware elements in a host computing device. The embodiments herein describe a mix of hardware and software coherency which reduces the overhead of managing data when large chunks of data are moved from the host into the accelerator device. In one embodiment, an accelerator application executing on the host identifies a data set it wishes to transfer to the accelerator device to be processed. The accelerator application transfers ownership from a home agent in the host to the accelerator device. A slave agent can then take ownership of the data. As a result, any memory operation requests received from a requesting agent in the accelerator device can gain access to the data set in local memory via the slave agent without the slave agent obtaining permission from the home agent in the host.
    Type: Application
    Filed: May 29, 2019
    Publication date: December 3, 2020
    Applicant: Xilinx, Inc.
    Inventors: Millind Mittal, Jaideep Dastidar
  • Publication number: 20200341941
    Abstract: Examples herein describe a peripheral I/O device with a hybrid gateway that permits the device to have both I/O and coherent domains. As a result, the compute resources in the coherent domain of the peripheral I/O device can communicate with the host in a similar manner as CPU-to-CPU communication in the host. The dual domains in the peripheral I/O device can be leveraged for machine learning (ML) applications. While an I/O device can be used as an ML accelerator, these accelerators previously only used an I/O domain. In the embodiments herein, compute resources can be split between the I/O domain and the coherent domain where a ML engine is in the I/O domain and a ML model is in the coherent domain. An advantage of doing so is that the ML model can be coherently updated using a reference ML model stored in the host.
    Type: Application
    Filed: April 26, 2019
    Publication date: October 29, 2020
    Applicant: Xilinx, Inc.
    Inventors: Jaideep Dastidar, Millind Mittal
  • Patent number: 10817455
    Abstract: Examples herein describe a peripheral I/O device with a hybrid gateway that permits the device to have both I/O and coherent domains. That is, the I/O device can benefit from a traditional I/O model where the I/O device driver manages some of the compute resources in the I/O device as well as the benefits of adding other compute resources in the I/O device to the same coherent domain used by the hardware in the host computing system. As result, the compute resources in the coherent domain of the peripheral I/O device can communicate with the host in a similar manner as, e.g., CPU-to-CPU communication in the host. At the same time, the compute resources in the I/O domain can benefit from the advantages of the traditional I/O device model which provides efficiencies when doing large memory transfers between the host and the I/O device (e.g., DMA).
    Type: Grant
    Filed: April 10, 2019
    Date of Patent: October 27, 2020
    Assignee: XILINX, INC.
    Inventors: Jaideep Dastidar, Sagheer Ahmad, Ian A. Swarbrick
  • Patent number: 10817462
    Abstract: Examples herein describe a peripheral I/O device with a hybrid gateway that permits the device to have both I/O and coherent domains. As a result, the compute resources in the coherent domain of the peripheral I/O device can communicate with the host in a similar manner as CPU-to-CPU communication in the host. The dual domains in the peripheral I/O device can be leveraged for machine learning (ML) applications. While an I/O device can be used as an ML accelerator, these accelerators previously only used an I/O domain. In the embodiments herein, compute resources can be split between the I/O domain and the coherent domain where a ML engine is in the I/O domain and a ML model is in the coherent domain. An advantage of doing so is that the ML model can be coherently updated using a reference ML model stored in the host.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: October 27, 2020
    Assignee: XILINX, INC.
    Inventors: Jaideep Dastidar, Millind Mittal
  • Publication number: 20200327089
    Abstract: Examples herein describe a peripheral I/O device with a hybrid gateway that permits the device to have both I/O and coherent domains. That is, the I/O device can benefit from a traditional I/O model where the I/O device driver manages some of the compute resources in the I/O device as well as the benefits of adding other compute resources in the I/O device to the same coherent domain used by the hardware in the host computing system. As result, the compute resources in the coherent domain of the peripheral I/O device can communicate with the host in a similar manner as, e.g., CPU-to-CPU communication in the host. At the same time, the compute resources in the I/O domain can benefit from the advantages of the traditional I/O device model which provides efficiencies when doing large memory transfers between the host and the I/O device (e.g., DMA).
    Type: Application
    Filed: April 10, 2019
    Publication date: October 15, 2020
    Applicant: Xilinx, Inc.
    Inventors: Jaideep Dastidar, Sagheer Ahmad, Ian A. Swarbrick
  • Patent number: 10761985
    Abstract: Circuits and methods for combined precise and imprecise snoop filtering. A memory and a plurality of processors are coupled to the interconnect circuitry. A plurality of cache circuits are coupled to the plurality of processor circuits, respectively. A first snoop filter is coupled to the interconnect and is configured to filter snoop requests by individual cache lines of a first subset of addresses of the memory. A second snoop filter is coupled to the interconnect and is configured to filter snoop requests by groups of cache lines of a second subset of addresses of the memory. Each group encompasses a plurality of cache lines.
    Type: Grant
    Filed: August 2, 2018
    Date of Patent: September 1, 2020
    Assignee: Xilinx, Inc.
    Inventors: Millind Mittal, Jaideep Dastidar
  • Patent number: 10698824
    Abstract: Disclosed systems and methods include in each agent, an agent layer, a link layer, and a port layer. The agent layer looks-up a port identifier in an address-to-port identifier map in response to a request directed to another agent and submits the request to the port layer. The link layer includes a plurality of links, and each link buffers communications from and to the agent layer. The port layer looks-up, in response to the request from the agent layer, a link identifier and chip identifier and writes the request to one of the links identified by the link identifier and associated with the chip identifier. The port layer also reads requests from the links and submits communications to a transport layer circuit based on the requests read from the links and associated chip identifiers.
    Type: Grant
    Filed: September 25, 2018
    Date of Patent: June 30, 2020
    Assignee: Xilinx, Inc.
    Inventors: Millind Mittal, Jaideep Dastidar
  • Patent number: 10698842
    Abstract: Examples herein describe a peripheral I/O device with a domain assist processor (DAP) and a domain specific accelerator (DSA) that are in the same coherent domain as CPUs and memory in a host computing system. Peripheral I/O devices were previously unable to participate in a cache-coherent shared-memory multiprocessor paradigm with hardware resources in the host computing system. As a result, domain assist processing for lightweight processor functions (e.g., open source functions such as gzip, open source crypto libraries, open source network switches, etc.) either are performed using CPUs resources in the host or by provisioning a special processing system in the peripheral I/O device (e.g., using programmable logic in a FPGA). The embodiments herein use a DAP in the peripheral I/O device to perform the lightweight processor functions that would otherwise be performed by hardware resources in the host or by a special processing system in the peripheral I/O device.
    Type: Grant
    Filed: April 10, 2019
    Date of Patent: June 30, 2020
    Assignee: XILINX, INC.
    Inventors: Jaideep Dastidar, Sagheer Ahmad
  • Patent number: 10673439
    Abstract: A device can include programmable logic circuitry, a processor system coupled to the programmable logic circuitry, and a network-on-chip. The network-on-chip is coupled to the programmable logic circuitry and the processor system. The network-on-chip is programmable to establish user specified data paths communicatively linking a circuit block implemented in the programmable logic circuitry and the processor system. The programmable logic circuitry, the network-on-chip, and the processor system are configured using a platform management controller.
    Type: Grant
    Filed: March 27, 2019
    Date of Patent: June 2, 2020
    Assignee: Xilinx, Inc.
    Inventors: Sagheer Ahmad, Jaideep Dastidar, Brian C. Gaide, Juan J. Noguera Serra, Ian A. Swarbrick
  • Patent number: 10664422
    Abstract: Various implementations of a multi-chip system operable according to a predefined transport protocol are disclosed. In one embodiment, a system comprises a first IC comprising a processing element communicatively coupled with first physical ports. The system further comprises a second IC comprising second physical ports communicatively coupled with a first set of the first physical ports via first physical links, and one or more memory devices that are communicatively coupled with the second physical ports and accessible by the processing element via the first physical links. The first IC further comprises a data structure describing a first level of port aggregation to be applied across the first set. The second IC comprises a first distribution function configured to provide ordering to data communicated using the second physical ports. The first distribution function is based on the first level of port aggregation.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: May 26, 2020
    Assignee: XILINX, INC.
    Inventors: Millind Mittal, Jaideep Dastidar
  • Patent number: 10649922
    Abstract: A system and method for efficiently scheduling requests. In various embodiments, a processor sends commands such as read requests and write requests to an arbiter. The arbiter reduces latencies between commands being sent to a communication fabric and corresponding data being sent to the fabric. When the arbiter selects a given request, the arbiter identifies a first subset of stored requests affected by the given request being selected. The arbiter adjusts one or more attributes of the first subset of requests based on the selection of the given request. In one example, the arbiter replaces a weight attribute with a value, such as a zero value, indicating the first subset of requests should not be selected. Therefore, during the next selection by the arbiter, only the requests in a second subset different from the first subset are candidates for selection.
    Type: Grant
    Filed: August 6, 2018
    Date of Patent: May 12, 2020
    Assignee: Apple Inc.
    Inventors: Shawn Munetoshi Fukami, Jaideep Dastidar, Yiu Chun Tse
  • Publication number: 20200057737
    Abstract: A system and method for efficiently arbitrating traffic on a bus. A computing system includes a fabric for routing traffic among one or more agents and one or more endpoints. The fabric includes multiple arbiters in an arbitration hierarchy. Arbiters store traffic in buffers with each buffer associated with a particular traffic type and a source of the traffic. Arbiters maintain a respective urgency counter for keeping track of a period of time traffic of a particular type is blocked by upstream arbiters. When the block is removed, the traffic of the particular type has priority for selection based on the urgency counter. When arbiters receive feedback from downstream arbiters or sources, the arbiters adjust selection priority accordingly. For example, changes in bandwidth requirement, low latency tolerance and active status cause adjustments in selection priority of stored requests.
    Type: Application
    Filed: August 20, 2018
    Publication date: February 20, 2020
    Inventors: Nachiappan Chidambaram Nachiappan, Jaideep Dastidar, Yiu Chun Tse, Ripudaman Singh, Shawn Munetoshi Fukami, Benjamin K. Dodge, Vinodh R. Cuppu
  • Publication number: 20200050379
    Abstract: A system and method for efficiently allocating data storage to agents. A computing system includes an interconnect with intermediate buffers for storing transactions and corresponding payload data during transport between sources and destinations. A data storage limit is set on an amount of data storage corresponding to outstanding transactions for each of the multiple sources based on the initial buffer assignments. A number of outstanding transactions for each of the multiple sources is limited based on a corresponding data storage limit. If the rate of allocation of a given buffer assigned to a first source exceeds a threshold, then a second source is selected with available space exceeding a threshold in an assigned buffer. If it is determined the second source is not assigned to a buffer with a rate of allocation exceeding a threshold, then buffer storage is reassigned from the second source to the first source.
    Type: Application
    Filed: August 9, 2018
    Publication date: February 13, 2020
    Inventors: Nachiappan Chidambaram Nachiappan, David L. Trawick, Yiu Chun Tse, Deniz Balkan, Hengsheng Geng, Shawn Munetoshi Fukami, Jaideep Dastidar, Benjamin K. Dodge, Vinodh R. Cuppu
  • Publication number: 20200042446
    Abstract: Circuits and methods for combined precise and imprecise snoop filtering. A memory and a plurality of processors are coupled to the interconnect circuitry. A plurality of cache circuits are coupled to the plurality of processor circuits, respectively. A first snoop filter is coupled to the interconnect and is configured to filter snoop requests by individual cache lines of a first subset of addresses of the memory. A second snoop filter is coupled to the interconnect and is configured to filter snoop requests by groups of cache lines of a second subset of addresses of the memory. Each group encompasses a plurality of cache lines.
    Type: Application
    Filed: August 2, 2018
    Publication date: February 6, 2020
    Applicant: Xilinx, Inc.
    Inventors: Millind Mittal, Jaideep Dastidar
  • Publication number: 20200042469
    Abstract: A system and method for efficiently scheduling requests. In various embodiments, a processor sends commands such as read requests and write requests to an arbiter. The arbiter reduces latencies between commands being sent to a communication fabric and corresponding data being sent to the fabric. When the arbiter selects a given request, the arbiter identifies a first subset of stored requests affected by the given request being selected. The arbiter adjusts one or more attributes of the first subset of requests based on the selection of the given request. In one example, the arbiter replaces a weight attribute with a value, such as a zero value, indicating the first subset of requests should not be selected. Therefore, during the next selection by the arbiter, only the requests in a second subset different from the first subset are candidates for selection.
    Type: Application
    Filed: August 6, 2018
    Publication date: February 6, 2020
    Inventors: Shawn Munetoshi Fukami, Jaideep Dastidar, Yiu Chun Tse