Patents by Inventor Sujoy Sen

Sujoy Sen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220179575
    Abstract: Technologies for dividing work across one or more accelerator devices include a compute device. The compute device is to determine a configuration of each of multiple accelerator devices of the compute device, receive a job to be accelerated from a requester device remote from the compute device, and divide the job into multiple tasks for a parallelization of the multiple tasks among the one or more accelerator devices, as a function of a job analysis of the job and the configuration of each accelerator device. The compute engine is further to schedule the tasks to the one or more accelerator devices based on the job analysis and execute the tasks on the one or more accelerator devices for the parallelization of the multiple tasks to obtain an output of the job.
    Type: Application
    Filed: February 25, 2022
    Publication date: June 9, 2022
    Inventors: Susanne M. Balle, Francesc Guim Bernat, Slawomir Putyrski, Joe Grecco, Henry Mitchel, Evan CUSTODIO, Rahul Khanna, Sujoy Sen
  • Patent number: 11336547
    Abstract: Technologies for dynamically managing resources in disaggregated accelerators include an accelerator. The accelerator includes acceleration circuitry with multiple logic portions, each capable of executing a different workload. Additionally, the accelerator includes communication circuitry to receive a workload to be executed by a logic portion of the accelerator and a dynamic resource allocation logic unit to identify a resource utilization threshold associated with one or more shared resources of the accelerator to be used by a logic portion in the execution of the workload, limit, as a function of the resource utilization threshold, the utilization of the one or more shared resources by the logic portion as the logic portion executes the workload, and subsequently adjust the resource utilization threshold as the workload is executed. Other embodiments are also described and claimed.
    Type: Grant
    Filed: April 20, 2021
    Date of Patent: May 17, 2022
    Assignee: Intel Corporation
    Inventors: Francesc Guim Bernat, Susanne M. Balle, Rahul Khanna, Sujoy Sen, Karthik Kumar
  • Publication number: 20220113913
    Abstract: Examples described herein relate to a network interface device that includes circuitry to receive storage access command and determine a processing path in the network interface device for the storage access command, wherein the processing path is within the network interface device and wherein the processing path is selected from direct mapped or control plane processed based at least on command type and source of command. In some examples, the command type is read or write.
    Type: Application
    Filed: December 23, 2021
    Publication date: April 14, 2022
    Inventors: Jose NIELL, Yadong LI, Salma Mirza JOHNSON, Scott D. PETERSON, Sujoy SEN
  • Publication number: 20220114030
    Abstract: Examples described herein relate to a network interface device that includes circuitry to perform operations, offloaded from a host, to identify at least one locator of at least one target storage associated with a storage access command based on operations selected from among multiple available operations, wherein the available operations comprise two or more: entry lookup by the network interface device, hash-based calculation on the network interface device, or control plane processing on the network interface device.
    Type: Application
    Filed: December 23, 2021
    Publication date: April 14, 2022
    Inventors: Salma Mirza JOHNSON, Jose NIELL, Bradley A. BURRES, Yadong LI, Scott D. PETERSON, Tony HURSON, Sujoy SEN
  • Patent number: 11301407
    Abstract: Technologies for accessing pooled accelerator resources over a network fabric are disclosed. In disclosed embodiments, an application hosted by a computing platform accesses remote accelerator resources over a network fabric using protocol multipathing mechanisms. A communication session is established with the remote accelerator resources. The communication session comprises at least two connections. The at least two connections at least include a first connection having or utilizing a first transport layer and a second connection having or utilizing a second transport layer that is different than the first transport layer. Other embodiments may be disclosed and/or claimed.
    Type: Grant
    Filed: January 8, 2019
    Date of Patent: April 12, 2022
    Assignee: Intel Corporation
    Inventors: Sujoy Sen, Narayan Ranganathan
  • Publication number: 20220100580
    Abstract: An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes one or more processors to: provide a remote GPU middleware layer to act as a proxy for an application stack on a client platform separate from the apparatus; communicate, by the remote GPU middleware layer, with a kernel mode driver of the one or more processors to cause the host memory to be allocated for command buffers and data structures received from the client platform for consumption by a command streamer of a remote GPU of the apparatus; and invoke, by the remote GPU middleware layer, the kernel mode driver to submit a workload generated by the application stack, the workload submitted for processing by the remote GPU using the command buffers and the data structures allocated in the host memory as directed by the command streamer.
    Type: Application
    Filed: November 15, 2021
    Publication date: March 31, 2022
    Applicant: Intel Corporation
    Inventors: Reshma Lal, Pradeep Pappachan, Luis Kida, Soham Jayesh Desai, Sujoy Sen, Selvakumar Panneer, Robert Sharp
  • Publication number: 20220100582
    Abstract: An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes a processor executing a trusted execution environment (TEE) comprising a field-programmable gate array (FPGA) driver to interface with an FPGA device that is remote to the apparatus; and a remote memory-mapped input/output (MMIO) driver to expose the FPGA device as a legacy device to the FPGA driver, wherein the processor to utilize the remote MMIO driver to: enumerate the FPGA device using FPGA enumeration data provided by a remote management controller of the FPGA device, the FPGA enumeration data comprising a configuration space and device details; load function drivers for the FPGA device in the TEE; create corresponding device files in the TEE based on the FPGA enumeration data; and handle remote MMIO reads and writes to the FPGA device via a network transport protocol.
    Type: Application
    Filed: November 19, 2021
    Publication date: March 31, 2022
    Applicant: Intel Corporation
    Inventors: Reshma Lal, Pradeep Pappachan, Luis Kida, Soham Jayesh Desai, Sujoy Sen, Selvakumar Panneer, Robert Sharp
  • Publication number: 20220100583
    Abstract: An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes a programmable integrated circuit (IC) comprising secure device manager (SDM) hardware circuitry to: receive a tenant bitstream of a tenant and a tenant use policy for utilization of the programmable IC via the tenant bitstream, wherein the tenant use policy is cryptographically bound to the tenant bitstream by a cloud service provider (CSP) authorizing entity and signed with a signature of the CSP authorizing entity; in response to successfully verifying the signature, extract the tenant use policy to provide to a policy manager of the programmable IC for verification; in response to the policy manager verifying the tenant bitstream based on the tenant use policy, configure a partial reconfiguration (PR) region of the programable IC using the tenant bitstream; and associate a slot ID of the PR region with the tenant use policy.
    Type: Application
    Filed: November 22, 2021
    Publication date: March 31, 2022
    Applicant: Intel Corporation
    Inventors: Reshma Lal, Pradeep Pappachan, Luis Kida, Soham Jayesh Desai, Sujoy Sen, Selvakumar Panneer, Robert Sharp
  • Publication number: 20220100579
    Abstract: An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes a source remote direct memory access (RDMA) network interface controller (RNIC); a queue to store a data entry corresponding to an RDMA request between the source RNIC and a sink RNIC; a data buffer to store data for an RDMA transfer corresponding to the RDMA request, the RDMA transfer between the source RNIC and the sink RNIC; and a trusted execution environment (TEE) comprising an authentication tag controller to: initialize a first authentication tag calculated using a first key known between a source consumer generating the RDMA request and the source RNIC; associate the first authentication tag with the data entry as integrity verification; initialize a second authentication tag calculated using a second key; and associate the second authentication tag with the data buffer as integrity verification for the data buffer.
    Type: Application
    Filed: November 12, 2021
    Publication date: March 31, 2022
    Applicant: Intel Corporation
    Inventors: Reshma Lal, Pradeep Pappachan, Luis Kida, Soham Jayesh Desai, Sujoy Sen, Selvakumar Panneer, Robert Sharp
  • Publication number: 20220100584
    Abstract: An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes a programmable integrated circuit (IC) comprising system manager hardware circuitry to: interface, over a network, with a remote application of a client platform, the system manager hardware circuitry to interface with the remote application using a message-based interface; perform resource management of resources of the programmable IC; validate incoming messages to the programmable IC; verify whether a requester is allowed to perform requested actions of the incoming messages that are successfully validated; and manage transfer of data between the programmable IC and the remote application based on successfully verifying the requester.
    Type: Application
    Filed: November 22, 2021
    Publication date: March 31, 2022
    Applicant: Intel Corporation
    Inventors: Reshma Lal, Pradeep Pappachan, Luis Kida, Soham Jayesh Desai, Sujoy Sen, Selvakumar Panneer, Robert Sharp
  • Publication number: 20220100581
    Abstract: An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes a graphics processing unit (GPU) to: provide a virtual GPU monitor (VGM) to interface over a network with a middleware layer of a client platform, the VGM to interface with the middleware layer using a message passing interface; configure and expose, by the VGM, virtual functions (VFs) of the GPU to the middleware layer of the client platform; intercept, by the VGM, request messages directed to the GPU from the middleware layer, the request messages corresponding to VFs of the GPU to be utilized by the client platform; and generate, by the VGM, a response to the request messages for the middleware client.
    Type: Application
    Filed: November 17, 2021
    Publication date: March 31, 2022
    Applicant: Intel Corporation
    Inventors: Reshma Lal, Pradeep Pappachan, Luis Kida, Soham Jayesh Desai, Sujoy Sen, Selvakumar Panneer, Robert Sharp
  • Patent number: 11290392
    Abstract: Technologies for pooling accelerators over fabric are disclosed. In the illustrative embodiment, an application may access an accelerator device over an application programming interface (API) and the API can access an accelerator device that is either local or a remote accelerator device that is located on a remote accelerator sled over a network fabric. The API may employ a send queue and a receive queue to send and receive command capsules to and from the accelerator sled.
    Type: Grant
    Filed: June 12, 2017
    Date of Patent: March 29, 2022
    Assignee: Intel Corporation
    Inventors: Sujoy Sen, Mohan J. Kumar, Donald L. Faw, Susanne M. Balle, Narayan Ranganathan
  • Patent number: 11269395
    Abstract: Technologies for providing adaptive power management in an accelerator sled include an accelerator sled having circuitry to determine, based on (i) a total power budget for the accelerator sled, (ii) service level agreement (SLA) data indicative of a target performance of a kernel, and (iii) profile data indicative of a performance of the kernel as a function of a power utilization of the kernel, a power utilization limit for the kernel to be executed by an accelerator device on the accelerator sled. Additionally, the circuitry is to allocate the determined power utilization limit to the kernel and execute the kernel under the allocated power utilization limit.
    Type: Grant
    Filed: April 25, 2019
    Date of Patent: March 8, 2022
    Assignee: Intel Corporation
    Inventors: Francesc Guim Bernat, Susanne M. Balle, Sujoy Sen, Evan Custodio, Paul H. Dormitzer
  • Publication number: 20220050722
    Abstract: Examples described herein relate to providing an interface to an operating system (OS) to create different memory pool classes to allocate to one or more processes and allocate a memory pool class with a process of the one or more processes. In some examples, a memory pool class of the different memory pool classes defines a mixture of memory devices in at least one memory pool available for access by the one or more processes. In some examples, memory devices are associated with multiple memory pool classes to provide multiple different categories of memory resource capabilities.
    Type: Application
    Filed: October 29, 2021
    Publication date: February 17, 2022
    Inventors: Francois DUGAST, Florent PIROU, Sujoy SEN, Lidia WARNES, Thomas E. WILLIS, Durgesh SRIVASTAVA
  • Patent number: 11228539
    Abstract: Technologies for network interface controllers (NICs) include a compute sled and an accelerator sled in communication over a network. The accelerator sled configures a virtual switch endpoint associated with a remote direct memory access (RDMA) server instance that is associated with a field-programmable gate array (FPGA) of the accelerator sled. The accelerator sled updates local software defined networking (SDN) tables with a virtual tunnel associated with the virtual switch endpoint and a remote compute sled. A virtual switch of the accelerator sled switches virtual tunnel traffic from the remote compute sled to the RDMA server instance, which transfers data to or from the FPGA. The compute sled also updates a local SDN table with the virtual tunnel, and a virtual switch of the compute sled switches virtual tunnel traffic to or from the accelerator sled. Other embodiments are described and claimed.
    Type: Grant
    Filed: August 14, 2019
    Date of Patent: January 18, 2022
    Assignee: Intel Corporation
    Inventors: Mrittika Ganguli, Sugesh Chandran, Parthasarathy Sarangam, Sujoy Sen, Susanne M. Balle, Rajesh Sankaran
  • Patent number: 11194522
    Abstract: Apparatuses for computing are disclosed herein. An apparatus may include a set of data reduction modules to perform data reduction operations on sets of (key, value) data pairs to reduce an amount of values associated with a shared key, wherein the (key, value) data pairs are stored in a plurality of queues located in a plurality of solid state drives remote from the apparatus. The apparatus may further include a memory access module, communicably coupled to the set of data reduction modules, to directly transfer individual ones of the sets of queued (key, value) data pairs from the plurality of remote solid state drives through remote random access of the solid state drives, via a network, without using intermediate staging storage. Other embodiments may be disclosed or claimed.
    Type: Grant
    Filed: August 16, 2017
    Date of Patent: December 7, 2021
    Assignee: Intel Corporation
    Inventors: Xiao Hu, Huan Zhou, Sujoy Sen, Anjaneya R. Chagam Reddy, Mohan J. Kumar, Chong Han
  • Publication number: 20210359955
    Abstract: Examples described herein relate to a network interface device comprising: a host interface, a direct memory access (DMA) engine, and circuitry to allocate a region in a cache to store a context of a connection. In some examples, the circuitry is to allocate a region in a cache to store a context of a connection based on connection reliability and wherein connection reliability comprises use of a reliable transport protocol or non-use of a reliable transport protocol. In some examples, the circuitry is to allocate a region in a cache to store a context of a connection based on expected length of runtime of the connection and the expected length of runtime of the connection is based on a historic average amount of time the context for the connection was stored in the cache. In some examples, the circuitry is to allocate a region in a cache to store a context of a connection based on content transmitted and the content transmitted comprises congestion messaging payload or acknowledgement.
    Type: Application
    Filed: July 23, 2021
    Publication date: November 18, 2021
    Inventors: Malek MUSLEH, Tony HURSON, Pedro YEBENES SEGURA, Allister ALEMANIA, Roberto PENARANDA CEBRIAN, Ayan BANERJEE, Robert SOUTHWORTH, Sujoy SEN, Curt E. BRUNS
  • Publication number: 20210326270
    Abstract: Examples described herein relate to a network interface device comprising circuitry to receive an access request with a target logical block address (LBA) and based on a target media of the access request storing at least one object, translate the target LBA to an address and access content in the target media based on the address. In some examples, translate the target LBA to an address includes access a translation entry that maps the LBA to one or more of: a physical address or a virtual address. In some examples, translate the target LBA to an address comprises: request a software defined storage (SDS) stack to provide a translation of the LBA to one or more of: a physical address or a virtual address and store the translation into a mapping table for access by the circuitry. In some examples, at least one entry that maps the LBA to one or more of: a physical address or a virtual address is received before receipt of an access request.
    Type: Application
    Filed: June 26, 2021
    Publication date: October 21, 2021
    Inventors: Yi ZOU, Arun RAGHUNATH, Scott D. PETERSON, Sujoy SEN, Yadong LI
  • Publication number: 20210318961
    Abstract: Methods and apparatus for mitigating pooled memory cache miss latency with cache miss faults and transaction aborts. A compute platform coupled to one or more tiers of memory, such as remote pooled memory in a disaggregated environment executes memory transactions to access objects that are stored in the one or more tiers. A determination is made to whether a copy of the object is in a local cache on the platform; if it is, the object is accessed from the local cache. If the object is not in the local cache, a transaction abort may be generated if enabled for the transactions. Optionally, a cache miss page fault is generated if the object is in a cacheable region of a memory tier, and the transaction abort is not enabled. Various mechanisms are provided to determine what to do in response to a cache miss page fault, such as determining addresses for cache lines to prefetch from a memory tier storing the object(s), determining how much data to prefetch, and determining whether to perform a bulk transfer.
    Type: Application
    Filed: June 23, 2021
    Publication date: October 14, 2021
    Inventors: Scott D. PETERSON, Sujoy SEN, Francesc GUIM BERNAT
  • Publication number: 20210318920
    Abstract: A method of offloading performance of a workload includes receiving, on a first computing system acting as an initiator, a first function call from a caller, the first function call to be executed by an accelerator on a second computing system acting as a target, the first computing system coupled to the second computing system by a network; determining a type of the first function call; and generating a list of parameter values of the first function call.
    Type: Application
    Filed: June 25, 2021
    Publication date: October 14, 2021
    Applicant: Intel Corporation
    Inventors: Pradeep Pappachan, Sujoy Sen, Joseph Grecco, Mukesh Gangadhar Bhavani Venkatesan, Reshma Lal