Patents by Inventor Sujoy Sen
Sujoy Sen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240143410Abstract: Technologies for dividing work across one or more accelerator devices include a compute device. The compute device is to determine a configuration of each of multiple accelerator devices of the compute device, receive a job to be accelerated from a requester device remote from the compute device, and divide the job into multiple tasks for a parallelization of the multiple tasks among the one or more accelerator devices, as a function of a job analysis of the job and the configuration of each accelerator device. The compute engine is further to schedule the tasks to the one or more accelerator devices based on the job analysis and execute the tasks on the one or more accelerator devices for the parallelization of the multiple tasks to obtain an output of the job.Type: ApplicationFiled: January 5, 2024Publication date: May 2, 2024Applicant: Intel CorporationInventors: Susanne M. Balle, Francesc Guim Bernat, Slawomir Putyrski, Joe Grecco, Henry Mitchel, Evan Custodio, Rahul Khanna, Sujoy Sen
-
Publication number: 20240113954Abstract: Technologies for dynamically managing resources in disaggregated accelerators include an accelerator. The accelerator includes acceleration circuitry with multiple logic portions, each capable of executing a different workload. Additionally, the accelerator includes communication circuitry to receive a workload to be executed by a logic portion of the accelerator and a dynamic resource allocation logic unit to identify a resource utilization threshold associated with one or more shared resources of the accelerator to be used by a logic portion in the execution of the workload, limit, as a function of the resource utilization threshold, the utilization of the one or more shared resources by the logic portion as the logic portion executes the workload, and subsequently adjust the resource utilization threshold as the workload is executed. Other embodiments are also described and claimed.Type: ApplicationFiled: November 9, 2023Publication date: April 4, 2024Inventors: Francesc GUIM BERNAT, Susanne M. BALLE, Rahul KHANNA, Sujoy SEN, Karthik KUMAR
-
Patent number: 11941457Abstract: An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes a source remote direct memory access (RDMA) network interface controller (RNIC); a queue to store a data entry corresponding to an RDMA request between the source RNIC and a sink RNIC; a data buffer to store data for an RDMA transfer corresponding to the RDMA request, the RDMA transfer between the source RNIC and the sink RNIC; and a trusted execution environment (TEE) comprising an authentication tag controller to: initialize a first authentication tag calculated using a first key known between a source consumer generating the RDMA request and the source RNIC; associate the first authentication tag with the data entry as integrity verification; initialize a second authentication tag calculated using a second key; and associate the second authentication tag with the data buffer as integrity verification for the data buffer.Type: GrantFiled: November 12, 2021Date of Patent: March 26, 2024Assignee: INTEL CORPORATIONInventors: Reshma Lal, Pradeep Pappachan, Luis Kida, Soham Jayesh Desai, Sujoy Sen, Selvakumar Panneer, Robert Sharp
-
Publication number: 20240086258Abstract: An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes one or more processors to facilitate receiving a manifest corresponding to graph nodes representing regions of memory of a remote client machine, the graph nodes corresponding to a command buffer and to associated data structures and kernels of the command buffer used to initialize a hardware accelerator and execute the kernels, and the manifest indicating a destination memory location of each of the graph nodes and dependencies of each of the graph nodes; identifying, based on the manifest, the command buffer and the associated data structures to copy to the host memory; identifying, based on the manifest, the kernels to copy to local memory of the hardware accelerator; and patching addresses in the command buffer copied to the host memory with updated addresses of corresponding locations in the host memory.Type: ApplicationFiled: November 16, 2023Publication date: March 14, 2024Applicant: Intel CorporationInventors: Reshma Lal, Pradeep Pappachan, Luis Kida, Soham Jayesh Desai, Sujoy Sen, Selvakumar Panneer, Robert Sharp
-
Patent number: 11907557Abstract: Technologies for dividing work across one or more accelerator devices include a compute device. The compute device is to determine a configuration of each of multiple accelerator devices of the compute device, receive a job to be accelerated from a requester device remote from the compute device, and divide the job into multiple tasks for a parallelization of the multiple tasks among the one or more accelerator devices, as a function of a job analysis of the job and the configuration of each accelerator device. The compute engine is further to schedule the tasks to the one or more accelerator devices based on the job analysis and execute the tasks on the one or more accelerator devices for the parallelization of the multiple tasks to obtain an output of the job.Type: GrantFiled: February 25, 2022Date of Patent: February 20, 2024Assignee: Intel CorporationInventors: Susanne M. Balle, Francesc Guim Bernat, Slawomir Putyrski, Joe Grecco, Henry Mitchel, Evan Custodio, Rahul Khanna, Sujoy Sen
-
Patent number: 11893425Abstract: An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes a processor executing a trusted execution environment (TEE) comprising a field-programmable gate array (FPGA) driver to interface with an FPGA device that is remote to the apparatus; and a remote memory-mapped input/output (MMIO) driver to expose the FPGA device as a legacy device to the FPGA driver, wherein the processor to utilize the remote MMIO driver to: enumerate the FPGA device using FPGA enumeration data provided by a remote management controller of the FPGA device, the FPGA enumeration data comprising a configuration space and device details; load function drivers for the FPGA device in the TEE; create corresponding device files in the TEE based on the FPGA enumeration data; and handle remote MMIO reads and writes to the FPGA device via a network transport protocol.Type: GrantFiled: November 19, 2021Date of Patent: February 6, 2024Assignee: INTEL CORPORATIONInventors: Reshma Lal, Pradeep Pappachan, Luis Kida, Soham Jayesh Desai, Sujoy Sen, Selvakumar Panneer, Robert Sharp
-
Patent number: 11855766Abstract: Technologies for dynamically managing resources in disaggregated accelerators include an accelerator. The accelerator includes acceleration circuitry with multiple logic portions, each capable of executing a different workload. Additionally, the accelerator includes communication circuitry to receive a workload to be executed by a logic portion of the accelerator and a dynamic resource allocation logic unit to identify a resource utilization threshold associated with one or more shared resources of the accelerator to be used by a logic portion in the execution of the workload, limit, as a function of the resource utilization threshold, the utilization of the one or more shared resources by the logic portion as the logic portion executes the workload, and subsequently adjust the resource utilization threshold as the workload is executed. Other embodiments are also described and claimed.Type: GrantFiled: April 29, 2022Date of Patent: December 26, 2023Assignee: Intel CorporationInventors: Francesc Guim Bernat, Susanne M. Balle, Rahul Khanna, Sujoy Sen, Karthik Kumar
-
Publication number: 20230111490Abstract: Examples described herein relate to a network interface that includes an initiator device to determine a storage node associated with an access command based on an association between an address in the command and a storage node. The network interface can include a redirector to update the association based on messages from one or more remote storage nodes. The association can be based on a look-up table associating a namespace identifier with prefix string and object size. In some examples, the access command is compatible with NVMe over Fabrics. The initiator device can determine a remote direct memory access (RDMA) queue-pair (QP) lookup for use to perform the access command.Type: ApplicationFiled: October 21, 2022Publication date: April 13, 2023Inventors: Yadong LI, Scott D. PETERSON, Sujoy SEN, David B. MINTURN
-
Publication number: 20230092541Abstract: Methods and apparatus to minimize hot/cold page detection overhead on running workloads. A page meta data structure is populated with meta data associated with memory pages in one or more far memory tier. In conjunction with one or more processes accessing memory pages to perform workloads, the page meta data structure is updated to reflect accesses to the memory pages. The page meta data, which reflects the current state of memory, is used to determine which pages are “hot” pages and which pages are “cold” pages, wherein hot pages are memory pages with relatively higher access frequencies and cold pages are memory pages with relatively lower access frequencies. Variations on the approach including filtering meta data updates on pages in memory regions of interest and applying a filter(s) to trigger meta data updates based on (a) condition(s). A callback function may also be triggered to be executed synchronously with memory page accesses.Type: ApplicationFiled: September 23, 2021Publication date: March 23, 2023Inventors: Francois DUGAST, Durgesh SRIVASTAVA, Sujoy SEN, Lidia WARNES, Thomas E. WILLIS, Bassam N. COURY
-
Publication number: 20230048915Abstract: A method of offloading performance of a workload includes receiving, on a first computing system acting as an initiator, a first function call from a caller, the first function call to be executed by an accelerator on a second computing system acting as a target, the first computing system coupled to the second computing system by a network; determining a type of the first function call; and generating a list of parameter values of the first function call.Type: ApplicationFiled: October 18, 2022Publication date: February 16, 2023Inventors: Pradeep Pappachan, Sujoy Sen, Joseph Grecco, Mukesh Gangadhar Bhavani Venkatesan, Reshma Lal
-
Patent number: 11579788Abstract: Technologies for providing shared memory for accelerator sleds includes an accelerator sled to receive, with a memory controller, a memory access request from an accelerator device to access a region of memory. The request is to identify the region of memory with a logical address. Additionally, the accelerator sled is to determine from a map of logical addresses and associated physical address, the physical address associated with the region of memory. In addition, the accelerator sled is to route the memory access request to a memory device associated with the determined physical address.Type: GrantFiled: July 30, 2020Date of Patent: February 14, 2023Assignee: Intel CorporationInventors: Henry Mitchel, Joe Grecco, Sujoy Sen, Francesc Guim Bernat, Susanne M. Balle, Evan Custodio, Paul Dormitzer
-
Patent number: 11550617Abstract: A method is described. The method includes performing the following with a storage end transaction agent within a storage sled of a rack mounted computing system: receiving a request to perform storage operations with one or more storage devices of the storage sled, the request specifying an all-or-nothing semantic for the storage operations; recognizing that all of the storage operations have successfully completed; after all of the storage operations have successfully completed, reporting to a CPU side transaction agent that sent the request that all of the storage operations have successfully completed.Type: GrantFiled: June 22, 2020Date of Patent: January 10, 2023Assignee: Intel CorporationInventors: Arun Raghunath, Yi Zou, Tushar Sudhakar Gohad, Anjaneya R. Chagam Reddy, Sujoy Sen
-
Patent number: 11537457Abstract: A method of offloading performance of a workload includes receiving, on a first computing system acting as an initiator, a first function call from a caller, the first function call to be executed by an accelerator on a second computing system acting as a target, the first computing system coupled to the second computing system by a network; determining a type of the first function call; and generating a list of parameter values of the first function call.Type: GrantFiled: June 25, 2021Date of Patent: December 27, 2022Assignee: INTEL CORPORATIONInventors: Pradeep Pappachan, Sujoy Sen, Joseph Grecco, Mukesh Gangadhar Bhavani Venkatesan, Reshma Lal
-
Patent number: 11531635Abstract: Technologies for providing I/O channel abstraction for accelerator device kernels include an accelerator device comprising circuitry to obtain availability data indicative of an availability of one or more accelerator device kernels in a system, including one or more physical communication paths to each accelerator device kernel. The circuitry is also configured to determine whether to establish a logical communication path between a kernel of the present accelerator device and another accelerator device kernel and establish, in response to a determination to establish the logical communication path as a function of the obtained availability data, the logical communication path between the kernel of the present accelerator device and the other accelerator device kernel.Type: GrantFiled: November 3, 2020Date of Patent: December 20, 2022Assignee: Intel CorporationInventors: Susanne M. Balle, Evan Custodio, Francesc Guim Bernat, Sujoy Sen, Slawomir Putyrski, Paul Dormitzer, Joseph Grecco
-
Patent number: 11509606Abstract: Examples described herein relate to a network interface that includes an initiator device to determine a storage node associated with an access command based on an association between an address in the command and a storage node. The network interface can include a redirector to update the association based on messages from one or more remote storage nodes. The association can be based on a look-up table associating a namespace identifier with prefix string and object size. In some examples, the access command is compatible with NVMe over Fabrics. The initiator device can determine a remote direct memory access (RDMA) queue-pair (QP) lookup for use to perform the access command.Type: GrantFiled: December 27, 2019Date of Patent: November 22, 2022Assignee: Intel CorporationInventors: Yadong Li, Scott D. Peterson, Sujoy Sen, David B. Minturn
-
Patent number: 11467873Abstract: Technologies for remote direct memory access (RDMA) queue pair quality of service (QoS) management are disclosed. In the illustrative embodiment, several queue pairs associated with a virtual machine on a compute sled may be created in a network interface controller of the compute sled. A QoS parameter such as a class of service identifier or a weighting may be assigned to each queue pair such that each queue pair has a different available bandwidth. The compute sled may also predict future RDMA queue pair bandwidth usage and adjust RDMA queue pair bandwidth allocation based on the prediction.Type: GrantFiled: July 29, 2019Date of Patent: October 11, 2022Assignee: Intel CorporationInventors: Mrittika Ganguli, Neerav Parikh, Robert Sharp, Sujoy Sen
-
Publication number: 20220321438Abstract: Technologies for dynamically managing resources in disaggregated accelerators include an accelerator. The accelerator includes acceleration circuitry with multiple logic portions, each capable of executing a different workload. Additionally, the accelerator includes communication circuitry to receive a workload to be executed by a logic portion of the accelerator and a dynamic resource allocation logic unit to identify a resource utilization threshold associated with one or more shared resources of the accelerator to be used by a logic portion in the execution of the workload, limit, as a function of the resource utilization threshold, the utilization of the one or more shared resources by the logic portion as the logic portion executes the workload, and subsequently adjust the resource utilization threshold as the workload is executed. Other embodiments are also described and claimed.Type: ApplicationFiled: April 29, 2022Publication date: October 6, 2022Inventors: Francesc GUIM BERNAT, Susanne M. BALLE, Rahul KHANNA, Sujoy SEN, Karthik KUMAR
-
Patent number: 11429297Abstract: Technologies for dividing work across one or more accelerator devices include a compute device. The compute device is to determine a configuration of each of multiple accelerator devices of the compute device, receive a job to be accelerated from a requester device remote from the compute device, and divide the job into multiple tasks for a parallelization of the multiple tasks among the one or more accelerator devices, as a function of a job analysis of the job and the configuration of each accelerator device. The compute engine is further to schedule the tasks to the one or more accelerator devices based on the job analysis and execute the tasks on the one or more accelerator devices for the parallelization of the multiple tasks to obtain an output of the job.Type: GrantFiled: May 14, 2021Date of Patent: August 30, 2022Assignee: Intel CorporationInventors: Susanne M. Balle, Francesc Guim Bernat, Slawomir Putyrski, Joe Grecco, Henry Mitchel, Evan Custodio, Rahul Khanna, Sujoy Sen
-
Patent number: 11422867Abstract: Technologies for composing a managed node based on telemetry data include communication circuitry and a compute device. The compute device is to receive resource-level telemetry data for each resource of a plurality of resources and rack-level telemetry data from each rack of a plurality of racks and a managed node composition request, which identifies at least one metric to be achieved by a managed node. In response to a receipt of the managed node composition request, the compute device is further to determine a present utilization of each resource of the plurality of resources and a present performance level of each rack of the plurality of racks, and determine a set of resources from the plurality of resources that satisfies the managed node composition request based on the resource-level and rack-level telemetry data.Type: GrantFiled: December 30, 2017Date of Patent: August 23, 2022Assignee: Intel CorporationInventors: Sujoy Sen, Mohan J. Kumar
-
Patent number: 11397653Abstract: Technologies for fast distributed storage recovery include a distributed storage system that includes multiple controller nodes and multiple target nodes. Each controller node is coupled to a corresponding target node via a storage fabric. Each target node stores replica data. The system identifies a failed node and a corresponding node that was coupled to the failed node. If the failed node is a controller node, the corresponding node is a target node. If the failed node is a target node, the corresponding node is a controller node. The system instantiates a replacement node, adds the replacement node to the system, and couples the replacement node to the corresponding node. The system may direct a backup target node to copy replica data to the replacement target node via the storage fabric. Other embodiments are described and claimed.Type: GrantFiled: May 29, 2019Date of Patent: July 26, 2022Assignee: Intel CorporationInventors: Yi Zou, Arun Raghunath, Tushar Gohad, Anjaneya Reddy Chagam Reddy, Sujoy Sen