Patents Assigned to d-MATRIX CORPORATION
-
Patent number: 12353985Abstract: A server system with AI accelerator apparatuses using in-memory compute chiplet devices. The system includes a plurality of multiprocessors each having at least a first server central processing unit (CPU) and a second server CPU, both of which are coupled to a plurality of switch devices. Each switch device is coupled to a plurality of AI accelerator apparatuses. The apparatus includes one or more chiplets, each of which includes a plurality of tiles. Each tile includes a plurality of slices, a CPU, and a hardware dispatch device. Each slice can include a digital in-memory compute (DIMC) device configured to perform high throughput computations. In particular, the DIMC device can be configured to accelerate the computations of attention functions for transformer-based models (a.k.a. transformers) applied to machine learning applications. A single input multiple data (SIMD) device configured to further process the DIMC output and compute softmax functions for the attention functions.Type: GrantFiled: October 13, 2023Date of Patent: July 8, 2025Assignee: d-MATRIX CORPORATIONInventors: Jayaprakash Balachandran, Akhil Arunkumar, Aayush Ankit, Nithesh Kurella, Sudeep Bhoja
-
Patent number: 12299484Abstract: A hardware and software co-designed dispatch engine (DE) apparatus. The DE apparatus can be configured to store a compute workload having groups of tasks in the form of a hierarchy of serial and/or concurrent queues in a task queue. Also, the DE can use various hardware modules to asynchronously delegate the tasks to various resources or destination devices and to track the completion of such tasks and task groups in an efficient manner. The DE can also include an interrupt/completion handler module, a resource monitor module, and a task dispatcher module configured with the task queue module to track and dispatch work units that are sent to various destination devices for processing. Using this approach, the DE apparatus can be configured with a processing unit to coordinate the processing of work units in a manner that efficient uses the most critical resources with minimal added cost of silicon area.Type: GrantFiled: March 16, 2022Date of Patent: May 13, 2025Assignee: d-MATRIX CORPORATIONInventors: Satyam Srivastava, Joseph P. Sprowes, Mayank Gupta
-
Patent number: 12271321Abstract: An AI accelerator apparatus using in-memory compute chiplet devices. The apparatus includes one or more chiplets, each of which includes a plurality of tiles. Each tile includes a plurality of slices, a central processing unit (CPU), and a hardware dispatch device. Each slice can include a digital in-memory compute (DIMC) device configured to perform high throughput computations. In particular, the DIMC device can be configured to accelerate the computations of attention functions for transformer-based models (a.k.a. transformers) applied to machine learning applications. A single input multiple data (SIMD) device configured to further process the DIMC output and compute softmax functions for the attention functions. The chiplet can also include die-to-die (D2D) interconnects, a peripheral component interconnect express (PCIe) bus, a dynamic random access memory (DRAM) interface, and a global CPU interface to facilitate communication between the chiplets, memory and a server or host system.Type: GrantFiled: October 24, 2023Date of Patent: April 8, 2025Assignee: d-MATRIX CORPORATIONInventors: Sudeep Bhoja, Siddharth Sheth
-
Patent number: 12260223Abstract: An AI accelerator apparatus using in-memory compute chiplet devices. The apparatus includes one or more chiplets, each of which includes a plurality of tiles. Each tile includes a plurality of slices, a central processing unit (CPU), and a hardware dispatch device. Each slice can include a digital in-memory compute (DIMC) device configured to perform high throughput computations. In particular, the DIMC device can be configured to accelerate the computations of attention functions for transformer-based models (a.k.a. transformers) applied to machine learning applications, including generative AI. A single input multiple data (SIMD) device configured to further process the DIMC output and compute softmax functions for the attention functions.Type: GrantFiled: November 23, 2022Date of Patent: March 25, 2025Assignee: d-MATRIX CORPORATIONInventors: Sudeep Bhoja, Siddharth Sheth
-
Patent number: 12182028Abstract: A transformer compute apparatus and method of operation therefor. The apparatus receives matrix inputs in a first format and generates projection tokens from these inputs. Among others, the apparatus includes a first cache device configured for processing first projection tokens and a second cache device configured for processing second projection tokens. The first cache device stores the first projection tokens in a first cache region and stores these tokens converted to a second format in a second cache region. The second cache device stores the second projection tokens converted to the second format in a first cache region and stores the converted second projection tokens after being transposed. Then, a compute device performs various matrix computations with the converted first projection tokens and transposed second projection tokens. Re-processing data and expensive padding and de-padding operations for transposed storage and byte alignment can be avoided using this caching process.Type: GrantFiled: September 28, 2023Date of Patent: December 31, 2024Assignee: d-MATRIX CORPORATIONInventors: Akhil Arunkumar, Satyam Srivastava, Aayush Ankit
-
Patent number: 12147359Abstract: An AI accelerator apparatus using in-memory compute chiplet devices. The apparatus includes one or more chiplets, each of which includes a plurality of tiles. Each tile includes a plurality of slices, a central processing unit (CPU), and a hardware dispatch device. Each slice can include a digital in-memory compute (DIMC) device configured to perform high throughput computations. In particular, the DIMC device can be configured to accelerate the computations of attention functions for transformer-based models (a.k.a. transformers) applied to machine learning applications. A single input multiple data (SIMD) device configured to further process the DIMC output and compute softmax functions for the attention functions. The chiplet can also include die-to-die (D2D) interconnects, a peripheral component interconnect express (PCIe) bus, a dynamic random access memory (DRAM) interface, and a global CPU interface to facilitate communication between the chiplets, memory and a server or host system.Type: GrantFiled: January 25, 2024Date of Patent: November 19, 2024Assignee: d-MATRIX CORPORATIONInventors: Sudeep Bhoja, Siddharth Sheth
-
Patent number: 11886359Abstract: An AI accelerator apparatus using in-memory compute chiplet devices. The apparatus includes one or more chiplets, each of which includes a plurality of tiles. Each tile includes a plurality of slices, a central processing unit (CPU), and a hardware dispatch device. Each slice can include a digital in-memory compute (DIMC) device configured to perform high throughput computations. In particular, the DIMC device can be configured to accelerate the computations of attention functions for transformer-based models (a.k.a. transformers) applied to machine learning applications. A single input multiple data (SIMD) device configured to further process the DIMC output and compute softmax functions for the attention functions. The chiplet can also include die-to-die (D2D) interconnects, a peripheral component interconnect express (PCIe) bus, a dynamic random access memory (DRAM) interface, and a global CPU interface to facilitate communication between the chiplets, memory and a server or host system.Type: GrantFiled: October 17, 2022Date of Patent: January 30, 2024Assignee: d-MATRIX CORPORATIONInventors: Sudeep Bhoja, Siddharth Sheth
-
Patent number: 11847072Abstract: An AI accelerator apparatus using in-memory compute chiplet devices. The apparatus includes one or more chiplets, each of which includes a plurality of tiles. Each tile includes a plurality of slices, a central processing unit (CPU), and a hardware dispatch device. Each slice can include a digital in-memory compute (DIMC) device configured to perform high throughput computations. In particular, the DIMC device can be configured to accelerate the computations of attention functions for transformer-based models (a.k.a. transformers) applied to machine learning applications. A single input multiple data (SIMD) device configured to further process the DIMC output and compute softmax functions for the attention functions. The chiplet can also include die-to-die (D2D) interconnects, a peripheral component interconnect express (PCIe) bus, a dynamic random access memory (DRAM) interface, and a global CPU interface to facilitate communication between the chiplets, memory and a server or host system.Type: GrantFiled: November 30, 2021Date of Patent: December 19, 2023Assignee: d-MATRIX CORPORATIONInventors: Sudeep Bhoja, Siddharth Sheth