Patents by Inventor Subhankar PAL

Subhankar PAL has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

COMPILING AN APPLICATION HAVING POLYNOMIAL OPERATIONS TO PRODUCE DIRECTED ACYCLIC GRAPHS HAVING COMMANDS TO EXECUTE IN A NEAR MEMORY PROCESSING DEVICE

Publication number: 20250244980

Abstract: Provided are a computer program product, system, and method for compiling an application having polynomial operations to produce directed acyclic graphs having commands to execute in a near memory processing device. An application is compiled including operations on a polynomial having coefficients, decomposed into a number of levels of coefficient elements, to generate hierarchical directed acyclic graphs (DAGs) having nodes indicating commands for execution by a hierarchy of hardware components in a near memory processing (NMP) device. The hierarchy of hardware components includes a plurality of enclaves of tiles. Each tile includes memory and a processing element to perform operations on the decomposed coefficients stored in the memory of the tile. Each of the hardware components includes a controller to process the commands in the DAG generated for the hardware components. The DAGs are provided to a hierarchical DAG tracker to generate commands for the NMP device.

Type: Application

Filed: January 26, 2024

Publication date: July 31, 2025

Inventors: Yongmo Park, Subhankar Pal, Aporva Amarnath, Alper Buyuktosunoglu, Pradip Bose
NEAR MEMORY PROCESSING DEVICE FOR PROCESSING HIERARCHICAL COMMANDS TO PROCESS COEFFICIENT ELEMENTS RESULTING FROM DECOMPOSITION OF POLYNOMIALS

Publication number: 20250245285

Abstract: Provided are a device, system, and computer program product for a near memory processing device to process coefficient elements resulting from decomposition of polynomials. A near memory processing device includes a plurality of enclaves and a plurality of interconnected tiles on each enclave. Coefficients of a polynomial are decomposed into a number of levels of the coefficient elements. Each level of coefficient elements comprises a limb. A device control receives hierarchical commands, from an application, that map operations to perform on limbs of coefficient elements to the enclaves and that map operations for the enclaves to the tiles in the enclaves. The device controller distributes operations for the tiles in the hierarchical commands to perform on the coefficient elements to the enclaves to distribute operations to perform on the coefficient elements to the tiles.

Type: Application

Filed: January 26, 2024

Publication date: July 31, 2025

Inventors: Yongmo Park, Subhankar Pal, Aporva Amarnath, Alper Buyuktosunoglu, Pradip Bose
MAPPING INPUT NODES IN A TRANSFORM NETWORK TO INPUT NODES OF A SMALLER TRANSFORM NETWORK

Publication number: 20250139443

Abstract: Provided are a computer program product, system, and method for mapping input nodes in a transform network to input nodes of a smaller transform network. A first transform network having N input nodes and successive columns of interlinked nodes at which input data is processed. A mapping is generated of the N input nodes to n input nodes of a second transform network implemented in processing tiles in a hardware unit, such that n is less than N and multiple of the N input nodes of the first transform network map to one of the n input nodes of the second transform network. The mapping is used to map the N input nodes of the first transform network to n input nodes of the second transform network implemented in hardware.

Type: Application

Filed: October 27, 2023

Publication date: May 1, 2025

Inventors: Aporva Amarnath, Subhankar Pal, Yongmo Park, Alper Buyuktosunoglu
MAPPING DATA FOR NODES IN A FIRST TRANSFORM NETWORK TO INPUT NODES OF NEAR-MEMORY PROCESSING UNITS IMPLEMENTING A SMALLER TRANSFORM NETWORK

Publication number: 20250139444

Abstract: Provided are computer program product, system, and method for mapping data for nodes in a first transform network to input nodes of near-memory processing units implementing a smaller transform network. A plurality of processing units, which are interconnected, receive input data for n input nodes for a second transform network to process at interlinked stages of nodes in the processing units. A mapping maps N input nodes for the first transform network to the n input nodes of the second transform network. N is greater than n and a plurality of the N input nodes of the first transform network map to one of the n input nodes of the second transform network. A transform manager uses the mapping to map the N input nodes to n input nodes and loads received input data for the n input nodes into the processing units to perform computations in the processing units.

Type: Application

Filed: October 27, 2023

Publication date: May 1, 2025

Inventors: Aporva Amarnath, Subhankar Pal, Yongmo Park, Alper Buyuktosunoglu
DEEP LEARNING OPTIMIZATION THROUGH ZERO TILE MANIPULATION

Publication number: 20250053803

Abstract: Processing zero weights within a data structure when performing a multiply and accumulate operation in a deep learning network as the result is itself a zero. Avoiding this step may save time and reduce power consumption in the training and operation of deep learning networks. An approach to zero-tile manipulation may be presented herein. An approach to permute and pack weighted data structures into zero-tile data structures may be presented. The zero-tiles may be configured in a structure which is optimized for the architecture of a parallel processing unit. The zero tile data structures may comprise vectors which instruct a the components in processing element to operate in a manner which prevents the element from expending energy when processing the zero tiles. An apparatus may also be presented in the immediate disclosure which can be configured to accept a zero-tile data structure.

Type: Application

Filed: August 11, 2023

Publication date: February 13, 2025

Inventors: Monodeep Kar, Subhankar Pal, Alper Buyuktosunoglu, Sanchari Sen
NEURAL NETWORK INFERENCE UNDER HOMOMORPHIC ENCRYPTION

Publication number: 20240256850

Abstract: A trained neural network is partitioned into a client-side portion and a server-side portion, the client-side portion comprising a first set of layers of the trained neural network, the server-side portion comprising a second set of layers of the trained neural network, the trained neural network trained using a first set of training data. From a homomorphically encrypted intermediate result input to the server-side portion, a homomorphically encrypted output of the trained neural network is computed, the homomorphically encrypted intermediate result comprising a homomorphically encrypted output computed by the client-side portion.

Type: Application

Filed: January 30, 2023

Publication date: August 1, 2024

Applicant: International Business Machines Corporation

Inventors: Omri Soceanu, Nir Drucker, Subhankar Pal, Roman Vaculin, Kanthi Sarpatwar, Alper Buyuktosunoglu, Pradip Bose, Hayim Shaul, Ehud Aharoni, James Thomas Rayfield
PACKING MACHINE LEARNING MODELS USING PRUNING AND PERMUTATION

Publication number: 20240013050

Abstract: An example system includes a processor to prune a machine learning model based on an importance of neurons or weights. The processor is to further permute and pack remaining neurons or weights of the pruned machine learning model to reduce an amount of ciphertext computation under a selected constraint.

Type: Application

Filed: July 5, 2022

Publication date: January 11, 2024

Inventors: Subhankar PAL, Alper BUYUKTOSUNOGLU, Ehud AHARONI, Nir DRUCKER, Omri SOCEANU, Hayim SHAUL, Kanthi SARPATWAR, Roman VACULIN, Moran BARUCH, Pradip BOSE
Branch target filtering based on memory region access count

Patent number: 11550588

Abstract: A branch predictor of a processor includes one or more prediction structures, including a predicted branch address and predicted branch direction, that identify predicted branches. To reduce power consumption, the branch predictor selects one or more of the prediction structures that are not expected to provide useful branch prediction information and filters the selected structures such that the filtered structures are not used for branch prediction. The branch predictor thereby reduces the amount of power used for branch prediction without substantially reducing the accuracy of the predicted branches.

Type: Grant

Filed: August 22, 2018

Date of Patent: January 10, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: John Kalamatianos, Adithya Yalavarti, Varun Agrawal, Subhankar Pal, Vinesh Srinivasan
FILTERED BRANCH PREDICTION STRUCTURES OF A PROCESSOR

Publication number: 20200065106

Abstract: A branch predictor of a processor includes one or more prediction structures that identify predicted branches, including a predicted branch addresses and predicted branch direction. To reduce power consumption, the branch predictor selects one or more of the prediction structures that are not expected to provide useful branch prediction information and filters the selected structures such that the filtered structures are not used for branch prediction. The branch predictor thereby reduces the amount of power used for branch prediction without substantially reducing the accuracy of the predicted branches.

Type: Application

Filed: August 22, 2018

Publication date: February 27, 2020

Inventors: John KALAMATIANOS, Adithya YALAVARTI, Varun AGRAWAL, Subhankar PAL, Vinesh SRINIVASAN