Patents by Inventor Subhankar PAL

Subhankar PAL has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250244980
    Abstract: Provided are a computer program product, system, and method for compiling an application having polynomial operations to produce directed acyclic graphs having commands to execute in a near memory processing device. An application is compiled including operations on a polynomial having coefficients, decomposed into a number of levels of coefficient elements, to generate hierarchical directed acyclic graphs (DAGs) having nodes indicating commands for execution by a hierarchy of hardware components in a near memory processing (NMP) device. The hierarchy of hardware components includes a plurality of enclaves of tiles. Each tile includes memory and a processing element to perform operations on the decomposed coefficients stored in the memory of the tile. Each of the hardware components includes a controller to process the commands in the DAG generated for the hardware components. The DAGs are provided to a hierarchical DAG tracker to generate commands for the NMP device.
    Type: Application
    Filed: January 26, 2024
    Publication date: July 31, 2025
    Inventors: Yongmo Park, Subhankar Pal, Aporva Amarnath, Alper Buyuktosunoglu, Pradip Bose
  • Publication number: 20250245285
    Abstract: Provided are a device, system, and computer program product for a near memory processing device to process coefficient elements resulting from decomposition of polynomials. A near memory processing device includes a plurality of enclaves and a plurality of interconnected tiles on each enclave. Coefficients of a polynomial are decomposed into a number of levels of the coefficient elements. Each level of coefficient elements comprises a limb. A device control receives hierarchical commands, from an application, that map operations to perform on limbs of coefficient elements to the enclaves and that map operations for the enclaves to the tiles in the enclaves. The device controller distributes operations for the tiles in the hierarchical commands to perform on the coefficient elements to the enclaves to distribute operations to perform on the coefficient elements to the tiles.
    Type: Application
    Filed: January 26, 2024
    Publication date: July 31, 2025
    Inventors: Yongmo Park, Subhankar Pal, Aporva Amarnath, Alper Buyuktosunoglu, Pradip Bose
  • Publication number: 20250139443
    Abstract: Provided are a computer program product, system, and method for mapping input nodes in a transform network to input nodes of a smaller transform network. A first transform network having N input nodes and successive columns of interlinked nodes at which input data is processed. A mapping is generated of the N input nodes to n input nodes of a second transform network implemented in processing tiles in a hardware unit, such that n is less than N and multiple of the N input nodes of the first transform network map to one of the n input nodes of the second transform network. The mapping is used to map the N input nodes of the first transform network to n input nodes of the second transform network implemented in hardware.
    Type: Application
    Filed: October 27, 2023
    Publication date: May 1, 2025
    Inventors: Aporva Amarnath, Subhankar Pal, Yongmo Park, Alper Buyuktosunoglu
  • Publication number: 20250139444
    Abstract: Provided are computer program product, system, and method for mapping data for nodes in a first transform network to input nodes of near-memory processing units implementing a smaller transform network. A plurality of processing units, which are interconnected, receive input data for n input nodes for a second transform network to process at interlinked stages of nodes in the processing units. A mapping maps N input nodes for the first transform network to the n input nodes of the second transform network. N is greater than n and a plurality of the N input nodes of the first transform network map to one of the n input nodes of the second transform network. A transform manager uses the mapping to map the N input nodes to n input nodes and loads received input data for the n input nodes into the processing units to perform computations in the processing units.
    Type: Application
    Filed: October 27, 2023
    Publication date: May 1, 2025
    Inventors: Aporva Amarnath, Subhankar Pal, Yongmo Park, Alper Buyuktosunoglu
  • Publication number: 20250053803
    Abstract: Processing zero weights within a data structure when performing a multiply and accumulate operation in a deep learning network as the result is itself a zero. Avoiding this step may save time and reduce power consumption in the training and operation of deep learning networks. An approach to zero-tile manipulation may be presented herein. An approach to permute and pack weighted data structures into zero-tile data structures may be presented. The zero-tiles may be configured in a structure which is optimized for the architecture of a parallel processing unit. The zero tile data structures may comprise vectors which instruct a the components in processing element to operate in a manner which prevents the element from expending energy when processing the zero tiles. An apparatus may also be presented in the immediate disclosure which can be configured to accept a zero-tile data structure.
    Type: Application
    Filed: August 11, 2023
    Publication date: February 13, 2025
    Inventors: Monodeep Kar, Subhankar Pal, Alper Buyuktosunoglu, Sanchari Sen
  • Publication number: 20240256850
    Abstract: A trained neural network is partitioned into a client-side portion and a server-side portion, the client-side portion comprising a first set of layers of the trained neural network, the server-side portion comprising a second set of layers of the trained neural network, the trained neural network trained using a first set of training data. From a homomorphically encrypted intermediate result input to the server-side portion, a homomorphically encrypted output of the trained neural network is computed, the homomorphically encrypted intermediate result comprising a homomorphically encrypted output computed by the client-side portion.
    Type: Application
    Filed: January 30, 2023
    Publication date: August 1, 2024
    Applicant: International Business Machines Corporation
    Inventors: Omri Soceanu, Nir Drucker, Subhankar Pal, Roman Vaculin, Kanthi Sarpatwar, Alper Buyuktosunoglu, Pradip Bose, Hayim Shaul, Ehud Aharoni, James Thomas Rayfield
  • Publication number: 20240013050
    Abstract: An example system includes a processor to prune a machine learning model based on an importance of neurons or weights. The processor is to further permute and pack remaining neurons or weights of the pruned machine learning model to reduce an amount of ciphertext computation under a selected constraint.
    Type: Application
    Filed: July 5, 2022
    Publication date: January 11, 2024
    Inventors: Subhankar PAL, Alper BUYUKTOSUNOGLU, Ehud AHARONI, Nir DRUCKER, Omri SOCEANU, Hayim SHAUL, Kanthi SARPATWAR, Roman VACULIN, Moran BARUCH, Pradip BOSE
  • Patent number: 11550588
    Abstract: A branch predictor of a processor includes one or more prediction structures, including a predicted branch address and predicted branch direction, that identify predicted branches. To reduce power consumption, the branch predictor selects one or more of the prediction structures that are not expected to provide useful branch prediction information and filters the selected structures such that the filtered structures are not used for branch prediction. The branch predictor thereby reduces the amount of power used for branch prediction without substantially reducing the accuracy of the predicted branches.
    Type: Grant
    Filed: August 22, 2018
    Date of Patent: January 10, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Adithya Yalavarti, Varun Agrawal, Subhankar Pal, Vinesh Srinivasan
  • Publication number: 20200065106
    Abstract: A branch predictor of a processor includes one or more prediction structures that identify predicted branches, including a predicted branch addresses and predicted branch direction. To reduce power consumption, the branch predictor selects one or more of the prediction structures that are not expected to provide useful branch prediction information and filters the selected structures such that the filtered structures are not used for branch prediction. The branch predictor thereby reduces the amount of power used for branch prediction without substantially reducing the accuracy of the predicted branches.
    Type: Application
    Filed: August 22, 2018
    Publication date: February 27, 2020
    Inventors: John KALAMATIANOS, Adithya YALAVARTI, Varun AGRAWAL, Subhankar PAL, Vinesh SRINIVASAN