Patents by Inventor Sreenivas Subramoney

Sreenivas Subramoney has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Instruction and Micro-Architecture Support for Decompression on Core

Publication number: 20250117329

Abstract: Methods and apparatus relating to an instruction and/or micro-architecture support for decompression on core are described. In an embodiment, decode circuitry decodes a decompression instruction into a first micro operation and a second micro operation. The first micro operation causes one or more load operations to fetch data into one or more cachelines of a cache of a processor core. Decompression Engine (DE) circuitry decompresses the fetched data from the one or more cachelines of the cache of the processor core in response to the second micro operation. Other embodiments are also disclosed and claimed.

Type: Application

Filed: November 14, 2024

Publication date: April 10, 2025

Applicant: Intel Corporation

Inventors: Jayesh Gaur, Adarsh Chauhan, Vinodh Gopal, Vedvyas Shanbhogue, Sreenivas Subramoney, Wajdi Feghali
Techniques to repurpose static random access memory rows to store a look-up-table for processor-in-memory operations

Patent number: 12248696

Abstract: Example compute-in-memory (CIM) or processor-in-memory (PIM) techniques using repurposed or dedicated static random access memory (SRAM) rows of an SRAM sub-array to store look-up-table (LUT) entries for use in a multiply and accumulate (MAC) operation.

Type: Grant

Filed: June 7, 2021

Date of Patent: March 11, 2025

Assignee: Intel Corporation

Inventors: Saurabh Jain, Srivatsa Rangachar Srinivasa, Akshay Krishna Ramanathan, Gurpreet Singh Kalsi, Kamlesh R. Pillai, Sreenivas Subramoney
Methods and apparatus to profile page tables for memory management

Patent number: 12242721

Abstract: Disclosed Methods, Apparatus, and articles of manufacture to profile page tables for memory management are disclosed. An example apparatus includes a processor to execute computer readable instructions to: profile a first page at a first level of a page table as not part of a target group; and in response to profiling the first page as not part of the target group, label a data page at a second level that corresponds to the first page as not part of the target group, the second level being lower than the first level.

Type: Grant

Filed: March 26, 2021

Date of Patent: March 4, 2025

Assignee: Intel Corporation

Inventors: Aravinda Prasad, Sandeep Kumar, Sreenivas Subramoney, Andy Rudoff
System, method, and apparatus for enhanced pointer identification and prefetching

Patent number: 12216581

Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.

Type: Grant

Filed: May 19, 2023

Date of Patent: February 4, 2025

Assignee: Intel Corporation

Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
Segmented branch target buffer based on branch instruction type

Patent number: 12190114

Abstract: In one embodiment, a processor includes a branch predictor to predict whether a branch instruction is to be taken and a branch target buffer (BTB) coupled to the branch predictor. The branch target buffer may be segmented into a first cache portion and a second cache portion, where, in response to an indication that the branch is to be taken, the BTB is to access an entry in one of the first cache portion and the second cache portion based at least in part on a type of the branch instruction, an occurrence frequency of the branch instruction, and spatial information regarding a distance between a target address of a target of the branch instruction and an address of the branch instruction. Other embodiments are described and claimed.

Type: Grant

Filed: December 22, 2020

Date of Patent: January 7, 2025

Assignee: Intel Corporation

Inventors: Niranjan Kumar Soundararajan, Sreenivas Subramoney, Sr Swamy Saranam Chongala
Methods, apparatus, and articles of manufacture to reorder N-dimensional sparse data into groups of data elements that can be collocated in a memory

Patent number: 12189559

Abstract: Exemplary embodiments maintain spatial locality of the data being processed by a sparse CNN. The spatial locality is maintained by reordering the data to preserve spatial locality. The reordering may be performed on data elements and on data for groups of co-located data elements referred to herein as “chunks”. Thus, the data may be reordered into chunks, where each chunk contains data for spatially co-located data elements, and in addition, chunks may be organized so that spatially located chunks are together. The use of chunks helps to reduce the need to re-fetch data during processing. Chunk sizes may be chosen based on the memory constraints of the processing logic (e.g., cache sizes).

Type: Grant

Filed: June 26, 2020

Date of Patent: January 7, 2025

Assignee: Intel Corporation

Inventors: Anirud Thyagharajan, Prashant Laddha, Om Omer, Sreenivas Subramoney
SOFTWARE SPLITTING FOR SOFTWARE DEFINED CORES

Publication number: 20250004766

Abstract: Techniques for software defined super core usage are described. In some examples, a first and a second processor core are to operate as a single virtual core as configured by the operating system to execute the first set of instruction segments of the single threaded program and the second set of instruction segments of the single threaded program concurrently using a shared memory space, wherein the instruction segments are to include one or more of a store instruction to store live register data to be shared with another core and a load instruction to load live register data shared by another core.

Type: Application

Filed: June 28, 2024

Publication date: January 2, 2025

Inventors: Jayesh Gaur, Sumeet Bandishte, Anant Nori, Michael Chynoweth, Sreenivas Subramoney, Adi Yoaz, Anshuman Dhuliya
Instruction and micro-architecture support for decompression on core

Patent number: 12182018

Abstract: Methods and apparatus relating to an instruction and/or micro-architecture support for decompression on core are described. In an embodiment, decode circuitry decodes a decompression instruction into a first micro operation and a second micro operation. The first micro operation causes one or more load operations to fetch data into one or more cachelines of a cache of a processor core. Decompression Engine (DE) circuitry decompresses the fetched data from the one or more cachelines of the cache of the processor core in response to the second micro operation. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: December 23, 2020

Date of Patent: December 31, 2024

Assignee: Intel Corporation

Inventors: Jayesh Gaur, Adarsh Chauhan, Vinodh Gopal, Vedvyas Shanbhogue, Sreenivas Subramoney, Wajdi Feghali
Alternate path decode for hard-to-predict branch

Patent number: 12153925

Abstract: An embodiment of an integrated circuit may comprise a core, a front end unit coupled to the core to decode one or more instruction wherein the front end unit includes a first decode path, a second decode path, and circuitry to: predict a taken branch of a conditional branch instruction of the one or more instructions, decode a predicted path of the taken branch on the first decode path, determine if the conditional branch instruction corresponds to a hard-to-predict conditional branch instruction and if the second decode path is available and, if so determined, decode an alternate path of a not-taken branch of the hard-to-predict conditional branch instruction on the second decode path. Other embodiments are disclosed and claimed.

Type: Grant

Filed: December 22, 2020

Date of Patent: November 26, 2024

Assignee: Intel Corporation

Inventors: Niranjan Soundararajan, Sreenivas Subramoney
High end imaging radar

Patent number: 12140696

Abstract: According to various embodiments, a radar device is described comprising a processor configured to generate a scene comprising an object based on a plurality of receive wireless signals, generate a ground truth object parameter of the object and generate a dataset representative of the scene and a radar detector configured to determine an object parameter of the object using a machine learning algorithm and the dataset, determine an error value of the machine learning algorithm using a cost function, the object parameter, and the ground truth object parameter and adjust the machine learning algorithm values to reduce the error value.

Type: Grant

Filed: July 14, 2021

Date of Patent: November 12, 2024

Assignee: Intel Corporation

Inventors: Chulong Chen, Wenling Margaret Huang, Saiveena Kesaraju, Ivan Simões Gaspar, Pradyumna S. Singh, Biji George, Dipan Kumar Mandal, Om Ji Omer, Sreenivas Subramoney, Yuval Amizur, Leor Banin, Hao Chen, Nir Dvorecki, Shengbo Xu
Restoring persistent application data from non-volatile memory after a system crash or system reboot

Patent number: 12117908

Abstract: Systems, apparatuses and methods may provide for technology that associates a unique identifier with an application, creates an entry in a metadata table, wherein the metadata table is at a fixed location in persistent system memory, populates the entry with the unique identifier, a user identifier, and a pointer to a root of a page table tree, and recovers in-use data pages after a system crash. In one example, the in-use data pages are recovered from the persistent system memory based on the metadata table and include one or more of application heap information or application stack information.

Type: Grant

Filed: December 4, 2020

Date of Patent: October 15, 2024

Assignee: Intel Corporation

Inventors: Aravinda Prasad, Sreenivas Subramoney
Loop support extensions

Patent number: 12112171

Abstract: Techniques for processing loops are described. An exemplary apparatus at least includes decoder circuitry to decode a single instruction, the single instruction to include a field for an opcode, the opcode to indicate execution circuitry is to perform an operation to configure execution of one or more loops, wherein the one or more loops are to include a plurality of configuration instructions and instructions that are to use metadata generated by ones of the plurality of configuration instructions; and execution circuitry to perform the operation as indicated by the opcode.

Type: Grant

Filed: December 26, 2020

Date of Patent: October 8, 2024

Assignee: Intel Corporation

Inventors: Anant Nori, Shankar Balachandran, Sreenivas Subramoney, Joydeep Rakshit, Vedvyas Shanbhogue, Avishaii Abuhatzera, Belliappa Kuttanna
Dynamic shared cache partition for workload with large code footprint

Patent number: 12066945

Abstract: An embodiment of an integrated circuit may comprise a core, a first level core cache memory coupled to the core, a shared core cache memory coupled to the core, a first cache controller coupled to the core and communicatively coupled to the first level core cache memory, a second cache controller coupled to the core and communicatively coupled to the shared core cache memory, and circuitry coupled to the core and communicatively coupled to the first cache controller and the second cache controller to determine if a workload has a large code footprint, and, if so determined, partition N ways of the shared core cache memory into first and second chunks of ways with the first chunk of M ways reserved for code cache lines from the workload and the second chunk of N minus M ways reserved for data cache lines from the workload, where N and M are positive integer values and N minus M is greater than zero. Other embodiments are disclosed and claimed.

Type: Grant

Filed: December 22, 2020

Date of Patent: August 20, 2024

Assignee: Intel Corporation

Inventors: Prathmesh Kallurkar, Anant Vithal Nori, Sreenivas Subramoney
Application programming interface for fine grained low latency decompression within processor core

Patent number: 12028094

Abstract: Methods and apparatus relating to an Application Programming Interface (API) for fine grained low latency decompression within a processor core are described. In an embodiment, a decompression Application Programming Interface (API) receives an input handle to a data object. The data object includes compressed data and metadata. Decompression Engine (DE) circuitry decompresses the compressed data to generate uncompressed data. The DE circuitry decompress the compressed data in response to invocation of a decompression instruction by the decompression API. The metadata comprises a first operand to indicate a location of the compressed data, a second operand to indicate a size of the compressed data, a third operand to indicate a location to which decompressed data by the DE circuitry is to be stored, and a fourth operand to indicate a size of the decompressed data. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: December 23, 2020

Date of Patent: July 2, 2024

Inventors: Jayesh Gaur, Adarsh Chauhan, Vinodh Gopal, Vedvyas Shanbhogue, Sreenivas Subramoney, Wajdi Feghali
Apparatus and method for hardware-based memoization of function calls to reduce instruction execution

Patent number: 12020033

Abstract: Apparatus and method for memorizing repeat function calls are described herein. An apparatus embodiment includes: uop buffer circuitry to identify a function for memorization based on retiring micro-operations (uops) from a processing pipeline; memorization retirement circuitry to generate a signature of the function which includes input and output data of the function; a memorization data structure to store the signature; and predictor circuitry to detect an instance of the function to be executed by the processing pipeline and to responsively exclude a first subset of uops associated with the instance from execution when a confidence level associated with the function is above a threshold. One or more instructions that are data-dependent on execution of the instance is then provided with the output data of the function from the memorization data structure.

Type: Grant

Filed: December 24, 2020

Date of Patent: June 25, 2024

Assignee: Intel Corporation

Inventors: Niranjan Kumar Soundararajan, Sreenivas Subramoney, Jayesh Gaur, S R Swamy Saranam Chongala
DEVICE, METHOD AND SYSTEM TO CAPTURE OR RESTORE MICROARCHITECTURAL STATE OF A PROCESSOR CORE

Publication number: 20240202000

Abstract: Techniques and mechanisms for efficiently saving and recovering state of a processor core. In an embodiment, a processor core fetches and decodes a first instruction to generate a first decoded instruction, wherein the first instruction comprises a first opcode which corresponds to one or more components of the processor core. Execution of the first instruction comprises saving microarchitectural state of the one or more components to a memory of the core. In another embodiment, a processor core fetches and decodes a second instruction to generate a second decoded instruction, wherein the second instruction comprises a second opcode which corresponds to the same one or more components. Execution of the second instruction comprises restoring the microarchitectural state from the memory to the one or more components.

Type: Application

Filed: December 19, 2022

Publication date: June 20, 2024

Applicant: Intel Corporation

Inventors: Niranjan Soundararajan, Sreenivas Subramoney
SPARSITY-AWARE PERFORMANCE BOOST IN COMPUTE-IN-MEMORY CORES FOR DEEP NEURAL NETWORK ACCELERATION

Publication number: 20240201949

Abstract: Systems, apparatuses and methods may provide for technology that includes a compute-in-memory (CiM) enabled memory array to conduct digital bit-serial multiply and accumulate (MAC) operations on multi-bit input data and weight data stored in the CiM enabled memory array, an adder tree coupled to the CiM enabled memory array, an accumulator coupled to the adder tree, and an input bit selection stage coupled to the CiM enabled memory array, wherein the input bit selection stage restricts serial bit selection on the multi-bit input data to non-zero values during the digital MAC operations.

Type: Application

Filed: February 28, 2024

Publication date: June 20, 2024

Inventors: Sagar Varma Sayyaparaju, Om Ji Omer, Sreenivas Subramoney
Data relocation for inline metadata

Patent number: 11972126

Abstract: Technologies disclosed herein provide one example of a system that includes processor circuitry to be communicatively coupled to a memory circuitry. The processor circuitry is to receive a memory access request corresponding to an application for access to an address range in a memory allocation of the memory circuitry and to locate a metadata region within the memory allocation. The processor circuitry is also to, in response to a determination that the address range includes at least a portion of the metadata region, obtain first metadata stored in the metadata region, use the first metadata to determine an alternate memory address in a relocation region, and read, at the alternate memory address, displaced data from the portion of the metadata region included in the address range of the memory allocation. The address range includes one or more bytes of an expected allocation region of the memory allocation.

Type: Grant

Filed: September 10, 2021

Date of Patent: April 30, 2024

Assignee: Intel Corporation

Inventors: David M. Durham, Michael D. LeMay, Sergej Deutsch, Joydeep Rakshit, Anant Vithal Nori, Jayesh Gaur, Sreenivas Subramoney
Methods, apparatus, and articles of manufacture to improve in-memory multiply and accumulate operations

Patent number: 11949414

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to improve in-memory multiply and accumulate operations. An example apparatus includes a first multiplexer in a subarray of memory, the first multiplexer to receive first values representative of a column of a lookup table (LUT) including entries to represent products of four-bit numbers and return second values from an intersection of a row and the column of the LUT based on a first element of a first operand; shift and adder logic in the subarray, the shift and adder logic to shift the second values based on at least one of the first element of the first operand or a first element of a second operand; and accumulation storage in the subarray, the accumulation storage to store at least the shifted second values.

Type: Grant

Filed: December 22, 2020

Date of Patent: April 2, 2024

Assignee: INTEL CORPORATION

Inventors: Gurpreet Singh Kalsi, Akshay Krishna Ramanathan, Kamlesh Pillai, Sreenivas Subramoney, Srivatsa Rangachar Srinivasa, Anirud Thyagharajan, Om Ji Omer, Saurabh Jain
SHORT PIPELINE FOR FAST RECOVERY FROM A BRANCH MISPREDICTION

Publication number: 20240103878

Abstract: An example of an integrated circuit may include a first execution cluster, a second execution cluster that is one or more of narrower and shallower as compared to the first execution cluster, and circuitry to selectively steer instructions to the first execution cluster and the second execution cluster based on branch misprediction information. Other embodiments are disclosed and claimed.

Type: Application

Filed: September 26, 2022

Publication date: March 28, 2024

Applicant: Intel Corporation

Inventors: Jayesh Gaur, Sufiyan Syed, Adithya Ranganathan, Sreenivas Subramoney

1 2 3 4 5 … next