Patents by Inventor Rabin Sugumar

Rabin Sugumar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250245085
    Abstract: Techniques for debugging errors in a processor are disclosed. One or more processors are accessed. Each processor within the one or more processors includes a set of assertion registers. A processor within the one or more processors executes one or more instructions. An assertion logic detects an error condition in the processor. The detecting occurs during the executing. The error condition is recorded. The recording is based on one or more bits in the set of assertion registers. A hardware interface reads the one or more bits in the set of assertion registers. The one or more bits indicate the error condition to the hardware interface. The executing includes a communication protocol between the processor and a slave device. The error condition comprises an incorrect value in a credit buffer. The credit buffer controls a number of transactions allowed between the processor and the slave device.
    Type: Application
    Filed: March 10, 2025
    Publication date: July 31, 2025
    Applicant: Akeana, Inc.
    Inventors: Ricardo Ramirez, Rabin Sugumar
  • Publication number: 20250238503
    Abstract: Disclosed embodiments provide techniques for malicious code detection in a processor core. A system-on-a-chip (SoC) is accessed. The SoC includes one or more processor cores. Each processor core is coupled to one or more external profiling agents (EPAs) on the SoC. An EPA configures a performance counter in a processor core within the SoC. The configuring is based on an offset value. The processor core updates the performance counter that was configured, based on a processor core event. A program state is saved to a performance counter storage area, based on a performance counter event. The program state that is saved corresponds to code being executed on the processor core. The program state is read from the performance counter storage area by the EPA. The EPA interprets the program state that was read, which identifies a malicious program running on the processor core.
    Type: Application
    Filed: March 6, 2025
    Publication date: July 24, 2025
    Applicant: Akeana, Inc.
    Inventor: Rabin Sugumar
  • Patent number: 12360769
    Abstract: Disclosed embodiments provide techniques for branch prediction. A processor core is accessed. The processor core is coupled to memory and includes branch prediction circuitry. The branch prediction circuitry includes a branch target buffer (BTB) and an indirect branch target buffer (BTBI). A hashed program counter within the processor core is read. The BTB and BTBI are searched. The searching the BTB is accomplished with the hashed program counter and the searching the BTBI is accomplished with the hashed program counter and branch history information. A predicted branch target address within the BTBI or the BTB is matched. The matching within the BTBI is based on an indirect branch instruction, and the matching within the BTB is based on other branch instruction types. The predicted branch target address that was matched is predicted taken. The processor core is directed to fetch a next instruction from the predicted branch target address.
    Type: Grant
    Filed: December 11, 2023
    Date of Patent: July 15, 2025
    Assignee: Akeana, Inc.
    Inventors: James Youngsae Cho, Chandramouli Banerjee, Rabin Sugumar
  • Publication number: 20250217151
    Abstract: Disclosed embodiments provide techniques for instruction execution with a processor pipeline for data transfer operations. A processor core is accessed. The processor core executes one or more instructions out of order. The processor core supports integer operations and floating-point operations. An instruction in the processor core is decoded. The instruction is a data transfer operation. The data transfer operation necessitates a floating-point operation and an integer operation. The floating-point operation and the integer operation are dispatched to one or more issue queues. The floating-point operation and the integer operation are interlocked. The interlocking is accomplished using at least one entry in the one or more issue queues. A first operation of the floating-point operation and the integer operation is executed. A second operation of the floating-point operation and the integer operation is executed. The execution of the second operation is based on the interlocking.
    Type: Application
    Filed: April 26, 2024
    Publication date: July 3, 2025
    Applicant: Akeana, Inc.
    Inventors: Ricardo Ramirez, Albert Anthony Martin, Abhijit Sil, Rabin Sugumar
  • Publication number: 20250021336
    Abstract: A processor core includes a local cache hierarchy, prefetch logic, and a prefetch table, where the processor core is coupled to an external memory system. A data stream is detected, where the data stream includes multiple load instructions, including a load instruction that causes a cache miss, resulting in prefetching. A prefetch table is initialized with information pertaining to load instructions, and includes a Positive or Negative value (PON), a stride, and a saturation count. Information in the prefetch table is updated as new load instructions are prefetched. An underlying stride of the data stream is discovered, based on the updating. Data is prefetched using an offset, where a polarity of the offset is based on the PON, enabling effective stride detection with dynamic directionality and out-of-order instructions.
    Type: Application
    Filed: July 10, 2024
    Publication date: January 16, 2025
    Applicant: Akeana, Inc.
    Inventor: Rabin Sugumar
  • Publication number: 20240419551
    Abstract: Disclosed embodiments provide techniques for enhancing security of a processor. Multiple consistency units are distributed within a processor core. Instructions are executed in an architecturally defined mode. The architecturally defined mode can be based on an instruction set architecture (ISA). In response to detecting an error in at least one consistency unit, disclosed embodiments reduce the functionality of the processor core. The reduced functionality includes halting the processor core, shutting down the processor core, switching the functionality of the processor core to a safe mode, and/or other suitable actions. The consistency unit can include a program counter comparison function. The consistency unit can include a completion signal check function. The consistency unit can include an address check function. The consistency unit can include a temporal proximity check function. Disclosed embodiments provide safeguards against various environmental attacks, such as voltage and/or clock alterations.
    Type: Application
    Filed: May 17, 2024
    Publication date: December 19, 2024
    Applicant: Akeana, Inc.
    Inventor: Rabin Sugumar
  • Publication number: 20240220267
    Abstract: Techniques for providing a return address stack with branch mispredict recovery are disclosed. A processor core is accessed. The processor core includes a return address stack (RAS), a local cache hierarchy, and branch prediction logic. RAS state information, including a write pointer, a read pointer, and a RAS count, is sent to a branch execution unit. One or more call instructions are detected in an instruction stream. The detecting generates a predicted return address for each of the one or more call instructions which are pushed on the RAS. The pushing is directed by the write pointer. One or more return instructions are recognized in the instruction stream. The write pointer and the read pointer for the RAS are updated, based on information from the branch execution unit. The predicted return address for each of the one or more return instructions is popped from the RAS.
    Type: Application
    Filed: December 29, 2023
    Publication date: July 4, 2024
    Applicant: Akeana, Inc.
    Inventors: James Youngsae Cho, Rabin Sugumar
  • Publication number: 20240211366
    Abstract: Techniques for performance profiling based on processor performance profiling using agents are disclosed. A processor core is accessed. The processor core includes a performance counter, a performance counter storage area, and a performance counter control register. The processor core includes a performance monitoring interface. The performance counter, performance counter storage area, and performance counter control register are assigned to an external profiling agent, which loads the performance counter and the performance counter control register. The loading is based on a particular event in the processor core. A program state is saved to the storage area, based on a counter event in the performance counter and an enable bit in the performance counter control register being set. The program state that is saved corresponds to code being executed on the processor core. The program state is read, from the storage area, by the external profiling agent.
    Type: Application
    Filed: December 20, 2023
    Publication date: June 27, 2024
    Applicant: Akeana, Inc.
    Inventor: Rabin Sugumar
  • Publication number: 20240211259
    Abstract: Disclosed embodiments provide techniques for data prefetching. A processor core is accessed. The processor core includes prefetch logic and a local cache hierarchy and is coupled to a memory system. A stride of a data stream is detected. The data stream comprises two or more load instructions that cause two or more misses in the local cache hierarchy. Information about the data stream is accumulated. The information includes a stride count. Prefetch operations to the memory system are generated, based on the information. The prefetch operations include prefetch addresses. A rate of the prefetch operations is limited, based on the stride count. Based on the stride count, the prefetcher can enter a saturation state. The saturation state keeps the cache supplied with prefetched data. A number of stride prefetch operations is based on the stride of the data stream. The number is stored in a software-updatable configuration register array.
    Type: Application
    Filed: December 27, 2023
    Publication date: June 27, 2024
    Applicant: Akeana, Inc.
    Inventors: James Youngsae Cho, Rabin Sugumar
  • Publication number: 20240192958
    Abstract: Disclosed embodiments provide techniques for branch prediction. A processor core is accessed. The processor core is coupled to memory and includes branch prediction circuitry. The branch prediction circuitry includes a branch target buffer (BTB) and an indirect branch target buffer (BTBI). A hashed program counter within the processor core is read. The BTB and BTBI are searched. The searching the BTB is accomplished with the hashed program counter and the searching the BTBI is accomplished with the hashed program counter and branch history information. A predicted branch target address within the BTBI or the BTB is matched. The matching within the BTBI is based on an indirect branch instruction, and the matching within the BTB is based on other branch instruction types. The predicted branch target address that was matched is predicted taken. The processor core is directed to fetch a next instruction from the predicted branch target address.
    Type: Application
    Filed: December 11, 2023
    Publication date: June 13, 2024
    Applicant: Akeana, Inc.
    Inventors: James Youngsae Cho, Chandramouli Banerjee, Rabin Sugumar
  • Publication number: 20240192961
    Abstract: Techniques for instruction execution based on processor instruction exception handling are disclosed. A processor core is accessed. The processor core executes at least one instruction thread. The processor core executes one or more instructions out of order. An ordered list of instructions is maintained. The ordered list is based on instructions that are presented to the processor core for execution. The ordered list is organized using one or more pointers. An execution exception is detected in the processor core. The execution exception corresponds to one of the instructions in the ordered list. The execution exception requires initiating an exception handling routine. An effective age of an instruction in the ordered list is determined. The effective age corresponds to the execution exception. The exception handling routine is initiated, based on matching the effective age of an instruction in the ordered list with one of the one or more pointers.
    Type: Application
    Filed: December 6, 2023
    Publication date: June 13, 2024
    Applicant: Akeana, Inc.
    Inventors: Ricardo Ramirez, Rabin Sugumar
  • Patent number: 11868193
    Abstract: A system includes a controller configured to receive a signal indicating whether a droop event has occurred. The system also includes a plurality of delay elements where each delay element of the plurality of delay elements responsive to a signal from the controller receives an input signal and outputs an output signal that is a delayed version of the input signal. At least one delay element of the plurality of delay elements receives a clocking signal as its input signal. The system also includes a selector configured to select rising edges and falling edges of output signals from the plurality of delay elements to form a modified clocking signal. The modified clocking signal is a modified version of the clocking signal.
    Type: Grant
    Filed: April 6, 2021
    Date of Patent: January 9, 2024
    Assignee: Marvell Asia Pte Ltd
    Inventors: Rabin Sugumar, Bharath Upputuri, Bruce Kauffmann, Novinder Waraich, Bivraj Koradia, Paul Sebata
  • Patent number: 11755483
    Abstract: In a multi-node system, each node includes tiles. Each tile includes a cache controller, a local cache, and a snoop filter cache (SFC). The cache controller responsive to a memory access request by the tile checks the local cache to determine whether the data associated with the request has been cached by the local cache of the tile. The cached data from the local cache is returned responsive to a cache-hit. The SFC is checked to determine whether any other tile of a remote node has cached the data associated with the memory access request. If it is determined that the data has been cached by another tile of a remote node and if there is a cache-miss by the local cache, then the memory access request is transmitted to the global coherency unit (GCU) and the snoop filter to fetch the cached data. Otherwise an interconnected memory is accessed.
    Type: Grant
    Filed: May 27, 2022
    Date of Patent: September 12, 2023
    Assignee: Marvell Asia Pte Ltd
    Inventors: Pranith Kumar Denthumdas, Rabin Sugumar, Isam Wadih Akkawi
  • Patent number: 11663130
    Abstract: Described herein are systems and methods for cache replacement mechanisms for speculative execution. For example, some systems include, a buffer comprising entries that are each configured to store a cache line of data and a tag that includes an indication of a status of the cache line stored in the entry, in an integrated circuit that is configured to: responsive to a cache miss caused by a load instruction that is speculatively executed by a processor pipeline, load a cache line of data corresponding to the cache miss into a first entry of the buffer and update the tag of the first entry to indicate the status is speculative; responsive to the load instruction being retired by the processor pipeline, update the tag to indicate the status is validated; and, responsive to the load instruction being flushed from the processor pipeline, update the tag to indicate the status is cancelled.
    Type: Grant
    Filed: April 30, 2021
    Date of Patent: May 30, 2023
    Assignee: Marvell Asia Pte, Ltd.
    Inventor: Rabin Sugumar
  • Patent number: 11487695
    Abstract: A circuit provides for processing and routing peer-to-peer (P2P) traffic. A bus request queue store a data request received from a first peer device. A decoder compares an address portion of the data request against an address map to determine whether the data request is directed to either a second peer device or a local memory. A bus interface unit, in response to the data request being directed to the second peer device, 1) generates a memory access request from the bus request and 2) transmits the memory access request toward the second peer device via a bus. A memory controller, in response to the data request being directed to a local memory, accesses the local memory to perform a memory access operation based on the data request.
    Type: Grant
    Filed: April 23, 2021
    Date of Patent: November 1, 2022
    Assignee: MARVELL ASIA PTE LTD
    Inventors: Sivakumar Radhakrishnan, Rabin Sugumar, Ham U Prince
  • Patent number: 11467964
    Abstract: A system includes a first counter configured to increment or decrement in response to a triggering event. The first counter is sized to overflow. The system also includes a second counter configured to increment or decrement in response to a triggering event. The first counter and the second counter are merged to form a third counter in response to detecting an overflow triggering event for the first counter. A merge bit indicative of whether the first counter and the second counter are merged changes value in response to merging the first counter and the second counter.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: October 11, 2022
    Assignee: Marvell Asia Pte Ltd
    Inventors: Nagesh Bangalore Lakshminarayana, Pranith Kumar Denthumdas, Rabin Sugumar
  • Patent number: 11403101
    Abstract: Described herein are systems and methods for introducing noise in threaded execution to mitigate cross-thread monitoring. For example, some systems include an integrated circuit including a processor pipeline that is configured to execute instructions using an architectural state of a processor core; data storage circuitry configured to store a thread identifier; and a random parameter generator. The integrated circuit may be configured to: determine a time for insertion based on a random parameter generated using the random parameter generator; at the time for insertion, insert one or more instructions in the processor pipeline by participating in thread arbitration using the thread identifier; and execute the one or more instructions using one or more execution units of the processor pipeline.
    Type: Grant
    Filed: July 30, 2021
    Date of Patent: August 2, 2022
    Assignee: Marvell Asia Pte, Ltd.
    Inventor: Rabin Sugumar
  • Patent number: 11379370
    Abstract: In a multi-node system, each node includes tiles. Each tile includes a cache controller, a local cache, and a snoop filter cache (SFC). The cache controller responsive to a memory access request by the tile checks the local cache to determine whether the data associated with the request has been cached by the local cache of the tile. The cached data from the local cache is returned responsive to a cache-hit. The SFC is checked to determine whether any other tile of a remote node has cached the data associated with the memory access request. If it is determined that the data has been cached by another tile of a remote node and if there is a cache-miss by the local cache, then the memory access request is transmitted to the global coherency unit (GCU) and the snoop filter to fetch the cached data. Otherwise an interconnected memory is accessed.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: July 5, 2022
    Assignee: Marvell Asia Pte Ltd
    Inventors: Pranith Kumar Denthumdas, Rabin Sugumar, Isam Wadih Akkawi
  • Publication number: 20210255685
    Abstract: A system includes a controller configured to receive a signal indicating whether a droop event has occurred. The system also includes a plurality of delay elements where each delay element of the plurality of delay elements responsive to a signal from the controller receives an input signal and outputs an output signal that is a delayed version of the input signal. At least one delay element of the plurality of delay elements receives a clocking signal as its input signal. The system also includes a selector configured to select rising edges and falling edges of output signals from the plurality of delay elements to form a modified clocking signal. The modified clocking signal is a modified version of the clocking signal.
    Type: Application
    Filed: April 6, 2021
    Publication date: August 19, 2021
    Inventors: Rabin SUGUMAR, Bharath UPPUTURI, Bruce KAUFFMAN, Novinder WARAICH, Bivraj KORADIA, Paul SEBATA
  • Patent number: 10996738
    Abstract: A system includes a controller configured to receive a signal indicating whether a droop event has occurred. The system also includes a plurality of delay elements where each delay element of the plurality of delay elements responsive to a signal from the controller receives an input signal and outputs an output signal that is a delayed version of the input signal. At least one delay element of the plurality of delay elements receives a clocking signal as its input signal. The system also includes a selector configured to select rising edges and falling edges of output signals from the plurality of delay elements to form a modified clocking signal. The modified clocking signal is a modified version of the clocking signal.
    Type: Grant
    Filed: December 18, 2018
    Date of Patent: May 4, 2021
    Assignee: Marvell Asia Pte, Ltd.
    Inventors: Rabin Sugumar, Bharath Upputuri, Bruce Kauffman, Novinder Waraich, Bivraj Koradia, Paul Sebata