Patents Assigned to Next Silicon, Ltd.

Reconfigurable cache architecture and methods for cache coherency

Patent number: 12360902

Abstract: A method for cache coherency in a reconfigurable cache architecture is provided. The method includes receiving a memory access command, wherein the memory access command includes at least an address of a memory to access; determining at least one access parameter based on the memory access command; and determining a target cache bin for serving the memory access command based in part on the at least one access parameter and the address.

Type: Grant

Filed: August 4, 2023

Date of Patent: July 15, 2025

Assignee: Next Silicon Ltd

Inventor: Elad Raz
Executing concurrent threads on a reconfigurable processing grid

Patent number: 12340221

Abstract: A system for processing a plurality of concurrent threads comprising: a reconfigurable processing grid, comprising logical elements and a context storage for storing thread contexts, each thread context for one of a plurality of concurrent threads, each implementing a dataflow graph comprising an identified operation; and a hardware processor configured for configuring the at reconfigurable processing grid for: executing a first thread of the plurality of concurrent threads; and while executing the first thread: storing a runtime context value of the first thread in the context storage; while waiting for completion of the identified operation by identified logical elements, executing the identified operation of a second thread by the identified logical element; and when execution of the identified operation of the first thread completes: retrieving the runtime context value of the first thread from the context storage; and executing another operation of the first thread.

Type: Grant

Filed: January 11, 2024

Date of Patent: June 24, 2025

Assignee: Next Silicon Ltd

Inventors: Elad Raz, Ilan Tayari
JOINT SCHEDULER FOR HIGH BANDWIDTH MULTI-SHOT PREFETCHING

Publication number: 20250199959

Abstract: A joint scheduler adapted for dispatching prefetch and demand accesses of data relating to a plurality of instructions loaded in an execution pipeline of processing circuit(s). Each prefetch access comprises checking whether a respective data is cached in a cache entry and each demand access comprises accessing a respective data. The joint scheduler is adapted to, responsive to each hit prefetch access dispatched for a respective data relating to a respective instruction, associate the respective instruction with a valid indication and a pointer to a respective cache entry storing the respective data such that the demand access relating to the respective instruction uses the associated pointer to access the respective data in the cache, and responsive to each missed prefetch access dispatched for a respective data relating to a respective instruction, initiate a read cycle for loading the respective data from next level memory and cache it in the cache.

Type: Application

Filed: July 12, 2024

Publication date: June 19, 2025

Applicant: Next Silicon Ltd

Inventors: Yiftach GILAD, Liron ZUR
Reconfigurable integrated circuit (IC) device and a system and method of configuring thereof

Patent number: 12333231

Abstract: An Integrated Circuit (IC) device, and a method of utilizing thereof, may include: a plurality of Processing Elements (PEs), each comprising one or more configurable hardware logic blocks. The IC may further include a plurality of configuration memory elements, each associated with a respective PE, and adapted to maintain two or more configuration settings of the respective PE. The IC may further include a configuration manager circuit, configured to: receive a reconfiguration instruction, dictating a required function of the IC device; based on the reconfiguration instruction, identify at least one target PE of the plurality of PEs as a target for reconfiguration; based on the required function, select a specific configuration setting in the configuration memory element associated with the at least one target PE; and reconfigure at least one hardware logic block of the at least one target PE, according to the selected configuration setting.

Type: Grant

Filed: November 3, 2024

Date of Patent: June 17, 2025

Assignee: NEXT SILICON LTD.

Inventors: Elad Raz, Ilan Tayati, Ronen Gal, Oded Margalit, Elad Shliselberg
GRAPHICAL USER INTERFACE FOR CODE TO DATAFLOW GRAPH REPRESENTATION

Publication number: 20250138787

Abstract: There is provided a method, comprising simultaneously presenting in a GUI, a source code and an interactive graph of nodes connected by edges representing the source code mapped to physical configurable elements of computational cluster(s) of a processor each configurable to execute mathematical operations, each node represents operation(s) mapped to physical configurable elements, and edges represent dependencies between the operations, mapped to physical dependency links between the configurable elements, receiving, via the GUI, a user selection of a portion of the source code, determining node(s) and/or edge(s) of the interactive graph corresponding to the portion, and updating the GUI for visually distinguishing the node(s) and/or edge(s), wherein the visually distinguished node(s) represents a mapping to certain physical configurable elements and the visually distinguished edge(s) represents certain dependency links between the certain physical configurable elements of the processor configured to execute

Type: Application

Filed: May 27, 2024

Publication date: May 1, 2025

Applicant: Next Silicon Ltd

Inventors: Oshri KDOSHIM, Elad RAZ
DYNAMIC ALLOCATION OF EXECUTABLE CODE FOR MULTI-ARCHITECTURE HETEROGENEOUS COMPUTING

Publication number: 20250130802

Abstract: An apparatus for executing a software program, comprising processing units and a hardware processor adapted for: in an intermediate representation of the software program, where the intermediate representation comprises blocks, each associated with an execution block of the software program and comprising intermediate instructions, identifying a calling block and a target block, where the calling block comprises a control-flow intermediate instruction to execute a target intermediate instruction of the target block; generating target instructions using the target block; generating calling instructions using the calling block and a computer control instruction for invoking the target instructions, when the calling instructions are executed by a calling processing unit and the target instructions are executed by a target processing unit; configuring the calling processing unit for executing the calling instructions; and configuring the target processing unit for executing the target instructions.

Type: Application

Filed: December 24, 2024

Publication date: April 24, 2025

Applicant: Next Silicon Ltd

Inventors: Elad RAZ, Ilan TAYARI
Automatic generation of processing architecture-specific algorithms

Patent number: 12277051

Abstract: A method of generating automatically architecture-specific algorithms, comprising receiving an architecture independent algorithm and one or more algorithm parameters defining at least a target processing architecture and a format of an output of an architecture-specific algorithm implementing the received algorithm, determining automatically a functionality of the algorithm by analyzing the algorithm, selecting one or more architecture-specific computing blocks of the target processing architecture according to the functionality of the algorithm and the algorithm parameter(s) wherein each computing block is dynamically reconfigurable in runtime and associated with (1) simulation code simulating its functionality, and (2) execution code executing its functionality, testing an emulated architecture-specific algorithm constructed using the simulation code of the selected architecture-specific computing block(s) to verify compliance with the algorithm parameter(s), and, responsive to successful compliance verifi

Type: Grant

Filed: February 5, 2024

Date of Patent: April 15, 2025

Assignee: Next Silicon Ltd

Inventor: Daniel Khankin
EVENT PROCESSING BY HARDWARE ACCELERATOR

Publication number: 20250053449

Abstract: A hardware acceleration circuit, comprising a communication interface for connecting to one or more event-driven circuits, a memory, an event handling circuit, and a hardware acceleration engine. The event handling circuit is adapted to detect one or more events triggered by one or more of the event-driven circuits, update one or more pointers pointing to one or more event handling routines stored in the memory and to a context memory segment in the memory storing a plurality of context parameters relating to the one or more events, and transmit the pointer(s) to the hardware acceleration engine. The hardware acceleration engine is adapted to receive the pointer(s) from the event handling circuit, and execute the event handling routine(s) pointed by the pointer(s) to process data relating to the event(s) according to at least some of the context parameters retrieved from the context memory segment using the pointer(s).

Type: Application

Filed: August 7, 2023

Publication date: February 13, 2025

Applicant: Next Silicon Ltd

Inventors: Alexander MARGOLIN, Menashe DASKAL, Oren NISHRY
SYSTEM AND METHOD FOR SHARING A CACHE LINE BETWEEN NON-CONTIGUOUS MEMORY AREAS

Publication number: 20250053511

Abstract: A method for caching memory comprising caching two data values, each of one of two ranges of application memory addresses, each associated with one of a set of threads, by: organizing a plurality of sequences of consecutive address sub-ranges in an interleaved sequence of address sub-ranges by alternately selecting, for each thread in an identified order of threads, a next sub-range in the respective sequence of sub-ranges associated therewith; generating a mapping of the interleaved sequence of sub-ranges to a range of physical memory addresses in order of the interleaved sequence of sub-ranges; and when a thread accesses an application memory address of the respective range of application addresses associated thereof: computing a target address according to the mapping using the application address; and storing the two data values in one cache-line of a plurality of cache-lines of a cache by accessing the physical memory area using the target address.

Type: Application

Filed: October 28, 2024

Publication date: February 13, 2025

Applicant: Next Silicon Ltd

Inventors: Dan SHECHTER, Elad RAZ
Dynamic software interface translation for computing in a heterogeneous environment

Patent number: 12197919

Abstract: A system for executing a software program comprising processing units and a hardware processor configured to: for at least one set of blocks, each set comprising a calling block and a target block of an intermediate representation of the software program, generate control-transfer information describing at least one value of the software program at an exit of the calling block (out-value) and at least one other value of the software program at an entry to the target block (in-value); select a set of blocks according to at least one statistical value collected while executing the software program; generate a target set of instructions using the target block and the control-transfer information; generate a calling set of instructions using the calling block and the control-transfer information; configure a calling processing unit to execute the calling set of instructions; and configure a target processing unit to execute the target set of instructions.

Type: Grant

Filed: June 17, 2024

Date of Patent: January 14, 2025

Assignee: Next Silicon Ltd

Inventors: Elad Raz, Ilan Tayari, Itay Bookstein, Jonathan Lavi
EXECUTING CONCURRENT THREADS ON A RECONFIGURABLE PROCESSING GRID

Publication number: 20250013466

Abstract: A system for processing a plurality of concurrent threads comprising: a reconfigurable processing grid, comprising logical elements and a context storage for storing thread contexts, each thread context for one of a plurality of concurrent threads, each implementing a dataflow graph comprising an identified operation; and a hardware processor configured for configuring the at reconfigurable processing grid for: executing a first thread of the plurality of concurrent threads; and while executing the first thread: storing a runtime context value of the first thread in the context storage; while waiting for completion of the identified operation by identified logical elements, executing the identified operation of a second thread by the identified logical element; and when execution of the identified operation of the first thread completes: retrieving the runtime context value of the first thread from the context storage; and executing another operation of the first thread.

Type: Application

Filed: January 11, 2024

Publication date: January 9, 2025

Applicant: Next Silicon Ltd

Inventors: Elad RAZ, Ilan TAYARI
Dynamic allocation of executable code for multi-architecture heterogeneous computing

Patent number: 12189412

Abstract: An apparatus for executing a software program, comprising processing units and a hardware processor adapted for: in an intermediate representation of the software program, where the intermediate representation comprises blocks, each associated with an execution block of the software program and comprising intermediate instructions, identifying a calling block and a target block, where the calling block comprises a control-flow intermediate instruction to execute a target intermediate instruction of the target block; generating target instructions using the target block; generating calling instructions using the calling block and a computer control instruction for invoking the target instructions, when the calling instructions are executed by a calling processing unit and the target instructions are executed by a target processing unit; configuring the calling processing unit for executing the calling instructions; and configuring the target processing unit for executing the target instructions.

Type: Grant

Filed: March 29, 2023

Date of Patent: January 7, 2025

Assignee: Next Silicon Ltd

Inventors: Elad Raz, Ilan Tayari
Premature incoming packet processing

Patent number: 12164793

Abstract: A method of processing incoming packets prior to complete reception, comprising receiving a pointer to one or more memory blocks allocated for storing one or more incoming packets to be written by one or more another controllers where each packet comprises one or more packet segments, determining all valid data values of fields contained in the packet segments. initializing one or more memory sections in the memory blocks which are mapped to the fields with predefined data pattern which are different from any of the valid values of the fields, checking continuously content of the memory sections, determining packet segment(s) were written in the memory block(s) responsive to detecting that the content of one or more of the memory sections do not match the one or more predefined data patterns, and processing one or more of the packets according to at least part of the received packet segment(s).

Type: Grant

Filed: January 30, 2024

Date of Patent: December 10, 2024

Assignee: Next Silicon Ltd

Inventor: Alexander Margolin
System and method for sharing a cache line between non-contiguous memory areas

Patent number: 12130736

Abstract: A method for caching memory comprising caching two data values, each of one of two ranges of application memory addresses, each associated with one of a set of threads, by: organizing a plurality of sequences of consecutive address sub-ranges in an interleaved sequence of address sub-ranges by alternately selecting, for each thread in an identified order of threads, a next sub-range in the respective sequence of sub-ranges associated therewith; generating a mapping of the interleaved sequence of sub-ranges to a range of physical memory addresses in order of the interleaved sequence of sub-ranges; and when a thread accesses an application memory address of the respective range of application addresses associated thereof: computing a target address according to the mapping using the application address; and storing the two data values in one cache-line of a plurality of cache-lines of a cache by accessing the physical memory area using the target address.

Type: Grant

Filed: August 7, 2023

Date of Patent: October 29, 2024

Assignee: Next Silicon Ltd

Inventors: Dan Shechter, Elad Raz
MEMORY MANAGEMENT IN A MULTI-PROCESSOR ENVIRONMENT

Publication number: 20240345881

Abstract: There is provided a memory, comprising: issuing an allocation operation for allocation of a region of a memory by a first process of a plurality of first processes executed in parallel on a first processor, sending a message to a second processor indicating the allocation of the region of the pool of the memory, issuing a free operation for release of the allocated region of the pool of the memory by a second process of a plurality of second processes executed in parallel on a second processor, and releasing, by the first processor, the allocated region of the pool of the memory as indicated in the free operation, wherein a same region of memory is allocated by the first process and released by the second process, wherein the first processes are concurrently attempting to issue the allocation operation and the second processes are concurrently attempting to issue the free operation.

Type: Application

Filed: June 24, 2024

Publication date: October 17, 2024

Applicant: Next Silicon Ltd

Inventors: Elad RAZ, Ilan TAYARI, Dan SHECHTER
AUTOMATIC GENERATION OF COMPUTATION KERNELS FOR APPROXIMATING ELEMENTARY FUNCTIONS

Publication number: 20240289248

Abstract: An apparatus for computing functions using polynomial-based approximation, comprising one or more processing circuitries configured for computing a polynomial-based approximant approximating a function by executing one or more iterations. Each iteration comprising computing the polynomial-based approximant using scaled fixed-point unit(s) according to a constructed set of coefficients, minimizing an approximation error of the computed polynomial-based approximant compared to the function while complying with one or more constraints selected from a group comprising at least: an accuracy, a compute graph size, a computation complexity, and a hardware utilization of the processing circuitry(s), adjusting one or more of the coefficients in case the approximation error is incompliant with the constraint(s) and initiating another iteration.

Type: Application

Filed: May 2, 2024

Publication date: August 29, 2024

Applicant: Next Silicon Ltd

Inventor: Daniel KHANKIN
Interconnected memory grid with bypassable units

Patent number: 12056376

Abstract: A device for executing a software program by at least one computational device, comprising an interconnected computing grid, connected to the at least one computational device, comprising an interconnected memory grid comprising a plurality of memory units connected by a plurality of memory network nodes, each connected to at least one of the plurality of memory units; wherein configuring the interconnected memory comprises: identifying a bypassable memory unit; selecting a backup memory unit connected to a backup memory network node; configuring the respective memory network node connected to the bypassable memory unit to forward at least one memory access request, comprising an address in a first address range, to the backup memory network node; and configuring the backup memory network node to access the backup memory unit in response to the at least one memory access request, in addition to accessing the respective at least one memory unit connected thereto.

Type: Grant

Filed: May 8, 2023

Date of Patent: August 6, 2024

Assignee: Next Silicon Ltd

Inventors: Yoav Lossin, Ron Schneider, Elad Raz, Ilan Tayari, Eyal Nagar
Joint scheduler for high bandwidth multi-shot prefetching

Patent number: 12038843

Abstract: A joint scheduler adapted for dispatching prefetch and demand accesses of data relating to a plurality of instructions loaded in an execution pipeline of processing circuit(s). Each prefetch access comprises checking whether a respective data is cached in a cache entry and each demand access comprises accessing a respective data. The joint scheduler is adapted to, responsive to each hit prefetch access dispatched for a respective data relating to a respective instruction, associate the respective instruction with a valid indication and a pointer to a respective cache entry storing the respective data such that the demand access relating to the respective instruction uses the associated pointer to access the respective data in the cache, and responsive to each missed prefetch access dispatched for a respective data relating to a respective instruction, initiate a read cycle for loading the respective data from next level memory and cache it in the cache.

Type: Grant

Filed: December 13, 2023

Date of Patent: July 16, 2024

Assignee: Next Silicon Ltd

Inventors: Yiftach Gilad, Liron Zur
Memory management in a multi-processor environment

Patent number: 12020069

Abstract: There is provided a computer implemented method of allocation of memory, comprising: issuing an allocation operation for allocation of a region of a pool of a memory by a first process executed on a first processor, sending a message to a second processor indicating the allocation of the region of the pool of the memory, wherein the first processor and the second processor access the region of the pool of the memory, issuing a free operation for release of the allocated region of the pool of the memory by a second process executed on a second processor, and releasing, by the first processor, the allocated region of the pool of the memory as indicated in the free operation, wherein the region of the pool of the memory allocated by the first process and released by the second process is a same region of memory.

Type: Grant

Filed: August 11, 2022

Date of Patent: June 25, 2024

Assignee: Next Silicon Ltd

Inventors: Elad Raz, Ilan Tayari, Dan Shechter, Yuval Asher Deutsher
Automatic generation of computation kernels for approximating elementary functions

Patent number: 12001311

Abstract: An apparatus for computing functions using polynomial-based approximation, comprising one or more processing circuitries configured for computing a polynomial-based approximant approximating a function by executing one or more iterations. Each iteration comprising computing the polynomial-based approximant using scaled fixed-point unit(s) according to a constructed set of coefficients, minimizing an approximation error of the computed polynomial-based approximant compared to the function while complying with one or more constraints selected from a group comprising at least: an accuracy, a compute graph size, a computation complexity, and a hardware utilization of the processing circuitry(s), adjusting one or more of the coefficients in case the approximation error is incompliant with the constraint(s) and initiating another iteration.

Type: Grant

Filed: January 6, 2022

Date of Patent: June 4, 2024

Assignee: Next Silicon Ltd

Inventor: Daniel Khankin

1 2 3 next