Patents by Inventor Ljubisa Bajic

Ljubisa Bajic has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

OVERLAY LAYER FOR NETWORK OF PROCESSOR CORES

Publication number: 20230041130

Abstract: Methods and systems related to the efficient execution of complex computations by a multicore processor and the movement of data among the various processing cores in the multicore processor are disclosed. A multicore processor includes a set of processing cores and associated sets of processing pipelines, core controllers, routers, and network interface units. The multicore processor also includes a computation layer, for conducting computations using the set of processing cores, with executable instructions for the set of processing pipelines which are executed by the set of core controllers. The multicore processor also includes a network-on-chip layer, for connecting the set of processing cores in the multicore processor, with executable instructions for the set of routers and the set of network interface units.

Type: Application

Filed: September 14, 2022

Publication date: February 9, 2023

Inventors: Davor Capalija, Ivan Matosevic, Jasmina Vasiljevic, Utku Aydonat, Andrew Lewycky, S. Alexander Chin, Ljubisa Bajic, Alex Cejkov, Milos Trajkovic
Data structure optimized dedicated memory caches

Patent number: 11520701

Abstract: Methods and systems associated with caches are disclosed. One disclosed system includes at least one memory storing at least two data structures. The at least two data structures include a first data structure and a second data structure. The system also includes at least two caches with a first cache which caches the first data structure and a second cache which caches the second data structure. The system also includes a controller communicatively coupled to the at least two caches. The controller separately configures the first cache based on the first data structure and the second cache based on the second data structure. The system also comprises at least one processor communicatively coupled to the at least two caches. The processor accesses each of the at least two data structures using the at least two caches and during the execution of a complex computation.

Type: Grant

Filed: April 2, 2021

Date of Patent: December 6, 2022

Assignee: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Davor Capalija, Ivan Matosevic, Alex Cejkov
Overlay layer for network of processor cores

Patent number: 11467846

Abstract: Methods and systems related to the efficient execution of complex computations by a multicore processor and the movement of data among the various processing cores in the multicore processor are disclosed. A multicore processor stack for the multicore processor can include a computation layer, for conducting computations using the processing cores in the multicore processor, with executable instructions for processing pipelines in the processing cores. The multicore processor stack can also include a network-on-chip layer, for connecting the processing cores in the multicore processor, with executable instructions for routers and network interface units in the multicore processor. The computation layer and the network-on-chip layer can be logically isolated by a network-on-chip overlay layer.

Type: Grant

Filed: July 29, 2020

Date of Patent: October 11, 2022

Assignee: Tenstorrent Inc.

Inventors: Davor Capalija, Ivan Matosevic, Jasmina Vasiljevic, Utku Aydonat, Andrew Lewycky, S. Alexander Chin, Ljubisa Bajic
GRAPH EXECUTION USING ACCESS REQUEST RESPONSE DYNAMIC BATCH ASSEMBLY

Publication number: 20220318614

Abstract: Methods and systems for the for the accelerated execution of a directed graph are disclosed. The execution can involve the generation of an inference from a set of inputs provided to an artificial neural network. In a specific example, a method for executing a directed graph includes receiving at least two batches of indices. The batches of indices, when used to access a set of embeddings, provide at least two batches of embedding outputs and execute a layer of the directed graph. The method further includes accessing the set of embeddings using the at least two batches of indices. The method further includes rearranging, based on a set of latencies for the accessing step, the at least two batches of embedding outputs into at least two batches or rearranged embeddings. The method further includes providing the at least two batches of rearranged embeddings to a subsequent layer of the directed graph.

Type: Application

Filed: April 2, 2021

Publication date: October 6, 2022

Applicant: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Davor Capalija, Ivan Matosevic, Alex Cejkov
DATA STRUCTURE OPTIMIZED DEDICATED MEMORY CACHES

Publication number: 20220318144

Abstract: Methods and systems associated with caches are disclosed. One disclosed system includes at least one memory storing at least two data structures. The at least two data structures include a first data structure and a second data structure. The system also includes at least two caches with a first cache which caches the first data structure and a second cache which caches the second data structure. The system also includes a controller communicatively coupled to the at least two caches. The controller separately configures the first cache based on the first data structure and the second cache based on the second data structure. The system also comprises at least one processor communicatively coupled to the at least two caches. The processor accesses each of the at least two data structures using the at least two caches and during the execution of a complex computation.

Type: Application

Filed: April 2, 2021

Publication date: October 6, 2022

Applicant: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Davor Capalija, Ivan Matosevic, Alex Cejkov
APPLICATION DATA FLOW GRAPH EXECUTION USING NETWORK-ON-CHIP OVERLAY

Publication number: 20220245009

Abstract: Methods and systems for the for executing an application data flow graph using a network of computational nodes are disclosed. In specific examples, the network of computational nodes can be a network-on-chip for a multicore processor. One method includes transitioning first application data from a first source computational node to an intermediary computational node. The method can also include providing second application data, from a computation layer of the network of computational nodes, on the intermediary computational node. The method can also include multicasting the first application data in combination with the second application data from the intermediary computational node to at least two destination computational nodes. The first source computational node, the intermediary computational node, and the at least two destination computational nodes are all in the network of computational nodes.

Type: Application

Filed: January 29, 2021

Publication date: August 4, 2022

Applicant: Tenstorrent Inc.

Inventors: Jasmina Vasiljevic, Davor Capalija, Zahi Moudallal, Utku Aydonat, Joseph Chu, S. Alexander Chin, Ljubisa Bajic
PROCESSING CORE WITH OPERATION SUPPRESSION BASED ON CONTRIBUTION ESTIMATE

Publication number: 20220222086

Abstract: Processing cores with the ability to suppress operations based on a contribution estimate for those operations for purposes of increasing the overall performance of the core are disclosed. Associated methods that can be conducted by such processing cores are also disclosed. One such method includes generating a reference value for a composite computation. A complete execution of the composite computation generates a precise output and requires execution of a set of component computations. The method also includes generating a component computation approximation. The method also includes evaluating the component computation approximation with the reference value. The method also includes executing a partial execution of the composite computation using the component computation approximation to produce an estimated output.

Type: Application

Filed: March 28, 2022

Publication date: July 14, 2022

Applicant: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Milos Trajkovic, Ivan Hamer, Syed Gilani
PROCESSOR CORES USING PACKET IDENTIFIERS FOR ROUTING AND COMPUTATION

Publication number: 20220188106

Abstract: Processor cores using packet identifiers for routing and computation are disclosed. One method includes executing a complex computation using a set of processing cores. The method includes routing a set of packets using a set of packet identifiers and executing a set of instructions. The set of instructions are defined using a set of operand identifiers. The operand identifiers represent packet identifiers in the set of packet identifiers. In specific implementations the set of the operand identifiers represent packet identifiers in the set of packet identifiers in that a set of memories on the set of processing cores stores data values in common association with both the set of packets, and a set of operands identified by the set of operand identifiers. In specific implementations the set of operand identifiers and packet identifiers are unambiguously mapped to an underlying set of application datums of the complex computation.

Type: Application

Filed: March 3, 2022

Publication date: June 16, 2022

Applicant: Tenstorrent Inc.

Inventors: Davor Capalija, Ljubisa Bajic, Jasmina Vasiljevic
Processing core with operation suppression based on contribution estimate

Patent number: 11301264

Abstract: Processing cores with the ability to suppress operations based on a contribution estimate for those operations for purposes of increasing the overall performance of the core are disclosed. Associated methods that can be conducted by such processing cores are also disclosed. One such method includes generating a reference value for a composite computation. A complete execution of the composite computation generates a precise output and requires execution of a set of component computations. The method also includes generating a component computation approximation. The method also includes evaluating the component computation approximation with the reference value. The method also includes executing a partial execution of the composite computation using the component computation approximation to produce an estimated output.

Type: Grant

Filed: February 11, 2020

Date of Patent: April 12, 2022

Assignee: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Milos Trajkovic, Ivan Hamer, Syed Gilani
Overlay Layer Hardware Unit for Network of Processor Cores

Publication number: 20220100503

Abstract: Methods and systems for executing an application data flow graph on a set of computational nodes are disclosed. The computational nodes can each include a programmable controller from a set of programmable controllers, a memory from a set of memories, a network interface unit from a set of network interface units, and an endpoint from a set of endpoints. A disclosed method comprises configuring the programmable controllers with instructions. The method also comprises independently and asynchronously executing the instructions using the set of programmable controllers in response to a set of events exchanged between the programmable controllers themselves, between the programmable controllers and the network interface units, between the programmable controllers and the set of endpoints. The method also comprises transitioning data in the set of memories on the computational nodes in accordance with the application data flow graph and in response to the execution of the instructions.

Type: Application

Filed: September 28, 2020

Publication date: March 31, 2022

Applicant: Tenstorrent Inc.

Inventors: Ivan Matosevic, Davor Capalija, Jasmina Vasiljevic, Utku Aydonat, S. Alexander Chin, Djordje Maksimovic, Ljubisa Bajic
Processor cores using packet identifiers for routing and computation

Patent number: 11269628

Abstract: Processor cores using packet identifiers for routing and computation are disclosed. One method includes executing a complex computation using a set of processing cores. The method includes routing a set of packets using a set of packet identifiers and executing a set of instructions. The set of instructions are defined using a set of operand identifiers. The operand identifiers represent packet identifiers in the set of packet identifiers. In specific implementations the set of the operand identifiers represent packet identifiers in the set of packet identifiers in that a set of memories on the set of processing cores stores data values in common association with both the set of packets, and a set of operands identified by the set of operand identifiers. In specific implementations the set of operand identifiers and packet identifiers are unambiguously mapped to an underlying set of application datums of the complex computation.

Type: Grant

Filed: June 15, 2020

Date of Patent: March 8, 2022

Assignee: Tenstorrent Inc.

Inventors: Davor Capalija, Ljubisa Bajic, Jasmina Vasiljevic
Speculative resource allocation for routing on interconnect fabrics

Patent number: 11245643

Abstract: Methods and systems related to speculative resource allocation for routing on an interconnect fabric are disclosed herein. One disclosed method includes speculatively allocating a collection of resources to support a set of paths through an interconnect fabric. The method also includes aggregating a set of responses from the set of paths at a branch node on the set of paths. If a resource contention is detected, the set of responses will include an indicator of a resource contention. The method will then further include transmitting, from the branch node and in response to the indicator of the resource contention, a deallocate message downstream and the indicator of the resource contention upstream, and reallocating resources for the multicast after a hold period.

Type: Grant

Filed: May 20, 2020

Date of Patent: February 8, 2022

Assignee: Tenstorrent Inc.

Inventors: Ivan Matosevic, Ljubisa Bajic
PROCESSING CORE WITH METADATA ACTUATED CONDITIONAL GRAPH EXECUTION

Publication number: 20210382716

Abstract: A processing core and associated methods for the efficient execution of a directed graph are disclosed. A disclosed processing core includes a memory and a first data tile stored in the memory. The first data tile includes a first set of data elements and metadata stored in association with the first set of data elements. The processing core also includes a second data tile stored in the memory. The second data tile includes a second set of data elements. The processing core also includes an arithmetic logic unit configured to conduct an arithmetic logic operation using data from the first set of data elements and the second set of data elements. The processing core also includes a control unit configured to evaluate the metadata and control the arithmetic logic unit to conditionally execute the arithmetic logic operation based on the evaluation of the metadata.

Type: Application

Filed: August 23, 2021

Publication date: December 9, 2021

Applicant: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Milos Trajkovic, Ivan Hamer, Lejla Bajic, Aleksandar Cejkov
SPECULATIVE RESOURCE ALLOCATION FOR ROUTING ON INTERCONNECT FABRICS

Publication number: 20210367905

Abstract: Methods and systems related to speculative resource allocation for routing on an interconnect fabric are disclosed herein. One disclosed method includes speculatively allocating a collection of resources to support a set of paths through an interconnect fabric. The method also includes aggregating a set of responses from the set of paths at a branch node on the set of paths. If a resource contention is detected, the set of responses will include an indicator of a resource contention. The method will then further include transmitting, from the branch node and in response to the indicator of the resource contention, a deallocate message downstream and the indicator of the resource contention upstream, and reallocating resources for the multicast after a hold period.

Type: Application

Filed: May 20, 2020

Publication date: November 25, 2021

Applicant: Tenstorrent Inc.

Inventors: Ivan Matosevic, Ljubisa Bajic
Processing core with metadata actuated conditional graph execution

Patent number: 11113051

Abstract: A processing core and associated methods for the efficient execution of a directed graph are disclosed. A disclosed processing core comprises a memory and a first data tile stored in the memory. The first data tile includes a first set of data elements and metadata stored in association with the first set of data elements. The processing core also comprises a second data tile stored in the memory. The second data tile includes a second set of data elements. The processing core also comprises an arithmetic logic unit configured to conduct an arithmetic logic operation using data from the first set of data elements and the second set of data elements. The processing core also comprises a control unit configured to evaluate the metadata and control the arithmetic logic unit to conditionally execute the arithmetic logic operation based on the evaluation of the metadata.

Type: Grant

Filed: October 8, 2018

Date of Patent: September 7, 2021

Assignee: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Milos Trajkovic, Ivan Hamer, Lejla Bajic, Aleksandar Cejkov
PROCESSING CORE WITH DATA ASSOCIATIVE ADAPTIVE ROUNDING

Publication number: 20210271450

Abstract: Processing cores with data associative adaptive rounding and associated methods are disclosed herein. One disclosed processing core comprises an arithmetic logic unit cluster configured to generate a value for a unit of directed graph data using input directed graph data, a comparator coupled to a threshold register and a data register, a core controller configured to load a threshold value into the threshold register when the value for the unit of directed graph data is loaded into the data register, and a rounding circuit. The rounding circuit is configured to receive the value for the unit of directed graph data from the arithmetic logic unit cluster and conditionally round the value for the unit of directed graph data based on a comparator output from the comparator.

Type: Application

Filed: May 17, 2021

Publication date: September 2, 2021

Applicant: Tenstorrent Inc

Inventors: Ljubisa Bajic, Alex Cejkov, Lejla Bajic
Processing core with data associative adaptive rounding

Patent number: 11010132

Abstract: Processing cores with data associative adaptive rounding and associated methods are disclosed herein. One disclosed processing core comprises an arithmetic logic unit cluster configured to generate a value for a unit of directed graph data using input directed graph data, a comparator coupled to a threshold register and a data register, a core controller configured to load a threshold value into the threshold register when the value for the unit of directed graph data is loaded into the data register, and a rounding circuit. The rounding circuit is configured to receive the value for the unit of directed graph data from the arithmetic logic unit cluster and conditionally round the value for the unit of directed graph data based on a comparator output from the comparator.

Type: Grant

Filed: September 17, 2019

Date of Patent: May 18, 2021

Assignee: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Alex Cejkov, Lejla Bajic
Processing core data compression and storage system

Patent number: 10938413

Abstract: Methods and systems regarding the rapid and efficient compression and decompression of sparse data are disclosed. One method for compressing a set of data from a sparse matrix includes, evaluating a sequence of data entries from the set of data, extracting a sequence of sparse data values from the sequence, extracting a sequence of non-sparse data value run lengths from the sequence, formulating a set of row pointers from the sequence, storing the sequence of sparse data values in a first set of memory addresses, and storing the sequence of non-sparse data value run lengths in a second set of memory addresses. The set of row pointers identify a set of rows of the sparse matrix in both the first and second sets of memory addresses. Rapid decompression can be conducted using the row pointers.

Type: Grant

Filed: April 17, 2020

Date of Patent: March 2, 2021

Assignee: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Alex Cejkov, Lejla Bajic
Processing Core with Meta Data Actuated Conditional Graph Execution

Publication number: 20210042118

Abstract: A processing core for the efficient execution of a directed graph is disclosed. The processing core includes a memory and a first and a second data tile stored in the memory. The first and second data tiles include a first and a second set of data elements stored contiguously in the memory. The processing core also includes metadata relationally stored with the first data tile in the memory. The processing core also includes an execution engine, a control unit, and an instruction. Execution of the instruction uses the execution engine, a first data element in the first set of data elements, and a second data element in the second set of data elements. The control unit conditions execution of the instruction using the metadata. A standard execution of the instruction generates a standard output. A conditional execution of the instruction operation generates a conditionally executed output.

Type: Application

Filed: October 26, 2020

Publication date: February 11, 2021

Applicant: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Milos Trajkovic, Ivan Hamer
OVERLAY LAYER FOR NETWORK OF PROCESSOR CORES

Publication number: 20210034373

Abstract: Methods and systems related to the efficient execution of complex computations by a multicore processor and the movement of data among the various processing cores in the multicore processor are disclosed. A multicore processor stack for the multicore processor can include a computation layer, for conducting computations using the processing cores in the multicore processor, with executable instructions for processing pipelines in the processing cores. The multicore processor stack can also include a network-on-chip layer, for connecting the processing cores in the multicore processor, with executable instructions for routers and network interface units in the multicore processor. The computation layer and the network-on-chip layer can be logically isolated by a network-on-chip overlay layer.

Type: Application

Filed: July 29, 2020

Publication date: February 4, 2021

Applicant: Tenstorrent Inc.

Inventors: Davor Capalija, Ivan Matosevic, Jasmina Vasiljevic, Utku Aydonat, Andrew Lewycky, S. Alexander Chin, Ljubisa Bajic

prev 1 2 3 4 next