Patents Assigned to Cornami, Inc.
-
Patent number: 11973692Abstract: A method and system for providing robust streaming of data from a multi-core die is disclosed. The techniques include using a high bandwidth memory (HBM) device as retransmit buffers for large amounts of data to ensure robust communication in relatively high round trip-transmission time (RTT) transmission. Another technique is supporting two or more Ethernet ports between components to both transmit the same data packets on the two ports to insure robustness. Another technique is to use sequence numbers and send data packets from the different ports in a round robin fashion and reorder the packets upon receipt of an external device. Another technique is dynamically adding and removing paths for data packets between devices with multiple ports based on the quality of the path.Type: GrantFiled: December 5, 2022Date of Patent: April 30, 2024Assignee: Cornami, Inc.Inventor: Krishnamurthy Subramanian
-
Patent number: 11907157Abstract: A representative reconfigurable processing circuit and a reconfigurable arithmetic circuit are disclosed, each of which may include input reordering queues; a multiplier shifter and combiner network coupled to the input reordering queues; an accumulator circuit; and a control logic circuit, along with a processor and various interconnection networks. A representative reconfigurable arithmetic circuit has a plurality of operating modes, such as floating point and integer arithmetic modes, logical manipulation modes, Boolean logic, shift, rotate, conditional operations, and format conversion, and is configurable for a wide variety of multiplication modes. Dedicated routing connecting multiplier adder trees allows multiple reconfigurable arithmetic circuits to be reconfigurably combined, in pair or quad configurations, for larger adders, complex multiplies and general sum of products use, for example.Type: GrantFiled: December 31, 2022Date of Patent: February 20, 2024Assignee: Cornami, Inc.Inventors: Paul L. Master, Steven K. Knapp, Raymond J. Andraka, Alexei Beliaev, Martin A. Franz, Rene Meessen, Frederick Curtis Furtek
-
Patent number: 11886377Abstract: A representative reconfigurable processing circuit and a reconfigurable arithmetic circuit are disclosed, each of which may include input reordering queues; a multiplier shifter and combiner network coupled to the input reordering queues; an accumulator circuit; and a control logic circuit, along with a processor and various interconnection networks. A representative reconfigurable arithmetic circuit has a plurality of operating modes, such as floating point and integer arithmetic modes, logical manipulation modes, Boolean logic, shift, rotate, conditional operations, and format conversion, and is configurable for a wide variety of multiplication modes. Dedicated routing connecting multiplier adder trees allows multiple reconfigurable arithmetic circuits to be reconfigurably combined, in pair or quad configurations, for larger adders, complex multiplies and general sum of products use, for example.Type: GrantFiled: September 9, 2020Date of Patent: January 30, 2024Assignee: Cornami, Inc.Inventor: Raymond J. Andraka
-
Patent number: 11853256Abstract: An apparatus, computer-readable medium, and computer-implemented method for parallelization of a computer program on a plurality of computing cores includes receiving a computer program comprising a plurality of commands, decomposing the plurality of commands into a plurality of node networks, each node network corresponding to a command in the plurality of commands and including one or more nodes corresponding to execution dependencies of the command, mapping the plurality of node networks to a plurality of systolic arrays, each systolic array comprising a plurality of cells and each non-data node in each node network being mapped to a cell in the plurality of cells, and mapping each cell in each systolic array to a computing core in the plurality of computing cores.Type: GrantFiled: June 11, 2018Date of Patent: December 26, 2023Assignee: CORNAMI, INC.Inventors: Solomon Harsha, Paul Master
-
Patent number: 11693662Abstract: Systems and methods for configuring a reduced instruction set computer processor architecture to execute fully homomorphic encryption (FHE) logic gates as a streaming topology. The method includes parsing sequential FHE logic gate code, transforming the FHE logic gate code into a set of code modules that each have in input and an output that is a function of the input and which do not pass control to other functions, creating a node wrapper around each code module, configuring at least one of the primary processing cores to implement the logic element equivalents of each element in a manner which operates in a streaming mode wherein data streams out of corresponding arithmetic logic units into the main memory and other ones of the plurality arithmetic logic units.Type: GrantFiled: January 15, 2020Date of Patent: July 4, 2023Assignee: CORNAMI INC.Inventors: Morris Jacob Creeger, Tianfang Liu, Frederick Furtek, Paul L. Master
-
Patent number: 11669526Abstract: Representative embodiments are disclosed for a rapid and highly parallel decompression of compressed executable and other files, such as executable files for operating systems and applications, having compressed blocks including run length encoded (“RLE”) data having data-dependent references. An exemplary embodiment includes a plurality of processors or processor cores to identify a start or end of each compressed block; to partially decompress, in parallel, a selected compressed block into independent data, dependent (RLE) data, and linked dependent (RLE) data; to sequence the independent data, dependent (RLE) data, and linked dependent (RLE) data from a plurality of partial decompressions of a plurality of compressed blocks, to obtain data specified by the dependent (RLE) data and linked dependent (RLE) data, and to insert the obtained data into a corresponding location in an uncompressed file.Type: GrantFiled: September 5, 2021Date of Patent: June 6, 2023Assignee: Cornami, Inc.Inventors: Paul L. Master, Frederick Curtis Furtek, Kim Knuttila, L. Brian McGann
-
Patent number: 11599367Abstract: A system and method to compress application control data, such as weights for a layer of a convolutional neural network, is disclosed. A multi-core system for executing at least one layer of the convolutional neural network includes a storage device storing a compressed weight matrix of a set of weights of the at least one layer of the convolutional network and a decompression matrix. The compressed weight matrix is formed by matrix factorization and quantization of a floating point value of each weight to a floating point format. A decompression module is operable to obtain an approximation of the weight values by decompressing the compressed weight matrix through the decompression matrix. A plurality of cores executes the at least one layer of the convolutional neural network with the approximation of weight values to produce an inference output.Type: GrantFiled: January 24, 2020Date of Patent: March 7, 2023Assignee: Cornami, Inc.Inventor: Tianfang Liu
-
Patent number: 11522804Abstract: A method and system for providing robust streaming of data from a multi-core die is disclosed. The techniques include using a high bandwidth memory (HBM) device as retransmit buffers for large amounts of data to ensure robust communication in relatively high round trip-transmission time (RTT) transmission. Another technique is supporting two or more Ethernet ports between components to both transmit the same data packets on the two ports to insure robustness. Another technique is to use sequence numbers and send data packets from the different ports in a round robin fashion and reorder the packets upon receipt of an external device. Another technique is dynamically adding and removing paths for data packets between devices with multiple ports based on the quality of the path.Type: GrantFiled: March 20, 2020Date of Patent: December 6, 2022Assignee: Cornami, Inc.Inventor: Krishnamurthy Subramanian
-
Patent number: 11494331Abstract: A representative reconfigurable processing circuit and a reconfigurable arithmetic circuit are disclosed, each of which may include input reordering queues; a multiplier shifter and combiner network coupled to the input reordering queues; an accumulator circuit; and a control logic circuit, along with a processor and various interconnection networks. A representative reconfigurable arithmetic circuit has a plurality of operating modes, such as floating point and integer arithmetic modes, logical manipulation modes, Boolean logic, shift, rotate, conditional operations, and format conversion, and is configurable for a wide variety of multiplication modes. Dedicated routing connecting multiplier adder trees allows multiple reconfigurable arithmetic circuits to be reconfigurably combined, in pair or quad configurations, for larger adders, complex multiplies and general sum of products use, for example.Type: GrantFiled: September 9, 2020Date of Patent: November 8, 2022Assignee: Cornami, Inc.Inventors: Paul L. Master, Steven K. Knapp, Raymond J. Andraka, Alexei Beliaev, Martin A. Franz, Rene Meessen, Frederick Curtis Furtek
-
Publication number: 20220179823Abstract: Systems and methods for reconfiguring a reduced instruction set computer processor architecture are disclosed. Exemplary implementations may: provide a primary processing core consisting of a RISC processor; provide a node wrapper associated with each of the plurality of secondary cores, the node wrapper comprising access memory associates with each secondary core, and a load/unload matrix associated with each secondary core; operate the architecture in a manner in which, for at least one core, data is read from and written to the at least cache memory in a control-centric mode; the secondary cores are selectively partitioned to operate in a streaming mode wherein data streams out of the corresponding secondary core into the main memory and other ones of the plurality of secondary cores.Type: ApplicationFiled: February 25, 2022Publication date: June 9, 2022Applicant: Cornami Inc.Inventors: Paul L. Master, Frederick Furtek, Martin Alan Franz II, Raymond J. Andraka PE
-
Patent number: 11294851Abstract: Systems and methods for reconfiguring a reduced instruction set computer processor architecture are disclosed. Exemplary implementations may: provide a primary processing core consisting of a RISC processor; provide a node wrapper associated with each of the plurality of secondary cores, the node wrapper comprising access memory associates with each secondary core, and a load/unload matrix associated with each secondary core; operate the architecture in a manner in which, for at least one core, data is read from and written to the at least cache memory in a control-centric mode; the secondary cores are selectively partitioned to operate in a streaming mode wherein data streams out of the corresponding secondary core into the main memory and other ones of the plurality of secondary cores.Type: GrantFiled: May 4, 2018Date of Patent: April 5, 2022Assignee: Cornami, Inc.Inventors: Paul L. Master, Frederick Furtek, Martin Alan Franz, II, Raymond J. Andraka PE
-
Patent number: 11055103Abstract: A method and system of efficient use and programming of a multi-processing core device. The system includes a programming construct that is based on stream-domain code. A programmable core based computing device is disclosed. The computing device includes a plurality of processing cores coupled to each other. A memory stores stream-domain code including a stream defining a stream destination module and a stream source module. The stream source module places data values in the stream and the stream conveys data values from the stream source module to the stream destination module. A runtime system detects when the data values are available to the stream destination module and schedules the stream destination module for execution on one of the plurality of processing cores.Type: GrantFiled: September 10, 2018Date of Patent: July 6, 2021Assignee: Cornami, Inc.Inventors: Frederick Furtek, Paul Master
-
Patent number: 10817184Abstract: A computing system with a plurality of nodes is disclosed. At least one of the plurality nodes includes an execution unit configured to execute an operation. An interconnection network is coupled to the plurality of nodes. The interconnection network is configured to provide interconnections among the plurality of nodes. A control node is coupled to the plurality of nodes via the network to manage the execution of the operation by the one or more of the plurality of nodes.Type: GrantFiled: January 22, 2019Date of Patent: October 27, 2020Assignee: Cornami, Inc.Inventors: W. James Scheuermann, Eugene B. Hogenauer
-
Patent number: 10685023Abstract: Representative embodiments are disclosed for a rapid and highly parallel decompression of compressed executable and other files, such as executable files for operating systems and applications, having compressed blocks including run length encoded (“RLE”) data having data-dependent references. An exemplary embodiment includes a plurality of processors or processor cores to identify a start or end of each compressed block; to partially decompress, in parallel, a selected compressed block into independent data, dependent (RLE) data, and linked dependent (RLE) data; to sequence the independent data, dependent (RLE) data, and linked dependent (RLE) data from a plurality of partial decompressions of a plurality of compressed blocks, to obtain data specified by the dependent (RLE) data and linked dependent (RLE) data, and to insert the obtained data into a corresponding location in an uncompressed file.Type: GrantFiled: August 22, 2018Date of Patent: June 16, 2020Assignee: Cornami, Inc.Inventors: Paul L. Master, Frederick Curtis Furtek, Kim Knuttila, L. Brian McGann
-
Publication number: 20190340152Abstract: Systems and methods for reconfiguring a reduced instruction set computer processor architecture are disclosed. Exemplary implementations may: provide a primary processing core consisting of a RISC processor; provide a node wrapper associated with each of the plurality of secondary cores, the node wrapper comprising access memory associates with each secondary core, and a load/unload matrix associated with each secondary core; operate the architecture in a manner in which, for at least one core, data is read from and written to the at least cache memory in a control-centric mode; the secondary cores are selectively partitioned to operate in a streaming mode wherein data streams out of the corresponding secondary core into the main memory and other ones of the plurality of secondary cores.Type: ApplicationFiled: May 4, 2018Publication date: November 7, 2019Applicant: Cornami Inc.Inventors: Paul L. Master, Frederick Furtek, Martin Alan Franz, II, Raymond J. Andraka PE
-
Patent number: 10318260Abstract: A method and system of compiling and linking source stream programs for efficient use of multi-node devices. The system includes a compiler, a linker, a loader and a runtime component. The process converts a source code stream program to a compiled object code that is used with a programmable node based computing device having a plurality of processing nodes coupled to each other. The programming modules include stream statements for input values and output values in the form of sources and destinations for at least one of the plurality of processing nodes and stream statements that determine the streaming flow of values for the at least one of the plurality of processing nodes. The compiler converts the source code stream based program to object modules, object module instances and executables. The linker matches the object module instances to at least one of the multiple cores.Type: GrantFiled: April 2, 2018Date of Patent: June 11, 2019Assignee: Cornami, Inc.Inventors: Frederick Furtek, Paul Master
-
Patent number: 10185502Abstract: A computing system with a plurality of nodes is disclosed. At least one of the plurality nodes includes an execution unit configured to execute an operation. An interconnection network is coupled to the plurality of nodes. The interconnection network is configured to provide interconnections among the plurality of nodes. A control node is coupled to the plurality of nodes via the network to manage the execution of the operation by the one or more of the plurality of nodes.Type: GrantFiled: May 25, 2017Date of Patent: January 22, 2019Assignee: Cornami, Inc.Inventors: W. James Scheuermann, Eugene B. Hogenauer
-
Publication number: 20180293206Abstract: An apparatus, computer-readable medium, and computer-implemented method for parallelization of a computer program on a plurality of computing cores includes receiving a computer program comprising a plurality of commands, decomposing the plurality of commands into a plurality of node networks, each node network corresponding to a command in the plurality of commands and including one or more nodes corresponding to execution dependencies of the command, mapping the plurality of node networks to a plurality of systolic arrays, each systolic array comprising a plurality of cells and each non-data node in each node network being mapped to a cell in the plurality of cells, and mapping each cell in each systolic array to a computing core in the plurality of computing cores.Type: ApplicationFiled: June 11, 2018Publication date: October 11, 2018Applicant: CORNAMI, INC.Inventors: Solomon Harsha, Paul Master
-
Patent number: 10083209Abstract: Representative embodiments are disclosed for a rapid and highly parallel decompression of compressed executable and other files, such as executable files for operating systems and applications, having compressed blocks including run length encoded (“RLE”) data having data-dependent references. An exemplary embodiment includes a plurality of processors or processor cores to identify a start or end of each compressed block; to partially decompress, in parallel, a selected compressed block into independent data, dependent (RLE) data, and linked dependent (RLE) data; to sequence the independent data, dependent (RLE) data, and linked dependent (RLE) data from a plurality of partial decompressions of a plurality of compressed blocks, to obtain data specified by the dependent (RLE) data and linked dependent (RLE) data, and to insert the obtained data into a corresponding location in an uncompressed file.Type: GrantFiled: April 21, 2017Date of Patent: September 25, 2018Assignee: Cornami, Inc.Inventors: Paul L. Master, Frederick Curtis Furtek, Kim Knuttila, L. Brian McGann
-
Patent number: 10073700Abstract: A method and system of efficient use and programming of a multi-processing core device. The system includes a programming construct that is based on stream-domain code. A programmable core based computing device is disclosed. The computing device includes a plurality of processing cores coupled to each other. A memory stores stream-domain code including a stream defining a stream destination module and a stream source module. The stream source module places data values in the stream and the stream conveys data values from the stream source module to the stream destination module. A runtime system detects when the data values are available to the stream destination module and schedules the stream destination module for execution on one of the plurality of processing cores.Type: GrantFiled: September 22, 2014Date of Patent: September 11, 2018Assignee: Cornami, Inc.Inventors: Frederick Furtek, Paul Master