Patents by Inventor Gerald George Pechanek

Gerald George Pechanek has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200250131
    Abstract: An Execution Array Memory Array (XarMa©) processor is described for signal processing and internet of things (IoT) applications, (pronounced sharma, that means happiness in Sanskrit). The XarMa© processor uses a 1 to K+1 adjacency network in an array of execution units. The 1 to K+1 adjacency refers to connections separately made in rows and in columns of execution unit and local file nodes, where the number of Rows?K>1 and of Columns?K>1 and K is an odd integer. Instead of a large central multi-ported register file, a distributed set of storage files local to each execution unit is used. The instruction set architecture uses instructions that specify forwarding of execution results to execution units associated with destination instructions. This execution array is scalable to support cost effective and low power high-performance application specific processing focused on target product requirements.
    Type: Application
    Filed: February 4, 2020
    Publication date: August 6, 2020
    Inventor: Gerald George Pechanek
  • Patent number: 10503515
    Abstract: A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.
    Type: Grant
    Filed: March 2, 2018
    Date of Patent: December 10, 2019
    Inventor: Gerald George Pechanek
  • Patent number: 10078517
    Abstract: A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.
    Type: Grant
    Filed: August 16, 2016
    Date of Patent: September 18, 2018
    Inventor: Gerald George Pechanek
  • Publication number: 20180189068
    Abstract: A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.
    Type: Application
    Filed: March 2, 2018
    Publication date: July 5, 2018
    Inventor: Gerald George Pechanek
  • Patent number: 9672033
    Abstract: Techniques are described for loading decoded instructions and super-set instructions in a memory for later access. For loading a decoded instruction, the decoded instruction is a transformed form of an original instruction that was stored in the program memory. The transformation is from an encoded assembly level format to a binary machine level format. In one technique, the transformation mechanism is invoked by a transform and load instruction that causes an instruction retrieved from program memory to be transformed into a new language format and then loaded into a transformed instruction memory. The format of the transformed instruction may be optimized to the implementation requirements, such as improving critical path timing. The transformation of instructions may extend to other needs beyond timing path improvement, for example, requiring super-set instructions for increased functionality and improvements to instruction level parallelism.
    Type: Grant
    Filed: January 8, 2009
    Date of Patent: June 6, 2017
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, Larry D. Larsen
  • Publication number: 20160357569
    Abstract: A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.
    Type: Application
    Filed: August 16, 2016
    Publication date: December 8, 2016
    Inventor: Gerald George Pechanek
  • Patent number: 9507603
    Abstract: A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.
    Type: Grant
    Filed: August 2, 2014
    Date of Patent: November 29, 2016
    Inventor: Gerald George Pechanek
  • Patent number: 9460048
    Abstract: A method and apparatus for creating and executing a packet of chained instructions in a processor. A first instruction specifies a first operand is to be accessed from a memory and delivered through a first path in a first network to a first output. A second instruction specifies the first operand is to be received from the first output, to operate on the first operand, and to generate a result delivered to a second output. The second instruction does not identify a source device for the first operand and a destination device for the result. A third instruction specifies the first result is to be received from the second output and delivered through a first path in a second network for storage in the memory. The first, second, and third instructions are paired together as a packet of chained instructions for execution by a processor.
    Type: Grant
    Filed: March 9, 2013
    Date of Patent: October 4, 2016
    Inventor: Gerald George Pechanek
  • Patent number: 9390057
    Abstract: An array processor includes processing elements arranged in clusters to form a rectangular array. Inter-cluster communication paths are mutually exclusive. Due to the mutual exclusivity of the data paths, communications between the processing elements of each cluster may be combined in a single inter-cluster path, thus eliminating half the wiring required for the path. The length of the longest communication path is not directly determined by the overall dimension of the array, as in conventional torus arrays. Rather, the longest communications path is limited by the inter-cluster spacing. Transpose elements of an N×N torus may be combined in clusters and communicate with one another through intra-cluster communications paths. Transpose operation latency is eliminated in this approach. Each PE may have a single transmit port and a single receive port. Thus, the individual PEs are decoupled from the array topology.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: July 12, 2016
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, Charles W. Kurak, Jr.
  • Patent number: 9329866
    Abstract: Processor pipeline controlling techniques are described which take advantage of the variation in critical path lengths of different instructions to achieve increased performance. By examining a processor's instruction set and execution unit implementation's critical timing paths, instructions are classified into speed classes. Based on these speed classes, one pipeline is presented where hold signals are used to dynamically control the pipeline based on the instruction class in execution. An alternative pipeline supporting multiple classes of instructions is presented where the pipeline clocking is dynamically changed as a result of decoded instruction class signals. A single pass synthesis methodology for multi-class execution stage logic is also described. For dynamic class variable pipeline processors, the mix of instructions can have a great effect on processor performance and power utilization since both can vary by the program mix of instruction classes.
    Type: Grant
    Filed: February 28, 2013
    Date of Patent: May 3, 2016
    Assignee: Altera Corporation
    Inventors: Edwin Franklin Barry, Gerald George Pechanek, Patrick R. Marchand
  • Patent number: 9300958
    Abstract: Various approaches for motion search refinement in a processing element are discussed. A k/2+L+k/2 register stores an expanded row of an L×L macro block. A k-tap filter horizontally interpolates over the expanded row generating horizontal interpolation results. A transpose storage unit stores the interpolated results generated by the k-tap filter for k/2+L+k/2 entries, wherein rows or columns of data may be read out of the transpose storage unit in pipelined register stages. A k-tap filter vertically interpolates over the pipelined register stages generating vertical interpolation results.
    Type: Grant
    Filed: February 24, 2014
    Date of Patent: March 29, 2016
    Assignee: Altera Corporation
    Inventors: Mihailo M. Stojancic, Gerald George Pechanek
  • Patent number: 9158547
    Abstract: Hardware and software techniques for interrupt detection and response in a scalable pipelined array processor environment are described. Utilizing these techniques, a sequential program execution model with interrupts can be maintained in a highly parallel scalable pipelined array processing containing multiple processing elements and distributed memories and register files. When an interrupt occurs, interface signals are provided to all PEs to support independent interrupt operations in each PE dependent upon the local PE instruction sequence prior to the interrupt. Processing/element exception interrupts are supported and low latency interrupt processing is also provided for embedded systems where real time signal processing is required. Further, a hierarchical interrupt structure is used allowing a generalized debug approach using debut interrupts and a dynamic debut monitor mechanism.
    Type: Grant
    Filed: April 29, 2014
    Date of Patent: October 13, 2015
    Assignee: Altera Corporation
    Inventors: Edwin Franklin Barry, Patrick R. Marchand, Gerald George Pechanek, Larry D. Larsen
  • Patent number: 9075651
    Abstract: Efficient computation of complex long multiplication results and an efficient calculation of a covariance matrix are described. A parallel array VLIW digital signal processor is employed along with specialized complex long multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs may be used allowing the complex multiplication pipeline hardware to be efficiently used.
    Type: Grant
    Filed: September 13, 2012
    Date of Patent: July 7, 2015
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, Ricardo Rodriguez, Matthew Plonski, David Strube, Kevin Coopman
  • Patent number: 9063722
    Abstract: A control processor is used for fetching and distributing single instruction multiple data (SIMD) instructions to a plurality of processing elements (PEs). One of the SIMD instructions is a thread start (Tstart) instruction, which causes the control processor to pause its instruction fetching. A local PE instruction memory (PE Imem) is associated with each PE and contains local PE instructions for execution on the local PE. Local PE Imem fetch, decode, and execute logic are associated with each PE. Instruction path selection logic in each PE is used to select between control processor distributed instructions and local PE instructions fetched from the local PE Imem. Each PE is also initialized to receive control processor distributed instructions. In addition, local hold generation logic is associated with each PE. A PE receiving a Tstart instruction causes the instruction path selection logic to switch to fetch local PE Imem instructions.
    Type: Grant
    Filed: December 21, 2011
    Date of Patent: June 23, 2015
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, Edwin Franklin Barry, Mihailo Stojancic
  • Patent number: 9060169
    Abstract: Apparatus and methods for scalable block pixel filtering are described. A block filtering instruction is issued to a processing element (PE) to initiate block pixel filtering hardware by causing at least one command and at least one parameter be sent to a command and control function associated with the PE. A block of pixels is fetched from a PE local memory to be stored in a register file of a hardware assist module. A sub-block of pixels is processed to generate sub-block parameters and the block of pixels is filtered in a horizontal/vertical edge filtering computation pipeline using the sub-block parameters.
    Type: Grant
    Filed: August 28, 2013
    Date of Patent: June 16, 2015
    Assignee: Altera Corporation
    Inventors: Mihailo M. Stojancic, Gerald George Pechanek
  • Patent number: 9015354
    Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
    Type: Grant
    Filed: August 11, 2014
    Date of Patent: April 21, 2015
    Assignee: Altera Corporation
    Inventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez
  • Patent number: 9009365
    Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.
    Type: Grant
    Filed: February 20, 2013
    Date of Patent: April 14, 2015
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, David Strube, Edwin Franklin Barry, Charles W. Kurak, Jr., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo Rodriguez, Marco Jacobs
  • Publication number: 20150039856
    Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
    Type: Application
    Filed: August 11, 2014
    Publication date: February 5, 2015
    Applicant: Altera Corporation
    Inventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez
  • Publication number: 20150039855
    Abstract: A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.
    Type: Application
    Filed: August 2, 2014
    Publication date: February 5, 2015
    Inventor: Gerald George Pechanek
  • Patent number: 8904152
    Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
    Type: Grant
    Filed: May 26, 2011
    Date of Patent: December 2, 2014
    Assignee: Altera Corporation
    Inventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez