Patents by Inventor Terry Parks

Terry Parks has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

NEURAL NETWORK UNIT THAT PERFORMS CONCURRENT LSTM CELL CALCULATIONS

Publication number: 20170103305

Abstract: An output buffer holds N words arranged as N/J mutually exclusive output buffer word groups (OBWG) of J words each of the N words. N processing units (PU) are arranged as N/J mutually exclusive PU groups. Each PU group has an associated OBWG. Each PU includes an accumulator and an arithmetic unit that performs operations on inputs, which include the accumulator output, to generate a first result for accumulation into the accumulator. Activation function units selectively perform an activation function on the accumulator outputs to generate results for provision to the N output buffer words. For each PU group, four of the J PUs and at least one of the activation function units compute an input gate, a forget gate, an output gate and a candidate state of a Long Short Term Memory (LSTM) cell, respectively, for writing to respective first, second, third and fourth words of the associated OBWG.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. Glenn HENRY, Terry PARKS, Kyle T. O'BRIEN
NEURAL NETWORK UNIT WITH NEURAL PROCESSING UNITS DYNAMICALLY CONFIGURABLE TO PROCESS MULTIPLE DATA SIZES

Publication number: 20170103302

Abstract: A neural network unit. A register holds an indicator that specifies narrow and wide configurations. A first memory holds rows of 2N/N narrow/wide weight words in the narrow/wide configuration. A second memory holds rows of 2N/N narrow/wide data words in the narrow/wide configuration. An array of neural processing units (NPU) is configured as 2N/N narrow/wide NPUs and to receive the 2N/N narrow/wide weight words of rows from the first memory and to receive the 2N/N narrow/wide data words of rows from the second memory in the narrow/wide configuration. In the narrow configuration, the 2N NPUs perform narrow arithmetic operations on the 2N narrow weight words and the 2N narrow data words received from the first and second memories. In the wide configuration, the N NPUs perform wide arithmetic operations on the N wide weight words and the N wide data words received from the first and second memories.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
NEURAL NETWORK UNIT WITH SHARED ACTIVATION FUNCTION UNITS

Publication number: 20170103320

Abstract: A neural network unit includes first and second memories that hold rows of respective N weight and data words and provides a row of them to N corresponding neural processing units (NPU), respectively. The N NPUs each have an accumulator and an arithmetic unit that performs a series of multiply operations on pairs of weight words and data words received from the first and second memories to generate a series of products. The arithmetic unit also performs a series of addition operations on the series of products to accumulate an accumulated value in the accumulator. Activation function units (AFU) are each shared by a corresponding plurality of the N NPUs. Each AFU, in a sequential fashion with respect to each NPU of the corresponding plurality of the N NPUs, receives the accumulated value from the NPU and performs an activation function on the accumulated value to generate a result.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
DIRECT EXECUTION BY AN EXECUTION UNIT OF A MICRO-OPERATION LOADED INTO AN ARCHITECTURAL REGISTER FILE BY AN ARCHITECTURAL INSTRUCTION OF A PROCESSOR

Publication number: 20170102945

Abstract: A processor includes an architectural register file loadable with micro-operations by architectural instructions of an architectural instruction set of the processor and an execution unit that executes instructions. The instructions are either architectural instructions or microinstructions into which architectural instructions are translated. The execution unit includes a decoder that decodes the instructions into micro-operations, a mode indicator that indicates one of first and second modes, a pipeline of stages to which are provided micro-operations that control circuits of the stages of the pipeline, and a multiplexer. The multiplexer selects for provision to the pipeline a micro-operation received from the decoder when the mode indicator indicates the first mode and selects for provision to the pipeline a micro-operation received from the architectural register file when the mode indicator indicates the second mode.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
PROCESSOR WITH VARIABLE RATE EXECUTION UNIT

Publication number: 20170103040

Abstract: A processor has functional units that fetch and decode architectural instructions of an architectural instruction set at a first rate, a register that stores a value of an indicator programmable by execution of an architectural instruction of the architectural instruction set, and an execution unit. The execution unit includes a first memory that holds data, a second memory that holds instructions of a program, and a plurality of processing units that execute the program instructions at a second rate to perform operations on data received from the first memory to generate results to be written to the first memory. The instructions are of an instruction set that is distinct from the architectural instruction set. The second rate is the first rate when the indicator is programmed with a first value and the second rate is less than the first rate when the indicator is programmed with a second value.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
PROCESSOR WITH ARCHITECTURAL NEURAL NETWORK EXECUTION UNIT

Publication number: 20170103301

Abstract: A processor has an instruction fetch unit that fetches ISA instructions from memory and execution units that perform operations on instruction operands to generate results according to the processor's ISA. A hardware neural network unit (NNU) execution unit performs computations associated with artificial neural networks (ANN). The NNU has an array of ALUs, a first memory that holds data words associated with ANN neuron outputs, and a second memory that holds weight words associated with connections between ANN neurons. Each ALU multiplies a portion of the data words by a portion of the weight words to generate products and accumulates the products in an accumulator as an accumulated value. Activation function units normalize the accumulated values to generate outputs associated with ANN neurons. The ISA includes at least one instruction that instructs the processor to write data words and the weight words to the respective first and second memories.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
NEURAL NETWORK UNIT EMPLOYING USER-SUPPLIED RECIPROCAL FOR NORMALIZING AN ACCUMULATED VALUE

Publication number: 20170103321

Abstract: A neural network unit including a register programmable with a representation of a reciprocal value of a divisor and a plurality of neural processing units (NPU). Each NPU has an ALU, an accumulator, and a reciprocal multiplier unit. The ALU performs arithmetic and logical operations on a sequence of operands to generate a sequence of results and accumulates the sequence of results as an accumulated value into the accumulator. The reciprocal multiplier unit receives the representation of the reciprocal value and the accumulated value and in response generates a result that is the quotient of the accumulated value and the divisor.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
TRI-CONFIGURATION NEURAL NETWORK UNIT

Publication number: 20170103300

Abstract: A neural network unit configurable to first/second/third configurations has N narrow and N wide accumulators, multipliers and adders. Each multiplier performs a narrow/wide multiply on first and second narrow/wide inputs to generate a narrow/wide product. A first adder input receives a corresponding narrow/wide accumulator's output and third input receives a widened corresponding narrow multiplier's narrow product in the third configuration. In the first configuration, each narrow/wide adder performs a narrow/wide addition on the first and second inputs to generate a narrow/wide sum for storage into the corresponding narrow/wide accumulator. In the second configuration, each wide adder performs a wide addition on the first and a second input to generate a wide sum for storage into the corresponding wide accumulator. In the third configuration, each wide adder performs a wide addition on the first, second and third inputs to generate a wide sum for storage into the corresponding wide accumulator.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
MECHANISM FOR COMMUNICATION BETWEEN ARCHITECTURAL PROGRAM RUNNING ON PROCESSOR AND NON-ARCHITECTURAL PROGRAM RUNNING ON EXECUTION UNIT OF THE PROCESSOR REGARDING SHARED RESOURCE

Publication number: 20170103041

Abstract: Functional units of a processor fetch and decode architectural instructions of an architectural program. The architectural instructions are of an architectural instruction set of the processor. An execution unit includes first and second memories, a register and processing units. The first memory holds data in rows with addresses. The second memory holds non-architectural instructions of a non-architectural program. The architectural and non-architectural instruction sets are distinct. The processing units execute the non-architectural program instructions to read data from the first memory, perform operations on the data read from the first memory to generate results, and to write the results to the first memory. The register holds information that indicates progress made by the non-architectural program during execution. The first memory is also readable and writable by the architectural program. The architectural program uses the information to decide where in the first memory to read/write data.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
NEURAL NETWORK UNIT WITH OUTPUT BUFFER FEEDBACK AND MASKING CAPABILITY

Publication number: 20170102941

Abstract: An output buffer holds N words arranged as N/J mutually exclusive output buffer word groups (OBWG) of J words each. N processing units (PU) are arranged as N/J mutually exclusive PU groups each having an associated OBWG. Each PU has an accumulator, an arithmetic unit, and first and second multiplexed registers each having at least J+1 inputs and an output. A first input receives a memory operand and the other J inputs receive the J words of the associated OBWG. Each accumulator provides its output to a respective output buffer word. Each arithmetic unit performs an operation on the first and second multiplexed register outputs and the accumulator output to generate a result for accumulation into the accumulator. A mask input to the output buffer controls which words, if any, of the N words retain their current value or are updated with their respective accumulator output.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS, KYLE T. O'BRIEN
APPARATUS EMPLOYING USER-SPECIFIED BINARY POINT FIXED POINT ARITHMETIC

Publication number: 20170102921

Abstract: An apparatus includes a plurality of arithmetic logic units each having an accumulator and an integer arithmetic unit that receives and performs integer arithmetic operations on integer inputs and accumulates integer results of a series of the integer arithmetic operations into the accumulator as an integer accumulated value. A register is programmable with an indication of a number of fractional bits of the integer accumulated values and an indication of a number of fractional bits of integer outputs. A first bit width of the accumulator is greater than twice a second bit width of the integer outputs. A plurality of adjustment units scale and saturate the first bit width integer accumulated values to generate the second bit width integer outputs based on the indications of the number of fractional bits of the integer accumulated values and outputs programmed into the register.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
NEURAL PROCESSING UNIT THAT SELECTIVELY WRITES BACK TO NEURAL MEMORY EITHER ACTIVATION FUNCTION OUTPUT OR ACCUMULATOR VALUE

Publication number: 20170103319

Abstract: A neural network unit includes a programmable indicator, a first memory that holds first operands, a second memory that holds second operands, neural processing units (NPU), and activation units. Each NPU has an accumulator and an arithmetic unit that performs a series of multiply operations on pairs of the first and second operands received from the first and second memories to generate a series of products, and a series of addition operations on the series of products to accumulate an accumulated value in the accumulator. The activation units perform activation functions on the accumulated values in the accumulators to generate results. When the indicator specifies the first action, the neural network unit writes to the first memory the results generated by the activation units. When the indicator specifies the second action, the neural network unit writes to the first memory the accumulated values in the accumulators.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
NEURAL NETWORK UNIT WITH PLURALITY OF SELECTABLE OUTPUT FUNCTIONS

Publication number: 20170103304

Abstract: A neural network unit includes a register programmable with a control value, a plurality of neural processing units (NPU), and a plurality of activation function units (AFU). Each NPU includes an arithmetic logic unit (ALU) that performs arithmetic and logical operations on a sequence of operands to generate a sequence of results and an accumulator into which the ALU accumulates the sequence of results as an accumulated value. Each AFU includes a first module that performs a first function on the accumulated value to generate a first output, a second module that performs a second function on the accumulated value to generate a second output, the first function is distinct from the second function, and a multiplexer that receives the first and second outputs and selects one of the two outputs based on the control value programmed into the register.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
NEURAL NETWORK UNIT THAT PERFORMS CONVOLUTIONS USING COLLECTIVE SHIFT REGISTER AMONG ARRAY OF NEURAL PROCESSING UNITS

Publication number: 20170103311

Abstract: A neural network unit has a first memory that holds elements of a data matrix and a second memory that holds elements of a convolution kernel. An array of neural processing units (NPU) each have a multiplexed register that receives a corresponding element of a row from the first memory and that also receives the multiplexed register output of an adjacent NPU. A register receives a corresponding element of a row from the second memory. An arithmetic unit receives the outputs of the register, the multiplexed register and an accumulator and performs a multiply-accumulate operation on them. For each sub-matrix of a plurality of sub-matrices of the data matrix, each arithmetic unit selectively receives either the element from the first memory or the adjacent NPU multiplexed register output and performs a series of the multiply-accumulate operations to accumulate into the accumulator a convolution of the sub-matrix with the convolution kernel.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS, KYLE T. O'BRIEN
PROCESSOR WITH HYBRID COPROCESSOR/EXECUTION UNIT NEURAL NETWORK UNIT

Publication number: 20170103307

Abstract: A processor includes a front-end portion that issues instructions to execution units that execute the issued instructions. A hardware neural network unit (NNU) execution unit includes a first memory that holds data words associated with artificial neural networks (ANN) neuron outputs, a second memory that holds weight words associated with connections between ANN neurons, and a third memory that holds a program comprising NNU instructions that are distinct, with respect to their instruction set, from the instructions issued to the NNU by the front-end portion of the processor. The program performs ANN-associated computations on the data and weight words. A first instruction instructs the NNU to transfer NNU instructions of the program from architectural general purpose registers to the third memory. A second instruction instructs the NNU to invoke the program stored in the third memory.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS
NEURAL NETWORK UNIT WITH OUTPUT BUFFER FEEDBACK AND MASKING CAPABILITY WITH PROCESSING UNIT GROUPS THAT OPERATE AS RECURRENT NEURAL NETWORK LSTM CELLS

Publication number: 20170103312

Abstract: An output buffer holds N words arranged as N/J mutually exclusive output buffer word groups (OBWG) of J words each. N processing units (PU) are arranged as N/J mutually exclusive PU groups each having an associated OBWG. Each PU has an accumulator, arithmetic unit, and first and second multiplexed registers each having at least J+1 inputs. A first input receives a memory operand and the other J inputs receive the J words of the associated OBWG. Each accumulator provides its output to a respective OBWG. Each arithmetic unit performs an operation on the first and second multiplexed register outputs and accumulator output to generate a result for accumulation into the accumulator. A mask input to the output buffer controls which words, if any, of the N words retain their current value or are updated with their respective accumulator output. Each PU group operates as a recurrent neural network LSTM cell.

Type: Application

Filed: April 5, 2016

Publication date: April 13, 2017

Inventors: G. GLENN HENRY, TERRY PARKS, KYLE T. O'BRIEN
Processor that recovers from excessive approximate computing error

Patent number: 9588845

Abstract: A processor includes a storage configured to receive a snapshot of a state of the processor prior to performing a set of computations in an approximating manner. The processor also includes an indicator that indicates an amount of error accumulated while the set of computations is performed in the approximating manner. When the processor detects that the amount of error accumulated has exceeded an error bound, the processor is configured to restore the state of the processor to the snapshot from the storage.

Type: Grant

Filed: October 23, 2014

Date of Patent: March 7, 2017

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventors: G. Glenn Henry, Terry Parks, Rodney E. Hooker
Multi-core processor having control unit that generates interrupt requests to all cores in response to synchronization condition

Patent number: 9588572

Abstract: A microprocessor includes a plurality of processing cores, each comprising a respective interrupt request input and a control unit configured to receive a respective synchronization request from each of the plurality of processing cores. The control unit is configured to generate an interrupt request to all of the plurality of processing cores on their respective interrupt request inputs in response to detecting that the control unit has received the respective synchronization request from all of the plurality of processing cores.

Type: Grant

Filed: May 19, 2014

Date of Patent: March 7, 2017

Assignee: VIA TECHNOLOGIES, INC.

Inventors: G. Glenn Henry, Terry Parks
SINGLE-CORE WAKEUP MULTI-CORE SYNCHRONIZATION MECHANISM

Publication number: 20170003707

Abstract: A microprocessor includes a plurality of cores, a shared cache memory, and a control unit that individually puts each core to sleep by stopping its clock signal. Each core executes a sleep instruction and responsively makes a respective request of the control unit to put the core to sleep, which the control unit responsively does, and detects when all the cores have made the respective request and responsively wakes up only the last requesting cores. The last core writes back and invalidates the shared cache memory and indicates it has been invalidated and makes a request to the control unit to put the last core back to sleep. The control unit puts the last core back to sleep and continuously keeps the other cores asleep while the last core writes back and invalidates the shared cache memory, indicates the shared cache memory was invalidated, and is put back to sleep.

Type: Application

Filed: September 14, 2016

Publication date: January 5, 2017

Inventors: G. GLENN HENRY, TERRY PARKS, BRENT BEAN, STEPHAN GASKINS
HARDWARE DATA COMPRESSOR USING DYNAMIC HASH ALGORITHM BASED ON INPUT BLOCK TYPE

Publication number: 20160380649

Abstract: A hardware data compressor that compresses an input block of characters by replacing strings of characters in the input block with back pointers to matching strings earlier in the input block. A hash table is used in searching for the matching strings in the input block. A plurality of hash index generators each employs a different hashing algorithm on an initial portion of the strings of characters to be replaced to generate a respective index. The hardware data compressor also includes an indication of a type of the input block of characters. A selector selects the index generated by of one of the plurality hash index generators to index into the hash table based on the type of the input block.

Type: Application

Filed: September 7, 2016

Publication date: December 29, 2016

Inventors: G. GLENN HENRY, TERRY PARKS

prev 1 2 3 4 5 6 7 8 9 … next