Patents by Inventor KIM C. HOUCK

KIM C. HOUCK has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11029949
    Abstract: A hardware processing unit is provided. The hardware processing unit includes: an accumulator; a multiplier-adder receives first and second factors and receives an addend, the multiplier-adder generates a sum of the addend and a product of the first and second factors and provides the sum; a first multiplexer receives a first operand, a positive one, and a negative one and selects one of them for provision as the first factor to the multiplier-adder; a second multiplexer receives a second operand, a positive one, and a negative one and selects one of them for provision as the second factor to the multiplier-adder; a third multiplexer, having an output, that receives the first operand and the second operand and selects one of them for provision on its output; and a fourth multiplexer receives the third multiplexer output and the sum and selects one of them for provision to the accumulator.
    Type: Grant
    Filed: April 10, 2018
    Date of Patent: June 8, 2021
    Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Douglas R. Reed, Kim C Houck, Parviz Palangpour
  • Patent number: 10586148
    Abstract: A memory holds D rows of N words and receives an address having log2 D bits and an extra bit. Each of N processing units (PU) of index J has first and second registers, an accumulator, an arithmetic unit that performs an operation thereon to accumulate a result, and multiplexing logic receiving memory word J, and for PUs 0 to (N/2)?1 also memory word J+(N/2). In a first mode, the multiplexing logic of PUs 0 to N?1 selects word J to output to the first register. In a second mode: when the extra bit is a zero, the multiplexing logic of PUs 0 to (N/2)?1 selects word J to output to the first register, and when the extra bit is a one, the multiplexing logic of PUs 0 through (N/2)?1 selects word J+(N/2) to output to the first register.
    Type: Grant
    Filed: December 31, 2016
    Date of Patent: March 10, 2020
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Kim C. Houck, Parviz Palangpour
  • Patent number: 10565492
    Abstract: First/second memories hold rows of N weight/data words. Each of N processing units (PU) of index J have a register, an accumulator having an output, an arithmetic unit that performs an operation thereon to accumulate a result, the first input receives the output of the accumulator, the second input receives a respective first memory weight word, the third input receives a respective data word output by the register, and multiplexing logic receives a respective second memory data word and a data word output by the register of PU J?1 and outputs a selected data word to the register. PU J?1 for PU 0 is PU N?1. The multiplexing logic of PU N/4 also receives the data word output by the register of PU (3N/4)?1. The multiplexing logic of PU 3N/4 also receives the data word output by the register of PU (N/4)?1.
    Type: Grant
    Filed: December 31, 2016
    Date of Patent: February 18, 2020
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Kim C. Houck, Parviz Palangpour
  • Patent number: 10565494
    Abstract: First/second memories hold rows of N weight/data words. Each of N processing units (PU) of index J have a register, an accumulator having an output, an arithmetic unit that performs an operation thereon to accumulate a result, the first input receives the output of the accumulator, the second input receives a respective first memory weight word, the third input receives a respective data word output by the register, and multiplexing logic receives a respective second memory data word and a data word output by the register of PU J?1 and outputs a selected data word to the register. PU J?1 for PU 0 is PU N?1. The multiplexing logic of PU 0 also receives the data word output by the register of PU (N/2)?1. The multiplexing logic of PU N/2 also receives the data word output by the register of PU N?1.
    Type: Grant
    Filed: December 31, 2016
    Date of Patent: February 18, 2020
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Kim C. Houck, Parviz Palangpour
  • Patent number: 10515302
    Abstract: In a neural network unit, each neural processing unit (NPU) of an array of N NPUs receives respective first and second upper and lower bytes of 2N bytes received from first and second RAMs. In a first mode, each NPU sign-extends the first upper byte to form a first 16-bit word and performs an arithmetic operation on the first 16-bit word and a second 16-bit word formed by the second upper and lower bytes. In a second mode, each NPU sign-extends the first lower byte to form a third 16-bit word and performs the arithmetic operation on the third 16-bit word and the second 16-bit word formed by the second upper and lower bytes. In a third mode, each NPU performs the arithmetic operation on a fourth 16-bit word formed by the first upper and lower bytes and the second 16-bit word formed by the second upper and lower bytes.
    Type: Grant
    Filed: December 8, 2016
    Date of Patent: December 24, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Kim C. Houck
  • Patent number: 10438115
    Abstract: A neural network unit convolves an H×W×C input with F R×S×C filters to generate F Q×P outputs. N processing units (PU) each have a register receiving a respective word of an N-word row of a second memory and multiplexed-register selectively receiving a respective word of an N-word row of a first memory or word rotated from an adjacent PU multiplexed-register. H first memory rows hold input blocks of B words each of channels of respective 2-dimensional input row slices. R×S×C second memory rows hold filter blocks of B words each holding P copies of a filter weight. B is the smallest factor of N greater than W. The PU blocks multiply-accumulate input blocks and filter blocks in column-channel-row order; they read a row of input blocks and rotate it around the N PUs while performing multiply-accumulate operations so each PU block receives each input block before reading another row.
    Type: Grant
    Filed: December 1, 2016
    Date of Patent: October 8, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Kim C. Houck
  • Patent number: 10417560
    Abstract: A neural network unit convolves a H×W×C input with F R×S×C filters to generate F Q×P outputs. N processing units (PU) each have a register receiving a memory word and a multiplexed-register selectively receiving a memory word or word rotated from an adjacent PU multiplexed-register. The N PUs are logically partitioned as G blocks each of B PUs. The PUs convolve in a column-channel-row order. For each filter column: the N registers read a memory row, each PU multiplies the register and the multiplexed-register to generate a product to accumulate, and the multiplexed-registers are rotated by one; the multiplexed-registers are rotated to align the input blocks with the adjacent PU block. This is performed for each channel. For each filter row, N multiplexed-registers read a memory row for the multiply-accumulations, F column-channel-row-sums are generated and written to the memory, then all steps are performed for each output row.
    Type: Grant
    Filed: December 1, 2016
    Date of Patent: September 17, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Kim C. Houck
  • Patent number: 10395165
    Abstract: N processing units (PU) each have an arithmetic unit (AU) that performs an operation on first, second and third inputs to generate a result to store in an accumulator having an output provided to the first input. A weight input is received by the AU second input. A multiplexed register has first, second, third and fourth data inputs and an output received by the third AU input. A first memory provides N weight words to the N weight inputs. A second memory provides N data words to the multiplexed register first data inputs. The multiplexed register output is also received by the second, third, and fourth data input of the multiplexed register one, 2{circumflex over (?)}J, and 2{circumflex over (?)}K PUs away, respectively. The N multiplexed registers collectively operate as an N-word rotater that rotates by one, 2{circumflex over (?)}J, or 2{circumflex over (?)}K words when the control input specifies the second, third, or fourth data input, respectively.
    Type: Grant
    Filed: December 1, 2016
    Date of Patent: August 27, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD
    Inventors: G. Glenn Henry, Kim C. Houck
  • Patent number: 10140574
    Abstract: First/second memories hold rows of N weight/data words. The first memory address has log2 W bits and an extra bit. Each of N processing units (PU) of index J has first and second registers, an accumulator, an arithmetic unit performs an operation thereon to accumulate a result, first multiplexing logic for PUs 0 through (N/2)?1 receives first memory weight words J and J+(N/2) and for PUs N/2 through N?1 receives first memory weight words J and J?(N/2) and outputs a selected weight word to the first register, and second multiplexing logic receives second memory data word J and data word output by the second register of PU J?1 and outputs a selected data word to the second register. PU 0 second multiplexing logic also receives PU (N/2)?1 second register data word, and PU N/2 second multiplexing logic also receives PU N?1 second register data word.
    Type: Grant
    Filed: December 31, 2016
    Date of Patent: November 27, 2018
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD
    Inventors: G. Glenn Henry, Kim C. Houck, Parviz Palangpour
  • Publication number: 20180225116
    Abstract: A hardware processing unit is provided. The hardware processing unit includes: an accumulator; a multiplier-adder receives first and second factors and receives an addend, the multiplier-adder generates a sum of the addend and a product of the first and second factors and provides the sum; a first multiplexer receives a first operand, a positive one, and a negative one and selects one of them for provision as the first factor to the multiplier-adder; a second multiplexer receives a second operand, a positive one, and a negative one and selects one of them for provision as the second factor to the multiplier-adder; a third multiplexer, having an output, that receives the first operand and the second operand and selects one of them for provision on its output; and a fourth multiplexer receives the third multiplexer output and the sum and selects one of them for provision to the accumulator.
    Type: Application
    Filed: April 10, 2018
    Publication date: August 9, 2018
    Inventors: G. Glenn HENRY, Douglas R. Reed, Kim C. Houck, Parviz Palangpour
  • Publication number: 20180189640
    Abstract: First/second memories hold rows of N weight/data words. Each of N processing units (PU) of index J have a register, an accumulator having an output, an arithmetic unit that performs an operation thereon to accumulate a result, the first input receives the output of the accumulator, the second input receives a respective first memory weight word, the third input receives a respective data word output by the register, and multiplexing logic receives a respective second memory data word and a data word output by the register of PU J?1 and outputs a selected data word to the register. PU J?1 for PU 0 is PU N?1. The multiplexing logic of PU 0 also receives the data word output by the register of PU (N/2)?1. The multiplexing logic of PU N/2 also receives the data word output by the register of PU N?1.
    Type: Application
    Filed: December 31, 2016
    Publication date: July 5, 2018
    Inventors: G. GLENN HENRY, KIM C. HOUCK, PARVIZ PALANGPOUR
  • Publication number: 20180189651
    Abstract: First/second memories hold rows of N weight/data words. The first memory address has log2W bits and an extra bit. Each of N processing units (PU) of index J has first and second registers, an accumulator, an arithmetic unit performs an operation thereon to accumulate a result, first multiplexing logic for PUs 0 through (N/2)?1 receives first memory weight words J and J+(N/2) and for PUs N/2 through N?1 receives first memory weight words J and J?(N/2) and outputs a selected weight word to the first register, and second multiplexing logic receives second memory data word J and data word output by the second register of PU J?1 and outputs a selected data word to the second register. PU 0 second multiplexing logic also receives PU (N/2)?1 second register data word, and PU N/2 second multiplexing logic also receives PU N?1 second register data word.
    Type: Application
    Filed: December 31, 2016
    Publication date: July 5, 2018
    Inventors: G. GLENN HENRY, KIM C. HOUCK, PARVIZ PALANGPOUR
  • Publication number: 20180189633
    Abstract: First/second memories hold rows of N weight/data words. Each of N processing units (PU) of index J have a register, an accumulator having an output, an arithmetic unit that performs an operation thereon to accumulate a result, the first input receives the output of the accumulator, the second input receives a respective first memory weight word, the third input receives a respective data word output by the register, and multiplexing logic receives a respective second memory data word and a data word output by the register of PU J?1 and outputs a selected data word to the register. PU J?1 for PU 0 is PU N?1. The multiplexing logic of PU N/4 also receives the data word output by the register of PU (3N/4)?1. The multiplexing logic of PU 3N/4 also receives the data word output by the register of PU (N/4)?1.
    Type: Application
    Filed: December 31, 2016
    Publication date: July 5, 2018
    Inventors: G. GLENN HENRY, KIM C. HOUCK, PARVIZ PALANGPOUR
  • Publication number: 20180189639
    Abstract: A memory holds D rows of N words and receives an address having log2D bits and an extra bit. Each of N processing units (PU) of index J has first and second registers, an accumulator, an arithmetic unit that performs an operation thereon to accumulate a result, and multiplexing logic receiving memory word J, and for PUs 0 to (N/2)?1 also memory word J+(N/2). In a first mode, the multiplexing logic of PUs 0 to N?1 selects word J to output to the first register. In a second mode: when the extra bit is a zero, the multiplexing logic of PUs 0 to (N/2)?1 selects word J to output to the first register, and when the extra bit is a one, the multiplexing logic of PUs 0 through (N/2)?1 selects word J+(N/2) to output to the first register.
    Type: Application
    Filed: December 31, 2016
    Publication date: July 5, 2018
    Inventors: G. GLENN HENRY, KIM C. HOUCK, PARVIZ PALANGPOUR
  • Publication number: 20180165575
    Abstract: In a neural network unit, each neural processing unit (NPU) of an array of N NPUs receives respective first and second upper and lower bytes of 2N bytes received from first and second RAMs. In a first mode, each NPU sign-extends the first upper byte to form a first 16-bit word and performs an arithmetic operation on the first 16-bit word and a second 16-bit word formed by the second upper and lower bytes. In a second mode, each NPU sign-extends the first lower byte to form a third 16-bit word and performs the arithmetic operation on the third 16-bit word and the second 16-bit word formed by the second upper and lower bytes. In a third mode, each NPU performs the arithmetic operation on a fourth 16-bit word formed by the first upper and lower bytes and the second 16-bit word formed by the second upper and lower bytes.
    Type: Application
    Filed: December 8, 2016
    Publication date: June 14, 2018
    Inventors: G. GLENN HENRY, KIM C. HOUCK
  • Publication number: 20180157961
    Abstract: N processing units (PU) each have an arithmetic unit (AU) that performs an operation on first, second and third inputs to generate a result to store in an accumulator having an output provided to the first input. A weight input is received by the AU second input. A multiplexed register has first, second, third and fourth data inputs and an output received by the third AU input. A first memory provides N weight words to the N weight inputs. A second memory provides N data words to the multiplexed register first data inputs. The multiplexed register output is also received by the second, third, and fourth data input of the multiplexed register one, 2?J, and 2?K PUs away, respectively. The N multiplexed registers collectively operate as an N-word rotater that rotates by one, 2?J, or 2?K words when the control input specifies the second, third, or fourth data input, respectively.
    Type: Application
    Filed: December 1, 2016
    Publication date: June 7, 2018
    Inventors: G. GLENN HENRY, KIM C. HOUCK
  • Publication number: 20180157966
    Abstract: A neural network unit convolves a H×W×C input with F R×S×C filters to generate F Q×P outputs. N processing units (PU) each have a register receiving a memory word and a multiplexed-register selectively receiving a memory word or word rotated from an adjacent PU multiplexed-register. The N PUs are logically partitioned as G blocks each of B PUs. The PUs convolve in a column-channel-row order. For each filter column: the N registers read a memory row, each PU multiplies the register and the multiplexed-register to generate a product to accumulate, and the multiplexed-registers are rotated by one; the multiplexed-registers are rotated to align the input blocks with the adjacent PU block. This is performed for each channel. For each filter row, N multiplexed-registers read a memory row for the multiply-accumulations, F column-channel-row-sums are generated and written to the memory, then all steps are performed for each output row.
    Type: Application
    Filed: December 1, 2016
    Publication date: June 7, 2018
    Inventors: G. GLENN HENRY, KIM C. HOUCK
  • Publication number: 20180157962
    Abstract: A neural network unit convolves an H×W×C input with F R×S×C filters to generate F Q×P outputs. N processing units (PU) each have a register receiving a respective word of an N-word row of a second memory and multiplexed-register selectively receiving a respective word of an N-word row of a first memory or word rotated from an adjacent PU multiplexed-register. H first memory rows hold input blocks of B words each of channels of respective 2-dimensional input row slices. R×S×C second memory rows hold filter blocks of B words each holding P copies of a filter weight. B is the smallest factor of N greater than W. The PU blocks multiply-accumulate input blocks and filter blocks in column-channel-row order; they read a row of input blocks and rotate it around the N PUs while performing multiply-accumulate operations so each PU block receives each input block before reading another row.
    Type: Application
    Filed: December 1, 2016
    Publication date: June 7, 2018
    Inventors: G. GLENN HENRY, KIM C. HOUCK