Patents by Inventor Peter Joseph Bannon

Peter Joseph Bannon has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210048984
    Abstract: Various embodiments of the disclosure relate to an accelerated mathematical engine. In certain embodiments, the accelerated mathematical engine is applied to image processing such that convolution of an image is accelerated by using a two-dimensional matrix processor comprising sub-circuits that include an ALU, output register and shadow register. This architecture supports a clocked, two-dimensional architecture in which image data and weights are multiplied in a synchronized manner to allow a large number of mathematical operations to be performed in parallel.
    Type: Application
    Filed: May 29, 2020
    Publication date: February 18, 2021
    Inventors: Peter Joseph Bannon, Kevin Altair Hurd, Emil Talpes
  • Patent number: 10747844
    Abstract: Presented are systems and methods that accelerate the convolution of an image and similar arithmetic operations by utilizing hardware-specific circuitry that enables a large number of operations to be performed in parallel across a large set of data. In various embodiments, arithmetic operations are further enhanced by reusing data and eliminating redundant steps of storing and fetching intermediate results from registers and memory when performing arithmetic operations.
    Type: Grant
    Filed: December 12, 2017
    Date of Patent: August 18, 2020
    Assignee: Tesla, Inc.
    Inventors: Peter Joseph Bannon, William A McGee, Emil Talpes
  • Patent number: 10715175
    Abstract: Various embodiments of the invention provide systems, devices, and methods for decompressing encoded electronic data to increase decompression throughput using any number of decoding engines. In certain embodiments, this is accomplished by identifying and processing a next record in a pipeline operation before having to complete the decompression of a current record. Various embodiments take advantage of the knowledge of the method of how the records have been encoded, e.g., in a single long record, to greatly reduce delay time, compared with existing designs, when decompressing encoded electronic data.
    Type: Grant
    Filed: August 28, 2017
    Date of Patent: July 14, 2020
    Assignee: Tesla, Inc.
    Inventors: Peter Joseph Bannon, Kevin Altair Hurd
  • Patent number: 10671349
    Abstract: Various embodiments of the disclosure relate to an accelerated mathematical engine. In certain embodiments, the accelerated mathematical engine is applied to image processing such that convolution of an image is accelerated by using a two-dimensional matrix processor comprising sub-circuits that include an ALU, output register and shadow register. This architecture supports a clocked, two-dimensional architecture in which image data and weights are multiplied in a synchronized manner to allow a large number of mathematical operations to be performed in parallel.
    Type: Grant
    Filed: September 20, 2017
    Date of Patent: June 2, 2020
    Assignee: Tesla, Inc.
    Inventors: Peter Joseph Bannon, Kevin Altair Hurd, Emil Talpes
  • Patent number: 10416899
    Abstract: In various embodiment, the present invention teaches a sequencer that identifies an address point of a first data block within a memory and a length of data that comprises that data block and is related to an input of a matrix processor. The sequencer then calculates, based on the block length, the input length, and a memory map, a block count representative of a number of data blocks that are to be retrieved from the memory. Using the address pointer, the sequencer may retrieve a number of data blocks from the memory in a number of cycles that depends on whether the data blocks are contiguous. In embodiments, based on the length of data, a formatter then maps the data blocks to the input of the matrix processor.
    Type: Grant
    Filed: June 5, 2018
    Date of Patent: September 17, 2019
    Assignee: Tesla, Inc.
    Inventors: Peter Joseph Bannon, Kevin Altair Hurd, Emil Talpes
  • Publication number: 20190250830
    Abstract: Presented are systems and methods that allow for efficient data processing that reduce data latency and, thus, power consumption and data management cost. In various embodiments, this is accomplished by using a sequencer that identifies an address pointer of a first data block within a memory and a length of data that comprises that data block and is related to an input of a matrix processor. The sequencer then calculates, based on the block length, the input length, and a memory map, a block count representative of a number of data blocks that are to be retrieved from the memory. Using the address pointer, the sequencer may retrieve a number of data blocks from the memory in a number of cycles that depends on whether the data blocks are contiguous. In embodiments, based on the length of data, a formatter then maps the data blocks to the input of the matrix processor.
    Type: Application
    Filed: June 5, 2018
    Publication date: August 15, 2019
    Applicant: Tesla, Inc.
    Inventors: Peter Joseph BANNON, Kevin Altair HURD, Emil TALPES
  • Publication number: 20190235866
    Abstract: A microprocessor system comprises a vector computational unit and a control unit. The vector computational unit includes a plurality of processing elements. The control unit is configured to provide at least a single processor instruction to the vector computational unit. The single processor instruction specifies a plurality of component instructions to be executed by the vector computational unit in response to the single processor instruction and each of the plurality of processing elements of the vector computational unit is configured to process different data elements in parallel with other processing elements in response to the single processor instruction.
    Type: Application
    Filed: March 13, 2018
    Publication date: August 1, 2019
    Inventors: Debjit Das Sarma, Emil Talpes, Peter Joseph Bannon
  • Publication number: 20190205738
    Abstract: Described herein are systems and methods that utilize a novel hardware-based pooling architecture to process the output of a convolution engine representing an output channel of a convolution layer in a convolutional neural network (CNN). The pooling system converts the output into a set of arrays and aligns them according to a pooling operation to generate a pooling result. In certain embodiments, this is accomplished by using an aligner that aligns, e.g., over a number of arithmetic cycles, an array of data in the output into rows and shifts the rows relative to each other. A pooler applies a pooling operation to a combination of a subset of data from each row to generate the pooling result.
    Type: Application
    Filed: January 4, 2018
    Publication date: July 4, 2019
    Applicant: Tesla, Inc.
    Inventors: Peter Joseph BANNON, Kevin Altair Hurd
  • Publication number: 20190179870
    Abstract: Presented are systems and methods that accelerate the convolution of an image and similar arithmetic operations by utilizing hardware-specific circuitry that enables a large number of operations to be performed in parallel across a large set of data. In various embodiments, arithmetic operations are further enhanced by reusing data and eliminating redundant steps of storing and fetching intermediate results from registers and memory when performing arithmetic operations.
    Type: Application
    Filed: December 12, 2017
    Publication date: June 13, 2019
    Applicant: Tesla, Inc.
    Inventors: Peter Joseph BANNON, William A McGee, Emil Talpes
  • Publication number: 20190068217
    Abstract: Various embodiments of the invention provide systems, devices, and methods for decompressing encoded electronic data to increase decompression throughput using any number of decoding engines. In certain embodiments, this is accomplished by identifying and processing a next record in a pipeline operation before having to complete the decompression of a current record. Various embodiments take advantage of the knowledge of the method of how the records have been encoded, e.g., in a single long record, to greatly reduce delay time, compared with existing designs, when decompressing encoded electronic data.
    Type: Application
    Filed: August 28, 2017
    Publication date: February 28, 2019
    Applicant: Tesla, Inc.
    Inventors: Peter Joseph BANNON, Kevin Altair HURD
  • Publication number: 20190026249
    Abstract: A microprocessor system comprises a computational array and a hardware data formatter. The computational array includes a plurality of computation units that each operates on a corresponding value addressed from memory. The values operated by the computation units are synchronously provided together to the computational array as a group of values to be processed in parallel. The hardware data formatter is configured to gather the group of values, wherein the group of values includes a first subset of values located consecutively in memory and a second subset of values located consecutively in memory. The first subset of values is not required to be located consecutively in the memory from the second subset of values.
    Type: Application
    Filed: March 13, 2018
    Publication date: January 24, 2019
    Inventors: Emil Talpes, William McGee, Peter Joseph Bannon
  • Publication number: 20190026250
    Abstract: A microprocessor system comprises a computational array and a vector computational unit. The computational array includes a plurality of computation units. The vector computational unit is in communication with the computational array and includes a plurality of processing elements. The processing elements are configured to receive output data elements from the computational array and process in parallel the received output data elements.
    Type: Application
    Filed: March 13, 2018
    Publication date: January 24, 2019
    Inventors: Debjit Das Sarma, Emil Talpes, Peter Joseph Bannon
  • Publication number: 20190026078
    Abstract: Various embodiments of the disclosure relate to an accelerated mathematical engine. In certain embodiments, the accelerated mathematical engine is applied to image processing such that convolution of an image is accelerated by using a two-dimensional matrix processor comprising sub-circuits that include an ALU, output register and shadow register. This architecture supports a clocked, two-dimensional architecture in which image data and weights are multiplied in a synchronized manner to allow a large number of mathematical operations to be performed in parallel.
    Type: Application
    Filed: September 20, 2017
    Publication date: January 24, 2019
    Applicant: Tesla, Inc.
    Inventors: Peter Joseph BANNON, Kevin Altair HURD, Emil TALPES
  • Publication number: 20190026237
    Abstract: A microprocessor system comprises a computational array and a hardware arbiter. The computational array includes a plurality of computation units. Each of the plurality of computation units operates on a corresponding value addressed from memory. The hardware arbiter is configured to control issuing of at least one memory request for one or more of the corresponding values addressed from the memory for the computation units. The hardware arbiter is also configured to schedule a control signal to be issued based on the issuing of the memory requests.
    Type: Application
    Filed: March 13, 2018
    Publication date: January 24, 2019
    Inventors: Emil Talpes, Peter Joseph Bannon, Kevin Altair Hurd
  • Patent number: 5805872
    Abstract: A computer system including a cache which has a wave pipeline read controller is described. The system in addition to the cache memory includes a processor coupled to the cache memory. The processor includes a register stack which stores values corresponding to a wave number and read speed which is loaded as part of a configuration of the processor. The processor determines a repetition rate for read data corresponding to a difference between the values of read speed and wave number. The processor includes a logic delay line comprised of a plurality of clock delay elements, each of said elements providing successively increasing discrete delays to a clock signal fed to the logic delay line. The delay line is used to provide inputs to a first and second multiplexer which are respectively controlled by a signal corresponding to a desired repetition rate for read cycles and a signal corresponding to the read speed of the cache memory.
    Type: Grant
    Filed: September 8, 1995
    Date of Patent: September 8, 1998
    Assignee: Digital Equipment Corporation
    Inventor: Peter Joseph Bannon