Patents by Inventor Christopher L. Mills

Christopher L. Mills has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Neural network processor for handling differing datatypes

Patent number: 11580353

Abstract: Embodiments relate to a neural engine circuit that includes an input buffer circuit, a kernel extract circuit, and a multiply-accumulator (MAC) circuit. The MAC circuit receives input data from the input buffer circuit and a kernel coefficient from the kernel extract circuit. The MAC circuit contains several multiply-add (MAD) circuits and accumulators used to perform neural networking operations on the received input data and kernel coefficients. MAD circuits are configured to support fixed-point precision (e.g., INT8) and floating-point precision (FP16) of operands. In floating-point mode, each MAD circuit multiplies the integer bits of input data and kernel coefficients and adds their exponent bits to determine a binary point for alignment. In fixed-point mode, input data and kernel coefficients are multiplied. In both operation modes, the output data is stored in an accumulator, and may be sent back as accumulated values for further multiply-add operations in subsequent processing cycles.

Type: Grant

Filed: May 4, 2018

Date of Patent: February 14, 2023

Assignee: Apple Inc.

Inventor: Christopher L. Mills
MULTI DIMENSIONAL CONVOLUTION IN NEURAL NETWORK PROCESSOR

Publication number: 20230018248

Abstract: Embodiments of the present disclosure relate to a neural engine of a neural processor circuit having multiple multiply-add circuits and an accumulator circuit coupled to the multiply-add circuits. The multiply-add circuits perform multiply-add operations of a three dimensional convolution on a work unit of input data using a kernel to generate at least a portion of output data in a processing cycle. The accumulator circuit includes multiple batches of accumulators. Each batch of accumulators receives and stores, after the processing cycle, the portion of the output data for each output depth plane of multiple output depth planes. A corresponding batch of accumulators stores, after the processing cycle, the portion of the output data for a subset of the output channels and for each output depth plane.

Type: Application

Filed: September 14, 2022

Publication date: January 19, 2023

Inventors: Christopher L. Mills, Sung Hee Park
Reduction mode of planar engine in neural processor

Patent number: 11537864

Abstract: Embodiments relate to a neural processor that includes one or more neural engine circuits and planar engine circuits. The neural engine circuits can perform convolution operations of input data with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. A planar engine circuit can be configured to multiple modes. In a reduction mode, the planar engine circuit may process values arranged in one or more dimensions of input to generate a reduced value. The reduced values across multiple input data may be accumulated. The planar engine circuit may program a filter circuit as a reduction tree to gradually reduce the data into a reduced value. The reduction operation reduces the size of one or more dimensions of a tensor.

Type: Grant

Filed: November 26, 2019

Date of Patent: December 27, 2022

Assignee: Apple Inc.

Inventors: Christopher L. Mills, Kenneth W. Waters, Youchang Kim
Scalable neural network processing engine

Patent number: 11537838

Abstract: Embodiments relate to a neural processor circuit with scalable architecture for instantiating one or more neural networks. The neural processor circuit includes a data buffer coupled to a memory external to the neural processor circuit, and a plurality of neural engine circuits. To execute tasks that instantiate the neural networks, each neural engine circuit generates output data using input data and kernel coefficients. A neural processor circuit may include multiple neural engine circuits that are selectively activated or deactivated according to configuration data of the tasks. Furthermore, an electronic device may include multiple neural processor circuits that are selectively activated or deactivated to execute the tasks.

Type: Grant

Filed: May 4, 2018

Date of Patent: December 27, 2022

Assignee: Apple Inc.

Inventors: Erik K. Norden, Liran Fishel, Sung Hee Park, Jaewon Shin, Christopher L. Mills, Seungjin Lee, Fernando A. Mujica
CIRCULAR BUFFERING IN NEURAL NETWORK PROCESSOR

Publication number: 20220398440

Abstract: Embodiments of the present disclosure relate to circular buffers in a neural processor circuit. The neural processor circuit includes multiple neural engine circuits and a data processor circuit coupled to at least one of the neural engine circuits. The at least one neural engine circuit performs at least convolution operations. The data processor circuit includes a circular buffer, and a flow control circuit coupled to the circular buffer. The flow control circuit generates at least one addressing parameter that defines wrapping of data in the circular buffer. The circular buffer controls data flow in the neural processor circuit by storing first data associated with the at least one neural engine circuit so that the first data is wrapped around in the circular buffer. An addressing layout of the first data wrapped around in the circular buffer is defined by the at least one addressing parameter.

Type: Application

Filed: June 10, 2021

Publication date: December 15, 2022

Inventor: Christopher L. Mills
Chained buffers in neural network processor

Patent number: 11513799

Abstract: Embodiments of the present disclosure relate to chained buffers in a neural processor circuit. The neural processor circuit includes multiple neural engines, a planar engine, a buffer memory, and a flow control circuit. At least one neural engine operates as a first producer of first data or a first consumer of second data. The planar engine operates as a second consumer receiving the first data from the first producer or a second producer sending the second data to the first consumer. Data flow between the at least one neural engine and the planar engine is controlled using at least a subset of buffers in the buffer memory operating as at least one chained buffer that chains flow of the first data and the second data between the at least one neural engine and the planar engine.

Type: Grant

Filed: November 4, 2019

Date of Patent: November 29, 2022

Assignee: Apple Inc.

Inventor: Christopher L. Mills
Performing multiply and accumulate operations in neural network processor

Patent number: 11487846

Abstract: Embodiments relate to a neural processor circuit including a plurality of neural engine circuits, a data buffer, and a kernel fetcher circuit. At least one of the neural engine circuits is configured to receive matrix elements of a matrix as at least the portion of the input data from the data buffer over multiple processing cycles. The at least one neural engine circuit further receives vector elements of a vector from the kernel fetcher circuit, wherein each of the vector elements is extracted as a corresponding kernel to the at least one neural engine circuit in each of the processing cycles. The at least one neural engine circuit performs multiplication between the matrix and the vector as a convolution operation to produce at least one output channel of the output data.

Type: Grant

Filed: May 4, 2018

Date of Patent: November 1, 2022

Assignee: Apple Inc.

Inventors: Christopher L. Mills, Erik K. Norden, Sung Hee Park
Multi dimensional convolution in neural network processor

Patent number: 11475283

Abstract: Embodiments of the present disclosure relate to a neural engine of a neural processor circuit having multiple multiply-add circuits and an accumulator circuit coupled to the multiply-add circuits. The multiply-add circuits perform multiply-add operations of a three dimensional convolution on a work unit of input data using a kernel to generate at least a portion of output data in a processing cycle. The accumulator circuit includes multiple batches of accumulators. Each batch of accumulators receives and stores, after the processing cycle, the portion of the output data for each output depth plane of multiple output depth planes. A corresponding batch of accumulators stores, after the processing cycle, the portion of the output data for a subset of the output channels and for each output depth plane.

Type: Grant

Filed: October 24, 2019

Date of Patent: October 18, 2022

Assignee: Apple Inc.

Inventors: Christopher L. Mills, Sung Hee Park
BRANCHING OPERATION FOR NEURAL PROCESSOR CIRCUIT

Publication number: 20220237439

Abstract: A neural processor includes neural engines for performing convolution operations on input data corresponding to one or more tasks to generate output data. The neural processor circuit also includes a data processor circuit that is coupled to one or more neural engine. The data processor circuit receives the output data from the neural engine and generates a branching command from the output data. The neural processor circuit further includes a task manager that is coupled to the data processor circuit. The task manager receives the branching command from the data processor circuit. The task manager enqueues one of two or more segment branches according to the received branching command. The two or more segment branches are subsequent to a pre-branch task segment that includes the pre-branch task. The task manager transmits a task from the selected one of the segment branches to data processor circuit to perform the task.

Type: Application

Filed: January 22, 2021

Publication date: July 28, 2022

Inventors: Kenneth W. Waters, Christopher L. Mills
TASK CONTEXT SWITCH FOR NEURAL PROCESSOR CIRCUIT

Publication number: 20220237438

Abstract: A neural processor includes neural engines for performing convolution operations on input data corresponding to one or more tasks to generate output data. The neural processor also includes a data processor circuit coupled to external system memory. The data processor circuit includes a buffer for storing the output data from the neural engines. The neural processor further includes a task manager coupled to the data processor circuit. The task manager receives a context-switch task. The context-switch task specifies a switch of the data processor circuit from handling an outgoing task to an incoming task. The task manager sends configuration data of the context-switch task to cause the data processor circuit to transmit the output data corresponding to the outgoing task from the buffer to the external system memory. The data processor circuit also fetches data corresponding to the incoming task from the external system memory to the buffer.

Type: Application

Filed: January 22, 2021

Publication date: July 28, 2022

Inventors: Christopher L. Mills, Kenneth W. Waters
MULTI-OPERATIONAL MODES OF NEURAL ENGINE CIRCUIT

Publication number: 20220222510

Abstract: Embodiments relate to a neural engine circuit of a neural network processor circuit that performs a convolution operation on input data in a first mode and a parallel sorting operation on input data in a second mode. The neural engine circuit includes a plurality of operation circuits and an accumulator circuit coupled to the plurality of operation circuits. The plurality of operation circuits receives input data. In the first mode, the plurality of operation circuits performs multiply-add operations of a convolution on the input data using a kernel. In the second mode, the plurality of operation circuits performs a portion of a parallel sorting operation on the input data. In the first mode, the accumulator circuit receives and stores first results of the multiply-add operations. In the second mode, the accumulator circuit receives and stores second results of the parallel sorting operation.

Type: Application

Filed: January 13, 2021

Publication date: July 14, 2022

Inventors: Christopher L. Mills, Sung Hee Park
PROCESSING NON-POWER-OF-TWO WORK UNIT IN NEURAL PROCESSOR CIRCUIT

Publication number: 20220222509

Abstract: A neural processor includes one or more neural engine circuits for performing convolution operations on input data corresponding to one or more tasks to generate output data. The neural engine circuits process the input data having a power-of-two (P2) shape. The neural processor circuit also includes a data processor circuit. The data processor circuit fetches source data having a non-power-of-two (NP2) shape. The source data may correspond to data of a machine learning model. The data processor circuit also reshapes the source data to generate reshaped source data with the P2 shape. The data processor circuit further sends the reshaped source data to the one or more neural engine circuits as the input data for performing convolution operations. In some cases, the data processor circuit may also perform padding on the source data before the source data is reshaped to the P2 shape.

Type: Application

Filed: January 13, 2021

Publication date: July 14, 2022

Inventor: Christopher L. Mills
MULTI-DIMENSIONAL TENSOR SUPPORT EXTENSION IN NEURAL NETWORK PROCESSOR

Publication number: 20220156575

Abstract: Embodiments of the present disclosure relate to a tensor access operation circuit in a neural processor circuit. The neural processor circuit further includes a data processor circuit and at least one neural engine circuit. The tensor access operation circuit indirectly accesses at least a region of a source tensor in a system memory having a rank, and maps one or more source components of the source tensor into an input tensor having another rank. The data processor circuit stores an output version of the input tensor obtained from the tensor access operation circuit and sends the output version of the input tensor as multiple of units of input data to the at least one neural engine circuit. The at least one neural engine circuit performs at least convolution operations on the units of input data and at least one kernel to generate output data.

Type: Application

Filed: November 19, 2020

Publication date: May 19, 2022

Inventor: Christopher L. Mills
TEXTURE UNIT CIRCUIT IN NEURAL NETWORK PROCESSOR

Publication number: 20220138553

Abstract: Embodiments of the present disclosure relate to a texture unit circuit in a neural processor circuit. The neural processor circuit includes a tensor access operation circuit with the texture unit circuit, a data processor circuit, and at least one neural engine circuit. The texture unit circuit fetches a source tensor from a system memory by referencing an index tensor in the system memory representing indexing information into the source tensor. The data processor circuit stores an output version of the source tensor obtained from the tensor access operation circuit and sends the output version of the source tensor as multiple of units of input data to the at least one neural engine circuit. The at least one neural engine circuit performs at least convolution operations on the units of input data and at least one kernel to generate output data.

Type: Application

Filed: October 30, 2020

Publication date: May 5, 2022

Inventor: Christopher L. Mills
COMPRESSION OF KERNEL DATA FOR NEURAL NETWORK OPERATIONS

Publication number: 20220019875

Abstract: Embodiments relate to a neural processor circuit that includes a kernel access circuit and multiple neural engine circuits. The kernel access circuit reads compressed kernel data from memory external to the neural processor circuit. Each neural engine circuit receives compressed kernel data from the kernel access circuit. Each neural engine circuit includes a kernel extract circuit and a kernel multiply-add (MAD) circuit. The kernel extract circuit extracts uncompressed kernel data from the compressed kernel data. The kernel MAD circuit receives the uncompressed kernel data from the kernel extract circuit and performs neural network operations on a portion of input data using the uncompressed kernel data.

Type: Application

Filed: September 13, 2021

Publication date: January 20, 2022

Inventors: Liran Fishel, Sung Hee Park, Christopher L. Mills
Processing group convolution in neural network processor

Patent number: 11200490

Abstract: Embodiments relate to a neural processor circuit including neural engines, a buffer, and a kernel access circuit. The neural engines perform convolution operations on input data and kernel data to generate output data. The buffer is between the neural engines and a memory external to the neural processor circuit. The buffer stores input data for sending to the neural engines and output data received from the neural engines. The kernel access circuit receives one or more kernels from the memory external to the neural processor circuit. The neural processor circuit operates in one of multiple modes, at least one of which divides a convolution operation into multiple independent convolution operations for execution by the neural engines.

Type: Grant

Filed: May 4, 2018

Date of Patent: December 14, 2021

Assignee: Apple Inc.

Inventors: Sung Hee Park, Seungjin Lee, Christopher L. Mills
TERNARY MODE OF PLANAR ENGINE FOR NEURAL PROCESSOR

Publication number: 20210319290

Abstract: A neural processor includes one or more neural engine circuits and a planar engine circuit. The neural engine circuits can perform convolution operations of first input data with one or more kernels to generate a first output. The planar engine circuit receives second input data that corresponds to a version of the first input data. The planar engine circuit also receives third input data that includes fourth input data and fifth input data stored together in a dimension of third input data. The planar engine circuit performs a first elementwise operation between a version of the second input data and a version of the fourth input data to generate intermediate data. The planar engine circuit performs a second elementwise operation between the intermediate data and a version of the fifth input data to generate a second output.

Type: Application

Filed: April 9, 2020

Publication date: October 14, 2021

Inventors: Christopher L. Mills, Kenneth W. Waters, Youchang Kim
Compression of kernel data for neural network operations

Patent number: 11120327

Abstract: Embodiments relate to a neural processor circuit that includes a kernel access circuit and multiple neural engine circuits. The kernel access circuit reads compressed kernel data from memory external to the neural processor circuit. Each neural engine circuit receives compressed kernel data from the kernel access circuit. Each neural engine circuit includes a kernel extract circuit and a kernel multiply-add (MAD) circuit. The kernel extract circuit extracts uncompressed kernel data from the compressed kernel data. The kernel MAD circuit receives the uncompressed kernel data from the kernel extract circuit and performs neural network operations on a portion of input data using the uncompressed kernel data.

Type: Grant

Filed: May 4, 2018

Date of Patent: September 14, 2021

Assignee: APPLE INC.

Inventors: Liran Fishel, Sung Hee Park, Christopher L. Mills
ASYNCHRONOUS TASK EXECUTION FOR NEURAL PROCESSOR CIRCUIT

Publication number: 20210271958

Abstract: Embodiments relate to a neural processor circuit including one or more planar engine circuits that perform non-convolution operations in parallel with convolution operations performed by one or more neural engine circuits. The neural engine circuits perform the convolution operations on neural input data corresponding to one or more neural engine tasks to generate neural output data. The planar engine circuits perform non-convolution operations on planar input data corresponding to one or more planar engine tasks to generate planar output data. A data processor circuit in the neural processor circuit addresses data dependency between the one or more neural engine tasks and the one or more planar engine tasks by controlling reading of the neural output data as the planar input data by the planar engine circuits or reading of the planar output data as the neural input data by the neural engine circuits.

Type: Application

Filed: March 2, 2020

Publication date: September 2, 2021

Inventors: Christopher L. Mills, Kenneth W. Waters
BROADCASTING MODE OF PLANAR ENGINE FOR NEURAL PROCESSOR

Publication number: 20210241079

Abstract: Embodiments relate to a neural processor that includes one or more neural engine circuits and planar engine circuits. The neural engine circuits can perform convolution operations of input data with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. A planar engine circuit can be configured to multiple modes. In an elementwise mode, the planar engine circuit may combine two tensors by performing operations element by element. The planar engine circuit may support elementwise operation for two tensors that are in different sizes and ranks. The planar engine circuit may perform a broadcasting operation to duplicate one or more values across one or more channels to make a smaller tensor matching the size of the larger tensor.

Type: Application

Filed: February 4, 2020

Publication date: August 5, 2021

Inventors: CHRISTOPHER L. MILLS, KENNETH W. WATERS, YOUCHANG KIM

prev 1 2 3 4 next