Patents by Inventor Christopher L. Mills

Christopher L. Mills has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Texture unit circuit in neural network processor

Patent number: 11972348

Abstract: Embodiments of the present disclosure relate to a texture unit circuit in a neural processor circuit. The neural processor circuit includes a tensor access operation circuit with the texture unit circuit, a data processor circuit, and at least one neural engine circuit. The texture unit circuit fetches a source tensor from a system memory by referencing an index tensor in the system memory representing indexing information into the source tensor. The data processor circuit stores an output version of the source tensor obtained from the tensor access operation circuit and sends the output version of the source tensor as multiple of units of input data to the at least one neural engine circuit. The at least one neural engine circuit performs at least convolution operations on the units of input data and at least one kernel to generate output data.

Type: Grant

Filed: October 30, 2020

Date of Patent: April 30, 2024

Assignee: APPLE INC.

Inventor: Christopher L. Mills
Asynchronous task execution for neural processor circuit

Patent number: 11934941

Abstract: A neural processor circuit includes one or more planar engine circuits that perform non-convolution operations in parallel with convolution operations performed by one or more neural engine circuits. The neural engine circuits perform the convolution operations on neural input data corresponding to one or more neural engine tasks to generate neural output data. The planar engine circuits perform non-convolution operations on planar input data corresponding to one or more planar engine tasks to generate planar output data. A data processor circuit in the neural processor circuit addresses data dependency between the one or more neural engine tasks and the one or more planar engine tasks by controlling reading of the neural output data as the planar input data by the planar engine circuits or reading of the planar output data as the neural input data by the neural engine circuits.

Type: Grant

Filed: November 17, 2022

Date of Patent: March 19, 2024

Assignee: APPLE INC.

Inventors: Christopher L. Mills, Kenneth W. Waters
TEXTURE UNIT CIRCUIT IN NEURAL NETWORK PROCESSOR

Publication number: 20240037399

Abstract: Embodiments of the present disclosure relate to a texture unit circuit in a neural processor circuit. The neural processor circuit includes a tensor access operation circuit with the texture unit circuit, a data processor circuit, and at least one neural engine circuit. The texture unit circuit fetches a source tensor from a system memory by referencing an index tensor in the system memory representing indexing information into the source tensor. The data processor circuit stores an output version of the source tensor obtained from the tensor access operation circuit and sends the output version of the source tensor as multiple of units of input data to the at least one neural engine circuit. The at least one neural engine circuit performs at least convolution operations on the units of input data and at least one kernel to generate output data.

Type: Application

Filed: October 10, 2023

Publication date: February 1, 2024

Applicant: Apple Inc.

Inventor: Christopher L. MILLS
SPLITTING OF INPUT DATA FOR PROCESSING IN NEURAL NETWORK PROCESSOR

Publication number: 20240028894

Abstract: Embodiments of the present disclosure relate to splitting input data into smaller units for loading into a data buffer and neural engines in a neural processor circuit for performing neural network operations. The input data of a large size is split into slices and each slice is again split into tiles. The tile is uploaded from an external source to a data buffer inside the neural processor circuit but outside the neural engines. Each tile is again split into work units sized for storing in an input buffer circuit inside each neural engine. The input data stored in the data buffer and the input buffer circuit is reused by the neural engines to reduce re-fetching of input data. Operations of splitting the input data are performed at various components of the neural processor circuit under the management of rasterizers provided in these components.

Type: Application

Filed: July 27, 2023

Publication date: January 25, 2024

Applicant: Apple Inc.

Inventor: Christopher L. MILLS
Neural network processor for handling differing datatypes

Patent number: 11880757

Abstract: Embodiments relate to a neural engine circuit that includes an input buffer circuit, a kernel extract circuit, and a multiply-accumulator (MAC) circuit. The MAC circuit receives input data from the input buffer circuit and a kernel coefficient from the kernel extract circuit. The MAC circuit contains several multiply-add (MAD) circuits and accumulators used to perform neural networking operations on the received input data and kernel coefficients. MAD circuits are configured to support fixed-point precision (e.g., INT8) and floating-point precision (FP16) of operands. In floating-point mode, each MAD circuit multiplies the integer bits of input data and kernel coefficients and adds their exponent bits to determine a binary point for alignment. In fixed-point mode, input data and kernel coefficients are multiplied. In both operation modes, the output data is stored in an accumulator, and may be sent back as accumulated values for further multiply-add operations in subsequent processing cycles.

Type: Grant

Filed: January 11, 2023

Date of Patent: January 23, 2024

Assignee: APPLE INC.

Inventor: Christopher L Mills
Multi dimensional convolution in neural network processor

Patent number: 11853868

Abstract: Embodiments of the present disclosure relate to a neural engine of a neural processor circuit having multiple multiply-add circuits and an accumulator circuit coupled to the multiply-add circuits. The multiply-add circuits perform multiply-add operations of a three dimensional convolution on a work unit of input data using a kernel to generate at least a portion of output data in a processing cycle. The accumulator circuit includes multiple batches of accumulators. Each batch of accumulators receives and stores, after the processing cycle, the portion of the output data for each output depth plane of multiple output depth planes. A corresponding batch of accumulators stores, after the processing cycle, the portion of the output data for a subset of the output channels and for each output depth plane.

Type: Grant

Filed: September 14, 2022

Date of Patent: December 26, 2023

Assignee: APPLE INC.

Inventors: Christopher L. Mills, Sung Hee Park
Splitting of input data for processing in neural network processor

Patent number: 11783174

Abstract: Embodiments of the present disclosure relate to splitting input data into smaller units for loading into a data buffer and neural engines in a neural processor circuit for performing neural network operations. The input data of a large size is split into slices and each slice is again split into tiles. The tile is uploaded from an external source to a data buffer inside the neural processor circuit but outside the neural engines. Each tile is again split into work units sized for storing in an input buffer circuit inside each neural engine. The input data stored in the data buffer and the input buffer circuit is reused by the neural engines to reduce re-fetching of input data. Operations of splitting the input data are performed at various components of the neural processor circuit under the management of rasterizers provided in these components.

Type: Grant

Filed: May 4, 2018

Date of Patent: October 10, 2023

Assignee: Apple Inc.

Inventor: Christopher L. Mills
CACHE PREFETCH FOR NEURAL PROCESSOR CIRCUIT

Publication number: 20230289291

Abstract: A neural processor may include a system memory access circuit coupled to a system memory. The system memory access circuit is configured to fetch, from the system memory, first input data of a first task associated with a neural network. The neural processor may also include neural engines coupled to the system memory access circuit. The neural engines are configured to perform convolution operations on the first input data in a first set of operating cycles. The neural processor may further include a cache access circuit coupled to a cache. The cache access circuit is configured to instruct the cache to prefetch from the system memory, during the first set of operating cycles corresponding to the first task, second input data of a second task of the neural network. The second task is scheduled for processing in a second set of operating cycles after the first set of operating cycles.

Type: Application

Filed: March 10, 2022

Publication date: September 14, 2023

Inventors: Seungjin Lee, Jaewon Shin, Christopher L Mills
KEY-BASED COMPARISON IN NEURAL ENGINE CIRCUIT

Publication number: 20230229902

Abstract: Embodiments relate to a neural engine circuit of a neural network processor circuit that performs a parallel sorting operation on input data. The neural engine circuit includes operation circuits and an accumulator circuit coupled to the outputs of the operation circuits. Each of the operation circuits operates in parallel and is configured to compare a field of a first record of a first set of records and a corresponding field of a second record of a second set of records to generate a comparison result on values in the field and the corresponding field. The accumulator circuit includes a record store storing records that are involved in the parallel sorting operation and a sideband register that stores the comparison results generated by the operation circuits.

Type: Application

Filed: January 19, 2022

Publication date: July 20, 2023

Inventor: Christopher L. Mills
BROADCASTING MODE OF PLANAR ENGINE FOR NEURAL PROCESSOR

Publication number: 20230206051

Abstract: Embodiments relate to a neural processor that includes one or more neural engine circuits and planar engine circuits. The neural engine circuits can perform convolution operations of input data with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. A planar engine circuit can be configured to multiple modes. In an elementwise mode, the planar engine circuit may combine two tensors by performing operations element by element. The planar engine circuit may support elementwise operation for two tensors that are in different sizes and ranks. The planar engine circuit may perform a broadcasting operation to duplicate one or more values across one or more channels to make a smaller tensor matching the size of the larger tensor.

Type: Application

Filed: March 10, 2023

Publication date: June 29, 2023

Inventors: Christopher L. Mills, Kenneth W. Waters, Youchang Kim
Neural Network Processor for Handling Differing Datatypes

Publication number: 20230169308

Abstract: Embodiments relate to a neural engine circuit that includes an input buffer circuit, a kernel extract circuit, and a multiply-accumulator (MAC) circuit. The MAC circuit receives input data from the input buffer circuit and a kernel coefficient from the kernel extract circuit. The MAC circuit contains several multiply-add (MAD) circuits and accumulators used to perform neural networking operations on the received input data and kernel coefficients. MAD circuits are configured to support fixed-point precision (e.g., INT8) and floating-point precision (FP16) of operands. In floating-point mode, each MAD circuit multiplies the integer bits of input data and kernel coefficients and adds their exponent bits to determine a binary point for alignment. In fixed-point mode, input data and kernel coefficients are multiplied. In both operation modes, the output data is stored in an accumulator, and may be sent back as accumulated values for further multiply-add operations in subsequent processing cycles.

Type: Application

Filed: January 11, 2023

Publication date: June 1, 2023

Inventor: Christopher L. Mills
Indexing Operations In Neural Network Processor

Publication number: 20230169316

Abstract: Embodiments of the present disclosure relate to indexing in a neural processor circuit. The neural processor circuit includes multiple neural engine circuits and a data processor circuit directly coupled to at least one of the neural engine circuits. The at least one neural engine circuit performs a convolution operation on input data to generate output data. The data processor circuit includes a buffer memory and an indexing circuit coupled to the buffer memory. The buffer memory stores an index tensor and the output data as a source tensor. The indexing circuit fetches a portion of the source tensor from the buffer memory by referencing the index tensor representing indexing information into the portion of the source tensor.

Type: Application

Filed: November 30, 2021

Publication date: June 1, 2023

Inventor: Christopher L. Mills
BINARY COMPARISON AND REDUCTION OPERATIONS IN NEURAL NETWORK PROCESSOR

Publication number: 20230128047

Abstract: Embodiments of the present disclosure relate to binary comparison operations (e.g., Boolean operations) and reduction operations in a neural processor circuit to enable implementation of conditional operations without software control. The neural processor circuit includes a neural engine circuit and a planar engine circuit coupled to the neural engine circuit. The neural engine circuit performs a convolution operation to generate output data. The planar engine circuit includes a binary comparator circuit and a filter circuit coupled to the binary comparator circuit. The binary comparator circuit performs a binary comparison operation on a tensor from the output data to generate a conditional tensor. The filter circuit performs a reduction operation for each patch of the conditional tensor to generate a respective reduced value of multiple reduced values associated with a corresponding channel of multiple channels of the conditional tensor.

Type: Application

Filed: October 25, 2021

Publication date: April 27, 2023

Inventor: Christopher L. Mills
REDUCTION OPERATION WITH RETENTION IN NEURAL NETWORK PROCESSOR

Publication number: 20230121448

Abstract: Embodiments of the present disclosure relate to a reduction operation in a neural processor circuit where results of the reduction operation are retained for multiple post-processing operations. The neural processor circuit includes neural engine circuits and a planar engine circuit coupled to the neural engine circuits. At least one neural engine circuit performs a convolution operation to generate output data. The planar engine circuit includes a filter circuit and a line buffer coupled to the filter circuit. The filter circuit performs a reduction operation for each patch of a tensor from the output data to generate a respective reduced value associated with a corresponding channel of the tensor. The line buffer stores reduced values each being associated with a respective channel of the tensor. The line buffer retains the reduced values for a defined number of operating cycles as indicated by a refresh flag defining resetting of the line buffer.

Type: Application

Filed: October 19, 2021

Publication date: April 20, 2023

Inventors: Christopher L. Mills, Youchang Kim
Broadcasting mode of planar engine for neural processor

Patent number: 11630991

Abstract: Embodiments relate to a neural processor that includes one or more neural engine circuits and planar engine circuits. The neural engine circuits can perform convolution operations of input data with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. A planar engine circuit can be configured to multiple modes. In an elementwise mode, the planar engine circuit may combine two tensors by performing operations element by element. The planar engine circuit may support elementwise operation for two tensors that are in different sizes and ranks. The planar engine circuit may perform a broadcasting operation to duplicate one or more values across one or more channels to make a smaller tensor matching the size of the larger tensor.

Type: Grant

Filed: February 4, 2020

Date of Patent: April 18, 2023

Assignee: Apple Inc.

Inventors: Christopher L. Mills, Kenneth W. Waters, Youchang Kim
SCALABLE NEURAL NETWORK PROCESSING ENGINE

Publication number: 20230099652

Abstract: Embodiments relate to a neural processor circuit with scalable architecture for instantiating one or more neural networks. The neural processor circuit includes a data buffer coupled to a memory external to the neural processor circuit, and a plurality of neural engine circuits. To execute tasks that instantiate the neural networks, each neural engine circuit generates output data using input data and kernel coefficients. A neural processor circuit may include multiple neural engine circuits that are selectively activated or deactivated according to configuration data of the tasks. Furthermore, an electronic device may include multiple neural processor circuits that are selectively activated or deactivated to execute the tasks.

Type: Application

Filed: November 21, 2022

Publication date: March 30, 2023

Inventors: Erik Norden, Liran Fishel, Sung Hee Park, Jaewon Shin, Christopher L. Mills, Seungjin Lee, Fernando A. Mujica
ASYNCHRONOUS TASK EXECUTION FOR NEURAL PROCESSOR CIRCUIT

Publication number: 20230081023

Abstract: Embodiments relate to a neural processor circuit including one or more planar engine circuits that perform non-convolution operations in parallel with convolution operations performed by one or more neural engine circuits. The neural engine circuits perform the convolution operations on neural input data corresponding to one or more neural engine tasks to generate neural output data. The planar engine circuits perform non-convolution operations on planar input data corresponding to one or more planar engine tasks to generate planar output data. A data processor circuit in the neural processor circuit addresses data dependency between the one or more neural engine tasks and the one or more planar engine tasks by controlling reading of the neural output data as the planar input data by the planar engine circuits or reading of the planar output data as the neural input data by the neural engine circuits.

Type: Application

Filed: November 17, 2022

Publication date: March 16, 2023

Inventors: Christopher L. Mills, Kenneth W. Waters
Ternary mode of planar engine for neural processor

Patent number: 11604975

Abstract: A neural processor includes one or more neural engine circuits and a planar engine circuit. The neural engine circuits can perform convolution operations of first input data with one or more kernels to generate a first output. The planar engine circuit receives second input data that corresponds to a version of the first input data. The planar engine circuit also receives third input data that includes fourth input data and fifth input data stored together in a dimension of third input data. The planar engine circuit performs a first elementwise operation between a version of the second input data and a version of the fourth input data to generate intermediate data. The planar engine circuit performs a second elementwise operation between the intermediate data and a version of the fifth input data to generate a second output.

Type: Grant

Filed: April 9, 2020

Date of Patent: March 14, 2023

Assignee: Apple Inc.

Inventors: Christopher L. Mills, Kenneth W. Waters, Youchang Kim
Asynchronous task execution for neural processor circuit

Patent number: 11599780

Abstract: A neural processor circuit including one or more planar engine circuits that perform non-convolution operations in parallel with convolution operations performed by one or more neural engine circuits. The neural engine circuits perform the convolution operations on neural input data corresponding to one or more neural engine tasks to generate neural output data. The planar engine circuits perform non-convolution operations on planar input data corresponding to one or more planar engine tasks to generate planar output data. A data processor circuit in the neural processor circuit addresses data dependency between the one or more neural engine tasks and the one or more planar engine tasks by controlling reading of the neural output data as the planar input data by the planar engine circuits or reading of the planar output data as the neural input data by the neural engine circuits.

Type: Grant

Filed: March 2, 2020

Date of Patent: March 7, 2023

Assignee: Apple Inc.

Inventors: Christopher L. Mills, Kenneth W. Waters
Neural network processor for handling differing datatypes

Patent number: 11580353

Abstract: Embodiments relate to a neural engine circuit that includes an input buffer circuit, a kernel extract circuit, and a multiply-accumulator (MAC) circuit. The MAC circuit receives input data from the input buffer circuit and a kernel coefficient from the kernel extract circuit. The MAC circuit contains several multiply-add (MAD) circuits and accumulators used to perform neural networking operations on the received input data and kernel coefficients. MAD circuits are configured to support fixed-point precision (e.g., INT8) and floating-point precision (FP16) of operands. In floating-point mode, each MAD circuit multiplies the integer bits of input data and kernel coefficients and adds their exponent bits to determine a binary point for alignment. In fixed-point mode, input data and kernel coefficients are multiplied. In both operation modes, the output data is stored in an accumulator, and may be sent back as accumulated values for further multiply-add operations in subsequent processing cycles.

Type: Grant

Filed: May 4, 2018

Date of Patent: February 14, 2023

Assignee: Apple Inc.

Inventor: Christopher L. Mills

1 2 3 4 next