Patents by Inventor Christopher L. Mills

Christopher L. Mills has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210158135
    Abstract: Embodiments relate to a neural processor that includes one or more neural engine circuits and planar engine circuits. The neural engine circuits can perform convolution operations of input data with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. A planar engine circuit can be configured to multiple modes. In a reduction mode, the planar engine circuit may process values arranged in one or more dimensions of input to generate a reduced value. The reduced values across multiple input data may be accumulated. The planar engine circuit may program a filter circuit as a reduction tree to gradually reduce the data into a reduced value. The reduction operation reduces the size of one or more dimensions of a tensor.
    Type: Application
    Filed: November 26, 2019
    Publication date: May 27, 2021
    Inventors: Christopher L. MILLS, Kenneth W. WATERS, Youchang KIM
  • Publication number: 20210132945
    Abstract: Embodiments of the present disclosure relate to chained buffers in a neural processor circuit. The neural processor circuit includes multiple neural engines, a planar engine, a buffer memory, and a flow control circuit. At least one neural engine operates as a first producer of first data or a first consumer of second data. The planar engine operates as a second consumer receiving the first data from the first producer or a second producer sending the second data to the first consumer. Data flow between the at least one neural engine and the planar engine is controlled using at least a subset of buffers in the buffer memory operating as at least one chained buffer that chains flow of the first data and the second data between the at least one neural engine and the planar engine.
    Type: Application
    Filed: November 4, 2019
    Publication date: May 6, 2021
    Inventor: Christopher L. Mills
  • Publication number: 20210125041
    Abstract: Embodiments of the present disclosure relate to a neural engine of a neural processor circuit having multiple multiply-add circuits and an accumulator circuit coupled to the multiply-add circuits. The multiply-add circuits perform multiply-add operations of a three dimensional convolution on a work unit of input data using a kernel to generate at least a portion of output data in a processing cycle. The accumulator circuit includes multiple batches of accumulators. Each batch of accumulators receives and stores, after the processing cycle, the portion of the output data for each output depth plane of multiple output depth planes. A corresponding batch of accumulators stores, after the processing cycle, the portion of the output data for a subset of the output channels and for each output depth plane.
    Type: Application
    Filed: October 24, 2019
    Publication date: April 29, 2021
    Inventors: Christopher L. Mills, Sung Hee Park
  • Publication number: 20210103803
    Abstract: Embodiments relate to a neural processor that include a plurality of neural engine circuits and one or more planar engine circuits. The plurality of neural engine circuits can perform convolution operations of input data of the neural engine circuits with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. The planar engine circuit generates an output from input data that corresponds to output of the neural engine circuits or a version of input data of the neural processor. The planar engine circuit can be configured to multiple modes. In a pooling mode, the planar engine circuit reduces a spatial size of a version of the input data. In an elementwise mode, the planar engine circuit performs an elementwise operation on the input data. In a reduction mode, the planar engine circuit reduces the rank of a tensor.
    Type: Application
    Filed: October 8, 2019
    Publication date: April 8, 2021
    Inventors: Christopher L. Mills, Kenneth W. Waters, Youchang Kim
  • Patent number: 10540742
    Abstract: A device that includes integrated circuit includes a tiler circuit, a grid generator, and a warper circuit. The tiler circuit divides the distorted input image data into a plurality of image tiles and stores the image tiles into a memory device. Each image tile is an M×N array of pixel samples where M and N are greater than 1. The grid generator produces a mesh grid that describes a mapping of first pixel locations of the distorted image data to second pixel locations of the corrected image data. The warper circuit reads one or more of the image tiles from the memory device based on the mesh grid and interpolates a warped output image from the image tiles read from memory.
    Type: Grant
    Filed: April 27, 2017
    Date of Patent: January 21, 2020
    Assignee: APPLE INC.
    Inventor: Christopher L. Mills
  • Publication number: 20190340498
    Abstract: Embodiments relate to a neural processor circuit that includes multiple neural engine circuits, a data buffer, and a kernel fetcher circuit. At least one of the neural engine circuits receives multiple sub-channels of a portion of input data from the data buffer. Neural engine circuit further receives a kernel of the one or more kernels from the kernel fetcher circuit, wherein the kernel was decomposed into a corresponding sub-kernel for each sub-channel of the portion of the input data. Neural engine circuit performs a convolution operation on each sub-channel of the portion of the input data and the corresponding sub-kernel. Neural engine circuit accumulates corresponding outputs of each sub-channel portion of the convolution operation to generate a single channel of the output data.
    Type: Application
    Filed: May 4, 2018
    Publication date: November 7, 2019
    Inventor: Christopher L. Mills
  • Publication number: 20190340491
    Abstract: Embodiments relate to a neural processor circuit with scalable architecture for instantiating one or more neural networks. The neural processor circuit includes a data buffer coupled to a memory external to the neural processor circuit, and a plurality of neural engine circuits. To execute tasks that instantiate the neural networks, each neural engine circuit generates output data using input data and kernel coefficients. A neural processor circuit may include multiple neural engine circuits that are selectively activated or deactivated according to configuration data of the tasks. Furthermore, an electronic device may include multiple neural processor circuits that are selectively activated or deactivated to execute the tasks.
    Type: Application
    Filed: May 4, 2018
    Publication date: November 7, 2019
    Inventors: Erik K. Norden, Liran Fishel, Sung Hee Park, Jaewon Shin, Christopher L. Mills, Seungjin Lee, Fernando A. Mujica
  • Publication number: 20190340486
    Abstract: Embodiments relate to a neural processor circuit including a plurality of neural engine circuits, a data buffer, and a kernel fetcher circuit. At least one of the neural engine circuits is configured to receive matrix elements of a matrix as at least the portion of the input data from the data buffer over multiple processing cycles. The at least one neural engine circuit further receives vector elements of a vector from the kernel fetcher circuit, wherein each of the vector elements is extracted as a corresponding kernel to the at least one neural engine circuit in each of the processing cycles. The at least one neural engine circuit performs multiplication between the matrix and the vector as a convolution operation to produce at least one output channel of the output data.
    Type: Application
    Filed: May 4, 2018
    Publication date: November 7, 2019
    Inventors: Christopher L. Mills, Erik K. Norden, Sung Hee Park
  • Publication number: 20190340502
    Abstract: Embodiments relate to a neural processor circuit including neural engines, a buffer, and a kernel access circuit. The neural engines perform convolution operations on input data and kernel data to generate output data. The buffer is between the neural engines and a memory external to the neural processor circuit. The buffer stores input data for sending to the neural engines and output data received from the neural engines. The kernel access circuit receives one or more kernels from the memory external to the neural processor circuit. The neural processor circuit operates in one of multiple modes, at least one of which divides a convolution operation into multiple independent convolution operations for execution by the neural engines.
    Type: Application
    Filed: May 4, 2018
    Publication date: November 7, 2019
    Inventors: Sung Hee Park, Seungjin Lee, Christopher L. Mills
  • Publication number: 20190340489
    Abstract: Embodiments relate to a neural engine circuit that includes an input buffer circuit, a kernel extract circuit, and a multiply-accumulator (MAC) circuit. The MAC circuit receives input data from the input buffer circuit and a kernel coefficient from the kernel extract circuit. The MAC circuit contains several multiply-add (MAD) circuits and accumulators used to perform neural networking operations on the received input data and kernel coefficients. MAD circuits are configured to support fixed-point precision (e.g., INT8) and floating-point precision (FP16) of operands. In floating-point mode, each MAD circuit multiplies the integer bits of input data and kernel coefficients and adds their exponent bits to determine a binary point for alignment. In fixed-point mode, input data and kernel coefficients are multiplied. In both operation modes, the output data is stored in an accumulator, and may be sent back as accumulated values for further multiply-add operations in subsequent processing cycles.
    Type: Application
    Filed: May 4, 2018
    Publication date: November 7, 2019
    Inventor: Christopher L. Mills
  • Publication number: 20190340501
    Abstract: Embodiments of the present disclosure relate to splitting input data into smaller units for loading into a data buffer and neural engines in a neural processor circuit for performing neural network operations. The input data of a large size is split into slices and each slice is again split into tiles. The tile is uploaded from an external source to a data buffer inside the neural processor circuit but outside the neural engines. Each tile is again split into work units sized for storing in an input buffer circuit inside each neural engine. The input data stored in the data buffer and the input buffer circuit is reused by the neural engines to reduce re-fetching of input data. Operations of splitting the input data are performed at various components of the neural processor circuit under the management of rasterizers provided in these components.
    Type: Application
    Filed: May 4, 2018
    Publication date: November 7, 2019
    Inventor: Christopher L. Mills
  • Publication number: 20190340488
    Abstract: Embodiments relate to a neural processor circuit that includes a kernel access circuit and multiple neural engine circuits. The kernel access circuit reads compressed kernel data from memory external to the neural processor circuit. Each neural engine circuit receives compressed kernel data from the kernel access circuit. Each neural engine circuit includes a kernel extract circuit and a kernel multiply-add (MAD) circuit. The kernel extract circuit extracts uncompressed kernel data from the compressed kernel data. The kernel MAD circuit receives the uncompressed kernel data from the kernel extract circuit and performs neural network operations on a portion of input data using the uncompressed kernel data.
    Type: Application
    Filed: May 4, 2018
    Publication date: November 7, 2019
    Inventors: Liran Fishel, Sung Hee Park, Christopher L. Mills
  • Publication number: 20180315170
    Abstract: A device that includes integrated circuit includes a tiler circuit, a grid generator, and a warper circuit. The tiler circuit divides the distorted input image data into a plurality of image tiles and stores the image tiles into a memory device. Each image tile is an M×N array of pixel samples where M and N are greater than 1. The grid generator produces a mesh grid that describes a mapping of first pixel locations of the distorted image data to second pixel locations of the corrected image data. The warper circuit reads one or more of the image tiles from the memory device based on the mesh grid and interpolates a warped output image from the image tiles read from memory.
    Type: Application
    Filed: April 27, 2017
    Publication date: November 1, 2018
    Inventor: Christopher L. Mills
  • Patent number: 9911174
    Abstract: An image processing pipeline may process image data at multiple rates. A stream of raw pixel data collected from an image sensor for an image frame may be processed through one or more pipeline stages of an image signal processor. The stream of raw pixel data may then be converted into a full-color domain and scaled to a data size that is less than an initial data size for the image frame. The converted pixel data may be processed through one or more other pipelines stages and output for storage, further processing, or display. In some embodiments, a back-end interface may be implemented as part of the image signal processor via which image data collected from sources other than the image sensor may be received and processed through various pipeline stages at the image signal processor.
    Type: Grant
    Filed: August 26, 2015
    Date of Patent: March 6, 2018
    Assignee: Apple Inc.
    Inventors: Suk Hwan Lim, Christopher L. Mills, D. Amnon Silverstein, David R. Pope, Sheng Lin
  • Patent number: 9756266
    Abstract: An input rescale module that performs cross-color correlated downscaling of sensor data in the horizontal and vertical dimensions. The module may perform a first-pass demosaic of sensor data, apply horizontal and vertical scalers to resample and downsize the data in the horizontal and vertical dimensions, and then remosaic the data to provide horizontally and vertically downscaled sensor data as output for additional image processing. The module may, for example, act as a front end scaler for an image signal processor (ISP). The demosaic performed by the module may be a relatively simple demosaic, for example a demosaic function that works on 3×3 blocks of pixels. The front end of module may receive and process sensor data at two pixels per clock (ppc); the horizontal filter component reduces the sensor data down to one ppc for downstream components of the input rescale module and for the ISP pipeline.
    Type: Grant
    Filed: December 21, 2015
    Date of Patent: September 5, 2017
    Assignee: Apple Inc.
    Inventors: Christopher L. Mills, Sheng Lin, David R. Pope, D. Amnon Silverstein, Suk Hwan Lim
  • Publication number: 20170061567
    Abstract: An image processing pipeline may process image data at multiple rates. A stream of raw pixel data collected from an image sensor for an image frame may be processed through one or more pipeline stages of an image signal processor. The stream of raw pixel data may then be converted into a full-color domain and scaled to a data size that is less than an initial data size for the image frame. The converted pixel data may be processed through one or more other pipelines stages and output for storage, further processing, or display. In some embodiments, a back-end interface may be implemented as part of the image signal processor via which image data collected from sources other than the image sensor may be received and processed through various pipeline stages at the image signal processor.
    Type: Application
    Filed: August 26, 2015
    Publication date: March 2, 2017
    Applicant: APPLE INC.
    Inventors: Suk Hwan Lim, Christopher L. Mills, D. Amnon Silverstein, David R. Pope, Sheng Lin
  • Patent number: 9462189
    Abstract: An image signal processor of a device, apparatus, or computing system that includes a camera capable of capturing image data may apply piecewise perspective transformations to image data received from the camera's image sensor. A scaling unit of an Image Signal Processor (ISP) may perform piecewise perspective transformations on a captured image to correct for rolling shutter artifacts and to provide video image stabilization. Image data may be divided into a series of horizontal slices and perspective transformations may be applied to each slice. The transformations may be based on motion data determined in any of various manners, such as by using gyroscopic data and/or optical-flow calculations. The piecewise perspective transforms may be encoded as Digital Difference Analyzer (DDA) steppers and may be implemented using separable scalar operations. The image signal processor may not write the received image data to system memory until after the transformations have been performed.
    Type: Grant
    Filed: July 31, 2014
    Date of Patent: October 4, 2016
    Assignee: Apple Inc.
    Inventors: Christopher L. Mills, David R. Pope, D. Amnon Silverstein
  • Patent number: 9386234
    Abstract: An output rescale module may determine an estimated set of lines to hold in vertical support for use when performing image transformations. For example, an output rescale module may monitor input Y coordinates (in terms of input pixel lines) computed over previous lines and compute a set of lines to hold in a set of line buffers. As each output pixel line is generated, the output rescale module may compute the minimum and maximum values of Y generated by the transform across that line. The minimum and maximum input Y coordinates may then be averaged to determine the center value (the centermost input line) for that output line. The difference (in terms of input pixel lines) between centerlines for two adjacent output lines may be added to the centerline value for the current output line to estimate a center line for the next (not yet generated) output pixel line.
    Type: Grant
    Filed: July 31, 2014
    Date of Patent: July 5, 2016
    Assignee: Apple Inc.
    Inventors: Christopher L. Mills, Simon W. Butler
  • Publication number: 20160110843
    Abstract: An input rescale module that performs cross-color correlated downscaling of sensor data in the horizontal and vertical dimensions. The module may perform a first-pass demosaic of sensor data, apply horizontal and vertical scalers to resample and downsize the data in the horizontal and vertical dimensions, and then remosaic the data to provide horizontally and vertically downscaled sensor data as output for additional image processing. The module may, for example, act as a front end scaler for an image signal processor (ISP). The demosaic performed by the module may be a relatively simple demosaic, for example a demosaic function that works on 3×3 blocks of pixels. The front end of module may receive and process sensor data at two pixels per clock (ppc); the horizontal filter component reduces the sensor data down to one ppc for downstream components of the input rescale module and for the ISP pipeline.
    Type: Application
    Filed: December 21, 2015
    Publication date: April 21, 2016
    Applicant: Apple Inc.
    Inventors: Christopher L. Mills, Sheng Lin, David R. Pope, D. Amnon Silverstein, Suk Hwan Lim
  • Publication number: 20160037085
    Abstract: An output rescale module may determine an estimated set of lines to hold in vertical support for use when performing image transformations. For example, an output rescale module may monitor input Y coordinates (in terms of input pixel lines) computed over previous lines and compute a set of lines to hold in a set of line buffers. As each output pixel line is generated, the output rescale module may compute the minimum and maximum values of Y generated by the transform across that line. The minimum and maximum input Y coordinates may then be averaged to determine the center value (the centermost input line) for that output line. The difference (in terms of input pixel lines) between centerlines for two adjacent output lines may be added to the centerline value for the current output line to estimate a center line for the next (net yet generated) output pixel line.
    Type: Application
    Filed: July 31, 2014
    Publication date: February 4, 2016
    Applicant: Apple Inc.
    Inventors: Christopher L. Mills, Simon W. Butler