Patents by Inventor Martin Power

Martin Power has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240134786
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.
    Type: Application
    Filed: December 14, 2023
    Publication date: April 25, 2024
    Applicant: Intel Corporation
    Inventors: Martin-Thomas Grymel, David Bernard, Niall Hanrahan, Martin Power, Kevin Brady, Gary Baugh, Cormac Brick
  • Publication number: 20240118992
    Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.
    Type: Application
    Filed: October 16, 2023
    Publication date: April 11, 2024
    Applicant: Intel Corporation
    Inventors: Martin-Thomas Grymel, David Bernard, Martin Power, Niall Hanrahan, Kevin Brady
  • Patent number: 11940907
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.
    Type: Grant
    Filed: June 25, 2021
    Date of Patent: March 26, 2024
    Assignee: INTEL CORPORATION
    Inventors: Martin-Thomas Grymel, David Bernard, Niall Hanrahan, Martin Power, Kevin Brady, Gary Baugh, Cormac Brick
  • Publication number: 20240036763
    Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.
    Type: Application
    Filed: September 12, 2023
    Publication date: February 1, 2024
    Applicant: Intel Corporation
    Inventors: Niall Hanrahan, Martin Power, Kevin Brady, Martin-Thomas Grymel, David Bernard, Gary Baugh, Cormac Brick
  • Publication number: 20240028895
    Abstract: A load module in a deep neural network (DNN) accelerator may receive a configuration parameter indicating a selection between an activation sparsity mode and a weight sparsity mode. The load module may read a sparse activation tensor, an activation sparsity bitmap, a sparse weight tensor, and a weight sparsity bitmap from a memory. The load module may densify one of the compressed tensors based on the sparsity mode and leave the other compressed tensor as is. The load module may load the dense tensor and the sparse tensor to a sparse cell. The sparse cell includes a sparsity module that may select one or more elements of the dense tensor based on the sparsity bitmap of the sparse tensor. The sparse cell also includes multiply-accumulate (MAC) units that perform MAC operation on the selected elements and the sparse tensor. MAC operations on unselected elements of the dense tensor are skipped.
    Type: Application
    Filed: September 28, 2023
    Publication date: January 25, 2024
    Inventors: Arnab Raha, Deepak Abraham Mathaikutty, Dinakar Kondru, Umer Iftikhar Cheema, Martin Power, Niall Hanrahan
  • Publication number: 20230381013
    Abstract: A rollable apparatus includes a housing being annular in shape, a heating element mounted about the exterior of the housing, a vibration-inducing motor mounted in the interior of the housing and attached thereto, and a set of batteries mounted in the interior of the housing. The set of batteries is provided for powering the heating element to control the temperature of the housing and for powering the vibration motor to induce oscillatory motion in the housing. The apparatus also includes a set of user controls mounted so as to be accessible at the exterior of the housing, a front cover removably mounted so as to form a portion of the exterior of the housing at one end thereof, and a skin of pliable cushioning material covering the housing.
    Type: Application
    Filed: August 11, 2023
    Publication date: November 30, 2023
    Inventor: Thomas Martin Powers
  • Patent number: 11829279
    Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.
    Type: Grant
    Filed: September 23, 2021
    Date of Patent: November 28, 2023
    Assignee: Intel Corporation
    Inventors: Martin-Thomas Grymel, David Bernard, Martin Power, Niall Hanrahan, Kevin Brady
  • Publication number: 20230376274
    Abstract: A fused dot-product multiply-accumulate (MAC) circuit may support variable precisions of floating-point data elements to perform computations (e.g., MAC operations) in deep learning operations. An operation mode of the circuit may be selected based on the precision of an input element. The operation mode may be a FP16 mode or a FP8 mode. In the FP8 mode, product exponents may be computed based on exponents of floating-point input elements. A maximum exponent may be selected from the one or more product exponents. A global maximum exponent may be selected from a plurality of maximum exponents. A product mantissa may be computed and aligned with another product mantissa based on a difference between the global maximum exponent and a corresponding maximum exponent. An adder tree may accumulate the aligned product mantissas and compute a partial sum mantissa. The partial sum mantissa may be normalized using the global maximum exponent.
    Type: Application
    Filed: July 31, 2023
    Publication date: November 23, 2023
    Applicant: Intel Corporation
    Inventors: Mark Anders, Arnab Raha, Amit Agarwal, Steven Hsu, Deepak Abraham Mathaikutty, Ram K. Krishnamurthy, Martin Power
  • Patent number: 11789646
    Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.
    Type: Grant
    Filed: September 24, 2021
    Date of Patent: October 17, 2023
    Assignee: INTEL CORPORATION
    Inventors: Niall Hanrahan, Martin Power, Kevin Brady, Martin-Thomas Grymel, David Bernard, Gary Baugh, Cormac Brick
  • Publication number: 20230325665
    Abstract: Gate switching in deep learning operations can be reduced based on sparsity in the input data. A first element of an activation operand and a first element of a weight operand may be stored in input storage units associated with a multiplier in a processing element. The multiplier computes a product of the two elements, which may be stored in an output storage unit of the multiplier. After detecting that a second element of the activation operand or a second element of the weight operand is zero valued, gate switching is reduced by avoiding at least one gate switching needed for the multiply-accumulation operation. For instance, the input storage units may not be updated. A zero-valued data element may be stored in the output storage unit of the multiplier and used as a product of the second element of the activation operand and the second element of the weight operand.
    Type: Application
    Filed: May 30, 2023
    Publication date: October 12, 2023
    Applicant: Intel Corporation
    Inventors: Martin Langhammer, Arnab Raha, Martin Power
  • Publication number: 20230103671
    Abstract: Methods for characterizing mechanical properties of cells at different stress levels. The disclosed inventions can determine the impact of shear stress on cells in bioproduction processes.
    Type: Application
    Filed: September 30, 2022
    Publication date: April 6, 2023
    Inventors: Ian O'Shea, John Crowley, Martin Power, Alan Ronan
  • Patent number: 11612026
    Abstract: A streetlight asset module (SAM) is provided including at least one memory block configured to store operating configuration settings of a streetlight associated with the SAM. The SAM is configured to communicate the operating configuration settings of the streetlight to a control and monitoring system (CMS) node coupled to the streetlight. The CMS node is configured to communicate the operating configuration settings to a remote location using a radio communications network. Related streetlights and systems are also provided.
    Type: Grant
    Filed: January 12, 2022
    Date of Patent: March 21, 2023
    Assignee: SELC Ireland Ltd
    Inventors: William John Kerrigan, Francis Joseph Magee, John Martin Power
  • Publication number: 20230059976
    Abstract: An DNN accelerator may include a PE array performing MAC operations. The PE array may include PEs capable of MAC operations on quantized values. A PE may include subtractors for subtracting zeropoints from quantized activations and quantized weights to generate intermediate activations and intermediate weights. The intermediate activations and intermediate weights may be stored in data storage units in the PE and maybe used by an MAC unit in the PE. The subtractors may be placed outside the MAC unit but inside the PE. The MAC unit may perform sequential cycles of MAC operations. The MAC unit may include a plurality of multipliers. The intermediate activations and intermediate weights stored in the data storage units may be reused by different multipliers in different cycles of MAC operations. An output of the MAC unit or of the PE may be multiplied with a quantization scale to produce a floating-point value.
    Type: Application
    Filed: October 18, 2022
    Publication date: February 23, 2023
    Applicant: Intel Corporation
    Inventors: Deepak Abraham Mathaikutty, Arnab Raha, Raymond Jit-Hung Sung, Martin Power, Umer Iftikhar Cheema, David Thomas Bernard
  • Publication number: 20230014656
    Abstract: A memory array of a compute tile may store activations or weights of a DNN. The memory array may include databanks for storing contexts, context MUXs, and byte MUXs. A databank may store a context with flip-flop arrays, each of which includes a sequence of flip-flops. A logic gate and an ICG unit may gate flip-flops and control whether states of the flip-flops can be changed. The data gating can prevent a context not selected for the databank from inadvertently toggling and wasting power A context MUX may read a context from different flip-flop arrays in a databank based on gray-coded addresses. A byte MUX can combine bits from different bytes in a context read by the context MUX. The memory array may be implemented with bit packing to reduce distance between the context MUX and byte MUX to reduce lengths of wires connecting the context MUXs and byte MUXs.
    Type: Application
    Filed: September 23, 2022
    Publication date: January 19, 2023
    Inventors: Raymond Jit-Hung Sung, Deepak Abraham Mathaikutty, Amit Agarwal, David Thomas Bernard, Steven Hsu, Martin Power, Conor Byme, Arnab Raha
  • Publication number: 20230018857
    Abstract: Sparsity processing within a compute block can be done on unpacked data. The compute block includes a sparsity decoder that generates a combined sparsity vector from an activation sparsity vector and a weight sparsity vector. The activation sparsity vector indicates positions of non-zero valued activations in an activation context. The weight sparsity vector indicates positions of non-zero valued weights in a weight context. The combined sparsity vector comprises one or more zero valued bits and one or more non-zero valued bits. The sparsity decoder may determine the position of a non-zero valued bit in the combined sparsity vector and determine an address for the non-zero valued activation and the non-zero valued weight based on the position of the non-zero valued bit. The non-zero valued activation and the non-zero valued weight may be provided to a PE for performing MAC operations.
    Type: Application
    Filed: September 19, 2022
    Publication date: January 19, 2023
    Inventors: Martin Power, Conor Byrne, Niall Hanrahan, Deepak Abraham Mathaikutty, Arnab Raha, Raymond Jit-Hung Sung, David Thomas Bernard, Kevin Brady, Martin-Thomas Grymel
  • Publication number: 20230020929
    Abstract: A compute tile includes a WCB that receives a workload of writing an output tensor of a convolution into a local memory of the compute tile. The local memory may be a SRAM. The WCB receives write transactions. A write transaction includes a data block, which is a part of the output tensor, and metadata describing one or more attributes of the data block. The WCB may store write transactions in its internal buffers. The WCB may determine whether to combine two write transactions, e.g., based on an operation mode or metadata in the write transactions. In embodiments where the WCB determines to combine the two write transactions, the WCB may combine the two write transactions into a new write transaction and write the new write transaction into the local memory or an internal memory of the WCB. The total number of write transactions for the workload can be reduced.
    Type: Application
    Filed: September 16, 2022
    Publication date: January 19, 2023
    Inventors: Martin-Thomas Grymel, David Thomas Bernard, Martin Power, Niall Hanrahan, Kevin Brady
  • Publication number: 20230008622
    Abstract: An DNN accelerator may perform 1×N kernel decomposition to decompose a convolutional kernel into kernel vectors, each of which includes multiple weights. Through the kernel decomposition, a weight operand may be generated from a filter. The DNN accelerator converts an input tensor into input operands. An input operand includes activations and has the same size as the weight operand. The DNN accelerator may read a first activation in the input operand from memory to an internal memory of a first PE and read a second activation in the input operand from the memory to an internal memory of a second PE. The first PE may receive the second activation from the second PE through activation broadcasting between the two PEs and perform MAC operations on the input operand and weight operand. The second PE may perform MAC operations on another input operand in the input tensor and the weight operand.
    Type: Application
    Filed: September 22, 2022
    Publication date: January 12, 2023
    Inventors: Richard Boyd, David Thomas Bernard, Deepak Abraham Mathaikutty, Martin Power, Niall Hanrahan
  • Patent number: 11546621
    Abstract: Methods, systems, apparatus and articles of manufacture to identify features within an image are disclosed herein. An example apparatus includes a horizontal cost (HCOST) engine to apply a first row of pixels of a macroblock to an input of a first HCOST unit, the first HCOST unit including a number of difference calculators; and a difference calculator engine to apply corresponding rows of pixels of a search window of a source image to corresponding ones of the number of difference calculators of the first HCOST unit, the corresponding ones of the number of difference calculators to calculate respective sums of absolute difference (SAD) values between (a) the first row of pixels of the macroblock and (b) the corresponding rows of pixels of the search window.
    Type: Grant
    Filed: February 26, 2019
    Date of Patent: January 3, 2023
    Assignee: Movidius Limited
    Inventors: Martin Power, Brendan Barry, Vasile Toma, II
  • Publication number: 20220362053
    Abstract: A rollable apparatus includes a housing being annular in shape, a heating element mounted about the exterior of the housing, a vibration-inducing motor mounted in the interior of the housing and attached thereto, and a set of batteries mounted in the interior of the housing. The set of batteries is provided for powering the heating element to control the temperature of the housing and for powering the vibration motor to induce oscillatory motion in the housing. The apparatus also includes a set of user controls mounted so as to be accessible at the exterior of the housing, a front cover removably mounted so as to form a portion of the exterior of the housing at one end thereof, and a skin of pliable cushioning material covering the housing.
    Type: Application
    Filed: August 2, 2022
    Publication date: November 17, 2022
    Inventor: Thomas Martin Powers
  • Publication number: 20220141929
    Abstract: A streetlight asset module (SAM) is provided including at least one memory block configured to store operating configuration settings of a streetlight associated with the SAM. The SAM is configured to communicate the operating configuration settings of the streetlight to a control and monitoring system (CMS) node coupled to the streetlight. The CMS node is configured to communicate the operating configuration settings to a remote location using a radio communications network. Related streetlights and systems are also provided.
    Type: Application
    Filed: January 12, 2022
    Publication date: May 5, 2022
    Inventors: William John Kerrigan, Francis Joseph Magee, John Martin Power