Patents by Inventor Nitin N. Garegrat

Nitin N. Garegrat has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230333855
    Abstract: In one embodiment, a matrix processor comprises a memory to store a matrix operand and a strided read sequence, wherein: the matrix operand is stored out of order in the memory; and the strided read sequence comprises a sequence of read operations to read the matrix operand in a correct order from the memory. The matrix processor further comprises circuitry to: receive a first instruction to be executed by the matrix processor, wherein the first instruction is to instruct the matrix processor to perform a first operation on the matrix operand; read the matrix operand from the memory based on the strided read sequence; and execute the first instruction by performing the first operation on the matrix operand.
    Type: Application
    Filed: May 19, 2023
    Publication date: October 19, 2023
    Applicant: Intel Corporation
    Inventors: Nitin N. Garegrat, Tony L. Werner, Jeff DelChiaro, Michael Rotzin, Robert T. Rhoades, Ujwal Basavaraj Sajjanar, Anne Q. Ye
  • Patent number: 11687341
    Abstract: In one embodiment, a matrix processor comprises a memory to store a matrix operand and a strided read sequence, wherein: the matrix operand is stored out of order in the memory; and the strided read sequence comprises a sequence of read operations to read the matrix operand in a correct order from the memory. The matrix processor further comprises circuitry to: receive a first instruction to be executed by the matrix processor, wherein the first instruction is to instruct the matrix processor to perform a first operation on the matrix operand; read the matrix operand from the memory based on the strided read sequence; and execute the first instruction by performing the first operation on the matrix operand.
    Type: Grant
    Filed: August 29, 2019
    Date of Patent: June 27, 2023
    Assignee: Intel Corporation
    Inventors: Nitin N. Garegrat, Tony L. Werner, Jeff DelChiaro, Michael Rotzin, Robert T. Rhoades, Ujwal Basavaraj Sajjanar, Anne Q. Ye
  • Patent number: 11567555
    Abstract: Embodiments include an apparatus comprising an execution unit coupled to a memory, a microcode controller, and a hardware controller. The microcode controller is to identify a global power and performance hint in an instruction stream that includes first and second instruction phases to be executed in parallel, identify a local hint based on synchronization dependence in the first instruction phase, and use the first local hint to balance power consumption between the execution unit and the memory during parallel executions of the first and second instruction phases. The hardware controller is to use the global hint to determine an appropriate voltage level of a compute voltage and a frequency of a compute clock signal for the execution unit during the parallel executions of the first and second instruction phases. The first local hint includes a processing rate for the first instruction phase or an indication of the processing rate.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: January 31, 2023
    Assignee: Intel Corporation
    Inventors: Jason Seung-Min Kim, Sundar Ramani, Yogesh Bansal, Nitin N. Garegrat, Olivia K. Wu, Mayank Kaushik, Mrinal Iyer, Tom Schebye, Andrew Yang
  • Patent number: 11520562
    Abstract: A method comprising storing a plurality of entries, each entry of the plurality of entries associated with a portion of a range of input values, each entry of the plurality of entries comprising a set of coefficients defining a power series approximation; selecting first entry of the plurality of entries based on a determination that a floating point input value is within a portion of the range of input values that is associated with the first entry; and calculating an output value by evaluating the power series approximation defined by the set of coefficients of the first entry at the floating point input value.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: December 6, 2022
    Assignee: Intel Corporation
    Inventors: Brian J. Hickmann, Nitin N. Garegrat, Maciej Urbanski, Michael Rotzin
  • Patent number: 11204766
    Abstract: Embodiments include a method comprising identifying, by an instruction scheduler of a processor core, a first high power instruction in an instruction stream to be executed by an execution unit of the processor core. A pre-charge signal is asserted indicating that the first high power instruction is scheduled for execution. Subsequent to the pre-charge signal being asserted, a voltage boost signal is asserted to cause a supply voltage for the execution unit to be increased. A busy signal indicating that the first high power instruction is executing is received from the execution unit. Based at least in part on the busy signal being asserted, de-asserting the voltage boost signal. More specific embodiments include decreasing the supply voltage for the execution unit subsequent to the de-asserting the voltage boost signal. More Further embodiments include delaying asserting the voltage boost signal based on a start delay time.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: December 21, 2021
    Assignee: Intel Corporation
    Inventors: Jason Seung-Min Kim, Nitin N. Garegrat, Anitha Loke, Nasima Parveen, David Y. Fang, Kursad Kiziloglu, Dmitry Sergeyevich Lukiyanchenko, Fabrice Paillet, Andrew Yang
  • Patent number: 11169776
    Abstract: Systems, apparatuses and methods may provide for technology that in response to an identification that one or more hardware units are to execute on a first type of data format, decomposes a first original floating point number to a plurality of first segmented floating point numbers that are to be equivalent to the first original floating point number. The technology may further in response to the identification, decompose a second original floating point number to a plurality of second segmented floating point numbers that are to be equivalent to the second original floating point number. The technology may further execute a multiplication operation on the first and second segmented floating point numbers to multiply the first segmented floating point numbers with the second segmented floating point numbers.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: November 9, 2021
    Assignee: Intel Corporation
    Inventors: Nitin N. Garegrat, Maciej Urbanski, Michael Rotzin, Brian J. Hickmann, Valentina Popescu
  • Publication number: 20190391811
    Abstract: In one embodiment, a matrix processor comprises a memory to store a matrix operand and a strided read sequence, wherein: the matrix operand is stored out of order in the memory; and the strided read sequence comprises a sequence of read operations to read the matrix operand in a correct order from the memory. The matrix processor further comprises circuitry to: receive a first instruction to be executed by the matrix processor, wherein the first instruction is to instruct the matrix processor to perform a first operation on the matrix operand; read the matrix operand from the memory based on the strided read sequence; and execute the first instruction by performing the first operation on the matrix operand.
    Type: Application
    Filed: August 29, 2019
    Publication date: December 26, 2019
    Applicant: Intel Corporation
    Inventors: Nitin N. Garegrat, Tony L. Werner, Jeff DelChiaro, Michael Rotzin, Robert T. Rhoades, Ujwal Basavaraj Sajjanar, Anne Q. Ye
  • Publication number: 20190384575
    Abstract: A method comprising storing a plurality of entries, each entry of the plurality of entries associated with a portion of a range of input values, each entry of the plurality of entries comprising a set of coefficients defining a power series approximation; selecting first entry of the plurality of entries based on a determination that a floating point input value is within a portion of the range of input values that is associated with the first entry; and calculating an output value by evaluating the power series approximation defined by the set of coefficients of the first entry at the floating point input value.
    Type: Application
    Filed: August 30, 2019
    Publication date: December 19, 2019
    Applicant: Intel Corporation
    Inventors: Brian J. Hickmann, Nitin N. Garegrat, Maciej Urbanski, Michael Rotzin
  • Publication number: 20190384603
    Abstract: Embodiments include a method comprising identifying, by an instruction scheduler of a processor core, a first high power instruction in an instruction stream to be executed by an execution unit of the processor core. A pre-charge signal is asserted indicating that the first high power instruction is scheduled for execution. Subsequent to the pre-charge signal being asserted, a voltage boost signal is asserted to cause a supply voltage for the execution unit to be increased. A busy signal indicating that the first high power instruction is executing is received from the execution unit. Based at least in part on the busy signal being asserted, de-asserting the voltage boost signal. More specific embodiments include decreasing the supply voltage for the execution unit subsequent to the de-asserting the voltage boost signal. More Further embodiments include delaying asserting the voltage boost signal based on a start delay time.
    Type: Application
    Filed: August 30, 2019
    Publication date: December 19, 2019
    Inventors: Jason Seung-Min Kim, Nitin N. Garegrat, Anitha Loke, Nasima Parveen, David Y. Fang, Kursad Kiziloglu, Dmitry Sergeyevich Lukiyanchenko, Fabrice Paillet, Andrew Yang
  • Publication number: 20190384370
    Abstract: Embodiments include an apparatus comprising an execution unit coupled to a memory, a microcode controller, and a hardware controller. The microcode controller is to identify a global power and performance hint in an instruction stream that includes first and second instruction phases to be executed in parallel, identify a local hint based on synchronization dependence in the first instruction phase, and use the first local hint to balance power consumption between the execution unit and the memory during parallel executions of the first and second instruction phases. The hardware controller is to use the global hint to determine an appropriate voltage level of a compute voltage and a frequency of a compute clock signal for the execution unit during the parallel executions of the first and second instruction phases. The first local hint includes a processing rate for the first instruction phase or an indication of the processing rate.
    Type: Application
    Filed: August 30, 2019
    Publication date: December 19, 2019
    Inventors: Jason Seung-Min Kim, Sundar Ramani, Yogesh Bansal, Nitin N. Garegrat, Olivia K. Wu, Mayank Kaushik, Mrinal Iyer, Tom Schebye, Andrew Yang
  • Publication number: 20190324723
    Abstract: Systems, apparatuses and methods may provide for technology that in response to an identification that one or more hardware units are to execute on a first type of data format, decomposes a first original floating point number to a plurality of first segmented floating point numbers that are to be equivalent to the first original floating point number. The technology may further in response to the identification, decompose a second original floating point number to a plurality of second segmented floating point numbers that are to be equivalent to the second original floating point number. The technology may further execute a multiplication operation on the first and second segmented floating point numbers to multiply the first segmented floating point numbers with the second segmented floating point numbers.
    Type: Application
    Filed: June 28, 2019
    Publication date: October 24, 2019
    Applicant: Intel Corporation
    Inventors: Nitin N. Garegrat, Maciej Urbanski, Michael Rotzin, Brian J. Hickmann, Valentina Popescu
  • Publication number: 20190007318
    Abstract: Technologies for inflight packet count limiting include a network device. The network device is to receive a packet from a producer application. The packet is configured to be enqueued into a packet queue as a queue element to be consumed by a consumer application. The network device is also to increment, in response to receipt of the packet, an inflight count variable, determine whether a value of the inflight count variable satisfies an inflight count limit, and enqueue, in response to a determination that the value of the inflight count variable satisfies the inflight count limit, the packet.
    Type: Application
    Filed: June 30, 2017
    Publication date: January 3, 2019
    Inventors: Niall D. McDonnell, William Burroughs, Nitin N. Garegrat, David P. Sonnier
  • Patent number: 9286125
    Abstract: A processing engine implementing job arbitration with ordering status is disclosed. A method of the disclosure includes receiving, by a job assigner communicably coupled to a plurality of processors, availability status from a plurality of job rings, availability status from the plurality of processors, and job entry completion status from an order manager, identifying, based on the received job entry completion status, a set of job rings from the plurality of job rings that do not exceed threshold conditions maintained by the job assigner, selecting, from the identified set of job rings, a job ring from which to pull a job entry for assignment, wherein the selecting is based on the received availability status of the plurality of job rings, and selecting, based on the received availability status of the plurality of processors, a processor to receive the assignment of the job entry for processing.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: March 15, 2016
    Assignee: INTEL CORPORATION
    Inventors: David A. Smiley, Naveen Lakkakula, Weiqiang Ma, Justin B. Diether, Nitin N. Garegrat
  • Publication number: 20140282579
    Abstract: A processing engine implementing job arbitration with ordering status is disclosed. A method of the disclosure includes receiving, by a job assigner communicably coupled to a plurality of processors, availability status from a plurality of job rings, availability status from the plurality of processors, and job entry completion status from an order manager, identifying, based on the received job entry completion status, a set of job rings from the plurality of job rings that do not exceed threshold conditions maintained by the job assigner, selecting, from the identified set of job rings, a job ring from which to pull a job entry for assignment, wherein the selecting is based on the received availability status of the plurality of job rings, and selecting, based on the received availability status of the plurality of processors, a processor to receive the assignment of the job entry for processing.
    Type: Application
    Filed: March 14, 2013
    Publication date: September 18, 2014
    Inventors: David A. Smiley, Naveen Lakkakula, Weiqiang Ma, Justin B. Diether, Nitin N. Garegrat