Patents by Inventor Jorge E. Parra
Jorge E. Parra has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11937597Abstract: The present invention provides a cryopreservation device with an integrated tracking device chamber. The tracking device can be an RFID tag which can be placed in the cryopreservation device as part of the procedure for vitrification of a biological specimen. The cryopreservation device maintains its integrity and keeps the RFID tag securely in the chamber at both ambient conditions when the specimen is collected and under liquid nitrogen where the specimen is stored. The cryopreservation device with an integrated tracking device chamber allows for each individual biological sample to have its own tracking device for monitoring during cryogenic storage.Type: GrantFiled: April 19, 2023Date of Patent: March 26, 2024Assignee: Biotech, Inc.Inventor: Jorge E. Parra-Almanza
-
Patent number: 11900502Abstract: Examples described herein relate to a software and hardware optimization that manages scenarios where a write operation to a register is less than an entirety of the register. A compiler detects instructions that make partial writes to the same register, groups such instructions, and provides hints to hardware of the partial write. The execution unit combines the output data for grouped instructions and updates the destination register as single write instead of multiple separate partial writes.Type: GrantFiled: May 2, 2022Date of Patent: February 13, 2024Assignee: Intel CorporationInventors: Chandra S. Gurram, Gang Y. Chen, Subramaniam Maiyuran, Supratim Pal, Ashutosh Garg, Jorge E. Parra, Darin M. Starkey, Guei-Yuan Lueh, Wei-Yu Chen
-
Publication number: 20230297373Abstract: Embodiments described herein provided for an instruction and associated logic to enable GPGPU program code to access special purpose hardware logic to accelerate dot product operations. One embodiment provides for a graphics processing unit comprising a fetch unit to fetch a single instruction for execution, a decode unit to decode the single instruction into a decoded instruction, wherein the decoded instruction is to cause the graphics processing unit to perform a set of parallel dot product operations on elements of input matrices, and a systolic dot product unit to execute the decoded instruction across one or more parallel processor lanes using multiple systolic layers associated with multiple pipeline stages. The multiple pipeline stages include one or more sets of interconnected multipliers and adders to compute multiple concurrent dot products.Type: ApplicationFiled: April 26, 2023Publication date: September 21, 2023Applicant: Intel CorporationInventors: SUBRAMANIAM MAIYURAN, GUEI-YUAN LUEH, SUPRATIM PAL, ASHUTOSH GARG, CHANDRA S. GURRAM, JORGE E. PARRA, JUNJIE GU, KONRAD TRIFUNOVIC, HONG BIN LIAO, MIKE B. MACPHERSON, SHUBH B. SHAH, SHUBRA MARWAHA, STEPHEN JUNKINS, TIMOTHY R. BAUER, VARGHESE GEORGE, WEIYU CHEN
-
Patent number: 11669329Abstract: Embodiments described herein provide for an instruction and associated logic to enable a vector multiply add instructions with automatic zero skipping for sparse input. One embodiment provides for a general-purpose graphics processor comprising logic to perform operations comprising fetching a hardware macro instruction having a predicate mask, a repeat count, and a set of initial operands, where the initial operands include a destination operand and multiple source operands. The hardware macro instruction is configured to perform one or more multiply/add operations on input data associated with a set of matrices.Type: GrantFiled: April 18, 2022Date of Patent: June 6, 2023Assignee: Intel CorporationInventors: Supratim Pal, Sasikanth Avancha, Ishwar Bhati, Wei-Yu Chen, Dipankar Das, Ashutosh Garg, Chandra S. Gurram, Junjie Gu, Guei-Yuan Lueh, Subramaniam Maiyuran, Jorge E. Parra, Sudarshan Srinivasan, Varghese George
-
Patent number: 11640297Abstract: Embodiments described herein provided for an instruction and associated logic to enable GPGPU program code to access special purpose hardware logic to accelerate dot product operations. One embodiment provides for a graphics processing unit comprising a fetch unit to fetch an instruction for execution and a decode unit to decode the instruction into a decoded instruction. The decoded instruction is a matrix instruction to cause the graphics processing unit to perform a parallel dot product operation. The GPGPU also includes systolic dot product circuitry to execute the decoded instruction across one or more SIMD lanes using multiple systolic layers, wherein to execute the decoded instruction, a dot product computed at a first systolic layer is to be output to a second systolic layer, wherein each systolic layer includes one or more sets of interconnected multipliers and adders, each set of multipliers and adders to generate a dot product.Type: GrantFiled: June 15, 2021Date of Patent: May 2, 2023Assignee: Intel CorporationInventors: Subramaniam Maiyuran, Guei-Yuan Lueh, Supratim Pal, Ashutosh Garg, Chandra S. Gurram, Jorge E. Parra, Junjie Gu, Konrad Trifunovic, Hong Bin Liao, Mike B. MacPherson, Shubh B. Shah, Shubra Marwaha, Stephen Junkins, Timothy R. Bauer, Varghese George, Weiyu Chen
-
Publication number: 20230060900Abstract: Embodiments detailed herein relate to reduction operations on a plurality of data element values. In one embodiment, a process comprises decoding circuitry to decode an instruction and execution circuitry to execute the decoded instruction. The instruction specifies a first input register containing a plurality of data element values, a first index register containing a plurality of indices, and an output register, where each index of the plurality of indices maps to one unique data element position of the first input register. The execution includes to identify data element values that are associated with one another based on the indices, perform one or more reduction operations on the associated data element values based on the identification, and store results of the one or more reduction operations in the output register.Type: ApplicationFiled: October 4, 2022Publication date: March 2, 2023Inventors: Christopher J. HUGHES, Jonathan D. PEARCE, Guei-Yuan LUEH, ElMoustapha OULD-AHMED-VALL, Jorge E. PARRA, Prasoonkumar SURTI, Krishna N. VINOD, Ronen ZOHAR
-
Publication number: 20220326953Abstract: Embodiments described herein provide for an instruction and associated logic to enable a vector multiply add instructions with automatic zero skipping for sparse input. One embodiment provides for a general-purpose graphics processor comprising logic to perform operations comprising fetching a hardware macro instruction having a predicate mask, a repeat count, and a set of initial operands, where the initial operands include a destination operand and multiple source operands. The hardware macro instruction is configured to perform one or more multiply/add operations on input data associated with a set of matrices.Type: ApplicationFiled: April 18, 2022Publication date: October 13, 2022Applicant: Intel CorporationInventors: Supratim Pal, Sasikanth Avancha, Ishwar Bhati, Wei-Yu Chen, Dipankar Das, Ashutosh Garg, Chandra S. Gurram, Junjie Gu, Guei-Yuan Lueh, Subramaniam Maiyuran, Jorge E. Parra, Sudarshan Srinivasan, Varghese George
-
Publication number: 20220261949Abstract: Examples described herein relate to a software and hardware optimization that manages scenarios where a write operation to a register is less than an entirety of the register. A compiler detects instructions that make partial writes to the same register, groups such instructions, and provides hints to hardware of the partial write. The execution unit combines the output data for grouped instructions and updates the destination register as single write instead of multiple separate partial writes.Type: ApplicationFiled: May 2, 2022Publication date: August 18, 2022Inventors: Chandra S. GURRAM, Gang Y. CHEN, Subramaniam MAIYURAN, Supratim PAL, Ashutosh GARG, Jorge E. PARRA, Darin M. STARKEY, Guei-Yuan LUEH, Wei-Yu CHEN
-
Publication number: 20220229661Abstract: Embodiments detailed herein relate to reduction operations on a plurality of data element values. In one embodiment, a process comprises decoding circuitry to decode an instruction and execution circuitry to execute the decoded instruction. The instruction specifies a first input register containing a plurality of data element values, a first index register containing a plurality of indices, and an output register, where each index of the plurality of indices maps to one unique data element position of the first input register. The execution includes to identify data element values that are associated with one another based on the indices, perform one or more reduction operations on the associated data element values based on the identification, and store results of the one or more reduction operations in the output register.Type: ApplicationFiled: April 4, 2022Publication date: July 21, 2022Inventors: Christopher J. HUGHES, Jonathan D. PEARCE, Guei-Yuan LUEH, ElMoustapha OULD-AHMED-VALL, Jorge E. PARRA, Prasoonkumar SURTI, Krishna N. VINOD, Ronen ZOHAR
-
Patent number: 11321799Abstract: Examples described herein relate to a software and hardware optimization that manages scenarios where a write operation to a register is less than an entirety of the register. A compiler detects instructions that make partial writes to the same register, groups such instructions, and provides hints to hardware of the partial write. The execution unit combines the output data for grouped instructions and updates the destination register as single write instead of multiple separate partial writes.Type: GrantFiled: December 24, 2019Date of Patent: May 3, 2022Assignee: Intel CorporationInventors: Chandra S. Gurram, Gang Y. Chen, Subramaniam Maiyuran, Supratim Pal, Ashutosh Garg, Jorge E. Parra, Darin M. Starkey, Guei-Yuan Lueh, Wei-Yu Chen
-
Patent number: 11314515Abstract: Embodiments described herein provide for an instruction and associated logic to enable a vector multiply add instructions with automatic zero skipping for sparse input. One embodiment provides for a general-purpose graphics processor comprising logic to perform operations comprising fetching a hardware macro instruction having a predicate mask, a repeat count, and a set of initial operands, where the initial operands include a destination operand and multiple source operands. The hardware macro instruction is configured to perform one or more multiply/add operations on input data associated with a set of matrices.Type: GrantFiled: December 23, 2019Date of Patent: April 26, 2022Assignee: Intel CorporationInventors: Supratim Pal, Sasikanth Avancha, Ishwar Bhati, Wei-Yu Chen, Dipankar Das, Ashutosh Garg, Chandra S. Gurram, Junjie Gu, Guei-Yuan Lueh, Subramaniam Maiyuran, Jorge E. Parra, Sudarshan Srinivasan, Varghese George
-
Patent number: 11294670Abstract: Embodiments detailed herein relate to reduction operations on a plurality of data element values. In one embodiment, a process comprises decoding circuitry to decode an instruction and execution circuitry to execute the decoded instruction. The instruction specifies a first input register containing a plurality of data element values, a first index register containing a plurality of indices, and an output register, where each index of the plurality of indices maps to one unique data element position of the first input register. The execution includes to identify data element values that are associated with one another based on the indices, perform one or more reduction operations on the associated data element values based on the identification, and store results of the one or more reduction operations in the output register.Type: GrantFiled: March 27, 2019Date of Patent: April 5, 2022Assignee: INTEL CORPORATIONInventors: Christopher J. Hughes, Jonathan D. Pearce, Guei-Yuan Lueh, ElMoustapha Ould-Ahmed-Vall, Jorge E. Parra, Prasoonkumar Surti, Krishna N. Vinod, Ronen Zohar
-
Publication number: 20210303299Abstract: Embodiments described herein provided for an instruction and associated logic to enable GPGPU program code to access special purpose hardware logic to accelerate dot product operations. One embodiment provides for a graphics processing unit comprising a fetch unit to fetch an instruction for execution and a decode unit to decode the instruction into a decoded instruction. The decoded instruction is a matrix instruction to cause the graphics processing unit to perform a parallel dot product operation. The GPGPU also includes systolic dot product circuitry to execute the decoded instruction across one or more SIMD lanes using multiple systolic layers, wherein to execute the decoded instruction, a dot product computed at a first systolic layer is to be output to a second systolic layer, wherein each systolic layer includes one or more sets of interconnected multipliers and adders, each set of multipliers and adders to generate a dot product.Type: ApplicationFiled: June 15, 2021Publication date: September 30, 2021Applicant: Intel CorporationInventors: SUBRAMANIAM MAIYURAN, GUEI-YUAN LUEH, SUPRATIM PAL, ASHUTOSH GARG, CHANDRA S. GURRAM, JORGE E. PARRA, JUNJIE GU, KONRAD TRIFUNOVIC, HONG BIN LIAO, MIKE B. MACPHERSON, SHUBH B. SHAH, SHUBRA MARWAHA, STEPHEN JUNKINS, TIMOTHY R. BAUER, VARGHESE GEORGE, WEIYU CHEN
-
Publication number: 20210192673Abstract: Examples described herein relate to a software and hardware optimization that manages scenarios where a write operation to a register is less than an entirety of the register. A compiler detects instructions that make partial writes to the same register, groups such instructions, and provides hints to hardware of the partial write. The execution unit combines the output data for grouped instructions and updates the destination register as single write instead of multiple separate partial writes.Type: ApplicationFiled: December 24, 2019Publication date: June 24, 2021Inventors: Chandra S. GURRAM, Gang Y. CHEN, Subramaniam MAIYURAN, Supratim PAL, Ashutosh GARG, Jorge E. PARRA, Darin M. STARKEY, Guei-Yuan LUEH, Wei-Yu CHEN
-
Publication number: 20210191724Abstract: Embodiments described herein provide for an instruction and associated logic to enable a vector multiply add instructions with automatic zero skipping for sparse input. One embodiment provides for a general-purpose graphics processor comprising logic to perform operations comprising fetching a hardware macro instruction having a predicate mask, a repeat count, and a set of initial operands, where the initial operands include a destination operand and multiple source operands. The hardware macro instruction is configured to perform one or more multiply/add operations on input data associated with a set of matrices.Type: ApplicationFiled: December 23, 2019Publication date: June 24, 2021Applicant: Intel CorporationInventors: Supratim Pal, Sasikanth Avancha, Ishwar Bhati, Wei-Yu Chen, Dipankar Das, Ashutosh Garg, Chandra S. Gurram, Junjie Gu, Guei-Yuan Lueh, Subramaniam Maiyuran, Jorge E. Parra, Sudarshan Srinivasan, Varghese George
-
Patent number: 11042370Abstract: Embodiments described herein provided for an instruction and associated logic to enable GPGPU program code to access special purpose hardware logic to accelerate dot product operations. One embodiment provides for a graphics processing unit comprising a fetch unit to fetch an instruction for execution and a decode unit to decode the instruction into a decoded instruction. The decoded instruction is a matrix instruction to cause the graphics processing unit to perform a parallel dot product operation. The GPGPU also includes a systolic dot product unit to execute the decoded instruction across one or more SIMD lanes using multiple systolic layers, wherein to execute the decoded instruction, a dot product computed at a first systolic layer is to be output to a second systolic layer, wherein each systolic layer includes one or more sets of interconnected multipliers and adders, each set of multipliers and adders to generate a dot product.Type: GrantFiled: April 19, 2018Date of Patent: June 22, 2021Assignee: Intel CorporationInventors: Subramaniam Maiyuran, Guei-Yuan Lueh, Supratim Pal, Ashutosh Garg, Chandra S. Gurram, Jorge E. Parra, Junjie Gu, Konrad Trifunovic, Hong Bin Liao, Mike B. Macpherson, Shubh B. Shah, Shubra Marwaha, Stephen Junkins, Timothy R. Bauer, Varghese George, Weiyu Chen
-
Patent number: 10983794Abstract: An processor to facilitate register sharing is disclosed. The processor includes a plurality of execution units (EUs), each including a General Purpose Register File (GRF) having a plurality of registers; and register sharing hardware to divide the plurality of registers into a first set of registers dedicated for execution of a first set of threads and a second set of registers shared for execution of a second set of threads.Type: GrantFiled: June 17, 2019Date of Patent: April 20, 2021Assignee: Intel CorporationInventors: Guei-Yuan Lueh, Subramaniam Maiyuran, Weiyu Chen, Konrad Trifunovic, Supratim Pal, Chandra S. Gurram, Jorge E. Parra, Pratik J. Ashar, Tomasz Bujewski
-
Publication number: 20200394041Abstract: An processor to facilitate register sharing is disclosed. The processor includes a plurality of execution units (EUs), each including a General Purpose Register File (GRF) having a plurality of registers; and register sharing hardware to divide the plurality of registers into a first set of registers dedicated for execution of a first set of threads and a second set of registers shared for execution of a second set of threads.Type: ApplicationFiled: June 17, 2019Publication date: December 17, 2020Applicant: Intel CorporationInventors: Guei-Yuan Lueh, Subramaniam Maiyuran, Weiyu Chen, Konrad Trifunovic, Supratim Pal, Chandra S. Gurram, Jorge E. Parra, Pratik J. Ashar, Tomasz Bujewski
-
Patent number: 10839478Abstract: A processor is disclosed. The processor includes an execution unit having a register file having one or more banks of registers to store operand values, an accumulator comprising a pool of registers to store operand values determined to cause a conflict at register banks within the register file and cache circuitry to control storage of the operand values determined to cause a conflict at the register banks from the register file to the pool of registers.Type: GrantFiled: April 8, 2019Date of Patent: November 17, 2020Assignee: Intel CorporationInventors: Guei-Yuan Lueh, Subramaniam Maiyuran, Wei-Yu Chen, Konrad Trifunovic, Supratim Pal, Chandra S. Gurram, Jorge E. Parra, Pratik J. Ashar, Tomasz Bujewski
-
Patent number: 10817297Abstract: Methods and apparatus for vector-matrix comparison are disclosed. In one embodiment, a processor comprises decoding and execution circuitry. The decoding circuitry decodes an instruction, where operands of the instruction specifies an output location to store output results, a vector of data element values, and a matrix of data element values. The execution circuitry executes the decoded instruction. The execution includes to map each of the data element values of the vector to one of consecutive rows of the matrix; for each data element value of the vector, to compare that data element value of the vector with data element values in a respective row of the matrix and obtain data element match results. The execution further includes to store the output results based on the data element match results, where each output result maps to a respective data element column position and indicates a vector match result.Type: GrantFiled: March 30, 2019Date of Patent: October 27, 2020Assignee: Intel CorporationInventors: Christopher J. Hughes, ElMoustapha Ould-Ahmed-Vall, Jorge E. Parra, Prasoonkumar Surti, Krishna N. Vinod, Ronen Zohar