Patents by Inventor Varghese George

Varghese George has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11816500
    Abstract: Apparatuses to synchronize lanes that diverge or threads that drift are disclosed. In one embodiment, a graphics multiprocessor includes a queue having an initial state of groups with a first group having threads of first and second instruction types and a second group having threads of the first and second instruction types. A regroup engine (or regroup circuitry) regroups threads into a third group having threads of the first instruction type and a fourth group having threads of the second instruction type.
    Type: Grant
    Filed: March 15, 2019
    Date of Patent: November 14, 2023
    Assignee: Intel Corporation
    Inventors: Valentin Andrei, Subramaniam Maiyuran, SungYe Kim, Varghese George, Altug Koker, Aravindh Anantaraman
  • Publication number: 20230360307
    Abstract: One embodiment provides a graphics processor comprising a block of execution resources, a cache memory, a cache memory prefetcher, and circuitry including a programmable neural network unit, the programmable neural network unit comprising a network hardware block including circuitry to perform neural network operations and activation operations for a layer of a neural network, the programmable neural network unit addressable by cores within the block of graphics cores and the neural network hardware block configured to perform operations associated with a neural network configured to determine a prefetch pattern for the cache memory prefetcher.
    Type: Application
    Filed: May 1, 2023
    Publication date: November 9, 2023
    Applicant: Intel Corporation
    Inventors: HUGUES LABBE, DARREL PALKE, SHERINE ABDELHAK, JILL BOYCE, VARGHESE GEORGE, SCOTT JANUS, ADAM LAKE, ZHIJUN LEI, ZHENGMIN LI, MIKE MACPHERSON, CARL MARSHALL, SELVAKUMAR PANNEER, PRASOONKUMAR SURTI, KARTHIK VEERAMANI, DEEPAK VEMBAR, VALLABHAJOSYULA SRINIVASA SOMAYAZULU
  • Patent number: 11809905
    Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a set of processing elements to execute one or more thread groups of a second kernel to be executed by the general-purpose graphics processor, an on-chip memory coupled to the set of processing elements, and a scheduler coupled with the set of processing elements, the scheduler to schedule the thread groups of the kernel to the set of processing elements, wherein the scheduler is to schedule a thread group of the second kernel to execute subsequent to a thread group of a first kernel, the thread group of the second kernel configured to access a region of the on-chip memory that contains data written by the thread group of the first kernel in response to a determination that the second kernel is dependent upon the first kernel.
    Type: Grant
    Filed: September 13, 2021
    Date of Patent: November 7, 2023
    Assignee: Intel Corporation
    Inventors: Valentin Andrei, Aravindh Anantaraman, Abhishek R. Appu, Nicolas C. Galoppo von Borries, Altug Koker, SungYe Kim, Elmoustapha Ould-Ahmed-Vall, Mike Macpherson, Subramaniam Maiyuran, Vasanth Ranganathan, Joydeep Ray, Varghese George
  • Publication number: 20230351543
    Abstract: Embodiments described herein include, software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. Embodiment described herein provided techniques to detect zero value elements within a vector or a set of packed data elements output by a processing resource and generate metadata to indicate a location of the zero value elements within the plurality of data elements.
    Type: Application
    Filed: May 2, 2023
    Publication date: November 2, 2023
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Scott Janus, Varghese George, Subramaniam Maiyuran, Altug Koker, Abhishek Appu, Prasoonkumar Surti, Vasanth Ranganathan, Valentin Andrei, Ashutosh Garg, Yoav Harel, Arthur Hunter, JR., SungYe Kim, Mike Macpherson, Elmoustapha Ould-Ahmed-Vall, William Sadler, Lakshminarayanan Striramassarma, Vikranth Vemulapalli
  • Patent number: 11786158
    Abstract: A bladder fullness monitoring systems includes a controller and an active optical sensor that is affixed to a patient's bladder. The sensor emits light onto the bladder and further detects light reflected from the bladder, in order to generate an output signal that indicates an amount of emitted light was reflected back to the detector. The controller is coupled to the optical sensor to receive and interpret the output signals, e.g., to determine when the bladder is full. The controller may be operatively coupled to a urinary control apparatus which uses the output signals to trigger urination in patients who have lost the ability to voluntarily urinate. Embodiments are particularly useful for monitoring bladder fullness in patients who have lost bladder sensation and/or the ability to voluntary urinate and rely on a urinary control apparatus in order to urinate.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: October 17, 2023
    Assignee: InCube Labs, LLC
    Inventors: Ralph Walter Peterson, Kyle Horlen, Stephen R. Kraus, Paul Spehr, Elmar Fischer, Varghese George, Mir A. Imran
  • Publication number: 20230297373
    Abstract: Embodiments described herein provided for an instruction and associated logic to enable GPGPU program code to access special purpose hardware logic to accelerate dot product operations. One embodiment provides for a graphics processing unit comprising a fetch unit to fetch a single instruction for execution, a decode unit to decode the single instruction into a decoded instruction, wherein the decoded instruction is to cause the graphics processing unit to perform a set of parallel dot product operations on elements of input matrices, and a systolic dot product unit to execute the decoded instruction across one or more parallel processor lanes using multiple systolic layers associated with multiple pipeline stages. The multiple pipeline stages include one or more sets of interconnected multipliers and adders to compute multiple concurrent dot products.
    Type: Application
    Filed: April 26, 2023
    Publication date: September 21, 2023
    Applicant: Intel Corporation
    Inventors: SUBRAMANIAM MAIYURAN, GUEI-YUAN LUEH, SUPRATIM PAL, ASHUTOSH GARG, CHANDRA S. GURRAM, JORGE E. PARRA, JUNJIE GU, KONRAD TRIFUNOVIC, HONG BIN LIAO, MIKE B. MACPHERSON, SHUBH B. SHAH, SHUBRA MARWAHA, STEPHEN JUNKINS, TIMOTHY R. BAUER, VARGHESE GEORGE, WEIYU CHEN
  • Patent number: 11763416
    Abstract: Embodiments described herein provide techniques to disaggregate an architecture of a system on a chip integrated circuit into multiple distinct chiplets that can be packaged onto a common chassis. In one embodiment, a graphics processing unit or parallel processor is composed from diverse silicon chiplets that are separately manufactured. A chiplet is an at least partially and distinctly packaged integrated circuit that includes distinct units of logic that can be assembled with other chiplets into a larger package. A diverse set of chiplets with different IP core logic can be assembled into a single device.
    Type: Grant
    Filed: October 13, 2021
    Date of Patent: September 19, 2023
    Assignee: Intel Corporation
    Inventors: Naveen Matam, Lance Cheney, Eric Finley, Varghese George, Sanjeev Jahagirdar, Altug Koker, Josh Mastronarde, Iqbal Rajwani, Lakshminarayanan Striramassarma, Melaku Teshome, Vikranth Vemulapalli, Binoj Xavier
  • Patent number: 11762804
    Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: July 19, 2022
    Date of Patent: September 19, 2023
    Assignee: INTEL CORPORATION
    Inventors: Joydeep Ray, Aravindh Anantaraman, Abhishek R. Appu, Altug Koker, Elmoustapha Ould-Ahmed-Vall, Valentin Andrei, Subramaniam Maiyuran, Nicolas Galoppo Von Borries, Varghese George, Mike MacPherson, Ben Ashbaugh, Murali Ramadoss, Vikranth Vemulapalli, William Sadler, Jonathan Pearce, Sungye Kim
  • Patent number: 11756150
    Abstract: Embodiments described herein provide techniques to disaggregate an architecture of a system on a chip integrated circuit into multiple distinct chiplets that can be packaged onto a common chassis. In one embodiment, a graphics processing unit or parallel processor is composed from diverse silicon chiplets that are separately manufactured. A chiplet is an at least partially and distinctly packaged integrated circuit that includes distinct units of logic that can be assembled with other chiplets into a larger package. A diverse set of chiplets with different IP core logic can be assembled into a single device.
    Type: Grant
    Filed: February 17, 2022
    Date of Patent: September 12, 2023
    Assignee: Intel Corporation
    Inventors: Naveen Matam, Lance Cheney, Eric Finley, Varghese George, Sanjeev Jahagirdar, Altug Koker, Josh Mastronarde, Iqbal Rajwani, Lakshminarayanan Striramassarma, Melaku Teshome, Vikranth Vemulapalli, Binoj Xavier
  • Patent number: 11755501
    Abstract: An apparatus to facilitate efficient data sharing for graphics data processing operations is disclosed. The apparatus includes a processing resource to generate a stream of instructions, an L1 cache communicably coupled to the processing resource and comprising an on-page detector circuit to determine that a set of memory requests in the stream of instructions access a same memory page; and set a marker in a first request of the set of memory requests; and arbitration circuitry communicably coupled to the L1 cache, the arbitration circuitry to route the set of memory requests to memory comprising the memory page and to, in response to receiving the first request with the marker set, remain with the processing resource to process the set of memory requests.
    Type: Grant
    Filed: March 25, 2021
    Date of Patent: September 12, 2023
    Assignee: INTEL CORPORATION
    Inventors: Joydeep Ray, Altug Koker, Elmoustapha Ould-Ahmed-Vall, Michael Macpherson, Aravindh V. Anantaraman, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Varghese George, Abhishek Appu, Prasoonkumar Surti
  • Publication number: 20230281272
    Abstract: Described herein is a graphics processor including a plurality of processing clusters coupled with a host interface, each processing cluster comprising a plurality of multiprocessors, the plurality of multiprocessors interconnected via a data interconnect, and each multiprocessor comprising sparse matrix multiply acceleration hardware including a systolic processing array with feedback inputs.
    Type: Application
    Filed: April 17, 2023
    Publication date: September 7, 2023
    Applicant: Intel Corporation
    Inventors: SUBRAMANIAM MAIYURAN, JORGE PARRA, SUPRATIM PAL, ASHUTOSH GARG, SHUBRA MARWAHA, CHANDRA GURRAM, DARIN STARKEY, DURGESH BORKAR, VARGHESE GEORGE
  • Patent number: 11748130
    Abstract: Graphics processing systems and methods are described. A graphics processing apparatus may comprise one or more graphics processing engines, a memory, a memory management unit (MMU) including a GPU second level page table and GPU dirty bit tracking, and a provisioning agent to receive a request from a virtual machine monitor (VMM) to provision a subcluster of graphics processing apparatuses, the subcluster including a plurality of graphics processing engines from a plurality of graphics processing apparatuses connected using a scale-up fabric, provision the scale-up fabric to route data within the subcluster of graphics processing apparatuses, and provision a plurality of resources on the graphics processing apparatus for the subcluster based on the request from the VMM.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: September 5, 2023
    Assignee: INTEL CORPORATION
    Inventors: Rajesh Sankaran, Bret Toll, William Rash, Subramaniam Maiyuran, Gang Chen, Varghese George
  • Patent number: 11709793
    Abstract: Described herein is a graphics processing unit (GPU) comprising a first processing cluster to perform parallel processing operations, the parallel processing operations including a ray tracing operation and a matrix multiply operation; and a second processing cluster coupled to the first processing cluster, wherein the first processing cluster includes a floating-point unit to perform floating point operations, the floating-point unit is configured to process an instruction using a bfloat16 (BF16) format with a multiplier to multiply second and third source operands while an accumulator adds a first source operand with output from the multiplier.
    Type: Grant
    Filed: May 27, 2022
    Date of Patent: July 25, 2023
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Shubra Marwaha, Ashutosh Garg, Supratim Pal, Jorge Parra, Chandra Gurram, Varghese George, Darin Starkey, Guei-Yuan Lueh
  • Publication number: 20230195685
    Abstract: Described herein is a graphics processing unit (GPU) configured to receive an instruction having multiple operands, where the instruction is a single instruction multiple data (SIMD) instruction configured to use a bfloat16 (BF16) number format and the BF16 number format is a sixteen-bit floating point format having an eight-bit exponent. The GPU can process the instruction using the multiple operands, where to process the instruction includes to perform a multiply operation, perform an addition to a result of the multiply operation, and apply a rectified linear unit function to a result of the addition.
    Type: Application
    Filed: February 17, 2023
    Publication date: June 22, 2023
    Applicant: Intel Corporation
    Inventors: Subramaniam Maiyuran, Shubra Marwaha, Ashutosh Garg, Supratim Pal, Jorge Parra, Chandra Gurram, Varghese George, Darin Starkey, Guei-Yuan Lueh
  • Patent number: 11676322
    Abstract: One embodiment provides for a graphics processor comprising a block of graphics compute units, a graphics processor pipeline coupled to the block of graphics compute units, and a programmable neural network unit including one or more neural network hardware blocks. The programmable neural network unit is coupled with the block of graphics compute units and the graphics processor pipeline. The one or more neural network hardware blocks include hardware to perform neural network operations and activation operations for a layer of a neural network. The programmable neural network unit can configure settings of one or more hardware blocks within the graphics processor pipeline based on a machine learning model trained to optimize performance of a set of workloads.
    Type: Grant
    Filed: October 13, 2021
    Date of Patent: June 13, 2023
    Assignee: Intel Corporation
    Inventors: Hugues Labbe, Darrel Palke, Sherine Abdelhak, Jill Boyce, Varghese George, Scott Janus, Adam Lake, Zhijun Lei, Zhengmin Li, Mike Macpherson, Carl Marshall, Selvakumar Panneer, Prasoonkumar Surti, Karthik Veeramani, Deepak Vembar, Vallabhajosyula Srinivasa Somayazulu
  • Patent number: 11676239
    Abstract: Embodiments described herein include, software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. Embodiment described herein provided techniques to skip computational operations for zero filled matrices and sub-matrices. Embodiments additionally provide techniques to maintain data compression through to a processing unit. Embodiments additionally provide an architecture for a sparse aware logic unit.
    Type: Grant
    Filed: June 3, 2021
    Date of Patent: June 13, 2023
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Scott Janus, Varghese George, Subramaniam Maiyuran, Altug Koker, Abhishek Appu, Prasoonkumar Surti, Vasanth Ranganathan, Andrei Valentin, Ashutosh Garg, Yoav Harel, Arthur Hunter, Jr., SungYe Kim, Mike Macpherson, Elmoustapha Ould-Ahmed-Vall, William Sadler, Lakshminarayanan Striramassarma, Vikranth Vemulapalli
  • Patent number: 11669329
    Abstract: Embodiments described herein provide for an instruction and associated logic to enable a vector multiply add instructions with automatic zero skipping for sparse input. One embodiment provides for a general-purpose graphics processor comprising logic to perform operations comprising fetching a hardware macro instruction having a predicate mask, a repeat count, and a set of initial operands, where the initial operands include a destination operand and multiple source operands. The hardware macro instruction is configured to perform one or more multiply/add operations on input data associated with a set of matrices.
    Type: Grant
    Filed: April 18, 2022
    Date of Patent: June 6, 2023
    Assignee: Intel Corporation
    Inventors: Supratim Pal, Sasikanth Avancha, Ishwar Bhati, Wei-Yu Chen, Dipankar Das, Ashutosh Garg, Chandra S. Gurram, Junjie Gu, Guei-Yuan Lueh, Subramaniam Maiyuran, Jorge E. Parra, Sudarshan Srinivasan, Varghese George
  • Patent number: 11640297
    Abstract: Embodiments described herein provided for an instruction and associated logic to enable GPGPU program code to access special purpose hardware logic to accelerate dot product operations. One embodiment provides for a graphics processing unit comprising a fetch unit to fetch an instruction for execution and a decode unit to decode the instruction into a decoded instruction. The decoded instruction is a matrix instruction to cause the graphics processing unit to perform a parallel dot product operation. The GPGPU also includes systolic dot product circuitry to execute the decoded instruction across one or more SIMD lanes using multiple systolic layers, wherein to execute the decoded instruction, a dot product computed at a first systolic layer is to be output to a second systolic layer, wherein each systolic layer includes one or more sets of interconnected multipliers and adders, each set of multipliers and adders to generate a dot product.
    Type: Grant
    Filed: June 15, 2021
    Date of Patent: May 2, 2023
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Guei-Yuan Lueh, Supratim Pal, Ashutosh Garg, Chandra S. Gurram, Jorge E. Parra, Junjie Gu, Konrad Trifunovic, Hong Bin Liao, Mike B. MacPherson, Shubh B. Shah, Shubra Marwaha, Stephen Junkins, Timothy R. Bauer, Varghese George, Weiyu Chen
  • Publication number: 20230125173
    Abstract: Provided herein are multi-chain chimeric polypeptides and use thereof in the treatment of liver diseases.
    Type: Application
    Filed: August 11, 2022
    Publication date: April 27, 2023
    Applicant: HCW Biologics, Inc.
    Inventors: Hing C. Wong, Xiaoyun Zhu, Pallavi Chaturvedi, Varghese George, Niraj Shrestha, Michael Dee
  • Patent number: 11636174
    Abstract: Described herein is an accelerator device including a host interface, a fabric interconnect coupled with the host interface, and one or more hardware tiles coupled with the fabric interconnect, the one or more hardware tiles including sparse matrix multiply acceleration hardware including a systolic array with feedback inputs.
    Type: Grant
    Filed: November 16, 2021
    Date of Patent: April 25, 2023
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Jorge Parra, Supratim Pal, Ashutosh Garg, Shubra Marwaha, Chandra Gurram, Darin Starkey, Durgesh Borkar, Varghese George