Patents by Inventor Joydeep Ray

Joydeep Ray has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240070926
    Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.
    Type: Application
    Filed: September 13, 2023
    Publication date: February 29, 2024
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Ben Ashbaugh, Prasoonkumar Surti, Pradeep Ramani, Rama Harihara, Jerin C. Justin, Jing Huang, Xiaoming Cui, Timothy B. Costa, Ting Gong, Elmoustapha Ould-ahmed-vall, Kumar Balasubramanian, Anil Thomas, Oguz H. Elibol, Jayaram Bobba, Guozhong Zhuang, Bhavani Subramanian, Gokce Keskin, Chandrasekaran Sakthivel, Rajesh Poornachandran
  • Publication number: 20240069737
    Abstract: Embodiments described herein provide a technique to improve the performance of bit-wise atomic writes to the same double word address. One embodiment provides a graphics processor comprising a system interface, a graphics processor core coupled with the system interface, and circuitry to process memory access messages received from the graphics processor core. To process the memory access messages, the circuitry is configured to merge operands associated with one or more memory access messages to perform a bitwise atomic operation, the one or more memory access messages having addresses within a same 4-byte location in memory.
    Type: Application
    Filed: August 23, 2022
    Publication date: February 29, 2024
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Abhishek R. Appu, Prathamesh Raghunath Shinde
  • Patent number: 11915357
    Abstract: Apparatus and method for stack throttling.
    Type: Grant
    Filed: March 16, 2020
    Date of Patent: February 27, 2024
    Assignee: Intel Corporation
    Inventors: Karthik Vaidyanathan, Abhishek Appu, Vasanth Ranganathan, Joydeep Ray, Prasoonkumar Surti
  • Publication number: 20240053985
    Abstract: Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a shared local memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive an instruction to initiate a matrix multiplication operation, write a first set of matrix data into a first set of registers, and share the first set of matrix data between the first processing resource and the second processing resource for use in the matrix multiplication operation. Other embodiments may be described and claimed.
    Type: Application
    Filed: October 11, 2023
    Publication date: February 15, 2024
    Applicant: Intel Corporation
    Inventors: SUBRAMANIAM MAIYURAN, VARGHESE GEORGE, JOYDEEP RAY, ASHUTOSH GARG, JORGE PARRA, SHUBH SHAH, SHUBRA MARWAHA
  • Publication number: 20240054595
    Abstract: Embodiments described herein provide a system of concurrent compute queues that enable the scheduling of a large number of compute contexts simultaneously on graphics processor hardware. One embodiment provides an apparatus comprising a system interface and a general-purpose graphics processor coupled with the system interface. The general-purpose graphics processor comprises a plurality of graphics processor hardware resources configured to be partitioned into a plurality of isolated partitions, each of the plurality of isolated partitions including a first command streamer, a second command streamer, and circuitry configured to schedule general-purpose graphics compute workloads submitted to a first plurality of command queues associated with the first command streamer and a second plurality of command queues associated with the second command streamer.
    Type: Application
    Filed: August 10, 2022
    Publication date: February 15, 2024
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Vasanth Ranganathan, James Valerio, Jeffery S. Boles, Hema Chand Nalluri, Aditya Navale, Ben J. Ashbaugh, Michal Mrozek, Murali Ramadoss, Hong Jiang, Ankur Shah
  • Patent number: 11899614
    Abstract: Embodiments described herein provide techniques to facilitate instruction-based control of memory attributes. One embodiment provides a graphics processor comprising a processing resource, a memory device, a cache coupled with the processing resources and the memory, and circuitry to process a memory access message received from the processing resource. The memory access message enables access to data of the memory device. To process the memory access message, the circuitry is configured to determine one or more cache attributes that indicate whether the data should be read from or stored the cache. The cache attributes may be provided by the memory access message or stored in state data associated with the data to be accessed by the access message.
    Type: Grant
    Filed: June 24, 2022
    Date of Patent: February 13, 2024
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Altug Koker, Varghese George, Mike Macpherson, Aravindh Anantaraman, Abhishek R. Appu, Elmoustapha Ould-Ahmed-Vall, Nicolas Galoppo von Borries, Ben J. Ashbaugh
  • Patent number: 11900665
    Abstract: A graphics processor can include a processing cluster array including a plurality of processing clusters coupled with the plurality of memory controllers, each processing cluster of the plurality of processing clusters including a plurality of streaming multiprocessors, the processing cluster array configured for partitioning into a plurality of partitions. The plurality of partitions include a first partition including a first plurality of streaming multiprocessors configured to perform operations for a first neural network, The operations for the first neural network are isolated to the first partition. The plurality of partitions also include a second partition including a second plurality of streaming multiprocessors configured to perform operations for a second neural network. The operations for the second neural network are isolated to the second partition and protected from operations performed for the first neural network.
    Type: Grant
    Filed: July 25, 2023
    Date of Patent: February 13, 2024
    Assignee: Intel Corporation
    Inventors: Barnan Das, Mayuresh M. Varerkar, Narayan Biswal, Stanley J. Baran, Gokcen Cilingir, Nilesh V. Shah, Archie Sharma, Sherine Abdelhak, Praneetha Kotha, Neelay Pandit, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Abhishek R. Appu, Altug Koker, Joydeep Ray
  • Publication number: 20240045830
    Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: August 16, 2023
    Publication date: February 8, 2024
    Applicant: Intel Corporation
    Inventors: Joydeep RAY, Aravindh ANANTARAMAN, Abhishek R. APPU, Altug KOKER, Elmoustapha OULD-AHMED-VALL, Valentin ANDREI, Subramaniam MAIYURAN, Nicolas GALOPPO VON BORRIES, Varghese GEORGE, Mike MACPHERSON, Ben ASHBAUGH, Murali RAMADOSS, Vikranth VEMULAPALLI, William SADLER, Jonathan PEARCE, Sungye KIM
  • Patent number: 11892950
    Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
    Type: Grant
    Filed: July 15, 2022
    Date of Patent: February 6, 2024
    Assignee: INTEL CORPORATION
    Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Jr., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
  • Publication number: 20240037691
    Abstract: Systems, apparatuses, and methods may provide for technology to process multi-resolution images by identifying pixels at a boundary between pixels of different resolutions, and selectively smoothing the identified pixels.
    Type: Application
    Filed: May 4, 2023
    Publication date: February 1, 2024
    Applicant: Intel Corporation
    Inventors: Travis T. Schluessler, Joydeep Ray, John H. Feit, Nikos Kaburlasos, Jacek Kwiatkowski
  • Publication number: 20240012767
    Abstract: An apparatus to facilitate efficient data sharing for graphics data processing operations is disclosed. The apparatus includes a processing resource to generate a stream of instructions, an L1 cache communicably coupled to the processing resource and comprising an on-page detector circuit to determine that a set of memory requests in the stream of instructions access a same memory page; and set a marker in a first request of the set of memory requests; and arbitration circuitry communicably coupled to the L1 cache, the arbitration circuitry to route the set of memory requests to memory comprising the memory page and to, in response to receiving the first request with the marker set, remain with the processing resource to process the set of memory requests.
    Type: Application
    Filed: July 25, 2023
    Publication date: January 11, 2024
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Altug Koker, Elmoustapha Ould-Ahmed-Vall, Michael Macpherson, Aravindh V. Anantaraman, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Varghese George, Abhishek Appu, Prasoonkumar Surti
  • Publication number: 20240013337
    Abstract: A mechanism is described for detecting, at training time, information related to one or more tasks to be performed by the one or more processors according to a training dataset for a neural network, analyzing the information to determine one or more portions of hardware of a processor of the one or more processors that is configurable to support the one or more tasks, configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks, and monitoring utilization of the hardware via a hardware unit of the graphics processor and, via a scheduler of the graphics processor, adjusting allocation of the one or more tasks to the one or more portions of the hardware based on the utilization.
    Type: Application
    Filed: July 13, 2023
    Publication date: January 11, 2024
    Applicant: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, Dukhwan Kim
  • Patent number: 11868264
    Abstract: One embodiment provides circuitry coupled with cache memory and a memory interface, the circuitry to compress compute data at multiple cache line granularity, and a processing resource coupled with the memory interface and the cache memory. The processing resource is configured to perform a general-purpose compute operation on compute data associated with multiple cache lines of the cache memory. The circuitry is configured to compress the compute data before a write of the compute data via the memory interface to the memory bus, in association with a read of the compute data associated with the multiple cache lines via the memory interface, decompress the compute data, and provide the decompressed compute data to the processing resource.
    Type: Grant
    Filed: February 13, 2023
    Date of Patent: January 9, 2024
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, David Puffer, Prasoonkumar Surti, Lakshminarayanan Striramassarma, Vasanth Ranganathan, Kiran C. Veernapu, Balaji Vembu, Pattabhiraman K
  • Patent number: 11871142
    Abstract: Systems, apparatuses and methods may provide for technology that determines a frame rate of video content, sets a blend amount parameter based on the frame rate, and temporally anti-aliases the video content based on the blend amount parameter. Additionally, the technology may detect a coarse pixel (CP) shading condition with respect to one or more frames in the video content and select, in response to the CP shading condition, a per frame jitter pattern that jitters across pixels, wherein the video content is temporally anti-aliased based on the per frame jitter pattern. The CP shading condition may also cause the technology to apply a gradient to a plurality of color planes on a per color plane basis and discard pixel level samples associated with a CP if all mip data corresponding to the CP is transparent or shadowed out.
    Type: Grant
    Filed: December 14, 2021
    Date of Patent: January 9, 2024
    Assignee: Intel Corporation
    Inventors: Travis T. Schluessler, Joydeep Ray, John H. Feit, Nikos Kaburlasos, Jacek Kwiatkowski, Karthik Vaidyanathan, Prasoonkumar Surti, Michael Apodaca, Murali Ramadoss, Abhishek Venkatesh
  • Publication number: 20240004713
    Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: August 1, 2023
    Publication date: January 4, 2024
    Applicant: Intel Corporation
    Inventors: Abhishek R. APPU, Altug KOKER, Balaji VEMBU, Joydeep RAY, Kamal SINHA, Prasoonkumar SURTI, Kiran C. VEERNAPU, Subramaniam MAIYURAN, Sanjeev S. Jahagirdar, Eric J. Asperheim, Guei-Yuan Lueh, David Puffer, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Josh B. Mastronarde, Linda L. Hurd, Travis T. Schluessler, Tomasz Janczak, Abhishek Venkatesh, Kai Xiao, Slawomir Grajewski
  • Publication number: 20240004829
    Abstract: An integrated circuit (IC) package apparatus is disclosed. The IC package includes one or more processing units and a bridge, mounted below the one or more processing unit, including one or more arithmetic logic units (ALUs) to perform atomic operations.
    Type: Application
    Filed: July 12, 2023
    Publication date: January 4, 2024
    Applicant: Intel Corporation
    Inventors: Altug Koker, Farshad Akhbari, Feng Chen, Dukhwan Kim, Narayan Srinivasa, Nadathur Rajagopalan Satish, Liwei Ma, Jeremy Bottleson, Eriko Nurvitadhi, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu
  • Publication number: 20240004833
    Abstract: In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: July 10, 2023
    Publication date: January 4, 2024
    Applicant: Intel Corporation
    Inventors: Altug Koker, Prasoonkumar Surti, David Puffer, Subramaniam Maiyuran, Guei-Yuan Lueh, Abhishek R. Appu, Joydeep Ray, Balaji Vembu, Tomer Bar-On, Andrew T. Lauritzen, Hugues Labbe, John G. Gierach, Gabor Liktor
  • Publication number: 20240005136
    Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: July 12, 2023
    Publication date: January 4, 2024
    Applicant: Intel Corporation
    Inventors: Kamal Sinha, Balaji Vembu, Eriko Nurvitadhi, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Farshad Akhbari, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Nadathur Rajagopalan Satish, John C. Weast, Mike B. MacPherson, Linda L. Hurd, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 11861759
    Abstract: Embodiments are generally directed to memory prefetching in multiple GPU environment. An embodiment of an apparatus includes multiple processors including a host processor and multiple graphics processing units (GPUs) to process data, each of the GPUs including a prefetcher and a cache; and a memory for storage of data, the memory including a plurality of memory elements, wherein the prefetcher of each of the GPUs is to prefetch data from the memory to the cache of the GPU; and wherein the prefetcher of a GPU is prohibited from prefetching from a page that is not owned by the GPU or by the host processor.
    Type: Grant
    Filed: January 20, 2022
    Date of Patent: January 2, 2024
    Assignee: INTEL CORPORATION
    Inventors: Joydeep Ray, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Nicolas Galoppo von Borries, Varghese George, Altug Koker, Elmoustapha Ould-Ahmed-Vall, Mike Macpherson, Subramaniam Maiyuran
  • Patent number: 11861761
    Abstract: Embodiments described herein are generally directed to improvements relating to power, latency, bandwidth and/or performance issues relating to GPU processing/caching. According to one embodiment, a system includes a producer intellectual property (IP) (e.g., a media IP), a compute core (e.g., a GPU or an AI-specific core of the GPU), a streaming buffer logically interposed between the producer IP and the compute core. The producer IP is operable to consume data from memory and output results to the streaming buffer. The compute core is operable to perform AI inference processing based on data consumed from the streaming buffer and output AI inference processing results to the memory.
    Type: Grant
    Filed: November 11, 2020
    Date of Patent: January 2, 2024
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Durgaprasad Bilagi, Joydeep Ray, Scott Janus, Sanjeev Jahagirdar, Brent Insko, Lidong Xu, Abhishek R. Appu, James Holland, Vasanth Ranganathan, Nikos Kaburlasos, Altug Koker, Xinmin Tian, Guei-Yuan Lueh, Changliang Wang