Patents by Inventor Joydeep Ray
Joydeep Ray has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230123644Abstract: One embodiment provides a graphics processor comprising an interface to a system interconnect and a graphics processor coupled to the interface, the graphics processor comprising circuitry configured to compact sample data for multiple sample locations of a pixel, map the multiple sample locations to memory locations that store compacted sample data, the memory locations in a memory of the graphics processor, apply lossless compression to the compacted sample data, and update a compression control surface associated with the memory locations, the compression control surface to specify a compression status for the memory locationsType: ApplicationFiled: October 6, 2022Publication date: April 20, 2023Applicant: Intel CorporationInventors: Abhishek R. Appu, Prasoonkumar Surti, Joydeep Ray, Michael J. Norris
-
Patent number: 11631198Abstract: An apparatus to facilitate compute compression is disclosed. The apparatus includes a graphics processing unit including mapping logic to map a first block of integer pixel data to a compression block and compression logic to compress the compression block.Type: GrantFiled: June 23, 2021Date of Patent: April 18, 2023Assignee: Intel CorporationInventors: Abhishek Appu, Altug Koker, Joydeep Ray, Balaji Vembu, Prasoonkumar Surti, Kamal Sinha, Nadathur Rajagopalan Satish, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Farshad Akhbari
-
Publication number: 20230114164Abstract: In a further embodiment, a system on a chip integrated circuit (SoC) is provided that includes an active base die including a first cache memory, a first die mounted on and coupled with the active base die, and a second die mounted on the active base die and coupled with the active base die and the first die. The first die includes an interconnect fabric, an input/output interface, and an atomic operation handler. The second die includes an array of graphics processing elements and an interface to the first cache memory of the active base die. At least one of the graphics processing elements are configured to perform, via the atomic operation handler, an atomic operation to a memory device.Type: ApplicationFiled: December 15, 2021Publication date: April 13, 2023Applicant: Intel CorporationInventors: Rahul Pal, Aravindh Anantaraman, Lakshminarayana Pappu, Dongsheng Bi, Guadalupe J. Garcia, Altug Koker, Joydeep Ray, Rahul Joshi, Shrikul Atulkumar Joshi, Mahak Gupta
-
Publication number: 20230109990Abstract: One embodiment provides a graphics processor including an active base die including a fabric interconnect and a chiplet including a switched fabric, wherein the chiplet couples with the active base die via an array of interconnect structures, the array of interconnect structures couple the fabric interconnect with the switched fabric, and the chiplet includes a first modular interconnect configured to couple a block of graphics processing resources to the switched fabric and a second modular interconnect configured to couple a memory subsystem with the switched fabric and the block of graphics processing resources, the memory interconnect including a set of memory controllers and a set of physical interfaces.Type: ApplicationFiled: October 7, 2021Publication date: April 13, 2023Applicant: Intel CorporationInventors: Lakshminarayana Pappu, Altug Koker, Aditya Navale, Prasoonkumar Surti, Ankur Shah, Joydeep Ray, Naveen Matam
-
Publication number: 20230104845Abstract: One embodiment provides a graphics processor including a processing resource including a register file, memory, a cache, and load/store/cache circuitry to process load, store, and prefetch messages from the processing resource. The circuitry will sort received memory access messages into address sorted lists of reads and writes. The circuitry schedules a first set of address sorted requests from a first request buffer for a first period of time, then schedules a second set of address sorted requests from a second request buffer for a second period of time.Type: ApplicationFiled: September 24, 2021Publication date: April 6, 2023Applicant: Intel CorporationInventors: Joydeep Ray, Abhishek R. Appu, Altug Koker, Aditya Navale, Varghese George, Vasanth Ranganathan, Fangwen Fu, Ben J. Ashbaugh, Vidhya Krishnan, Sabareesh Ganapathy, Prathamesh Raghunath Shinde
-
Patent number: 11620723Abstract: One embodiment provides a graphics processor including a plurality of processing clusters, each processing cluster including a plurality of multiprocessors and a data interconnect coupled to the plurality of multiprocessors. At least one multiprocessor of the plurality of multiprocessors is configured to share data with another multiprocessor over the data interconnect.Type: GrantFiled: July 21, 2022Date of Patent: April 4, 2023Assignee: Intel CorporationInventors: Balaji Vembu, Altug Koker, Joydeep Ray
-
Patent number: 11620256Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.Type: GrantFiled: April 28, 2022Date of Patent: April 4, 2023Assignee: Intel CorporationInventors: Altug Koker, Joydeep Ray, Ben Ashbaugh, Jonathan Pearce, Abhishek Appu, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Elmoustapha Ould-Ahmed-Vall, Aravindh Anantaraman, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Yoav Harel, Arthur Hunter, Jr., Brent Insko, Scott Janus, Pattabhiraman K, Mike Macpherson, Subramaniam Maiyuran, Marian Alin Petre, Murali Ramadoss, Shailesh Shah, Kamal Sinha, Prasoonkumar Surti, Vikranth Vemulapalli
-
Publication number: 20230094002Abstract: Dynamic routing of texture-load in graphics processing is described. An example of an apparatus includes a graphics processor including a plurality of processing engines of a class of processing engines of the graphic processor; a set of queues for the plurality of processing engines; and a unified submit port for the plurality of processing engines, wherein the unified submit port is to notify a scheduler regarding availability of slots in the set of queues for receipt of workload contexts; and wherein, upon the unified submit port receiving a workload context for processing by the plurality of processing engines, the unified submit port is to detect an available processing engine of the plurality of processing engines and direct the received context to a slot of the set of queues for processing by the available processing engine.Type: ApplicationFiled: September 24, 2021Publication date: March 30, 2023Applicant: Intel CorporationInventors: Hema Chand Nalluri, Jeffery S. Boles, Joseph Koston, Ankur N. Shah, Vidhya Krishnan, Vasanth Ranganathan, Joydeep Ray, Aditya Navale, Murali Ramadoss, James Valerio
-
Publication number: 20230101654Abstract: Dynamic routing of texture-load in graphics processing is described. An example of a processor includes one or more processing resources, the one or more processing resources to load a message including a texture load; a texture sampler and a data port; and a message router to route the texture load to a destination, wherein the destination may be either the texture sampler or the data port; wherein the message router includes arbitration circuitry to select the destination for the texture load, the arbitration circuitry to base selection of the destination at least in part on support by the data port for a format of a memory surface for the texture load; and a utilization metric for the data port representing availability of the data port.Type: ApplicationFiled: September 24, 2021Publication date: March 30, 2023Applicant: Intel CorporationInventors: Carlos Nava Rodriguez, Yoav Harel, Joydeep Ray, Abhishek R. Appu, Vamsee Vardhan Chivukula, Benjamin R. Pletcher
-
Publication number: 20230102538Abstract: Embodiments are directed to systems and methods for supporting generic pointers in hardware of a GPU. According to one embodiment, a GPU includes multiple sub-cores each having a processing resource and a load/store pipeline. The processing resource is operable to receive a memory access message including a pointer and a memory type identifier indicative of the pointer representing a generic pointer. The processing resource is further operable to output a load or store operation to the load/store pipeline based on the memory access message, including computing an address for the load or store operation by adding a base address of a named memory type of a plurality of named memory types referenced by the generic pointer to an offset into a memory of the named memory type. The load/store pipeline is operable to, responsive to receipt of the load or store operation, access the memory at the address.Type: ApplicationFiled: September 24, 2021Publication date: March 30, 2023Applicant: Intel CorporationInventors: Joydeep Ray, Prathamesh Raghunath Shinde, Ben J. Ashbaugh, Wei-Yu Chen, Abhishek R. Appu, Vasanth Ranganathan, Dmitry Yurievich Babokin, Ankur N. Shah
-
Patent number: 11615584Abstract: Briefly, in accordance with one or more embodiments, a processor performs a coarse depth test on pixel data, and performs a final depth test on the pixel data. Coarse depth data is stored in a coarse depth cache, and per pixel depth data is stored in a per pixel depth cache. If a result of the coarse depth test is ambiguous, the processor is to read the per pixel depth data from the per pixel depth cache, and to update the coarse depth data with the per pixel depth data if the per pixel depth data has a smaller depth range than the coarse depth data.Type: GrantFiled: July 22, 2021Date of Patent: March 28, 2023Assignee: Intel CorporationInventors: Vasanth Ranganathan, Saikat Mandal, Saurabh Sharma, Vamsee Vardhan Chivukula, Karol A. Szerszen, Aleksander Olek Neyman, Altug Koker, Prasoonkumar Surti, Abhishek Appu, Joydeep Ray, Art Hunter, Luis F. Cruz Camacho, Akshay R. Chada
-
Publication number: 20230090973Abstract: One embodiment provides a graphics processor including a processing resource including a register file, memory, a cache memory, and load/store/cache circuitry to process load, store, and prefetch messages from the processing resource. The circuitry includes support for an immediate address offset that will be used to adjust the address supplied for a memory access to be requested by the circuitry. Including support for the immediate address offset removes the need to execute additional instructions to adjust the address to be accessed prior to execution of the memory access instruction.Type: ApplicationFiled: September 21, 2021Publication date: March 23, 2023Applicant: Intel CorporationInventors: Joydeep Ray, Abhishek R. Appu, Timothy R. Bauer, James Valerio, Weiyu Chen, Subramaniam Maiyuran, Prasoonkumar Surti, Karthik Vaidyanathan, Carsten Benthin, Sven Woop, Jiasheng Chen
-
Patent number: 11610564Abstract: A mechanism is described for facilitating consolidated compression/de-compression of graphics data streams of varying types at computing devices. A method of embodiments, as described herein, includes generating a common sector cache relating to a graphics processor. The method may further include performing a consolidated compression of multiple types of graphics data streams associated with the graphics processor using the common sector cache.Type: GrantFiled: July 23, 2021Date of Patent: March 21, 2023Assignee: Intel CorporationInventors: Abhishek R. Appu, Joydeep Ray, Prasoonkumar Surti, Altug Koker, Kiran C. Veernapu, Erik G. Liskay
-
Patent number: 11609856Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.Type: GrantFiled: March 19, 2021Date of Patent: March 21, 2023Assignee: INTEL CORPORATIONInventors: Chandrasekaran Sakthivel, Prasoonkumar Surti, John C. Weast, Sara S. Baghsorkhi, Justin E. Gottschlich, Abhishek R. Appu, Nicolas C. Galoppo Von Borries, Joydeep Ray, Narayan Srinivasa, Feng Chen, Ben J. Ashbaugh, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Eriko Nurvitadhi, Balaji Vembu, Altug Koker
-
Patent number: 11605197Abstract: An embodiment of a parallel processor apparatus may include a sample pattern selector to select a sample pattern for a pixel, and a sample pattern subset selector communicatively coupled to the sample pattern selector to select a first subset of the sample pattern for the pixel corresponding to a left eye display frame and to select a second subset of the sample pattern for the pixel corresponding to a right eye display frame, wherein the second subset is different from the first subset. Other embodiments are disclosed and claimed.Type: GrantFiled: February 19, 2021Date of Patent: March 14, 2023Assignee: Intel CorporationInventors: Nikos Kaburlasos, Joydeep Ray, John H. Feit, Travis T. Schluessler, Jacek Kwiatkowski, Philip R. Laws
-
Publication number: 20230061331Abstract: One embodiment provides a multi-chip module accelerator usable to execute tensor data processing operations a multi-chip module. The multi-chip module may include a memory stack including multiple memory dies and parallel processor circuitry communicatively coupled to the memory stack. The parallel processor circuitry may include multiprocessor cores to execute matrix multiplication and accumulate operations. The matrix multiplication and accumulate operations may include floating-point operations that are configurable to include two-dimensional matrix multiply and accumulate operations involving inputs that have differing floating-point precisions. The floating-point operations may include a first operation at a first precision and a second operation at a second precision. The first operation may include a multiply having at least one 16-bit floating-point input and the second operation may include an accumulate having a 32-bit floating-point input.Type: ApplicationFiled: October 5, 2022Publication date: March 2, 2023Applicant: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
-
Publication number: 20230061670Abstract: One embodiment provides an apparatus comprising a memory stack including multiple memory dies and a parallel processor including a plurality of multiprocessors. Each multiprocessor has a single instruction, multiple thread (SIMT) architecture, the parallel processor coupled to the memory stack via one or more memory interfaces. At least one multiprocessor comprises a multiply-accumulate circuit to perform multiply-accumulate operations on matrix data in a stage of a neural network implementation to produce a result matrix comprising a plurality of matrix data elements at a first precision, precision tracking logic to evaluate metrics associated with the matrix data elements and indicate if an optimization is to be performed for representing data at a second stage of the neural network implementation, and a numerical transform unit to dynamically perform a numerical transform operation on the matrix data elements based on the indication to produce transformed matrix data elements at a second precision.Type: ApplicationFiled: November 1, 2022Publication date: March 2, 2023Applicant: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
-
Patent number: 11592817Abstract: A mechanism is described for facilitating storage management for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting one or more components associated with machine learning, where the one or more components include memory and a processor coupled to the memory, and where the processor includes a graphics processor. The method may further include allocating a storage portion of the memory and a hardware portion of the processor to a machine learning training set, where the storage and hardware portions are precise for implementation and processing of the training set.Type: GrantFiled: April 28, 2017Date of Patent: February 28, 2023Assignee: INTEL CORPORATIONInventors: Abhishek R. Appu, John C. Weast, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Altug Koker, Farshad Akhbari, Feng Chen, Dukhwan Kim, Narayan Srinivasa, Nadathur Rajagopalan Satish, Kamal Sinha, Joydeep Ray, Balaji Vembu, Mike B. Macpherson, Linda L. Hurd, Sanjeev Jahagirdar, Vasanth Ranganathan
-
Patent number: 11593454Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.Type: GrantFiled: June 2, 2020Date of Patent: February 28, 2023Assignee: Intel CorporationInventors: Joydeep Ray, Fangwen Fu, Dhiraj D. Kalamkar, Sasikanth Avancha
-
Patent number: 11593910Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.Type: GrantFiled: May 11, 2022Date of Patent: February 28, 2023Assignee: Intel CorporationInventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu