Patents by Inventor Joydeep Ray

Joydeep Ray has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20190206090
    Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.
    Type: Application
    Filed: December 30, 2017
    Publication date: July 4, 2019
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Ben Ashbaugh, Prasoonkumar Surti, Pradeep Ramani, Rama Harihara, Jerin C. Justin, Jing Huang, Xiaoming Cui, Timothy B. Costa, Ting Gong, Elmoustapha Ould-Ahmed-Vall, Kumar Balasubramanian, Anil Thomas, Oguz H. Elibol, Jayaram Bobba, Guozhong Zhuang, Bhavani Subramanian, Gokce Keskin, Chandrasekaran Sakthivel, Rajesh Poornachandran
  • Publication number: 20190206020
    Abstract: One embodiment provides an accelerator module comprising a memory stack including multiple memory dies; a graphics processing unit (GPU) coupled with the memory stack via one or more memory controllers, the GPU including a plurality of multiprocessors having a single instruction, multiple thread (SIMT) architecture, the multiprocessors to execute at least one single instruction. The at least one single instruction is to cause at least a portion of the GPU to perform a floating point operation on input having differing precisions. The floating point operation is a two-dimensional matrix multiply and accumulate operation.
    Type: Application
    Filed: November 21, 2018
    Publication date: July 4, 2019
    Applicant: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
  • Publication number: 20190205163
    Abstract: A mechanism is described to facilitate microcontroller-based flexible thread scheduling launching in computing environments. An apparatus of embodiments, as described herein, includes facilitating a graphics processor hosting a microcontroller having a thread scheduling unit, and detection and observation logic to detect a scheduling algorithm associated with an application at the apparatus. The apparatus may further include reading and dispatching logic to facilitate the microcontroller to prepare a flexible dispatch routine based on the scheduling algorithm. The apparatus may further include scheduling and launching logic to facilitate the thread scheduling unit to dynamically schedule and launch threads based on the flexible dispatch routine, where the threads are hosted by the graphics processor.
    Type: Application
    Filed: January 3, 2018
    Publication date: July 4, 2019
    Applicant: Intel Corporation
    Inventors: Kiran C. Veernapu, Kamlesh Pillai, James Valerio, Joydeep Ray, Abhishek Appu
  • Publication number: 20190197657
    Abstract: One embodiment provides for a parallel processor comprising a processing array within the parallel processor, the processing array including multiple compute blocks, each compute block including multiple processing clusters configured for parallel operation, wherein each of the multiple compute blocks is independently preemptable. In one embodiment a preemption hint can be generated for source code during compilation to enable a compute unit to determine an efficient point for preemption.
    Type: Application
    Filed: March 5, 2019
    Publication date: June 27, 2019
    Applicant: Intel Corporation
    Inventors: Altug Koker, Ingo Wald, David Puffer, Subramaniam M. Maiyuran, Prasoonkumar Surti, Balaji Vembu, Guei-Yuan Lueh, Murali Ramadoss, Abhishek R. Appu, Joydeep Ray
  • Patent number: 10332302
    Abstract: In an example, an apparatus comprises a plurality of execution units, and logic, at least partially including hardware logic, to create a scatter gather list in memory and collect a plurality of operating statistics for the plurality of execution units using the scatter gather list. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: April 17, 2017
    Date of Patent: June 25, 2019
    Assignee: INTEL CORPORATION
    Inventors: Balaji Vembu, Murali Ramadoss, David I. Standring, Shruti A. Sethi, Jeffrey S. Frizzell, Alan M. Curtis, Abhishek R. Appu, Joydeep Ray, Altug Koker
  • Patent number: 10332320
    Abstract: One embodiment provides for a computing device within an autonomous vehicle, the compute device comprising a wireless network device to enable a wireless data connection with an autonomous vehicle network, a set of multiple processors including a general-purpose processor and a general-purpose graphics processor, the set of multiple processors to execute a compute manager to manage execution of compute workloads associated with the autonomous vehicle, the compute workload associated with autonomous operations of the autonomous vehicle, and offload logic configured to execute on the set of multiple processors, the offload logic to determine to offload one or more of the compute workloads to one or more autonomous vehicles within range of the wireless network device.
    Type: Grant
    Filed: April 17, 2017
    Date of Patent: June 25, 2019
    Assignee: Intel Corporation
    Inventors: Barath Lakshamanan, Linda L. Hurd, Ben J. Ashbaugh, Elmoustapha Ould-Ahmed-Vall, Liwei Ma, Jingyi Jin, Justin E. Gottschlich, Chandrasekaran Sakthivel, Michael S. Strickland, Brian T. Lewis, Lindsey Kuper, Altug Koker, Abhishek R. Appu, Prasoonkumar Surti, Joydeep Ray, Balaji Vembu, Javier S. Turek, Naila Farooqui
  • Publication number: 20190188554
    Abstract: Embodiments provide systems and methods which facilitate optimization of a convolutional neural network (CNN). One embodiment provides for a non-transitory machine-readable medium storing instructions that cause one or more processors to perform operations comprising processing a trained convolutional neural network (CNN) to generate a processed CNN, the trained CNN having weights in a floating-point format. Processing the trained CNN includes quantizing the weights in the floating-point format to generate weights in an integer format. Quantizing the weights includes generating a quantization table to enable non-uniform quantization of the weights and quantizing the weights from the floating-point format to the integer format using the quantization table. The operations additionally comprise performing an inference operation utilizing the processed CNN with the integer format weights.
    Type: Application
    Filed: February 22, 2019
    Publication date: June 20, 2019
    Applicant: Intel Corporation
    Inventors: Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Barath Lakshmanan, Ben J. Ashbaugh, Jingyi Jin, Jeremy Bottleson, Mike B. Macpherson, Kevin Nealis, Dhawal Srivastava, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Altug Koker, Abhishek R. Appu
  • Patent number: 10324848
    Abstract: A mechanism is described for facilitating independent and separate entity-based graphics cache at computing devices. A method of embodiments, as described herein, includes facilitate hosting of a plurality of cache at a plurality of entities associated with a graphics processor, wherein each entity hosts at least one cache, and wherein an entity includes a dual sub-slice (DSS) or a streaming multiprocessor (SM).
    Type: Grant
    Filed: April 10, 2017
    Date of Patent: June 18, 2019
    Assignee: INTEL CORPORATION
    Inventors: Altug Koker, Joydeep Ray, James A. Valerio, Abhishek R. Appu, Vasanth Ranganathan
  • Patent number: 10325344
    Abstract: A mechanism is described for facilitating dynamic merging of atomic operations in computing devices. A method of embodiments, as described herein, includes facilitating detecting atomic messages and a plurality of slot addresses. The method further includes comparing one or more slot addresses of the plurality of slot addresses with other slot addresses of the plurality of slot addresses to seek one or more matched slot addresses, where the one or more matched slot addresses are merged into one or more merged groups. The method may further include generating one or more merged atomic operations based on and corresponding to the one or more merged groups.
    Type: Grant
    Filed: April 17, 2017
    Date of Patent: June 18, 2019
    Assignee: INTEL CORPORATION
    Inventors: Joydeep Ray, Altug Koker, Abhishek R. Appu, Balaji Vembu
  • Patent number: 10325341
    Abstract: One embodiment provides for a general-purpose graphics processing unit comprising multiple processing units and a pipeline manager to distribute a thread group to the multiple processing units, wherein the pipeline manager is to distribute the thread group as multiple thread sub-groups.
    Type: Grant
    Filed: April 21, 2017
    Date of Patent: June 18, 2019
    Assignee: Intel Corporation
    Inventors: Balaji Vembu, Altug Koker, Joydeep Ray
  • Publication number: 20190179531
    Abstract: A processor includes a first memory interface to be coupled to a plurality of memory module sockets located off-package, a second memory interface to be coupled to a non-volatile memory (NVM) socket located off-package, and a multi-level memory controller (MLMC). The MLMC is to: control the memory modules disposed in the plurality of memory module sockets as main memory in a one-level memory (1LM) configuration; detect a switch from a 1LM mode of operation to a two-level memory (2LM) mode of operation in response to a basic input/output system (BIOS) detection of a low-power memory module disposed in one of the memory module sockets and a NVM device disposed in the NVM socket in a 2LM configuration; and control the low-power memory module as cache in the 2LM configuration in response to detection of the switch from the 1LM mode of operation to the 2LM mode of operation.
    Type: Application
    Filed: February 13, 2019
    Publication date: June 13, 2019
    Inventors: Joydeep Ray, Varghese George, Inder M. Sodhi, Jeffrey R. Wilcox
  • Publication number: 20190180494
    Abstract: An embodiment of an electronic processing system may include an application processor, persistent storage media communicatively coupled to the application processor, and a graphics subsystem communicatively coupled to the application processor. The graphics subsystem may include a first graphics engine to process a graphics workload, and a second graphics engine to offload at least a portion of the graphics workload from the first graphics engine. The second graphics engine may include a low precision compute engine. The system may further include a wearable display housing the second graphics engine. Other embodiments are disclosed and claimed.
    Type: Application
    Filed: December 28, 2018
    Publication date: June 13, 2019
    Inventors: Atsuo Kuwahara, Deepak S. Vembar, Chandrasekaran Sakthivel, Radhakrishnan Venkataraman, Brent E. Insko, Anupreet S. Kalra, Hugues Labbe, Abhishek R. Appu, Ankur N. Shah, Joydeep Ray, Elmoustapha Ould-Ahmed-Vall, Prasoonkumar Surti, Murali Ramadoss
  • Patent number: 10319138
    Abstract: An embodiment of graphics apparatus may include a coarse depth tester to perform a coarse depth test on a block of pixels, and a stencil tester to perform a stencil test on the block of pixels. The stencil tester may be further configured to perform the stencil test in parallel with the coarse depth test. Other embodiments are disclosed and claimed.
    Type: Grant
    Filed: April 1, 2017
    Date of Patent: June 11, 2019
    Assignee: Intel Corporation
    Inventors: Vasanth Ranganathan, Saikat Mandal, Prasoonkumar Surti, Abhishek R. Appu, Andrew S. Downsworth, Vamsee Vardhan Chivukula, Akshay Chada, Karol A. Szerszen, Joydeep Ray, Bryon T. Rogers
  • Patent number: 10319070
    Abstract: In accordance with one embodiment each page table entry maps a variable page size (per entry), if multiple continuous virtual pages map to contiguous physical pages. This may drastically reduce the number of translation lookaside buffer (TLB) entries needed since each entry can potentially map a larger chunk of memory, in some embodiments.
    Type: Grant
    Filed: September 4, 2018
    Date of Patent: June 11, 2019
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Joydeep Ray, Altug Koker, Balaji Vembu, Prasoonkumar P. Surti, Kamal Sinha, Vasanth Ranganathan, Kiran C. Veernapu, Bhushan M. Borole, Wenyin Fu
  • Patent number: 10310895
    Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.
    Type: Grant
    Filed: April 21, 2017
    Date of Patent: June 4, 2019
    Assignee: INTEL CORPORATION
    Inventors: Altug Koker, Joydeep Ray, Balaji Vembu, James A. Valerio, Abhishek R. Appu
  • Patent number: 10310861
    Abstract: An apparatus to facilitate thread scheduling is disclosed. The apparatus includes logic to store barrier usage data based on a magnitude of barrier messages in an application kernel and a scheduler to schedule execution of threads across a plurality of multiprocessors based on the barrier usage data.
    Type: Grant
    Filed: April 1, 2017
    Date of Patent: June 4, 2019
    Assignee: INTEL CORPORATION
    Inventors: Balaji Vembu, Abhishek R. Appu, Joydeep Ray, Altug Koker
  • Patent number: 10303594
    Abstract: An apparatus to facilitate guaranteed forward progress for graphics data is disclosed. The apparatus includes a plurality of ports to receive and transmit streams of graphics data, one or more buffers associated with each of the plurality of ports to store the graphics data and switching logic to virtually partition each of the one or more buffers to allocate a dedicated buffer to receive each of a plurality of independent streams of graphics data.
    Type: Grant
    Filed: April 17, 2017
    Date of Patent: May 28, 2019
    Assignee: INTEL CORPORATION
    Inventors: Altug Koker, Joydeep Ray, Niranjan L. Cooray, Abhishek R. Appu
  • Patent number: 10303953
    Abstract: A mechanism is described for facilitating person tracking and data security in machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, by a camera associated with one or more trackers, a person within a physical vicinity, where detecting includes capturing one or more images the person. The method may further include tracking, by the one or more trackers, the person based on the one or more images of the person, where tracking includes collect tracking data relating to the person. The method may further include selecting a tracker of the one or more trackers as a preferred tracker based on the tracking data.
    Type: Grant
    Filed: April 17, 2017
    Date of Patent: May 28, 2019
    Assignee: INTEL CORPORATION
    Inventors: Mayuresh M. Varerkar, Barnan Das, Narayan Biswal, Stanley J. Baran, Gokcen Cilingir, Nilesh V. Shah, Archie Sharma, Sherine Abdelhak, Sachin Godse, Farshad Akhbari, Narayan Srinivasa, Altug Koker, Nadathur Rajagopalan Satish, Dukhwan Kim, Feng Chen, Abhishek R. Appu, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 10304154
    Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, at training time, information relating to one or more tasks to be performed according to a training dataset relating to a processor including a graphics processor. The method may further include analyzing the information to determine one or more portions of hardware relating to the processor capable of supporting the one or more tasks, and configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks.
    Type: Grant
    Filed: April 24, 2017
    Date of Patent: May 28, 2019
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, Dukhwan Kim
  • Patent number: 10304421
    Abstract: An apparatus and method are described for efficiently rendering an transmitting to a remote display.
    Type: Grant
    Filed: April 7, 2017
    Date of Patent: May 28, 2019
    Assignee: Intel Corporation
    Inventors: Balaji Vembu, Jason Tanner, Joydeep Ray, Altug Koker, Abhishek R. Appu, Pattabhiraman K