Patents by Inventor Tatiana Shpeisman

Tatiana Shpeisman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11360767
    Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.
    Type: Grant
    Filed: July 6, 2021
    Date of Patent: June 14, 2022
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 11360933
    Abstract: An integrated circuit (IC) package apparatus is disclosed. The IC package includes one or more processing units and a bridge, mounted below the one or more processing unit, including one or more arithmetic logic units (ALUs) to perform atomic operations.
    Type: Grant
    Filed: January 16, 2020
    Date of Patent: June 14, 2022
    Assignee: Intel Corporation
    Inventors: Altug Koker, Farshad Akhbari, Feng Chen, Dukhwan Kim, Narayan Srinivasa, Nadathur Rajagopalan Satish, Liwei Ma, Jeremy Bottleson, Eriko Nurvitadhi, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu
  • Publication number: 20220114430
    Abstract: One embodiment provides an apparatus comprising an instruction cache to store a plurality of instructions, a scheduler unit coupled to the instruction cache, the scheduler unit to schedule the plurality of instructions for execution, an instruction fetch and decode unit to decode the plurality of instructions to determine a set of operations to perform in response, one or more compute blocks to perform parallel multiply-accumulate operations based on the instruction fetch and decode unit decoding a first instruction of the plurality of instructions, and matrix multiplication logic to perform matrix multiplication operations based on the instruction fetch and decode unit decoding a second instruction of the plurality of instructions.
    Type: Application
    Filed: December 21, 2021
    Publication date: April 14, 2022
    Applicant: Intel Corporation
    Inventors: Rajkishore Barik, Elmoustapha Ould-Ahmed-Vall, Xiaoming Chen, Dhawal Srivastava, Anbang Yao, Kevin Nealis, Eriko Nurvitadhi, Sara S. Baghsorkhi, Balaji Vembu, Tatiana Shpeisman, Ping T. Tang
  • Patent number: 11270405
    Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a mixed precision core to perform a mixed precision multi-dimensional matrix multiply and accumulate operation on 8-bit and/or 32 bit signed or unsigned integer elements.
    Type: Grant
    Filed: August 3, 2020
    Date of Patent: March 8, 2022
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Linda L. Hurd, Dukhwan Kim, Mike B. Macpherson, John C. Weast, Feng Chen, Farshad Akhbari, Narayan Srinivasa, Nadathur Rajagopalan Satish, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman
  • Publication number: 20220027210
    Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to steal work in heterogeneous computing systems. An apparatus includes load balancing circuitry to obtain tasks from a workload by encoding minimum and maximum index ranges of a data parallel operation, allocate a first task from the workload to a first work queue based on a first capability of first computation circuitry, the first computation circuitry to process the first task in the first work queue, and allocate a second task from the workload to a second work queue, second computation circuitry to process the second task in the second work queue. The apparatus further includes first work stealer logic to steal the second task from the second work queue using an atomic operation to access the second work queue.
    Type: Application
    Filed: October 4, 2021
    Publication date: January 27, 2022
    Inventors: Rajkishore Barik, Stephan A. Herhut, Jaswanth Sreeram, Tatiana Shpeisman, Richard L. Hudson
  • Publication number: 20220019431
    Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.
    Type: Application
    Filed: July 6, 2021
    Publication date: January 20, 2022
    Applicant: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Publication number: 20210397925
    Abstract: A library of machine learning primitives is provided to optimize a machine learning model to improve the efficiency of inference operations. In one embodiment a trained convolutional neural network (CNN) model is processed into a trained CNN model via pruning, convolution window optimization, and quantization.
    Type: Application
    Filed: August 26, 2021
    Publication date: December 23, 2021
    Applicant: Intel Corporation
    Inventors: Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Barath Lakshmanan, Ben J. Ashbaugh, Jingyi Jin, Jeremy Bottleson, Mike B. Macpherson, Kevin Nealis, Dhawal Srivastava, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Altug Koker, Abhishek R. Appu
  • Patent number: 11169799
    Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.
    Type: Grant
    Filed: June 5, 2019
    Date of Patent: November 9, 2021
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Publication number: 20210334637
    Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: May 11, 2021
    Publication date: October 28, 2021
    Applicant: INTEL CORPORATION
    Inventors: Kamal Sinha, Balaji Vembu, Eriko Nurvitadhi, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Farshad Akhbari, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Nadathur Rajagopalan Satish, John C. Weast, Mike B. MacPherson, Linda L. Hurd, Vasanth Ranganathan, Sanjeev S. Jahagirdar
  • Patent number: 11138048
    Abstract: A work stealer apparatus includes a determination module. The determination module is to determine to steal work from a first hardware computation unit of a first type for a second hardware computation unit of a second type that is different than the first type. The work is to be queued in a first work queue, which is to correspond to the first hardware computation unit, and which is to be stored in a shared memory that is to be shared by the first and second hardware computation units. A synchronized work stealer module is to steal the work through a synchronized memory access to the first work queue. The synchronized memory access is to be synchronized relative to memory accesses to the first work queue from the first hardware computation unit.
    Type: Grant
    Filed: December 27, 2016
    Date of Patent: October 5, 2021
    Assignee: Intel Corporation
    Inventors: Rajkishore Barik, Stephan A. Herhut, Jaswanth Sreeram, Tatiana Shpeisman, Richard L. Hudson
  • Publication number: 20210264163
    Abstract: A mechanism is described for facilitating person tracking and data security in machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, by a camera associated with one or more trackers, a person within a physical vicinity, where detecting includes capturing one or more images the person. The method may further include tracking, by the one or more trackers, the person based on the one or more images of the person, where tracking includes collect tracking data relating to the person. The method may further include selecting a tracker of the one or more trackers as a preferred tracker based on the tracking data.
    Type: Application
    Filed: February 11, 2021
    Publication date: August 26, 2021
    Applicant: Intel Corporation
    Inventors: Mayuresh M. Varerkar, Barnan Das, Narayan Biswal, Stanley J. Baran, Gokcen Cilingir, Nilesh V. Shah, Archie Sharma, Sherine Abdelhak, Sachin Godse, Farshad Akhbari, Narayan Srinivasa, Altug Koker, Nadathur Rajagopalan Satish, Dukhwan Kim, Feng Chen, Abhishek R. Appu, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 11080811
    Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a mixed precision core to perform a mixed precision multi-dimensional matrix multiply and accumulate operation on 16-bit and/or 32 bit floating-point elements.
    Type: Grant
    Filed: June 19, 2019
    Date of Patent: August 3, 2021
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Linda L. Hurd, Dukhwan Kim, Mike B. Macpherson, John C. Weast, Feng Chen, Farshad Akhbari, Narayan Srinivasa, Nadathur Rajagopalan Satish, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman
  • Patent number: 11080046
    Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.
    Type: Grant
    Filed: February 5, 2021
    Date of Patent: August 3, 2021
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 11080813
    Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a mixed precision core to perform a mixed precision multi-dimensional matrix multiply and accumulate operation on 8-bit and/or 32 bit signed or unsigned integer elements.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: August 3, 2021
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Linda L. Hurd, Dukhwan Kim, Mike B. Macpherson, John C. Weast, Feng Chen, Farshad Akhbari, Narayan Srinivasa, Nadathur Rajagopalan Satish, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman
  • Publication number: 20210182058
    Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.
    Type: Application
    Filed: February 5, 2021
    Publication date: June 17, 2021
    Applicant: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 11010659
    Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: April 24, 2017
    Date of Patent: May 18, 2021
    Assignee: INTEL CORPORATION
    Inventors: Kamal Sinha, Balaji Vembu, Eriko Nurvitadhi, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Farshad Akhbari, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Nadathur Rajagopalan Satish, John C. Weast, Mike B. MacPherson, Linda L. Hurd, Vasanth Ranganathan, Sanjeev S. Jahagirdar
  • Publication number: 20210124579
    Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.
    Type: Application
    Filed: December 9, 2020
    Publication date: April 29, 2021
    Applicant: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Publication number: 20210081774
    Abstract: One embodiment provides for a general-purpose graphics processing unit including a scheduler to schedule multiple matrix operations for execution by a general-purpose graphics processing unit. The multiple matrix operations are determined based on a single machine learning compute instruction. The single machine learning compute instruction is a convolution instruction and the multiple matrix operations are associated with a convolution operation.
    Type: Application
    Filed: October 28, 2020
    Publication date: March 18, 2021
    Applicant: Intel Corporation
    Inventors: Rajkishore Barik, Elmoustapha Ould-Ahmed-Vall, Xiaoming Chen, Dhawal Srivastava, Anbang Yao, Kevin Nealis, Eriko Nurvitadhi, Sara S. Baghsorkhi, Balaji Vembu, Tatiana Shpeisman, Ping T. Tang
  • Patent number: 10922553
    Abstract: A mechanism is described for facilitating person tracking and data security in machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, by a camera associated with one or more trackers, a person within a physical vicinity, where detecting includes capturing one or more images the person. The method may further include tracking, by the one or more trackers, the person based on the one or more images of the person, where tracking includes collect tracking data relating to the person. The method may further include selecting a tracker of the one or more trackers as a preferred tracker based on the tracking data.
    Type: Grant
    Filed: April 9, 2019
    Date of Patent: February 16, 2021
    Assignee: INTEL CORPORATION
    Inventors: Mayuresh M. Varerkar, Barnan Das, Narayan Biswal, Stanley J. Baran, Gokcen Cilingir, Nilesh V. Shah, Archie Sharma, Sherine Abdelhak, Sachin Godse, Farshad Akhbari, Narayan Srinivasa, Altug Koker, Nadathur Rajagopalan Satish, Dukhwan Kim, Feng Chen, Abhishek R. Appu, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Publication number: 20200364822
    Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a mixed precision core to perform a mixed precision multi-dimensional matrix multiply and accumulate operation on 8-bit and/or 32 bit signed or unsigned integer elements.
    Type: Application
    Filed: August 3, 2020
    Publication date: November 19, 2020
    Applicant: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Linda L. Hurd, Dukhwan Kim, Mike B. Macpherson, John C. Weast, Feng Chen, Farshad Akhbari, Narayan Srinivasa, Nadathur Rajagopalan Satish, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman