Patents by Inventor Mark A. Anders

Mark A. Anders has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20210182058

Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.

Type: Application

Filed: February 5, 2021

Publication date: June 17, 2021

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20210124579

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

Type: Application

Filed: December 9, 2020

Publication date: April 29, 2021

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Reconfigurable interconnect structure in integrated circuits

Patent number: 10944402

Abstract: Some embodiments include apparatuses having a first circuit path including drive units coupled in series between a first node and a first additional node, a second circuit path including drive units coupled in series between a second node and a second additional node, each drive unit of the driver units of the first circuit path and the second circuit path including an inverter, and a transmission gate circuit including an input node and an output node coupled to an input node and an output node, respectively, of the inverter; and control circuitry to provide control information to the transmission gate circuit of each of the driver units of the first circuit path and the second circuit path.

Type: Grant

Filed: February 14, 2020

Date of Patent: March 9, 2021

Assignee: Intel Corporation

Inventors: SeongJong Kim, Mark A. Anders, Himanshu Kaul
VARIABLE FORMAT, VARIABLE SPARSITY MATRIX MULTIPLICATION INSTRUCTION

Publication number: 20200334038

Abstract: Disclosed embodiments relate to a variable format, variable sparsity matrix multiplication (VFVSMM) instruction. In one example, a processor includes fetch and decode circuitry to fetch and decode a VFVSMM instruction specifying locations of A, B, and C matrices having (M×K), (K×N), and (M×N) elements, respectively, execution circuitry, responsive to the decoded VFVSMM instruction, to: route each row of the specified A matrix, staggering subsequent rows, into corresponding rows of a (M×N) processing array, and route each column of the specified B matrix, staggering subsequent columns, into corresponding columns of the processing array, wherein each of the processing units is to generate K products of A-matrix elements and matching B-matrix elements having a same row address as a column address of the A-matrix element, and to accumulate each generated product with a corresponding C-matrix element.

Type: Application

Filed: July 6, 2020

Publication date: October 22, 2020

Inventors: Mark A. ANDERS, Himanshu KAUL, Sanu MATHEW
Variable format, variable sparsity matrix multiplication instruction

Patent number: 10599429

Abstract: Disclosed embodiments relate to a variable format, variable sparsity matrix multiplication (VFVSMM) instruction. In one example, a processor includes fetch and decode circuitry to fetch and decode a VFVSMM instruction specifying locations of A, B, and C matrices having (M×K), (K×N), and (M×N) elements, respectively, execution circuitry, responsive to the decoded VFVSMM instruction, to: route each row of the specified A matrix, staggering subsequent rows, into corresponding rows of a (M×N) processing array, and route each column of the specified B matrix, staggering subsequent columns, into corresponding columns of the processing array, wherein each of the processing units is to generate K products of A-matrix elements and matching B-matrix elements having a same row address as a column address of the A-matrix element, and to accumulate each generated product with a corresponding C-matrix element.

Type: Grant

Filed: June 8, 2018

Date of Patent: March 24, 2020

Assignee: Intel Corporation

Inventors: Mark A. Anders, Himanshu Kaul, Sanu Mathew
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING-POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20190369988

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

Type: Application

Filed: June 5, 2019

Publication date: December 5, 2019

Applicant: Intel Corporation

Inventors: HIMANSHU KAUL, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Instructions and logic to perform floating-point and integer operations for machine learning

Patent number: 10474458

Abstract: One embodiment provides for a machine-learning hardware accelerator comprising a compute unit having an adder and a multiplier that are shared between integer data path and a floating-point datapath, the upper bits of input operands to the multiplier to be gated during floating-point operation.

Type: Grant

Filed: October 18, 2017

Date of Patent: November 12, 2019

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Motion estimation for video processing

Patent number: 10440377

Abstract: In accordance with some embodiments, the complexity of motion estimation algorithms that use Haar, SAD and Hadamard transforms may be reduced. In some embodiments, the number of summations may be reduced compared to existing techniques and some of the existing summations may be replaced with compare operations. In some embodiments, additions are replaced with compares in order to balance delay and area or energy or power considerations.

Type: Grant

Filed: October 11, 2012

Date of Patent: October 8, 2019

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Ram K. Krishnamurthy
Instructions and logic to perform floating-point and integer operations for machine learning

Patent number: 10353706

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

Type: Grant

Filed: November 21, 2017

Date of Patent: July 16, 2019

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Systems, apparatuses, and methods for K nearest neighbor search

Patent number: 10303735

Abstract: Systems, apparatuses, and methods for k-nearest neighbor (KNN) searches are described. In particular, embodiments of a KNN accelerator and its uses are described. In some embodiments, the KNN accelerator includes a plurality of vector partial distance computation circuits each to calculate a partial sum, a minimum sort network to sort partial sums from the plurality of vector partial distance computation circuits to find k nearest neighbor matches and a global control circuit to control aspects of operations of the plurality of vector partial distance computation circuits.

Type: Grant

Filed: November 18, 2015

Date of Patent: May 28, 2019

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew
VARIABLE FORMAT, VARIABLE SPARSITY MATRIX MULTIPLICATION INSTRUCTION

Publication number: 20190042250

Abstract: Disclosed embodiments relate to a variable format, variable sparsity matrix multiplication (VFVSMM) instruction. In one example, a processor includes fetch and decode circuitry to fetch and decode a VFVSMM instruction specifying locations of A, B, and C matrices having (M×K), (K×N), and (M×N) elements, respectively, execution circuitry, responsive to the decoded VFVSMM instruction, to: route each row of the specified A matrix, staggering subsequent rows, into corresponding rows of a (M×N) processing array, and route each column of the specified B matrix, staggering subsequent columns, into corresponding columns of the processing array, wherein each of the processing units is to generate K products of A-matrix elements and matching B-matrix elements having a same row address as a column address of the A-matrix element, and to accumulate each generated product with a corresponding C-matrix element.

Type: Application

Filed: June 8, 2018

Publication date: February 7, 2019

Inventors: Mark A. Anders, Himanshu Kaul, Sanu Mathew
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING-POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20180315398

Abstract: One embodiment provides for a machine-learning hardware accelerator comprising a compute unit having an adder and a multiplier that are shared between integer data path and a floating-point datapath, the upper bits of input operands to the multiplier to be gated during floating-point operation.

Type: Application

Filed: October 18, 2017

Publication date: November 1, 2018

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING-POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20180315399

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

Type: Application

Filed: November 21, 2017

Publication date: November 1, 2018

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
SIMON-BASED HASHING FOR FUSE VALIDATION

Publication number: 20180167199

Abstract: An instruction and logic for a Simon-based hashing for validation are described. In one embodiment, a processor comprises: a memory the memory to store a plurality of values; and a hash circuit comprising a Simon cipher circuit operable to receive the plurality of values from the memory, to apply a Simon cipher, and to generate an output for each of the plurality of values; and circuitry coupled to the Simon cipher circuit to combine outputs from the Simon cipher circuit for each value of the plurality of values into a hash digest that is indicative of whether the values in the memory are valid.

Type: Application

Filed: December 9, 2016

Publication date: June 14, 2018

Inventors: Himanshu Kaul, Sanu K. Mathew, Mark A. Anders, Jesse Walker, Jason G. Sandri
Pipelined hybrid packet/circuit-switched network-on-chip

Patent number: 9992042

Abstract: A packet-switched request from a first router of a network-on-chip is received. The packet-switched request is generated by source logic of the network-on-chip. Circuit-switched data associated with the packet switched request is also received. The circuit-switched data is stored by a storage element. The circuit-switched data is sent towards destination logic identified in the packet-switched request.

Type: Grant

Filed: December 17, 2014

Date of Patent: June 5, 2018

Assignee: Intel Corporation

Inventors: Mark A. Anders, Himanshu Kaul, Gregory K. Chen
Combined guaranteed throughput and best effort network-on-chip

Patent number: 9979668

Abstract: A first packet-switched reservation request is received. Data associated with the first packet-switched reservation request is communicated through a first circuit-switched channel according to a best effort communication scheme. A second packet-switched reservation request is received. Data associated with the second packet-switched reservation request is communicated through a second circuit-switched channel according to a guaranteed throughput communication scheme.

Type: Grant

Filed: December 22, 2014

Date of Patent: May 22, 2018

Assignee: Intel Corporation

Inventors: Gregory K. Chen, Mark A. Anders, Himanshu Kaul, Ram K. Krishnamurthy, Aaron T. Stillmaker
Adaptively switched network-on-chip

Patent number: 9961019

Abstract: A packet-switched reservation request to be associated with a first data stream is received. A communication mode is selected. The communication mode is to be either a circuit-switched mode or a packet-switched mode. At least a portion of the first data stream is communicated in accordance with the communication mode.

Type: Grant

Filed: December 22, 2014

Date of Patent: May 1, 2018

Assignee: Intel Corporation

Inventors: Gregory K. Chen, Mark A. Anders, Himanshu Kaul, Ram K. Krishnamurthy, Yejoong Kim
Pointer chasing across distributed memory

Patent number: 9940236

Abstract: A first pointer dereferencer receives a location of a portion of a first node of a data structure. The first node is to be stored in a first storage element. A first pointer is obtained from the first node of the data structure. A location of a portion of a second node of the data structure is determined based on the first pointer. The second node is to be stored in a second storage element. The location of the portion of the second node of the data structure is sent to a second pointer dereferencer that is to access the portion of the second node from the second storage element.

Type: Grant

Filed: December 17, 2014

Date of Patent: April 10, 2018

Assignee: Intel Corporation

Inventors: Mark A. Anders, Himanshu Kaul, Gregory K. Chen
System for multicast and reduction communications on a network-on-chip

Patent number: 9923730

Abstract: A multicast message that is to originate from a source is received. The multicast message comprises an identifier. A plurality of directions in which the multicast message is to fork at the router are stored. A plurality of messages from the directions in which the multicast message is to fork are received. The received messages are to comprise the identifier. The plurality of messages are aggregated into an aggregate message and sent towards the source.

Type: Grant

Filed: December 17, 2014

Date of Patent: March 20, 2018

Assignee: Intel Corporation

Inventors: Gregory K. Chen, Mark A. Anders, Himanshu Kaul
Parallel direction decode circuits for network-on-chip

Patent number: 9866476

Abstract: A first packet and a first direction associated with the first packet are received. The first packet is forwarded to an output port of a plurality of output ports of the first router based on the first direction associated with the first packet. A second direction associated with the first packet is determined. The second direction is based at least on an address of the first packet. The first packet and the second direction are forwarded through the output port of the first router to a second router.

Type: Grant

Filed: December 17, 2014

Date of Patent: January 9, 2018

Assignee: Intel Corporation

Inventors: Mark A. Anders, Gregory K. Chen, Himanshu Kaul

prev 1 2 3 4 5 next