Patents by Inventor Mark A. Anders

Mark A. Anders has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20190369988
    Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.
    Type: Application
    Filed: June 5, 2019
    Publication date: December 5, 2019
    Applicant: Intel Corporation
    Inventors: HIMANSHU KAUL, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 10474458
    Abstract: One embodiment provides for a machine-learning hardware accelerator comprising a compute unit having an adder and a multiplier that are shared between integer data path and a floating-point datapath, the upper bits of input operands to the multiplier to be gated during floating-point operation.
    Type: Grant
    Filed: October 18, 2017
    Date of Patent: November 12, 2019
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 10440377
    Abstract: In accordance with some embodiments, the complexity of motion estimation algorithms that use Haar, SAD and Hadamard transforms may be reduced. In some embodiments, the number of summations may be reduced compared to existing techniques and some of the existing summations may be replaced with compare operations. In some embodiments, additions are replaced with compares in order to balance delay and area or energy or power considerations.
    Type: Grant
    Filed: October 11, 2012
    Date of Patent: October 8, 2019
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Ram K. Krishnamurthy
  • Publication number: 20190294415
    Abstract: Systems, apparatuses and methods may provide for technology that conduct a first alignment between a plurality of floating-point numbers based on a first subset of exponent bits. The technology may also conduct, at least partially in parallel with the first alignment, a second alignment between the plurality of floating-point numbers based on a second subset of exponent bits, where the first subset of exponent bits are LSBs and the second subset of exponent bits are MSBs. In one example, technology adds the aligned plurality of floating-point numbers to one another. With regard to the second alignment, the technology may also identify individual exponents of a plurality of floating-point numbers, identify a maximum exponent across the individual exponents, and conduct a subtraction of the individual exponents from the maximum exponent, where the subtraction is conducted from MSB to LSB.
    Type: Application
    Filed: June 7, 2019
    Publication date: September 26, 2019
    Inventors: Himanshu Kaul, Mark Anders
  • Patent number: 10374793
    Abstract: An instruction and logic for a Simon-based hashing for validation are described. In one embodiment, a processor comprises: a memory the memory to store a plurality of values; and a hash circuit comprising a Simon cipher circuit operable to receive the plurality of values from the memory, to apply a Simon cipher, and to generate an output for each of the plurality of values; and circuitry coupled to the Simon cipher circuit to combine outputs from the Simon cipher circuit for each value of the plurality of values into a hash digest that is indicative of whether the values in the memory are valid.
    Type: Grant
    Filed: December 9, 2016
    Date of Patent: August 6, 2019
    Assignee: INTEL CORPORATION
    Inventors: Himanshu Kaul, Sanu Mathew, Mark Anders, Jesse Walker, Jason Sandri
  • Patent number: 10353706
    Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.
    Type: Grant
    Filed: November 21, 2017
    Date of Patent: July 16, 2019
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 10303735
    Abstract: Systems, apparatuses, and methods for k-nearest neighbor (KNN) searches are described. In particular, embodiments of a KNN accelerator and its uses are described. In some embodiments, the KNN accelerator includes a plurality of vector partial distance computation circuits each to calculate a partial sum, a minimum sort network to sort partial sums from the plurality of vector partial distance computation circuits to find k nearest neighbor matches and a global control circuit to control aspects of operations of the plurality of vector partial distance computation circuits.
    Type: Grant
    Filed: November 18, 2015
    Date of Patent: May 28, 2019
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew
  • Publication number: 20190042250
    Abstract: Disclosed embodiments relate to a variable format, variable sparsity matrix multiplication (VFVSMM) instruction. In one example, a processor includes fetch and decode circuitry to fetch and decode a VFVSMM instruction specifying locations of A, B, and C matrices having (M×K), (K×N), and (M×N) elements, respectively, execution circuitry, responsive to the decoded VFVSMM instruction, to: route each row of the specified A matrix, staggering subsequent rows, into corresponding rows of a (M×N) processing array, and route each column of the specified B matrix, staggering subsequent columns, into corresponding columns of the processing array, wherein each of the processing units is to generate K products of A-matrix elements and matching B-matrix elements having a same row address as a column address of the A-matrix element, and to accumulate each generated product with a corresponding C-matrix element.
    Type: Application
    Filed: June 8, 2018
    Publication date: February 7, 2019
    Inventors: Mark A. Anders, Himanshu Kaul, Sanu Mathew
  • Publication number: 20190042252
    Abstract: A configurable integrated circuit to compute vector dot products between a first N-bit vector and a second N-bit vector in a plurality of precision modes. An embodiment includes M slices, each of which calculates the vector dot products between a corresponding segment of the first and the second N-bit vectors. Each of the slices outputs intermediary multiplier results for the lower precision modes, but not for highest precision mode. A plurality of adder trees to sum up the plurality of intermediate multiplier results, with each adder tree producing a respective adder out result. An accumulator to merge the adder out result from a first adder tree with the adder out result from a second adder tree to produce the vector dot product of the first and the second N-bit vector in the highest precision mode.
    Type: Application
    Filed: September 29, 2018
    Publication date: February 7, 2019
    Inventors: Himanshu Kaul, Mark Anders, Seongjong Kim
  • Publication number: 20180344993
    Abstract: A system for clearing a blockage from a lumen of a patient, such as an impacted food bolus, can include a catheter that includes a distal end for insertion into the lumen of the patient. The catheter further includes a proximal end for coupling with a suction source at a position external to the patient. The catheter further includes a lumen extending through the proximal and distal ends. The system can include a cutting tip for cutting a morsel from the bolus as suction is applied to the bolus via the lumen of the catheter. The system can also include a positioning element that transitions from an undeployed state to a deployed state, which is expanded relative to the undeployed state. The positioning element can contact the esophagus to space the cutting tip from the esophagus when in the deployed state.
    Type: Application
    Filed: May 31, 2018
    Publication date: December 6, 2018
    Inventors: Robert A. Ganz, Mark Anders Rydell, Travis Sessions, Steven Berhow, Doug Wahnschaffe, Michael W. Augustine
  • Publication number: 20180315398
    Abstract: One embodiment provides for a machine-learning hardware accelerator comprising a compute unit having an adder and a multiplier that are shared between integer data path and a floating-point datapath, the upper bits of input operands to the multiplier to be gated during floating-point operation.
    Type: Application
    Filed: October 18, 2017
    Publication date: November 1, 2018
    Applicant: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Publication number: 20180315399
    Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.
    Type: Application
    Filed: November 21, 2017
    Publication date: November 1, 2018
    Applicant: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Publication number: 20180280040
    Abstract: A system for clearing a blockage in a patient can include a tubular member that defines a channel, the tubular member being insertable into an esophagus of the patient. The system can further include a catheter assembly that includes a catheter tube, which defines a length greater than a length of the tubular member and is passable through the channel of the tubular member while the tubular member is positioned in the esophagus of the patient. The catheter tube can include a distal tip that defines a cutting element to core the blockage positioned in the esophagus of the patient. The catheter assembly can further include a proximal end coupled with the catheter tube, the proximal end being couplable with a vacuum line such that, when suction is provided via the vacuum line, advancement of the catheter tube into contact with the blockage cores from the blockage a piece of the blockage that is passed through the catheter.
    Type: Application
    Filed: May 31, 2018
    Publication date: October 4, 2018
    Inventors: Robert A. Ganz, Mark Anders Rydell, Travis Sessions, Steven Berhow, Doug Wahnschaffe, Michael W. Augustine
  • Publication number: 20180167199
    Abstract: An instruction and logic for a Simon-based hashing for validation are described. In one embodiment, a processor comprises: a memory the memory to store a plurality of values; and a hash circuit comprising a Simon cipher circuit operable to receive the plurality of values from the memory, to apply a Simon cipher, and to generate an output for each of the plurality of values; and circuitry coupled to the Simon cipher circuit to combine outputs from the Simon cipher circuit for each value of the plurality of values into a hash digest that is indicative of whether the values in the memory are valid.
    Type: Application
    Filed: December 9, 2016
    Publication date: June 14, 2018
    Inventors: Himanshu Kaul, Sanu K. Mathew, Mark A. Anders, Jesse Walker, Jason G. Sandri
  • Patent number: 9992042
    Abstract: A packet-switched request from a first router of a network-on-chip is received. The packet-switched request is generated by source logic of the network-on-chip. Circuit-switched data associated with the packet switched request is also received. The circuit-switched data is stored by a storage element. The circuit-switched data is sent towards destination logic identified in the packet-switched request.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: June 5, 2018
    Assignee: Intel Corporation
    Inventors: Mark A. Anders, Himanshu Kaul, Gregory K. Chen
  • Patent number: 9979668
    Abstract: A first packet-switched reservation request is received. Data associated with the first packet-switched reservation request is communicated through a first circuit-switched channel according to a best effort communication scheme. A second packet-switched reservation request is received. Data associated with the second packet-switched reservation request is communicated through a second circuit-switched channel according to a guaranteed throughput communication scheme.
    Type: Grant
    Filed: December 22, 2014
    Date of Patent: May 22, 2018
    Assignee: Intel Corporation
    Inventors: Gregory K. Chen, Mark A. Anders, Himanshu Kaul, Ram K. Krishnamurthy, Aaron T. Stillmaker
  • Patent number: 9961019
    Abstract: A packet-switched reservation request to be associated with a first data stream is received. A communication mode is selected. The communication mode is to be either a circuit-switched mode or a packet-switched mode. At least a portion of the first data stream is communicated in accordance with the communication mode.
    Type: Grant
    Filed: December 22, 2014
    Date of Patent: May 1, 2018
    Assignee: Intel Corporation
    Inventors: Gregory K. Chen, Mark A. Anders, Himanshu Kaul, Ram K. Krishnamurthy, Yejoong Kim
  • Patent number: 9940236
    Abstract: A first pointer dereferencer receives a location of a portion of a first node of a data structure. The first node is to be stored in a first storage element. A first pointer is obtained from the first node of the data structure. A location of a portion of a second node of the data structure is determined based on the first pointer. The second node is to be stored in a second storage element. The location of the portion of the second node of the data structure is sent to a second pointer dereferencer that is to access the portion of the second node from the second storage element.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: April 10, 2018
    Assignee: Intel Corporation
    Inventors: Mark A. Anders, Himanshu Kaul, Gregory K. Chen
  • Patent number: 9923730
    Abstract: A multicast message that is to originate from a source is received. The multicast message comprises an identifier. A plurality of directions in which the multicast message is to fork at the router are stored. A plurality of messages from the directions in which the multicast message is to fork are received. The received messages are to comprise the identifier. The plurality of messages are aggregated into an aggregate message and sent towards the source.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: March 20, 2018
    Assignee: Intel Corporation
    Inventors: Gregory K. Chen, Mark A. Anders, Himanshu Kaul
  • Patent number: 9866476
    Abstract: A first packet and a first direction associated with the first packet are received. The first packet is forwarded to an output port of a plurality of output ports of the first router based on the first direction associated with the first packet. A second direction associated with the first packet is determined. The second direction is based at least on an address of the first packet. The first packet and the second direction are forwarded through the output port of the first router to a second router.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: January 9, 2018
    Assignee: Intel Corporation
    Inventors: Mark A. Anders, Gregory K. Chen, Himanshu Kaul