Patents by Inventor Mark Anders

Mark Anders has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210182058
    Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.
    Type: Application
    Filed: February 5, 2021
    Publication date: June 17, 2021
    Applicant: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Publication number: 20210124579
    Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.
    Type: Application
    Filed: December 9, 2020
    Publication date: April 29, 2021
    Applicant: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Publication number: 20210102769
    Abstract: Projectile loading systems for toy launchers that discharge soft spherical, but tacky, projectiles, the loading systems including a projectile hopper for storing the projectiles, a chute at the bottom of the hopper, the chute having a central groove for lining the projectiles in a single file, an agitator in the hopper for disturbing the tacky projectiles in the hopper to separate them, and a projectile transfer structure for carrying a projectile, one at a time, from the chute to a breech or from a feed track during respective priming cycles.
    Type: Application
    Filed: October 6, 2020
    Publication date: April 8, 2021
    Applicant: Hasbro, Inc.
    Inventors: Robert J. DeRoche, Robert C. Maschin, Mark Anders
  • Patent number: 10944402
    Abstract: Some embodiments include apparatuses having a first circuit path including drive units coupled in series between a first node and a first additional node, a second circuit path including drive units coupled in series between a second node and a second additional node, each drive unit of the driver units of the first circuit path and the second circuit path including an inverter, and a transmission gate circuit including an input node and an output node coupled to an input node and an output node, respectively, of the inverter; and control circuitry to provide control information to the transmission gate circuit of each of the driver units of the first circuit path and the second circuit path.
    Type: Grant
    Filed: February 14, 2020
    Date of Patent: March 9, 2021
    Assignee: Intel Corporation
    Inventors: SeongJong Kim, Mark A. Anders, Himanshu Kaul
  • Publication number: 20210043500
    Abstract: Embodiments disclosed herein include interconnect layers that include non-uniform interconnect heights and methods of forming such devices. In an embodiment, an interconnect layer comprises an interlayer dielectric (ILD), a first interconnect disposed in the ILD, wherein the first interconnect has a first height, and a second interconnect disposed in the ILD, wherein the second interconnect has a second height that is different than the first height.
    Type: Application
    Filed: August 7, 2019
    Publication date: February 11, 2021
    Inventors: Kevin Lai LIN, Mauro KOBRINSKY, Mark ANDERS, Himanshu KAUL, Ram KRISHNAMURTHY
  • Publication number: 20210043567
    Abstract: Embodiments disclosed herein include a semiconductor device with interconnects with non-uniform heights. In an embodiment, the semiconductor device comprises a semiconductor substrate, and a back end of line (BEOL) stack over the semiconductor substrate. In an embodiment, the BEOL stack comprises first interconnects and second interconnects in an interconnect layer of the BEOL stack. In an embodiment, the first interconnects have a first height and the second interconnects have a second height that is different than the first height.
    Type: Application
    Filed: August 7, 2019
    Publication date: February 11, 2021
    Inventors: Mark ANDERS, Himanshu KAUL, Ram KRISHNAMURTHY, Kevin Lai LIN, Mauro KOBRINSKY
  • Publication number: 20200334038
    Abstract: Disclosed embodiments relate to a variable format, variable sparsity matrix multiplication (VFVSMM) instruction. In one example, a processor includes fetch and decode circuitry to fetch and decode a VFVSMM instruction specifying locations of A, B, and C matrices having (M×K), (K×N), and (M×N) elements, respectively, execution circuitry, responsive to the decoded VFVSMM instruction, to: route each row of the specified A matrix, staggering subsequent rows, into corresponding rows of a (M×N) processing array, and route each column of the specified B matrix, staggering subsequent columns, into corresponding columns of the processing array, wherein each of the processing units is to generate K products of A-matrix elements and matching B-matrix elements having a same row address as a column address of the A-matrix element, and to accumulate each generated product with a corresponding C-matrix element.
    Type: Application
    Filed: July 6, 2020
    Publication date: October 22, 2020
    Inventors: Mark A. ANDERS, Himanshu KAUL, Sanu MATHEW
  • Publication number: 20200297386
    Abstract: A method can include introducing a device into a lung of a patient to a site of a material lodged in the lung. The device can include a distal end configured to core the material, a proximal end, and a tube that includes a hollow interior. The method can further include coring from the material a piece that is sized to pass through the hollow interior of the tube using the distal end of the device. The method can further include applying suction to the proximal end of the device to pass the piece through the hollow interior of the tube and out of the device through the proximal end. Other and further methods are also disclosed.
    Type: Application
    Filed: June 12, 2020
    Publication date: September 24, 2020
    Applicant: Piranha Medical, LLC
    Inventors: Robert A. Ganz, Mark Anders Rydell
  • Patent number: 10722267
    Abstract: A device is configured to clear a bolus of food impacted within an esophagus, the device including a catheter tube having a hollow interior and a distal end configured to core the bolus of food and a proximal end configured to be coupled to a source of suction to clear the core.
    Type: Grant
    Filed: November 21, 2016
    Date of Patent: July 28, 2020
    Assignee: Piranha Medical, LLC
    Inventors: Robert A. Ganz, Mark Anders Rydell
  • Patent number: 10642614
    Abstract: A configurable integrated circuit to compute vector dot products between a first N-bit vector and a second N-bit vector in a plurality of precision modes. An embodiment includes M slices, each of which calculates the vector dot products between a corresponding segment of the first and the second N-bit vectors. Each of the slices outputs intermediary multiplier results for the lower precision modes, but not for highest precision mode. A plurality of adder trees to sum up the plurality of intermediate multiplier results, with each adder tree producing a respective adder out result. An accumulator to merge the adder out result from a first adder tree with the adder out result from a second adder tree to produce the vector dot product of the first and the second N-bit vector in the highest precision mode.
    Type: Grant
    Filed: September 29, 2018
    Date of Patent: May 5, 2020
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark Anders, Seongjong Kim
  • Patent number: 10599429
    Abstract: Disclosed embodiments relate to a variable format, variable sparsity matrix multiplication (VFVSMM) instruction. In one example, a processor includes fetch and decode circuitry to fetch and decode a VFVSMM instruction specifying locations of A, B, and C matrices having (M×K), (K×N), and (M×N) elements, respectively, execution circuitry, responsive to the decoded VFVSMM instruction, to: route each row of the specified A matrix, staggering subsequent rows, into corresponding rows of a (M×N) processing array, and route each column of the specified B matrix, staggering subsequent columns, into corresponding columns of the processing array, wherein each of the processing units is to generate K products of A-matrix elements and matching B-matrix elements having a same row address as a column address of the A-matrix element, and to accumulate each generated product with a corresponding C-matrix element.
    Type: Grant
    Filed: June 8, 2018
    Date of Patent: March 24, 2020
    Assignee: Intel Corporation
    Inventors: Mark A. Anders, Himanshu Kaul, Sanu Mathew
  • Publication number: 20190369988
    Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.
    Type: Application
    Filed: June 5, 2019
    Publication date: December 5, 2019
    Applicant: Intel Corporation
    Inventors: HIMANSHU KAUL, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 10474458
    Abstract: One embodiment provides for a machine-learning hardware accelerator comprising a compute unit having an adder and a multiplier that are shared between integer data path and a floating-point datapath, the upper bits of input operands to the multiplier to be gated during floating-point operation.
    Type: Grant
    Filed: October 18, 2017
    Date of Patent: November 12, 2019
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 10440377
    Abstract: In accordance with some embodiments, the complexity of motion estimation algorithms that use Haar, SAD and Hadamard transforms may be reduced. In some embodiments, the number of summations may be reduced compared to existing techniques and some of the existing summations may be replaced with compare operations. In some embodiments, additions are replaced with compares in order to balance delay and area or energy or power considerations.
    Type: Grant
    Filed: October 11, 2012
    Date of Patent: October 8, 2019
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Ram K. Krishnamurthy
  • Publication number: 20190294415
    Abstract: Systems, apparatuses and methods may provide for technology that conduct a first alignment between a plurality of floating-point numbers based on a first subset of exponent bits. The technology may also conduct, at least partially in parallel with the first alignment, a second alignment between the plurality of floating-point numbers based on a second subset of exponent bits, where the first subset of exponent bits are LSBs and the second subset of exponent bits are MSBs. In one example, technology adds the aligned plurality of floating-point numbers to one another. With regard to the second alignment, the technology may also identify individual exponents of a plurality of floating-point numbers, identify a maximum exponent across the individual exponents, and conduct a subtraction of the individual exponents from the maximum exponent, where the subtraction is conducted from MSB to LSB.
    Type: Application
    Filed: June 7, 2019
    Publication date: September 26, 2019
    Inventors: Himanshu Kaul, Mark Anders
  • Patent number: 10374793
    Abstract: An instruction and logic for a Simon-based hashing for validation are described. In one embodiment, a processor comprises: a memory the memory to store a plurality of values; and a hash circuit comprising a Simon cipher circuit operable to receive the plurality of values from the memory, to apply a Simon cipher, and to generate an output for each of the plurality of values; and circuitry coupled to the Simon cipher circuit to combine outputs from the Simon cipher circuit for each value of the plurality of values into a hash digest that is indicative of whether the values in the memory are valid.
    Type: Grant
    Filed: December 9, 2016
    Date of Patent: August 6, 2019
    Assignee: INTEL CORPORATION
    Inventors: Himanshu Kaul, Sanu Mathew, Mark Anders, Jesse Walker, Jason Sandri
  • Patent number: 10353706
    Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.
    Type: Grant
    Filed: November 21, 2017
    Date of Patent: July 16, 2019
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 10303735
    Abstract: Systems, apparatuses, and methods for k-nearest neighbor (KNN) searches are described. In particular, embodiments of a KNN accelerator and its uses are described. In some embodiments, the KNN accelerator includes a plurality of vector partial distance computation circuits each to calculate a partial sum, a minimum sort network to sort partial sums from the plurality of vector partial distance computation circuits to find k nearest neighbor matches and a global control circuit to control aspects of operations of the plurality of vector partial distance computation circuits.
    Type: Grant
    Filed: November 18, 2015
    Date of Patent: May 28, 2019
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew
  • Publication number: 20190042252
    Abstract: A configurable integrated circuit to compute vector dot products between a first N-bit vector and a second N-bit vector in a plurality of precision modes. An embodiment includes M slices, each of which calculates the vector dot products between a corresponding segment of the first and the second N-bit vectors. Each of the slices outputs intermediary multiplier results for the lower precision modes, but not for highest precision mode. A plurality of adder trees to sum up the plurality of intermediate multiplier results, with each adder tree producing a respective adder out result. An accumulator to merge the adder out result from a first adder tree with the adder out result from a second adder tree to produce the vector dot product of the first and the second N-bit vector in the highest precision mode.
    Type: Application
    Filed: September 29, 2018
    Publication date: February 7, 2019
    Inventors: Himanshu Kaul, Mark Anders, Seongjong Kim
  • Publication number: 20190042250
    Abstract: Disclosed embodiments relate to a variable format, variable sparsity matrix multiplication (VFVSMM) instruction. In one example, a processor includes fetch and decode circuitry to fetch and decode a VFVSMM instruction specifying locations of A, B, and C matrices having (M×K), (K×N), and (M×N) elements, respectively, execution circuitry, responsive to the decoded VFVSMM instruction, to: route each row of the specified A matrix, staggering subsequent rows, into corresponding rows of a (M×N) processing array, and route each column of the specified B matrix, staggering subsequent columns, into corresponding columns of the processing array, wherein each of the processing units is to generate K products of A-matrix elements and matching B-matrix elements having a same row address as a column address of the A-matrix element, and to accumulate each generated product with a corresponding C-matrix element.
    Type: Application
    Filed: June 8, 2018
    Publication date: February 7, 2019
    Inventors: Mark A. Anders, Himanshu Kaul, Sanu Mathew