Patents by Inventor Thomas Elmer

Thomas Elmer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Reducing dynamic power consumption in arrays

Patent number: 10817260

Abstract: Systems and methods are provided to skip multiplication operations with zeros in processing elements of the systolic array to reduce dynamic power consumption. A value of zero can be detected on an input data element entering each row of the array and respective zero indicators may be generated. These respective zero indicators may be passed to all the processing elements in the respective rows. The multiplication operation with the zero value can be skipped in each processing element based on the zero indicators, thus reducing dynamic power consumption.

Type: Grant

Filed: June 13, 2018

Date of Patent: October 27, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni, Thomas A. Volpe
ACCELERATED QUANTIZED MULTIPLY-AND-ADD OPERATIONS

Publication number: 20200293284

Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. In one example, an apparatus comprises a first circuit, a second circuit, and a third circuit. The first circuit is configured to: receive first values in a first format, the first values being generated from one or more asymmetric quantization operations of second values in a second format, and generate difference values based on subtracting a third value from each of the first values, the third value representing a zero value in the first format. The second circuit is configured to generate a sum of products in the first format using the difference values. The third circuit is configured to convert the sum of products from the first format to the second format based on scaling the sum of products with a scaling factor.

Type: Application

Filed: June 2, 2020

Publication date: September 17, 2020

Inventors: Dana Michelle Vantrease, Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni
Adapter plate

Patent number: 10674882

Abstract: An adapter device for arranging a container (4, 6) on the upper face (5) of a cleaning device (9), said adapter device consisting of an adapter plate (1, 2) with a support surface (13, 15) and at least one locking element for locking the container in place on the (4, 6) adapter plate. The adapter plate (1, 2) is divided into at least two parts and consists of a front adapter plate (1) and a rear adapter plate (2) which can be separately mounted on the upper face (5) of the cleaning device (9).

Type: Grant

Filed: September 15, 2015

Date of Patent: June 9, 2020

Assignee: NILFISK A/S

Inventors: Henrik Mathiassen, Steen Klimt Johannesen, Thomas Elmer, Trine Baek Nielsen
Accelerated quantized multiply-and-add operations

Patent number: 10678508

Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. A computer-implemented method includes receiving low-precision inputs for a convolution operation from a storage device, and subtracting a low-precision value representing a high-precision zero value from the low-precision inputs to generate difference values, where the low-precision inputs are asymmetrically quantized from high-precision inputs. The method also includes performing multiplication and summation operations on the difference values to generate a sum of products, and generating a high-precision output by scaling the sum of products with a scaling factor.

Type: Grant

Filed: March 23, 2018

Date of Patent: June 9, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni
ACCELERATED QUANTIZED MULTIPLY-AND-ADD OPERATIONS

Publication number: 20190294413

Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. A computer-implemented method includes receiving low-precision inputs for a convolution operation from a storage device, and subtracting a low-precision value representing a high-precision zero value from the low-precision inputs to generate difference values, where the low-precision inputs are asymmetrically quantized from high-precision inputs. The method also includes performing multiplication and summation operations on the difference values to generate a sum of products, and generating a high-precision output by scaling the sum of products with a scaling factor.

Type: Application

Filed: March 23, 2018

Publication date: September 26, 2019

Inventors: Dana Michelle Vantrease, Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni
Processing denormal numbers in FMA hardware

Patent number: 10078512

Abstract: A microprocessor includes FMA execution logic that determines whether to accumulate an accumulator operand C to the partial products of multiplier and multiplicand operands A and B in the partial product adder or in a second accumulation stage. The logic calculates an exponent delta of Aexp+Bexp?Cexp and determines the number of leading zeroes in C, if C is denormal. The microprocessor accumulates C with the partial products of A and B when the accumulation of C to the product of A and B could result in mass cancellation, when ExpDelta is greater than or equal to ?K (where K is related to a width of a datapath in the partial product adder), and when a C is denormal and its number of leading zeroes plus K exceeds ?ExpDelta. The strategic use of resources in the partial product adder and second accumulation stage reduces latency.

Type: Grant

Filed: October 3, 2016

Date of Patent: September 18, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor: Thomas Elmer
Noble metal hydrogenation catalysts with low cracking activity

Patent number: 10023814

Abstract: Methods are provided for modifying hydrogenation catalysts having silica supports (or other non-alumina supports) with additional alumina, and using such catalysts to achieve unexpectedly superior hydrogenation of feedstocks. The modified hydrogenation catalysts can have a relatively low cracking activity while providing an increased activity for hydrogenation.

Type: Grant

Filed: May 18, 2015

Date of Patent: July 17, 2018

Assignee: EXXONMOBIL RESEARCH AND ENGINEERING COMPANY

Inventors: Michael P. Lanci, Stuart L. Soled, Javier Guzman, Sabato Miseo, Thomas Elmer Green, Joseph Ernest Baumgartner
Calculation control indicator cache

Patent number: 10019230

Abstract: An arithmetic operation is performed using a first instruction execution unit to generate an intermediate result vector and a plurality of calculation control indicators that indicate how subsequent calculations to generate a final result from the intermediate result vector should proceed. The intermediate result vector and the plurality of calculation control indicators are stored in memory external to the instruction execution unit, and later read by a second instruction execution unit to complete the arithmetic operation.

Type: Grant

Filed: June 24, 2015

Date of Patent: July 10, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD

Inventor: Thomas Elmer
Calculation control indicator cache

Patent number: 10019229

Abstract: A microprocessor comprises an instruction execution unit operable to generate an intermediate result vector and a plurality of calculation control indicators and storage external to the instruction execution unit which stores the intermediate result vector and the plurality of calculation control indicators. The intermediate result vector is generated from an application of at least a first arithmetic operation of a compound arithmetic operation. The calculation control indicators indicate how subsequent calculations to generate a final result from the intermediate result vector should proceed. The subsequent calculations may involve one or more remaining arithmetic operations of the compound arithmetic operation.

Type: Grant

Filed: June 24, 2015

Date of Patent: July 10, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD

Inventor: Thomas Elmer
PROCESSING DENORMAL NUMBERS IN FMA HARDWARE

Publication number: 20180095749

Abstract: A microprocessor includes FMA execution logic that determines whether to accumulate an accumulator operand C to the partial products of multiplier and multiplicand operands A and B in the partial product adder or in a second accumulation stage. The logic calculates an exponent delta of Aexp+Bexp?Cexp and determines the number of leading zeroes in C, if C is denormal. The microprocessor accumulates C with the partial products of A and B when the accumulation of C to the product of A and B could result in mass cancellation, when ExpDelta is greater than or equal to ?K (where K is related to a width of a datapath in the partial product adder), and when a C is denormal and its number of leading zeroes plus K exceeds ?ExpDelta. The strategic use of resources in the partial product adder and second accumulation stage reduces latency.

Type: Application

Filed: October 3, 2016

Publication date: April 5, 2018

Inventor: THOMAS ELMER
Subdivision of a fused compound arithmetic operation

Patent number: 9891887

Abstract: A microprocessor prepares a fused multiply-accumulate operation of a form ±A*B±C for execution by issuing first and second multiply-accumulate microinstructions to one or more instruction execution units to complete the fused multiply-accumulate operation. The first multiply-accumulate microinstruction causes an unrounded nonredundant result vector to be generated from a first accumulation of a selected one of (a) the partial products of A and B or (b) C with the partial products of A and B. The second multiply-accumulate microinstruction causes performance of a second accumulation of C with the unrounded nonredundant result vector, if the first accumulation did not include C. The second multiply-accumulate microinstruction also causes a final rounded result to be generated from the unrounded nonredundant result vector, wherein the final rounded result is a complete result of the fused multiply-accumulate operation.

Type: Grant

Filed: June 24, 2015

Date of Patent: February 13, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD

Inventor: Thomas Elmer
Split-path heuristic for performing a fused FMA operation

Patent number: 9891886

Abstract: A microprocessor performs a fused multiply-accumulate operation of a form ±A*B±C. An evaluation is made to detect whether values of A, B, and/or C meet a sufficient condition for performing a joint accumulation of C with partial products of A and B. If so, a joint accumulation of C is done with partial products of A and B and result of the joint accumulation is rounded. If not, then a primary accumulation is done of the partial products of A and B. This generates an unrounded non-redundant result of the primary accumulation. The unrounded result is then truncated to generate an unrounded non-redundant intermediate result vector that excludes one or more least significant bits of the unrounded non-redundant result. A secondary accumulation is then performed, adding or subtracting C to the unrounded non-redundant intermediate result vector. Finally, the result of the secondary accumulation is rounded.

Type: Grant

Filed: June 24, 2015

Date of Patent: February 13, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD

Inventor: Thomas Elmer
Standard format intermediate result

Patent number: 9798519

Abstract: A microprocessor comprises an instruction pipeline, a shared memory, and first and second arithmetic processing units in the instruction pipeline, each capable of reading or receiving operands from and writing or providing results to the shared memory. The first arithmetic processing unit performs a first portion of a mathematical operation to produce an intermediate result vector that is not a complete, final result of the mathematical operation. The first arithmetic processing unit generates a plurality of non-architectural calculation control indicators that indicate how subsequent calculations to generate a final result from the intermediate result vector should proceed. The second arithmetic processing unit performs a second portion of the mathematical operation, in accordance with the calculation control indicators, to produce a complete, final result of the mathematical operation.

Type: Grant

Filed: June 24, 2015

Date of Patent: October 24, 2017

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor: Thomas Elmer
Temporally split fused multiply-accumulate operation

Patent number: 9778908

Abstract: A microprocessor splits a fused multiply-accumulate operation of the form A*B+C into first and second multiply-accumulate sub-operations to be performed by a multiplier and an adder. The first sub-operation at least multiplies A and B, and conditionally also accumulates C to the partial products of A and B to generate an unrounded nonredundant sum. The unrounded nonredundant sum is stored in memory shared by the multiplier and adder for an indefinite time period, enabling the multiplier and adder to perform other operations unrelated to the multiply-accumulate operation. The second sub-operation conditionally accumulates C to the unrounded nonredundant sum if C is not already incorporated into the value, and then generates a final rounded result.

Type: Grant

Filed: June 24, 2015

Date of Patent: October 3, 2017

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor: Thomas Elmer
Non-atomic split-path fused multiply-accumulate

Patent number: 9778907

Abstract: A microprocessor performs a fused multiply-accumulate operation of a form ±A*B±C using first and second execution units. An input operand analyzer circuit determines whether values of A, B and/or C meet a sufficient condition to perform a joint accumulation of C with partial products of A and B. The first instruction execution unit multiplies A and B and jointly accumulates C to partial products of A and B when the values of A, B and/or C meet a sufficient condition to perform a joint accumulation of C with the partial products of A and B. The second instruction execution unit separately accumulates C to the products of A and B when the values of A, B and/or C do not meet a sufficient condition to perform a joint accumulation of C with the partial products of A and B.

Type: Grant

Filed: June 24, 2015

Date of Patent: October 3, 2017

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor: Thomas Elmer
ADAPTER PLATE

Publication number: 20170273519

Abstract: An adapter device for arranging a container (4, 6) on the upper face (5) of a cleaning device (9), said adapter device consisting of an adapter plate (1, 2) with a support surface (13, 15) and at least one locking element for locking the container in place on the (4, 6) adapter plate. The adapter plate (1, 2) is divided into at least two parts and consists of a front adapter plate (1) and a rear adapter plate (2) which can be separately mounted on the upper face (5) of the cleaning device (9).

Type: Application

Filed: September 15, 2015

Publication date: September 28, 2017

Inventors: Henrik MATHIASSEN, Steen Klimt JOHANNESEN, Thomas ELMER, Trine BAEK NIELSEN
CHAINED SPLIT EXECUTION OF FUSED COMPOUND ARITHMETIC OPERATIONS

Publication number: 20170097824

Abstract: A microprocessor is configured for unchained and chained modes of split execution of a fused compound arithmetic operation. In both modes of split execution, a first execution unit executes only a first part of the fused compound arithmetic operation and produces an intermediate result thereof, and a second instruction execution unit receives the intermediate result and executes a second part of the fused compound arithmetic operation to produce a final result. In the unchained mode, execution is accomplished by dispatching separate split-execution microinstructions to the first and second instruction execution units. In the chained mode, execution is accomplished by dispatching a single split-execution microinstruction to the first instruction execution unit and sending a chaining control signal or signal group to the second execution unit, causing it to execute its part of the fused arithmetic operation without needing an instruction.

Type: Application

Filed: July 5, 2016

Publication date: April 6, 2017

Inventors: THOMAS ELMER, NIKHIL A. PATIL
SPLIT-PATH HEURISTIC FOR PERFORMING A FUSED FMA OPERATION

Publication number: 20160004507

Abstract: A microprocessor performs a fused multiply-accumulate operation of a form ±A*B±C. An evaluation is made to detect whether values of A, B, and/or C meet a sufficient condition for performing a joint accumulation of C with partial products of A and B. If so, a joint accumulation of C is done with partial products of A and B and result of the joint accumulation is rounded. If not, then a primary accumulation is done of the partial products of A and B. This generates an unrounded non-redundant result of the primary accumulation. The unrounded result is then truncated to generate an unrounded non-redundant intermediate result vector that excludes one or more least significant bits of the unrounded non-redundant result. A secondary accumulation is then performed, adding or subtracting C to the unrounded non-redundant intermediate result vector. Finally, the result of the secondary accumulation is rounded.

Type: Application

Filed: June 24, 2015

Publication date: January 7, 2016

Inventor: THOMAS ELMER
CALCULATION CONTROL INDICATOR CACHE

Publication number: 20160004509

Abstract: An arithmetic operation is performed using a first instruction execution unit to generate an intermediate result vector and a plurality of calculation control indicators that indicate how subsequent calculations to generate a final result from the intermediate result vector should proceed. The intermediate result vector and the plurality of calculation control indicators are stored in memory external to the instruction execution unit, and later read by a second instruction execution unit to complete the arithmetic operation.

Type: Application

Filed: June 24, 2015

Publication date: January 7, 2016

Inventor: THOMAS ELMER
Vacuum cleaner housing

Patent number: D782759

Type: Grant

Filed: December 23, 2015

Date of Patent: March 28, 2017

Assignee: Nilfisk A/S

Inventors: Henrik Mathiassen, Steen Klimt Johannesen, Thomas Elmer

prev 1 2 3 next