Patents by Inventor Mark Ulrich

Mark Ulrich has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Hardware for floating-point arithmetic in multiple formats

Patent number: 11275560

Abstract: A floating-point number in a first format representation is received. Based on an identification of a floating-point format type of the floating-point number, different components of the first format representation are identified. The different components of the first format representation are placed in corresponding components of a second format representation of the floating-point number, wherein a total number of bits of the second format representation is larger than a total number of bits of the first format representation. At least one of the components of the second format representation is padded with one or more zero bits. The floating-point number in the second format representation is stored in a register. A multiplication using the second format representation of the floating-point number is performed.

Type: Grant

Filed: February 19, 2020

Date of Patent: March 15, 2022

Assignee: Meta Platforms, Inc.

Inventors: Thomas Mark Ulrich, Abdulkadir Utku Diril, Krishnakumar Narayanan Nair, Zhao Wang, Rakesh Komuravelli
Floating point multiply hardware using decomposed component numbers

Patent number: 11188303

Abstract: A processor system comprises one or more logic units configured to receive a processor instruction identifying a first floating point number to be multiplied with a second floating point number. The floating point numbers are each decomposed into a group of a plurality of component numbers, wherein a number of bits used to represent each floating point number is greater than a number of bits used to represent any component number in each group of the plurality of component numbers. The component numbers of the first group are multiplied with the component numbers of the second group to determine intermediate multiplication results that are summed together to determine an effective result that represents a result of multiplying the first floating point number with the second floating point number.

Type: Grant

Filed: October 2, 2019

Date of Patent: November 30, 2021

Assignee: Facebook, Inc.

Inventors: Krishnakumar Narayanan Nair, Anup Ramesh Kadkol, Ehsan Khish Ardestani Zadeh, Olivia Wu, Yuchen Hao, Thomas Mark Ulrich, Rakesh Komuravelli
USING A LOW-BIT-WIDTH DOT PRODUCT ENGINE TO SUM HIGH-BIT-WIDTH NUMBERS

Publication number: 20210349690

Abstract: A device (e.g., an integrated circuit chip) includes a dot product processing component, a data alignment component, and an accumulator. The dot product processing component is configured to calculate a dot product of a first group of elements stored in a first storage unit with a second group of elements, wherein: each element of the first group of elements is represented using a first number of bits, each value of a group of values stored in the first storage unit is represented using a second number of bits greater than the first number of bits, and each value of the group of values is stored as split segments across more than one element of the elements of the first group of elements. The data alignment component is configured to receive results of the dot product processing component and modify one or more of the results of the dot product processing component. The accumulator is configured to sum outputs of the data alignment component to at least in part determine a sum of the group of values.

Type: Application

Filed: May 7, 2020

Publication date: November 11, 2021

Inventors: Thomas Mark Ulrich, Krishnakumar Narayanan Nair, Ehsan Khish Ardestani Zadeh
DEVICE AND METHOD FOR FLEXIBLY SUMMING MATRIX VALUES

Publication number: 20210349965

Abstract: A device (e.g., an application-specific integrated circuit chip) includes a matrix transpose component, a matrix processing component, a data alignment component, and a data reduction component. The matrix transpose component is configured to transpose an input matrix of elements to output an output matrix of the elements that have been transposed, wherein: each element of the input matrix of elements is represented using a first number of bits, each value of a group of values stored in the input matrix is represented using a second number of bits greater than the first number of bits, and each value of the group of values is stored as split segments across more than one element of the elements of the input matrix.

Type: Application

Filed: May 7, 2020

Publication date: November 11, 2021

Inventors: Krishnakumar Narayanan Nair, Thomas Mark Ulrich, Ehsan Khish Ardestani Zadeh
BYPASSING ZERO-VALUE MULTIPLICATIONS IN A HARDWARE MULTIPLIER

Publication number: 20210349694

Abstract: A device (e.g., integrated circuit chip) includes a first operand register, a second operand register, a multiplication unit, and a hardware logic component. The first operand register is configured to store a first operand value. The second operand register is configured to store a second operand value. The multiplication unit is configured to at least multiply the first operand value with the second operand value. The hardware logic component is configured to detect whether a zero value is provided and in response to a detection that the zero value is being provided: cause an update of at least the first operand register to be disabled, and cause a result of a multiplication of the first operand value with the second operand value to be a zero-value result.

Type: Application

Filed: May 7, 2020

Publication date: November 11, 2021

Inventors: Thomas Mark Ulrich, Abdulkadir Utku Diril, Zhao Wang
MAPPING CONVOLUTION TO CONNECTED PROCESSING ELEMENTS USING DISTRIBUTED PIPELINED SEPARABLE CONVOLUTION OPERATIONS

Publication number: 20210334072

Abstract: A processor system comprises a plurality of dot product processor units and element-wise multiplication units. The dot product processor units perform a depthwise convolution of a data matrix with a separate depthwise convolution weight matrix for each data matrix channel. Each dot product processor unit performs at least a portion of the depthwise convolution for one or more data matrix channels. The element-wise multiplication units perform multiplication operations of a pointwise convolution. Each element-wise multiplication unit applies to each depthwise convolution partial result element received from one or more of the dot product processor units a corresponding data element from each of a plurality of pointwise convolution weight filters to determine element-wise multiplication unit results. The processor system sums together different groups of data elements from the element-wise multiplication unit results to at least in part calculate different data elements of a result of the pointwise convolution.

Type: Application

Filed: April 22, 2020

Publication date: October 28, 2021

Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
Matrix multiplication in hardware using modular math

Patent number: 11157594

Abstract: A first group of modulo result matrices corresponding to modulo of elements of a first matrix by each of a plurality of moduli is stored. A second group of modulo result matrices corresponding to modulo of elements of a second matrix by each of the plurality of moduli is stored. It is determined whether an element operation of a multiplication of the first matrix with the second matrix can be performed using a first hardware multiplication module rather than a second hardware multiplication module. In response to a determination that the element operation can be performed using the first hardware multiplication module, the element operation is performed using the first hardware multiplication module including by multiplying one or more corresponding elements from the first group of modulo result matrices with one or more corresponding elements from the second group of modulo result matrices.

Type: Grant

Filed: July 24, 2019

Date of Patent: October 26, 2021

Assignee: Facebook, Inc.

Inventor: Thomas Mark Ulrich
GROUPED CONVOLUTION USING POINT-TO-POINT CONNECTED CHANNEL CONVOLUTION ENGINES

Publication number: 20210319076

Abstract: A processor system comprises a plurality of processing elements. Each processing element includes a corresponding convolution processor unit configured to perform a portion of a groupwise convolution. The corresponding convolution processor unit determines multiplication results by multiplying each data element of a portion of data elements in a convolution data matrix with a corresponding data element in a corresponding groupwise convolution weight matrix. The portion of data elements in the convolution data matrix that are multiplied belong to different channels and different groups. For each specific channel of the different channels, the corresponding convolution processor unit sums together at least some of the multiplication results belonging to the same specific channel to determine a corresponding channel convolution result data element.

Type: Application

Filed: April 8, 2020

Publication date: October 14, 2021

Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
PIPELINED POINTWISE CONVOLUTION USING PER-CHANNEL CONVOLUTION OPERATIONS

Publication number: 20210294875

Abstract: A processor system comprises a hardware channel convolution processor unit and dot product processor unit. The channel convolution processor unit is configured to perform depthwise convolution, including by multiplying each data element of a first group of data elements of a convolution data matrix with a corresponding data element of a second group of data elements of a plurality of depthwise convolution weight matrices and summing together, for each specific channel, multiplication results corresponding to the specific channel to determine one corresponding result data element in a corresponding channel convolution result matrix to calculate a portion of depthwise convolution results.

Type: Application

Filed: March 23, 2020

Publication date: September 23, 2021

Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
MAPPING CONVOLUTION TO A PARTITION CHANNEL CONVOLUTION ENGINE

Publication number: 20210271451

Abstract: A processor system comprises two groups of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate matrix for each channel. Each register stores at least one data element from each matrix. The hardware channel convolution processor unit is configured to multiply each data element in a first and second portion of the first group of registers with a corresponding data element in the second group of registers to determine corresponding multiplication results and sum together the multiplication results for each specific channel to determine two corresponding channel convolution result data elements in a corresponding channel convolution result matrix.

Type: Application

Filed: February 28, 2020

Publication date: September 2, 2021

Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
MAPPING CONVOLUTION TO A CHANNEL CONVOLUTION ENGINE

Publication number: 20210256363

Abstract: A processor system comprises a first and second group of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate convolution weight matrix for each channel. Each register stores at least one data element from each convolution weight matrix. The hardware channel convolution processor unit is configured to multiply each data element in the first group of registers with a corresponding data element in the second group of registers and sum together the multiplication results for each specific channel to determine corresponding channel convolution result data elements in a corresponding channel convolution result matrix.

Type: Application

Filed: February 18, 2020

Publication date: August 19, 2021

Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
HARDWARE FOR FLOATING-POINT ARITHMETIC IN MULTIPLE FORMATS

Publication number: 20210255830

Abstract: A floating-point number in a first format representation is received. Based on an identification of a floating-point format type of the floating-point number, different components of the first format representation are identified. The different components of the first format representation are placed in corresponding components of a second format representation of the floating-point number, wherein a total number of bits of the second format representation is larger than a total number of bits of the first format representation. At least one of the components of the second format representation is padded with one or more zero bits. The floating-point number in the second format representation is stored in a register. A multiplication using the second format representation of the floating-point number is performed.

Type: Application

Filed: February 19, 2020

Publication date: August 19, 2021

Inventors: Thomas Mark Ulrich, Abdulkadir Utku Diril, Krishnakumar Narayanan Nair, Zhao Wang, Rakesh Komuravelli
HARDWARE ACCELERATED MATRIX MANIPULATION OPERATIONS USING PROCESSOR INSTRUCTIONS

Publication number: 20210173646

Abstract: A processor system comprises a shared memory and a processing element. The processing element includes a matrix processor unit and is in communication with the shared memory. The processing element is configured to receive a processor instruction specifying a data matrix and a matrix manipulation operation. A manipulation matrix based on the processor instruction is identified. The data matrix and the manipulation matrix are loaded into the matrix processor unit and a matrix operation is performed to determine a result matrix. The result matrix is outputted to a destination location.

Type: Application

Filed: December 9, 2019

Publication date: June 10, 2021

Inventors: Thomas Mark Ulrich, Krishnakumar Narayanan Nair, Yuchen Hao
SUPPORT FOR DIFFERENT MATRIX MULTIPLICATIONS BY SELECTING ADDER TREE INTERMEDIATE RESULTS

Publication number: 20210125044

Abstract: A first group of elements is element-wise multiplied with a second group of elements using a plurality of multipliers belonging to a matrix multiplication hardware unit. Results of the plurality of multipliers are added together using a hierarchical tree of adders belonging to the matrix multiplication hardware unit and a final result of the hierarchical tree of adders or any of a plurality of intermediate results of the hierarchical tree of adders is selectively provided for use in determining an output result matrix.

Type: Application

Filed: October 29, 2019

Publication date: April 29, 2021

Inventors: Yuchen Hao, Krishnakumar Narayanan Nair, Ehsan Khish Ardestani Zadeh, Rakesh Komuravelli, Abdulkadir Utku Diril, Thomas Mark Ulrich
HIGH THROUGHPUT MATRIX PROCESSOR WITH SUPPORT FOR CONCURRENTLY PROCESSING MULTIPLE MATRICES

Publication number: 20210124794

Abstract: A system comprises a data input vector unit, a weight input vector unit, and a plurality of calculation units of a matrix processor unit. The data input vector unit is configured to concurrently receive elements of different rows of a first and second data matrix. The weight input vector unit is configured to receive a combined weight vector and at least in part concurrently provide obtained weight elements of a first and second weight matrix to a corresponding first and second group of calculation units. Each calculation unit of the first and second group of calculation units is configured to multiply elements from the data input vector unit with elements of the corresponding weight matrix from the weight input vector unit and sum together multiplication results of the corresponding calculation unit to at least in part determine a corresponding element in a first or second convolution result matrix.

Type: Application

Filed: October 29, 2019

Publication date: April 29, 2021

Inventors: Krishnakumar Narayanan Nair, Olivia Wu, Ehsan Khish Ardestani Zadeh, Abdulkadir Utku Diril, Thomas Mark Ulrich, Yuchen Hao, Rakesh Komuravelli, Aravind Kalaiah
FLOATING POINT MULTIPLY HARDWARE USING DECOMPOSED COMPONENT NUMBERS

Publication number: 20210103429

Abstract: A processor system comprises one or more logic units configured to receive a processor instruction identifying a first floating point number to be multiplied with a second floating point number. The floating point numbers are each decomposed into a group of a plurality of component numbers, wherein a number of bits used to represent each floating point number is greater than a number of bits used to represent any component number in each group of the plurality of component numbers. The component numbers of the first group are multiplied with the component numbers of the second group to determine intermediate multiplication results that are summed together to determine an effective result that represents a result of multiplying the first floating point number with the second floating point number.

Type: Application

Filed: October 2, 2019

Publication date: April 8, 2021

Inventors: Krishnakumar Narayanan Nair, Anup Ramesh Kadkol, Ehsan Khish Ardestani Zadeh, Olivia Wu, Yuchen Hao, Thomas Mark Ulrich, Rakesh Komuravelli
NUMBER-THEORETIC TRANSFORM HARDWARE

Publication number: 20210073316

Abstract: A forward number-theoretic transform dedicated hardware unit is configured to calculate a number-theoretic transform of an input vector, wherein a root of unity of the number-theoretic transform performed by the forward number-theoretic transform dedicated hardware unit is a power of two. The forward number-theoretic transform dedicated hardware unit includes data routing paths, a plurality of hardware binary bit shifters, and a plurality of adders.

Type: Application

Filed: September 9, 2019

Publication date: March 11, 2021

Inventor: Thomas Mark Ulrich
MATRIX MULTIPLICATION IN HARDWARE USING MODULAR MATH

Publication number: 20210026916

Abstract: A first group of modulo result matrices corresponding to modulo of elements of a first matrix by each of a plurality of moduli is stored. A second group of modulo result matrices corresponding to modulo of elements of a second matrix by each of the plurality of moduli is stored. It is determined whether an element operation of a multiplication of the first matrix with the second matrix can be performed using a first hardware multiplication module rather than a second hardware multiplication module. In response to a determination that the element operation can be performed using the first hardware multiplication module, the element operation is performed using the first hardware multiplication module including by multiplying one or more corresponding elements from the first group of modulo result matrices with one or more corresponding elements from the second group of modulo result matrices.

Type: Application

Filed: July 24, 2019

Publication date: January 28, 2021

Inventor: Thomas Mark Ulrich
Power source for reducing electromagnetic interference and power consumption

Patent number: 10843287

Abstract: A welding power source configured to receive an input power includes a plurality of components and supervising circuitry configurable in a plurality of modes. The plurality of components include power conversion circuitry and background power supply. The supervising circuitry is configured to distribute the input power to the plurality of components based at least in part on a mode of the plurality of modes. The plurality of modes include a welding mode configured to distribute the input power to the power conversion circuitry and to the background power supply. The plurality of modes also include a monitoring mode configured to distribute the input power to the background power supply, and to not distribute the input power to the power conversion circuitry.

Type: Grant

Filed: March 20, 2018

Date of Patent: November 24, 2020

Assignee: Illinois Tool Works Inc.

Inventors: Indra B. Wiryadinata, Mark A. Ulrich, Charles L. Kaufman
Energy Conservation System, Method, and Apparatus for Use with Welding Equipment

Publication number: 20200338658

Abstract: A welding system comprising a user manipulatable device, a proximity sensor positioned on said user manipulatable device, an engine capable of operation in an energy conservation mode and an operating mode, and a generator operatively coupled to the engine. The generator may provide at least one of (1) a welding current output or (2) an auxiliary power output. The engine may operate (1) in the operating mode when the proximity sensor outputs the first output signal, and (2) in the energy conservation mode when the engine is not operating in the operating mode. The proximity sensor may be configured to output the first output signal when an operator of said welding system is in-range.

Type: Application

Filed: July 9, 2020

Publication date: October 29, 2020

Inventor: Mark Ulrich

prev 1 2 3 4 5 next