Patents by Inventor Ping Tak Peter Tang
Ping Tak Peter Tang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11868304Abstract: In an embodiment, an example computer-implemented method for configuring a hardware accelerator to perform a non-linear function involves: determining a plurality of intervals that partition an input domain of the non-linear function; determining a plurality of subinterval configurations corresponding to different numbers of subintervals for partitioning that interval; generating an error set comprising an error for using a polynomial function to approximate the non-linear function within one or more corresponding subintervals specified by the subinterval configuration; using the error set and resource constraints, selecting one of the subinterval configurations for each of the intervals to generate a configuration set that minimizes a worst-case error across the intervals; selecting one of the subinterval configurations for each of the intervals to generate an improved configuration set that minimizes a cumulative error across the intervals without exceeding the worst-case error; and configuring the hardwareType: GrantFiled: September 20, 2021Date of Patent: January 9, 2024Assignee: Meta Platforms, Inc.Inventors: Ping Tak Peter Tang, Nimit Singhania
-
Patent number: 11264120Abstract: A method for managing flow of particles into an array of pairwise-point-interaction-module includes receiving a first set of particles into a first queue. The first set is a proper subset of a second set of particles that comprises all particles that are to be passed into an array of pairwise-point-interaction-modules during a current time period. Prior to having received all particles from the second set, particles from the first set are allowed to pass from the first queue into the array.Type: GrantFiled: September 10, 2019Date of Patent: March 1, 2022Assignee: D. E. Shaw Research, LLCInventors: Ping Tak Peter Tang, J. P. Grossman, Brannon Batson, Ron Dror
-
Patent number: 11139049Abstract: A method comprising causing a simulation machine for molecular dynamic simulation to determine that a topological distance that separates two particles is less than a threshold. The simulation machine includes nodes connected by a network. The nodes collectively representing a volume with each node corresponding to a portion of the simulation space. A topological relationship between the nodes corresponds to spatial relationship thereof in the simulation space. The simulation volume is occupied by particles that interact with each other. The two particles are among these particles. The simulation volume includes node boxes, each of which is handled by one of the nodes. Each of the nodes is implemented as an application specific integrated circuit that includes a combination of first and second hardware elements. The first hardware elements are especially designed to perform pairwise interactions. The second hardware elements operate to provide potentially interacting particles to the first hardware elements.Type: GrantFiled: November 16, 2015Date of Patent: October 5, 2021Assignee: D.E. Shaw Research, LLCInventors: Ping Tak Peter Tang, J. P. Grossman, Brannon Batson, Ron Dror
-
Publication number: 20200005904Abstract: A method for managing flow of particles into an array of pairwise-point-interaction-module includes receiving a first set of particles into a first queue. The first set is a proper subset of a second set of particles that comprises all particles that are to be passed into an array of pairwise-point-interaction-modules during a current time period. Prior to having received all particles from the second set, particles from the first set are allowed to pass from the first queue into the array.Type: ApplicationFiled: September 10, 2019Publication date: January 2, 2020Inventors: Ping Tak Peter Tang, J.P. Grossman, Brannon Batson, Ron Dror
-
Patent number: 10445451Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. At least one of the plurality of processing elements includes a plurality of control inputs.Type: GrantFiled: July 1, 2017Date of Patent: October 15, 2019Assignee: Intel CorporationInventors: Kermin Fleming, Kent D. Glossop, Simon C. Steely, Jr., Ping Tak Peter Tang
-
Publication number: 20190087546Abstract: A method comprising causing a computer to determine that a topological distance between two particles is less than a threshold.Type: ApplicationFiled: November 16, 2015Publication date: March 21, 2019Inventors: Ping Tak Peter Tang, J.P. Grossman, Brannon Batson, Ron Dror
-
Publication number: 20190005161Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. At least one of the plurality of processing elements includes a plurality of control inputs.Type: ApplicationFiled: July 1, 2017Publication date: January 3, 2019Inventors: Kermin Fleming, Kent D. Glossop, Simon C. Steely, JR., Ping Tak Peter Tang
-
Patent number: 9292476Abstract: Fourier transform computation for distributed processing environments is disclosed. Example methods disclosed herein to compute a Fourier transform of an input data sequence include performing first processing on the input data sequence using a plurality of processors, the first processing resulting in an output data sequence having more data elements than the input data sequence Such example methods also include performing second processing on the output data sequence using the plurality of processors, the output data sequence being permutated among the plurality of processors, each of the processors performing the second processing on a respective permutated portion of the output data sequence to determine a respective, ordered segment of the Fourier transform of the input data sequence.Type: GrantFiled: October 10, 2012Date of Patent: March 22, 2016Assignee: Intel CorporationInventors: Ping Tak Peter Tang, Jong Soo Park, Vladimir Petrov
-
Patent number: 8838663Abstract: A new function for calculating the reciprocal residual of a floating-point number X is defined as recip_residual(X)=1?X*recip(X), where recip(X) represents the reciprocal of X. The function may be implemented using a fused multiply-add unit in a processor. The reciprocal value of X, recip(X), may be obtained from a lookup table. The recip_residual function may help reduce the latency of many multiplicative functions that are based on products of multiple numbers and can be expressed in simple terms of functions on each individual number (e.g., log(U*V)=log(U)+log(V)).Type: GrantFiled: March 30, 2007Date of Patent: September 16, 2014Assignee: Intel CorporationInventors: Ping Tak Peter Tang, Robert Cavin
-
Publication number: 20140250161Abstract: Embodiments of techniques and systems for approximating a function are described. In embodiments, a computing device may receive one or more statistical properties associated with application of an approximation function of a function over a target domain. The computing device may formulate one or more constraints on one or more parameters of a functional form of the approximation function, based at least in part on the one or more statistical properties. The computing device may then determine the one or more parameters subject to the constraints and out put results of the determination. In embodiments, the one or more parameters may be determined through application of an optimization procedure. Other embodiments, may be described and claimed.Type: ApplicationFiled: March 28, 2012Publication date: September 4, 2014Inventor: Ping Tak Peter Tang
-
Publication number: 20140101219Abstract: Fourier transform computation for distributed processing environments is disclosed. Example methods disclosed herein to compute a Fourier transform of an input data sequence include performing first processing on the input data sequence using a plurality of processors, the first processing resulting in an output data sequence having more data elements than the input data sequence Such example methods also include performing second processing on the output data sequence using the plurality of processors, the output data sequence being permutated among the plurality of processors, each of the processors performing the second processing on a respective permutated portion of the output data sequence to determine a respective, ordered segment of the Fourier transform of the input data sequence.Type: ApplicationFiled: October 10, 2012Publication date: April 10, 2014Inventors: Ping Tak Peter Tang, Jong Soo Park, Vladimir Petrov
-
Patent number: 7747669Abstract: Methods and apparatus to provide rounding of a binary integer are described. In one embodiment, a value that indicates whether a divisor divides a binary integer is extracted from a product of the binary integer and a scaled approximate reciprocal of the divisor.Type: GrantFiled: March 31, 2006Date of Patent: June 29, 2010Assignee: Intel CorporationInventors: Ping Tak (Peter) Tang, John R. Harrison
-
Publication number: 20080243985Abstract: A new function for calculating the reciprocal residual of a floating-point number X is defined as recip_residual(X)=1?X*recip(X), where recip(X) represents the reciprocal of X. The function may be implemented using a fused multiply-add unit in a processor. The reciprocal value of X, recip(X), may be obtained from a lookup table. The recip_residual function may help reduce the latency of many multiplicative functions that are based on products of multiple numbers and can be expressed in simple terms of functions on each individual number (e.g., log(U*V)=log(U)+log(V)).Type: ApplicationFiled: March 30, 2007Publication date: October 2, 2008Inventors: Ping Tak Peter Tang, Robert Cavin
-
Patent number: 7366748Abstract: There is disclosed method, software and apparatus for evaluating a function f in a computing device using a reduction, core approximation and final reconstruction stage. According to one embodiment of the invention, an argument reduction stage uses an approximate reciprocal table in the computing device. According to another embodiment, an approximate reciprocal instruction I is operative on the computing device to use the approximate reciprocal table such that the argument reduction stage provides that—C:=I(X) and R:=X×C?1, the core approximation stage provides that p(R) is computed so that it approximates f(1+R), and the final reconstruction stage provides that T=f(1/C) is fetched and calculated if necessary, and f(X) is reconstructed based on f(X)=f([1/C]×[X×C])=g(f(1/C), f(1+R)).Type: GrantFiled: June 30, 2000Date of Patent: April 29, 2008Assignee: Intel CorporationInventors: Ping Tak Peter Tang, John Harrison, Theodore Kubaska
-
Patent number: 7013320Abstract: An apparatus and method for creating lookup tables of approximate floating-point quotients which exactly represent the underlying value, within the range of the specified precision. The lookup tables are created without any extraneous data beyond what is needed and also without sacrificing numerical accuracy, and may be creating for any radix.Type: GrantFiled: January 25, 2002Date of Patent: March 14, 2006Assignee: Intel CorporationInventor: Ping Tak Peter Tang
-
Patent number: 6792443Abstract: Apparatus and methods are provided for an improved on-the-fly rounding technique for digit-recurrence algorithms, such as division and square root calculations. According to one embodiment, only two forms of an intermediate result of an operation to be performed by a digit-recurrence algorithm are maintained. A first form is maintained in a first register and a second form is maintained in a second register. Responsive to receiving digits 1 to L−2 of the intermediate result from a digit recurrence unit, where L represents a number of digits that satisfies a predetermined precision for the operation, both forms of the intermediate result are updated by register swapping or concatenation under the control of load and shift control logic and on-the-fly conversion logic. Then, a rounded result is generated by determining digits dL−1 and dL and appending a rounded last digit to the appropriate form of the intermediate result.Type: GrantFiled: June 29, 2001Date of Patent: September 14, 2004Assignee: Intel CorporationInventor: Ping Tak Peter Tang
-
Publication number: 20040015882Abstract: Various embodiments of a computer-implemented branch-free methodology for approximating a function of an input argument are disclosed. The methodology includes selecting one of a number of breakpoints, such that a reduced argument for the function is less than a predetermined value. An approximate function of the reduced argument is evaluated, including accessing a look-up table based on the selected breakpoint to obtain value of a term in the approximate function. The look-up table has at least one breakpoint for which the reduced argument can be computed without roundoff error when the input argument is close to a root of the function. The branch-free methodology may be applied to compute transcendental functions such as the exponential, logarithm, and trigonometric functions.Type: ApplicationFiled: June 5, 2001Publication date: January 22, 2004Inventor: Ping Tak Peter Tang
-
Publication number: 20030145029Abstract: An apparatus and method for creating lookup tables of approximate floating-point quotients which exactly represent the underlying value, within the range of the specified precision. The lookup tables are created without any extraneous data beyond what is needed and also without sacrificing numerical accuracy, and may be creating for any radix.Type: ApplicationFiled: January 25, 2002Publication date: July 31, 2003Inventor: Ping Tak Peter Tang
-
Patent number: 6598063Abstract: A method suitable for calculating an expression having the form (A/B)K by a processor that features separate sets of floating point units which can operate in parallel for greater speed of execution. The processor issues instructions to determine an approximate reciprocal R0 of a first variable B. Further instructions are issued to raise a second variable to the power of a third variable K by a first set of arithmetic units of the processor, where the second variable is a function of the approximate reciprocal R0. Still further instructions are issued to calculate a polynomial q at a fourth variable delta by a second set of arithmetic units of the processor. The fourth variable delta is also a function of the approximate reciprocal R0. Finally, one or more instructions are issued to multiply the calculated polynomial by the second variable, having been raised to the power of the third variable, to yield (A/B)K.Type: GrantFiled: August 14, 2000Date of Patent: July 22, 2003Assignee: lntel CorporationInventors: Ping Tak Peter Tang, Theodore E. Kubaska
-
Publication number: 20030009501Abstract: Apparatus and methods are provided for an improved on-the-fly rounding technique for digit-recurrence algorithms, such as division and square root calculations. According to one embodiment, only two forms of an intermediate result of an operation to be performed by a digit-recurrence algorithm are maintained. A first form is maintained in a first register and a second form is maintained in a second register. Responsive to receiving digits 1 to L−2 of the intermediate result from a digit recurrence unit, where L represents a number of digits that satisfies a predetermined precision for the operation, both forms of the intermediate result are updated by register swapping or concatenation under the control of load and shift control logic and on-the-fly conversion logic. Then, a rounded result is generated by determining digits dL−1 and dL and appending a rounded last digit to the appropriate form of the intermediate result.Type: ApplicationFiled: June 29, 2001Publication date: January 9, 2003Inventor: Ping Tak Peter Tang