Abstract: Apparatus and method for a ring oscillator based random number generator with intentional startup delays timed for each ring to provide a uniform initial spreading of the ring oscillator transition edges. This invention adds a controlled incremental delay in the startup of each individual ring within the ring oscillator random number generator. Typically the delay units used in the ring oscillators themselves can be used to get a course delay between the start times of each ring. A subset of the rings start up with a particular course delay and different fine delays such that the transition edges of all the rings are spread out over the oscillation period. This spreading of the transition edges ensures the output of the random number generator are not a predictable sequence of ones and zeros based on a simultaneous startup of all rings at the same time.
Type:
Grant
Filed:
January 9, 2014
Date of Patent:
June 7, 2016
Assignee:
The United States of America as represented by the Secretary of the Air Force.
Abstract: Discrete cosine transform/inverse discrete cosine transform method and device are provided. The discrete cosine transform/inverse discrete cosine transform method includes: generating a table index for only an input value other than 0 (zero) out of input values of coordinates in an input block; reading one or more partial values corresponding to the table index out of a plurality of table information pieces which are generated and stored in advance so as to include partial values corresponding to a multiplication of a weight value and an index; and adding the read partial value and calculating the resultant value of each coordinate in an output block. Accordingly, it is possible to perform a fast DCT/IDCT operation and to reduce the energy consumption for the transform.
Abstract: Techniques and mechanisms for generating a random number. In an embodiment, a first signal is received from a first cell including a first source follower transistor. Circuit logic detects for a pulse of the first signal and, in response to the pulse, generates a signal indicating detection of a first random telegraph noise event in the first source follower transistor. In another embodiment, a first count update is performed in response to the indicated detection of the first random telegraph noise event. The first count update is one basis for generation of a number corresponding to a plurality of random telegraph noise events.
Abstract: Embodiments of the present invention set forth a technique for optimizing the performance and efficiency of complex, software-based computations, such as lighting computations. Data entering a graphics application programming interface (API) in a conventional arithmetic representation, such as floating-point or fixed-point, is converted to an internal logarithmic representation for greater computational efficiency. Lighting computations are then performed using logarithmic space arithmetic routines that, on average, execute more efficiently than similar routines performed in a native floating-point format. The lighting computation results, represented as logarithmic space numbers, are converted back to floating-point numbers before being transmitted to a graphics processing unit (GPU) for further processing. Because of efficiencies of logarithmic space arithmetic, performance improvements may be realized relative to prior art approaches to performing software-based floating-point operations.
Abstract: A method in a signal processor for filtering samples in a digital signal is provided. An approximate filtered sample is generated as a function of less than four samples of the digital signal. A correction is generated as a function of the less than four samples, and a filtered sample is generated by modifying the approximate filtered sample with the correction.
Abstract: An exemplary embodiment of the present invention provides a lattice reduction method comprising obtaining a preliminary estimate of a transformation matrix, generating a covariance matrix based on the preliminary estimate of the transformation matrix, reducing diagonal elements of the covariance matrix to generate a unimodular transformation matrix, and using the unimodular transformation matrix to obtain an estimate of an input.
Abstract: A hardware circuit for returning single precision denormal results to double precision. A hardware circuit component configured to count leading zeros of an unrounded single precision denormal result. A hardware circuit component configured to pre-compute a first exponent and a second exponent for the unrounded single precision denormal result. A hardware circuit component configured to perform a second normalization of the rounded single precision denormal result back to architected format.
Type:
Grant
Filed:
November 26, 2013
Date of Patent:
March 15, 2016
Assignee:
International Business Machines Corporation
Inventors:
Maarten J. Boersma, Thomas Fuchs, Markus Kaltenbach, David Lang
Abstract: A hardware circuit for returning single precision denormal results to double precision. A hardware circuit component configured to count leading zeros of an unrounded single precision denormal result. A hardware circuit component configured to pre-compute a first exponent and a second exponent for the unrounded single precision denormal result. A hardware circuit component configured to perform a second normalization of the rounded single precision denormal result back to architected format.
Type:
Grant
Filed:
January 9, 2014
Date of Patent:
March 8, 2016
Assignee:
International Business Machines Corporation
Inventors:
Maarten J. Boersma, Thomas Fuchs, Markus Kaltenbach, David Lang
Abstract: An embodiment of a method and a related apparatus for digital computation of a floating point complex multiply-add is provided. The method includes receiving an input addend, a first product, and a second product. The input addend, the first product and the second product each respectively has a mantissa and an exponent. The method includes shifting the mantissas of the two with smaller exponents of the input addend, the first product, and the second product to align together with the mantissa of the one with largest exponent of the input addend, the first product and the second product, and adding the aligned input addend, the aligned first product and the aligned second product.
Abstract: Techniques are disclosed relating to type conversion using a floating-point unit. In one embodiment, to convert a floating-point value to a normalized integer format, a floating-point unit is configured to perform an operation to generate a result having a significant portion and an exponent portion, where the operation includes multiplying the floating-point value by a constant. In one embodiment, the apparatus is further configured to add a value to the exponent portion of the result, and set a rounding mode to round to nearest. The constant may be a greatest value less than one that can be represented using the particular number of unsigned bits. The value added to the initial exponent may be equal to the number of unsigned bits of the normalized integer format. The apparatus may perform this conversion in response to a pack instruction.
Abstract: A method of generating a signal having a converted sampling rate in a communication system is provided. The method includes selecting effective input samples among S number of input samples included in an input stream corresponding to an input sampling rate, generating a filter coefficient set including filter coefficients having a length of a second tap, the filter coefficients having the length of the second tap being generated by dividing a filter coefficient having a length of a first tap configuring a low-pass filter into the filter coefficients having the length of the second tap corresponding to a number of selected effective input samples, selecting filter coefficients corresponding to each of the effective input samples among the filter coefficients included in the filter coefficient set, and outputting output samples having an output sampling rate which is converted from the input sampling rate.
Type:
Grant
Filed:
February 8, 2012
Date of Patent:
January 19, 2016
Assignee:
Samsung Electronics Co., Ltd.
Inventors:
Joo-Hyun Lee, Sung-Kwon Jo, Ha-Young Yang
Abstract: A system includes a decimation module having an adjustable decimation rate and a filter module responsive to the decimation module. A digital phase lock loop is operable to control a decimation rate of the decimation module. The decimation module is a cascade integrator comb decimation module.
Abstract: Systems and methods for matching data based on numeric difference are described herein. Input data elements are parsed to identify a first number and a second number. A difference between the first number and the second number is calculated based on a predefined formula. Based on the difference, a matching score between the input data elements is evaluated. The matching score is proportional to a base matching score corresponding to a threshold difference, and a maximum score corresponding to a match between the first number and the second number. A similarity between the input data elements is reported based on the evaluated matching score.
Type:
Grant
Filed:
December 21, 2010
Date of Patent:
January 5, 2016
Assignee:
Business Objects Software Limited
Inventors:
Jeffrey Woody, Abhiram Gujjewar, Mark Spiess
Abstract: A multi-stage adaptive filter is disclosed, which exhibits a smaller mean square error than in prior art adaptive filters. The adaptive filter selectively manipulates the weights, in multiple stages, so as to achieve a global minimum of the error function, such that the filtered signal has as small an error as possible.
Abstract: A floating point execution unit is capable of selectively repurposing a subset of the significand bits in a floating point value for use as additional exponent bits to dynamically provide an extended range for floating point calculations. A significand field of a floating point operand may be considered to include first and second portions, with the first portion capable of being concatenated with the second portion to represent the significand for a floating point value, or, to provide an extended range, being concatenated with the exponent field of the floating point operand to represent the exponent for a floating point value.
Type:
Grant
Filed:
March 11, 2013
Date of Patent:
December 29, 2015
Assignee:
International Business Machines Corporation
Inventors:
Mark J. Hickey, Adam J. Muff, Matthew R. Tubbs, Charles D. Wait
Abstract: A method and structure for an in-place transformation of matrix data. For a matrix A stored in one of a standard full format or a packed format and a transformation T having a compact representation, blocking parameters MB and NB are chosen, based on a cache size. A sub-matrix A1 of A, A1 having size M1=m*MB by N1=n*NB, is worked on, and any of a residual remainder of A is saved in a buffer B. Sub-matrix A1 is worked on by contiguously moving and contiguously transforming A1 in-place into a New Data Structure (NDS), applying the transformation T in units of MB*NB contiguous double words to the NDS format of A1, thereby replacing A1 with the contents of T(A1), and moving and transforming NDS T(A1) to standard data format T(A1) with holes for the remainder of A in buffer B. The contents of buffer B is contiguously copied into the holes of A2, thereby providing in-place transformed matrix T(A).
Type:
Grant
Filed:
September 1, 2007
Date of Patent:
December 15, 2015
Assignee:
International Business Machines Corporation
Inventors:
Fred Gehrung Gustavson, John A. Gunnels, James C. Sexton
Abstract: The disclosure provides a device with a capability of processing a Fast Fourier Transform Algorithm (FFT) radix 2 butterfly operation and an operation method thereof, the device at least includes a latch, a complex multiplier, a complex adder-subtractor, a switch and a complex conjugate Arithmetic Logical Unit (ALU). The complex operation unit of the disclosure has a simple structure. The parallel processing array composed of the complex operation unit has the capability of efficiently processing vectors and the FFT operation.
Abstract: A true random number generator, a method of generating a true random number and a system incorporating the generator or the method. In one embodiment, the generator includes: (1) a ring oscillator including inverting gates having power inputs and (2) a time-varying power supply coupled to the power inputs to provide power thereto and including power perturbation circuitry operable to perturb the power provided to at least one of the power inputs.
Abstract: A mechanism is provided for a circuit for generation of a random output. A bistable circuit has two stable states as an output and a clock signal as an input. The bistable circuit includes a first logic circuit and a second logic circuit cross-coupled connected together, which transition into a metastable state before resolving to the two stable states. The second logic circuit resolves to a stable state at a resolution time. A digitization circuit is configured to generate random bits corresponding to a variance of the resolution time of the second logic circuit resolving from the metastable state to the stable state for cycles of the clock signal. The resolution time randomly varies according to noise. An actual value of the stable state is eliminated as factor in generating the random bits.
Abstract: Systems and methods for using carry-less multiplication (CLMUL) to implement erasure code are provided. An embodiment method of using CLMUL to implement erasure code includes initiating, with a processor, a first CLMUL call to calculate a first product of a data bit word and a constant, partitioning, with the processor, the first product into a high portion and a low portion, and initiating, with the processor, a second CLMUL call to calculate a second product of the high portion and a hexadecimal number portion, a bit size of the second product less than a bit size of the first product. The second product, or a third product generated by a third CLMUL call, is used to calculate a parity bit. Because the second product or the third product has a number of bits equivalent to the number of bits used by the processor, the erasure codes are more efficiently implemented.