Patents by Inventor Gary S. Goldman

Gary S. Goldman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240143988
    Abstract: Dynamic data quantization may be applied to minimize the power consumption of a system that implements a convolutional neural network (CNN). Under such a quantization scheme, a quantized representation of a 3×3 array of m-bit activation values may include 9 n-bit mantissa values and one exponent shared between the n-bit mantissa values (n<m); and a quantized representation of a 3×3 kernel with p-bit parameter values may include 9 q-bit mantissa values and one exponent shared between the q-bit mantissa values (q<p). Convolution of the kernel with the activation data may include computing a dot product of the 9 n-bit mantissa values with the 9 q-bit mantissa values, and summing the shared exponents. In a CNN with multiple kernels, multiple computing units (each corresponding to one of the kernels) may receive the quantized representation of the 3×3 array of m-bit activation values from the same quantization-alignment module.
    Type: Application
    Filed: January 11, 2024
    Publication date: May 2, 2024
    Inventors: Jian hui Huang, James Michael Bodwin, Pradeep R. Joginipally, Shabarivas Abhiram, Gary S. Goldman, Martin Stefan Patz, Eugene M. Feinberg, Berend Ozceri
  • Patent number: 11915126
    Abstract: Dynamic data quantization may be applied to minimize the power consumption of a system that implements a convolutional neural network (CNN). Under such a quantization scheme, a quantized representation of a 3×3 array of m-bit activation values may include 9 n-bit mantissa values and one exponent shared between the n-bit mantissa values (n<m); and a quantized representation of a 3×3 kernel with p-bit parameter values may include 9 q-bit mantissa values and one exponent shared between the q-bit mantissa values (q<p). Convolution of the kernel with the activation data may include computing a dot product of the 9 n-bit mantissa values with the 9 q-bit mantissa values, and summing the shared exponents. In a CNN with multiple kernels, multiple computing units (each corresponding to one of the kernels) may receive the quantized representation of the 3×3 array of m-bit activation values from the same quantization-alignment module.
    Type: Grant
    Filed: September 4, 2020
    Date of Patent: February 27, 2024
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, James Michael Bodwin, Pradeep R. Joginipally, Shabarivas Abhiram, Gary S. Goldman, Martin Stefan Patz, Eugene M. Feinberg, Berend Ozceri
  • Publication number: 20240053919
    Abstract: A memory system comprises a plurality of memory sub-systems, each with a memory bank and other circuit components. For each of the memory sub-systems, a first buffer receives and stores a read-modify-write request (with a read address, a write address and a first operand), a second operand is read from the memory bank at the location specified by the read address, a combiner circuit combines the first operand with the second operand, an activation circuit transforms the output of the combiner circuit, and the output of the activation circuit is stored in the memory bank at the location specified by the write address. The first operand and the write address may be stored in a second buffer while the second operand is read from the memory bank. Further, the output of the activation circuit may be first stored in the first buffer before being stored in the memory bank.
    Type: Application
    Filed: March 13, 2023
    Publication date: February 15, 2024
    Inventors: Gary S. Goldman, Ashwin Radhakrishnan
  • Publication number: 20230401433
    Abstract: In a low power hardware architecture for handling accumulation overflows in a convolver unit, an accumulator of the convolver unit computes a running total by successively summing dot products from a dot product computation module during an accumulation cycle. In response to the running total overflowing the maximum or minimum value of a data storage element, the accumulator transmits an overflow indicator to a controller and sets its output equal to a positive or negative overflow value. In turn, the controller disables the dot product computation module by clock gating, clamping one of its inputs to zero and/or holding its inputs to constant values. At the end of the accumulation cycle, the output of the accumulator is sampled. In response to a clear signal being asserted, the dot product computation module is enabled, and the running total is set to zero for the start of the next accumulation cycle.
    Type: Application
    Filed: June 9, 2022
    Publication date: December 14, 2023
    Inventors: Shabarivas Abhiram, Gary S. Goldman, Jian hui Huang, Eugene M. Feinberg
  • Patent number: 11762946
    Abstract: Convolution with a 5×5 kernel involves computing the dot product of a 5×5 data block with a 5×5 kernel. Instead of computing this dot product as a single sum of 25 products, the dot product is computed as a sum of four partial sums, where each partial sum is computed as a dot product of a 3×3 data block with a 3×3 kernel. The four partial sums may be computed by a single 3×3 convolver unit over four time periods. During each time period, at least some of the weights received by the 3×3 convolver unit may correspond to a quadrant of weights from the 5×5 kernel. A shifter circuit provides shifted columns (left or right shifted) of the input data to the 3×3 convolver unit, allowing the 3×3 convolver unit access to the 3×3 data block that spatially corresponds to a particular quadrant of weights from the 5×5 kernel.
    Type: Grant
    Filed: September 23, 2022
    Date of Patent: September 19, 2023
    Assignee: Recogni Inc.
    Inventors: Gary S. Goldman, Shabarivas Abhiram
  • Patent number: 11645355
    Abstract: A system for evaluating a piecewise linear function includes a first look-up table with N entries, and a second look-up table with M entries, with M being less than N. Each of the N entries contains parameters that define a corresponding linear segment of the piecewise linear function. The system further includes a controller configured to store a subset of the N entries from the first look-up table in the second look-up table. The system further includes a classifier for receiving an input value and classifying the input value in one of a plurality of segments of a number line. A total number of the segments is equal to M, and the segments are non-overlapping and contiguous. The system further includes a multiplexor for selecting one of the M entries of the second look-up table based on the classification of the input value into one of the plurality of segments.
    Type: Grant
    Filed: December 30, 2022
    Date of Patent: May 9, 2023
    Assignee: Recogni Inc.
    Inventors: Gilles J. C. A. Backhus, Gary S. Goldman
  • Patent number: 11630605
    Abstract: A memory system comprises a plurality of memory sub-systems, each with a memory bank and other circuit components. For each of the memory sub-systems, a first buffer receives and stores a read-modify-write request (with a read address, a write address and a first operand), a second operand is read from the memory bank at the location specified by the read address, a combiner circuit combines the first operand with the second operand, an activation circuit transforms the output of the combiner circuit, and the output of the activation circuit is stored in the memory bank at the location specified by the write address. The first operand and the write address may be stored in a second buffer while the second operand is read from the memory bank. Further, the output of the activation circuit may be first stored in the first buffer before being stored in the memory bank.
    Type: Grant
    Filed: August 10, 2022
    Date of Patent: April 18, 2023
    Assignee: Recogni Inc.
    Inventors: Gary S. Goldman, Ashwin Radhakrishnan
  • Publication number: 20220076104
    Abstract: Dynamic data quantization may be applied to minimize the power consumption of a system that implements a convolutional neural network (CNN). Under such a quantization scheme, a quantized representation of a 3×3 array of m-bit activation values may include 9 n-bit mantissa values and one exponent shared between the n-bit mantissa values (n<m); and a quantized representation of a 3×3 kernel with p-bit parameter values may include 9 q-bit mantissa values and one exponent shared between the q-bit mantissa values (q<p). Convolution of the kernel with the activation data may include computing a dot product of the 9 n-bit mantissa values with the 9 q-bit mantissa values, and summing the shared exponents. In a CNN with multiple kernels, multiple computing units (each corresponding to one of the kernels) may receive the quantized representation of the 3×3 array of m-bit activation values from the same quantization-alignment module.
    Type: Application
    Filed: September 4, 2020
    Publication date: March 10, 2022
    Inventors: Jian hui Huang, James Michael Bodwin, Pradeep R. Joginipally, Shabarivas Abhiram, Gary S. Goldman, Martin Stefan Patz, Eugene M. Feinberg, Berend Ozceri
  • Patent number: 6829224
    Abstract: A method and apparatus for smoothing the rate of packet discards for random early detection (“RED”) in a communication device such as an ATM switch is described. The ATM switch includes a plurality of class of service queues. An accumulated discard probability is stored independently for each class of service queue. With the arrival of each packet (frame), an instantaneous discard probability is calculated. The sum of the instantaneous discard probability and the accumulated discard probability becomes the effective probability for discard. If the effective discard probability is greater than (or equal to) a random number, the cell is discarded, and the accumulated discard probability is cleared. Otherwise, the sum is stored back as the new value for the accumulated discard probability. The accumulated discard probability may optionally be cleared if a class of service queue's current cell count is zero.
    Type: Grant
    Filed: February 4, 1999
    Date of Patent: December 7, 2004
    Assignee: Cisco Technology, Inc.
    Inventors: Gary S. Goldman, Mohammed Nikuie
  • Patent number: 6385710
    Abstract: In accordance with the present invention, a cache memory subsystem includes a processor, a cache control unit and a SRAM serving as the cache memory. The SRAM is a synchronous SRAM. The cache control unit provides appropriately timed control signals to the SRAM when the processor is accessing the cache memory. The SRAM can be either a pipelined architecture SRAM (register output SRAM) or a flow-through access architecture SRAM (latch output SRAM). The cache control unit is selectably configured to operate in a pipelined mode (1-1-1) or a flow-through (2-2) mode. The cache control unit is configured in the 1-1-1 mode when the SRAM is a pipelined architecture SRAM having a clock rate equal to the processor. When the SRAM is a flow-through architecture SRAM that cannot be clocked at the same rate as the processor, the cache control unit is configured in the 2-2 mode and the SRAM is clocked at a clock rate half of the processor clock rate.
    Type: Grant
    Filed: February 23, 1996
    Date of Patent: May 7, 2002
    Assignee: Sun Microsystems, Inc.
    Inventors: Gary S. Goldman, Christopher Chen, Douglas W. Forehand
  • Patent number: 6212181
    Abstract: A system and method for assigning departure timeslots to arrival data in an ATM switch is described. The departure timeslots are assigned to arrival data when no departure data is pending or when arrival data has a higher priority than pending departure data.
    Type: Grant
    Filed: March 26, 1999
    Date of Patent: April 3, 2001
    Assignee: Cisco Technology, Inc.
    Inventors: Robert J. Divivier, Christopher B. Bergen, Gary S. Goldman
  • Patent number: 5715425
    Abstract: A central processing unit is connected to an external memory including system memory and an external cache. The central processing unit includes a First-In-First-Out (FIFO) load buffer configured to generate an access to the external memory in response to a data prefetch command. The access to external memory has an associated data load latency period as data is moved from the system memory into the external cache. Instead of requiring the access to external memory to be completed before another FIFO load buffer address is processed, as is typically required in a FIFO load buffer configuration, the FIFO load buffer responds to the data prefetch command by processing additional stored addresses during the data load latency period.
    Type: Grant
    Filed: February 22, 1996
    Date of Patent: February 3, 1998
    Assignee: Sun Microsystems, Inc.
    Inventors: Gary S. Goldman, Bruce E. Petrick, Marc Tremblay, Dale R. Greenley
  • Patent number: 5490250
    Abstract: The invention provides a method and apparatus for tagging a control error indication onto a data signal passing through a data router in a computer system.
    Type: Grant
    Filed: December 31, 1991
    Date of Patent: February 6, 1996
    Assignee: Amdahl Corporation
    Inventors: Klaus P. Reschke, Gary S. Goldman
  • Patent number: 5423025
    Abstract: An error handling and reporting mechanism is capable of taking advantage of sophisticated error analysis performed after clocks have been stopped in response to an error detected in a controller. The controller provides services in a data processing system in response to requests for controller services from a plurality of requestors. The controller includes a plurality of ports for storing requests for controller services. A plurality of servers is coupled to the plurality of ports, and perform separate services associated with the requests for controller services stored in the plurality of ports. An error reporting mechanism is included which is responsive to a detected error in a particular server associated with a request in a particular port, for posting error status in the particular port and causing clock stoppage within a clock stop latency period. An error analysis mechanism analyzes the detected errors during the clock stoppage.
    Type: Grant
    Filed: September 29, 1992
    Date of Patent: June 6, 1995
    Assignee: Amdahl Corporation
    Inventors: Gary S. Goldman, Kent W. Wendorf
  • Patent number: 5339407
    Abstract: Recovery of data from a store-to cache in a malfunctioning CPU, is accomplished without exercising the hardware of the malfunctioning CPU. A data path which is independent of the normal operating paths of the computer, such as a scan facility, is used to move data out of the cache into the mainstore while the malfunctioning CPU's clocks are off. A system controller controls normal transfer of data between the cache memory of the processing unit and the mainstore. A service processor is coupled to the processing unit, the mainstore, and the system controller, and is responsive to the detection of errors in the processing unit, for stopping the processing unit and moving data out of the cache memory to the mainstore through the scan facility separate from the system controller. Logic in the system controller flushes the move out queue or other storage locations in the system controller.
    Type: Grant
    Filed: September 29, 1992
    Date of Patent: August 16, 1994
    Assignee: Amdahl Corporation
    Inventors: Gary S. Goldman, Silas P. Elash, Jeffrey L. Baker
  • Patent number: 4745605
    Abstract: In a data processing machine that generates a control word and that includes a plurality of registers connected to receive respective copies of the control word for execution in sections of the data processing machine, the present invention provides an apparatus for detecting an error condition in the execution of the control word. The apparatus detects an error in any of the respective copies of the control word. Further, a second means, responsive to the one copy of the control word in one register, is included for analyzing the one copy to identify a class of possible errors. Finally, responsive to the detection of an error in any of the respective copies and to the class of possible errors, a signal is generated indicating an error condition.
    Type: Grant
    Filed: August 19, 1986
    Date of Patent: May 17, 1988
    Assignee: Amadahl Corporation
    Inventors: Gary S. Goldman, Mark W. Semmelmeyer
  • Patent number: 4223255
    Abstract: An electric motor with microprogrammed controller and dual-functioning brushless commutation/rectification circuitry contained entirely within a wheel. The principal use of this electric motor is intended to be, but not limited to, powering a four-wheel drive electric vehicle through normal driving modes and serving as a power-recovery generator during braking. The magnetic and electronic configuration is optimized within the wheel to provide high torque and efficiency without the use of gear reductions, chain or belt drives, transmission, rotating axles, differentials, universal joints, or brushes. Power losses from mechanical drive system couplings are thus eliminated. Except for the wheel and bearings, there are no moving parts. Also, the wheel is virtually free of devices that are subject to mechanical failure.
    Type: Grant
    Filed: October 28, 1977
    Date of Patent: September 16, 1980
    Inventors: Gary S. Goldman, Allen W. Beishline