Patents by Inventor Steven Hsu

Steven Hsu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240007087
    Abstract: Techniques and mechanisms for an integrated clock gate (ICG) to selectively output a clock signal, and to provide frequency division functionality. In an embodiment, an ICG circuit comprises first circuitry which is coupled to receive a first clock signal, and second circuitry which is coupled to receive a control signal. The first circuitry provides a single edge triggered flip-flop functionality, and is coupled to communicate a feedback signal which the first circuitry is further coupled to receive. Based on the control signal and the feedback signal, the second circuitry performs an exclusive OR (XOR) operation to selectively enable the first circuitry to generate a second clock signal based on the first clock signal. In another embodiment, a frequency of the second clock signal is substantially equal to one half of a frequency of the first clock signal.
    Type: Application
    Filed: July 1, 2022
    Publication date: January 4, 2024
    Applicant: Intel Corporation
    Inventors: Steven Hsu, Amit Agarwal, Simeon Realov, Mark Anders, Ram Krishnamurthy
  • Publication number: 20230376274
    Abstract: A fused dot-product multiply-accumulate (MAC) circuit may support variable precisions of floating-point data elements to perform computations (e.g., MAC operations) in deep learning operations. An operation mode of the circuit may be selected based on the precision of an input element. The operation mode may be a FP16 mode or a FP8 mode. In the FP8 mode, product exponents may be computed based on exponents of floating-point input elements. A maximum exponent may be selected from the one or more product exponents. A global maximum exponent may be selected from a plurality of maximum exponents. A product mantissa may be computed and aligned with another product mantissa based on a difference between the global maximum exponent and a corresponding maximum exponent. An adder tree may accumulate the aligned product mantissas and compute a partial sum mantissa. The partial sum mantissa may be normalized using the global maximum exponent.
    Type: Application
    Filed: July 31, 2023
    Publication date: November 23, 2023
    Applicant: Intel Corporation
    Inventors: Mark Anders, Arnab Raha, Amit Agarwal, Steven Hsu, Deepak Abraham Mathaikutty, Ram K. Krishnamurthy, Martin Power
  • Patent number: 11791819
    Abstract: A parasitic-aware single-edge triggered flip-flop reduces clock power through layout optimization, enabled through process-circuit co-optimization. The static pass-gate master-slave flip-flop utilizes novel layout optimization enabling significant power reduction. The layout removes the clock poly over notches in the diffusion area. Poly lines implement clock nodes. The poly lines are aligned between n-type and p-type active regions.
    Type: Grant
    Filed: December 26, 2019
    Date of Patent: October 17, 2023
    Assignee: Intel Corporation
    Inventors: Steven Hsu, Amit Agarwal, Simeon Realov, Ram Krishnamurthy
  • Patent number: 11757434
    Abstract: A fast Mux-D scan flip-flop is provided, which bypasses a scan multiplexer to a master keeper side path, removing delay overhead of a traditional Mux-D scan topology. The design is compatible with simple scan methodology of Mux-D scan, while preserving smaller area and small number of inputs/outputs. Since scan Mux is not in the forward critical path, circuit topology has similar high performance as level-sensitive scan flip-flop and can be easily converted into bare pass-gate version. The new fast Mux-D scan flip-flop combines the advantages of the conventional LSSD and Mux-D scan flip-flop, without the disadvantages of each.
    Type: Grant
    Filed: April 1, 2022
    Date of Patent: September 12, 2023
    Assignee: Intel Corporation
    Inventors: Amit Agarwal, Steven Hsu, Simeon Realov, Mahesh Kumashikar, Ram Krishnamurthy
  • Publication number: 20230195388
    Abstract: Methods and apparatus relating to register file virtualization techniques are described. In an embodiment, a register file includes a plurality of register file cells. Each of the register file cells includes a register file entry and a shadow buffer. Logic circuitry causes storage of input data to the shadow buffer, while data stored in the register file entry is accessible to perform one or more operations. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: December 17, 2021
    Publication date: June 22, 2023
    Applicant: Intel Corporation
    Inventors: William Butera, David Webb, Mitchell Diamond, Steven Hsu, Amit Agarwal
  • Publication number: 20230014656
    Abstract: A memory array of a compute tile may store activations or weights of a DNN. The memory array may include databanks for storing contexts, context MUXs, and byte MUXs. A databank may store a context with flip-flop arrays, each of which includes a sequence of flip-flops. A logic gate and an ICG unit may gate flip-flops and control whether states of the flip-flops can be changed. The data gating can prevent a context not selected for the databank from inadvertently toggling and wasting power A context MUX may read a context from different flip-flop arrays in a databank based on gray-coded addresses. A byte MUX can combine bits from different bytes in a context read by the context MUX. The memory array may be implemented with bit packing to reduce distance between the context MUX and byte MUX to reduce lengths of wires connecting the context MUXs and byte MUXs.
    Type: Application
    Filed: September 23, 2022
    Publication date: January 19, 2023
    Inventors: Raymond Jit-Hung Sung, Deepak Abraham Mathaikutty, Amit Agarwal, David Thomas Bernard, Steven Hsu, Martin Power, Conor Byme, Arnab Raha
  • Patent number: 11442103
    Abstract: An apparatus is provided which comprises: a multi-bit quad latch with an internally coupled level sensitive scan circuitry; and a combinational logic coupled to an output of the multi-bit quad latch. Another apparatus is provided which comprises: a plurality of sequential logic circuitries; and a clocking circuitry comprising inverters, wherein the clocking circuitry is shared by the plurality of sequential logic circuitries.
    Type: Grant
    Filed: April 26, 2021
    Date of Patent: September 13, 2022
    Assignee: Intel Corporation
    Inventors: Amit Agarwal, Ram Krishnamurthy, Satish Damaraju, Steven Hsu, Simeon Realov
  • Patent number: 11398814
    Abstract: A new family of shared clock single-edge triggered flip-flops that reduces a number of internal clock devices from 8 to 6 devices to reduce clock power. The static pass-gate master-slave flip-flop has no performance penalty compared to the flip-flops with 8 clock devices thus enabling significant power reduction. The flip-flop intelligently maintains the same polarity between the master and slave stages which enables the sharing of the master tristate and slave state feedback clock devices without risk of charge sharing across all combinations of clock and data toggling. Because of this, the state of the flip-flop remains undisturbed, and is robust across charge sharing noise. A multi-bit time borrowing internal stitched flip-flop is also described, which enables internal stitching of scan in a high performance time-borrowing flip-flop without incurring increase in layout area.
    Type: Grant
    Filed: March 9, 2020
    Date of Patent: July 26, 2022
    Assignee: Intel Corporation
    Inventors: Steven Hsu, Amit Agarwal, Simeon Realov, Satish Damaraju, Ram Krishnamurthy
  • Publication number: 20220224316
    Abstract: A fast Mux-D scan flip-flop is provided, which bypasses a scan multiplexer to a master keeper side path, removing delay overhead of a traditional Mux-D scan topology. The design is compatible with simple scan methodology of Mux-D scan, while preserving smaller area and small number of inputs/outputs. Since scan Mux is not in the forward critical path, circuit topology has similar high performance as level-sensitive scan flip-flop and can be easily converted into bare pass-gate version. The new fast Mux-D scan flip-flop combines the advantages of the conventional LSSD and Mux-D scan flip-flop, without the disadvantages of each.
    Type: Application
    Filed: April 1, 2022
    Publication date: July 14, 2022
    Inventors: Amit Agarwal, Steven Hsu, Simeon Realov, Mahesh Kumashikar, Ram Krishnamurthy
  • Patent number: 11296681
    Abstract: A fast Mux-D scan flip-flop is provided, which bypasses a scan multiplexer to a master keeper side path, removing delay overhead of a traditional Mux-D scan topology. The design is compatible with simple scan methodology of Mux-D scan, while preserving smaller area and small number of inputs/outputs. Since scan Mux is not in the forward critical path, circuit topology has similar high performance as level-sensitive scan flip-flop and can be easily converted into bare pass-gate version. The new fast Mux-D scan flip-flop combines the advantages of the conventional LSSD and Mux-D scan flip-flop, without the disadvantages of each.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: April 5, 2022
    Assignee: Intel Corporation
    Inventors: Amit Agarwal, Steven Hsu, Simeon Realov, Mahesh Kumashikar, Ram Krishnamurthy
  • Publication number: 20210407039
    Abstract: A method comprising: dividing a 3D space into a voxel grid comprising a plurality of voxels; associating a plurality of distance values with the plurality of voxels, each distance value based on a distance to a boundary of an object; selecting an approximate interpolation mode for stepping a ray through a first one or more voxels of the 3D space responsive to the first one or more voxels having distance values greater than a threshold; and detecting the ray reaching a second one or more voxels having distance values less than the first threshold; and responsively selecting a precise interpolation mode for stepping the ray through the second one or more voxels.
    Type: Application
    Filed: June 30, 2020
    Publication date: December 30, 2021
    Inventors: Vivek De, Ram Krishnamurthy, Amit Agarwal, Steven Hsu, Monodeep Kar
  • Publication number: 20210407168
    Abstract: A method comprising: dividing a 3D space into a voxel grid comprising a plurality of voxels; associating a plurality of distance values with the plurality of voxels, each distance value based on a distance to a boundary of an object; selecting an approximate interpolation mode for stepping a ray through a first one or more voxels of the 3D space responsive to the first one or more voxels having distance values greater than a threshold; and detecting the ray reaching a second one or more voxels having distance values less than the first threshold; and responsively selecting a precise interpolation mode for stepping the ray through the second one or more voxels.
    Type: Application
    Filed: October 14, 2020
    Publication date: December 30, 2021
    Inventors: Vivek De, Ram Krishnamurthy, Amit Agarwal, Steven Hsu, Monodeep Kar
  • Publication number: 20210281250
    Abstract: A new family of shared clock single-edge triggered flip-flops that reduces a number of internal clock devices from 8 to 6 devices to reduce clock power. The static pass-gate master-slave flip-flop has no performance penalty compared to the flip-flops with 8 clock devices thus enabling significant power reduction. The flip-flop intelligently maintains the same polarity between the master and slave stages which enables the sharing of the master tristate and slave state feedback clock devices without risk of charge sharing across all combinations of clock and data toggling. Because of this, the state of the flip-flop remains undisturbed, and is robust across charge sharing noise. A multi-bit time borrowing internal stitched flip-flop is also described, which enables internal stitching of scan in a high performance time-borrowing flip-flop without incurring increase in layout area.
    Type: Application
    Filed: March 9, 2020
    Publication date: September 9, 2021
    Applicant: Intel Corporation
    Inventors: Steven Hsu, Amit Agarwal, Simeon Realov, Satish Damaraju, Ram Krishnamurthy
  • Publication number: 20210263100
    Abstract: An apparatus is provided which comprises: a multi-bit quad latch with an internally coupled level sensitive scan circuitry; and a combinational logic coupled to an output of the multi-bit quad latch. Another apparatus is provided which comprises: a plurality of sequential logic circuitries; and a clocking circuitry comprising inverters, wherein the clocking circuitry is shared by the plurality of sequential logic circuitries.
    Type: Application
    Filed: April 26, 2021
    Publication date: August 26, 2021
    Applicant: Intel Corporation
    Inventors: Amit Agarwal, Ram Krishnamurthy, Satish Damaraju, Steven Hsu, Simeon Realov
  • Patent number: 11054470
    Abstract: A family of novel, low power, min-drive strength, double-edge triggered (DET) input data multiplexer (Mux-D) scan flip-flop (FF) is provided. The flip-flop takes the advantage of no state node in the slave to remove data inverters in a traditional DET FF to save power, without affecting the flip-flop functionality under coupling/glitch scenarios.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: July 6, 2021
    Assignee: Intel Corporation
    Inventors: Amit Agarwal, Steven Hsu, Anupama Ambardar Thaploo, Simeon Realov, Ram Krishnamurthy
  • Publication number: 20210203323
    Abstract: A parasitic-aware single-edge triggered flip-flop reduces clock power through layout optimization, enabled through process-circuit co-optimization. The static pass-gate master-slave flip-flop utilizes novel layout optimization enabling significant power reduction. The layout removes the clock poly over notches in the diffusion area. Poly lines implement clock nodes. The poly lines are aligned between n-type and p-type active regions.
    Type: Application
    Filed: December 26, 2019
    Publication date: July 1, 2021
    Applicant: Intel Corporation
    Inventors: Steven Hsu, Amit Agarwal, Simeon Realov, Ram Krishnamurthy
  • Publication number: 20210194468
    Abstract: A family of novel, low power, min-drive strength, double-edge triggered (DET) input data multiplexer (Mux-D) scan flip-flop (FF) is provided. The flip-flop takes the advantage of no state node in the slave to remove data inverters in a traditional DET FF to save power, without affecting the flip-flop functionality under coupling/glitch scenarios.
    Type: Application
    Filed: December 23, 2019
    Publication date: June 24, 2021
    Applicant: Intel Corporation
    Inventors: Amit Agarwal, Steven Hsu, Anupama Ambardar Thaploo, Simeon Realov, Ram Krishnamurthy
  • Publication number: 20210194469
    Abstract: A fast Mux-D scan flip-flop is provided, which bypasses a scan multiplexer to a master keeper side path, removing delay overhead of a traditional Mux-D scan topology. The design is compatible with simple scan methodology of Mux-D scan, while preserving smaller area and small number of inputs/outputs. Since scan Mux is not in the forward critical path, circuit topology has similar high performance as level-sensitive scan flip-flop and can be easily converted into bare pass-gate version. The new fast Mux-D scan flip-flop combines the advantages of the conventional LSSD and Mux-D scan flip-flop, without the disadvantages of each.
    Type: Application
    Filed: December 23, 2019
    Publication date: June 24, 2021
    Applicant: Intel Corporation
    Inventors: Amit Agarwal, Steven Hsu, Simeon Realov, Mahesh Kumashikar, Ram Krishnamurthy
  • Patent number: 11009549
    Abstract: An apparatus is provided which comprises: a multi-bit quad latch with an internally coupled level sensitive scan circuitry; and a combinational logic coupled to an output of the multi-bit quad latch. Another apparatus is provided which comprises: a plurality of sequential logic circuitries; and a clocking circuitry comprising inverters, wherein the clocking circuitry is shared by the plurality of sequential logic circuitries.
    Type: Grant
    Filed: November 12, 2019
    Date of Patent: May 18, 2021
    Assignee: Intel Corporation
    Inventors: Amit Agarwal, Ram Krishnamurthy, Satish Damaraju, Steven Hsu, Simeon Realov
  • Publication number: 20210117197
    Abstract: Systems, apparatuses and methods identify a plurality of registers that are associated with a system-on-chip. The plurality of registers includes a first portion dedicated to write operations and a second portion dedicated to read operations. The technology writes data to the first portion of the plurality of registers, and transfers the data from the first portion to the second portion.
    Type: Application
    Filed: December 23, 2020
    Publication date: April 22, 2021
    Applicant: Intel Corporation
    Inventors: Steven Hsu, Amit Agarwal, Debabrata Mohapatra, Arnab Raha, Moongon Jung, Gautham Chinya, Ram Krishnamurthy