Patents by Inventor Huichu Liu
Huichu Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11907827Abstract: Methods and systems include a neural network system that includes a neural network accelerator. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.Type: GrantFiled: June 28, 2019Date of Patent: February 20, 2024Assignee: Intel CorporationInventors: Gautham Chinya, Huichu Liu, Arnab Raha, Debabrata Mohapatra, Cormac Brick, Lance Hacking
-
Publication number: 20240022259Abstract: Methods, systems, articles of manufacture, and apparatus are disclosed to decode zero-value-compression data vectors. An example apparatus includes: a buffer monitor to monitor a buffer for a header including a value indicative of compressed data; a data controller to, when the buffer includes compressed data, determine a first value of a sparse select signal based on (1) a select signal and (2) a first position in a sparsity bitmap, the first value of the sparse select signal corresponding to a processing element that is to process a portion of the compressed data; and a write controller to, when the buffer includes compressed data, determine a second value of a write enable signal based on (1) the select signal and (2) a second position in the sparsity bitmap, the second value of the write enable signal corresponding to the processing element that is to process the portion of the compressed data.Type: ApplicationFiled: September 12, 2023Publication date: January 18, 2024Applicant: Intel CorporationInventors: Gautham Chinya, Debabrata Mohapatra, Arnab Raha, Huichu Liu, Cormac Brick
-
Patent number: 11869617Abstract: In some embodiments, a system comprises a static random access memory (SRAM) device and a controller. The SRAM device comprises a bit cell array comprising a plurality of bit cells arranged in a plurality of rows and a plurality of columns, each column operatively coupled to a pair of bit lines, wherein the plurality of columns is arranged as a plurality of column groups each comprising a plurality of local columns. The SRAM device further comprises a plurality of column decoders, each associated with a column group of the plurality of column groups. In some embodiments, the controller may be configured to read the local columns included in the column group by, for a given local column, sensing a voltage difference on a corresponding pair of bit lines, in a rearranged sequential order that is different from a physical sequential order of the plurality of local columns.Type: GrantFiled: April 11, 2022Date of Patent: January 9, 2024Assignee: Meta Platforms Technologies, LLCInventors: Huichu Liu, Edith Dallard, Daniel Henry Morris
-
Patent number: 11804851Abstract: Methods, systems, articles of manufacture, and apparatus are disclosed to decode zero-value-compression data vectors. An example apparatus includes: a buffer monitor to monitor a buffer for a header including a value indicative of compressed data; a data controller to, when the buffer includes compressed data, determine a first value of a sparse select signal based on (1) a select signal and (2) a first position in a sparsity bitmap, the first value of the sparse select signal corresponding to a processing element that is to process a portion of the compressed data; and a write controller to, when the buffer includes compressed data, determine a second value of a write enable signal based on (1) the select signal and (2) a second position in the sparsity bitmap, the second value of the write enable signal corresponding to the processing element that is to process the portion of the compressed data.Type: GrantFiled: March 27, 2020Date of Patent: October 31, 2023Assignee: INTEL CORPORATIONInventors: Gautham Chinya, Debabrata Mohapatra, Arnab Raha, Huichu Liu, Cormac Brick
-
Patent number: 11734174Abstract: Described is an low overhead method and apparatus to reconfigure a pair of buffered interconnect links to operate in one of these three modes—first mode (e.g., bandwidth mode), second mode (e.g., latency mode), and third mode (e.g., energy mode). In bandwidth mode, each link in the pair buffered interconnect links carries a unique signal from source to destination. In latency mode, both links in the pair carry the same signal from source to destination, where one link in the pair is “primary” and other is called the “assist”. Temporal alignment of transitions in this pair of buffered interconnects reduces the effective capacitance of primary, thereby reducing delay or latency. In energy mode, one link in the pair, the primary, alone carries a signal, while the other link in the pair is idle. An idle neighbor on one side reduces energy consumption of the primary.Type: GrantFiled: September 19, 2019Date of Patent: August 22, 2023Assignee: Intel CorporationInventors: Huichu Liu, Tanay Karnik, Tejpal Singh, Yen-Cheng Liu, Lavanya Subramanian, Mahesh Kumashikar, Sri Harsha Choday, Sreenivas Subramoney, Kaushik Vaidyanathan, Daniel H. Morris, Uygar E. Avci, Ian A. Young
-
Publication number: 20230075959Abstract: An SRAM controller for performing sequential accesses using internal ports that operate concurrently on different rows. Each internal port includes a row address strobe (RAS) timer that generates clock signals controlling the timing of operations during a RAS phase in which word line decoding is performed once for a group of bit cells being accessed. The RAS phase can involve additional conditioning operations, such as precharging of local bits lines associated with the group of bit cells. The RAS phase is followed by an input/output (IO) phase in which individual bit cells are accessed in sequential address order using a column select signal generated by an IO timer. The RAS phase of a first internal port can be at least partially overlapped by the IO phase of a second internal port to hide the RAS latency of the first internal port. The IO timer can be shared among internal ports.Type: ApplicationFiled: May 5, 2022Publication date: March 9, 2023Inventors: Huichu LIU, Daniel Henry MORRIS, Edith DALLARD
-
Publication number: 20230065591Abstract: In some embodiments, a system comprises a static random access memory (SRAM) device and a controller. The SRAM device comprises a bit cell array comprising a plurality of bit cells arranged in a plurality of rows and a plurality of columns, each column operatively coupled to a pair of bit lines, wherein the plurality of columns is arranged as a plurality of column groups each comprising a plurality of local columns. The SRAM device further comprises a plurality of column decoders, each associated with a column group of the plurality of column groups. In some embodiments, the controller may be configured to read the local columns included in the column group by, for a given local column, sensing a voltage difference on a corresponding pair of bit lines, in a rearranged sequential order that is different from a physical sequential order of the plurality of local columns.Type: ApplicationFiled: April 11, 2022Publication date: March 2, 2023Inventors: Huichu LIU, Edith DALLARD, Daniel Henry MORRIS
-
Publication number: 20230065165Abstract: In some embodiments, an apparatus comprises: a static random access memory (SRAM) device. The SRAM device may have a bit cell array comprising a plurality of bit cells, the plurality of bit cells arranged in a plurality of rows and a plurality of columns, each column of the plurality of columns operatively coupled to a pair of bit lines. The apparatus may comprise a controller configured to: assert a word line associated with a row; perform a sequence of write operations while the word line remains asserted, each write operation corresponding to a bit cell associated with a different column of the plurality of columns and the row, wherein the word line has an elevated voltage relative to a non-elevated voltage during at least a portion of the sequence of write operations; and de-assert the word line after the sequence of write operations are performed.Type: ApplicationFiled: December 17, 2021Publication date: March 2, 2023Inventors: Edith DALLARD, Huichu LIU, Daniel Henry MORRIS, Doyun KIM
-
Patent number: 11522012Abstract: A DIMA semiconductor structure is disclosed. The DIMA semiconductor structure includes a frontend including a semiconductor substrate, a transistor switch of a memory cell coupled to the semiconductor substrate and a computation circuit on the periphery of the frontend coupled to the semiconductor substrate. Additionally, the DIMA includes a backend that includes an RRAM component of the memory cell that is coupled to the transistor switch.Type: GrantFiled: September 28, 2018Date of Patent: December 6, 2022Assignee: Intel CorporationInventors: Jack T. Kavalieros, Ian A. Young, Ram Krishnamurthy, Ravi Pillarisetty, Sasikanth Manipatruni, Gregory Chen, Hui Jae Yoo, Van H. Le, Abhishek Sharma, Raghavan Kumar, Huichu Liu, Phil Knag, Huseyin Sumbul
-
Publication number: 20220383081Abstract: A neural network accelerator includes a first memory device, a controller connected to the first memory device through a high-bandwidth (e.g., three-dimensional) interconnect, a configurable processing element (PE) array connected to the first memory device through a first data bus and including a two-dimensional (2D) array of PEs, a local memory connected to the controller and connected, through a second data bus, to the configurable PE array. The controller is configured to, during execution of a neural network (NN), dynamically configure the neural network accelerator for executing each NN layer of a plurality of NN layers of the neural network by selecting either weights of a weight tensor or input data of an input tensor of a tensor operation of the NN layer to store into the local memory, and configuring input and output connections of PEs in the 2D array of PEs for performing the tensor operation.Type: ApplicationFiled: December 16, 2021Publication date: December 1, 2022Inventors: Huichu LIU, Fan WU, Edith DALLARD, Linyan MEI, Huseyin Ekin SUMBUL
-
Patent number: 11411172Abstract: An apparatus is provided which comprises a full adder including magnetoelectric material and spin orbit material. In some embodiments, the adder includes: a 3-bit carry generation structure and a multi-bit sum generation structure coupled to the 3-bit carry generation structure. In some embodiments, the 3-bit carry generation structure includes at least three cells comprising magnetoelectric material and spin orbit material, wherein the 3-bit carry generation structure is to perform a minority logic operation on first, second, and third inputs to generate a carry output. In some embodiments, the multi-bit sum generation structure includes at least four cells comprising magnetoelectric material and spin orbit material, wherein the multi-bit sum generation structure is to perform a minority logic operation on the first, second, and third inputs and the carry output to generate a sum output.Type: GrantFiled: September 13, 2018Date of Patent: August 9, 2022Assignee: Intel CorporationInventors: Huichu Liu, Sasikanth Manipatruni, Daniel Morris, Kaushik Vaidyanathan, Tanay Karnik, Ian Young
-
Patent number: 11387404Abstract: An apparatus is provided which comprises one or more magnetoelectric spin orbit (MESO) minority gates with different peripheral complementary metal oxide semiconductor (CMOS) circuit techniques in the device layer including: (1) current mirroring, (2) complementary supply voltages, (3) asymmetrical transistor sizing, and (4) using transmission gates. These MESO minority gates use the multi-phase clock to prevent back propagation of current so that MESO gate can correctly process the input data.Type: GrantFiled: September 13, 2018Date of Patent: July 12, 2022Assignee: Intel CorporationInventors: Huichu Liu, Tanay Karnik, Sasikanth Manipatruni, Daniel Morris, Kaushik Vaidyanathan, Ian Young
-
Patent number: 11043256Abstract: Described are mechanisms and methods for amortizing the cost of address decode, row-decode and wordline firing across multiple read accesses (instead of just on one read access). Some or all memory locations that share a wordline (WL) may be read, by walking through column multiplexor addresses (instead of just reading out one column multiplexor address per WL fire or memory access). The mechanisms and methods disclosed herein may advantageously enable N distinct memory words to be read out if the array uses an N-to-1 column multiplexor. Since memories such as embedded DRAMs (eDRAMs) may undergo a destructive read, for a given WL fire, a design may be disposed to sense N distinct memory words and restore them in order.Type: GrantFiled: June 29, 2019Date of Patent: June 22, 2021Assignee: Intel CorporationInventors: Kaushik Vaidyanathan, Huichu Liu, Tanay Karnik, Sreenivas Subramoney, Jayesh Gaur, Sudhanshu Shukla
-
Patent number: 11037614Abstract: Described is an apparatus to reduce or eliminate imprint charge, wherein the apparatus which comprises: a source line; a bit-line; a memory bit-cell coupled to the source line and the bit-line; a first multiplexer coupled to the bit-line; a second multiplexer coupled to the source-line; a first driver coupled to the first multiplexer; a second driver coupled to the second multiplexer; and a current source coupled to the first and second drivers.Type: GrantFiled: July 23, 2018Date of Patent: June 15, 2021Assignee: Intel CorporationInventors: Huichu Liu, Sasikanth Manipatruni, Ian A. Young, Tanay Karnik, Daniel H. Morris, Kaushik Vaidyanathan
-
Publication number: 20210089448Abstract: Described is an low overhead method and apparatus to reconfigure a pair of buffered interconnect links to operate in one of these three modes—first mode (e.g., bandwidth mode), second mode (e.g., latency mode), and third mode (e.g., energy mode). In bandwidth mode, each link in the pair buffered interconnect links carries a unique signal from source to destination. In latency mode, both links in the pair carry the same signal from source to destination, where one link in the pair is “primary” and other is called the “assist”. Temporal alignment of transitions in this pair of buffered interconnects reduces the effective capacitance of primary, thereby reducing delay or latency. In energy mode, one link in the pair, the primary, alone carries a signal, while the other link in the pair is idle. An idle neighbor on one side reduces energy consumption of the primary.Type: ApplicationFiled: September 19, 2019Publication date: March 25, 2021Applicant: Intel CorporationInventors: Huichu Liu, Tanay Karnik, Tejpal Singh, Yen-Cheng Liu, Lavanya Subramanian, Mahesh Kumashikar, Sri Harsha Chodav, Sreenivas Subramoney, Kaushik Vaidyanathan, Daniel H. Morris, Uygar E. Avci, Ian A. Young
-
Patent number: 10901486Abstract: Described is an apparatus which comprises: a first electrical path comprising at least one driver and receiver; and a second electrical path comprising at least one driver and receiver, wherein the first and second electrical paths are to receive a same input signal, wherein the first electrical path and the second electrical path are parallel to one another and have substantially the same propagation delays, and wherein the second electrical path is enabled during a first operation mode and disabled during a second operation mode.Type: GrantFiled: April 15, 2019Date of Patent: January 26, 2021Assignee: Intel CorporationInventors: Kaushik Vaidyanathan, Daniel H. Morris, Uygar E. Avci, Ian A. Young, Tanay Karnik, Huichu Liu
-
Publication number: 20200411079Abstract: Described are mechanisms and methods for amortizing the cost of address decode, row-decode and wordline firing across multiple read accesses (instead of just on one read access). Some or all memory locations that share a wordline (WL) may be read, by walking through column multiplexor addresses (instead of just reading out one column multiplexor address per WL fire or memory access). The mechanisms and methods disclosed herein may advantageously enable N distinct memory words to be read out if the array uses an N-to-1 column multiplexor. Since memories such as embedded DRAMs (eDRAMs) may undergo a destructive read, for a given WL fire, a design may be disposed to sense N distinct memory words and restore them in order.Type: ApplicationFiled: June 29, 2019Publication date: December 31, 2020Applicant: Intel CorporationInventors: Kaushik Vaidyanathan, Huichu Liu, Tanay Karnik, Sreenivas Subramoney, Jayesh Gaur, Sudhanshu Shukla
-
Publication number: 20200410327Abstract: Methods and systems include a neural network system that includes a neural network accelerator comprising. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.Type: ApplicationFiled: June 28, 2019Publication date: December 31, 2020Inventors: Gautham Chinya, Huichu Liu, Arnab Raha, Debabrata Mohapatra, Cormac Brick, Lance Hacking
-
Patent number: 10777250Abstract: Embodiments include apparatuses, methods, and systems associated with save-restore circuitry including metal-ferroelectric-metal (MFM) devices. The save-restore circuitry may be coupled to a bit node and/or bit bar node of a pair of cross-coupled inverters to save the state of the bit node and/or bit bar node when an associated circuit block transitions to a sleep state, and restore the state of the bit node and/or bit bar node when the associated circuit block transitions from the sleep state to an active state. The save-restore circuitry may be used in a flip-flop circuit, a register file circuit, and/or another suitable type of circuit. The save-restore circuitry may include a transmission gate coupled between the bit node (or bit bar node) and an internal node, and an MFM device coupled between the internal node and a plate line. Other embodiments may be described and claimed.Type: GrantFiled: September 27, 2018Date of Patent: September 15, 2020Assignee: Intel CorporationInventors: Kaushik Vaidyanathan, Daniel H. Morris, Huichu Liu, Dileep J. Kurian, Uygar E. Avci, Tanay Karnik, Ian A. Young
-
Patent number: 10748602Abstract: One embodiment provides an apparatus. The apparatus includes a pair of nonvolatile resistive random access memory (RRAM) memory cells coupled to a volatile static RAM (SRAM) memory cell. The pair of nonvolatile RRAM memory cells includes a first RRAM memory cell and a second RRAM memory cell. The first RRAM memory cell includes a first resistive memory element coupled to a first bit line, and a first selector transistor coupled between the first resistive memory element and a first output node of the volatile SRAM memory cell. The second RRAM memory cell includes a second resistive memory element coupled to a second bit line, and a second selector transistor coupled between the second resistive memory element and a second output node of the volatile SRAM memory cell.Type: GrantFiled: March 23, 2016Date of Patent: August 18, 2020Assignee: Intel CorporationInventors: Huichu Liu, Sasikanth Manipatruni, Daniel H. Morris, Kaushik Vaidyanathan, Niloy Mukherjee, Dmitri E. Nikonov, Ian Young, Tanay Karnik