Patents by Inventor Chung Kuang Chin
Chung Kuang Chin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230176731Abstract: Managed units (MUs) of data can be stored on a memory device according to a slice-based layout. A slice of the slice-based layout can include a plurality of stripes, each of the stripes including respective partitions and respective MUs of data. A subset of the stripes each include a quantity of partitions and a first quantity of MUs of data. Another subset of the stripes each include a lesser quantity of partitions and a lesser quantity of MUs of data.Type: ApplicationFiled: December 6, 2021Publication date: June 8, 2023Inventors: Horia C. Simionescu, Chung Kuang Chin
-
Publication number: 20230176978Abstract: Translated addresses of a memory device can be stored in a first LUT maintained by control circuitry. Untranslated addresses can be stored in a second LUT maintained by the control circuitry. In response to a translation request for a particular translated address of the memory device corresponding to a target untranslated address, an index of the second LUT associated with the target untranslated address can be determined, the index of the second LUT can be mapped to an index of the first LUT, and the particular translated address corresponding to the target untranslated address can be retrieved from the first LUT.Type: ApplicationFiled: December 3, 2021Publication date: June 8, 2023Inventors: Chung Kuang Chin, Di Hsien Ngu, Horia C. Simionescu
-
Patent number: 11531622Abstract: Systems and methods are disclosed including a processing device operatively coupled to a first and a second memory device. The processing device can receive a set of data access requests, from a host system, in a first order and execute the set of data access requests in a second order. The processing device can further identify a late data access request of the set of data access requests and determine whether a data structure in a local memory associated with the processing device includes a previous outstanding data access request corresponding to an address associated with the late data access request. Responsive to determining that the data structure includes an indication of a previous outstanding data access request corresponding to the address associated with the late data access request, identifying a type of data dependency associated with the previous outstanding data access request and performing one or more operations associated with the type of data dependency.Type: GrantFiled: August 26, 2020Date of Patent: December 20, 2022Assignee: MICRON TECHNOLOGY, INC.Inventors: Horia C. Simionescu, Chung Kuang Chin, Paul Stonelake, Narasimhulu Dharanikumar Kotte
-
Patent number: 11494306Abstract: Systems and methods are disclosed including a first memory component, a second memory component having a lower access latency than the first memory component and acting as a cache for the first memory component, and a processing device operatively coupled to the first and second memory components. The processing device can perform operations including receiving a data access operation and, responsive to determining that a data structure includes an indication of an outstanding data transfer of data associated with a physical address of the data access operation, determining whether an operation to copy the data, associated with the physical address, from the first memory component to the second memory component is scheduled to be executed. The processing device can further perform operations including determining to delay a scheduling of an execution of the data access operation until the operation to copy the data is executed.Type: GrantFiled: August 26, 2020Date of Patent: November 8, 2022Assignee: MICRON TECHNOLOGY, INC.Inventors: Horia C. Simionescu, Chung Kuang Chin, Paul Stonelake, Narasimhulu Dharanikumar Kotte
-
Patent number: 11397683Abstract: Systems and methods are disclosed including a first memory device, a second memory device coupled to the first memory device, where the second memory device has a lower access latency than the first memory device and acts as a cache for the first memory device. A processing device operatively coupled to the first and second memory devices can track access statistics of segments of data stored at the second memory device, the segments having a first granularity, and determine to update, based on the access statistics, a segment of data stored at the second memory device from the first granularity to a second granularity. The processing device can further retrieve additional data associated with the segment of data from the first memory device and store the additional data at the second memory device to form a new segment having the second granularity.Type: GrantFiled: August 26, 2020Date of Patent: July 26, 2022Assignee: MICRON TECHNOLOGY, INC.Inventors: Horia C. Simionescu, Paul Stonelake, Chung Kuang Chin, Narasimhulu Dharanikumar Kotte, Robert M. Walker, Cagdas Dirik
-
Publication number: 20210141697Abstract: Embodiments described herein provide a mission-critical artificial intelligence (AI) processor (MAIP), which includes multiple types of HEs (hardware elements) comprising one or more HEs configured to perform operations associated with multi-layer NN (neural network) processing, at least one spare HE, a data buffer to store correctly computed data in a previous layer of multi-layer NN processing computed, and fault tolerance (FT) control logic. The FT control logic is configured to: determine a fault in a current layer NN processing associated with the HE; cause the correctly computed data in the previous layer of multi-layer NN processing to be copied or moved to said at least one spare HE; and cause said at least one spare HE to perform the current layer NN processing using said at least one spare HE and the correctly computed data in the previous layer of multi-layer NN processing.Type: ApplicationFiled: February 25, 2019Publication date: May 13, 2021Inventors: Chung Kuang CHIN, Yujie HU, Tong WU, Clifford GOLD, Yick Kei WONG, Xiaosong WANG, Steven SERTILLANGE, Zongwei ZHU
-
Publication number: 20210089450Abstract: Systems and methods are disclosed including a processing device operatively coupled to a first and a second memory device. The processing device can receive a set of data access requests, from a host system, in a first order and execute the set of data access requests in a second order. The processing device can further identify a late data access request of the set of data access requests and determine whether a data structure in a local memory associated with the processing device includes a previous outstanding data access request corresponding to an address associated with the late data access request. Responsive to determining that the data structure includes an indication of a previous outstanding data access request corresponding to the address associated with the late data access request, identifying a type of data dependency associated with the previous outstanding data access request and performing one or more operations associated with the type of data dependency.Type: ApplicationFiled: August 26, 2020Publication date: March 25, 2021Inventors: Horia C. Simionescu, Chung Kuang Chin, Paul Stonelake, Narasimhulu Dharanikumar Kotte
-
Publication number: 20210089449Abstract: Systems and methods are disclosed including a first memory component, a second memory component having a lower access latency than the first memory component and acting as a cache for the first memory component, and a processing device operatively coupled to the first and second memory components. The processing device can perform operations including receiving a data access operation and, responsive to determining that a data structure includes an indication of an outstanding data transfer of data associated with a physical address of the data access operation, determining whether an operation to copy the data, associated with the physical address, from the first memory component to the second memory component is scheduled to be executed. The processing device can further perform operations including determining to delay a scheduling of an execution of the data access operation until the operation to copy the data is executed.Type: ApplicationFiled: August 26, 2020Publication date: March 25, 2021Inventors: Horia C. Simionescu, Chung Kuang Chin, Paul Stonelake, Narasimhulu Dharanikumar Kotte
-
Publication number: 20210089454Abstract: Systems and methods are disclosed including a first memory device, a second memory device coupled to the first memory device, where the second memory device has a lower access latency than the first memory device and acts as a cache for the first memory device. A processing device operatively coupled to the first and second memory devices can track access statistics of segments of data stored at the second memory device, the segments having a first granularity, and determine to update, based on the access statistics, a segment of data stored at the second memory device from the first granularity to a second granularity. The processing device can further retrieve additional data associated with the segment of data from the first memory device and store the additional data at the second memory device to form a new segment having the second granularity.Type: ApplicationFiled: August 26, 2020Publication date: March 25, 2021Inventors: Horia C. Simionescu, Paul Stonelake, Chung Kuang Chin, Narasimhulu Dharanikumar Kotte, Robert M. Walker, Cagdas Dirik
-
Patent number: 10747631Abstract: Embodiments described herein provide a mission-critical artificial intelligence (AI) processor (MAIP), which includes an instruction buffer, processing circuitry, a data buffer, command circuitry, and communication circuitry. During operation, the instruction buffer stores a first hardware instruction and a second hardware instruction. The processing circuitry executes the first hardware instruction, which computes an intermediate stage of an AI model. The data buffer stores data generated from executing the first hardware instruction. The command circuitry determines that the second hardware instruction is a hardware-initiated store instruction for transferring the data from the data buffer. Based on the hardware-initiated store instruction, the communication circuitry transfers the data from the data buffer to a memory device of a computing system, which includes the mission-critical processor, via a communication interface.Type: GrantFiled: June 5, 2018Date of Patent: August 18, 2020Assignee: DINOPLUSAI HOLDINGS LIMITEDInventors: Yujie Hu, Tong Wu, Xiaosong Wang, Zongwei Zhu, Chung Kuang Chin, Clifford Gold, Steven Sertillange, Yick Kei Wong
-
Publication number: 20200074293Abstract: A scalar element computing device for computing a selected activation function selected from two or more different activation functions is disclosed. The scalar element computing device comprises N processing elements, N command memories and an operator pool. The N processing elements are arranged into a pipeline to cause the outputs of each non-last-stage processing element coupled to the inputs of one next-stage processing element. The N command memories are coupled to the N processing elements individually. The operator pool is coupled to the N processing elements, where the operator pool comprises a set of operators for implementing any activation function in an activation function group. The N processing elements are configured according to command information stored in the N command memories to calculate a target activation function selected from the activation function group by using one or more operators in the set of operations.Type: ApplicationFiled: August 29, 2018Publication date: March 5, 2020Inventors: Chung Kuang Chin, Tong Wu, Ahmed Saber, Steven Sertillange
-
Publication number: 20190279083Abstract: A computing device for fast weighted sum calculation in neural networks is disclosed. The computing device comprises an array of processing elements configured to accept an input array. Each processing element comprises a plurality of multipliers and a multiple levels of accumulators. A set of weights associated with the inputs and a target output are provided to a target processing element to compute the weighted sum for the target output. The device according to the present invention reduces the computation time from M clock cycles to log2M, where M is the size of the input array.Type: ApplicationFiled: April 19, 2018Publication date: September 12, 2019Inventors: Cliff Gold, Tong Wu, Yujie Hu, Chung Kuang Chin, Xiaosong Wang, Yick Kei Wong
-
Publication number: 20190227887Abstract: Embodiments described herein provide a mission-critical artificial intelligence (AI) processor (MAIP), which includes an instruction buffer, processing circuitry, a data buffer, command circuitry, and communication circuitry. During operation, the instruction buffer stores a first hardware instruction and a second hardware instruction. The processing circuitry executes the first hardware instruction, which computes an intermediate stage of an AI model. The data buffer stores data generated from executing the first hardware instruction. The command circuitry determines that the second hardware instruction is a hardware-initiated store instruction for transferring the data from the data buffer. Based on the hardware-initiated store instruction, the communication circuitry transfers the data from the data buffer to a memory device of a computing system, which includes the mission-critical processor, via a communication interface.Type: ApplicationFiled: June 5, 2018Publication date: July 25, 2019Applicant: DinoplusAI Holdings LimitedInventors: Yujie Hu, Tong Wu, Xiaosong Wang, Zongwei Zhu, Chung Kuang Chin, Clifford Gold, Steven Sertillange, Yick Kei Wong
-
Patent number: 8861515Abstract: Generally, a method and apparatus are disclosed that store sequential data units of a data packet received at an input port in contiguous banks of a buffer in a shared memory, thereby obviating any need for storing linkage information between data units. Data packets can extend through multiple buffers (next-buffer linkage information is much more efficient than next-data-unit linkage information). According to another aspect of the invention, buffer memory utilization can be further enhanced by storing multiple packets in a single buffer. For each buffer, a buffer usage count is stored that indicates the sum (over all packets represented in the buffer) of the number of output ports toward which each of the packets is destined.Type: GrantFiled: April 21, 2004Date of Patent: October 14, 2014Assignee: Agere Systems LLCInventors: Chung Kuang Chin, Yaw Fann, Roy T. Myers, Jr.
-
Patent number: 8861300Abstract: A multi-port memory may be formed from a plurality of “simpler” memories. In one implementation, the memory includes a write port and a number of memories provided in groups, such that the write port supplies each of a plurality of copies of the data unit to a subset of the memories, each of the subset of memories being provided in a corresponding one of the groups, a number of the copies of the data unit being greater than two. Multiplexers may be implemented, each of which being associated with a corresponding one of the groups of the memories. One of the plurality of multiplexers may be configured to selectively supply one of the copies of the data unit from one of the memories. A read port may receive the one of the copies of the data unit from the one of the multiplexers and output the one of the copies of the data unit.Type: GrantFiled: June 30, 2009Date of Patent: October 14, 2014Assignee: Infinera CorporationInventor: Chung Kuang Chin
-
Patent number: 8848720Abstract: A propagation delay in the transmission of a frame from an initiator node to a peer node is determined by initially identifying a frame number and byte offset of a first incoming frame from the peer node at a time when the initiator node outputs a portion of a transmitted frame. The portion of the transmitted frame may be the first byte of a sub-frame within the transmitted frame. At the peer node, the frame number and byte offset of a second frame to be supplied to the initiator node is identified at a later time when the frame portion transmitted by the initiator node is received by the peer node, and such information is transmitted to the initiator node. Thus, since the frames output and received by the initiator node are typically of fixed duration, the frame number and byte offset of the incoming frame represent the time when the initiator node outputs the frame portion (a transmit time).Type: GrantFiled: March 25, 2010Date of Patent: September 30, 2014Assignee: Infinera CorporationInventors: Vinod Narippatta, Edward E. Sprague, Ting-Kuang Chiang, Chung Kuang Chin
-
Patent number: 8775744Abstract: A switching frame buffer is described in which data units within a sequence of time slots, of a frame, may be simultaneously input and output at ports of the switching frame buffer. In one implementation, a write port may receive data units within a single cycle of the switch. A number of memories may be provided, where first selected ones of the memories constitute memory groups and second selected ones of the memories constitute a memory subsets, each of the memory groups including a corresponding one of the memory subsets. The write port may supply each of a number of copies of the data units to a corresponding one of the memory subsets. Multiplexers may be associated with the groups of the memories and a read port may receive one of the copies of a number of the data units from different ones of the multiplexers.Type: GrantFiled: August 31, 2009Date of Patent: July 8, 2014Inventors: Chung Kuang Chin, Shankar Venkataraman, Swaroop Raghunatha
-
Patent number: 8370706Abstract: An optical device transmits ECC codewords using an interleaved technique in which a single ECC codeword is transmitted over multiple optical links. In one particular implementation, the device may include an ECC circuit configured to supply ECC codewords in series, the codewords being generated by the ECC circuit based on input data and each of the codewords including error correction information and a portion of the data. The device may further include a serial-to-parallel circuit configured to receive each of the codewords in succession, and supply data units in parallel, each of the data units including information from a corresponding one of the codewords; an interleaver circuit to receive the data units in parallel and output a second data units in parallel, each of the second data units including bits from different ones of the data units; and a number of output lines, each of which supplying a corresponding one of the second data units.Type: GrantFiled: October 2, 2009Date of Patent: February 5, 2013Assignee: Infinera CorporationInventors: Chung Kuang Chin, Edward E. Sprague, Swaroop Raghunatha
-
Patent number: 8300479Abstract: Consistent with the present disclosure, a plurality of FIFO buffers, for example, are provided in a switch, which also includes a switch fabric. Each of the plurality of FIFOs is pre-filled with data for a duration based on a skew or time difference between the time that a data unit group is supplied to its corresponding FIFO and a reference time. The reference time is the time, for example, after a delay period has lapsed following the leading edge of a synch signal, the timing of which is a known system parameter and is used to trigger switching in the switch fabric. Typically, the delay period may be equal to the latency (often, another known system parameter) or length of time required for the data unit to propagate from an input circuit, such as a line card of the switch or another switch, to the FIFO that receives the data unit. At the reference time, temporally aligned data unit groups may be read or output from each FIFO and supplied to the switch fabric.Type: GrantFiled: March 25, 2010Date of Patent: October 30, 2012Assignee: Infinera CorporationInventors: Chung Kuang Chin, Edward E. Sprague, Prasad Paranjape, Swaroop Raghunatha, Venkat Talapaneni
-
Patent number: 8188894Abstract: Serial-to-parallel and parallel-to-serial conversion devices may provide for efficient conversion of serial bit streams into parallel data units (and vice versa). In one implementation, a device may include delay circuits, each of which being configured to receive a serial data stream. A rotator circuit may receive the delayed serial data streams and rearrange bits in the serial data streams. Register circuits may receive the output of the rotator circuit and collectively output, in parallel, a number of bits of one of the serial bit streams.Type: GrantFiled: June 30, 2009Date of Patent: May 29, 2012Assignee: Infinera CorporationInventors: Chung Kuang Chin, Prasad Paranjape