Patents by Inventor Yen-Cheng Liu

Yen-Cheng Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20160195913
    Abstract: A method and apparatus to monitor architecture events is disclosed. The architecture events are linked together via a push bus mechanism with each architectural event having a designated time slot. There is at least one branch of the push bus in each core. Each branch of the push bus may monitor one core with all the architectural events. All the data collected from the events by the push bus is then sent to a power control unit.
    Type: Application
    Filed: December 17, 2015
    Publication date: July 7, 2016
    Inventors: YEN-CHENG LIU, P. KEONG OR, KRISHNAKANTH V. SISTLA, GANAPATI SRINIVASA
  • Publication number: 20160188474
    Abstract: Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.
    Type: Application
    Filed: December 26, 2014
    Publication date: June 30, 2016
    Applicant: Intel Corporation
    Inventors: Ren Wang, Andrew J. Herdrich, Yen-cheng Liu, Herbert H. Hum, Jong Soo Park, Christopher J. Hughes, Namakkal N. Venkatesan, Adrian C. Moga, Aamer Jaleel, Zeshan A. Chishti, Mesut A. Ergin, Jr-shian Tsai, Alexander W. Min, Tsung-yuan C. Tai, Christian Maciocco, Rajesh Sankaran
  • Publication number: 20160179679
    Abstract: A speculative read request is received from a host device over a buffered memory access link for data associated with a particular address. A read request is sent for the data to a memory device. The data is received from the memory device in response to the read request and the received data is sent to the host device as a response to a demand read request received subsequent to the speculative read request.
    Type: Application
    Filed: December 23, 2014
    Publication date: June 23, 2016
    Inventors: Brian S. Morris, Bill Nale, Robert G. Blankenship, Yen-Cheng Liu
  • Patent number: 9367112
    Abstract: A method and apparatus to monitor architecture events is disclosed. The architecture events are linked together via a push bus mechanism with each architectural event having a designated time slot. There is at least one branch of the push bus in each core. Each branch of the push bus may monitor one core with all the architectural events. All the data collected from the events by the push bus is then sent to a power control unit.
    Type: Grant
    Filed: January 16, 2015
    Date of Patent: June 14, 2016
    Assignee: INTEL CORPORATION
    Inventors: Yen-Cheng Liu, P. Keong Or, Krishnakanth V. Sistla, Ganapati Srinivasa
  • Publication number: 20160092366
    Abstract: An apparatus and method are described for distributed snoop filtering. For example, one embodiment of a processor comprises: a plurality of cores to execute instructions and process data; first snoop logic to track a first plurality of cache lines stored in a mid-level cache (“MLC”) accessible by one or more of the cores, the first snoop logic to allocate entries for cache lines stored in the MLC and to deallocate entries for cache lines evicted from the MLC, wherein at least some of the cache lines evicted from the MLC are retained in a level 1 (L1) cache; and second snoop logic to track a second plurality of cache lines stored in a non-inclusive last level cache (NI LLC), the second snoop logic to allocate entries in the NI LLC for cache lines evicted from the MLC and to deallocate entries for cache lines stored in the MLC, wherein the second snoop logic is to store and maintain a first set of core valid bits to identify cores containing copies of the cache lines stored in the NI LLC.
    Type: Application
    Filed: September 26, 2014
    Publication date: March 31, 2016
    Inventors: Rahul PAL, Ishwar AGARWAL, Yen-Cheng LIU, Joseph NUZMAN, Ashok JAGANNATHAN, Bahaa FAHIM, Nithiyanandan BASHYAM
  • Publication number: 20160077970
    Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for providing a virtual shared cache mechanism. A processing device includes a plurality of clusters allocated into a virtual private shared cache. Each of the clusters includes a plurality of cores and a plurality of cache slices co-located within the plurality of cores. The processing device also includes a virtual shared cache including the plurality of clusters such that the cache data in the plurality of cache slices is shared among the plurality of clusters.
    Type: Application
    Filed: September 12, 2014
    Publication date: March 17, 2016
    Inventors: Yen-Cheng Liu, Aamer Jaleel, Bongjin Jung, Zeshan A. Chishti, Adrian C. Moga, Eric Delano, Ren Wang
  • Patent number: 9288260
    Abstract: Methods, apparatus and systems for facilitating one-way ordering of otherwise independent message classes. A one-way message ordering mechanism facilitates one-way ordering of messages of different message classes sent between interconnects employing independent pathways for the message classes. In one aspect, messages of a second message class may not pass messages of a first message class. Moreover, when messages of the first and second classes are received in sequence, the ordering mechanism ensures that messages of the first class are forwarded to, and received at, a next hop prior to forwarding messages of the second class.
    Type: Grant
    Filed: October 8, 2013
    Date of Patent: March 15, 2016
    Assignee: Intel Corporation
    Inventors: James R. Vash, Vida Vakilotojar, Bongjin Jung, Yen-Cheng Liu
  • Publication number: 20160012010
    Abstract: In an embodiment, an apparatus comprises: a first component to perform coherent operations; and a coherent fabric logic coupled to the first component via a first component interface. The coherent fabric logic may be configured to perform full coherent fabric functionality for coherent communications between the first component and a second component coupled to the coherent fabric logic. The first component may include a packetization logic to communicate packets with the coherent fabric logic, but not include coherent interconnect interface logic to perform coherent fabric functionality. Other embodiments are described and claimed.
    Type: Application
    Filed: May 8, 2015
    Publication date: January 14, 2016
    Inventors: Krishnakumar GANAPATHY, Yen-Cheng LIU, Antonio JUAN, Steven R. PAGE, Jeffrey D. CHAMBERLAIN, Pau CABRE, Bahaa FAHIM, Gunnar GAUBATZ
  • Patent number: 9189296
    Abstract: Disclosed herein is a caching agent for preventing deadlock in a processor. The caching agent includes a receiver configured to receive a request from a core of the processor. The caching agent includes ingress logic coupled to the receiver to determine that the request is potentially a cacheable request. The ingress logic is to determine that the request does not deplete an available coherence resource. The ingress logic is to allow the request to be processed in response to the determination that the request does not deplete the available coherence resource.
    Type: Grant
    Filed: December 27, 2013
    Date of Patent: November 17, 2015
    Assignee: Intel Corporation
    Inventors: Bahaa Fahim, Jeffrey Chamberlain, Yen-Cheng Liu
  • Publication number: 20150269104
    Abstract: Methods, systems, and apparatus for implementing low latency interconnect switches between CPU's and associated protocols. CPU's are configured to be installed on a main board including multiple CPU sockets linked in communication via CPU socket-to-socket interconnect links forming a CPU socket-to-socket ring interconnect. The CPU's are also configured to transfer data between one another by sending data via the CPU socket-to-socket interconnects. Data may be transferred using a packetized protocol, such as QPI, and the CPU's may also be configured to support coherent memory transactions across CPU's.
    Type: Application
    Filed: November 29, 2011
    Publication date: September 24, 2015
    Inventors: Robert G. Blankenship, Geeyarpuram N. Santhanakrishnan, Yen-Cheng Liu, Bahaa Fahim, Ganapati N. Srinivasa
  • Publication number: 20150186275
    Abstract: A processor is described that includes one or more processing cores. The processing core includes a memory controller to interface with a system memory having a near memory and a far memory. The processing core includes a plurality of caching levels above the memory controller. The processor includes logic circuitry to track state information of a cache line that is cached in one of the caching levels. The state information including a selected one of an inclusive state and a non inclusive state. The inclusive state indicates that a copy or version of the cache line exists in near memory. The non inclusive states indicates that a copy or version of the cache line does not exist in the near memory. The logic circuitry is to cause the memory controller to handle a write request that requests a direct write into the near memory without a read of the near memory beforehand if a system memory write request generated within the processor targets the cache line when the cache line is in the inclusive state.
    Type: Application
    Filed: December 27, 2013
    Publication date: July 2, 2015
    Inventors: Adrian C. Moga, Vedaraman Geetha, Bahaa Fahim, Robert G. Blankenship, Yen-Cheng Liu, Jeffrey D. Chamberlain, Stephen R. Van Doren
  • Publication number: 20150186191
    Abstract: Disclosed herein is a caching agent for preventing deadlock in a processor. The caching agent includes a receiver configured to receive a request from a core of the processor. The caching agent includes ingress logic coupled to the receiver to determine that the request is potentially a cacheable request. The ingress logic is to determine that the request does not deplete an available coherence resource. The ingress logic is to allow the request to be processed in response to the determination that the request does not deplete the available coherence resource.
    Type: Application
    Filed: December 27, 2013
    Publication date: July 2, 2015
    Applicant: INTEL CORPORATION
    Inventors: Bahaa Fahim, Jeffrey Chamberlain, Yen-Cheng Liu
  • Publication number: 20150178202
    Abstract: An apparatus and method are described for performing a cache line write back operation. For example, one embodiment of a method comprises: initiating a cache line write back operation directed to a particular linear address; determining if a dirty cache line identified by the linear address exists at any cache of a cache hierarchy comprised of a plurality of cache levels; writing back the dirty cache line to memory if the dirty cache line exists in one of the caches; and responsively maintaining or placing the dirty cache line in an exclusive state in at least a first cache of the hierarchy.
    Type: Application
    Filed: December 20, 2013
    Publication date: June 25, 2015
    Inventors: Rajesh M. Sankaran, Neil M. Schaper, Joseph Nuzman, Larisa Novakovsky, Yen-Cheng Liu, Gilbert Neiger, Raj K. Ramanujan
  • Publication number: 20150178206
    Abstract: An apparatus and method for reducing or eliminating writeback operations. For example, one embodiment of a method comprises: detecting a first operation associated with a cache line at a first requestor cache; detecting that the cache line exists in a first cache in a modified (M) state; forwarding the cache line from the first cache to the first requestor cache and storing the cache line in the first requestor cache in a second modified (M?) state; detecting a second operation associated with the cache line at a second requestor; responsively forwarding the cache line from the first requestor cache to the second requestor cache and storing the cache line in the second requestor cache in an owned (O) state if the cache line has not been modified in the first requestor cache; and setting the cache line to a shared (S) state in the first requestor cache.
    Type: Application
    Filed: December 20, 2013
    Publication date: June 25, 2015
    Inventors: Jeffrey D. Chamberlain, Vedaraman Geetha, Robert G. Blankenship, Yen-Cheng Liu, Adrian C. Moga, Herbert H. Hum, Sailesh Kottapalli
  • Publication number: 20150143051
    Abstract: In one embodiment, the present invention includes a multicore processor having a plurality of cores, a shared cache memory, an integrated input/output (IIO) module to interface between the multicore processor and at least one IO device coupled to the multicore processor, and a caching agent to perform cache coherency operations for the plurality of cores and the IIO module. Other embodiments are described and claimed.
    Type: Application
    Filed: January 30, 2015
    Publication date: May 21, 2015
    Inventors: Yen-Cheng Liu, Robert G. Blankenship, Geeyarpuram N. Santhanakrishnan, Ganapati N. Srinivasa, Kenneth C. Creta, Sridhar Muthrasanallur, Bahaa Fahim
  • Publication number: 20150131580
    Abstract: Efficient algorithms for estimating LSFCs with no aid of SSFCs by taking advantage of the channel hardening effect and large spatial samples available to a massive MIMO base station (BS) are proposed. The LSFC estimates are of low computational complexity and require relatively small training overhead. In the uplink direction, mobile stations (MSs) transmit orthogonal uplink pilots for the serving BS to estimate LSFCs. In the downlink direction, the BS transmits either pilot signal or data signal intended to the MSs that have already established time and frequency synchronization. The proposed uplink and downlink LSFC estimators are unbiased and asymptotically optimal as the number of BS antennas tends to infinity.
    Type: Application
    Filed: November 12, 2014
    Publication date: May 14, 2015
    Inventors: Yen-Cheng Liu, Ko-Feng Chen, York Ted Su
  • Publication number: 20150127962
    Abstract: A method and apparatus to monitor architecture events is disclosed. The architecture events are linked together via a push bus mechanism with each architectural event having a designated time slot. There is at least one branch of the push bus in each core. Each branch of the push bus may monitor one core with all the architectural events. All the data collected from the events by the push bus is then sent to a power control unit.
    Type: Application
    Filed: January 16, 2015
    Publication date: May 7, 2015
    Inventors: YEN-CHENG LIU, P. KEONG OR, KRISHNAKANTH V. SISTLA, GANAPATI SRINIVASA
  • Publication number: 20150127907
    Abstract: In an embodiment, a processor includes one or more cores, and a distributed caching home agent (including portions associated with each core). Each portion includes a cache controller to receive a read request for data and, responsive to the data not being present in a cache memory associated with the cache controller, to issue a memory request to a memory controller to request the data in parallel with communication of the memory request to a home agent, where the home agent is to receive the memory request from the cache controller and to reserve an entry for the memory request. Other embodiments are described and claimed.
    Type: Application
    Filed: November 4, 2013
    Publication date: May 7, 2015
    Inventors: Bahaa Fahim, Samuel D. Strom, Vedaraman Geetha, Robert G. Blankenship, Yen-Cheng Liu, Krishnakumar Ganapathy, Cesar Maldonado
  • Publication number: 20150128142
    Abstract: A starvation mode is entered and a particular dependency of a first request in a retry queue is identified. The particular dependency is determined to be acquired and the first request is retried based on acquisition of the particular dependency.
    Type: Application
    Filed: November 6, 2013
    Publication date: May 7, 2015
    Inventors: Bahaa Fahim, Yen-Cheng Liu, Jeffrey D. Chamberlain
  • Publication number: 20150095580
    Abstract: A processor includes a cache-side address monitor unit corresponding to a first cache portion of a distributed cache that has a total number of cache-side address monitor storage locations less than a total number of logical processors of the processor. Each cache-side address monitor storage location is to store an address to be monitored. A core-side address monitor unit corresponds to a first core and has a same number of core-side address monitor storage locations as a number of logical processors of the first core. Each core-side address monitor storage location is to store an address, and a monitor state for a different corresponding logical processor of the first core. A cache-side address monitor storage overflow unit corresponds to the first cache portion, and is to enforce an address monitor storage overflow policy when no unused cache-side address monitor storage location is available to store an address to be monitored.
    Type: Application
    Filed: September 27, 2013
    Publication date: April 2, 2015
    Inventors: Yen-Cheng Liu, Bahaa Fahim, Erik G. Hallnor, Jeffrey D. Chamberlain, Stephen R. Van Doren, Antonio Juan