Patents by Inventor Yipeng Wang

Yipeng Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200081835
    Abstract: An apparatus and method for prioritizing transactional memory regions. For example, one embodiment of a processor comprises: a plurality of cores to execute threads comprising sequences of instructions, at least some of the instructions specifying a transactional memory region; a cache of each core to store a plurality of cache lines; transactional memory circuitry of each core to manage execution of the transactional memory (TM) regions based on priorities associated with each of the TM regions; and wherein the transactional memory circuitry, upon detecting a conflict between a first TM region having a first priority value and a second TM region having a second priority value, is to determine which of the first TM region or the second TM region is permitted to continue executing and which is to be aborted based, at least in part, on the first and second priority values.
    Type: Application
    Filed: September 10, 2018
    Publication date: March 12, 2020
    Inventors: REN WANG, RAANAN SADE, YIPENG WANG, Tsung-Yuan TAI, SAMEH GOBRIEL
  • Publication number: 20200042479
    Abstract: Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardware queue management device further comprises a resource management system to control the rate in which the cores may submit requests to reduce core stalls and dropped requests. Additionally, software instructions are introduced to optimize communication between the cores and the queue management device.
    Type: Application
    Filed: October 14, 2019
    Publication date: February 6, 2020
    Applicant: Intel Corporation
    Inventors: Ren Wang, Yipeng Wang, Andrew Herdrich, Jr-Shian Tsai, Tsung-Yuan C. Tai, Niall D. McDonnell, Hugh Wilkinson, Bradley A. Burres, Bruce Richardson, Namakkal N. Venkatesan, Debra Bernstein, Edwin Verplanke, Stephen R. Van Doren, An Yan, Andrew Cunningham, David Sonnier, Gage Eads, James T. Clee, Jamison D. Whitesell, Jerry Pirog, Jonathan Kenny, Joseph R. Hasting, Narender Vangati, Stephen Miller, Te K. Ma, William Burroughs
  • Patent number: 10530375
    Abstract: A frequency divider circuit (200) includes a frequency sub-divider (201) to provide a frequency divided clock, a delay circuit (250) configured to delay the frequency divided clock by N+0.5 cycles of the input clock to generate a delayed clock, and an output circuit (202) configured to generate an output clock based on the frequency divided clock and the delayed clock, where the output clock has a frequency that is equal to 1/(N+0.5) times a frequency of the input clock, and N is an integer greater than one.
    Type: Grant
    Filed: September 5, 2018
    Date of Patent: January 7, 2020
    Assignee: XILINX, INC.
    Inventors: Yipeng Wang, Kee Hian Tan, Stanley Y. Chen, Yohan Frans
  • Patent number: 10445271
    Abstract: Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardware queue management device further comprises a resource management system to control the rate in which the cores may submit requests to reduce core stalls and dropped requests. Additionally, software instructions are introduced to optimize communication between the cores and the queue management device.
    Type: Grant
    Filed: January 4, 2016
    Date of Patent: October 15, 2019
    Assignee: Intel Corporation
    Inventors: Ren Wang, Namakkal N. Venkatesan, Debra Bernstein, Edwin Verplanke, Stephen R. Van Doren, An Yan, Andrew Cunningham, David Sonnier, Gage Eads, James T. Clee, Jamison D. Whitesell, Yipeng Wang, Jerry Pirog, Jonathan Kenny, Joseph R. Hasting, Narender Vangati, Stephen Miller, Te K. Ma, William Burroughs, Andrew J. Herdrich, Jr-Shian Tsai, Tsung-Yuan C. Tai, Niall D. McDonnell, Hugh Wilkinson, Bradley A. Burres, Bruce Richardson
  • Patent number: 10445118
    Abstract: Methods, apparatus, systems, and articles of manufacture to facilitate field-programmable gate array support during runtime execution of computer readable instructions are disclosed herein. An example apparatus includes a compiler to, prior to runtime, compile a block of code written as high level source code into a first hardware bitstream kernel and a second hardware bitstream kernel; a kernel selector to select the first hardware bitstream kernel based on an attribute to be dispatched during runtime; a dispatcher to dispatch the first hardware bitstream kernel to a field programmable gate array (FPGA) during runtime; and the kernel selector to, when an FPGA attribute does not satisfy a threshold during runtime, adjust the selection of the first hardware bitstream kernel to the second hardware bitstream kernel to be dispatched during runtime.
    Type: Grant
    Filed: September 22, 2017
    Date of Patent: October 15, 2019
    Assignee: INTEL CORPORATION
    Inventors: Xiangyang Guo, Simonjit Dutta, Han Lee, Yipeng Wang
  • Publication number: 20190102346
    Abstract: A central processing unit can offload table lookup or tree traversal to an offload engine. The offload engine can provide hardware accelerated operations such as instruction queueing, bit masking, hashing functions, data comparisons, a results queue, and a progress tracking. The offload engine can be associated with a last level cache. In the case of a hash table lookup, the offload engine can apply a hashing function to a key to generate a signature, apply a comparator to compare signatures against the generated signature, retrieve a key associated with the signature, and apply the comparator to compare the key against the retrieved key. Accordingly, a data pointer associated with the key can be provided in the result queue. Acceleration of operations in tree traversal and tuple search can also occur.
    Type: Application
    Filed: November 30, 2018
    Publication date: April 4, 2019
    Inventors: Ren WANG, Andrew J. HERDRICH, Tsung-Yuan C. TAI, Yipeng WANG, Raghu KONDAPALLI, Alexander BACHMUTSKY, Yifan YUAN
  • Publication number: 20190095229
    Abstract: Methods, apparatus, systems, and articles of manufacture to facilitate field-programmable gate array support during runtime execution of computer readable instructions are disclosed herein. An example apparatus includes a compiler to, prior to runtime, compile a block of code written as high level source code into a first hardware bitstream kernel and a second hardware bitstream kernel; a kernel selector to select the first hardware bitstream kernel based on an attribute to be dispatched during runtime; a dispatcher to dispatch the first hardware bitstream kernel to a field programmable gate array (FPGA) during runtime; and the kernel selector to, when an FPGA attribute does not satisfy a threshold during runtime, adjust the selection of the first hardware bitstream kernel to the second hardware bitstream kernel to be dispatched during runtime.
    Type: Application
    Filed: September 22, 2017
    Publication date: March 28, 2019
    Inventors: XIANGYANG GUO, SIMONJIT DUTTA, HAN LEE, YIPENG WANG
  • Patent number: 10216668
    Abstract: Technologies for a distributed hardware queue manager include a compute device having a processor. The processor includes two or more hardware queue managers as well as two or more processor cores. Each processor core can enqueue or dequeue data from the hardware queue manager. Each hardware queue manager can be configured to contain several queue data structures. In some embodiments, the queues are addressed by the processor cores using virtual queue addresses, which are translated into physical queue addresses for accessing the corresponding hardware queue manager. The virtual queues can be moved from one physical queue in one hardware queue manager to a different physical queue in a different physical queue manager without changing the virtual address of the virtual queue.
    Type: Grant
    Filed: March 31, 2016
    Date of Patent: February 26, 2019
    Assignee: Intel Corporation
    Inventors: Ren Wang, Yipeng Wang, Jr-Shian Tsai, Andrew Herdrich, Tsung-Yuan Tai, Niall McDonnell, Stephen Van Doren, David Sonnier, Debra Bernstein, Hugh Wilkinson, Narender Vangati, Stephen Miller, Gage Eads, Andrew Cunningham, Jonathan Kenny, Bruce Richardson, William Burroughs, Joseph Hasting, An Yan, James Clee, Te Ma, Jerry Pirog, Jamison Whitesell
  • Publication number: 20190052719
    Abstract: Technologies for flow rule aware exact match cache compression include multiple computing devices in communication over a network. A computing device reads a network packet from a network port and extracts one or more key fields from the packet to generate a lookup key. The key fields are identified by a key field specification of an exact match flow cache. The computing device may dynamically configure the key field specification based on an active flow rule set. The computing device may compress the key field specification to match a union of non-wildcard fields of the active flow rule set. The computing device may expand the key field specification in response to insertion of a new flow rule. The computing device looks up the lookup key in the exact match flow cache and, if a match is found, applies the corresponding action. Other embodiments are described and claimed.
    Type: Application
    Filed: January 4, 2018
    Publication date: February 14, 2019
    Inventors: Yipeng Wang, Ren Wang, Antonio Fischetti, Sameh Gobriel, Tsung-Yuan C. Tai
  • Publication number: 20190042602
    Abstract: Techniques and apparatus for dynamic data access mode processes are described. In one embodiment, for example, an apparatus may a processor, at least one memory coupled to the processor, the at least one memory comprising an indication of a database and instructions, the instructions, when executed by the processor, to cause the processor to determine a database utilization value for a database, perform a comparison of the database utilization value to at least one utilization threshold, and set an active data access mode to one of a low-utilization data access mode or a high-utilization data access mode based on the comparison. Other embodiments are described.
    Type: Application
    Filed: August 20, 2018
    Publication date: February 7, 2019
    Inventors: Ren Wang, Bruce Richardson, Tsung-Yuan Tai, Yipeng Wang, Pablo De Lara Guarch
  • Publication number: 20190042471
    Abstract: Technologies for least recently used (LRU) cache replacement include a computing device with a processor with vector instruction support. The computing device retrieves a bucket of an associative cache from memory that includes multiple entries arranged from front to back. The bucket may be a 256-bit array including eight 32-bit entries. For lookups, a matching entry is located at a position in the bucket. The computing device executes a vector permutation processor instruction that moves the matching entry to the front of the bucket while preserving the order of other entries of the bucket. For insertion, an inserted entry is written at the back of the bucket. The computing device executes a vector permutation processor instruction that moves the inserted entry to the front of the bucket while preserving the order of other entries. The permuted bucket is stored to the memory. Other embodiments are described and claimed.
    Type: Application
    Filed: August 9, 2018
    Publication date: February 7, 2019
    Inventors: Ren Wang, Yipeng Wang, Tsung-Yuan Tai, Cristian Florin Dumitrescu, Xiangyang Guo
  • Publication number: 20190044869
    Abstract: Technologies for classifying network flows using adaptive virtual routing include a network appliance with one or more processors. The network appliance is configured to identify a set of candidate classification algorithms from a plurality of classification algorithm designs to perform a flow classification operation and deploy each of the candidate classification algorithms to a processor. Additionally the network appliance is configured to monitor a performance level of each of the deployed candidate classification algorithms and identify a candidate classification algorithm of the deployed candidate classification algorithms with the highest performance level. The network appliance is further configured to deploy the identified candidate classification algorithm with the highest performance level on each of the one or more processors that are configured to perform the flow classification operation. Other embodiments are described herein.
    Type: Application
    Filed: August 17, 2018
    Publication date: February 7, 2019
    Inventors: Yipeng Wang, Ren Wang, Janet Tseng, Jr-Shian Tsai, Tsung-Yuan Tai
  • Publication number: 20190004709
    Abstract: Examples may include techniques to control an insertion ratio or rate for a cache. Examples include comparing cache miss ratios for different time intervals or windows for a cache to determine whether to adjust a cache insertion ratio that is based on a ratio of cache misses to cache insertions.
    Type: Application
    Filed: June 30, 2017
    Publication date: January 3, 2019
    Inventors: Yipeng WANG, Ren WANG, Sameh GOBRIEL, Tsung-Yuan Charlie TAI
  • Publication number: 20180205653
    Abstract: Apparatus, methods, and systems for tuple space search-based flow classification using cuckoo hash tables and unmasked packet headers are described herein. A device can communicate with one or more hardware switches. The device can include memory to store hash table entries of a hash table. The device can include processing circuitry to perform a hash lookup in the hash table. The lookup can be based on an unmasked key include in a packet header corresponding to a received data packet. The processing circuitry can retrieve an index pointing to a sub-table, the sub-table including a set of rules for handling the data packet. Other embodiments are also described.
    Type: Application
    Filed: June 29, 2017
    Publication date: July 19, 2018
    Inventors: Ren Wang, Tsung-Yuan C. Tai, Yipeng Wang, Sameh Gobriel
  • Patent number: 9846627
    Abstract: Systems and methods for modeling memory access behavior and memory traffic timing behavior are disclosed. According to an aspect, a method includes receiving data indicative of memory access behavior resulting from instructions executed on a processor. The method also includes determining a statistical profile of the memory access behavior, the profile including tuple statistics of memory access behavior. Further, the method includes generating a clone of the executed instructions based on the statistical profile for use in simulating the memory access behavior.
    Type: Grant
    Filed: February 15, 2016
    Date of Patent: December 19, 2017
    Assignee: North Carolina State University
    Inventors: Yan Solihin, Yipeng Wang, Amro Awad
  • Publication number: 20170286114
    Abstract: A processor of an aspect includes a decode unit to decode memory access instructions of a first type and to output corresponding memory access operations, and to decode memory access instructions of a second type and to output corresponding memory access operations. The processor also includes a load store queue coupled with the decode unit. The load store queue includes a load buffer that is to have a plurality of load buffer entries, and a store buffer that is to have a plurality of store buffer entries. The load store queue also includes a buffer entry allocation controller coupled with the load buffer and coupled with the store buffer. The buffer entry allocation controller is to allocate load and store buffer entries based at least in part on whether memory access operations correspond to memory access instructions of the first type or of the second type. Other processors, methods, and systems, are also disclosed.
    Type: Application
    Filed: April 2, 2016
    Publication date: October 5, 2017
    Applicant: Intel Corporation
    Inventors: Andrew J. Herdrich, Yipeng Wang, Ren Wang, Tsung-Yuan Charles Tai, Jr-Shian Tsai
  • Publication number: 20170286337
    Abstract: Technologies for a distributed hardware queue manager include a compute device having a procesor. The processor includes two or more hardware queue managers as well as two or more processor cores. Each processor core can enqueue or dequeue data from the hardware queue manager. Each hardware queue manager can be configured to contain several queue data structures. In some embodiments, the queues are addressed by the processor cores using virtual queue addresses, which are translated into physical queue addresses for accessing the corresponding hardware queue manager. The virtual queues can be moved from one physical queue in one hardware queue manager to a different physical queue in a different physical queue manager without changing the virtual address of the virtual queue.
    Type: Application
    Filed: March 31, 2016
    Publication date: October 5, 2017
    Inventors: Ren Wang, Yipeng Wang, Jr-Shian Tsai, Andrew Herdrich, Tsung-Yuan Tai, Niall McDonnell, Stephen Van Doren, David Sonnier, Debra Bernstein, Hugh Wilkinson, Narender Vangati, Stephen Miller, Gage Eads, Andrew Cunningham, Jonathan Kenny, Bruce Richardson, William Burroughs, Joseph Hasting, An Yan, James Clee, Te Ma, Jerry Pirog, Jamison Whitesell
  • Publication number: 20170192921
    Abstract: Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardware queue management device further comprises a resource management system to control the rate in which the cores may submit requests to reduce core stalls and dropped requests. Additionally, software instructions are introduced to optimize communication between the cores and the queue management device.
    Type: Application
    Filed: January 4, 2016
    Publication date: July 6, 2017
    Inventors: Ren Wang, Yipeng Wang, Andrew J. Herdrich, Jr-Shian Tsai, Tsung-Yuan C. Tai, Niall D. McDonnell, Hugh Wilkinson, Bradley A. Burres, Bruce Richardson, Namakkal N. Venkatesan, Debra Bernstein, Edwin Verplanke, Stephen R. Van Doren, An Yan, Andrew Cunningham, David Sonnier, Gage Eads, James T. Clee, Jamison D. Whitesell, Jerry Pirog, Jonathan Kenny, Joseph R. Hasting, Narender Vangati, Stephen Miller, Te K. Ma, William Burroughs
  • Publication number: 20160239212
    Abstract: Systems and methods for modeling memory access behavior and memory traffic timing behavior are disclosed. According to an aspect, a method includes receiving data indicative of memory access behavior resulting from instructions executed on a processor. The method also includes determining a statistical profile of the memory access behavior, the profile including tuple statistics of memory access behavior. Further, the method includes generating a clone of the executed instructions based on the statistical profile for use in simulating the memory access behavior.
    Type: Application
    Filed: February 15, 2016
    Publication date: August 18, 2016
    Inventors: Yan Solihin, Yipeng Wang, Amro Awad
  • Patent number: 8709753
    Abstract: This invention is metabolically engineer bacterial strains that provide increased intracellular NADPH availability for the purpose of increasing the yield and productivity of NADPH-dependent compounds. In the invention, native NAD-dependent GAPDH is replaced with NADP-dependent GAPDH plus overexpressed NADK. Uses for the bacteria are also provided.
    Type: Grant
    Filed: November 19, 2012
    Date of Patent: April 29, 2014
    Assignee: William Marsh Rice University
    Inventors: Ka-Yiu San, George N. Bennett, Yipeng Wang