Patents by Inventor Tsung-Yuan Tai
Tsung-Yuan Tai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230114263Abstract: Systems, apparatuses and methods may provide for technology that uses centralized hardware to detect a local allocation request associated with a local thread, detect a remote allocation request associated with a remote thread, wherein the remote allocation request bypasses a remote operating system, and process the local allocation request and the remote allocation request with respect to central heap, wherein the central heap is shared by the local thread and the remote thread. The local allocation request and the remote allocation request may include one or more of a first request to allocate a memory block of a specified size, a second request to allocate multiple memory blocks of a same size, a third request to resize a previously allocated memory block, or a fourth request to deallocate the previously allocated memory block.Type: ApplicationFiled: December 13, 2022Publication date: April 13, 2023Inventors: Ren Wang, Poonam Shidlyali, Tsung-Yuan Tai
-
Patent number: 11601531Abstract: One embodiment provides a network system. The network system includes an application layer to execute one or more networking applications to generate or receive data packets having flow identification (ID) information; and a packet processing layer having profiling circuitry to generate a sketch table indicative of packet flow count data; the sketch table having a plurality of buckets, each bucket includes a first section including a plurality of data fields, each data field of the first section to store flow ID and packet count data, each bucket also having a second section having a plurality of data fields, each data field of the second section to store packet count data.Type: GrantFiled: December 3, 2019Date of Patent: March 7, 2023Assignee: Intel CorporationInventors: Ren Wang, Yipeng Wang, Tsung-Yuan Tai
-
Patent number: 11500825Abstract: Techniques and apparatus for dynamic data access mode processes are described. In one embodiment, for example, an apparatus may a processor, at least one memory coupled to the processor, the at least one memory comprising an indication of a database and instructions, the instructions, when executed by the processor, to cause the processor to determine a database utilization value for a database, perform a comparison of the database utilization value to at least one utilization threshold, and set an active data access mode to one of a low-utilization data access mode or a high-utilization data access mode based on the comparison. Other embodiments are described.Type: GrantFiled: August 20, 2018Date of Patent: November 15, 2022Assignee: INTEL CORPORATIONInventors: Ren Wang, Bruce Richardson, Tsung-Yuan Tai, Yipeng Wang, Pablo De Lara Guarch
-
Publication number: 20220326999Abstract: Apparatuses, methods, and systems for dynamic resource allocation based on quality-of-service prediction are disclosed. In embodiments, an apparatus includes quality-of-service prediction circuitry and a resource controller. The quality-of-service prediction circuitry is to make quality-of-service predictions using a model based at least in part on at least one performance counter measurements and at least one quality-of-service measurement. The resource controller is to allocate one or more shared resources based on the quality-of-service predictions and architectural performance counter measurements.Type: ApplicationFiled: June 29, 2022Publication date: October 13, 2022Inventors: Drew Penney, Bin Li, Tsung-Yuan Tai, Anna Drewek-Ossowicka, Rameshkumar Illikkal, Andrew J. Herdrich, Jaroslaw Sydir
-
Publication number: 20210406147Abstract: An apparatus and method for closed loop dynamic resource allocation.Type: ApplicationFiled: June 27, 2020Publication date: December 30, 2021Inventors: BIN LI, REN WANG, KSHITIJ ARUN DOSHI, FRANCESC GUIM BERNAT, YIPENG WANG, RAVISHANKAR IYER, ANDREW HERDRICH, TSUNG-YUAN TAI, ZHU ZHOU, RASIKA SUBRAMANIAN
-
Publication number: 20210110269Abstract: Neural network dense layer sparsification and matrix compression is disclosed. An example of an apparatus includes one or more processors; a memory to store data for processing, including data for processing of a deep neural network (DNN) including one or more layers, each layer including a plurality of neurons, the one or more processors to perform one or both of sparsification of one or more layers of the DNN, including selecting a subset of the plurality of neurons of a first layer of the DNN for activation based at least in part on locality sensitive hashing of inputs to the first layer; or compression of a weight or activation matrix of one or more layers of the DNN, including detection of sparsity patterns in a matrix of the first layer of the DNN based at least in part on locality sensitive hashing of patterns in the matrix.Type: ApplicationFiled: December 21, 2020Publication date: April 15, 2021Applicant: Intel CorporationInventors: Sameh Gobriel, Jesmin Jahari Tithi, Tsung-Yuan Tai
-
Patent number: 10789176Abstract: Technologies for least recently used (LRU) cache replacement include a computing device with a processor with vector instruction support. The computing device retrieves a bucket of an associative cache from memory that includes multiple entries arranged from front to back. The bucket may be a 256-bit array including eight 32-bit entries. For lookups, a matching entry is located at a position in the bucket. The computing device executes a vector permutation processor instruction that moves the matching entry to the front of the bucket while preserving the order of other entries of the bucket. For insertion, an inserted entry is written at the back of the bucket. The computing device executes a vector permutation processor instruction that moves the inserted entry to the front of the bucket while preserving the order of other entries. The permuted bucket is stored to the memory. Other embodiments are described and claimed.Type: GrantFiled: August 9, 2018Date of Patent: September 29, 2020Assignee: Intel CorporationInventors: Ren Wang, Yipeng Wang, Tsung-Yuan Tai, Cristian Florin Dumitrescu, Xiangyang Guo
-
Patent number: 10719442Abstract: An apparatus and method for prioritizing transactional memory regions. For example, one embodiment of a processor comprises: a plurality of cores to execute threads comprising sequences of instructions, at least some of the instructions specifying a transactional memory region; a cache of each core to store a plurality of cache lines; transactional memory circuitry of each core to manage execution of the transactional memory (TM) regions based on priorities associated with each of the TM regions; and wherein the transactional memory circuitry, upon detecting a conflict between a first TM region having a first priority value and a second TM region having a second priority value, is to determine which of the first TM region or the second TM region is permitted to continue executing and which is to be aborted based, at least in part, on the first and second priority values.Type: GrantFiled: September 10, 2018Date of Patent: July 21, 2020Assignee: Intel CorporationInventors: Ren Wang, Raanan Sade, Yipeng Wang, Tsung-Yuan Tai, Sameh Gobriel
-
Patent number: 10623311Abstract: Technologies for distributed table lookup via a distributed router includes an ingress computing node, an intermediate computing node, and an egress computing node. Each computing node of the distributed router includes a forwarding table to store a different set of network routing entries obtained from a routing table of the distributed router. The ingress computing node generates a hash key based on the destination address included in a received network packet. The hash key identifies the intermediate computing node of the distributed router that stores the forwarding table that includes a network routing entry corresponding to the destination address. The ingress computing node forwards the received network packet to the intermediate computing node for routing. The intermediate computing node receives the forwarded network packet, determines a destination address of the network packet, and determines the egress computing node for transmission of the network packet from the distributed router.Type: GrantFiled: September 27, 2017Date of Patent: April 14, 2020Assignee: Intel CorporationInventors: Sameh Gobriel, Ren Wang, Christian Maciocco, Tsung-Yuan Tai
-
Publication number: 20200106867Abstract: One embodiment provides a network system. The network system includes an application layer to execute one or more networking applications to generate or receive data packets having flow identification (ID) information; and a packet processing layer having profiling circuitry to generate a sketch table indicative of packet flow count data; the sketch table having a plurality of buckets, each bucket includes a first section including a plurality of data fields, each data field of the first section to store flow ID and packet count data, each bucket also having a second section having a plurality of data fields, each data field of the second section to store packet count data.Type: ApplicationFiled: December 3, 2019Publication date: April 2, 2020Applicant: Intel CorporationInventors: Ren Wang, Yipeng Wang, Tsung-Yuan Tai
-
SYSTEM, METHOD, AND APPARATUS FOR SNAPSHOT PREFETCHING TO IMPROVE PERFORMANCE OF SNAPSHOT OPERATIONS
Publication number: 20200104259Abstract: A snapshot prefetcher to perform snapshot prefetching to improve performance of snapshot read operations. An apparatus embodiment includes a snapshot read tracking circuitry to track snapshot read requests made by a first processor core to read a plurality of cache lines, and to detect a snapshot read access stream based on the tracked snapshot read requests. A snapshot prefetch issuing circuitry of the apparatus to issue, based on the detected snapshot read access stream, one or more snapshot prefetch requests, including a first snapshot prefetch request to prefetch data from a first cache line stored in, and owned exclusively by, a first storage location outside the first processor core. The snapshot prefetch issuing circuitry further to store the prefetched data in a second storage location within the first processor core, wherein after the prefetch, exclusive ownership of the first cache line is to remain with the first storage location.Type: ApplicationFiled: September 28, 2018Publication date: April 2, 2020Inventors: Ren Wang, Lawrence C. Stewart, Binh Pham, Andrew Herdrich, Venkata Krishnan, Anil Vasudevan, Joseph Nuzman, Tsung-Yuan Tai -
Publication number: 20200081835Abstract: An apparatus and method for prioritizing transactional memory regions. For example, one embodiment of a processor comprises: a plurality of cores to execute threads comprising sequences of instructions, at least some of the instructions specifying a transactional memory region; a cache of each core to store a plurality of cache lines; transactional memory circuitry of each core to manage execution of the transactional memory (TM) regions based on priorities associated with each of the TM regions; and wherein the transactional memory circuitry, upon detecting a conflict between a first TM region having a first priority value and a second TM region having a second priority value, is to determine which of the first TM region or the second TM region is permitted to continue executing and which is to be aborted based, at least in part, on the first and second priority values.Type: ApplicationFiled: September 10, 2018Publication date: March 12, 2020Inventors: REN WANG, RAANAN SADE, YIPENG WANG, Tsung-Yuan TAI, SAMEH GOBRIEL
-
Patent number: 10567510Abstract: Various embodiments are generally directed to techniques for improving the efficiency of exchanging packets between pairs of VMs within a communications server. An apparatus may include a processor component; a network interface to couple the processor component to a network; a virtual switch to analyze contents of at least one packet of a set of packets to be exchanged between endpoint devices through the network and the communications server, and to route the set of packets through one or more virtual servers of multiple virtual servers based on the contents; and a transfer component of a first virtual server of the multiple virtual servers to determine whether to route the set of packets to the virtual switch or to transfer the set of packets to a second virtual server of the multiple virtual servers in a manner that bypasses the virtual switch based on a routing rule.Type: GrantFiled: October 2, 2017Date of Patent: February 18, 2020Assignee: INTEL CORPORATIONInventors: Mesut A. Ergin, Jr-Shian Tsai, Janet Tseng, Ren Wang, Jun Nakajima, Tsung-Yuan Tai
-
Patent number: 10356012Abstract: Various embodiments are generally directed to techniques for improving the efficiency of exchanging packets among multiple VMs within a communications server, and between the communications server and other devices in a communications system. An apparatus may include a virtual switch to analyze contents of at least one packet of a set of packets to be exchanged between endpoint devices through a network, and to correlate the contents to a pathway to extend through one or more of the VMs that are each configured as virtual servers of multiple virtual servers; and an interface control component to select at least one virtual network interface of each of the one or more virtual servers along the pathway to operate in a polling mode, and to select a virtual network interface of at least one virtual server of the multiple virtual servers not along the pathway to operate in a non-polling mode.Type: GrantFiled: August 20, 2015Date of Patent: July 16, 2019Assignee: INTEL CORPORATIONInventors: Alexander W. Min, Tsung-Yuan Tai, Ren Wang, Mesut A. Ergin, Jr-Shian Tsai
-
Patent number: 10216668Abstract: Technologies for a distributed hardware queue manager include a compute device having a processor. The processor includes two or more hardware queue managers as well as two or more processor cores. Each processor core can enqueue or dequeue data from the hardware queue manager. Each hardware queue manager can be configured to contain several queue data structures. In some embodiments, the queues are addressed by the processor cores using virtual queue addresses, which are translated into physical queue addresses for accessing the corresponding hardware queue manager. The virtual queues can be moved from one physical queue in one hardware queue manager to a different physical queue in a different physical queue manager without changing the virtual address of the virtual queue.Type: GrantFiled: March 31, 2016Date of Patent: February 26, 2019Assignee: Intel CorporationInventors: Ren Wang, Yipeng Wang, Jr-Shian Tsai, Andrew Herdrich, Tsung-Yuan Tai, Niall McDonnell, Stephen Van Doren, David Sonnier, Debra Bernstein, Hugh Wilkinson, Narender Vangati, Stephen Miller, Gage Eads, Andrew Cunningham, Jonathan Kenny, Bruce Richardson, William Burroughs, Joseph Hasting, An Yan, James Clee, Te Ma, Jerry Pirog, Jamison Whitesell
-
Publication number: 20190042304Abstract: Methods, apparatus, systems, and software for architectures and mechanisms to accelerate tuple-space search with integrated GPUs (Graphic Processor Units). One of the architectures employs GPU-side lookup table sorting, under which local and global hit count histograms are maintained for work groups, and sub-tables containing rules for tuple matching are re-sorted based on the relative hit rates of the different sub-tables. Under a second architecture, two levels of parallelism are implemented: packet-level parallelism and lookup table-parallelism. Under a third architecture, dynamic two-level parallel processing with pre-screen is implemented. Adaptive decision making mechanisms are also disclosed to select which architecture is optimal in view of multiple considerations, including application preferences, offered throughput, and available GPU resources.Type: ApplicationFiled: December 3, 2017Publication date: February 7, 2019Applicant: Intel CorporationInventors: Ren Wang, Janet Tseng, Jr-Shian Tsai, Tsung-Yuan Tai
-
Publication number: 20190042602Abstract: Techniques and apparatus for dynamic data access mode processes are described. In one embodiment, for example, an apparatus may a processor, at least one memory coupled to the processor, the at least one memory comprising an indication of a database and instructions, the instructions, when executed by the processor, to cause the processor to determine a database utilization value for a database, perform a comparison of the database utilization value to at least one utilization threshold, and set an active data access mode to one of a low-utilization data access mode or a high-utilization data access mode based on the comparison. Other embodiments are described.Type: ApplicationFiled: August 20, 2018Publication date: February 7, 2019Inventors: Ren Wang, Bruce Richardson, Tsung-Yuan Tai, Yipeng Wang, Pablo De Lara Guarch
-
Publication number: 20190044869Abstract: Technologies for classifying network flows using adaptive virtual routing include a network appliance with one or more processors. The network appliance is configured to identify a set of candidate classification algorithms from a plurality of classification algorithm designs to perform a flow classification operation and deploy each of the candidate classification algorithms to a processor. Additionally the network appliance is configured to monitor a performance level of each of the deployed candidate classification algorithms and identify a candidate classification algorithm of the deployed candidate classification algorithms with the highest performance level. The network appliance is further configured to deploy the identified candidate classification algorithm with the highest performance level on each of the one or more processors that are configured to perform the flow classification operation. Other embodiments are described herein.Type: ApplicationFiled: August 17, 2018Publication date: February 7, 2019Inventors: Yipeng Wang, Ren Wang, Janet Tseng, Jr-Shian Tsai, Tsung-Yuan Tai
-
Publication number: 20190042471Abstract: Technologies for least recently used (LRU) cache replacement include a computing device with a processor with vector instruction support. The computing device retrieves a bucket of an associative cache from memory that includes multiple entries arranged from front to back. The bucket may be a 256-bit array including eight 32-bit entries. For lookups, a matching entry is located at a position in the bucket. The computing device executes a vector permutation processor instruction that moves the matching entry to the front of the bucket while preserving the order of other entries of the bucket. For insertion, an inserted entry is written at the back of the bucket. The computing device executes a vector permutation processor instruction that moves the inserted entry to the front of the bucket while preserving the order of other entries. The permuted bucket is stored to the memory. Other embodiments are described and claimed.Type: ApplicationFiled: August 9, 2018Publication date: February 7, 2019Inventors: Ren Wang, Yipeng Wang, Tsung-Yuan Tai, Cristian Florin Dumitrescu, Xiangyang Guo
-
Publication number: 20180270309Abstract: Various embodiments are generally directed to techniques for improving the efficiency of exchanging packets between pairs of VMs within a communications server. An apparatus may include a processor component; a network interface to couple the processor component to a network; a virtual switch to analyze contents of at least one packet of a set of packets to be exchanged between endpoint devices through the network and the communications server, and to route the set of packets through one or more virtual servers of multiple virtual servers based on the contents; and a transfer component of a first virtual server of the multiple virtual servers to determine whether to route the set of packets to the virtual switch or to transfer the set of packets to a second virtual server of the multiple virtual servers in a manner that bypasses the virtual switch based on a routing rule.Type: ApplicationFiled: October 2, 2017Publication date: September 20, 2018Applicant: INTEL CORPORATIONInventors: MESUT A. ERGIN, JR-SHIAN TSAI, JANET TSENG, REN WANG, JUN NAKAJIMA, TSUNG-YUAN TAI