Patents by Inventor Tongping LIU
Tongping LIU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240118818Abstract: Methods and systems for managing memory usage in data learning operations. The method includes profiling one or more objects used for training in the data learning operation, in which the profiling includes determining an object size, a memory allocation timestamp, and a memory deallocation timestamp; and scheduling the memory usage. The scheduling includes grouping the one or more objects into the one or more groups based on the memory allocation and/or the memory deallocation timestamp of the one or more objects, and arranging the one or more objects in the one or more groups in descending order in a memory space. Two or more objects are provided in the descending order in the memory space, in which one of the objects having an earliest memory allocation timestamp is provided at a first value and the other object having a later memory allocation timestamp is provided at a second value.Type: ApplicationFiled: November 17, 2023Publication date: April 11, 2024Inventors: Jin ZHOU, Tongping LIU, Yong FU, Ping ZHOU, Wei XU, Jianjun CHEN
-
Publication number: 20230325239Abstract: A system and method for memory allocation and management in non-uniform memory access (“NUMA”) architecture computing environments is disclosed. The system and method contemplates both hardware heterogeneity and allocation/deallocation attributes, with fine-grained memory management. NUMAlloc is centered on a binding-based memory management. On top of it, NUMAlloc proposes an “origin-aware memory management” to ensure the locality of memory allocations and deallocations, as well as a method called “incremental sharing” to balance the performance benefits and memory overhead of using transparent huge pages. It further introduced an interleaved heap to reduce the load imbalance among different nodes and an efficient mechanism for object movement. The system and method provides a scalable and increased performance alternative over other prior art memory allocators.Type: ApplicationFiled: April 11, 2022Publication date: October 12, 2023Applicant: University of MassachusettsInventors: Tongping Liu, Hanmei Yang, Xin Zhao
-
Publication number: 20230289616Abstract: System and method of training a machine learning model on a plurality of devices in parallel are provided. The method includes performing a model profiling execution before a model normal execution, allocating tensors of the model into a plurality of chunks based on profiling results from the model profiling execution, and performing the model normal execution on the plurality of devices in parallel to train or fine-tune the model.Type: ApplicationFiled: May 18, 2023Publication date: September 14, 2023Inventors: Tongping Liu, Wei Xu, Jianjun Chen
-
Patent number: 11599445Abstract: The techniques described herein may provide techniques for precise and fully-automatic on-site software failure diagnosis that overcomes issues of existing systems and general challenges of in-production software failure diagnosis. Embodiments of the present systems and methods may provide a tool capable of automatically pinpointing a fault propagation chain of program failures, with explicit symptoms. The combination of binary analysis, in-situ/identical replay, and debugging registers may be used together to simulate the debugging procedures of a programmer automatically. Overhead, privacy, transparency, convenience, and completeness challenges of in-production failure analysis are improved, making it suitable for deployment uses.Type: GrantFiled: June 14, 2019Date of Patent: March 7, 2023Assignee: Board of Regents, The University of Texas SystemInventors: Tongping Liu, Hongyu Liu, Sam Albert Silvestro
-
Patent number: 11593483Abstract: Memory allocation techniques may provide improved security and performance. A method may comprise mapping a block of memory, dividing the block of memory into a plurality of heaps, dividing each heap into a plurality of sub-heaps, wherein each sub-heap is associated with one thread of software executing in the computer system, dividing each sub-heap into a plurality of bags, wherein each bag is associated with one size class of objects, creating an allocation buffer and a deallocation buffer for each bag, storing a plurality of objects in at least some of the bags, wherein each object is stored in a bag having size class corresponding to a size of the object, storing in the allocation buffer of each bag information relating to available objects stored in that bag, and storing in the deallocation buffer of each bag information relating to freed objects that were stored in that bag.Type: GrantFiled: October 18, 2019Date of Patent: February 28, 2023Assignee: The Board of Regents of The University of Texas SystemInventors: Tongping Liu, Sam Albert Silvestro, Hongyu Liu, Tianyi Liu
-
Publication number: 20230004367Abstract: The techniques described herein may provide techniques to detect, categorize, and diagnose synchronization issues that provide improved performance and issue resolution. For example, in an embodiment, a method may comprise detecting occurrence of synchronization performance problems in software code, when at least some detected synchronization performance problems occur when a contention rate for software locks is low, determining a cause of the synchronization performance problems, and modifying the software code to remedy the cause of the synchronization performance problems so as to improve synchronization performance of the software code.Type: ApplicationFiled: February 24, 2022Publication date: January 5, 2023Inventors: Tongping Liu, Mohammad Mejbah ul Alam, Abdullah Al Muzahid
-
Patent number: 11294652Abstract: The techniques described herein may provide techniques to detect, categorize, and diagnose synchronization issues that provide improved performance and issue resolution. For example, in an embodiment, a method may comprise detecting occurrence of synchronization performance problems in software code, when at least some detected synchronization performance problems occur when a contention rate for software locks is low, determining a cause of the synchronization performance problems, and modifying the software code to remedy the cause of the synchronization performance problems so as to improve synchronization performance of the software code.Type: GrantFiled: April 16, 2019Date of Patent: April 5, 2022Assignee: The Board of Regents of The University of Texas SystemInventors: Tongping Liu, Mohammad Mejbah ul Alam, Abdullah Al Muzahid
-
Patent number: 10915424Abstract: The techniques described herein may provide deadlock detection and prevention with improved performance and reduced overhead over existing systems. For example, in an embodiment, a method for improving performance of software code by preventing deadlocks may comprise executing software code in a computer system comprising a processor, memory accessible by the processor, and program instructions and data for the software code stored in the memory, the program instructions executable by the processor to execute the software code, logging information relating to occurrence of deadlock conditions among threads in the executing software code, detecting occurrence of deadlock conditions in the software code based on the logged information, and modifying the software code or data used by the software code so as to prevent occurrence of at least one detected deadlock condition.Type: GrantFiled: October 12, 2018Date of Patent: February 9, 2021Assignee: The Board of Regents of The University of Texas SystemInventors: Tongping Liu, Jinpeng Zhou, Sam Silvestro, Hongyu Liu
-
Patent number: 10901828Abstract: The techniques described herein may include memory allocation techniques that provide improved security and performance. In embodiments, a method implemented in a computer system may include a processor and a memory, the method may comprise mapping a block of memory, dividing the block of memory into a plurality of heaps, dividing each heap into a plurality of sub-heaps, wherein each sub-heap is associated with one thread of software executing in the computer system, dividing each sub-heap into a plurality of bags, wherein each bag is associated with one size class of objects, and storing a plurality of objects in at least some of the bags, wherein each object is stored in a bag having size class corresponding to a size of the object.Type: GrantFiled: October 26, 2018Date of Patent: January 26, 2021Assignee: Board of Regents, The University of Texas SystemInventors: Tongping Liu, Sam Silvestro, Hongyu Liu
-
Publication number: 20200201997Abstract: Memory allocation techniques may provide improved security and performance. A method may comprise mapping a block of memory, dividing the block of memory into a plurality of heaps, dividing each heap into a plurality of sub-heaps, wherein each sub-heap is associated with one thread of software executing in the computer system, dividing each sub-heap into a plurality of bags, wherein each bag is associated with one size class of objects, creating an allocation buffer and a deallocation buffer for each bag, storing a plurality of objects in at least some of the bags, wherein each object is stored in a bag having size class corresponding to a size of the object, storing in the allocation buffer of each bag information relating to available objects stored in that bag, and storing in the deallocation buffer of each bag information relating to freed objects that were stored in that bag.Type: ApplicationFiled: October 18, 2019Publication date: June 25, 2020Applicant: The Board of Regents of The University of Texas SystemInventors: Tongping Liu, Sam Albert Silvestro, Hongyu Liu, Tianyi Liu
-
Publication number: 20190384692Abstract: The techniques described herein may provide techniques for precise and fully-automatic on-site software failure diagnosis that overcomes issues of existing systems and general challenges of in-production software failure diagnosis. Embodiments of the present systems and methods may provide a tool capable of automatically pinpointing a fault propagation chain of program failures, with explicit symptoms. The combination of binary analysis, in-situ/identical replay, and debugging registers may be used together to simulate the debugging procedures of a programmer automatically. Overhead, privacy, transparency, convenience, and completeness challenges of in-production failure analysis are improved, making it suitable for deployment uses.Type: ApplicationFiled: June 14, 2019Publication date: December 19, 2019Applicant: The Board of Regents of The University of Texas SystemInventors: Tongping Liu, Hongyu Liu, Sam Albert Silvestro
-
Patent number: 10474369Abstract: In a virtualized computer system, guest memory pages are mapped to disk blocks that contain identical contents and the mapping is used to improve management processes performed on virtual machines, such as live migration and snapshots. These processes are performed with less data being transferred because the mapping data of those guest memory pages that have identical content stored on disk are transmitted instead of the their contents. As a result, live migration and snapshots can be carried out more quickly. The mapping of the guest memory pages to disk blocks can also be used to optimize other tasks, such as page swaps and memory error corrections.Type: GrantFiled: February 6, 2013Date of Patent: November 12, 2019Assignee: VMware, Inc.Inventors: Kiran Tati, Rajesh Venkatasubramanian, Carl A. Waldspurger, Alexander Thomas Garthwaite, Tongping Liu
-
Publication number: 20190317746Abstract: The techniques described herein may provide techniques to detect, categorize, and diagnose synchronization issues that provide improved performance and issue resolution. For example, in an embodiment, a method may comprise detecting occurrence of synchronization performance problems in software code, when at least some detected synchronization performance problems occur when a contention rate for software locks is low, determining a cause of the synchronization performance problems, and modifying the software code to remedy the cause of the synchronization performance problems so as to improve synchronization performance of the software code.Type: ApplicationFiled: April 16, 2019Publication date: October 17, 2019Inventors: Tongping Liu, Mohammad Mejbah ul Alam, Abdullah Al Muzahid
-
Patent number: 10402292Abstract: In one embodiment, a method of false sharing detection includes performing, by a device, a plurality of optimization passes on source code, to produce optimized source code and receiving, by the device, selection criteria, The method also includes adding instrumentation to the optimized source code, by the device, after performing the plurality of optimization passes, to produce an instrumented code, where the instrumentation is configured to track memory access addresses and access types of global variables and heap variables in accordance with the selection criteria.Type: GrantFiled: May 2, 2017Date of Patent: September 3, 2019Assignee: Futurewei Technologies, Inc.Inventors: Tongping Liu, Chen Tian, Ziang Hu
-
Patent number: 10394714Abstract: In one embodiment, a method for predicting false sharing includes running code on a plurality of cores and determining whether there is potential false sharing between a first cache line and a second cache line, and where the first cache line is adjacent to the second cache line. The method also includes tracking the potential false sharing and reporting the potential false sharing.Type: GrantFiled: December 29, 2016Date of Patent: August 27, 2019Assignee: Futurewei Technologies, Inc.Inventors: Chen Tian, Tongping Liu, Ziang Hu
-
Publication number: 20190129786Abstract: The techniques described herein may include memory allocation techniques that provide improved security and performance. In embodiments, a method implemented in a computer system may include a processor and a memory, the method may comprise mapping a block of memory, dividing the block of memory into a plurality of heaps, dividing each heap into a plurality of sub-heaps, wherein each sub-heap is associated with one thread of software executing in the computer system, dividing each sub-heap into a plurality of bags, wherein each bag is associated with one size class of objects, and storing a plurality of objects in at least some of the bags, wherein each object is stored in a bag having size class corresponding to a size of the object.Type: ApplicationFiled: October 26, 2018Publication date: May 2, 2019Applicant: The Board of Regents of The University of Texas SystemInventors: Tongping Liu, Sam Silvestro, Hongyu Liu
-
Publication number: 20190114248Abstract: The techniques described herein may provide deadlock detection and prevention with improved performance and reduced overhead over existing systems. For example, in an embodiment, a method for improving performance of software code by preventing deadlocks may comprise executing software code in a computer system comprising a processor, memory accessible by the processor, and program instructions and data for the software code stored in the memory, the program instructions executable by the processor to execute the software code, logging information relating to occurrence of deadlock conditions among threads in the executing software code, detecting occurrence of deadlock conditions in the software code based on the logged information, and modifying the software code or data used by the software code so as to prevent occurrence of at least one detected deadlock condition.Type: ApplicationFiled: October 12, 2018Publication date: April 18, 2019Applicant: The Board of Regents of The University of Texas SystemInventors: Tongping Liu, Jinpeng Zhou, Sam Silvestro, Hongyu Liu
-
Publication number: 20170242772Abstract: In one embodiment, a method of false sharing detection includes performing, by a device, a plurality of optimization passes on source code, to produce optimized source code and receiving, by the device, selection criteria, The method also includes adding instrumentation to the optimized source code, by the device, after performing the plurality of optimization passes, to produce an instrumented code, where the instrumentation is configured to track memory access addresses and access types of global variables and heap variables in accordance with the selection criteria.Type: ApplicationFiled: May 2, 2017Publication date: August 24, 2017Inventors: Tongping Liu, Chen Tian, Ziang Hu
-
Patent number: 9678883Abstract: In one embodiment, a method for detecting false sharing includes running code on a plurality of cores, where the code includes instrumentation and tracking cache invalidations in the code while running the code to produce tracked invalidations in accordance with the instrumentation, where tracking the cache invalidations includes tracking cache accesses to a plurality of cache lines by a plurality of tasks. The method also includes reporting false sharing in accordance with the tracked invalidations to produce a false sharing report.Type: GrantFiled: July 18, 2014Date of Patent: June 13, 2017Assignee: Futurewei Technologies, Inc.Inventors: Tongping Liu, Chen Tian, Ziang Hu
-
Publication number: 20170109288Abstract: In one embodiment, a method for predicting false sharing includes running code on a plurality of cores and determining whether there is potential false sharing between a first cache line and a second cache line, and where the first cache line is adjacent to the second cache line. The method also includes tracking the potential false sharing and reporting the potential false sharing.Type: ApplicationFiled: December 29, 2016Publication date: April 20, 2017Inventors: Chen Tian, Tongping Liu, Ziang Hu