Patents by Inventor Fengbo Ren
Fengbo Ren has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230325464Abstract: A high-performance computing (HPC) framework for accelerating sparse Cholesky factorization on field-programmable gate arrays (FPGAs) is provided. The proposed framework includes an FPGA kernel implementing a throughput-optimized hardware architecture for accelerating a supernodal multifrontal algorithm for sparse Cholesky factorization. The proposed framework further includes a host program implementing a novel scheduling algorithm for finding the optimal execution order of supernode computations for an elimination tree on the FPGA to eliminate the need for off-chip memory access for storing intermediate results. Moreover, the proposed scheduling algorithm minimizes on-chip memory requirements for buffering intermediate results by resolving the dependency of parent nodes in an elimination tree through temporal parallelism.Type: ApplicationFiled: April 11, 2023Publication date: October 12, 2023Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITYInventors: Erfan Bank Tavakoli, Fengbo Ren, Michael Riera, Masudul Quraishi
-
Patent number: 11777520Abstract: A compression ratio (CR) adapter (CRA) for end-to-end data-driven compressive sensing (CS) reconstruction (EDCSR) frameworks is provided. EDCSR frameworks achieve state-of-the-art reconstruction performance in terms of reconstruction speed and accuracy for images and other signals. However, existing EDCSR frameworks cannot adapt to a variable CR. For applications that desire a variable CR, existing EDCSR frameworks must be trained from scratch at each CR, which is computationally costly and time-consuming. Embodiments described herein present a CRA framework that addresses the variable CR problem generally for existing and future EDCSR frameworks with no modification to given reconstruction models nor enormous additional rounds of training needed. The CRA exploits an initial reconstruction network to generate an initial estimate of reconstruction results based on a small portion of acquired image measurements.Type: GrantFiled: March 31, 2021Date of Patent: October 3, 2023Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITYInventors: Zhikang Zhang, Fengbo Ren, Kai Xu
-
Patent number: 11763165Abstract: A data-driven nonuniform subsampling approach for computation-free on-sensor data dimensionality is provided, referred to herein as selective sensing. Designing an on-sensor data dimensionality reduction scheme for efficient signal sensing has long been a challenging task. Compressive sensing is a generic solution for sensing signals in a compressed format. Although compressive sensing can be directly implemented in the analog domain for specific types of signals, many application scenarios require implementation of data compression in the digital domain. However, the computational complexity involved in digital-domain compressive sensing limits its practical application, especially in resource-constrained sensor devices or high-data-rate sensor devices. Embodiments described herein provide a selective sensing framework that adopts a novel concept of data-driven nonuniform subsampling to reduce the dimensionality of acquired signals while retaining the information of interest in a computation-free fashion.Type: GrantFiled: May 11, 2021Date of Patent: September 19, 2023Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITYInventors: Zhikang Zhang, Fengbo Ren
-
Publication number: 20230080421Abstract: Hardware-agnostic accelerator orchestration (HALO) provides a software framework for heterogeneous computing systems. Hardware-agnostic programming with high performance portability is envisioned to be a bedrock for realizing adoption of emerging accelerator technologies in heterogeneous computing systems, such as high-performance computing (HPC) systems, data center computing systems, and edge computing systems. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. Accordingly, embodiments described herein provide a flexible hardware-agnostic environment that allows application developers to develop high-performance applications without knowledge of the underlying hardware.Type: ApplicationFiled: March 1, 2021Publication date: March 16, 2023Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITYInventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
-
Publication number: 20230081394Abstract: A software-defined board support package (SW-BSP) for stand-alone reconfigurable accelerators is provided. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. A stand-alone accelerator protocol (SAP) allows for a hardware accelerator to be plug-and-playable in a stand-alone fashion (without needing a local central processing unit (CPU) host) and interact with a remote computing system agent for application acceleration across any network infrastructure. The SAP further facilitates a hardware-agnostic accelerator orchestration (HALO) software framework for hardware-agnostic programming with high performance portability and scalability in heterogeneous computing systems. The SW-BSP provides an implementation of the SAP on reconfigurable accelerators.Type: ApplicationFiled: March 1, 2021Publication date: March 16, 2023Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITYInventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
-
Publication number: 20230075490Abstract: This disclosure addresses the single-image compressive sensing (CS) and reconstruction problem. A scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) facilitates high-fidelity, flexible and fast CS image reconstruction. LAPRAN progressively reconstructs an image following the concept of the Laplacian pyramid through multiple stages of reconstructive adversarial networks (RANs). At each pyramid level, CS measurements are fused with a contextual latent vector to generate a high-frequency image residual. Consequently, LAPRAN can produce hierarchies of reconstructed images and each with an incremental resolution and improved quality. The scalable pyramid structure of LAPRAN enables high-fidelity CS reconstruction with a flexible resolution that is adaptive to a wide range of compression ratios (CRs), which is infeasible with existing methods.Type: ApplicationFiled: October 11, 2022Publication date: March 9, 2023Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITYInventors: Fengbo REN, Kai XU, Zhikang ZHANG
-
Publication number: 20230076476Abstract: A stand-alone accelerator protocol (SAP) for heterogeneous computing systems is provided. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. The SAP allows for a hardware accelerator to be plug-and-playable in a stand-alone fashion (without needing a local central processing unit (CPU) host) and interact with a remote computing system agent for application acceleration across any network infrastructure. The SAP further facilitates a hardware-agnostic accelerator orchestration (HALO) software framework for hardware-agnostic programming with high performance portability and scalability in heterogeneous computing systems, such as high-performance computing (HPC) systems, data center computing systems, and edge computing systems. Accordingly, embodiments described herein provide a flexible hardware-agnostic environment that allows application developers to develop high-performance applications without knowledge of the underlying hardware.Type: ApplicationFiled: March 1, 2021Publication date: March 9, 2023Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITYInventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
-
Publication number: 20230074426Abstract: Compute-centric message passing interface (C2MPI) provides a hardware-agnostic message passing interface for heterogenous computing systems. Hardware-agnostic programming with high performance portability is envisioned to be a bedrock for realizing adoption of emerging accelerator technologies in heterogeneous computing systems, such as high-performance computing (HPC) systems, data center computing systems, and edge computing systems. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. Accordingly, embodiments described herein provide a flexible hardware-agnostic environment that allows application developers to develop high-performance applications without knowledge of the underlying hardware.Type: ApplicationFiled: March 1, 2021Publication date: March 9, 2023Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITYInventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
-
Patent number: 11468542Abstract: This disclosure addresses the single-image compressive sensing (CS) and reconstruction problem. A scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) facilitates high-fidelity, flexible and fast CS image reconstruction. LAPRAN progressively reconstructs an image following the concept of the Laplacian pyramid through multiple stages of reconstructive adversarial networks (RANs). At each pyramid level, CS measurements are fused with a contextual latent vector to generate a high-frequency image residual. Consequently, LAPRAN can produce hierarchies of reconstructed images and each with an incremental resolution and improved quality. The scalable pyramid structure of LAPRAN enables high-fidelity CS reconstruction with a flexible resolution that is adaptive to a wide range of compression ratios (CRs), which is infeasible with existing methods.Type: GrantFiled: January 17, 2020Date of Patent: October 11, 2022Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITYInventors: Fengbo Ren, Kai Xu, Zhikang Zhang
-
Publication number: 20210349945Abstract: A data-driven nonuniform subsampling approach for computation-free on-sensor data dimensionality is provided, referred to herein as selective sensing. Designing an on-sensor data dimensionality reduction scheme for efficient signal sensing has long been a challenging task. Compressive sensing is a generic solution for sensing signals in a compressed format. Although compressive sensing can be directly implemented in the analog domain for specific types of signals, many application scenarios require implementation of data compression in the digital domain. However, the computational complexity involved in digital-domain compressive sensing limits its practical application, especially in resource-constrained sensor devices or high-data-rate sensor devices. Embodiments described herein provide a selective sensing framework that adopts a novel concept of data-driven nonuniform subsampling to reduce the dimensionality of acquired signals while retaining the information of interest in a computation-free fashion.Type: ApplicationFiled: May 11, 2021Publication date: November 11, 2021Applicant: Arizona Board of Regents on behalf of Arizona State UniversityInventors: Zhikang Zhang, Fengbo Ren
-
Publication number: 20210350242Abstract: Various embodiments of a system and method for pruning binary neural networks by analyzing weight flipping frequency and pruning the binary neural network based on the weight flipping frequency associated with each channel of the binary neural network are disclosed herein.Type: ApplicationFiled: May 11, 2021Publication date: November 11, 2021Applicant: Arizona Board of Regents on Behalf of Arizona State UniversityInventors: Yixing Li, Fengbo Ren
-
Publication number: 20210334636Abstract: An OpenCL-defined scalable runtime-flexible programmable accelerator architecture for accelerating convolutional neural network (CNN) inference in cloud/edge computing is provided, referred to herein as Systolic-CNN. Existing OpenCL-defined programmable accelerators (e.g., field-programmable gate array (FPGA)-based accelerators) for CNN inference are insufficient due to limited flexibility for supporting multiple CNN models at runtime and poor scalability resulting in underutilized accelerator resources and limited computational parallelism. Systolic-CNN adopts a highly pipelined and paralleled one-dimensional (1-D) systolic array architecture, which efficiently explores both spatial and temporal parallelism for accelerating CNN inference on programmable accelerators (e.g., FPGAs). Systolic-CNN is highly scalable and parameterized, and can be easily adapted by users to efficiently utilize the coarse-grained computation resources for a given programmable accelerator.Type: ApplicationFiled: April 28, 2021Publication date: October 28, 2021Applicant: Arizona Board of Regents on behalf of Arizona State UniversityInventors: Akshay Dua, Fengbo Ren
-
Publication number: 20210305999Abstract: A compression ratio (CR) adapter (CRA) for end-to-end data-driven compressive sensing (CS) reconstruction (EDCSR) frameworks is provided. EDCSR frameworks achieve state-of-the-art reconstruction performance in terms of reconstruction speed and accuracy for images and other signals. However, existing EDCSR frameworks cannot adapt to a variable CR. For applications that desire a variable CR, existing EDCSR frameworks must be trained from scratch at each CR, which is computationally costly and time-consuming. Embodiments described herein present a CRA framework that addresses the variable CR problem generally for existing and future EDCSR frameworks with no modification to given reconstruction models nor enormous additional rounds of training needed. The CRA exploits an initial reconstruction network to generate an initial estimate of reconstruction results based on a small portion of acquired image measurements.Type: ApplicationFiled: March 31, 2021Publication date: September 30, 2021Applicant: Arizona Board of Regents on behalf of Arizona State UniversityInventors: Zhikang Zhang, Fengbo Ren, Kai Xu
-
Patent number: 10924755Abstract: A real time end-to-end learning system for a high frame rate video compressive sensing network is described. The slow reconstruction speed of conventional compressive sensing approaches is overcome by directly modeling an inverse mapping from compressed domain to original domain in a single forward propagation. Through processing massive unlabeled video data such a mapping is learned by a neural network using data-driven methods. Systems and methods according to this disclosure incorporate a multi-rate convolutional neural network (CNN) and a synthesizing recurrent neural network (RNN) to achieve real time compression and reconstruction of video data.Type: GrantFiled: October 19, 2018Date of Patent: February 16, 2021Assignee: Arizona Board of Regents on behalf of Arizona State UniversityInventors: Fengbo Ren, Kai Xu
-
Publication number: 20200234406Abstract: This disclosure addresses the single-image compressive sensing (CS) and reconstruction problem. A scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) facilitates high-fidelity, flexible and fast CS image reconstruction. LAPRAN progressively reconstructs an image following the concept of the Laplacian pyramid through multiple stages of reconstructive adversarial networks (RANs). At each pyramid level, CS measurements are fused with a contextual latent vector to generate a high-frequency image residual. Consequently, LAPRAN can produce hierarchies of reconstructed images and each with an incremental resolution and improved quality. The scalable pyramid structure of LAPRAN enables high-fidelity CS reconstruction with a flexible resolution that is adaptive to a wide range of compression ratios (CRs), which is infeasible with existing methods.Type: ApplicationFiled: January 17, 2020Publication date: July 23, 2020Applicant: Arizona Board of Regents on behalf of Arizona State UniversityInventors: Fengbo Ren, Kai Xu, Zhikang Zhang
-
Publication number: 20190124346Abstract: A real time end-to-end learning system for a high frame rate video compressive sensing network is described. The slow reconstruction speed of conventional compressive sensing approaches is overcome by directly modeling an inverse mapping from compressed domain to original domain in a single forward propagation. Through processing massive unlabeled video data such a mapping is learned by a neural network using data-driven methods. Systems and methods according to this disclosure incorporate a multi-rate convolutional neural network (CNN) and a synthesizing recurrent neural network (RNN) to achieve real time compression and reconstruction of video data.Type: ApplicationFiled: October 19, 2018Publication date: April 25, 2019Applicant: Arizona Board of Regents on behalf of Arizona State UniversityInventors: Fengbo Ren, Kai Xu
-
Patent number: 10073701Abstract: Systems and methods for implementing a scalable very-large-scale integration (VLSI) architecture to perform compressive sensing (CS) hardware reconstruction for data signals in accordance with embodiments of the invention are disclosed. The VLSI architecture is optimized for CS signal reconstruction by implementing a reformulation of the orthogonal matching pursuit (OMP) process and utilizing architecture resource sharing techniques. Typically, the VLSI architecture is a CS reconstruction engine that includes a vector and scalar computation cores where the cores can be time-multiplexed (via dynamic configuration) to perform each task associated with OMP. The vector core includes configurable processing elements (PEs) connected in parallel. Further, the cores can be linked by data-path memories, where complex data flow of OMP can be customized utilizing local memory controllers synchronized by a top-level finite-state machine.Type: GrantFiled: July 29, 2014Date of Patent: September 11, 2018Assignee: The Regents of the University of CaliforniaInventors: Dejan Markovic, Fengbo Ren
-
Publication number: 20150032990Abstract: Systems and methods for implementing a scalable very-large-scale integration (VLSI) architecture to perform compressive sensing (CS) hardware reconstruction for data signals in accordance with embodiments of the invention are disclosed. The VLSI architecture is optimized for CS signal reconstruction by implementing a reformulation of the orthogonal matching pursuit (OMP) process and utilizing architecture resource sharing techniques. Typically, the VLSI architecture is a CS reconstruction engine that includes a vector and scalar computation cores where the cores can be time-multiplexed (via dynamic configuration) to perform each task associated with OMP. The vector core includes configurable processing elements (PEs) connected in parallel. Further, the cores can be linked by data-path memories, where complex data flow of OMP can be customized utilizing local memory controllers synchronized by a top-level finite-state machine.Type: ApplicationFiled: July 29, 2014Publication date: January 29, 2015Inventors: Dejan Markovic, Fengbo Ren
-
Patent number: 8917562Abstract: As memory geometries continue to scale down, current density of magnetic tunnel junctions (MTJs) make conventional low current reading scheme problematic with regard to performance and reliability. A body-voltage sense circuit (BVSC) short pulse reading (SPR) circuit is described using body connected load transistors and a novel sensing circuit with second stage amplifier which allows for very short read pulses providing much higher read margins, less sensing time, and shorter sensing current pulses. Simulation results (using 65-nm CMOS model SPICE simulations) show that our technique can achieve 550 mV of read margin at 1 ns performance under a 1 V supply voltage, which is greater than reference designs achieve at 5 ns performance.Type: GrantFiled: November 25, 2013Date of Patent: December 23, 2014Assignee: The Regents of the University of CaliforniaInventors: Kang-Lung Wang, Chih-Kong K. Yang, Dejan Markovic, Fengbo Ren
-
Publication number: 20140153325Abstract: As memory geometries continue to scale down, current density of magnetic tunnel junctions (MTJs) make conventional low current reading scheme problematic with regard to performance and reliability. A body-voltage sense circuit (BVSC) short pulse reading (SPR) circuit is described using body connected load transistors and a novel sensing circuit with second stage amplifier which allows for very short read pulses providing much higher read margins, less sensing time, and shorter sensing current pulses. Simulation results (using 65-nm CMOS model SPICE simulations) show that our technique can achieve 550 mV of read margin at 1 ns performance under a 1V supply voltage, which is greater than reference designs achieve at 5 ns performance.Type: ApplicationFiled: November 25, 2013Publication date: June 5, 2014Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIAInventors: Kang-Lung Wang, Chih-Kong K. Yang, Dejan Markovic, Fengbo Ren