Patents by Inventor Fengbo Ren

Fengbo Ren has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

HPC FRAMEWORK FOR ACCELERATING SPARSE CHOLESKY FACTORIZATION ON FPGAS

Publication number: 20230325464

Abstract: A high-performance computing (HPC) framework for accelerating sparse Cholesky factorization on field-programmable gate arrays (FPGAs) is provided. The proposed framework includes an FPGA kernel implementing a throughput-optimized hardware architecture for accelerating a supernodal multifrontal algorithm for sparse Cholesky factorization. The proposed framework further includes a host program implementing a novel scheduling algorithm for finding the optimal execution order of supernode computations for an elimination tree on the FPGA to eliminate the need for off-chip memory access for storing intermediate results. Moreover, the proposed scheduling algorithm minimizes on-chip memory requirements for buffering intermediate results by resolving the dependency of parent nodes in an elimination tree through temporal parallelism.

Type: Application

Filed: April 11, 2023

Publication date: October 12, 2023

Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY

Inventors: Erfan Bank Tavakoli, Fengbo Ren, Michael Riera, Masudul Quraishi
Generic compression ratio adapter for end-to-end data-driven compressive sensing reconstruction frameworks

Patent number: 11777520

Abstract: A compression ratio (CR) adapter (CRA) for end-to-end data-driven compressive sensing (CS) reconstruction (EDCSR) frameworks is provided. EDCSR frameworks achieve state-of-the-art reconstruction performance in terms of reconstruction speed and accuracy for images and other signals. However, existing EDCSR frameworks cannot adapt to a variable CR. For applications that desire a variable CR, existing EDCSR frameworks must be trained from scratch at each CR, which is computationally costly and time-consuming. Embodiments described herein present a CRA framework that addresses the variable CR problem generally for existing and future EDCSR frameworks with no modification to given reconstruction models nor enormous additional rounds of training needed. The CRA exploits an initial reconstruction network to generate an initial estimate of reconstruction results based on a small portion of acquired image measurements.

Type: Grant

Filed: March 31, 2021

Date of Patent: October 3, 2023

Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY

Inventors: Zhikang Zhang, Fengbo Ren, Kai Xu
Selective sensing: a data-driven nonuniform subsampling approach for computation-free on-sensor data dimensionality reduction

Patent number: 11763165

Abstract: A data-driven nonuniform subsampling approach for computation-free on-sensor data dimensionality is provided, referred to herein as selective sensing. Designing an on-sensor data dimensionality reduction scheme for efficient signal sensing has long been a challenging task. Compressive sensing is a generic solution for sensing signals in a compressed format. Although compressive sensing can be directly implemented in the analog domain for specific types of signals, many application scenarios require implementation of data compression in the digital domain. However, the computational complexity involved in digital-domain compressive sensing limits its practical application, especially in resource-constrained sensor devices or high-data-rate sensor devices. Embodiments described herein provide a selective sensing framework that adopts a novel concept of data-driven nonuniform subsampling to reduce the dimensionality of acquired signals while retaining the information of interest in a computation-free fashion.

Type: Grant

Filed: May 11, 2021

Date of Patent: September 19, 2023

Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY

Inventors: Zhikang Zhang, Fengbo Ren
HALO: A HARDWARE-AGNOSTIC ACCELERATOR ORCHESTRATION SOFTWARE FRAMEWORK FOR HETEROGENEOUS COMPUTING SYSTEMS

Publication number: 20230080421

Abstract: Hardware-agnostic accelerator orchestration (HALO) provides a software framework for heterogeneous computing systems. Hardware-agnostic programming with high performance portability is envisioned to be a bedrock for realizing adoption of emerging accelerator technologies in heterogeneous computing systems, such as high-performance computing (HPC) systems, data center computing systems, and edge computing systems. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. Accordingly, embodiments described herein provide a flexible hardware-agnostic environment that allows application developers to develop high-performance applications without knowledge of the underlying hardware.

Type: Application

Filed: March 1, 2021

Publication date: March 16, 2023

Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY

Inventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
A SOFTWARE-DEFINED BOARD SUPPORT PACKAGE (SW-BSP) FOR STAND-ALONE RECONFIGURABLE ACCELERATORS

Publication number: 20230081394

Abstract: A software-defined board support package (SW-BSP) for stand-alone reconfigurable accelerators is provided. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. A stand-alone accelerator protocol (SAP) allows for a hardware accelerator to be plug-and-playable in a stand-alone fashion (without needing a local central processing unit (CPU) host) and interact with a remote computing system agent for application acceleration across any network infrastructure. The SAP further facilitates a hardware-agnostic accelerator orchestration (HALO) software framework for hardware-agnostic programming with high performance portability and scalability in heterogeneous computing systems. The SW-BSP provides an implementation of the SAP on reconfigurable accelerators.

Type: Application

Filed: March 1, 2021

Publication date: March 16, 2023

Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY

Inventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
LAPRAN: A SCALABLE LAPLACIAN PYRAMID RECONSTRUCTIVE ADVERSARIAL NETWORK FOR FLEXIBLE COMPRESSIVE SENSING RECONSTRUCTION

Publication number: 20230075490

Abstract: This disclosure addresses the single-image compressive sensing (CS) and reconstruction problem. A scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) facilitates high-fidelity, flexible and fast CS image reconstruction. LAPRAN progressively reconstructs an image following the concept of the Laplacian pyramid through multiple stages of reconstructive adversarial networks (RANs). At each pyramid level, CS measurements are fused with a contextual latent vector to generate a high-frequency image residual. Consequently, LAPRAN can produce hierarchies of reconstructed images and each with an incremental resolution and improved quality. The scalable pyramid structure of LAPRAN enables high-fidelity CS reconstruction with a flexible resolution that is adaptive to a wide range of compression ratios (CRs), which is infeasible with existing methods.

Type: Application

Filed: October 11, 2022

Publication date: March 9, 2023

Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY

Inventors: Fengbo REN, Kai XU, Zhikang ZHANG
A STAND-ALONE ACCELERATOR PROTOCOL (SAP) FOR HETEROGENEOUS COMPUTING SYSTEMS

Publication number: 20230076476

Abstract: A stand-alone accelerator protocol (SAP) for heterogeneous computing systems is provided. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. The SAP allows for a hardware accelerator to be plug-and-playable in a stand-alone fashion (without needing a local central processing unit (CPU) host) and interact with a remote computing system agent for application acceleration across any network infrastructure. The SAP further facilitates a hardware-agnostic accelerator orchestration (HALO) software framework for hardware-agnostic programming with high performance portability and scalability in heterogeneous computing systems, such as high-performance computing (HPC) systems, data center computing systems, and edge computing systems. Accordingly, embodiments described herein provide a flexible hardware-agnostic environment that allows application developers to develop high-performance applications without knowledge of the underlying hardware.

Type: Application

Filed: March 1, 2021

Publication date: March 9, 2023

Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY

Inventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
C²MPI: A HARDWARE-AGNOSTIC MESSAGE PASSING INTERFACE FOR HETEROGENEOUS COMPUTING SYSTEMS

Publication number: 20230074426

Abstract: Compute-centric message passing interface (C2MPI) provides a hardware-agnostic message passing interface for heterogenous computing systems. Hardware-agnostic programming with high performance portability is envisioned to be a bedrock for realizing adoption of emerging accelerator technologies in heterogeneous computing systems, such as high-performance computing (HPC) systems, data center computing systems, and edge computing systems. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. Accordingly, embodiments described herein provide a flexible hardware-agnostic environment that allows application developers to develop high-performance applications without knowledge of the underlying hardware.

Type: Application

Filed: March 1, 2021

Publication date: March 9, 2023

Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY

Inventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
LAPRAN: a scalable Laplacian pyramid reconstructive adversarial network for flexible compressive sensing reconstruction

Patent number: 11468542

Abstract: This disclosure addresses the single-image compressive sensing (CS) and reconstruction problem. A scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) facilitates high-fidelity, flexible and fast CS image reconstruction. LAPRAN progressively reconstructs an image following the concept of the Laplacian pyramid through multiple stages of reconstructive adversarial networks (RANs). At each pyramid level, CS measurements are fused with a contextual latent vector to generate a high-frequency image residual. Consequently, LAPRAN can produce hierarchies of reconstructed images and each with an incremental resolution and improved quality. The scalable pyramid structure of LAPRAN enables high-fidelity CS reconstruction with a flexible resolution that is adaptive to a wide range of compression ratios (CRs), which is infeasible with existing methods.

Type: Grant

Filed: January 17, 2020

Date of Patent: October 11, 2022

Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY

Inventors: Fengbo Ren, Kai Xu, Zhikang Zhang
SELECTIVE SENSING: A DATA-DRIVEN NONUNIFORM SUBSAMPLING APPROACH FOR COMPUTATION-FREE ON-SENSOR DATA DIMENSIONALITY REDUCTION

Publication number: 20210349945

Abstract: A data-driven nonuniform subsampling approach for computation-free on-sensor data dimensionality is provided, referred to herein as selective sensing. Designing an on-sensor data dimensionality reduction scheme for efficient signal sensing has long been a challenging task. Compressive sensing is a generic solution for sensing signals in a compressed format. Although compressive sensing can be directly implemented in the analog domain for specific types of signals, many application scenarios require implementation of data compression in the digital domain. However, the computational complexity involved in digital-domain compressive sensing limits its practical application, especially in resource-constrained sensor devices or high-data-rate sensor devices. Embodiments described herein provide a selective sensing framework that adopts a novel concept of data-driven nonuniform subsampling to reduce the dimensionality of acquired signals while retaining the information of interest in a computation-free fashion.

Type: Application

Filed: May 11, 2021

Publication date: November 11, 2021

Applicant: Arizona Board of Regents on behalf of Arizona State University

Inventors: Zhikang Zhang, Fengbo Ren
SYSTEMS AND METHODS FOR PRUNING BINARY NEURAL NETWORKS GUIDED BY WEIGHT FLIPPING FREQUENCY

Publication number: 20210350242

Abstract: Various embodiments of a system and method for pruning binary neural networks by analyzing weight flipping frequency and pruning the binary neural network based on the weight flipping frequency associated with each channel of the binary neural network are disclosed herein.

Type: Application

Filed: May 11, 2021

Publication date: November 11, 2021

Applicant: Arizona Board of Regents on Behalf of Arizona State University

Inventors: Yixing Li, Fengbo Ren
SYSTOLIC-CNN: AN OPENCL-DEFINED SCALABLE RUNTIME-FLEXIBLE PROGRAMMABLE ACCELERATOR ARCHITECTURE FOR ACCELERATING CONVOLUTIONAL NEURAL NETWORK INFERENCE IN CLOUD/EDGE COMPUTING

Publication number: 20210334636

Abstract: An OpenCL-defined scalable runtime-flexible programmable accelerator architecture for accelerating convolutional neural network (CNN) inference in cloud/edge computing is provided, referred to herein as Systolic-CNN. Existing OpenCL-defined programmable accelerators (e.g., field-programmable gate array (FPGA)-based accelerators) for CNN inference are insufficient due to limited flexibility for supporting multiple CNN models at runtime and poor scalability resulting in underutilized accelerator resources and limited computational parallelism. Systolic-CNN adopts a highly pipelined and paralleled one-dimensional (1-D) systolic array architecture, which efficiently explores both spatial and temporal parallelism for accelerating CNN inference on programmable accelerators (e.g., FPGAs). Systolic-CNN is highly scalable and parameterized, and can be easily adapted by users to efficiently utilize the coarse-grained computation resources for a given programmable accelerator.

Type: Application

Filed: April 28, 2021

Publication date: October 28, 2021

Applicant: Arizona Board of Regents on behalf of Arizona State University

Inventors: Akshay Dua, Fengbo Ren
GENERIC COMPRESSION RATIO ADAPTER FOR END-TO-END DATA-DRIVEN COMPRESSIVE SENSING RECONSTRUCTION FRAMEWORKS

Publication number: 20210305999

Abstract: A compression ratio (CR) adapter (CRA) for end-to-end data-driven compressive sensing (CS) reconstruction (EDCSR) frameworks is provided. EDCSR frameworks achieve state-of-the-art reconstruction performance in terms of reconstruction speed and accuracy for images and other signals. However, existing EDCSR frameworks cannot adapt to a variable CR. For applications that desire a variable CR, existing EDCSR frameworks must be trained from scratch at each CR, which is computationally costly and time-consuming. Embodiments described herein present a CRA framework that addresses the variable CR problem generally for existing and future EDCSR frameworks with no modification to given reconstruction models nor enormous additional rounds of training needed. The CRA exploits an initial reconstruction network to generate an initial estimate of reconstruction results based on a small portion of acquired image measurements.

Type: Application

Filed: March 31, 2021

Publication date: September 30, 2021

Applicant: Arizona Board of Regents on behalf of Arizona State University

Inventors: Zhikang Zhang, Fengbo Ren, Kai Xu
Real time end-to-end learning system for a high frame rate video compressive sensing network

Patent number: 10924755

Abstract: A real time end-to-end learning system for a high frame rate video compressive sensing network is described. The slow reconstruction speed of conventional compressive sensing approaches is overcome by directly modeling an inverse mapping from compressed domain to original domain in a single forward propagation. Through processing massive unlabeled video data such a mapping is learned by a neural network using data-driven methods. Systems and methods according to this disclosure incorporate a multi-rate convolutional neural network (CNN) and a synthesizing recurrent neural network (RNN) to achieve real time compression and reconstruction of video data.

Type: Grant

Filed: October 19, 2018

Date of Patent: February 16, 2021

Assignee: Arizona Board of Regents on behalf of Arizona State University

Inventors: Fengbo Ren, Kai Xu
LAPRAN: A SCALABLE LAPLACIAN PYRAMID RECONSTRUCTIVE ADVERSARIAL NETWORK FOR FLEXIBLE COMPRESSIVE SENSING RECONSTRUCTION

Publication number: 20200234406

Abstract: This disclosure addresses the single-image compressive sensing (CS) and reconstruction problem. A scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) facilitates high-fidelity, flexible and fast CS image reconstruction. LAPRAN progressively reconstructs an image following the concept of the Laplacian pyramid through multiple stages of reconstructive adversarial networks (RANs). At each pyramid level, CS measurements are fused with a contextual latent vector to generate a high-frequency image residual. Consequently, LAPRAN can produce hierarchies of reconstructed images and each with an incremental resolution and improved quality. The scalable pyramid structure of LAPRAN enables high-fidelity CS reconstruction with a flexible resolution that is adaptive to a wide range of compression ratios (CRs), which is infeasible with existing methods.

Type: Application

Filed: January 17, 2020

Publication date: July 23, 2020

Applicant: Arizona Board of Regents on behalf of Arizona State University

Inventors: Fengbo Ren, Kai Xu, Zhikang Zhang
REAL TIME END-TO-END LEARNING SYSTEM FOR A HIGH FRAME RATE VIDEO COMPRESSIVE SENSING NETWORK

Publication number: 20190124346

Abstract: A real time end-to-end learning system for a high frame rate video compressive sensing network is described. The slow reconstruction speed of conventional compressive sensing approaches is overcome by directly modeling an inverse mapping from compressed domain to original domain in a single forward propagation. Through processing massive unlabeled video data such a mapping is learned by a neural network using data-driven methods. Systems and methods according to this disclosure incorporate a multi-rate convolutional neural network (CNN) and a synthesizing recurrent neural network (RNN) to achieve real time compression and reconstruction of video data.

Type: Application

Filed: October 19, 2018

Publication date: April 25, 2019

Applicant: Arizona Board of Regents on behalf of Arizona State University

Inventors: Fengbo Ren, Kai Xu
Scalable and parameterized VLSI architecture for compressive sensing sparse approximation

Patent number: 10073701

Abstract: Systems and methods for implementing a scalable very-large-scale integration (VLSI) architecture to perform compressive sensing (CS) hardware reconstruction for data signals in accordance with embodiments of the invention are disclosed. The VLSI architecture is optimized for CS signal reconstruction by implementing a reformulation of the orthogonal matching pursuit (OMP) process and utilizing architecture resource sharing techniques. Typically, the VLSI architecture is a CS reconstruction engine that includes a vector and scalar computation cores where the cores can be time-multiplexed (via dynamic configuration) to perform each task associated with OMP. The vector core includes configurable processing elements (PEs) connected in parallel. Further, the cores can be linked by data-path memories, where complex data flow of OMP can be customized utilizing local memory controllers synchronized by a top-level finite-state machine.

Type: Grant

Filed: July 29, 2014

Date of Patent: September 11, 2018

Assignee: The Regents of the University of California

Inventors: Dejan Markovic, Fengbo Ren
Scalable and Parameterized VLSI Architecture for Compressive Sensing Sparse Approximation

Publication number: 20150032990

Abstract: Systems and methods for implementing a scalable very-large-scale integration (VLSI) architecture to perform compressive sensing (CS) hardware reconstruction for data signals in accordance with embodiments of the invention are disclosed. The VLSI architecture is optimized for CS signal reconstruction by implementing a reformulation of the orthogonal matching pursuit (OMP) process and utilizing architecture resource sharing techniques. Typically, the VLSI architecture is a CS reconstruction engine that includes a vector and scalar computation cores where the cores can be time-multiplexed (via dynamic configuration) to perform each task associated with OMP. The vector core includes configurable processing elements (PEs) connected in parallel. Further, the cores can be linked by data-path memories, where complex data flow of OMP can be customized utilizing local memory controllers synchronized by a top-level finite-state machine.

Type: Application

Filed: July 29, 2014

Publication date: January 29, 2015

Inventors: Dejan Markovic, Fengbo Ren
Body voltage sensing based short pulse reading circuit

Patent number: 8917562

Abstract: As memory geometries continue to scale down, current density of magnetic tunnel junctions (MTJs) make conventional low current reading scheme problematic with regard to performance and reliability. A body-voltage sense circuit (BVSC) short pulse reading (SPR) circuit is described using body connected load transistors and a novel sensing circuit with second stage amplifier which allows for very short read pulses providing much higher read margins, less sensing time, and shorter sensing current pulses. Simulation results (using 65-nm CMOS model SPICE simulations) show that our technique can achieve 550 mV of read margin at 1 ns performance under a 1 V supply voltage, which is greater than reference designs achieve at 5 ns performance.

Type: Grant

Filed: November 25, 2013

Date of Patent: December 23, 2014

Assignee: The Regents of the University of California

Inventors: Kang-Lung Wang, Chih-Kong K. Yang, Dejan Markovic, Fengbo Ren
BODY VOLTAGE SENSING BASED SHORT PULSE READING CIRCUIT

Publication number: 20140153325

Abstract: As memory geometries continue to scale down, current density of magnetic tunnel junctions (MTJs) make conventional low current reading scheme problematic with regard to performance and reliability. A body-voltage sense circuit (BVSC) short pulse reading (SPR) circuit is described using body connected load transistors and a novel sensing circuit with second stage amplifier which allows for very short read pulses providing much higher read margins, less sensing time, and shorter sensing current pulses. Simulation results (using 65-nm CMOS model SPICE simulations) show that our technique can achieve 550 mV of read margin at 1 ns performance under a 1V supply voltage, which is greater than reference designs achieve at 5 ns performance.

Type: Application

Filed: November 25, 2013

Publication date: June 5, 2014

Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA

Inventors: Kang-Lung Wang, Chih-Kong K. Yang, Dejan Markovic, Fengbo Ren