Patents by Inventor Fengbo Ren

Fengbo Ren has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230325464
    Abstract: A high-performance computing (HPC) framework for accelerating sparse Cholesky factorization on field-programmable gate arrays (FPGAs) is provided. The proposed framework includes an FPGA kernel implementing a throughput-optimized hardware architecture for accelerating a supernodal multifrontal algorithm for sparse Cholesky factorization. The proposed framework further includes a host program implementing a novel scheduling algorithm for finding the optimal execution order of supernode computations for an elimination tree on the FPGA to eliminate the need for off-chip memory access for storing intermediate results. Moreover, the proposed scheduling algorithm minimizes on-chip memory requirements for buffering intermediate results by resolving the dependency of parent nodes in an elimination tree through temporal parallelism.
    Type: Application
    Filed: April 11, 2023
    Publication date: October 12, 2023
    Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY
    Inventors: Erfan Bank Tavakoli, Fengbo Ren, Michael Riera, Masudul Quraishi
  • Patent number: 11777520
    Abstract: A compression ratio (CR) adapter (CRA) for end-to-end data-driven compressive sensing (CS) reconstruction (EDCSR) frameworks is provided. EDCSR frameworks achieve state-of-the-art reconstruction performance in terms of reconstruction speed and accuracy for images and other signals. However, existing EDCSR frameworks cannot adapt to a variable CR. For applications that desire a variable CR, existing EDCSR frameworks must be trained from scratch at each CR, which is computationally costly and time-consuming. Embodiments described herein present a CRA framework that addresses the variable CR problem generally for existing and future EDCSR frameworks with no modification to given reconstruction models nor enormous additional rounds of training needed. The CRA exploits an initial reconstruction network to generate an initial estimate of reconstruction results based on a small portion of acquired image measurements.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: October 3, 2023
    Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY
    Inventors: Zhikang Zhang, Fengbo Ren, Kai Xu
  • Patent number: 11763165
    Abstract: A data-driven nonuniform subsampling approach for computation-free on-sensor data dimensionality is provided, referred to herein as selective sensing. Designing an on-sensor data dimensionality reduction scheme for efficient signal sensing has long been a challenging task. Compressive sensing is a generic solution for sensing signals in a compressed format. Although compressive sensing can be directly implemented in the analog domain for specific types of signals, many application scenarios require implementation of data compression in the digital domain. However, the computational complexity involved in digital-domain compressive sensing limits its practical application, especially in resource-constrained sensor devices or high-data-rate sensor devices. Embodiments described herein provide a selective sensing framework that adopts a novel concept of data-driven nonuniform subsampling to reduce the dimensionality of acquired signals while retaining the information of interest in a computation-free fashion.
    Type: Grant
    Filed: May 11, 2021
    Date of Patent: September 19, 2023
    Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY
    Inventors: Zhikang Zhang, Fengbo Ren
  • Publication number: 20230080421
    Abstract: Hardware-agnostic accelerator orchestration (HALO) provides a software framework for heterogeneous computing systems. Hardware-agnostic programming with high performance portability is envisioned to be a bedrock for realizing adoption of emerging accelerator technologies in heterogeneous computing systems, such as high-performance computing (HPC) systems, data center computing systems, and edge computing systems. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. Accordingly, embodiments described herein provide a flexible hardware-agnostic environment that allows application developers to develop high-performance applications without knowledge of the underlying hardware.
    Type: Application
    Filed: March 1, 2021
    Publication date: March 16, 2023
    Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY
    Inventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
  • Publication number: 20230081394
    Abstract: A software-defined board support package (SW-BSP) for stand-alone reconfigurable accelerators is provided. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. A stand-alone accelerator protocol (SAP) allows for a hardware accelerator to be plug-and-playable in a stand-alone fashion (without needing a local central processing unit (CPU) host) and interact with a remote computing system agent for application acceleration across any network infrastructure. The SAP further facilitates a hardware-agnostic accelerator orchestration (HALO) software framework for hardware-agnostic programming with high performance portability and scalability in heterogeneous computing systems. The SW-BSP provides an implementation of the SAP on reconfigurable accelerators.
    Type: Application
    Filed: March 1, 2021
    Publication date: March 16, 2023
    Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY
    Inventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
  • Publication number: 20230075490
    Abstract: This disclosure addresses the single-image compressive sensing (CS) and reconstruction problem. A scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) facilitates high-fidelity, flexible and fast CS image reconstruction. LAPRAN progressively reconstructs an image following the concept of the Laplacian pyramid through multiple stages of reconstructive adversarial networks (RANs). At each pyramid level, CS measurements are fused with a contextual latent vector to generate a high-frequency image residual. Consequently, LAPRAN can produce hierarchies of reconstructed images and each with an incremental resolution and improved quality. The scalable pyramid structure of LAPRAN enables high-fidelity CS reconstruction with a flexible resolution that is adaptive to a wide range of compression ratios (CRs), which is infeasible with existing methods.
    Type: Application
    Filed: October 11, 2022
    Publication date: March 9, 2023
    Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY
    Inventors: Fengbo REN, Kai XU, Zhikang ZHANG
  • Publication number: 20230076476
    Abstract: A stand-alone accelerator protocol (SAP) for heterogeneous computing systems is provided. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. The SAP allows for a hardware accelerator to be plug-and-playable in a stand-alone fashion (without needing a local central processing unit (CPU) host) and interact with a remote computing system agent for application acceleration across any network infrastructure. The SAP further facilitates a hardware-agnostic accelerator orchestration (HALO) software framework for hardware-agnostic programming with high performance portability and scalability in heterogeneous computing systems, such as high-performance computing (HPC) systems, data center computing systems, and edge computing systems. Accordingly, embodiments described herein provide a flexible hardware-agnostic environment that allows application developers to develop high-performance applications without knowledge of the underlying hardware.
    Type: Application
    Filed: March 1, 2021
    Publication date: March 9, 2023
    Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY
    Inventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
  • Publication number: 20230074426
    Abstract: Compute-centric message passing interface (C2MPI) provides a hardware-agnostic message passing interface for heterogenous computing systems. Hardware-agnostic programming with high performance portability is envisioned to be a bedrock for realizing adoption of emerging accelerator technologies in heterogeneous computing systems, such as high-performance computing (HPC) systems, data center computing systems, and edge computing systems. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. Accordingly, embodiments described herein provide a flexible hardware-agnostic environment that allows application developers to develop high-performance applications without knowledge of the underlying hardware.
    Type: Application
    Filed: March 1, 2021
    Publication date: March 9, 2023
    Applicant: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY
    Inventors: Michael F. RIERA, Fengbo REN, Masudul Hassan QURAISHI, Erfan BANK TAVAKOLI
  • Patent number: 11468542
    Abstract: This disclosure addresses the single-image compressive sensing (CS) and reconstruction problem. A scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) facilitates high-fidelity, flexible and fast CS image reconstruction. LAPRAN progressively reconstructs an image following the concept of the Laplacian pyramid through multiple stages of reconstructive adversarial networks (RANs). At each pyramid level, CS measurements are fused with a contextual latent vector to generate a high-frequency image residual. Consequently, LAPRAN can produce hierarchies of reconstructed images and each with an incremental resolution and improved quality. The scalable pyramid structure of LAPRAN enables high-fidelity CS reconstruction with a flexible resolution that is adaptive to a wide range of compression ratios (CRs), which is infeasible with existing methods.
    Type: Grant
    Filed: January 17, 2020
    Date of Patent: October 11, 2022
    Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY
    Inventors: Fengbo Ren, Kai Xu, Zhikang Zhang
  • Publication number: 20210349945
    Abstract: A data-driven nonuniform subsampling approach for computation-free on-sensor data dimensionality is provided, referred to herein as selective sensing. Designing an on-sensor data dimensionality reduction scheme for efficient signal sensing has long been a challenging task. Compressive sensing is a generic solution for sensing signals in a compressed format. Although compressive sensing can be directly implemented in the analog domain for specific types of signals, many application scenarios require implementation of data compression in the digital domain. However, the computational complexity involved in digital-domain compressive sensing limits its practical application, especially in resource-constrained sensor devices or high-data-rate sensor devices. Embodiments described herein provide a selective sensing framework that adopts a novel concept of data-driven nonuniform subsampling to reduce the dimensionality of acquired signals while retaining the information of interest in a computation-free fashion.
    Type: Application
    Filed: May 11, 2021
    Publication date: November 11, 2021
    Applicant: Arizona Board of Regents on behalf of Arizona State University
    Inventors: Zhikang Zhang, Fengbo Ren
  • Publication number: 20210350242
    Abstract: Various embodiments of a system and method for pruning binary neural networks by analyzing weight flipping frequency and pruning the binary neural network based on the weight flipping frequency associated with each channel of the binary neural network are disclosed herein.
    Type: Application
    Filed: May 11, 2021
    Publication date: November 11, 2021
    Applicant: Arizona Board of Regents on Behalf of Arizona State University
    Inventors: Yixing Li, Fengbo Ren
  • Publication number: 20210334636
    Abstract: An OpenCL-defined scalable runtime-flexible programmable accelerator architecture for accelerating convolutional neural network (CNN) inference in cloud/edge computing is provided, referred to herein as Systolic-CNN. Existing OpenCL-defined programmable accelerators (e.g., field-programmable gate array (FPGA)-based accelerators) for CNN inference are insufficient due to limited flexibility for supporting multiple CNN models at runtime and poor scalability resulting in underutilized accelerator resources and limited computational parallelism. Systolic-CNN adopts a highly pipelined and paralleled one-dimensional (1-D) systolic array architecture, which efficiently explores both spatial and temporal parallelism for accelerating CNN inference on programmable accelerators (e.g., FPGAs). Systolic-CNN is highly scalable and parameterized, and can be easily adapted by users to efficiently utilize the coarse-grained computation resources for a given programmable accelerator.
    Type: Application
    Filed: April 28, 2021
    Publication date: October 28, 2021
    Applicant: Arizona Board of Regents on behalf of Arizona State University
    Inventors: Akshay Dua, Fengbo Ren
  • Publication number: 20210305999
    Abstract: A compression ratio (CR) adapter (CRA) for end-to-end data-driven compressive sensing (CS) reconstruction (EDCSR) frameworks is provided. EDCSR frameworks achieve state-of-the-art reconstruction performance in terms of reconstruction speed and accuracy for images and other signals. However, existing EDCSR frameworks cannot adapt to a variable CR. For applications that desire a variable CR, existing EDCSR frameworks must be trained from scratch at each CR, which is computationally costly and time-consuming. Embodiments described herein present a CRA framework that addresses the variable CR problem generally for existing and future EDCSR frameworks with no modification to given reconstruction models nor enormous additional rounds of training needed. The CRA exploits an initial reconstruction network to generate an initial estimate of reconstruction results based on a small portion of acquired image measurements.
    Type: Application
    Filed: March 31, 2021
    Publication date: September 30, 2021
    Applicant: Arizona Board of Regents on behalf of Arizona State University
    Inventors: Zhikang Zhang, Fengbo Ren, Kai Xu
  • Patent number: 10924755
    Abstract: A real time end-to-end learning system for a high frame rate video compressive sensing network is described. The slow reconstruction speed of conventional compressive sensing approaches is overcome by directly modeling an inverse mapping from compressed domain to original domain in a single forward propagation. Through processing massive unlabeled video data such a mapping is learned by a neural network using data-driven methods. Systems and methods according to this disclosure incorporate a multi-rate convolutional neural network (CNN) and a synthesizing recurrent neural network (RNN) to achieve real time compression and reconstruction of video data.
    Type: Grant
    Filed: October 19, 2018
    Date of Patent: February 16, 2021
    Assignee: Arizona Board of Regents on behalf of Arizona State University
    Inventors: Fengbo Ren, Kai Xu
  • Publication number: 20200234406
    Abstract: This disclosure addresses the single-image compressive sensing (CS) and reconstruction problem. A scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) facilitates high-fidelity, flexible and fast CS image reconstruction. LAPRAN progressively reconstructs an image following the concept of the Laplacian pyramid through multiple stages of reconstructive adversarial networks (RANs). At each pyramid level, CS measurements are fused with a contextual latent vector to generate a high-frequency image residual. Consequently, LAPRAN can produce hierarchies of reconstructed images and each with an incremental resolution and improved quality. The scalable pyramid structure of LAPRAN enables high-fidelity CS reconstruction with a flexible resolution that is adaptive to a wide range of compression ratios (CRs), which is infeasible with existing methods.
    Type: Application
    Filed: January 17, 2020
    Publication date: July 23, 2020
    Applicant: Arizona Board of Regents on behalf of Arizona State University
    Inventors: Fengbo Ren, Kai Xu, Zhikang Zhang
  • Publication number: 20190124346
    Abstract: A real time end-to-end learning system for a high frame rate video compressive sensing network is described. The slow reconstruction speed of conventional compressive sensing approaches is overcome by directly modeling an inverse mapping from compressed domain to original domain in a single forward propagation. Through processing massive unlabeled video data such a mapping is learned by a neural network using data-driven methods. Systems and methods according to this disclosure incorporate a multi-rate convolutional neural network (CNN) and a synthesizing recurrent neural network (RNN) to achieve real time compression and reconstruction of video data.
    Type: Application
    Filed: October 19, 2018
    Publication date: April 25, 2019
    Applicant: Arizona Board of Regents on behalf of Arizona State University
    Inventors: Fengbo Ren, Kai Xu
  • Patent number: 10073701
    Abstract: Systems and methods for implementing a scalable very-large-scale integration (VLSI) architecture to perform compressive sensing (CS) hardware reconstruction for data signals in accordance with embodiments of the invention are disclosed. The VLSI architecture is optimized for CS signal reconstruction by implementing a reformulation of the orthogonal matching pursuit (OMP) process and utilizing architecture resource sharing techniques. Typically, the VLSI architecture is a CS reconstruction engine that includes a vector and scalar computation cores where the cores can be time-multiplexed (via dynamic configuration) to perform each task associated with OMP. The vector core includes configurable processing elements (PEs) connected in parallel. Further, the cores can be linked by data-path memories, where complex data flow of OMP can be customized utilizing local memory controllers synchronized by a top-level finite-state machine.
    Type: Grant
    Filed: July 29, 2014
    Date of Patent: September 11, 2018
    Assignee: The Regents of the University of California
    Inventors: Dejan Markovic, Fengbo Ren
  • Publication number: 20150032990
    Abstract: Systems and methods for implementing a scalable very-large-scale integration (VLSI) architecture to perform compressive sensing (CS) hardware reconstruction for data signals in accordance with embodiments of the invention are disclosed. The VLSI architecture is optimized for CS signal reconstruction by implementing a reformulation of the orthogonal matching pursuit (OMP) process and utilizing architecture resource sharing techniques. Typically, the VLSI architecture is a CS reconstruction engine that includes a vector and scalar computation cores where the cores can be time-multiplexed (via dynamic configuration) to perform each task associated with OMP. The vector core includes configurable processing elements (PEs) connected in parallel. Further, the cores can be linked by data-path memories, where complex data flow of OMP can be customized utilizing local memory controllers synchronized by a top-level finite-state machine.
    Type: Application
    Filed: July 29, 2014
    Publication date: January 29, 2015
    Inventors: Dejan Markovic, Fengbo Ren
  • Patent number: 8917562
    Abstract: As memory geometries continue to scale down, current density of magnetic tunnel junctions (MTJs) make conventional low current reading scheme problematic with regard to performance and reliability. A body-voltage sense circuit (BVSC) short pulse reading (SPR) circuit is described using body connected load transistors and a novel sensing circuit with second stage amplifier which allows for very short read pulses providing much higher read margins, less sensing time, and shorter sensing current pulses. Simulation results (using 65-nm CMOS model SPICE simulations) show that our technique can achieve 550 mV of read margin at 1 ns performance under a 1 V supply voltage, which is greater than reference designs achieve at 5 ns performance.
    Type: Grant
    Filed: November 25, 2013
    Date of Patent: December 23, 2014
    Assignee: The Regents of the University of California
    Inventors: Kang-Lung Wang, Chih-Kong K. Yang, Dejan Markovic, Fengbo Ren
  • Publication number: 20140153325
    Abstract: As memory geometries continue to scale down, current density of magnetic tunnel junctions (MTJs) make conventional low current reading scheme problematic with regard to performance and reliability. A body-voltage sense circuit (BVSC) short pulse reading (SPR) circuit is described using body connected load transistors and a novel sensing circuit with second stage amplifier which allows for very short read pulses providing much higher read margins, less sensing time, and shorter sensing current pulses. Simulation results (using 65-nm CMOS model SPICE simulations) show that our technique can achieve 550 mV of read margin at 1 ns performance under a 1V supply voltage, which is greater than reference designs achieve at 5 ns performance.
    Type: Application
    Filed: November 25, 2013
    Publication date: June 5, 2014
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Kang-Lung Wang, Chih-Kong K. Yang, Dejan Markovic, Fengbo Ren