Patents Assigned to Femtosense, Inc.

METHODS AND APPARATUS FOR ACCELERATING TRANSFORMS VIA SPARSE MATRIX OPERATIONS

Publication number: 20240289416

Abstract: Methods and apparatus for accelerating transforms via sparse matrix operations. Conventional processing architectures use bit-reversed addressing and a “butterfly” operation to perform digital signal processing techniques (such as the FFT, DFT, DCT, etc.). However, bit-reversed addressing may also be performed as a single sparse matrix permutation; similarly, butterfly operations may also be represented as a number of multi-matrix multiplications. Exemplary sparse matrix processors can perform these operations locally with great efficiency. Importantly, instead of sending data from a machine learning (ML) co-processor to a DSP to perform signal processing functions (and then back to the ML co-processor); the entire sequence may be performed on a sparse ML processor. This may greatly improve system power consumption and may entirely obviate the need for a separate DSP in certain (e.g., embedded) systems.

Type: Application

Filed: February 26, 2024

Publication date: August 29, 2024

Applicant: Femtosense, Inc.

Inventor: Scott Henry Reid
Methods and apparatus for thread-based scheduling in multicore neural networks

Patent number: 11783169

Abstract: Systems, apparatus, and methods for thread-based scheduling within a multicore processor. Neural networking uses a network of connected nodes (aka neurons) to loosely model the neuro-biological functionality found in the human brain. Various embodiments of the present disclosure use thread dependency graphs analysis to decouple scheduling across many distributed cores. Rather than using thread dependency graphs to generate a sequential ordering for a centralized scheduler, the individual thread dependencies define a count value for each thread at compile-time. Threads and their thread dependency count are distributed to each core at run-time. Thereafter, each core can dynamically determine which threads to execute based on fulfilled thread dependencies without requiring a centralized scheduler.

Type: Grant

Filed: January 2, 2023

Date of Patent: October 10, 2023

Assignee: Femtosense, Inc.

Inventors: Sam Brian Fok, Alexander Smith Neckar
Methods and apparatus for thread-based scheduling in multicore neural networks

Patent number: 11775810

Abstract: Systems, apparatus, and methods for thread-based scheduling within a multicore processor. Neural networking uses a network of connected nodes (aka neurons) to loosely model the neuro-biological functionality found in the human brain. Various embodiments of the present disclosure use thread dependency graphs analysis to decouple scheduling across many distributed cores. Rather than using thread dependency graphs to generate a sequential ordering for a centralized scheduler, the individual thread dependencies define a count value for each thread at compile-time. Threads and their thread dependency count are distributed to each core at run-time. Thereafter, each core can dynamically determine which threads to execute based on fulfilled thread dependencies without requiring a centralized scheduler.

Type: Grant

Filed: January 2, 2023

Date of Patent: October 3, 2023

Assignee: Femtosense, Inc.

Inventors: Sam Brian Fok, Alexander Smith Neckar
METHODS AND APPARATUS FOR THREAD-BASED SCHEDULING IN MULTICORE NEURAL NETWORKS

Publication number: 20230153595

Abstract: Systems, apparatus, and methods for thread-based scheduling within a multicore processor. Neural networking uses a network of connected nodes (aka neurons) to loosely model the neuro-biological functionality found in the human brain. Various embodiments of the present disclosure use thread dependency graphs analysis to decouple scheduling across many distributed cores. Rather than using thread dependency graphs to generate a sequential ordering for a centralized scheduler, the individual thread dependencies define a count value for each thread at compile-time. Threads and their thread dependency count are distributed to each core at run-time. Thereafter, each core can dynamically determine which threads to execute based on fulfilled thread dependencies without requiring a centralized scheduler.

Type: Application

Filed: January 2, 2023

Publication date: May 18, 2023

Applicant: Femtosense, Inc.

Inventors: Sam Brian Fok, Alexander Smith Neckar
METHODS AND APPARATUS FOR THREAD-BASED SCHEDULING IN MULTICORE NEURAL NETWORKS

Publication number: 20230153596

Abstract: Systems, apparatus, and methods for thread-based scheduling within a multicore processor. Neural networking uses a network of connected nodes (aka neurons) to loosely model the neuro-biological functionality found in the human brain. Various embodiments of the present disclosure use thread dependency graphs analysis to decouple scheduling across many distributed cores. Rather than using thread dependency graphs to generate a sequential ordering for a centralized scheduler, the individual thread dependencies define a count value for each thread at compile-time. Threads and their thread dependency count are distributed to each core at run-time. Thereafter, each core can dynamically determine which threads to execute based on fulfilled thread dependencies without requiring a centralized scheduler.

Type: Application

Filed: January 2, 2023

Publication date: May 18, 2023

Applicant: Femtosense, Inc.

Inventors: Sam Brian Fok, Alexander Smith Neckar
METHODS AND APPARATUS FOR SYSTEM-ON-A-CHIP NEURAL NETWORK PROCESSING APPLICATIONS

Publication number: 20230133088

Abstract: Methods and apparatus for multi-purpose neural network core and memory. The asynchronous/parallel nature of neural network tasks may allow a neural network IP core to dynamically switch between: a system memory (in whole or part), a neural network processor (in whole or part), and/or a hybrid of system memory and neural network processor. In one specific implementation, the multi-purpose neural network IP core has partitioned its sub-cores into a first set of neural network sub-cores, and a second set of memory sub-cores that operate as addressable memory space. Partitioning may be statically assigned at “compile-time”, dynamically assigned at “run-time”, or semi-statically assigned at “program-time” Any number of considerations may be used to partition the sub-cores; examples of such considerations may include, without limitation: thread priority, memory usage, historic usage, future usage, power consumption, performance, etc.

Type: Application

Filed: October 25, 2022

Publication date: May 4, 2023

Applicant: Femtosense, Inc.

Inventors: Sam Brian Fok, Alexander Smith Neckar, Gabriel Vega
Methods and apparatus for thread-based scheduling in multicore neural networks

Patent number: 11625592

Abstract: Systems, apparatus, and methods for thread-based scheduling within a multicore processor. Neural networking uses a network of connected nodes (aka neurons) to loosely model the neuro-biological functionality found in the human brain. Various embodiments of the present disclosure use thread dependency graphs analysis to decouple scheduling across many distributed cores. Rather than using thread dependency graphs to generate a sequential ordering for a centralized scheduler, the individual thread dependencies define a count value for each thread at compile-time. Threads and their thread dependency count are distributed to each core at run-time. Thereafter, each core can dynamically determine which threads to execute based on fulfilled thread dependencies without requiring a centralized scheduler.

Type: Grant

Filed: July 5, 2021

Date of Patent: April 11, 2023

Assignee: Femtosense, Inc.

Inventors: Sam Brian Fok, Alexander Smith Neckar
METHODS AND APPARATUS FOR LOCALIZED PROCESSING WITHIN MULTICORE NEURAL NETWORKS

Publication number: 20220012575

Abstract: Methods and apparatus for localized processing within multicore neural networks. Unlike existing solutions that rely on commodity software and hardware to perform “brute force” large scale neural network processing the various techniques described herein map and partition a neural network into the hardware limitations of a target platform. Specifically, the various implementations described herein synergistically leverage localization, sparsity, and distributed scheduling, to enable neural network processing within embedded hardware applications. As described herein, hardware-aware mapping/partitioning enhances neural network performance by e.g., avoiding pin-limited memory accesses, processing data in compressed formats/skipping unnecessary operations, and decoupling scheduling between cores.

Type: Application

Filed: July 5, 2021

Publication date: January 13, 2022

Applicant: Femtosense, Inc.

Inventors: Sam Brian Fok, Alexander Smith Neckar, Scott Henry Reid
METHODS AND APPARATUS FOR THREAD-BASED SCHEDULING IN MULTICORE NEURAL NETWORKS

Publication number: 20220012060

Abstract: Systems, apparatus, and methods for thread-based scheduling within a multicore processor. Neural networking uses a network of connected nodes (aka neurons) to loosely model the neuro-biological functionality found in the human brain. Various embodiments of the present disclosure use thread dependency graphs analysis to decouple scheduling across many distributed cores. Rather than using thread dependency graphs to generate a sequential ordering for a centralized scheduler, the individual thread dependencies define a count value for each thread at compile-time. Threads and their thread dependency count are distributed to each core at run-time. Thereafter, each core can dynamically determine which threads to execute based on fulfilled thread dependencies without requiring a centralized scheduler.

Type: Application

Filed: July 5, 2021

Publication date: January 13, 2022

Applicant: Femtosense, Inc.

Inventors: Sam Brian Fok, Alexander Smith Neckar
METHODS AND APPARATUS FOR MATRIX AND VECTOR STORAGE AND OPERATIONS

Publication number: 20220012598

Abstract: Methods and apparatus for matrix and vector storage and operations are disclosed. Vectors and matrices may be represented differently to further enhance performance of operations. Exemplary embodiments compress sparse neural network data structures based on actual, non-null, connectivity (rather than all possible connections). This greatly reduces storage requirements as well as computational complexity. In some variants, the compression and reduction in complexity is sized to fit within the memory footprint and processing capabilities of a core. The exemplary compression schemes represent sparse matrices with links to compressed column data structures, where each compressed column data structure only stores non-null entries to optimize column-based lookups of non-null entries. Similarly, sparse vector addressing skips nulled entries to optimize for vector-specific non-null multiply-accumulate operations.

Type: Application

Filed: July 5, 2021

Publication date: January 13, 2022

Applicant: Femtosense, Inc.

Inventors: Sam Brian Fok, Alexander Smith Neckar, Manish Shrivastava