Patents by Inventor David Bernard
David Bernard has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12541569Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed for performing a machine learning operation using storage element pointers. An example computer readable medium comprises instructions that when executed, cause at least one processor to select, in response to a determination that a machine learning operation is to be performed, create first and second storage element pointers based on a type of machine learning operation to be performed, remap input tensor data of the input tensor based on the first storage element pointer without movement of the input tensor data in memory, cause execution of the machine learning operation with the remapped input tensor data to create intermediate tensor data, remap the intermediate tensor data based on the second storage element pointer without movement of the intermediate tensor data in memory, and provide the remapped intermediate tensor data as an output tensor.Type: GrantFiled: December 17, 2021Date of Patent: February 3, 2026Assignee: Intel CorporationInventors: Kevin Brady, Martin Power, Martin-Thomas Grymel, Alessandro Palla, David Bernard, Niall Hanrahan
-
Patent number: 12530292Abstract: Caching techniques can include: receiving, from a host, a read I/O operation requesting to read current content of a logical address; determining whether a data cache includes a data cache entry corresponding to the logical address; responsive to determining the data cache includes the data cache entry corresponding to the logical address, performing data cache hit processing to service the read I/O operation using the data cache entry; responsive to determining the data cache does not include the data cache entry corresponding to the logical address, performing data cache miss processing including: determining whether a mapping cache includes a descriptor corresponding to the logical address; and responsive to determining the mapping cache includes the descriptor corresponding to the logical address, performing mapping cache hit processing to service the read I/O operation using the descriptor of the mapping cache.Type: GrantFiled: April 17, 2024Date of Patent: January 20, 2026Assignee: Dell Products L.P.Inventors: Ashok Tamilarasan, Vamsi K Vankamamidi, David Bernard
-
Patent number: 12499013Abstract: Techniques for tracking incoming writes and snap creation/deletion in memory to improve asynchronous replication and support lower RPOs. In the techniques, a storage system uses its data cache to receive data specified in write requests issued by storage clients, while dedicating an amount of the cache memory to track and record offsets/lengths of writes directed to source volumes. At the end of each replication interval, the storage system obtains a list of the recorded offsets/lengths for each source volume, identifies and reads areas of the source volume that were written to during the replication interval using the list, and replicates data from the identified areas to a destination volume. Because the list of recorded offsets/lengths of incoming writes for the source volume is compiled and available from volatile cache memory, it can be generated and accessed very quickly using reduced processing/memory resources, allowing for lower RPOs in asynchronous replication processes.Type: GrantFiled: January 2, 2024Date of Patent: December 16, 2025Assignee: Dell Products L.P.Inventors: David Bernard, Mayank Ajmera, Vamsi K. Vankamamidi, Vikram Prabhakar
-
Patent number: 12487908Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.Type: GrantFiled: October 16, 2023Date of Patent: December 2, 2025Assignee: Intel CorporationInventors: Martin-Thomas Grymel, David Bernard, Martin Power, Niall Hanrahan, Kevin Brady
-
Publication number: 20250328469Abstract: Caching techniques can include: receiving, from a host, a read I/O operation requesting to read current content of a logical address; determining whether a data cache includes a data cache entry corresponding to the logical address; responsive to determining the data cache includes the data cache entry corresponding to the logical address, performing data cache hit processing to service the read I/O operation using the data cache entry; responsive to determining the data cache does not include the data cache entry corresponding to the logical address, performing data cache miss processing including: determining whether a mapping cache includes a descriptor corresponding to the logical address; and responsive to determining the mapping cache includes the descriptor corresponding to the logical address, performing mapping cache hit processing to service the read I/O operation using the descriptor of the mapping cache.Type: ApplicationFiled: April 17, 2024Publication date: October 23, 2025Applicant: Dell Products L.P.Inventors: Ashok Tamilarasan, Vamsi K. Vankamamidi, David Bernard
-
Publication number: 20250321862Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.Type: ApplicationFiled: June 26, 2025Publication date: October 16, 2025Applicant: Intel CorporationInventors: Martin-Thomas Grymel, David Bernard, Martin Power, Niall Hanrahan, Kevin Brady
-
Patent number: 12430239Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.Type: GrantFiled: December 14, 2023Date of Patent: September 30, 2025Assignee: Intel CorporationInventors: Martin-Thomas Grymel, David Bernard, Niall Hanrahan, Martin Power, Kevin Brady, Gary Baugh, Cormac Brick
-
Publication number: 20250265464Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to perform machine-learning model operations on sparse accelerators. An example apparatus includes first circuitry, second circuitry to generate sparsity data based on an acceleration operation, and third circuitry to instruct one or more data buffers to provide at least one of activation data or weight data based on the sparsity data to the first circuitry, the first circuitry to execute the acceleration operation based on the at least one of the activation data or the weight data.Type: ApplicationFiled: May 5, 2025Publication date: August 21, 2025Applicant: Intel CorporationInventors: Martin Power, Kevin Brady, Niall Hanrahan, Martin-Thomas Grymel, David Bernard, Gary Baugh
-
Publication number: 20250217234Abstract: Techniques for tracking incoming writes and snap creation/deletion in memory to improve asynchronous replication and support lower RPOs. In the techniques, a storage system uses its data cache to receive data specified in write requests issued by storage clients, while dedicating an amount of the cache memory to track and record offsets/lengths of writes directed to source volumes. At the end of each replication interval, the storage system obtains a list of the recorded offsets/lengths for each source volume, identifies and reads areas of the source volume that were written to during the replication interval using the list, and replicates data from the identified areas to a destination volume. Because the list of recorded offsets/lengths of incoming writes for the source volume is compiled and available from volatile cache memory, it can be generated and accessed very quickly using reduced processing/memory resources, allowing for lower RPOs in asynchronous replication processes.Type: ApplicationFiled: January 2, 2024Publication date: July 3, 2025Inventors: David Bernard, Mayank Ajmera, Vamsi K. Vankamamidi, Vikram Prabhakar
-
Publication number: 20250201424Abstract: It is disclosed a computer-implemented method for determining a physiological age of a subject, comprising applying, on a set of values comprising at least values of biological variables relative to the subject, a trained model configured to predict the chronological age of a subject based on the set of values, to obtain a predicted age of the subject, wherein the physiological age corresponds to said predicted age.Type: ApplicationFiled: March 23, 2023Publication date: June 19, 2025Inventors: Louis CASTEILLA, Isabelle ADER, Philippe KEMOUN, Julien ALIGON, Paul MONSARRAT, Sylvain CUSSAT-BLANC, David BERNARD, Emmanuel DOUMARD, Luc PENICAUD
-
Patent number: 12321857Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to perform machine-learning model operations on sparse accelerators. An example apparatus includes first circuitry, second circuitry to generate sparsity data based on an acceleration operation, and third circuitry to instruct one or more data buffers to provide at least one of activation data or weight data based on the sparsity data to the first circuitry, the first circuitry to execute the acceleration operation based on the at least one of the activation data or the weight data.Type: GrantFiled: June 24, 2021Date of Patent: June 3, 2025Assignee: Intel CorporationInventors: Martin Power, Kevin Brady, Niall Hanrahan, Martin-Thomas Grymel, David Bernard, Gary Baugh
-
Patent number: 12169643Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.Type: GrantFiled: September 12, 2023Date of Patent: December 17, 2024Assignee: Intel CorporationInventors: Niall Hanrahan, Martin Power, Kevin Brady, Martin-Thomas Grymel, David Bernard, Gary Baugh, Cormac Brick
-
Patent number: 12117989Abstract: Techniques can include: receiving from a component a bufferless read request to read data of a storage object; opening a first transaction; acquiring locks of data pages including the read data; locking cache pages; storing the read data in the cache pages; sending to the component a notification identifying references, pointers or addresses of the cache pages storing the read data; responsive to receiving the notification, the first component performing one or more operations including directly accessing the first data from the cache pages using the references, pointers or addresses; and responsive to successfully completing the one or more operations, performing second processing including: releasing or unlocking the set of one or more cache pages storing the first data; releasing the one or more locks of the one or more data pages including the first data; and closing the first transaction.Type: GrantFiled: September 11, 2023Date of Patent: October 15, 2024Assignee: Dell Products L.P.Inventors: Alan L. Taylor, Nagapraveen Veeravenkata Seela, David Bernard
-
Patent number: 12007976Abstract: A method, computer program product, and computer system for acquiring, by a first node, local locks of the first node associated with a metadata log transaction, wherein the first node acquires the local locks of the first node prior to sending a commit message to a second node. The second node may acquire local locks of the second node associated with the metadata log transaction, wherein the second node acquires the local locks of the second node based upon, at least in part, receiving the commit message from the first node.Type: GrantFiled: April 28, 2021Date of Patent: June 11, 2024Assignee: EMC IP Holding Company, LLCInventors: Vladimir Shveidel, Bar David, David Bernard, Jason E. Raff, Shari A. Vietry
-
Publication number: 20240134786Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.Type: ApplicationFiled: December 14, 2023Publication date: April 25, 2024Applicant: Intel CorporationInventors: Martin-Thomas Grymel, David Bernard, Niall Hanrahan, Martin Power, Kevin Brady, Gary Baugh, Cormac Brick
-
Publication number: 20240118992Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.Type: ApplicationFiled: October 16, 2023Publication date: April 11, 2024Applicant: Intel CorporationInventors: Martin-Thomas Grymel, David Bernard, Martin Power, Niall Hanrahan, Kevin Brady
-
Patent number: 11940907Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.Type: GrantFiled: June 25, 2021Date of Patent: March 26, 2024Assignee: INTEL CORPORATIONInventors: Martin-Thomas Grymel, David Bernard, Niall Hanrahan, Martin Power, Kevin Brady, Gary Baugh, Cormac Brick
-
Patent number: 11893252Abstract: Processing can be performed to persistently record, in a log, a write I/O that writes first data to a target logical address. The processing can include: allocating storage for a first page buffer (PB) located at offsets in a PB pool of non-volatile storage of the log; enqueuing a request to an aggregation queue to persistently store the first data to the first PB of the log, wherein the request identifies the offsets of the PB pool of non-volatile storage which correspond to the first PB; and integrating the request into the aggregation queue. Integrating can include: determining whether a contiguous segment of the offsets of the request is adjacent to a second contiguous segment of the aggregation queue; and responsive to determining the contiguous segment is adjacent to the second contiguous segment, merging the first and second contiguous segments and generating an aggregated continuous segment.Type: GrantFiled: July 15, 2022Date of Patent: February 6, 2024Assignee: Dell Products L.P.Inventors: Svetlana Kronrod, Vladimir Shveidel, David Bernard, Vamsi K. Vankamamidi
-
Publication number: 20240036763Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.Type: ApplicationFiled: September 12, 2023Publication date: February 1, 2024Applicant: Intel CorporationInventors: Niall Hanrahan, Martin Power, Kevin Brady, Martin-Thomas Grymel, David Bernard, Gary Baugh, Cormac Brick
-
Publication number: 20240020031Abstract: Processing can be performed to persistently record, in a log, a write I/O that writes first data to a target logical address. The processing can include: allocating storage for a first page buffer (PB) located at offsets in a PB pool of non-volatile storage of the log; enqueuing a request to an aggregation queue to persistently store the first data to the first PB of the log, wherein the request identifies the offsets of the PB pool of non-volatile storage which correspond to the first PB; and integrating the request into the aggregation queue. Integrating can include: determining whether a contiguous segment of the offsets of the request is adjacent to a second contiguous segment of the aggregation queue; and responsive to determining the contiguous segment is adjacent to the second contiguous segment, merging the first and second contiguous segments and generating an aggregated continuous segment.Type: ApplicationFiled: July 15, 2022Publication date: January 18, 2024Applicant: Dell Products L.P.Inventors: Svetlana Kronrod, Vladimir Shveidel, David Bernard, Vamsi K. Vankamamidi