Patents by Inventor Hartmut Penner

Hartmut Penner has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SELECTIVE MULTICAST DELIVERY ON A BUS-BASED INTERCONNECT

Publication number: 20200106717

Abstract: According to one embodiment, a computer program product for performing selective multicast delivery includes a computer readable storage medium having program instructions embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, and where the program instructions are executable by a selector of an intelligent processing unit (IPU) to cause the selector to perform a method comprising identifying, by the selector, an address header appended to an instance of data, comparing, by the selector, address data in the address header to identifier data stored at the selector, and conditionally delivering, by the selector, the instance of data, based on the comparing.

Type: Application

Filed: September 28, 2018

Publication date: April 2, 2020

Inventors: Simon J. Hollis, Hartmut Penner, Andrew S. Cassidy, Jun Sawada, Pallab Datta
SCHEDULER FOR MAPPING NEURAL NETWORKS ONTO AN ARRAY OF NEURAL CORES IN AN INFERENCE PROCESSING UNIT

Publication number: 20200042856

Abstract: Mapping of neural network layers to physical neural cores is provided. In various embodiments, a neural network description describing a plurality of neural network layers is read. Each of the plurality of neural network layers has an associated weight tensor, input tensor, and output tensor. A plurality of precedence relationships among the plurality of neural network layers is determined. The weight tensor, input tensor, and output tensor of each of the plurality of neural network layers are mapped onto an array of neural cores.

Type: Application

Filed: July 31, 2018

Publication date: February 6, 2020

Inventors: Pallab Datta, Andrew S. Cassidy, Myron D. Flickner, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, John V. Arthur, Dharmendra S. Modha, Steven K. Esser, Brian Taba, Jennifer Klamo
EXTENDING EXISTING STORAGE DEVICES IN VIRTUALIZED ENVIRONMENTS

Publication number: 20200019349

Abstract: A method, system and computer program product for providing a guest with access to a virtual storage on a physical storage using a peripheral component interface hub. In one embodiment, the method comprises the guest sending to the peripheral component interface hub a request to access the physical storage, the request including physical addresses of the physical storage, and the peripheral component interface hub sending specified information about the request to a hypervisor. This method further comprises the hypervisor determining whether to grant or to reject the request; and when the hypervisor grants the request, the hypervisor sending a configuration command to the peripheral component interface hub. This command includes a mapping of addresses from the physical storage to addresses from the virtual storage. In an embodiment, the peripheral component interface hub uses this mapping to replace the addresses in the request with translated virtual addresses.

Type: Application

Filed: September 9, 2019

Publication date: January 16, 2020

Inventors: Hubertus Franke, Davide Pasetto, Hartmut Penner
HIERARCHICAL PARALLELISM IN A NETWORK OF DISTRIBUTED NEURAL NETWORK CORES

Publication number: 20200019836

Abstract: Networks of distributed neural cores are provided with hierarchical parallelism. In various embodiments, a plurality of neural cores is provided. Each of the plurality of neural cores comprises a plurality of vector compute units configured to operate in parallel. Each of the plurality of neural cores is configured to compute in parallel output activations by applying its plurality of vector compute units to input activations. Each of the plurality of neural cores is assigned a subset of output activations of a layer of a neural network for computation. Upon receipt of a subset of input activations of the layer of the neural network, each of the plurality of neural cores computes a partial sum for each of its assigned output activations, and computes its assigned output activations from at least the computed partial sums.

Type: Application

Filed: July 12, 2018

Publication date: January 16, 2020

Inventors: John V. Arthur, Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, Dharmendra S. Modha, Steven K. Esser, Brian Taba, Jennifer Klamo
INSTRUCTION DISTRIBUTION IN AN ARRAY OF NEURAL NETWORK CORES

Publication number: 20200012929

Abstract: Instruction distribution in an array of neural network cores is provided. In various embodiments, a neural inference chip is initialized with core microcode. The chip comprises a plurality of neural cores. The core microcode is executable by the neural cores to execute a tensor operation of a neural network. The core microcode is distributed to the plurality of neural cores via an on-chip network. The core microcode is executed synchronously by the plurality of neural cores to compute a neural network layer.

Type: Application

Filed: July 5, 2018

Publication date: January 9, 2020

Inventors: Hartmut Penner, Dharmendra S. Modha, John V. Arthur, Andrew S. Cassidy, Rathinakumar Appuswamy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Jun Sawada, Brian Taba
PARALLEL COMPUTATIONAL ARCHITECTURE WITH RECONFIGURABLE CORE-LEVEL AND VECTOR-LEVEL PARALLELISM

Publication number: 20190385046

Abstract: Neural network processing hardware using parallel computational architectures with reconfigurable core-level and vector-level parallelism is provided. In various embodiments, a neural network model memory is adapted to store a neural network model comprising a plurality of layers. Each layer has at least one dimension and comprises a plurality of synaptic weights. A plurality of neural cores is provided. Each neural core includes a computation unit and an activation memory. The computation unit is adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of output activations. The computation unit has a plurality of vector units. The activation memory is adapted to store the input activations and the output activations. The system is adapted to partition the plurality of cores into a plurality of partitions based on dimensions of the layer and the vector units.

Type: Application

Filed: June 14, 2018

Publication date: December 19, 2019

Inventors: Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, John V. Arthur, Dharmendra S. Modha, Steven K. Esser, Brian Taba, Jennifer Klamo
RUNTIME RECONFIGURABLE NEURAL NETWORK PROCESSOR CORE

Publication number: 20190385048

Abstract: Hardware neural network processors, are provided. A neural core includes a weight memory, an activation memory, a vector-matrix multiplier, and a vector processor. The vector-matrix multiplier is adapted to receive a weight matrix from the weight memory, receive an activation vector from the activation memory, and compute a vector-matrix multiplication of the weight matrix and the activation vector. The vector processor is adapted to receive one or more input vector from one or more vector source and perform one or more vector functions on the one or more input vector to yield an output vector. In some embodiments a programmable controller is adapted to configure and operate the neural core.

Type: Application

Filed: June 19, 2018

Publication date: December 19, 2019

Inventors: Andrew S. Cassidy, Rathinakumar Appuswamy, John V. Arthur, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
CENTRAL SCHEDULER AND INSTRUCTION DISPATCHER FOR A NEURAL INFERENCE PROCESSOR

Publication number: 20190332924

Abstract: Neural inference processors are provided. In various embodiments, a processor includes a plurality of cores. Each core includes a neural computation unit, an activation memory, and a local controller. The neural computation unit is adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of output activations. The activation memory is adapted to store the input activations and the output activations. The local controller is adapted to load the input activations from the activation memory to the neural computation unit and to store the plurality of output activations from the neural computation unit to the activation memory. The processor includes a neural network model memory adapted to store network parameters, including the plurality of synaptic weights. The processor includes a global scheduler operatively coupled to the plurality of cores, adapted to provide the synaptic weights from the neural network model memory to each core.

Type: Application

Filed: April 27, 2018

Publication date: October 31, 2019

Inventors: Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, John V. Arthur, Dharmendra S. Modha, Steven K. Esser, Brian Taba, Jennifer Klamo
TIME, SPACE, AND ENERGY EFFICIENT NEURAL INFERENCE VIA PARALLELISM AND ON-CHIP MEMORY

Publication number: 20190325295

Abstract: Neural inference chips and cores adapted to provide time, space, and energy efficient neural inference via parallelism and on-chip memory are provided. In various embodiments, the neural inference chips comprise: a plurality of neural cores interconnected by an on-chip network; a first on-chip memory for storing a neural network model, the first on-chip memory being connected to each of the plurality of cores by the on-chip network; a second on-chip memory for storing input and output data, the second on-chip memory being connected to each of the plurality of cores by the on-chip network.

Type: Application

Filed: April 20, 2018

Publication date: October 24, 2019

Inventors: Dharmendra S. Modha, John V. Arthur, Jun Sawada, Steven K. Esser, Rathinakumar Appuswamy, Brian Taba, Andrew S. Cassidy, Pallab Datta, Myron D. Flickner, Hartmut Penner, Jennifer Klamo
MASSIVELY PARALLEL NEURAL INFERENCE COMPUTING ELEMENTS

Publication number: 20190303749

Abstract: Massively parallel neural inference computing elements are provided. A plurality of multipliers is arranged in a plurality of equal-sized groups. Each of the plurality of multipliers is adapted to, in parallel, apply a weight to an input activation to generate an output. A plurality of adders is operatively coupled to one of the groups of multipliers. Each of the plurality of adders is adapted to, in parallel, add the outputs of the multipliers within its associated group to generate a partial sum. A plurality of function blocks is operatively coupled to one of the plurality of adders. Each of the plurality of function blocks is adapted to, in parallel, apply a function to the partial sum of its associated adder to generate an output value.

Type: Application

Filed: March 30, 2018

Publication date: October 3, 2019

Inventors: Rathinakumar Appuswamy, John V. Arthur, Andrew S. Cassidy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
DEFECT RESISTANT DESIGNS FOR LOCATION-SENSITIVE NEURAL NETWORK PROCESSOR ARRAYS

Publication number: 20190303741

Abstract: Defect resistant designs for location-sensitive neural network processor arrays are provided. In various embodiments, plurality of neural network processor cores are arrayed in a grid. The grid has a plurality of rows and a plurality of columns. A network interconnects at least those of the plurality of neural network processor cores that are adjacent within the grid. The network is adapted to bypass a defective core of the plurality of neural network processor cores by providing a connection between two non-adjacent rows or columns of the grid, and transparently routing messages between the two non-adjacent rows or columns, past the defective core.

Type: Application

Filed: March 30, 2018

Publication date: October 3, 2019

Inventors: Rathinakumar Appuswamy, John V. Arthur, Andrew S. Cassidy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
DISTRIBUTED, EVENT-BASED COMPUTATION USING NEUROMORPHIC CORES

Publication number: 20190266481

Abstract: Systems for distributed, event-based computation are provided. In various embodiments, the systems include a plurality of neurosynaptic processors and a network interconnecting the plurality of neurosynaptic processors. Each neurosynaptic processor includes a clock uncoupled from the clock of each other neurosynaptic processor. Each neurosynaptic processor is adapted to receive an input stream, the input stream comprising a plurality of inputs and a clock value associated with each of the plurality of inputs. Each neurosynaptic processor is adapted to compute, for each clock value, an output based on the inputs associated with that clock value. Each neurosynaptic processor is adapted to send to another of the plurality of neurosynaptic processors, via the network, the output and an associated clock value.

Type: Application

Filed: February 28, 2018

Publication date: August 29, 2019

Inventors: Arnon Amir, David Berg, Pallab Datta, Jeffrey A. Kusnitz, Hartmut Penner
PROVIDING REMOTE, RELIANT AND HIGH PERFORMANCE PCI EXPRESS DEVICE IN CLOUD COMPUTING ENVIRONMENTS

Publication number: 20190220437

Abstract: A system architecture, a method, and a computer program product are disclosed for attaching remote physical devices. In one embodiment, the system architecture comprises a compute server and a device server. The compute server includes a system memory, and one or more remote device drivers; and the device server includes a system memory and one or more physical devices, and each of the physical devices includes an associated device memory. The compute server and the device server are connected through an existing network fabric that provides remote direct memory access (RDMA) services. A system mapping function logically connects one or more of the physical devices on the device server to the compute server, including mapping between the system memories and the device memories and keeping the system memories and the device memories in synchronization using the RDMA.

Type: Application

Filed: March 20, 2019

Publication date: July 18, 2019

Inventors: Hubertus Franke, Davide Pasetto, Hartmut Penner
EXTENDING EXISTING STORAGE DEVICES IN VIRTUALIZED ENVIRONMENTS

Publication number: 20190196747

Abstract: A method, system and computer program product for providing a guest with access to a virtual storage on a physical storage using a peripheral component interface hub. In one embodiment, the method comprises the guest sending to the peripheral component interface hub a request to access the physical storage, the request including physical addresses of the physical storage, and the peripheral component interface hub sending specified information about the request to a hypervisor. This method further comprises the hypervisor determining whether to grant or to reject the request; and when the hypervisor grants the request, the hypervisor sending a configuration command to the peripheral component interface hub. This command includes a mapping of addresses from the physical storage to addresses from the virtual storage. In an embodiment, the peripheral component interface hub uses this mapping to replace the addresses in the request with translated virtual addresses.

Type: Application

Filed: March 1, 2019

Publication date: June 27, 2019

Inventors: Hubertus Franke, Davide Pasetto, Hartmut Penner
Providing remote, reliant and high performance PCI express device in cloud computing environments

Patent number: 10303645

Abstract: A system architecture, a method, and a computer program product are disclosed for attaching remote physical devices. In one embodiment, the system architecture comprises a compute server and a device server. The compute server includes a system memory, and one or more remote device drivers; and the device server includes a system memory and one or more physical devices, and each of the physical devices includes an associated device memory. The compute server and the device server are connected through an existing network fabric that provides remote direct memory access (RDMA) services. A system mapping function logically connects one or more of the physical devices on the device server to the compute server, including mapping between the system memories and the device memories and keeping the system memories and the device memories in synchronization using the RDMA.

Type: Grant

Filed: August 27, 2015

Date of Patent: May 28, 2019

Assignee: International Business Machines Corporation

Inventors: Hubertus Franke, Davide Pasetto, Hartmut Penner
Providing remote, reliant and high performance PCI express device in cloud computing environments

Patent number: 10303644

Abstract: A system architecture, a method, and a computer program product are disclosed for attaching remote physical devices. In one embodiment, the system architecture comprises a compute server and a device server. The compute server includes a system memory, and one or more remote device drivers; and the device server includes a system memory and one or more physical devices, and each of the physical devices includes an associated device memory. The compute server and the device server are connected through an existing network fabric that provides remote direct memory access (RDMA) services. A system mapping function logically connects one or more of the physical devices on the device server to the compute server, including mapping between the system memories and the device memories and keeping the system memories and the device memories in synchronization using the RDMA.

Type: Grant

Filed: January 16, 2015

Date of Patent: May 28, 2019

Assignee: International Business Machines Corporation

Inventors: Hubertus Franke, Davide Pasetto, Hartmut Penner
Extending existing storage devices in virtualized environments

Patent number: 10248360

Abstract: A method, system and computer program product for providing a guest with access to a virtual storage on a physical storage using a peripheral component interface hub. In one embodiment, the method comprises the guest sending to the peripheral component interface hub a request to access the physical storage, the request including physical addresses of the physical storage, and the peripheral component interface hub sending specified information about the request to a hypervisor. This method further comprises the hypervisor determining whether to grant or to reject the request; and when the hypervisor grants the request, the hypervisor sending a configuration command to the peripheral component interface hub. This command includes a mapping of addresses from the physical storage to addresses from the virtual storage. In an embodiment, the peripheral component interface hub uses this mapping to replace the addresses in the request with translated virtual addresses.

Type: Grant

Filed: April 2, 2018

Date of Patent: April 2, 2019

Assignee: International Business Machines Corporation

Inventors: Hubertus Franke, Davide Pasetto, Hartmut Penner
Efficient and secure direct storage device sharing in virtualized environments

Patent number: 10216628

Abstract: A method, system and computer program product are disclosed for direct storage device sharing in a virtualized environment. In an embodiment, the method comprises assigning each of a plurality of virtual functions an associated memory area of a physical memory, and executing the virtual functions in a single root-input/output virtualization environment to provide each of a plurality of guests with direct access to the physical memory. In one embodiment, each of the guests is associated with a respective one of the virtual functions; and the assigning each of the plurality of virtual functions an associated memory area includes maintaining a per-virtual function mapping table identifying a respective one mapping function for each of the virtual functions, and each of the mapping functions mapping one of the memory areas of the physical area to an associated virtual memory.

Type: Grant

Filed: December 27, 2017

Date of Patent: February 26, 2019

Assignee: International Business Machines Corporation

Inventors: Gheorghe Almasi, Hubertus Franke, Gokul B. Kandiraju, Davide Pasetto, Hartmut Penner
Efficient and secure direct storage device sharing in virtualized environments

Patent number: 10169231

Abstract: A method, system and computer program product are disclosed for direct storage device sharing in a virtualized environment. In an embodiment, the method comprises assigning each of a plurality of virtual functions an associated memory area of a physical memory, and executing the virtual functions in a single root-input/output virtualization environment to provide each of a plurality of guests with direct access to the physical memory. In one embodiment, each of the guests is associated with a respective one of the virtual functions; and the assigning each of the plurality of virtual functions an associated memory area includes maintaining a per-virtual function mapping table identifying a respective one mapping function for each of the virtual functions, and each of the mapping functions mapping one of the memory areas of the physical area to an associated virtual memory.

Type: Grant

Filed: December 5, 2017

Date of Patent: January 1, 2019

Assignee: International Business Machines Corporation

Inventors: Gheorghe Almasi, Hubertus Franke, Gokul B. Kandiraju, Davide Pasetto, Hartmut Penner
EXTENDING EXISTING STORAGE DEVICES IN VIRTUALIZED ENVIRONMENTS

Publication number: 20180225067

Abstract: A method, system and computer program product for providing a guest with access to a virtual storage on a physical storage using a peripheral component interface hub. In one embodiment, the method comprises the guest sending to the peripheral component interface hub a request to access the physical storage, the request including physical addresses of the physical storage, and the peripheral component interface hub sending specified information about the request to a hypervisor. This method further comprises the hypervisor determining whether to grant or to reject the request; and when the hypervisor grants the request, the hypervisor sending a configuration command to the peripheral component interface hub. This command includes a mapping of addresses from the physical storage to addresses from the virtual storage. In an embodiment, the peripheral component interface hub uses this mapping to replace the addresses in the request with translated virtual addresses.

Type: Application

Filed: April 2, 2018

Publication date: August 9, 2018

Inventors: Hubertus Franke, Davide Pasetto, Hartmut Penner

prev 1 2 3 next