Patents by Inventor Bruce M. Fleischer

Bruce M. Fleischer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Executing a composite scalar-vector VLIW instruction having a repeat field

Patent number: 12182576

Abstract: A processor core includes a storage device which stores a composite very large instruction word (VLIW) instruction, an instruction unit which obtains the composite VLIW instruction from the storage device and decodes the composite VLIW instruction to determine an operation to perform, and a composite VLIW instruction execution unit which executes the composite VLIW instruction to perform the operation.

Type: Grant

Filed: December 31, 2019

Date of Patent: December 31, 2024

Assignee: International Business Machines Corporation

Inventors: Bruce M. Fleischer, Thomas Winters Fox, Arpith C. Jacob, Hans Mikael Jacobson, Ravi Nair, Kevin John Patrick O'Brien, Daniel Arthur Prener
Processor Core, Processor and Method for Executing a Composite Scalar-Vector Very Lare Instruction Word (VLIW) Instruction

Publication number: 20200142704

Abstract: A processor core includes a storage device which stores a composite very large instruction word (VLIW) instruction, an instruction unit which obtains the composite VLIW instruction from the storage device and decodes the composite VLIW instruction to determine an operation to perform, and a composite VLIW instruction execution unit which executes the composite VLIW instruction to perform the operation.

Type: Application

Filed: December 31, 2019

Publication date: May 7, 2020

Inventors: Bruce M. Fleischer, Thomas Winters FOX, Arpith C. JACOB, Hans Mikael JACOBSON, Ravi NAIR, Kevin John Patrick O'BRIEN, Daniel Arthur PRENER
Executing a composite VLIW instruction having a scalar atom that indicates an iteration of execution

Patent number: 10572263

Abstract: A processor core includes a storage device which stores a composite very large instruction word (VLIW) instruction, an instruction unit which obtains the composite VLIW instruction from the storage device and decodes the composite VLIW instruction to determine an operation to perform, and a composite VLIW instruction execution unit which executes the decoded composite VLIW instruction to perform the operation.

Type: Grant

Filed: March 31, 2016

Date of Patent: February 25, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas Winters Fox, Arpith C. Jacob, Hans Mikael Jacobson, Ravi Nair, Kevin John Patrick O'Brien, Daniel Arthur Prener
Branch prediction in a computer processor

Patent number: 10275256

Abstract: Branch prediction in a computer processor, includes: fetching an instruction, the instruction comprising an address, the address comprising a first portion of a global history vector and a global history vector pointer; performing a first branch prediction in dependence upon the first portion of the global history vector; retrieving, in dependence upon the global history vector pointer, from a rolling global history vector buffer, a second portion of the global history vector; and performing a second branch prediction in dependence upon a combination of the first portion and second portion of the global history vector.

Type: Grant

Filed: February 22, 2016

Date of Patent: April 30, 2019

Assignee: International Business Machines Corporation

Inventors: Bruce M. Fleischer, Michael N. Goulet, David S. Levitan, Nicholas R. Orzol
Extendable conditional permute SIMD instructions

Patent number: 10162634

Abstract: A method, apparatus and non-transitory computer readable medium are provided for permuting data registers to a target register. Two or more data registers are concatenated to form a concatenated data register. Each data register comprises a plurality of elements. A permutation instruction which uses one of the data registers as a data input register is executed and conditionally selects an element of the data input register by comparing a portion of an element of a pattern register to an immediate match field value. The selected element of the data input register is copied to an element in a target register at a position corresponding to a position of the element of the pattern register when the portion of the element of the pattern register matches the immediate match field value. When the portion of the element of the pattern register does not match, the target register remains unchanged.

Type: Grant

Filed: May 20, 2016

Date of Patent: December 25, 2018

Assignee: International Business Machines Corporation

Inventors: Alexander E. Eichenberger, Bruce M. Fleischer
Active memory device gather, scatter, and filter

Patent number: 10049061

Abstract: Embodiments relate to loading and storing of data. An aspect includes a method for transferring data in an active memory device that includes memory and a processing element. An instruction is fetched and decoded for execution by the processing element. Based on determining that the instruction is a gather instruction, the processing element determines a plurality of source addresses in the memory from which to gather data elements and a destination address in the memory. One or more gathered data elements are transferred from the source addresses to contiguous locations in the memory starting at the destination address. Based on determining that the instruction is a scatter instruction, a source address in the memory from which to read data elements at contiguous locations and one or more destination addresses in the memory to store the data elements at non-contiguous locations are determined, and the data elements are transferred.

Type: Grant

Filed: November 12, 2012

Date of Patent: August 14, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, James A. Kahle, Jaime H. Moreno, Ravi Nair
Multi-petascale highly efficient parallel supercomputer

Patent number: 9971713

Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.

Type: Grant

Filed: April 30, 2015

Date of Patent: May 15, 2018

Assignee: GLOBALFOUNDRIES INC.

Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
High bandwidth low latency data exchange between processing elements

Patent number: 9928190

Abstract: Direct communication of data between processing elements is provided. An aspect includes sending, by a first processing element, data over an inter-processing element chaining bus. The data is destined for another processing element via a data exchange component that is coupled between the first processing element and a second processing element via a communication line disposed between corresponding multiplexors of the first processing element and the second processing element. A further aspect includes determining, by the data exchange component, whether the data has been received at the data exchange element. If so, an indicator is set in a register of the data exchange component and the data is forwarded to the other processing element. Setting the indicator causes the first processing element to stall. If the data has not been received, the other processing element is stalled while the data exchange component awaits receipt of the data.

Type: Grant

Filed: June 15, 2015

Date of Patent: March 27, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
High bandwidth low latency data exchange between processing elements

Patent number: 9910802

Abstract: Direct communication of data between processing elements is provided. An aspect includes sending, by a first processing element, data over an inter-processing element chaining bus. The data is destined for another processing element via a data exchange component that is coupled between the first processing element and a second processing element via a communication line disposed between corresponding multiplexors of the first processing element and the second processing element. A further aspect includes determining, by the data exchange component, whether the data has been received at the data exchange element. If so, an indicator is set in a register of the data exchange component and the data is forwarded to the other processing element. Setting the indicator causes the first processing element to stall. If the data has not been received, the other processing element is stalled while the data exchange component awaits receipt of the data.

Type: Grant

Filed: November 23, 2015

Date of Patent: March 6, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
On-chip traffic prioritization in memory

Patent number: 9841926

Abstract: According to one embodiment, a method for traffic prioritization in a memory device includes sending a memory access request including a priority value from a processing element in the memory device to a crossbar interconnect in the memory device. The memory access request is routed through the crossbar interconnect to a memory controller in the memory device associated with the memory access request. The memory access request is received at the memory controller. The priority value of the memory access request is compared to priority values of a plurality of memory access requests stored in a queue of the memory controller to determine a highest priority memory access request. A next memory access request is performed by the memory controller based on the highest priority memory access request.

Type: Grant

Filed: June 30, 2016

Date of Patent: December 12, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
Test structure to measure delay variability mismatch of digital logic paths

Patent number: 9829535

Abstract: An integrated circuit includes a test block which in turn includes a plurality of identical paths; a counter selectively coupled to the plurality of identical paths to selectively obtain a count of at least one of correctly operating paths and incorrectly operating paths from each of the plurality of identical paths; and a plurality of count latches selectively coupled to the counter to store output of the counter. Each path in turn includes a first clocked latch; a clocked logic path beginning and ending at the first clocked latch; and a clocked detection circuit coupled to the first clocked latch and the counter, which determines whether the clocked logic path is operating properly in a given clock period.

Type: Grant

Filed: January 20, 2016

Date of Patent: November 28, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Karthik Balakrishnan, Bruce M. Fleischer, Keith A. Jenkins, Christos Vezyrtzis
EXTENDABLE CONDITIONAL PERMUTE SIMD INSTRUCTIONS

Publication number: 20170337059

Abstract: A method, apparatus and non-transitory computer readable medium are provided for permuting data registers to a target register. Two or more data registers are concatenated to form a concatenated data register. Each data register comprises a plurality of elements. A permutation instruction which uses one of the data registers as a data input register is executed and conditionally selects an element of the data input register by comparing a portion of an element of a pattern register to an immediate match field value. The selected element of the data input register is copied to an element in a target register at a position corresponding to a position of the element of the pattern register when the portion of the element of the pattern register matches the immediate match field value. When the portion of the element of the pattern register does not match, the target register remains unchanged.

Type: Application

Filed: May 20, 2016

Publication date: November 23, 2017

Inventors: Alexander E. EICHENBERGER, Bruce M. FLEISCHER
Processor Core, Processor And Method For Executing A Composite Scalar-Vector Very Lare Instruction Word (VLIW) Instruction

Publication number: 20170286108

Abstract: A processor core includes a storage device which stores a composite very large instruction word (VLIW) instruction, an instruction unit which obtains the composite VLIW instruction from the storage device and decodes the composite VLIW instruction to determine an operation to perform, and a composite VLIW instruction execution unit which executes the decoded composite VLIW instruction to perform the operation.

Type: Application

Filed: March 31, 2016

Publication date: October 5, 2017

Inventors: Bruce M. Fleischer, Thomas Winters FOX, Arpith C. JACOB, Hans Mikael Jacobson, Ravi Nair, Kevin John Patrick O'Brien, Daniel Arthur Prener
BRANCH PREDICTION IN A COMPUTER PROCESSOR

Publication number: 20170242701

Abstract: Branch prediction in a computer processor, includes: fetching an instruction, the instruction comprising an address, the address comprising a first portion of a global history vector and a global history vector pointer; performing a first branch prediction in dependence upon the first portion of the global history vector; retrieving, in dependence upon the global history vector pointer, from a rolling global history vector buffer, a second portion of the global history vector; and performing a second branch prediction in dependence upon a combination of the first portion and second portion of the global history vector.

Type: Application

Filed: February 22, 2016

Publication date: August 24, 2017

Inventors: BRUCE M. FLEISCHER, MICHAEL N. GOULET, DAVID S. LEVITAN, NICHOLAS R. ORZOL
All-to-all permutation of vector elements based on a permutation pattern encoded in mantissa and exponent bits in a floating-point SIMD architecture

Patent number: 9652231

Abstract: Mechanisms are provided for dynamic data driven alignment and data formatting in a floating point SIMD architecture. At least two operand inputs are input to a permute unit of a processor. Each operand input contains at least one floating point value upon which a permute operation is to be performed by the permute unit. A control vector input, having a plurality of floating point values that together constitute the control vector input, is input to the permute unit of the processor for controlling the permute operation of the permute unit. The permute unit performs a permute operation on the at least two operand inputs according to a permutation pattern specified by the plurality of floating point values that constitute the control vector input. Moreover, a result output of the permute operation is output from the permute unit to a result vector register of the processor.

Type: Grant

Filed: October 14, 2008

Date of Patent: May 16, 2017

Assignee: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, Bruce M. Fleischer, Michael K. Gschwind
Gather/scatter of multiple data elements with packed loading/storing into/from a register file entry

Patent number: 9632777

Abstract: Embodiments relate to packed loading and storing of data. An aspect includes a method for packed loading and storing of data distributed in a system that includes memory and a processing element. The method includes fetching and decoding an instruction for execution by the processing element. The processing element gathers a plurality of individually addressable data elements from non-contiguous locations in the memory which are narrower than a nominal width of register file elements in the processing element based on the instruction. The data elements are packed and loaded into register file elements of a register file entry by the processing element based on the instruction, such that at least two of the data elements gathered from the non-contiguous locations in the memory are packed and loaded into a single register file element of the register file entry.

Type: Grant

Filed: August 3, 2012

Date of Patent: April 25, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Jaime H. Moreno, Ravi Nair, Daniel A. Prener
Gather/scatter of multiple data elements with packed loading/storing into /from a register file entry

Patent number: 9632778

Abstract: Embodiments relate to packed loading and storing of data. An aspect includes a system for packed loading and storing of distributed data. The system includes memory and a processing element configured to communicate with the memory. The processing element is configured to perform a method including fetching and decoding an instruction for execution by the processing element. A plurality of individually addressable data elements is gathered from non-contiguous locations in the memory which are narrower than a nominal width of register file elements in the processing element based on the instruction. The processing element packs and loads the data elements into register file elements of a register file entry based on the instruction, such that at least two of the data elements gathered from the non-contiguous locations in the memory are packed and loaded into a single register file element of the register file entry.

Type: Grant

Filed: August 8, 2012

Date of Patent: April 25, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Jaime H. Moreno, Ravi Nair, Daniel A. Prener
Vector register file

Patent number: 9594724

Abstract: An aspect includes accessing a vector register in a vector register file. The vector register file includes a plurality of vector registers and each vector register includes a plurality of elements. A read command is received at a read port of the vector register file. The read command specifies a vector register address. The vector register address is decoded by an address decoder to determine a selected vector register of the vector register file. An element address is determined for one of the plurality of elements associated with the selected vector register based on a read element counter of the selected vector register. A word is selected in a memory array of the selected vector register as read data based on the element address. The read data is output from the selected vector register based on the decoding of the vector register address by the address decoder.

Type: Grant

Filed: August 9, 2012

Date of Patent: March 14, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
Vector register file

Patent number: 9582466

Abstract: An aspect includes accessing a vector register in a vector register file. The vector register file includes a plurality of vector registers and each vector register includes a plurality of elements. A read command is received at a read port of the vector register file. The read command specifies a vector register address. The vector register address is decoded by an address decoder to determine a selected vector register of the vector register file. An element address is determined for one of the plurality of elements associated with the selected vector register based on a read element counter of the selected vector register. A word is selected in a memory array of the selected vector register as read data based on the element address. The read data is output from the selected vector register based on the decoding of the vector register address by the address decoder.

Type: Grant

Filed: August 13, 2012

Date of Patent: February 28, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
Predication in a vector processor

Patent number: 9575756

Abstract: Embodiments relate to vector processor predication in an active memory device. An aspect includes a system for vector processor predication in an active memory device. The system includes memory in the active memory device and a processing element in the active memory device. The processing element is configured to perform a method including decoding an instruction with a plurality of sub-instructions to execute in parallel. One or more mask bits are accessed from a vector mask register in the processing element. The one or more mask bits are applied by the processing element to predicate operation of a unit in the processing element associated with at least one of the sub-instructions.

Type: Grant

Filed: August 8, 2012

Date of Patent: February 21, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair

1 2 3 4 5 … next