Patents by Inventor Bruce M. Fleischer

Bruce M. Fleischer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200142704
    Abstract: A processor core includes a storage device which stores a composite very large instruction word (VLIW) instruction, an instruction unit which obtains the composite VLIW instruction from the storage device and decodes the composite VLIW instruction to determine an operation to perform, and a composite VLIW instruction execution unit which executes the composite VLIW instruction to perform the operation.
    Type: Application
    Filed: December 31, 2019
    Publication date: May 7, 2020
    Inventors: Bruce M. Fleischer, Thomas Winters FOX, Arpith C. JACOB, Hans Mikael JACOBSON, Ravi NAIR, Kevin John Patrick O'BRIEN, Daniel Arthur PRENER
  • Patent number: 10572263
    Abstract: A processor core includes a storage device which stores a composite very large instruction word (VLIW) instruction, an instruction unit which obtains the composite VLIW instruction from the storage device and decodes the composite VLIW instruction to determine an operation to perform, and a composite VLIW instruction execution unit which executes the decoded composite VLIW instruction to perform the operation.
    Type: Grant
    Filed: March 31, 2016
    Date of Patent: February 25, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas Winters Fox, Arpith C. Jacob, Hans Mikael Jacobson, Ravi Nair, Kevin John Patrick O'Brien, Daniel Arthur Prener
  • Patent number: 10275256
    Abstract: Branch prediction in a computer processor, includes: fetching an instruction, the instruction comprising an address, the address comprising a first portion of a global history vector and a global history vector pointer; performing a first branch prediction in dependence upon the first portion of the global history vector; retrieving, in dependence upon the global history vector pointer, from a rolling global history vector buffer, a second portion of the global history vector; and performing a second branch prediction in dependence upon a combination of the first portion and second portion of the global history vector.
    Type: Grant
    Filed: February 22, 2016
    Date of Patent: April 30, 2019
    Assignee: International Business Machines Corporation
    Inventors: Bruce M. Fleischer, Michael N. Goulet, David S. Levitan, Nicholas R. Orzol
  • Patent number: 10162634
    Abstract: A method, apparatus and non-transitory computer readable medium are provided for permuting data registers to a target register. Two or more data registers are concatenated to form a concatenated data register. Each data register comprises a plurality of elements. A permutation instruction which uses one of the data registers as a data input register is executed and conditionally selects an element of the data input register by comparing a portion of an element of a pattern register to an immediate match field value. The selected element of the data input register is copied to an element in a target register at a position corresponding to a position of the element of the pattern register when the portion of the element of the pattern register matches the immediate match field value. When the portion of the element of the pattern register does not match, the target register remains unchanged.
    Type: Grant
    Filed: May 20, 2016
    Date of Patent: December 25, 2018
    Assignee: International Business Machines Corporation
    Inventors: Alexander E. Eichenberger, Bruce M. Fleischer
  • Patent number: 10049061
    Abstract: Embodiments relate to loading and storing of data. An aspect includes a method for transferring data in an active memory device that includes memory and a processing element. An instruction is fetched and decoded for execution by the processing element. Based on determining that the instruction is a gather instruction, the processing element determines a plurality of source addresses in the memory from which to gather data elements and a destination address in the memory. One or more gathered data elements are transferred from the source addresses to contiguous locations in the memory starting at the destination address. Based on determining that the instruction is a scatter instruction, a source address in the memory from which to read data elements at contiguous locations and one or more destination addresses in the memory to store the data elements at non-contiguous locations are determined, and the data elements are transferred.
    Type: Grant
    Filed: November 12, 2012
    Date of Patent: August 14, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, James A. Kahle, Jaime H. Moreno, Ravi Nair
  • Patent number: 9971713
    Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
    Type: Grant
    Filed: April 30, 2015
    Date of Patent: May 15, 2018
    Assignee: GLOBALFOUNDRIES INC.
    Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
  • Patent number: 9928190
    Abstract: Direct communication of data between processing elements is provided. An aspect includes sending, by a first processing element, data over an inter-processing element chaining bus. The data is destined for another processing element via a data exchange component that is coupled between the first processing element and a second processing element via a communication line disposed between corresponding multiplexors of the first processing element and the second processing element. A further aspect includes determining, by the data exchange component, whether the data has been received at the data exchange element. If so, an indicator is set in a register of the data exchange component and the data is forwarded to the other processing element. Setting the indicator causes the first processing element to stall. If the data has not been received, the other processing element is stalled while the data exchange component awaits receipt of the data.
    Type: Grant
    Filed: June 15, 2015
    Date of Patent: March 27, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
  • Patent number: 9910802
    Abstract: Direct communication of data between processing elements is provided. An aspect includes sending, by a first processing element, data over an inter-processing element chaining bus. The data is destined for another processing element via a data exchange component that is coupled between the first processing element and a second processing element via a communication line disposed between corresponding multiplexors of the first processing element and the second processing element. A further aspect includes determining, by the data exchange component, whether the data has been received at the data exchange element. If so, an indicator is set in a register of the data exchange component and the data is forwarded to the other processing element. Setting the indicator causes the first processing element to stall. If the data has not been received, the other processing element is stalled while the data exchange component awaits receipt of the data.
    Type: Grant
    Filed: November 23, 2015
    Date of Patent: March 6, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
  • Patent number: 9841926
    Abstract: According to one embodiment, a method for traffic prioritization in a memory device includes sending a memory access request including a priority value from a processing element in the memory device to a crossbar interconnect in the memory device. The memory access request is routed through the crossbar interconnect to a memory controller in the memory device associated with the memory access request. The memory access request is received at the memory controller. The priority value of the memory access request is compared to priority values of a plurality of memory access requests stored in a queue of the memory controller to determine a highest priority memory access request. A next memory access request is performed by the memory controller based on the highest priority memory access request.
    Type: Grant
    Filed: June 30, 2016
    Date of Patent: December 12, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
  • Patent number: 9829535
    Abstract: An integrated circuit includes a test block which in turn includes a plurality of identical paths; a counter selectively coupled to the plurality of identical paths to selectively obtain a count of at least one of correctly operating paths and incorrectly operating paths from each of the plurality of identical paths; and a plurality of count latches selectively coupled to the counter to store output of the counter. Each path in turn includes a first clocked latch; a clocked logic path beginning and ending at the first clocked latch; and a clocked detection circuit coupled to the first clocked latch and the counter, which determines whether the clocked logic path is operating properly in a given clock period.
    Type: Grant
    Filed: January 20, 2016
    Date of Patent: November 28, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Karthik Balakrishnan, Bruce M. Fleischer, Keith A. Jenkins, Christos Vezyrtzis
  • Publication number: 20170337059
    Abstract: A method, apparatus and non-transitory computer readable medium are provided for permuting data registers to a target register. Two or more data registers are concatenated to form a concatenated data register. Each data register comprises a plurality of elements. A permutation instruction which uses one of the data registers as a data input register is executed and conditionally selects an element of the data input register by comparing a portion of an element of a pattern register to an immediate match field value. The selected element of the data input register is copied to an element in a target register at a position corresponding to a position of the element of the pattern register when the portion of the element of the pattern register matches the immediate match field value. When the portion of the element of the pattern register does not match, the target register remains unchanged.
    Type: Application
    Filed: May 20, 2016
    Publication date: November 23, 2017
    Inventors: Alexander E. EICHENBERGER, Bruce M. FLEISCHER
  • Publication number: 20170286108
    Abstract: A processor core includes a storage device which stores a composite very large instruction word (VLIW) instruction, an instruction unit which obtains the composite VLIW instruction from the storage device and decodes the composite VLIW instruction to determine an operation to perform, and a composite VLIW instruction execution unit which executes the decoded composite VLIW instruction to perform the operation.
    Type: Application
    Filed: March 31, 2016
    Publication date: October 5, 2017
    Inventors: Bruce M. Fleischer, Thomas Winters FOX, Arpith C. JACOB, Hans Mikael Jacobson, Ravi Nair, Kevin John Patrick O'Brien, Daniel Arthur Prener
  • Publication number: 20170242701
    Abstract: Branch prediction in a computer processor, includes: fetching an instruction, the instruction comprising an address, the address comprising a first portion of a global history vector and a global history vector pointer; performing a first branch prediction in dependence upon the first portion of the global history vector; retrieving, in dependence upon the global history vector pointer, from a rolling global history vector buffer, a second portion of the global history vector; and performing a second branch prediction in dependence upon a combination of the first portion and second portion of the global history vector.
    Type: Application
    Filed: February 22, 2016
    Publication date: August 24, 2017
    Inventors: BRUCE M. FLEISCHER, MICHAEL N. GOULET, DAVID S. LEVITAN, NICHOLAS R. ORZOL
  • Patent number: 9652231
    Abstract: Mechanisms are provided for dynamic data driven alignment and data formatting in a floating point SIMD architecture. At least two operand inputs are input to a permute unit of a processor. Each operand input contains at least one floating point value upon which a permute operation is to be performed by the permute unit. A control vector input, having a plurality of floating point values that together constitute the control vector input, is input to the permute unit of the processor for controlling the permute operation of the permute unit. The permute unit performs a permute operation on the at least two operand inputs according to a permutation pattern specified by the plurality of floating point values that constitute the control vector input. Moreover, a result output of the permute operation is output from the permute unit to a result vector register of the processor.
    Type: Grant
    Filed: October 14, 2008
    Date of Patent: May 16, 2017
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Bruce M. Fleischer, Michael K. Gschwind
  • Patent number: 9632778
    Abstract: Embodiments relate to packed loading and storing of data. An aspect includes a system for packed loading and storing of distributed data. The system includes memory and a processing element configured to communicate with the memory. The processing element is configured to perform a method including fetching and decoding an instruction for execution by the processing element. A plurality of individually addressable data elements is gathered from non-contiguous locations in the memory which are narrower than a nominal width of register file elements in the processing element based on the instruction. The processing element packs and loads the data elements into register file elements of a register file entry based on the instruction, such that at least two of the data elements gathered from the non-contiguous locations in the memory are packed and loaded into a single register file element of the register file entry.
    Type: Grant
    Filed: August 8, 2012
    Date of Patent: April 25, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Jaime H. Moreno, Ravi Nair, Daniel A. Prener
  • Patent number: 9632777
    Abstract: Embodiments relate to packed loading and storing of data. An aspect includes a method for packed loading and storing of data distributed in a system that includes memory and a processing element. The method includes fetching and decoding an instruction for execution by the processing element. The processing element gathers a plurality of individually addressable data elements from non-contiguous locations in the memory which are narrower than a nominal width of register file elements in the processing element based on the instruction. The data elements are packed and loaded into register file elements of a register file entry by the processing element based on the instruction, such that at least two of the data elements gathered from the non-contiguous locations in the memory are packed and loaded into a single register file element of the register file entry.
    Type: Grant
    Filed: August 3, 2012
    Date of Patent: April 25, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Jaime H. Moreno, Ravi Nair, Daniel A. Prener
  • Patent number: 9594724
    Abstract: An aspect includes accessing a vector register in a vector register file. The vector register file includes a plurality of vector registers and each vector register includes a plurality of elements. A read command is received at a read port of the vector register file. The read command specifies a vector register address. The vector register address is decoded by an address decoder to determine a selected vector register of the vector register file. An element address is determined for one of the plurality of elements associated with the selected vector register based on a read element counter of the selected vector register. A word is selected in a memory array of the selected vector register as read data based on the element address. The read data is output from the selected vector register based on the decoding of the vector register address by the address decoder.
    Type: Grant
    Filed: August 9, 2012
    Date of Patent: March 14, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
  • Patent number: 9582466
    Abstract: An aspect includes accessing a vector register in a vector register file. The vector register file includes a plurality of vector registers and each vector register includes a plurality of elements. A read command is received at a read port of the vector register file. The read command specifies a vector register address. The vector register address is decoded by an address decoder to determine a selected vector register of the vector register file. An element address is determined for one of the plurality of elements associated with the selected vector register based on a read element counter of the selected vector register. A word is selected in a memory array of the selected vector register as read data based on the element address. The read data is output from the selected vector register based on the decoding of the vector register address by the address decoder.
    Type: Grant
    Filed: August 13, 2012
    Date of Patent: February 28, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
  • Patent number: 9575755
    Abstract: Embodiments relate to vector processing in an active memory device. An aspect includes a method for vector processing in an active memory device that includes memory and a processing element. The method includes decoding, in the processing element, an instruction including a plurality of sub-instructions to execute in parallel. An iteration count to repeat execution of the sub-instructions in parallel is determined. Based on the iteration count, execution of the sub-instructions in parallel is repeated for multiple iterations by the processing element. Multiple locations in the memory are accessed in parallel based on the execution of the sub-instructions.
    Type: Grant
    Filed: August 3, 2012
    Date of Patent: February 21, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair, Daniel A. Prener
  • Patent number: 9575753
    Abstract: Mechanisms, in a data processing system comprising a single instruction multiple data (SIMD) processor, for performing a data dependency check operation on vector element values of at least two input vector registers are provided. Two calls to a simd-check instruction are performed, one with input vector registers having a first order and one with the input vector registers having a different order. The simd-check instruction performs comparisons to determine if any data dependencies are present. Results of the two calls to the simd-check instruction are obtained and used to determine if any data dependencies are present in the at least two input vector registers. Based on the results, the SIMD processor may perform various operations.
    Type: Grant
    Filed: March 15, 2012
    Date of Patent: February 21, 2017
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Bruce M. Fleischer