Patents by Inventor Daniel A. Prener

Daniel A. Prener has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20170357445
    Abstract: Methods and systems for optimizing an application for a computing system having multiple distinct memory locations that are interconnected by one or more communication channels include determining one or more data handling properties for a data region in an application. One or more data handling policies for the data region are determined based on the one or more data handling properties. Data setup costs are determined for a scope in the application that uses the data region in different memory locations based on the one or more data handling properties. The application is optimized in accordance with the one or more data handling policies and the data setup costs for the different memory locations.
    Type: Application
    Filed: June 13, 2016
    Publication date: December 14, 2017
    Inventors: Tong Chen, John Kevin O'Brien, Daniel A. Prener, Zehra N. Sura
  • Patent number: 9632778
    Abstract: Embodiments relate to packed loading and storing of data. An aspect includes a system for packed loading and storing of distributed data. The system includes memory and a processing element configured to communicate with the memory. The processing element is configured to perform a method including fetching and decoding an instruction for execution by the processing element. A plurality of individually addressable data elements is gathered from non-contiguous locations in the memory which are narrower than a nominal width of register file elements in the processing element based on the instruction. The processing element packs and loads the data elements into register file elements of a register file entry based on the instruction, such that at least two of the data elements gathered from the non-contiguous locations in the memory are packed and loaded into a single register file element of the register file entry.
    Type: Grant
    Filed: August 8, 2012
    Date of Patent: April 25, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Jaime H. Moreno, Ravi Nair, Daniel A. Prener
  • Patent number: 9632777
    Abstract: Embodiments relate to packed loading and storing of data. An aspect includes a method for packed loading and storing of data distributed in a system that includes memory and a processing element. The method includes fetching and decoding an instruction for execution by the processing element. The processing element gathers a plurality of individually addressable data elements from non-contiguous locations in the memory which are narrower than a nominal width of register file elements in the processing element based on the instruction. The data elements are packed and loaded into register file elements of a register file entry by the processing element based on the instruction, such that at least two of the data elements gathered from the non-contiguous locations in the memory are packed and loaded into a single register file element of the register file entry.
    Type: Grant
    Filed: August 3, 2012
    Date of Patent: April 25, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Jaime H. Moreno, Ravi Nair, Daniel A. Prener
  • Patent number: 9575755
    Abstract: Embodiments relate to vector processing in an active memory device. An aspect includes a method for vector processing in an active memory device that includes memory and a processing element. The method includes decoding, in the processing element, an instruction including a plurality of sub-instructions to execute in parallel. An iteration count to repeat execution of the sub-instructions in parallel is determined. Based on the iteration count, execution of the sub-instructions in parallel is repeated for multiple iterations by the processing element. Multiple locations in the memory are accessed in parallel based on the execution of the sub-instructions.
    Type: Grant
    Filed: August 3, 2012
    Date of Patent: February 21, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair, Daniel A. Prener
  • Patent number: 9535694
    Abstract: Embodiments relate to vector processing in an active memory device. An aspect includes a system for vector processing in an active memory device. The system includes memory in the active memory device and a processing element in the active memory device. The processing element is configured to perform a method including decoding an instruction with a plurality of sub-instructions to execute in parallel. An iteration count to repeat execution of the sub-instructions in parallel is determined. Execution of the sub-instructions is repeated in parallel for multiple iterations, by the processing element, based on the iteration count. Multiple locations in the memory are accessed in parallel based on the execution of the sub-instructions.
    Type: Grant
    Filed: August 8, 2012
    Date of Patent: January 3, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair, Daniel A. Prener
  • Patent number: 9038040
    Abstract: Partitioning programs between a general purpose core and one or more accelerators is provided. A compiler front end is provided for converting a program source code in a corresponding high level programming language into an intermediate code representation. This intermediate code representation is provided to an interprocedural optimizer which determines which core processor or accelerator each portion of the program should execute on and partitions the program into sub-programs based on this set of decisions. The interprocedural optimizer may further add instructions to the partitions to coordinate and synchronize the sub-programs as required. Each sub-program is compiled on an appropriate compiler backend for the instruction set architecture of the particular core processor or accelerator selected to execute the sub-program. The compiled sub-programs and then linked to thereby generate an executable program.
    Type: Grant
    Filed: January 25, 2006
    Date of Patent: May 19, 2015
    Assignee: International Business Machines Corporation
    Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel A. Prener
  • Patent number: 8972782
    Abstract: An aspect includes providing rollback support in an exposed-pipeline processing element. A method for providing rollback support in an exposed-pipeline processing element includes detecting, by rollback support logic, an error associated with execution of an instruction in the exposed-pipeline processing element. The rollback support logic determines whether the exposed-pipeline processing element supports replay of the instruction for a predetermined number of cycles. Based on determining that the exposed-pipeline processing element supports replay of the instruction, a rollback action is performed in the exposed-pipeline processing element to attempt recovery from the error.
    Type: Grant
    Filed: November 9, 2012
    Date of Patent: March 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair, Daniel A. Prener
  • Patent number: 8949101
    Abstract: Mechanisms are provided for predicting effects of soft errors on an integrated circuit device design. A data processing system is configured to implement a unified derating tool that includes a machine derating front-end engine used to generate machine derating information, and an application derating front-end engine used to generate application derating information, for the integrated circuit device design. The machine derating front-end engine executes a simulation of the integrated circuit device design to generate the machine derating information. The application derating front-end engine executes an application workload on existing hardware similar in architecture to the integrated circuit device design and injects a fault into the existing hardware during execution of the application workload to generate application derating information.
    Type: Grant
    Filed: October 12, 2011
    Date of Patent: February 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Pradip Bose, Meeta S. Gupta, Prabhakar N. Kudva, Daniel A. Prener
  • Publication number: 20140136894
    Abstract: An aspect includes providing rollback support in an exposed-pipeline processing element. A method for providing rollback support in an exposed-pipeline processing element includes detecting, by rollback support logic, an error associated with execution of an instruction in the exposed-pipeline processing element. The rollback support logic determines whether the exposed-pipeline processing element supports replay of the instruction for a predetermined number of cycles. Based on determining that the exposed-pipeline processing element supports replay of the instruction, a rollback action is performed in the exposed-pipeline processing element to attempt recovery from the error.
    Type: Application
    Filed: November 9, 2012
    Publication date: May 15, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair, Daniel A. Prener
  • Publication number: 20140040596
    Abstract: Embodiments relate to packed loading and storing of data. An aspect includes a method for packed loading and storing of data distributed in a system that includes memory and a processing element. The method includes fetching and decoding an instruction for execution by the processing element. The processing element gathers a plurality of individually addressable data elements from non-contiguous locations in the memory which are narrower than a nominal width of register file elements in the processing element based on the instruction. The data elements are packed and loaded into register file elements of a register file entry by the processing element based on the instruction, such that at least two of the data elements gathered from the non-contiguous locations in the memory are packed and loaded into a single register file element of the register file entry.
    Type: Application
    Filed: August 3, 2012
    Publication date: February 6, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Jaime H. Moreno, Ravi Nair, Daniel A. Prener
  • Publication number: 20140040598
    Abstract: Embodiments relate to vector processing in an active memory device. An aspect includes a system for vector processing in an active memory device. The system includes memory in the active memory device and a processing element in the active memory device. The processing element is configured to perform a method including decoding an instruction with a plurality of sub-instructions to execute in parallel. An iteration count to repeat execution of the sub-instructions in parallel is determined. Execution of the sub-instructions is repeated in parallel for multiple iterations, by the processing element, based on the iteration count. Multiple locations in the memory are accessed in parallel based on the execution of the sub-instructions.
    Type: Application
    Filed: August 8, 2012
    Publication date: February 6, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair, Daniel A. Prener
  • Publication number: 20140040599
    Abstract: Embodiments relate to packed loading and storing of data. An aspect includes a system for packed loading and storing of distributed data. The system includes memory and a processing element configured to communicate with the memory. The processing element is configured to perform a method including fetching and decoding an instruction for execution by the processing element. A plurality of individually addressable data elements is gathered from non-contiguous locations in the memory which are narrower than a nominal width of register file elements in the processing element based on the instruction. The processing element packs and loads the data elements into register file elements of a register file entry based on the instruction, such that at least two of the data elements gathered from the non-contiguous locations in the memory are packed and loaded into a single register file element of the register file entry.
    Type: Application
    Filed: August 8, 2012
    Publication date: February 6, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Jaime H. Moreno, Ravi Nair, Daniel A. Prener
  • Publication number: 20140040603
    Abstract: Embodiments relate to vector processing in an active memory device. An aspect includes a method for vector processing in an active memory device that includes memory and a processing element. The method includes decoding, in the processing element, an instruction including a plurality of sub-instructions to execute in parallel. An iteration count to repeat execution of the sub-instructions in parallel is determined. Based on the iteration count, execution of the sub-instructions in parallel is repeated for multiple iterations by the processing element. Multiple locations in the memory are accessed in parallel based on the execution of the sub-instructions.
    Type: Application
    Filed: August 3, 2012
    Publication date: February 6, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair, Daniel A. Prener
  • Publication number: 20130096902
    Abstract: Mechanisms are provided for predicting effects of soft errors on an integrated circuit device design. A data processing system is configured to implement a unified derating tool that includes a machine derating front-end engine used to generate machine derating information, and an application derating front-end engine used to generate application derating information, for the integrated circuit device design. The machine derating front-end engine executes a simulation of the integrated circuit device design to generate the machine derating information. The application derating front-end engine executes an application workload on existing hardware similar in architecture to the integrated circuit device design and injects a fault into the existing hardware during execution of the application workload to generate application derating information.
    Type: Application
    Filed: October 12, 2011
    Publication date: April 18, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Pradip Bose, Meeta S. Gupta, Prabhakar N. Kudva, Daniel A. Prener
  • Patent number: 8375374
    Abstract: An mechanism is provided for partitioning programs between a general purpose core and one or more accelerators. With the apparatus and method, a compiler front end is provided for converting a program source code in a corresponding high level programming language into an intermediate code representation. This intermediate code representation is provided to an interprocedural optimizer which determines which core processor or accelerator each portion of the program should execute on and partitions the program into sub-programs based on this set of decisions. The interprocedural optimizer may further add instructions to the partitions to coordinate and synchronize the sub-programs as required. Each sub-program is compiled on an appropriate compiler backend for the instruction set architecture of the particular core processor or accelerator selected to execute the sub-program. The compiled sub-programs and then linked to thereby generate an executable program.
    Type: Grant
    Filed: May 27, 2008
    Date of Patent: February 12, 2013
    Assignee: International Business Machines Corporation
    Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel A. Prener
  • Patent number: 7865699
    Abstract: This invention pertains to apparatus, method and a computer program stored on a computer readable medium. The computer program includes instructions for use with an instruction unit having a code page, and has computer program code for partitioning the code page into at least two sections for storing in a first section thereof a plurality of instruction words and, in association with at least one instruction word, for storing in a second section thereof an extension to each instruction word in the first section. The computer program further includes computer program code for setting a state of at least one page table entry bit for indicating, on a code page by code page basis, whether the code page is partitioned into the first and second sections for storing instruction words and their extensions, or whether the code page is comprised instead of a single section storing only instruction words.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: January 4, 2011
    Assignee: International Business Machines Corporation
    Inventors: Erik R Altman, Michael Gschwind, David A. Luick, Daniel A. Prener, Jude A. Rivers, Sumedh W. Sathaye, John-David Wellman
  • Patent number: 7441110
    Abstract: A mechanism is described that predicts the usefulness of a prefetching instruction during the instruction's decode cycle. Prefetching instructions that are predicted as useful (prefetch useful data) are sent to an execution unit of the processor for execution, while instructions that are predicted as not useful are discarded. The prediction regarding the usefulness of a prefetching instructions is performed utilizing a branch prediction mask contained in the branch history mechanism. This mask is compared to information contained in the prefetching instruction that records the branch path between the prefetching instruction and actual use of the data. Both instructions and data can be prefetched using this mechanism.
    Type: Grant
    Filed: December 10, 1999
    Date of Patent: October 21, 2008
    Assignee: International Business Machines Corporation
    Inventors: Thomas R. Puzak, Allan M. Hartstein, Mark Charney, Daniel A. Prener, Peter H. Oden
  • Publication number: 20080256521
    Abstract: An apparatus and method for partitioning programs between a general purpose core and one or more accelerators are provided. With the apparatus and method, a compiler front end is provided for converting a program source code in a corresponding high level programming language into an intermediate code representation. This intermediate code representation is provided to an interprocedural optimizer which determines which core processor or accelerator each portion of the program should execute on and partitions the program into sub-programs based on this set of decisions. The interprocedural optimizer may further add instructions to the partitions to coordinate and synchronize the sub-programs as required. Each sub-program is compiled on an appropriate compiler backend for the instruction set architecture of the particular core processor or accelerator selected to execute the sub-program. The compiled sub-programs and then linked to thereby generate an executable program.
    Type: Application
    Filed: May 27, 2008
    Publication date: October 16, 2008
    Applicant: International Business Machines Corporation
    Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel A. Prener
  • Publication number: 20080065861
    Abstract: This invention pertains to apparatus, method and a computer program stored on a computer readable medium. The computer program includes instructions for use with an instruction unit having a code page, and has computer program code for partitioning the code page into at least two sections for storing in a first section thereof a plurality of instruction words and, in association with at least one instruction word, for storing in a second section thereof an extension to each instruction word in the first section. The computer program further includes computer program code for setting a state of at least one page table entry bit for indicating, on a code page by code page basis, whether the code page is partitioned into the first and second sections for storing instruction words and their extensions, or whether the code page is comprised instead of a single section storing only instruction words.
    Type: Application
    Filed: October 31, 2007
    Publication date: March 13, 2008
    Inventors: Erik Altman, Michael Gschwind, David Luick, Daniel Prener, Jude Rivers, Sumedh Sathaye, John-David Wellman
  • Patent number: 7340588
    Abstract: This invention pertains to apparatus, method and a computer program stored on a computer readable medium. The computer program includes instructions for use with an instruction unit having a code page, and has computer program code for partitioning the code page into at least two sections for storing in a first section thereof a plurality of instruction words and, in association with at least one instruction word, for storing in a second section thereof an extension to each instruction word in the first section. The computer program further includes computer program code for setting a state of at least one page table entry bit for indicating, on a code page by code page basis, whether the code page is partitioned into the first and second sections for storing instruction words and their extensions, or whether the code page is comprised instead of a single section storing only instruction words.
    Type: Grant
    Filed: November 24, 2003
    Date of Patent: March 4, 2008
    Assignee: International Business Machines Corporation
    Inventors: Erik R Altman, Michael Gschwind, David A. Luick, Daniel A. Prener, Jude A. Rivers, Sumedh W. Sathaye, John-David Wellman