Patents by Inventor David T. Harper
David T. Harper has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11726912Abstract: Systems and methods are disclosed for performing wide memory operations for a wide data cache line. In some examples of the disclosed technology, a processor having two or more execution lanes includes a data cache coupled to memory, a wide memory load circuit that concurrently loads two or more words from a cache line of the data cache, and a writeback circuit situated to send a respective word of the concurrently-loaded words to a selected execution lane of the processor, either into an operand buffer or bypassing the operand buffer. In some examples, a sharding circuit is provided that allows bitwise, byte-wise, and/or word-wise manipulation of memory operation data. In some examples, wide cache loads allows for concurrent execution of plural execution lanes of the processor.Type: GrantFiled: March 29, 2021Date of Patent: August 15, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith, Gagan Gupta, David T. Harper
-
Publication number: 20210216454Abstract: Systems and methods are disclosed for performing wide memory operations for a wide data cache line. In some examples of the disclosed technology, a processor having two or more execution lanes includes a data cache coupled to memory, a wide memory load circuit that concurrently loads two or more words from a cache line of the data cache, and a writeback circuit situated to send a respective word of the concurrently-loaded words to a selected execution lane of the processor, either into an operand buffer or bypassing the operand buffer. In some examples, a sharding circuit is provided that allows bitwise, byte-wise, and/or word-wise manipulation of memory operation data. In some examples, wide cache loads allows for concurrent execution of plural execution lanes of the processor.Type: ApplicationFiled: March 29, 2021Publication date: July 15, 2021Applicant: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith, Gagan Gupta, David T. Harper
-
Patent number: 11042381Abstract: Techniques described herein are directed to ensuring register data consistency between different instruction blocks. For example, a block-based processor renames registers during block decode, but delays the update of a logical register-to-physical register mapping utilized by other instruction blocks until it is determined that a write instruction configured to write to a logical register commits. Alternatively, the processor renames registers during block decode and updates the mapping accordingly. However, the update is negated (e.g., rolled back) if the write instruction is not executed. Still further, the processor may analyze the instructions in the block to determine instructions configured to write to a logical register but that will not execute due to a mismatched predicate. Based on the determination, the block-based processor ensures data consistency by copying data from a previously-assigned register to a newly-assigned register.Type: GrantFiled: December 8, 2018Date of Patent: June 22, 2021Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: David T. Harper, III, Gagan Gupta
-
Patent number: 10963379Abstract: Systems and methods are disclosed for performing wide memory operations for a wide data cache line. In some examples of the disclosed technology, a processor having two or more execution lanes includes a data cache coupled to memory, a wide memory load circuit that concurrently loads two or more words from a cache line of the data cache, and a writeback circuit situated to send a respective word of the concurrently-loaded words to a selected execution lane of the processor, either into an operand buffer or bypassing the operand buffer. In some examples, a sharding circuit is provided that allows bitwise, byte-wise, and/or word-wise manipulation of memory operation data. In some examples, wide cache loads allows for concurrent execution of plural execution lanes of the processor.Type: GrantFiled: February 2, 2018Date of Patent: March 30, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith, Gagan Gupta, David T. Harper
-
Patent number: 10824429Abstract: Systems and methods are disclosed for executing instructions with a block-based processor. Instructions can be executed in any order as their dependencies arrive, but the individual instructions are committed in a serial fashion. Further, exception handling can be performed by storing transient state for an instruction block and resuming by restoring the transient state. This allows programmers to see intermediate state for the instruction block before the subject block has committed. In one examples of the disclosed technology, a method of operating a processor executing a block-based instruction set architecture includes executing at least one instruction encoded for an instruction block, responsive to determining that an individual instruction of the instruction block can commit, advancing a commit frontier for the instruction block to include all instructions in the instruction block that can commit, and committing one or more instructions inside the advanced commit frontier.Type: GrantFiled: December 18, 2018Date of Patent: November 3, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Gagan Gupta, David T. Harper
-
Publication number: 20200183695Abstract: Techniques described herein are directed to ensuring register data consistency between different instruction blocks. For example, a block-based processor renames registers during block decode, but delays the update of a logical register-to-physical register mapping utilized by other instruction blocks until it is determined that a write instruction configured to write to a logical register commits. Alternatively, the processor renames registers during block decode and updates the mapping accordingly. However, the update is negated (e.g., rolled back) if the write instruction is not executed. Still further, the processor may analyze the instructions in the block to determine instructions configured to write to a logical register but that will not execute due to a mismatched predicate. Based on the determination, the block-based processor ensures data consistency by copying data from a previously-assigned register to a newly-assigned register.Type: ApplicationFiled: December 8, 2018Publication date: June 11, 2020Inventors: David T. Harper, III, Gagan Gupta
-
Publication number: 20200089503Abstract: Systems and methods are disclosed for executing instructions with a block-based processor. Instructions can be executed in any order as their dependencies arrive, but the individual instructions are committed in a serial fashion. Further, exception handling can be performed by storing transient state for an instruction block and resuming by restoring the transient state. This allows programmers to see intermediate state for the instruction block before the subject block has committed. In one examples of the disclosed technology, a method of operating a processor executing a block-based instruction set architecture includes executing at least one instruction encoded for an instruction block, responsive to determining that an individual instruction of the instruction block can commit, advancing a commit frontier for the instruction block to include all instructions in the instruction block that can commit, and committing one or more instructions inside the advanced commit frontier.Type: ApplicationFiled: December 18, 2018Publication date: March 19, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Gagan Gupta, David T. Harper
-
Publication number: 20190236009Abstract: Systems and methods are disclosed for performing wide memory operations for a wide data cache line. In some examples of the disclosed technology, a processor having two or more execution lanes includes a data cache coupled to memory, a wide memory load circuit that concurrently loads two or more words from a cache line of the data cache, and a writeback circuit situated to send a respective word of the concurrently-loaded words to a selected execution lane of the processor, either into an operand buffer or bypassing the operand buffer. In some examples, a sharding circuit is provided that allows bitwise, byte-wise, and/or word-wise manipulation of memory operation data. In some examples, wide cache loads allows for concurrent execution of plural execution lanes of the processor.Type: ApplicationFiled: February 2, 2018Publication date: August 1, 2019Applicant: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith, Gagan Gupta, David T. Harper
-
Patent number: 10110504Abstract: A data center includes a plurality of computing units that communicate with each other using wireless communication, such as high frequency RF wireless communication. The data center may organize the computing units into groups (e.g., racks). In one implementation, each group may form a three-dimensional structure, such as a column having a free-space region for accommodating intra-group communication among computing units. The data center can include a number of features to facilitate communication, including dual-use memory for handling computing and buffering tasks, failsafe routing mechanisms, provisions to address permanent interface and hidden terminal scenarios, etc.Type: GrantFiled: May 25, 2016Date of Patent: October 23, 2018Assignee: Microsoft Technology Licensing, LLCInventors: Ji Yong Shin, Darko Kirovski, David T. Harper
-
Patent number: 10091091Abstract: A direct network is described in which each resource is connected to a switching fabric via a set of two or more routing nodes. The routing nodes are distributed so as to satisfy at least one inter-node separation criterion. In one case, the separation criterion specifies that, for each resource, a number of routing nodes that share a same coordinate value with another routing node in the set (in a same coordinate dimension) is to be minimized. In some network topologies, such as a torus network, this means a number of unique loops of the direct network to which each resource is connected is to be maximized. The routing provisions described herein offer various performance benefits, such as improved latency-related performance.Type: GrantFiled: December 21, 2015Date of Patent: October 2, 2018Assignee: Microsoft Technology Licensing, LLCInventors: David T. Harper, Eric C. Peterson, Mark A. Santaniello
-
Publication number: 20180081379Abstract: Low cost storage for write once read rarely data is described. In an embodiment a storage device comprises a plurality of hard disk drives connected to a server via an interconnect fabric. The storage device comprises a cooling system which is only capable of cooling a first subset of the hard disk drives and a power supply system which is only capable of powering a second subset of the hard disk drives and in some examples, the interconnect fabric may be only capable of providing full bandwidth for a third subset of the hard disk drives. Each subset may comprise only a small fraction of hard disk drives. A control mechanism, which may be implemented in software, is provided which controls which hard disk drives are active at any time in order that the constraints set by the cooling and power supply systems and interconnect fabric are not violated.Type: ApplicationFiled: November 28, 2017Publication date: March 22, 2018Inventors: Shobana M. BALAKRISHNAN, David T. HARPER, Stephen HEIL, Eric C. PETERSON, Adam B. GLASS, David Alex BUTLER, Austin Nicholas DONNELLY, Antony Ian Taylor ROWSTRON, Sergey LEGTCHENKO
-
Patent number: 9841774Abstract: Low cost storage for write once read rarely data is described. In an embodiment a storage device comprises a plurality of hard disk drives connected to a server via an interconnect fabric. The storage device comprises a cooling system which is only capable of cooling a first subset of the hard disk drives and a power supply system which is only capable of powering a second subset of the hard disk drives and in some examples, the interconnect fabric may be only capable of providing full bandwidth for a third subset of the hard disk drives. Each subset may comprise only a small fraction of hard disk drives. A control mechanism, which may be implemented in software, is provided which controls which hard disk drives are active at any time in order that the constraints set by the cooling and power supply systems and interconnect fabric are not violated.Type: GrantFiled: October 17, 2016Date of Patent: December 12, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Shobana M. Balakrishnan, David T. Harper, Stephen Heil, Eric C. Peterson, Adam B. Glass, David Alex Butler, Austin Nicholas Donnelly, Antony Ian Taylor Rowstron, Sergey Legtchenko
-
Patent number: 9647932Abstract: The transmission of multiple copies of data to other computing devices is optimized by minimizing the number of copies of such data transmitted through an expensive portion of the network. A store-and-forward methodology is utilized to transmit only a single copy through the expensive portion and the data is subsequently forked into multiple copies directed to multiple destination computing devices. Computing devices that are not intended destinations can be conscripted as intermediate computing devices, if appropriate to minimize copies of the data transmitted through an expensive portion. Additionally, accommodation can be made for data that is intolerant of out-of-order delivery by utilizing adaptive protocols that avoid mechanisms that may result in out-of-order delivery for data intolerant of such and by utilizing packet sorting at data convergence points to reorder the data. Different protocol settings can be utilized to transmit data across different portions of the network.Type: GrantFiled: June 6, 2016Date of Patent: May 9, 2017Assignee: Microsoft Technology Licensing, LLCInventors: David A. Maltz, David T. Harper, III, Douglas Christopher Burger
-
Publication number: 20170031372Abstract: Low cost storage for write once read rarely data is described. In an embodiment a storage device comprises a plurality of hard disk drives connected to a server via an interconnect fabric. The storage device comprises a cooling system which is only capable of cooling a first subset of the hard disk drives and a power supply system which is only capable of powering a second subset of the hard disk drives and in some examples, the interconnect fabric may be only capable of providing full bandwidth for a third subset of the hard disk drives. Each subset may comprise only a small fraction of hard disk drives. A control mechanism, which may be implemented in software, is provided which controls which hard disk drives are active at any time in order that the constraints set by the cooling and power supply systems and interconnect fabric are not violated.Type: ApplicationFiled: October 17, 2016Publication date: February 2, 2017Inventors: Shobana M. Balakrishnan, David T. Harper, Stephen Heil, Eric C. Peterson, Adam B. Glass, David Alex Butler, Austin Nicholas Donnelly, Antony Ian Taylor Rowstron, Sergey Legtchenko
-
Patent number: 9471068Abstract: Low cost storage for write once read rarely data is described. In an embodiment a storage device comprises a plurality of hard disk drives connected to a server via an interconnect fabric. The storage device comprises a cooling system which is only capable of cooling a first subset of the hard disk drives and a power supply system which is only capable of powering a second subset of the hard disk drives and in some examples, the interconnect fabric may be only capable of providing full bandwidth for a third subset of the hard disk drives. Each subset may comprise only a small fraction of hard disk drives. A control mechanism, which may be implemented in software, is provided which controls which hard disk drives are active at any time in order that the constraints set by the cooling and power supply systems and interconnect fabric are not violated.Type: GrantFiled: October 16, 2014Date of Patent: October 18, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Shobana M. Balakrishnan, David T. Harper, Stephen Heil, Eric C. Peterson, Adam B. Glass, David Alex Butler, Austin Nicholas Donnelly, Antony Ian Taylor Rowstron, Sergey Legtchenko
-
Publication number: 20160294679Abstract: The transmission of multiple copies of data to other computing devices is optimized by minimizing the number of copies of such data transmitted through an expensive portion of the network. A store-and-forward methodology is utilized to transmit only a single copy through the expensive portion and the data is subsequently forked into multiple copies directed to multiple destination computing devices. Computing devices that are not intended destinations can be conscripted as intermediate computing devices, if appropriate to minimize copies of the data transmitted through an expensive portion. Additionally, accommodation can be made for data that is intolerant of out-of-order delivery by utilizing adaptive protocols that avoid mechanisms that may result in out-of-order delivery for data intolerant of such and by utilizing packet sorting at data convergence points to reorder the data. Different protocol settings can be utilized to transmit data across different portions of the network.Type: ApplicationFiled: June 6, 2016Publication date: October 6, 2016Inventors: David A. Maltz, David T. Harper, III, Douglas Christopher Burger
-
Publication number: 20160269309Abstract: A data center includes a plurality of computing units that communicate with each other using wireless communication, such as high frequency RF wireless communication. The data center may organize the computing units into groups (e.g., racks). In one implementation, each group may form a three-dimensional structure, such as a column having a free-space region for accommodating intra-group communication among computing units. The data center can include a number of features to facilitate communication, including dual-use memory for handling computing and buffering tasks, failsafe routing mechanisms, provisions to address permanent interface and hidden terminal scenarios, etc.Type: ApplicationFiled: May 25, 2016Publication date: September 15, 2016Applicant: Microsoft Technology Licensing, LLCInventors: Ji Yong SHIN, Darko KIROVSKI, David T. HARPER
-
Patent number: 9391716Abstract: A data center includes a plurality of computing units that communicate with each other using wireless communication, such as high frequency RF wireless communication. The data center may organize the computing units into groups (e.g., racks). In one implementation, each group may form a three-dimensional structure, such as a column having a free-space region for accommodating intra-group communication among computing units. The data center can include a number of features to facilitate communication, including dual-use memory for handling computing and buffering tasks, failsafe routing mechanisms, provisions to address permanent interface and hidden terminal scenarios, etc.Type: GrantFiled: April 5, 2010Date of Patent: July 12, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Ji Yong Shin, Darko Kirovski, David T. Harper, III
-
Patent number: 9363303Abstract: The transmission of multiple copies of data to other computing devices is optimized by minimizing the number of copies of such data transmitted through an expensive portion of the network. A store-and-forward methodology is utilized to transmit only a single copy through the expensive portion and the data is subsequently forked into multiple copies directed to multiple destination computing devices. Computing devices that are not intended destinations can be conscripted as intermediate computing devices, if appropriate to minimize copies of the data transmitted through an expensive portion. Additionally, accommodation can be made for data that is intolerant of out-of-order delivery by utilizing adaptive protocols that avoid mechanisms that may result in out-of-order delivery for data intolerant of such and by utilizing packet sorting at data convergence points to reorder the data. Different protocol settings can be utilized to transmit data across different portions of the network.Type: GrantFiled: March 15, 2013Date of Patent: June 7, 2016Assignee: Microsoft Technology Licensing, LLCInventors: David A. Maltz, David T. Harper, III, Douglas Christopher Burger
-
Publication number: 20160112296Abstract: A direct network is described in which each resource is connected to a switching fabric via a set of two or more routing nodes. The routing nodes are distributed so as to satisfy at least one inter-node separation criterion. In one case, the separation criterion specifies that, for each resource, a number of routing nodes that share a same coordinate value with another routing node in the set (in a same coordinate dimension) is to be minimized. In some network topologies, such as a torus network, this means a number of unique loops of the direct network to which each resource is connected is to be maximized. The routing provisions described herein offer various performance benefits, such as improved latency-related performance.Type: ApplicationFiled: December 21, 2015Publication date: April 21, 2016Applicant: Microsoft Technology Licensing, LLCInventors: David T. HARPER, Eric C. PETERSON, Mark A. SANTANIELLO