Patents by Inventor Steven J. Wallach
Steven J. Wallach has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20210365381Abstract: The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.Type: ApplicationFiled: August 8, 2021Publication date: November 25, 2021Inventors: Steven J. Wallach, Tony M. Brewer
-
Patent number: 11106592Abstract: The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.Type: GrantFiled: May 16, 2017Date of Patent: August 31, 2021Assignee: Micron Technology, Inc.Inventors: Steven J. Wallach, Tony M. Brewer
-
Publication number: 20170249253Abstract: The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.Type: ApplicationFiled: May 16, 2017Publication date: August 31, 2017Inventors: Steven J. Wallach, Tony M. Brewer
-
Patent number: 9710384Abstract: The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.Type: GrantFiled: January 4, 2008Date of Patent: July 18, 2017Assignee: Micron Technology, Inc.Inventors: Steven J. Wallach, Tony Brewer
-
Patent number: 8561037Abstract: A software compiler is provided that is operable for generating an executable that comprises instructions for a plurality of different instruction sets as may be employed by different processors in a multi-processor system. The compiler may generate an executable that includes a first portion of instructions to be processed by a first instruction set (such as a first instruction set of a first processor in a multi-processor system) and a second portion of instructions to be processed by a second instruction set (such as a second instruction set of a second processor in a multi-processor system). Such executable may be generated for execution on a multi-processor system that comprises at least one host processor, which may comprise a fixed instruction set, such as the well-known x86 instruction set, and at least one co-processor, which comprises dynamically reconfigurable logic that enables the co-processor's instruction set to be dynamically reconfigured.Type: GrantFiled: August 29, 2007Date of Patent: October 15, 2013Assignee: Convey ComputerInventors: Steven J. Wallach, Tony Brewer
-
Patent number: 8279879Abstract: A chunk format for a large-scale, high data throughput router includes a preamble that allows each individual chunk to have clock and data recovery performed before the chunk data is retrieved. The format includes a chunk header that contains information specific to the entire chunk. A chunk according to the present format can contain multiple packet segments, with each segment having its own packet header for packet-specific information. The format provides for a scrambler seed which allows scrambling the data to achieve a favorable zero and one balance as well as minimal run lengths. There can be a random choice of available scrambler seeds for any particular chunk to avoid malicious forcing of zero and one patterns or run lengths of bit zeroes and ones. There are a chunk cyclical redundancy check (CRC) as well as forward error correction (FEC) bytes to detect and/or correct any errors and also to insure a high degree of data and control integrity.Type: GrantFiled: September 21, 2009Date of Patent: October 2, 2012Assignee: Foundry Networks, LLCInventors: Tony M. Brewer, Harry C. Blackmon, Chris Davies, Harold W. Dozier, Thomas C. McDermott, III, Steven J. Wallach, Dean E. Walker, Lou Yeh
-
Patent number: 8205066Abstract: A co-processor is provided that comprises one or more application engines that can be dynamically configured to a desired personality. For instance, the application engines may be dynamically configured to any of a plurality of different vector processing instruction sets, such as a single-precision vector processing instruction set and a double-precision vector processing instruction set. The co-processor further comprises a common infrastructure that is common across all of the different personalities, such as an instruction decode infrastructure, memory management infrastructure, system interface infrastructure, and/or scalar processing unit (that has a base set of instructions). Thus, the personality of the co-processor can be dynamically modified (by reconfiguring one or more application engines of the co-processor), while the common infrastructure of the co-processor remains consistent across the various personalities.Type: GrantFiled: October 31, 2008Date of Patent: June 19, 2012Assignee: Convey ComputerInventors: Tony Brewer, Steven J. Wallach
-
Patent number: 8156307Abstract: A multi-processor system comprises at least one host processor, which may comprise a fixed instruction set, such as the well-known x86 instruction set. The system further comprises at least one co-processor, which comprises dynamically reconfigurable logic that enables the co-processor's instruction set to be dynamically reconfigured. In this manner, the at least one host processor and the at least one dynamically reconfigurable co-processor are heterogeneous processors having different instruction sets. Further, cache coherency is maintained between the heterogeneous host and co-processors. And, a single executable file may contain instructions that are processed by the multi-processor system, wherein a portion of the instructions are processed by the host processor and a portion of the instructions are processed by the co-processor.Type: GrantFiled: August 20, 2007Date of Patent: April 10, 2012Assignee: Convey ComputerInventors: Steven J. Wallach, Tony Brewer
-
Patent number: 8122229Abstract: A dispatch mechanism is provided for dispatching instructions of an executable from a host processor to a heterogeneous co-processor. According to certain embodiments, cache coherency is maintained between the host processor and the heterogeneous co-processor, and such cache coherency is leveraged for dispatching instructions of an executable that are to be processed by the co-processor. For instance, in certain embodiments, a designated portion of memory (e.g., “UCB”) is utilized, wherein a host processor may place information in such UCB and the co-processor can retrieve information from the UCB (and vice-versa). The UCB may thus be used to dispatch instructions of an executable for processing by the co-processor. In certain embodiments, the co-processor may comprise dynamically reconfigurable logic which enables the co-processor's instruction set to be dynamically changed, and the dispatching operation may identify one of a plurality of predefined instruction sets to be loaded onto the co-processor.Type: GrantFiled: September 12, 2007Date of Patent: February 21, 2012Assignee: Convey ComputerInventors: Steven J. Wallach, Tony Brewer
-
Publication number: 20100115237Abstract: A co-processor is provided that comprises one or more application engines that can be dynamically configured to a desired personality. For instance, the application engines may be dynamically configured to any of a plurality of different vector processing instruction sets, such as a single-precision vector processing instruction set and a double-precision vector processing instruction set. The co-processor further comprises a common infrastructure that is common across all of the different personalities, such as an instruction decode infrastructure, memory management infrastructure, system interface infrastructure, and/or scalar processing unit (that has a base set of instructions). Thus, the personality of the co-processor can be dynamically modified (by reconfiguring one or more application engines of the co-processor), while the common infrastructure of the co-processor remains consistent across the various personalities.Type: ApplicationFiled: October 31, 2008Publication date: May 6, 2010Applicant: Convey ComputerInventors: Tony Brewer, Steven J. Wallach
-
Publication number: 20100115233Abstract: The present invention is directed generally to dynamically-selectable vector register partitioning, and more specifically to a processor infrastructure (e.g., co-processor infrastructure in a multi-processor system) that supports dynamic setting of vector register partitioning to any of a plurality of different vector partitioning modes. Thus, rather than being restricted to a fixed vector register partitioning mode, embodiments of the present invention enable a processor to be dynamically set to any of a plurality of different vector partitioning modes. Thus, for instance, different vector register partitioning modes may be employed for different applications being executed by the processor, and/or different vector register partitioning modes may even be employed for use in processing different vector oriented operations within a given applications being executed by the processor, in accordance with certain embodiments of the present invention.Type: ApplicationFiled: October 31, 2008Publication date: May 6, 2010Applicant: Convey ComputerInventors: Tony Brewer, Steven J. Wallach
-
Patent number: 7613183Abstract: A chunk format for a large-scale, high data throughput router includes a preamble that allows each individual chunk to have clock and data recovery performed before the chunk data is retrieved. The format includes a chunk header that contains information specific to the entire chunk. A chunk according to the present format can contain multiple packet segments, with each segment having its own packet header for packet-specific information. The format provides for a scrambler seed which allows scrambling the data to achieve a favorable zero and one balance as well as minimal run lengths. There are forward error correction (FEC) bytes as well as a chunk cyclical redundancy check (CRC) to detect and/or correct any errors and also to insure a high degree of data and control integrity. Advantageously, a framing symbol inserted into the chunk format itself allows the receiving circuitry to identify or locate a particular chunk format.Type: GrantFiled: October 31, 2000Date of Patent: November 3, 2009Assignee: Foundry Networks, Inc.Inventors: Tony M. Brewer, Harry C. Blackmon, Chris Davies, Harold W. Dozier, Thomas C. McDermott, III, Steven J. Wallach, Dean E. Walker, Lou Yeh
-
Publication number: 20090177843Abstract: The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.Type: ApplicationFiled: January 4, 2008Publication date: July 9, 2009Applicant: Convey ComputerInventors: Steven J. Wallach, Tony Brewer
-
Publication number: 20090070553Abstract: A dispatch mechanism is provided for dispatching instructions of an executable from a host processor to a heterogeneous co-processor. According to certain embodiments, cache coherency is maintained between the host processor and the heterogeneous co-processor, and such cache coherency is leveraged for dispatching instructions of an executable that are to be processed by the co-processor. For instance, in certain embodiments, a designated portion of memory (e.g., “UCB”) is utilized, wherein a host processor may place information in such UCB and the co-processor can retrieve information from the UCB (and vice-versa). The UCB may thus be used to dispatch instructions of an executable for processing by the co-processor. In certain embodiments, the co-processor may comprise dynamically reconfigurable logic which enables the co-processor's instruction set to be dynamically changed, and the dispatching operation may identify one of a plurality of predefined instruction sets to be loaded onto the co-processor.Type: ApplicationFiled: September 12, 2007Publication date: March 12, 2009Applicant: Convey ComputerInventors: Steven J. Wallach, Tony Brewer
-
Publication number: 20090064095Abstract: A software compiler is provided that is operable for generating an executable that comprises instructions for a plurality of different instruction sets as may be employed by different processors in a multi-processor system. The compiler may generate an executable that includes a first portion of instructions to be processed by a first instruction set (such as a first instruction set of a first processor in a multi-processor system) and a second portion of instructions to be processed by a second instruction set (such as a second instruction set of a second processor in a multi-processor system). Such executable may be generated for execution on a multi-processor system that comprises at least one host processor, which may comprise a fixed instruction set, such as the well-known x86 instruction set, and at least one co-processor, which comprises dynamically reconfigurable logic that enables the co-processor's instruction set to be dynamically reconfigured.Type: ApplicationFiled: August 29, 2007Publication date: March 5, 2009Applicant: Convey ComputerInventors: Steven J. Wallach, Tony Brewer
-
Publication number: 20090055596Abstract: A multi-processor system comprises at least one host processor, which may comprise a fixed instruction set, such as the well-known x86 instruction set. The system further comprises at least one co-processor, which comprises dynamically reconfigurable logic that enables the co-processor's instruction set to be dynamically reconfigured. In this manner, the at least one host processor and the at least one dynamically reconfigurable co-processor are heterogeneous processors having different instruction sets. Further, cache coherency is maintained between the heterogeneous host and co-processors. And, a single executable file may contain instructions that are processed by the multi-processor system, wherein a portion of the instructions are processed by the host processor and a portion of the instructions are processed by the co-processor.Type: ApplicationFiled: August 20, 2007Publication date: February 26, 2009Applicant: Convey ComputerInventors: Steven J. Wallach, Tony Brewer
-
Patent number: 4942518Abstract: A physical cache unit (100) is used within a computer (20). The computer (20) further includes a main memory (99) a memory control unit (22), inputs/output processors (54, 68) and a central processor (156). The central processor includes an address translation unit (118), an instruction processing unit (126), an address scalar unit (142), a vector control unit (144) and vector processing units (148, 150). The physical cache unit (100) stores operands in a data cache (180), the operands for delivery to and receipt from the control processor (156). Addresses for requested operands are received from the central processor (156) and are examined concurrently during one clock cycle in tag stores (190 and 192). The tag stores (190 and 192) produce tags which are compared in comparators (198 and 200) to the tag of physical addresses received from the central processor (156).Type: GrantFiled: November 12, 1985Date of Patent: July 17, 1990Assignee: Convex Computer CorporationInventors: James R. Weatherford, Arthur T. Kimmel, Steven J. Wallach
-
Patent number: 4926317Abstract: A vector processing computer (20) includes a memory control unit (22), main memory (99), a central processor (156), a service processing unit (42) and a plurality of input/output processors (54, 68). The central processor (156) includes a physical cache unit (100), an address translation unit (118), an instruction processing unit (126), an address scalar unit (142), a vector control unit (144), an odd pipe vector processing unit (148) and an even pipe vector processing unit (150). Vector elements are transmitted from memory, either main memory (99), a physical cache unit (100) or a logical cache (326) through a source bus (114) where the elements are alternately loaded into the vector processing units (148, 150). The resulting vectors are transmitted through a destination bus (114) to either the physical cache unit (100), the main memory (99), the logical cache (326) or to an input/output processor (54).Type: GrantFiled: April 12, 1988Date of Patent: May 15, 1990Assignee: Convex Computer CorporationInventors: Steven J. Wallach, David M. Chastain, James R. Weatherford
-
Patent number: 4821184Abstract: A universal addressing system for use in a digital data processing system including a universal memory for storing data including instructions and at least one local system having access to the universal memory. The universal memory is organized into objects, and each item of data is associated with an object. Each object is specified by a unique identifier, and data is addressed by means of a logical address which specifies the UID of the object containing the data and the offset of the data in the object. A processor in the local system responds to instructions by providing memory operation specifiers to the universal memory. Each memory operation specifier specifies a memory operation and a logical address. The offset in the logical address may specify any bit in the object. The memory operation specifier also specifies a length in bits.Type: GrantFiled: September 4, 1984Date of Patent: April 11, 1989Assignee: Data General CorporationInventors: Gerald F. Clancy, Craig J. Mundie, Stephen I. Schleimer, Steven J. Wallach, Richard G. Bratt
-
Patent number: 4809171Abstract: An operand processing unit (10) carries out processing of operands in a computer. The unit (10) includes a plurality of operation circuits (12, 14, 16, 18, 20). A source bus (22) provides one operand per clock cycle to the operation circuits (12, 14, 16, 18, 20). A destination bus (24) receives one resultant per clock cycle from the operation circuits (12, 14, 16, 18, 20). Within each operation circuit there is provided an operand processing circuit (80) which performs a selected function with the received operands. These functions include, for example, multiplication, division, addition, subtraction, logical AND, and shift. Logical circuitry provides a priority assignment to the operation circuits (12, 14, 16, 18, 20) for sequencing the loading of operands into the highest priority operation circuit (12, 14, 16, 18, 20) which is not busy processing operands within its corresponding operand processing circuit (80).Type: GrantFiled: January 21, 1988Date of Patent: February 28, 1989Assignee: Convex Computer CorporationInventors: Harold W. Dozier, Thomas M. Jones, Steven J. Wallach, Jeffrey H. Gruger