Patents by Inventor Steven J. Wallach

Steven J. Wallach has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210365381
    Abstract: The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.
    Type: Application
    Filed: August 8, 2021
    Publication date: November 25, 2021
    Inventors: Steven J. Wallach, Tony M. Brewer
  • Patent number: 11106592
    Abstract: The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.
    Type: Grant
    Filed: May 16, 2017
    Date of Patent: August 31, 2021
    Assignee: Micron Technology, Inc.
    Inventors: Steven J. Wallach, Tony M. Brewer
  • Publication number: 20170249253
    Abstract: The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.
    Type: Application
    Filed: May 16, 2017
    Publication date: August 31, 2017
    Inventors: Steven J. Wallach, Tony M. Brewer
  • Patent number: 9710384
    Abstract: The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.
    Type: Grant
    Filed: January 4, 2008
    Date of Patent: July 18, 2017
    Assignee: Micron Technology, Inc.
    Inventors: Steven J. Wallach, Tony Brewer
  • Patent number: 8561037
    Abstract: A software compiler is provided that is operable for generating an executable that comprises instructions for a plurality of different instruction sets as may be employed by different processors in a multi-processor system. The compiler may generate an executable that includes a first portion of instructions to be processed by a first instruction set (such as a first instruction set of a first processor in a multi-processor system) and a second portion of instructions to be processed by a second instruction set (such as a second instruction set of a second processor in a multi-processor system). Such executable may be generated for execution on a multi-processor system that comprises at least one host processor, which may comprise a fixed instruction set, such as the well-known x86 instruction set, and at least one co-processor, which comprises dynamically reconfigurable logic that enables the co-processor's instruction set to be dynamically reconfigured.
    Type: Grant
    Filed: August 29, 2007
    Date of Patent: October 15, 2013
    Assignee: Convey Computer
    Inventors: Steven J. Wallach, Tony Brewer
  • Patent number: 8279879
    Abstract: A chunk format for a large-scale, high data throughput router includes a preamble that allows each individual chunk to have clock and data recovery performed before the chunk data is retrieved. The format includes a chunk header that contains information specific to the entire chunk. A chunk according to the present format can contain multiple packet segments, with each segment having its own packet header for packet-specific information. The format provides for a scrambler seed which allows scrambling the data to achieve a favorable zero and one balance as well as minimal run lengths. There can be a random choice of available scrambler seeds for any particular chunk to avoid malicious forcing of zero and one patterns or run lengths of bit zeroes and ones. There are a chunk cyclical redundancy check (CRC) as well as forward error correction (FEC) bytes to detect and/or correct any errors and also to insure a high degree of data and control integrity.
    Type: Grant
    Filed: September 21, 2009
    Date of Patent: October 2, 2012
    Assignee: Foundry Networks, LLC
    Inventors: Tony M. Brewer, Harry C. Blackmon, Chris Davies, Harold W. Dozier, Thomas C. McDermott, III, Steven J. Wallach, Dean E. Walker, Lou Yeh
  • Patent number: 8205066
    Abstract: A co-processor is provided that comprises one or more application engines that can be dynamically configured to a desired personality. For instance, the application engines may be dynamically configured to any of a plurality of different vector processing instruction sets, such as a single-precision vector processing instruction set and a double-precision vector processing instruction set. The co-processor further comprises a common infrastructure that is common across all of the different personalities, such as an instruction decode infrastructure, memory management infrastructure, system interface infrastructure, and/or scalar processing unit (that has a base set of instructions). Thus, the personality of the co-processor can be dynamically modified (by reconfiguring one or more application engines of the co-processor), while the common infrastructure of the co-processor remains consistent across the various personalities.
    Type: Grant
    Filed: October 31, 2008
    Date of Patent: June 19, 2012
    Assignee: Convey Computer
    Inventors: Tony Brewer, Steven J. Wallach
  • Patent number: 8156307
    Abstract: A multi-processor system comprises at least one host processor, which may comprise a fixed instruction set, such as the well-known x86 instruction set. The system further comprises at least one co-processor, which comprises dynamically reconfigurable logic that enables the co-processor's instruction set to be dynamically reconfigured. In this manner, the at least one host processor and the at least one dynamically reconfigurable co-processor are heterogeneous processors having different instruction sets. Further, cache coherency is maintained between the heterogeneous host and co-processors. And, a single executable file may contain instructions that are processed by the multi-processor system, wherein a portion of the instructions are processed by the host processor and a portion of the instructions are processed by the co-processor.
    Type: Grant
    Filed: August 20, 2007
    Date of Patent: April 10, 2012
    Assignee: Convey Computer
    Inventors: Steven J. Wallach, Tony Brewer
  • Patent number: 8122229
    Abstract: A dispatch mechanism is provided for dispatching instructions of an executable from a host processor to a heterogeneous co-processor. According to certain embodiments, cache coherency is maintained between the host processor and the heterogeneous co-processor, and such cache coherency is leveraged for dispatching instructions of an executable that are to be processed by the co-processor. For instance, in certain embodiments, a designated portion of memory (e.g., “UCB”) is utilized, wherein a host processor may place information in such UCB and the co-processor can retrieve information from the UCB (and vice-versa). The UCB may thus be used to dispatch instructions of an executable for processing by the co-processor. In certain embodiments, the co-processor may comprise dynamically reconfigurable logic which enables the co-processor's instruction set to be dynamically changed, and the dispatching operation may identify one of a plurality of predefined instruction sets to be loaded onto the co-processor.
    Type: Grant
    Filed: September 12, 2007
    Date of Patent: February 21, 2012
    Assignee: Convey Computer
    Inventors: Steven J. Wallach, Tony Brewer
  • Publication number: 20100115237
    Abstract: A co-processor is provided that comprises one or more application engines that can be dynamically configured to a desired personality. For instance, the application engines may be dynamically configured to any of a plurality of different vector processing instruction sets, such as a single-precision vector processing instruction set and a double-precision vector processing instruction set. The co-processor further comprises a common infrastructure that is common across all of the different personalities, such as an instruction decode infrastructure, memory management infrastructure, system interface infrastructure, and/or scalar processing unit (that has a base set of instructions). Thus, the personality of the co-processor can be dynamically modified (by reconfiguring one or more application engines of the co-processor), while the common infrastructure of the co-processor remains consistent across the various personalities.
    Type: Application
    Filed: October 31, 2008
    Publication date: May 6, 2010
    Applicant: Convey Computer
    Inventors: Tony Brewer, Steven J. Wallach
  • Publication number: 20100115233
    Abstract: The present invention is directed generally to dynamically-selectable vector register partitioning, and more specifically to a processor infrastructure (e.g., co-processor infrastructure in a multi-processor system) that supports dynamic setting of vector register partitioning to any of a plurality of different vector partitioning modes. Thus, rather than being restricted to a fixed vector register partitioning mode, embodiments of the present invention enable a processor to be dynamically set to any of a plurality of different vector partitioning modes. Thus, for instance, different vector register partitioning modes may be employed for different applications being executed by the processor, and/or different vector register partitioning modes may even be employed for use in processing different vector oriented operations within a given applications being executed by the processor, in accordance with certain embodiments of the present invention.
    Type: Application
    Filed: October 31, 2008
    Publication date: May 6, 2010
    Applicant: Convey Computer
    Inventors: Tony Brewer, Steven J. Wallach
  • Patent number: 7613183
    Abstract: A chunk format for a large-scale, high data throughput router includes a preamble that allows each individual chunk to have clock and data recovery performed before the chunk data is retrieved. The format includes a chunk header that contains information specific to the entire chunk. A chunk according to the present format can contain multiple packet segments, with each segment having its own packet header for packet-specific information. The format provides for a scrambler seed which allows scrambling the data to achieve a favorable zero and one balance as well as minimal run lengths. There are forward error correction (FEC) bytes as well as a chunk cyclical redundancy check (CRC) to detect and/or correct any errors and also to insure a high degree of data and control integrity. Advantageously, a framing symbol inserted into the chunk format itself allows the receiving circuitry to identify or locate a particular chunk format.
    Type: Grant
    Filed: October 31, 2000
    Date of Patent: November 3, 2009
    Assignee: Foundry Networks, Inc.
    Inventors: Tony M. Brewer, Harry C. Blackmon, Chris Davies, Harold W. Dozier, Thomas C. McDermott, III, Steven J. Wallach, Dean E. Walker, Lou Yeh
  • Publication number: 20090177843
    Abstract: The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.
    Type: Application
    Filed: January 4, 2008
    Publication date: July 9, 2009
    Applicant: Convey Computer
    Inventors: Steven J. Wallach, Tony Brewer
  • Publication number: 20090070553
    Abstract: A dispatch mechanism is provided for dispatching instructions of an executable from a host processor to a heterogeneous co-processor. According to certain embodiments, cache coherency is maintained between the host processor and the heterogeneous co-processor, and such cache coherency is leveraged for dispatching instructions of an executable that are to be processed by the co-processor. For instance, in certain embodiments, a designated portion of memory (e.g., “UCB”) is utilized, wherein a host processor may place information in such UCB and the co-processor can retrieve information from the UCB (and vice-versa). The UCB may thus be used to dispatch instructions of an executable for processing by the co-processor. In certain embodiments, the co-processor may comprise dynamically reconfigurable logic which enables the co-processor's instruction set to be dynamically changed, and the dispatching operation may identify one of a plurality of predefined instruction sets to be loaded onto the co-processor.
    Type: Application
    Filed: September 12, 2007
    Publication date: March 12, 2009
    Applicant: Convey Computer
    Inventors: Steven J. Wallach, Tony Brewer
  • Publication number: 20090064095
    Abstract: A software compiler is provided that is operable for generating an executable that comprises instructions for a plurality of different instruction sets as may be employed by different processors in a multi-processor system. The compiler may generate an executable that includes a first portion of instructions to be processed by a first instruction set (such as a first instruction set of a first processor in a multi-processor system) and a second portion of instructions to be processed by a second instruction set (such as a second instruction set of a second processor in a multi-processor system). Such executable may be generated for execution on a multi-processor system that comprises at least one host processor, which may comprise a fixed instruction set, such as the well-known x86 instruction set, and at least one co-processor, which comprises dynamically reconfigurable logic that enables the co-processor's instruction set to be dynamically reconfigured.
    Type: Application
    Filed: August 29, 2007
    Publication date: March 5, 2009
    Applicant: Convey Computer
    Inventors: Steven J. Wallach, Tony Brewer
  • Publication number: 20090055596
    Abstract: A multi-processor system comprises at least one host processor, which may comprise a fixed instruction set, such as the well-known x86 instruction set. The system further comprises at least one co-processor, which comprises dynamically reconfigurable logic that enables the co-processor's instruction set to be dynamically reconfigured. In this manner, the at least one host processor and the at least one dynamically reconfigurable co-processor are heterogeneous processors having different instruction sets. Further, cache coherency is maintained between the heterogeneous host and co-processors. And, a single executable file may contain instructions that are processed by the multi-processor system, wherein a portion of the instructions are processed by the host processor and a portion of the instructions are processed by the co-processor.
    Type: Application
    Filed: August 20, 2007
    Publication date: February 26, 2009
    Applicant: Convey Computer
    Inventors: Steven J. Wallach, Tony Brewer
  • Patent number: 4942518
    Abstract: A physical cache unit (100) is used within a computer (20). The computer (20) further includes a main memory (99) a memory control unit (22), inputs/output processors (54, 68) and a central processor (156). The central processor includes an address translation unit (118), an instruction processing unit (126), an address scalar unit (142), a vector control unit (144) and vector processing units (148, 150). The physical cache unit (100) stores operands in a data cache (180), the operands for delivery to and receipt from the control processor (156). Addresses for requested operands are received from the central processor (156) and are examined concurrently during one clock cycle in tag stores (190 and 192). The tag stores (190 and 192) produce tags which are compared in comparators (198 and 200) to the tag of physical addresses received from the central processor (156).
    Type: Grant
    Filed: November 12, 1985
    Date of Patent: July 17, 1990
    Assignee: Convex Computer Corporation
    Inventors: James R. Weatherford, Arthur T. Kimmel, Steven J. Wallach
  • Patent number: 4926317
    Abstract: A vector processing computer (20) includes a memory control unit (22), main memory (99), a central processor (156), a service processing unit (42) and a plurality of input/output processors (54, 68). The central processor (156) includes a physical cache unit (100), an address translation unit (118), an instruction processing unit (126), an address scalar unit (142), a vector control unit (144), an odd pipe vector processing unit (148) and an even pipe vector processing unit (150). Vector elements are transmitted from memory, either main memory (99), a physical cache unit (100) or a logical cache (326) through a source bus (114) where the elements are alternately loaded into the vector processing units (148, 150). The resulting vectors are transmitted through a destination bus (114) to either the physical cache unit (100), the main memory (99), the logical cache (326) or to an input/output processor (54).
    Type: Grant
    Filed: April 12, 1988
    Date of Patent: May 15, 1990
    Assignee: Convex Computer Corporation
    Inventors: Steven J. Wallach, David M. Chastain, James R. Weatherford
  • Patent number: 4821184
    Abstract: A universal addressing system for use in a digital data processing system including a universal memory for storing data including instructions and at least one local system having access to the universal memory. The universal memory is organized into objects, and each item of data is associated with an object. Each object is specified by a unique identifier, and data is addressed by means of a logical address which specifies the UID of the object containing the data and the offset of the data in the object. A processor in the local system responds to instructions by providing memory operation specifiers to the universal memory. Each memory operation specifier specifies a memory operation and a logical address. The offset in the logical address may specify any bit in the object. The memory operation specifier also specifies a length in bits.
    Type: Grant
    Filed: September 4, 1984
    Date of Patent: April 11, 1989
    Assignee: Data General Corporation
    Inventors: Gerald F. Clancy, Craig J. Mundie, Stephen I. Schleimer, Steven J. Wallach, Richard G. Bratt
  • Patent number: 4809171
    Abstract: An operand processing unit (10) carries out processing of operands in a computer. The unit (10) includes a plurality of operation circuits (12, 14, 16, 18, 20). A source bus (22) provides one operand per clock cycle to the operation circuits (12, 14, 16, 18, 20). A destination bus (24) receives one resultant per clock cycle from the operation circuits (12, 14, 16, 18, 20). Within each operation circuit there is provided an operand processing circuit (80) which performs a selected function with the received operands. These functions include, for example, multiplication, division, addition, subtraction, logical AND, and shift. Logical circuitry provides a priority assignment to the operation circuits (12, 14, 16, 18, 20) for sequencing the loading of operands into the highest priority operation circuit (12, 14, 16, 18, 20) which is not busy processing operands within its corresponding operand processing circuit (80).
    Type: Grant
    Filed: January 21, 1988
    Date of Patent: February 28, 1989
    Assignee: Convex Computer Corporation
    Inventors: Harold W. Dozier, Thomas M. Jones, Steven J. Wallach, Jeffrey H. Gruger