Patents Assigned to Soft Machines, Inc.
  • Patent number: 8677105
    Abstract: A unified architecture for dynamic generation, execution, synchronization and parallelization of complex instructions formats includes a virtual register file, register cache and register file hierarchy. A self-generating and synchronizing dynamic and static threading architecture provides efficient context switching.
    Type: Grant
    Filed: November 14, 2007
    Date of Patent: March 18, 2014
    Assignee: Soft Machines, Inc.
    Inventor: Mohammad A. Abdallah
  • Publication number: 20140075168
    Abstract: A method for outputting reliably predictable instruction sequences. The method includes tracking repetitive hits to determine a set of frequently hit instruction sequences for a microprocessor, and out of that set, identifying a branch instruction having a series of subsequent frequently executed branch instructions that form a reliably predictable instruction sequence. The reliably predictable instruction sequence is stored into a buffer. On a subsequent hit to the branch instruction, the reliably predictable instruction sequence is output from the buffer.
    Type: Application
    Filed: October 12, 2011
    Publication date: March 13, 2014
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20140032844
    Abstract: Systems and methods for flushing a cache with modified data are disclosed. Responsive to a request to flush data from a cache with modified data to a next level cache that does not include the cache with modified data, the cache with modified data is accessed using an index and a way and an address associated with the index and the way is secured. Using the address, the cache with modified data is accessed a second time and an entry that is associated with the address is retrieved from the cache with modified data. The entry is placed into a location of the next level cache.
    Type: Application
    Filed: July 30, 2012
    Publication date: January 30, 2014
    Applicant: SOFT MACHINES, INC.
    Inventors: Karthikeyan Avudaiyappan, Mohammad Abdallah
  • Publication number: 20140032846
    Abstract: Systems and methods for supporting a plurality of load and store accesses of a cache are disclosed. Responsive to a request of a plurality of requests to access a block of a plurality of blocks of a load cache, the block of the load cache and a logically and physically paired block of a store coalescing cache are accessed in parallel. The data that is accessed from the block of the load cache is overwritten by the data that is accessed from the block of the store coalescing cache by merging on a per byte basis. Access is provided to the merged data.
    Type: Application
    Filed: July 30, 2012
    Publication date: January 30, 2014
    Applicant: Soft Machines, Inc.
    Inventors: Karthikeyan Avudaiyappan, Mohammad Abdallah
  • Publication number: 20140032856
    Abstract: A method for maintaining the coherency of a store coalescing cache and a load cache is disclosed. As a part of the method, responsive to a write-back of an entry from a level one store coalescing cache to a level two cache, the entry is written into the level two cache and into the level one load cache. The writing of the entry into the level two cache and into the level one load cache is executed at the speed of access of the level two cache.
    Type: Application
    Filed: July 30, 2012
    Publication date: January 30, 2014
    Applicant: SOFT MACHINES, INC.
    Inventors: Karthikeyan Avudaiyappan, Mohammad Abdallah
  • Publication number: 20140032845
    Abstract: A method for supporting a plurality of load accesses is disclosed. A plurality of requests to access a data cache is accessed, and in response, a tag memory is accessed that maintains a plurality of copies of tags for each entry in the data cache. Tags are identified that correspond to individual requests. The data cache is accessed based on the tags that correspond to the individual requests. A plurality of requests to access the same block of the plurality of blocks causes an access arbitration that is executed in the same clock cycle as is the access of the tag memory.
    Type: Application
    Filed: July 30, 2012
    Publication date: January 30, 2014
    Applicant: SOFT MACHINES, INC.
    Inventors: Karthikeyan Avudaiyappan, Mohammad Abdallah
  • Publication number: 20130311759
    Abstract: A method for outputting alternative instruction sequences. The method includes tracking repetitive hits to determine a set of frequently hit instruction sequences for a microprocessor. A frequently miss-predicted branch instruction is identified, wherein the predicted outcome of the branch instruction is frequently wrong. An alternative instruction sequence for the branch instruction target is stored into a buffer. On a subsequent hit to the branch instruction where the predicted outcome of the branch instruction was wrong, the alternative instruction sequence is output from the buffer.
    Type: Application
    Filed: October 12, 2011
    Publication date: November 21, 2013
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20130238874
    Abstract: Systems and methods for accessing a unified translation lookaside buffer (TLB) are disclosed. A method includes receiving an indicator of a level one translation lookaside buffer (L1TLB) miss corresponding to a request for a virtual address to physical address translation, searching a cache that includes virtual addresses and page sizes that correspond to translation table entries (TTEs) that have been evicted from the L1TLB, where a page size is identified, and searching a second level TLB and identifying a physical address that is contained in the second level TLB. Access is provided to the identified physical address.
    Type: Application
    Filed: March 7, 2012
    Publication date: September 12, 2013
    Applicant: SOFT MACHINES, INC.
    Inventors: Karthikeyan Avudaiyappan, Mohammad Abdallah
  • Publication number: 20130091340
    Abstract: A matrix of execution blocks form a set of rows and columns. The rows support parallel execution of instructions and the columns support execution of dependent instructions. The matrix of execution blocks process a single block of instructions specifying parallel and dependent instructions.
    Type: Application
    Filed: November 30, 2012
    Publication date: April 11, 2013
    Applicant: SOFT MACHINES, INC.
    Inventor: Soft Machines, Inc.
  • Publication number: 20130024661
    Abstract: A hardware based translation accelerator. The hardware includes a guest fetch logic component for accessing guest instructions; a guest fetch buffer coupled to the guest fetch logic component and a branch prediction component for assembling guest instructions into a guest instruction block; and conversion tables coupled to the guest fetch buffer for translating the guest instruction block into a corresponding native conversion block. The hardware further includes a native cache coupled to the conversion tables for storing the corresponding native conversion block, and a conversion look aside buffer coupled to the native cache for storing a mapping of the guest instruction block to corresponding native conversion block, wherein upon a subsequent request for a guest instruction, the conversion look aside buffer is indexed to determine whether a hit occurred, wherein the mapping indicates the guest instruction has a corresponding converted native instruction in the native cache.
    Type: Application
    Filed: January 27, 2012
    Publication date: January 24, 2013
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20130024619
    Abstract: A method for translating instructions for a processor. The method includes accessing a guest instruction and performing a first level translation of the guest instruction using a first level conversion table. The method further includes outputting a resulting native instruction when the first level translation proceeds to completion. A second level translation of the guest instruction is performed using a second level conversion table when the first level translation does not proceed to completion, wherein the second level translation further processes the guest instruction based upon a partial translation from the first level conversion table. The resulting native instruction is output when the second level translation proceeds to completion.
    Type: Application
    Filed: January 27, 2012
    Publication date: January 24, 2013
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Patent number: 8327115
    Abstract: A matrix of execution blocks form a set of rows and columns. The rows support parallel execution of instructions and the columns support execution of dependent instructions. The matrix of execution blocks process a single block of instructions specifying parallel and dependent instructions.
    Type: Grant
    Filed: April 12, 2007
    Date of Patent: December 4, 2012
    Assignee: Soft Machines, Inc.
    Inventor: Mohammad A. Abdallah
  • Publication number: 20120297170
    Abstract: A method for decentralized resource allocation in an integrated circuit. The method includes receiving a plurality of requests from a plurality of resource consumers of a plurality of partitionable engines to access a plurality resources, wherein the resources are spread across the plurality of engines and are accessed via a global interconnect structure. At each resource, a number of requests for access to said each resource are added. At said each resource, the number of requests are compared against a threshold limiter. At said each resource, a subsequent request that is received that exceeds the threshold limiter is canceled. Subsequently, requests that are not canceled within a current clock cycle are implemented.
    Type: Application
    Filed: May 18, 2012
    Publication date: November 22, 2012
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20120297396
    Abstract: A global interconnect system. The global interconnect system includes a plurality of resources having data for supporting the execution of multiple code sequences and a plurality of engines for implementing the execution of the multiple code sequences. A plurality of resource consumers are within each of the plurality of engines. A global interconnect structure is coupled to the plurality of resource consumers and coupled to the plurality of resources to enable data access and execution of the multiple code sequences, wherein the resource consumers access the resources through a per cycle utilization of the global interconnect structure.
    Type: Application
    Filed: May 18, 2012
    Publication date: November 22, 2012
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20120246450
    Abstract: A system for executing instructions using a plurality of register file segments for a processor. The system includes a global front end scheduler for receiving an incoming instruction sequence, wherein the global front end scheduler partitions the incoming instruction sequence into a plurality of code blocks of instructions and generates a plurality of inheritance vectors describing interdependencies between instructions of the code blocks. The system further includes a plurality of virtual cores of the processor coupled to receive code blocks allocated by the global front end scheduler, wherein each virtual core comprises a respective subset of resources of a plurality of partitionable engines, wherein the code blocks are executed by using the partitionable engines in accordance with a virtual core mode and in accordance with the respective inheritance vectors. A plurality register file segments are coupled to the partitionable engines for providing data storage.
    Type: Application
    Filed: March 23, 2012
    Publication date: September 27, 2012
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20120246448
    Abstract: A system for executing instructions using a plurality of memory fragments for a processor. The system includes a global front end scheduler for receiving an incoming instruction sequence, wherein the global front end scheduler partitions the incoming instruction sequence into a plurality of code blocks of instructions and generates a plurality of inheritance vectors describing interdependencies between instructions of the code blocks. The system further includes a plurality of virtual cores of the processor coupled to receive code blocks allocated by the global front end scheduler, wherein each virtual core comprises a respective subset of resources of a plurality of partitionable engines, wherein the code blocks are executed by using the partitionable engines in accordance with a virtual core mode and in accordance with the respective inheritance vectors. A plurality memory fragments are coupled to the partitionable engines for providing data storage.
    Type: Application
    Filed: March 23, 2012
    Publication date: September 27, 2012
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20120246657
    Abstract: A method for executing instructions using a plurality of virtual cores for a processor. The method includes receiving an incoming instruction sequence using a global front end scheduler, and partitioning the incoming instruction sequence into a plurality of code blocks of instructions. The method further includes generating a plurality of inheritance vectors describing interdependencies between instructions of the code blocks, and allocating the code blocks to a plurality of virtual cores of the processor, wherein each virtual core comprises a respective subset of resources of a plurality of partitionable engines. The code blocks are executed by using the partitionable engines in accordance with a virtual core mode and in accordance with the respective inheritance vectors.
    Type: Application
    Filed: March 23, 2012
    Publication date: September 27, 2012
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20120198209
    Abstract: A method for translating instructions for a processor. The method includes accessing a plurality of guest instructions that comprise multiple guest branch instructions comprising at least one guest far branch, and building an instruction sequence from the plurality of guest instructions by using branch prediction on the at least one guest far branch. The method further includes assembling a guest instruction block from the instruction sequence. The guest instruction block is translated to a corresponding native conversion block, wherein an at least one native far branch that corresponds to the at least one guest far branch and wherein the at least one native far branch includes an opposite guest address for an opposing branch path of the at least one guest far branch. Upon encountering a missprediction, a correct instruction sequence is obtained by accessing the opposite guest address.
    Type: Application
    Filed: January 27, 2012
    Publication date: August 2, 2012
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20120198122
    Abstract: A method for managing mappings of storage on a code cache for a processor. The method includes storing a plurality of guest address to native address mappings as entries in a conversion look aside buffer, wherein the entries indicate guest addresses that have corresponding converted native addresses stored within a code cache memory, and receiving a subsequent request for a guest address at the conversion look aside buffer. The conversion look aside buffer is indexed to determine whether there exists an entry that corresponds to the index, wherein the index comprises a tag and an offset that is used to identify the entry that corresponds to the index. Upon a hit on the tag, the corresponding entry is accessed to retrieve a pointer to the code cache memory corresponding block of converted native instructions. The corresponding block of converted native instructions are fetched from the code cache memory for execution.
    Type: Application
    Filed: January 27, 2012
    Publication date: August 2, 2012
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20120198168
    Abstract: A method for managing a variable caching structure for managing storage for a processor. The method includes using a multi-way tag array to store a plurality of pointers for a corresponding plurality of different size groups of physical storage of a storage stack, wherein the pointers indicate guest addresses that have corresponding converted native addresses stored within the storage stack, and allocating a group of storage blocks of the storage stack, wherein the size of the allocation is in accordance with a corresponding size of one of the plurality of different size groups. Upon a hit on the tag, a corresponding entry is accessed to retrieve a pointer that indicates where in the storage stack a corresponding group of storage blocks of converted native instructions reside. The converted native instructions are then fetched from the storage stack for execution.
    Type: Application
    Filed: January 27, 2012
    Publication date: August 2, 2012
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah