Patents by Inventor Eric Oliver Mejdrich
Eric Oliver Mejdrich has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7890699Abstract: A circuit arrangement and method bypass the storage of requested data in a higher level cache of a multi-level memory architecture during the return of the requested data to a requester, while caching the requested data in a lower level cache. For certain types of data, e.g., data that is only used once and/or that is rarely modified or written back to memory, bypassing storage in a higher level cache reduces the likelihood of the requested data casting out frequently used data from the higher level cache. By caching the data in a lower level cache, however, the lower level cache can still snoop data requests and return requested data in the event the data is already cached in the lower level cache.Type: GrantFiled: January 10, 2008Date of Patent: February 15, 2011Assignee: International Business Machines CorporationInventors: Miguel Comparan, Eric Oliver Mejdrich, Adam James Muff
-
Operand multiplexor control modifier instruction in a fine grain multithreaded vector microprocessor
Patent number: 7868894Abstract: The present invention is generally related to the field of image processing, and more specifically to an instruction set for processing images. Vector processing may involve rearranging vector operands in one or more source registers prior to performing vector operations. Typically, rearranging of operands in source registers is done by issuing a plurality of permute instructions that require excessive usage of temporary registers. Furthermore, the permute instructions may cause dependencies between instructions executing in a pipeline, thereby adversely affecting performance. Embodiments of the invention provide a level of muxing between a register file and a vector unit that allow for rearrangement of vector operands in source registers prior to providing the operands to the vector unit, thereby obviating the need for permute instructions.Type: GrantFiled: November 28, 2006Date of Patent: January 11, 2011Assignee: International Business Machines CorporationInventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs -
Publication number: 20100333099Abstract: A method and circuit arrangement process a workload in a multithreaded processor that includes a plurality of hardware threads. Each thread receives at least one message carrying data to process the workload through a respective inbox from among a plurality of inboxes. A plurality of messages are received at a first inbox among the plurality of inboxes, wherein the first inbox is associated with a first thread among the plurality of hardware threads, and wherein each message is associated with a priority. From the plurality of received messages, a first message is selected to process in the first thread based on that first message being associated with the highest priority among the received messages. A second message is selected to process in the first thread based on that second message being associated with the earliest time stamp among the received messages and in response to processing the first message.Type: ApplicationFiled: June 30, 2009Publication date: December 30, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Mark Gary Kupferschmidt, Eric Oliver Mejdrich, Paul Emery Schardt, Frederick Jacob Ziegler
-
Patent number: 7852336Abstract: By mapping leaf nodes of a spatial index to processing elements, efficient distribution of workload in an image processing system may be achieved. In addition, processing elements may use a thread table to redistribute workload from processing elements which are experiencing an increased workload to processing elements which may be idle. Furthermore, the workload experienced by processing elements may be monitored in order to determine if workload is balanced. Periodically the leaf nodes for which processing elements are responsible may be remapped in response to a detected imbalance in workload. By monitoring the workload experienced by the processing elements and remapping leaf nodes to different processing elements in response to unbalanced workload, efficient distribution of workload may be maintained. Efficient distribution of workload may improve the performance of the image processing system.Type: GrantFiled: November 28, 2006Date of Patent: December 14, 2010Assignee: International Business Machines CorporationInventors: Jeffrey Douglas Brown, Russell Dean Hoover, Eric Oliver Mejdrich, Robert Allen Shearer
-
Patent number: 7836258Abstract: According to embodiments of the invention, a distributed time base signal may be coupled to a memory directory which provides address translation for data located within a memory cache. The memory directory may have attribute bits which indicate whether or not the memory entries have been accessed by the distributed time base signal. Furthermore, the memory directory may have attribute bits which indicate whether or not a memory directory entry should be considered invalid after an access to the memory entry by the distributed time base signal. If the memory directory entry has been accessed by the distributed time base signal and the memory directory entry should be considered invalid after the access by the time base signal, any attempted address translation using the memory directory entry may cause a cache miss. The cache miss may initiate the retrieval of valid data from memory.Type: GrantFiled: November 13, 2006Date of Patent: November 16, 2010Assignee: International Business Machines CorporationInventors: Jeffrey Douglas Brown, Russell Dean Hoover, Eric Oliver Mejdrich
-
Patent number: 7818503Abstract: One embodiment of the invention provides a method and apparatus for utilizing memory. The method includes reserving a first portion of a cache in a processor for an inbox. The inbox is associated with a first thread being executed by the processor. The method also includes receiving a packet from a second thread, wherein the packet includes an access request. The method further includes using inbox control circuitry for the inbox to process the received packet and determine whether to grant the access request included in the packet.Type: GrantFiled: December 7, 2006Date of Patent: October 19, 2010Assignee: International Business Machines CorporationInventors: Russell Dean Hoover, Jon K. Kriegel, Eric Oliver Mejdrich, Robert Allen Shearer
-
Patent number: 7809925Abstract: A vectorizable execution unit is capable of being operated in a plurality of modes, with the processing lanes in the vectorizable execution unit grouped into different combinations of logical execution units in different modes. By doing so, processing lanes can be selectively grouped together to operate as different types of vector execution units and/or scalar execution units, and if desired, dynamically switched during runtime to process various types of instruction streams in a manner that is best suited for each type of instruction stream. As a consequence, a single vectorizable execution unit may be configurable, e.g., via software control, to operate either as a vector execution or a plurality of scalar execution units.Type: GrantFiled: December 7, 2007Date of Patent: October 5, 2010Assignee: International Business Machines CorporationInventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
-
Publication number: 20100228781Abstract: A circuit arrangement, program product and method are provided for resetting a dynamically grown Accelerated Data Structure (ADS) used in image processing in which an ADS is initialized by reusing the root node of a prior ADS and resetting at least one node in the prior ADS to break a link between the reset node and a linked-to node in the prior ADS. By doing so, the memory allocated to the prior ADS may be reused for the new ADS, without having to clear or wipe out all of the allocated memory. In addition, in some instances, given the similarity of many image frames, often some or all of the node structure of a prior ADS may be reused for a new ADS, requiring only the contents of nodes to be cleared, instead of having to clear out all of the nodes in the prior ADS. As a result, the processing overhead associated with initializing a new ADS can be significantly reduced.Type: ApplicationFiled: February 24, 2009Publication date: September 9, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: David Keith Fowler, Eric Oliver Mejdrich, Paul Emery Schardt, Robert Allen Shearer
-
Patent number: 7783860Abstract: Embodiments of the invention provide logic within the store data path between a processor and a memory array. The logic may be configured to misalign vector data as it is stored to memory. By misaligning vector data as it is stored to memory, memory bandwidth may be maximized while processing bandwidth required to store vector data misaligned is minimized. Furthermore, embodiments of the invention provide logic within the load data path which allows vector data which is stored misaligned to be aligned as it is loaded into a vector register. By aligning misaligned vector data as it is loaded into a vector register, memory bandwidth may be maximized while processing bandwidth required to align misaligned vector data may be minimized.Type: GrantFiled: July 31, 2007Date of Patent: August 24, 2010Assignee: International Business Machines CorporationInventors: David Arnold Luick, Eric Oliver Mejdrich, Adam James Muff
-
Publication number: 20100188396Abstract: A method, program product and system for conducting a ray tracing operation where the rendering compute requirement is reduced or otherwise adjusted in response to a changing vantage point. Aspects may update or reuse an acceleration data structure between frames in response to the changing vantage point. Tree and image construction quality may be adjusted in response to rapid changes in the camera perspective. Alternatively or additionally, tree building cycles may be skipped. All or some of the tree structure may be built in intervals, e.g., after a preset number of frames. More geometric image data may be added per leaf node in the tree in response to an increase in the rate of change. The quality of the rendering algorithm may additionally be reduced. A ray tracing algorithm may decrease the depth of recursion, and generate fewer cast and secondary rays. The ray tracer may further reduce the quality of soft shadows, resolution and global illumination samples, among other quality parameters.Type: ApplicationFiled: January 28, 2009Publication date: July 29, 2010Applicant: International Business Machines CorporationInventors: Eric Oliver Mejdrich, Paul Emery Schardt, Robert Allen Shearer, Matthew Ray Tubbs
-
Publication number: 20100188403Abstract: A method, program product and system for conducting a ray tracing operation where the rendering compute requirement is reduced by varying the size of bounding volumes into which image data is divided and/or by varying a number of primitives included within nodes of an acceleration data structure that correspond to the bounding volumes.Type: ApplicationFiled: January 28, 2009Publication date: July 29, 2010Applicant: International Business Machines CorporationInventors: Eric Oliver Mejdrich, Paul Emery Schardt, Robert Allen Shearer, Matthew Ray Tubbs
-
Patent number: 7757032Abstract: A bus bridge between a high speed computer processor bus and a high speed output bus. The preferred embodiment is a bus bridge between a GPUL bus for a GPUL PowerPC microprocessor from International Business Machines Corporation (IBM) and an output high speed interface (MPI). Another preferred embodiment is a bus bridge in a bus transceiver on a multi-chip module.Type: GrantFiled: August 20, 2008Date of Patent: July 13, 2010Assignee: International Business Machines CorporationInventors: Giora Biran, Robert Allen Drehmel, Robert Spencer Horton, Mark E. Kautzman, Jamie Randall Kuesel, Ming-i Mark Lin, Eric Oliver Mejdrich, Clarence Rosser Ogilvie, Charles S. Woodruff
-
Patent number: 7752413Abstract: A method and apparatus for communicating between threads in a processor. The method includes reserving a first portion of a cache in a processor for an inbox. The inbox is associated with a first thread being executed by the processor. The method also includes receiving a packet from a second thread, wherein the packet includes an access request. The method further includes using inbox control circuitry for the inbox to process the received packet and determine whether to grant the access request included in the packet.Type: GrantFiled: December 7, 2006Date of Patent: July 6, 2010Assignee: International Business Machines CorporationInventors: Russell Dean Hoover, Jon K. Kriegel, Eric Oliver Mejdrich, Robert Allen Shearer
-
Patent number: 7737974Abstract: Embodiments of the invention provide methods and apparatus for reallocating workload related to traversal of a ray through a spatial index. In a first operating state a workload manager may be experiencing a first or a normal workload. In the first operating state the workload manager may be responsible for traversing the entire spatial index and a vector throughput engine may be responsible for performing ray-primitive intersection tests. In an increased workload state the workload manager may experience an increased workload. In response to the increased workload the image processing system may partition the spatial index such that the workload manager may be responsible for traversing a first portion of the spatial index and the vector throughput engine may be responsible for traversing a second portion of the spatial index and for performing ray-primitive intersection tests.Type: GrantFiled: September 27, 2006Date of Patent: June 15, 2010Assignee: International Business Machines CorporationInventors: Eric Oliver Mejdrich, Adam James Muff, Robert Allen Shearer
-
Publication number: 20100115250Abstract: A method, computer-readable medium, and apparatus for context switching between a first thread and a second thread. The method includes detecting an exception, wherein the exception is generated in response to receiving a packet of information directed to one of the first thread and the second thread, and in response to detecting the exception, invoking an exception handler. The exception handler is configured to execute one or more instructions removing access to at least a portion of a processor cache. The portion of the processor cache contains cached information for the first thread using a first address translation. Removing access to the portion of the processor cache prevents the second thread using a second address translation from accessing the cached information in the processor cache. The exception handler is also configured to branch to at least one of the first thread and the second thread.Type: ApplicationFiled: January 11, 2010Publication date: May 6, 2010Applicant: International Business Machines CorporationInventors: JON K. KRIEGEL, Eric Oliver Mejdrich
-
Publication number: 20100100712Abstract: A processing unit includes multiple execution units and sequencer logic that is disposed downstream of instruction buffer logic, and that is responsive to a sequencer instruction present in an instruction stream. In response to such an instruction, the sequencer logic issues a plurality of instructions associated with a long latency operation to one execution unit, while blocking instructions from the instruction buffer logic from being issued to that execution unit. In addition, the blocking of instructions from being issued to the execution unit does not affect the issuance of instructions to any other execution unit, and as such, other instructions from the instruction buffer logic are still capable of being issued to and executed by other execution units even while the sequencer logic is issuing the plurality of instructions associated with the long latency operation.Type: ApplicationFiled: October 16, 2008Publication date: April 22, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
-
Patent number: 7681020Abstract: A method, computer-readable medium, and apparatus for context switching between a first thread and a second thread. The method includes detecting an exception, wherein the exception is generated in response to receiving a packet of information directed to one of the first thread and the second thread, and in response to detecting the exception, invoking an exception handler. The exception handler is configured to execute one or more instructions removing access to at least a portion of a processor cache. The portion of the processor cache contains cached information for the first thread using a first address translation. Removing access to the portion of the processor cache prevents the second thread using a second address translation from accessing the cached information in the processor cache. The exception handler is also configured to branch to at least one of the first thread and the second thread.Type: GrantFiled: April 18, 2007Date of Patent: March 16, 2010Assignee: International Business Machines CorporationInventors: Jon K. Kriegel, Eric Oliver Mejdrich
-
Publication number: 20090315908Abstract: A circuit arrangement and method utilize texture data prefetching to prefetch texture data used by an anisotropic filtering algorithm. In particular, stride-based prefetching may be used to prefetch texture data for use in anisotropic filtering, where the value of the stride, or difference between successive accesses, is based upon a distance in a memory address space between sample points taken along the line of anisotropy used in an anisotropic filtering algorithm.Type: ApplicationFiled: April 25, 2008Publication date: December 24, 2009Inventors: Miguel Comparan, Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
-
Publication number: 20090256836Abstract: A circuit arrangement and method provide a hybrid rendering architecture capable of interfacing a streaming geometry frontend with a physical rendering backend using a dynamic accelerated data structure (ADS) generator. The dynamic ADS generator effectively parallelizes the generation of the ADS, such that an ADS may be built using a plurality of parallel threads of execution. By doing so, both the frontend and backend rendering processes are amendable to parallelization, and enabling if so desired real time rendering using physical rendering techniques such as ray tracing and photon mapping. Furthermore, conventional streaming geometry frontends such as OpenGL and DirectX compatible frontends can readily be adapted for use with physical rendering backends, thereby enabling developers to continue to develop with known API's, yet still obtain the benefits of physical rendering techniques.Type: ApplicationFiled: April 11, 2008Publication date: October 15, 2009Inventors: Dave Fowler, Eric Oliver Mejdrich, Paul Emery Schardt, Robert Allen Shearer
-
Publication number: 20090231349Abstract: A multithreaded rendering software pipeline architecture utilizes a rolling context data structure to store multiple contexts that are associated with different image elements that are being processed in the software pipeline. Each context stores state data for a particular image element, and the association of each image element with a context is maintained as the image element is passed from stage to stage of the software pipeline, thus ensuring that the state used by the different stages of the software pipeline when processing the image element remains coherent irrespective of state changes made for other image elements being processed by the software pipeline. Multiple image elements may therefore be processed concurrently by the software pipeline, and often without regard for synchronization or serialization of state changes that affect only certain image elements.Type: ApplicationFiled: March 12, 2008Publication date: September 17, 2009Inventors: Eric Oliver Mejdrich, Paul Emery Schardt, Robert Allen Shearer