Parallel Processors (e.g., Identical Processors) Patents (Class 345/505)
  • Patent number: 8237720
    Abstract: Embodiments for shader-based finite state machine frame detection for implementing alternative graphical processing on an animation scenario are disclosed. In accordance with one embodiment, the embodiment includes assigning an identifier to each shader used to render animation scenarios. The embodiment also includes defining a finite state machine for a key frame in each of the animation scenarios, whereby each finite state machine representing a plurality of shaders that renders the key frame in each animation scenario. The embodiment further includes deriving a shader ID sequence for each finite state machine based on the identifier assigned to each shader. The embodiment additionally includes comparing an input shader ID sequence of a new frame of a new animation scenario to each derived shader ID sequences. Finally, the embodiment includes executing alternative graphics processing on the new animation scenario when the input shader ID sequence matches one of the derived shader ID sequences.
    Type: Grant
    Filed: February 12, 2009
    Date of Patent: August 7, 2012
    Assignee: Microsoft Corporation
    Inventors: Jinyu Li, Chen Li, Xin Tong
  • Publication number: 20120194528
    Abstract: Embodiments of the present invention provide a method of preempting a task. The method includes removing the task from the parallel processors via a scheduling mechanism. Responsive to the removing, the method also includes ceasing (i) retrieval of commands from a buffer associated with the task, (ii) dispatch of groups of work-items associated with the task, (iii) dispatch of wavefronts associated with the task, and (iiii) execution of the wavefronts. State information related to the task is saved.
    Type: Application
    Filed: November 30, 2011
    Publication date: August 2, 2012
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Ralph Clay Taylor, Michael Mantor, Sebastien Nussbaum, Rex McCrary, Mark Leather, Philip J. Rogers, Thomas R. Woller, Kevin McGrath, Nuwan Jayasena
  • Patent number: 8233185
    Abstract: What is provided are a system and method for print/copy job environments utilizing a page description language (PDL). In one embodiment, an input PDL stream describing embedded objects in a job is received and parsed. Reusable document components (RDCs) are identified. A determination is made as to how many placements are in the PDL for each identified RDC. If no RDCs are placed more than once, caching is disabled. If it is not efficient to split the PDL stream into smaller tasks, page parallel rip (PPR) is disabled. The embedded objects are analyzed to determine a number of PPRs for the job based on system resources. A raster image processing (RIP) time is projected for each path in the job based on the determined number of placements and the determined number of PPRs. A job processing path is prescribed for the job based on the most efficient projected RIP time.
    Type: Grant
    Filed: March 7, 2008
    Date of Patent: July 31, 2012
    Assignee: Xerox Corporation
    Inventors: Gerald S. Gordon, John H. Gustke, Scott Mayne
  • Patent number: 8223159
    Abstract: One embodiment of the present invention sets forth a system configured for transferring data between independent application programming interface (API) contexts on one or more graphics processing units (GPUs). Each API context may derive from an arbitrary API. Data is pushed from one API context to another API context using a peer-to-peer buffer “blit” operation executed between buffers allocated in the source and target API context memory spaces. The source and target API context memory spaces may be located within the frame buffers of the source and target GPUs, respectively, or located within the frame buffer of a single GPU. The data transfers between the API contexts are synchronized using semaphore operator pairs inserted in push buffer commands that are executed by the one or more GPUs.
    Type: Grant
    Filed: June 20, 2006
    Date of Patent: July 17, 2012
    Assignee: NVIDIA Corporation
    Inventors: Franck R. Diard, Barthold B. Lichtenbelt, Mark J. Harris, Simon G. Green
  • Patent number: 8217951
    Abstract: The present invention relates to an apparatus and method for processing graphic data. According to an embodiment, the graphic data processing apparatus includes a CPU having at least one core; a GPU configured to process graphic data; a usage level checking unit configured to check a usage level of the CPU and/or a usage level of the GPU; and a control unit configured to compare the checked usage level of the CPU with a usage level reference of the CPU and/or to compare the checked usage level of the GPU with a usage level reference of the GPU, to allow the graphic data to be processed in parallel by the CPU and the GPU or only by the GPU according to the comparison results.
    Type: Grant
    Filed: March 20, 2008
    Date of Patent: July 10, 2012
    Assignee: LG Electronics Inc.
    Inventor: Chang Kwon Jung
  • Patent number: 8217950
    Abstract: A processing unit, method, and graphics processing system are provided for processing a plurality of frames of graphics data. For instance, the processing unit can include a first plurality of graphics processing units (GPUs), a second plurality of GPUs, and a plurality of compositors. The first plurality of GPUs can be configured to process a first frame of graphics data. Likewise, the second plurality of GPUs can be configured to process a second frame of graphics data. Further, each compositor in the plurality of compositors can be coupled to a respective GPU from the first and second pluralities of GPUs, where the plurality of compositors is configured to sequentially pass the first and second frames of graphics data to a display module.
    Type: Grant
    Filed: September 2, 2009
    Date of Patent: July 10, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Rajabali M. Koduri, David Gotwalt, Andrew Pomianowski
  • Patent number: 8212825
    Abstract: One embodiment of the present invention sets forth a technique for more effectively utilizing graphics hardware by allowing the developer to exploit parallelism at the primitive-level. In this technique, an algorithm is analyzed to break the total work associated with processing one primitive into discrete portions of work. The results of this analysis are used to program a geometry shader group that includes multiple geometry shaders. Upon receiving a single input primitive, the geometry shader group launches multiple parallel threads, one thread in each geometry shader in the group corresponding to each discrete portion of work. As each thread completes, the output of the thread is stored in on-chip GPU memory for processing by the next stage in the graphics pipeline. Since the overall work associated with a given input primitive is distributed across multiple threads, the output of each thread is smaller and, thus, the total memory required to implement the algorithm is reduced.
    Type: Grant
    Filed: November 27, 2007
    Date of Patent: July 3, 2012
    Assignee: NVIDIA Corporation
    Inventors: Cass W. Everitt, Henry Packard Moreton
  • Patent number: 8212838
    Abstract: A system and method for improved antialiasing in video processing is described herein. Embodiments include multiple video processors (VPUs) in a system. Each VPU performs some combination of pixel sampling and pixel center sampling (also referred to as multisampling and supersampling). Each VPU performs sampling on the same pixels or pixel centers, but each VPU creates samples positioned differently from the other VPUs corresponding samples. The VPUs each output frame data that has been multisampled and/or supersampled into a compositor that composites the frame data to produce an antialiased rendered frame. The antialiased rendered frame has an effectively doubled antialiasing factor.
    Type: Grant
    Filed: May 27, 2005
    Date of Patent: July 3, 2012
    Assignee: ATI Technologies, Inc.
    Inventors: Arcot J. Preetham, Andrew S. Pomianowski, Raja Koduri
  • Publication number: 20120162235
    Abstract: A method and system are provided for performing the computational execution of automation tasks with automation devices by combining one or more central processing units (CPU) and one or more Graphics Processing Units (GPU). The control tasks and/or control algorithms are executed by the single-core or multi-core control unit (CPU) and a multi-core-graphics processor (GPU) or both in parallel at the same time.
    Type: Application
    Filed: February 24, 2012
    Publication date: June 28, 2012
    Applicant: ABB Technology AG
    Inventor: Rainer DRATH
  • Patent number: 8207972
    Abstract: A three-dimensional (3D) graphics pipeline which processes pixels of sub-screens in the last stage (pixel rendering) in parallel and independently. The sub-screen tasks are stored in a list in a shared memory. The shared memory is accessed by a plurality of processing threads designated for pixel rendering. The processing threads seize and lock sub-screens tasks in an orderly manner and process the tasks to create the bit map for display on a screen. The tasks are created by dividing a display area having the vertex information superimposed thereon into M×N sub-screen tasks. Based on system profiling, M and N may be varied.
    Type: Grant
    Filed: December 22, 2006
    Date of Patent: June 26, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Jian Wei, Chehui Wu, James M Brown
  • Publication number: 20120154373
    Abstract: Embodiments are disclosed herein that relate to generating a decision tree through graphical processing unit (GPU) based machine learning. For example, one embodiment provides a method including, for each level of the decision tree: performing, at each GPU of the parallel processing pipeline, a feature test for a feature in a feature set on every example in an example set. The method further includes accumulating results of the feature tests in local memory blocks. The method further includes writing the accumulated results from each local memory block to global memory to generate a histogram of features for every node in the level, and for each node in the level, assigning a feature having a lowest entropy in accordance with the histograms to the node.
    Type: Application
    Filed: December 15, 2010
    Publication date: June 21, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Mark Finocchio, Richard E. Moore, Ryan M. Geiss, Jamie Shotton
  • Patent number: 8203557
    Abstract: Embodiments of the invention provide assigning two different class identifiers to a device to allow loading to an operating system as different devices. The device may be a graphics device. The graphics device may be integrated in various configurations, including but not limited to a central processing unit, chipset and so forth. The processor or chipset may be associated with a first identifier associated with a graphics processor and a second device identifier that enables the processor or chipset as a co-processor.
    Type: Grant
    Filed: February 9, 2011
    Date of Patent: June 19, 2012
    Assignee: Intel Corporation
    Inventors: Katen Shah, Hong Jiang
  • Patent number: 8203568
    Abstract: A centralised server in a bank (50) of servers runs a program for use by a user at a remote terminal (52, 56, 58). In the server, a plurality of programs share a GPU and instructions are used to cause the GPU to store the frames representing graphics of different programs at different memory locations. The frames are compressed and transmitted to remote terminals. Optionally the invention also allows for GPU time slice allocation, such that the GPU completes rendering the frame of one program before it renders the frame of another program. Optionally the invention also allows delivering false information about the capabilities of the GPU to the programs.
    Type: Grant
    Filed: November 16, 2011
    Date of Patent: June 19, 2012
    Inventors: Graham Clemie, Dedrick Duckett
  • Publication number: 20120147016
    Abstract: Disclosed are an image processing device and an image processing method which achieve an increase in the speed of image processing by designating and operating a plurality of image processing units each corresponding to a specific function for the image processing in accordance with a program. A frame memory (21 . . . ) stores image data to be processed. Parallel memories (121 . . . ) each receive all or part of the image data stored in the frame memory (21 . . . ) and transmit the received image data to any of DMACs (111 . . . ) or processing units (13A . . . ) for the image processing. The processing units (13A . . . ) each have a function corresponding to a function for the image processing. The processing units (13A . . . ) each receive all or part of the image data from the parallel memory (121 . . . ) or the frame memory (21 . . . ) in accordance with a command from a CPU (3) and perform processing based on the function for the image processing on all or part of the image data.
    Type: Application
    Filed: August 13, 2010
    Publication date: June 14, 2012
    Applicant: THE UNIVERSITY OF TOKYO
    Inventors: Masatoshi Ishikawa, Takashi Komuro, Tomohira Tabata
  • Patent number: 8199151
    Abstract: A method of detecting an occurrence of an event of an event type during an animation, in which the animation comprises, for each of a plurality of object parts of an object, data defining the respective movement of that object part at each of a sequence of time-points for the animation, the method comprising: indicating the event type, wherein the event type specifies: one or more of the object parts; and a sequence of two or more event phases that occur during an event of that event type such that, for each event phase, the respective movements of the one or more specified object parts during that event phase are each constrained according to a constraint type associated with that event phase; and detecting an occurrence of an event of the event type by detecting a section of the animation during which the respective movements defined by the animation for the specified one or more object parts are constrained in accordance with the sequence of two or more event phases.
    Type: Grant
    Filed: February 13, 2009
    Date of Patent: June 12, 2012
    Assignee: Naturalmotion Ltd.
    Inventor: Nicholas MacDonald Spencer
  • Patent number: 8194083
    Abstract: A plurality of vertex or fragment processors on a graphics processor perform computations. Each vertex or fragment processor is capable of executing a separate program to compute a specific result. A combiner manages the combination of the results from the respective processors, and produces a final transformed vertex or pixel value. The vertex or fragment processors and the combiner can be programmable to modify their operations. As such, the vertex or fragment processors can operate in a parallel or serial configuration, or both. The combiner manages and resolves the operations of the serial and/or parallel configurations. A synchronization barrier enables the combiner to perform data-dependency analysis to determine the timing and ordering of the respective processors' execution. A transformation module can include one or more programmable vertex processors that transforms three-dimensional geometric data into fragments.
    Type: Grant
    Filed: November 8, 2010
    Date of Patent: June 5, 2012
    Assignee: Graphics Properties Holdings, Inc.
    Inventor: David Shreiner
  • Patent number: 8189001
    Abstract: Novel method and system for distributed database ray-tracing is presented, based on modular mapping of scene-data among processors. Its inherent properties include scattering data among processors for improved load balancing, and matching between geographical proximity in the scene with communication proximity between processors. High utilization is enabled by unique mechanism of cache sharing. The resulting improved performance enables deep level of ray tracing for real time applications.
    Type: Grant
    Filed: December 24, 2010
    Date of Patent: May 29, 2012
    Assignee: Adshir Ltd.
    Inventor: Reuven Bakalash
  • Patent number: 8189007
    Abstract: A graphics engine and related method of operation are disclosed in which a pixel distributor distributes pixel data across a plurality of pixel shaders using a first approach when the presence of one or more rendering features is indicated, else using a second approach different from the first approach.
    Type: Grant
    Filed: December 27, 2007
    Date of Patent: May 29, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Chun-Ho Kim
  • Patent number: 8189678
    Abstract: A video and graphics system processes video data including both analog video, e.g., NTSC/PAL/SECAM/S-video, and digital video, e.g., MPEG-2 video in SDTV or HDTV format. The video and graphics system includes a video decoder, which is capable of concurrently decoding multiple SLICEs of MPEG-2 video data. The video decoder includes multiple row decoding engines for decoding the MPEG-2 video data. Each row decoding engine concurrently decodes two or more rows of the MPEG-2 video data. The row decoding engines have a pipelined architecture for concurrently decoding multiple rows of MPEG-2 video data. The video decoder may be integrated on an integrated circuit chip with other video and graphics system components such as transport processors for receiving one or more compressed data streams and for extracting video data, and a video compositor for blending processed video data with graphics.
    Type: Grant
    Filed: December 6, 2010
    Date of Patent: May 29, 2012
    Assignee: Broadcom Corporation
    Inventors: Ramanujan Valmiki, Sandeep Bhatia
  • Publication number: 20120127182
    Abstract: One or more techniques and/or systems are disclosed for processing vector-based information for an image. From a set of pixels that comprises the image, a first subset of one or more pixels that are used in a raster representation of an element in the image, such as pixel values used to render the image, is identified. A first operation is performed in parallel for the respective one or more pixels in the first subset, such as by evaluating a batched first subset of pixels using stacked instruction for the first operation. The first operation comprises instructions for at least a first portion of a function for generating an image pixel value used to represent the element in the image.
    Type: Application
    Filed: November 23, 2010
    Publication date: May 24, 2012
    Applicant: Microsoft Corporation
    Inventors: Raman Narayanan, Radoslav Petrov Nickolov, Ming Liu, Rajendra Vishnumurthy
  • Patent number: 8180182
    Abstract: A processing device performs a geometry process as preprocessing for rendering a three-dimensional object on a display by modeling the three-dimensional object using a polygon mesh. The geometry process includes a vertex process that is performed for each of the vertices of the polygon mesh by a different one of a plurality of processors, and processed vertex data obtained by the vertex process is notified among the processors so that a polygon process can be performed in each of the processors. Because each processor can continuously perform the polygon process immediately after the vertex process, it is possible to suppress the occurrence of the unbalance of timing in performing the vertex process and the polygon process, thereby efficiently performing computation while minimizing the wasteful idle time of the processors.
    Type: Grant
    Filed: May 11, 2007
    Date of Patent: May 15, 2012
    Assignee: Panasonic Corporation
    Inventor: Yorihiko Wakayama
  • Patent number: 8179394
    Abstract: One embodiment of the present invention sets forth a technique to perform fine-grained rendering predication using an IGPU and a DGPU. A graphics driver divides a 3D object into batches of triangles. The IGPU processes each batch of triangles through a modified rendering pipeline to determine if the batch is culled. The IGPU writes bits into a bitstream corresponding to the visibility of the batches. The DGPU reads bits from the bitstream and performs full-blown rendering, including shading, but only on the batches of triangles whose bit indicates that the batch is visible. Advantageously, this approach to rendering predication provides fine-grained culling without adding unnecessary overhead, thereby optimizing both hardware resources and performance.
    Type: Grant
    Filed: December 13, 2007
    Date of Patent: May 15, 2012
    Assignee: NVIDIA Corporation
    Inventors: Cass W. Everitt, Franck R. Diard
  • Patent number: 8174531
    Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.
    Type: Grant
    Filed: December 29, 2009
    Date of Patent: May 8, 2012
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Brett W. Coon, Stuart F. Oberman, Ming Y. Siu, Matthew P. Gerlach
  • Patent number: 8174530
    Abstract: A data processing apparatus includes a plurality of processing elements arranged in a single instruction multiple data array for processing data relating to graphical primitives. Vertex data relating to graphical primitives is used as feedback data for the processing elements for additional processing.
    Type: Grant
    Filed: June 6, 2007
    Date of Patent: May 8, 2012
    Assignee: Rambus Inc.
    Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
  • Patent number: 8169441
    Abstract: A method and system for minimizing an amount of data needed to test data against subarea boundaries in spatially composited digital video is provided. Graphics data for a frame is composed of geometry chunks. Each geometry chunk is defined by its own bounding region, where the bounding region defines the space the geometry chunk occupies on the compositing window. Only the parameters that define the bounding region are communicated to each graphics unit in conjunction with the determination of which graphics unit will render the geometry chunk defined by the bounding region. The actual graphics data that comprises the geometry chunk is communicated only to those geometry units that will actually render the geometry chunk. This reduces the amount of data needed to communicate graphics data information in spatially composited digital video.
    Type: Grant
    Filed: April 11, 2011
    Date of Patent: May 1, 2012
    Assignee: Graphics Properties Holdings, Inc.
    Inventors: David R. Blythe, Marc Schafer, Paul Jeffrey Ungar, David Yu
  • Patent number: 8169437
    Abstract: A system and method for dividing three-dimensional patches into tasks for processing receives control points defining a three dimensional patch and determines if a number of vertices of the three dimensional patch is greater than a maximum value. When the number of vertices is not greater than the maximum value, the three dimensional patch is output as a single task. When the number of vertices is greater than the maximum value, the three dimensional patch is divided into multiple tasks that each include a number of vertices that is not greater than the maximum value and the multiple tasks are output.
    Type: Grant
    Filed: July 9, 2008
    Date of Patent: May 1, 2012
    Assignee: NVIDIA Corporation
    Inventors: Justin S. Legakis, Subodh Kumar
  • Patent number: 8169440
    Abstract: A method of processing data relating to geometrical primitives is disclosed. Each of the primitives has a plurality of vertices. The method uses a plurality of processing elements in parallel with one another, and comprises assigning respective vertex data to the processing elements, on each processing element, and in parallel with one another, performing at least one processing step on vertex data to produce processed vertex data, and transferring processed vertex data between processing elements so as to assemble primitive data.
    Type: Grant
    Filed: May 29, 2007
    Date of Patent: May 1, 2012
    Assignee: Rambus Inc.
    Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
  • Patent number: 8171198
    Abstract: An image forming apparatus and a control method thereof. The image forming apparatus includes a plurality of image processors which process an image to be formed on a printing medium corresponding to a plurality of colors, a processor which executes an interrupt routine with respect to the plurality of image processors, and a controller which generates an interrupt signal and transmits the interrupt signal to the processor if at least two the plurality of image processors generate interrupt requests so that the processor executes the interrupt routine.
    Type: Grant
    Filed: May 26, 2011
    Date of Patent: May 1, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jong-seung Lee, Yoon-tac Lee
  • Patent number: 8169439
    Abstract: Embodiments of the invention are generally related to image processing, and more specifically to vector units for supporting image processing. A combined vector/scalar unit is provided wherein one or more processing lanes of the vector unit are used for performing scalar operations. An integrated register file is also provided for storing vector and scalar data. Therefore, the transfer of data to memory to exchange data between independent vector and scalar units is obviated and a significant amount of chip area is saved.
    Type: Grant
    Filed: October 23, 2007
    Date of Patent: May 1, 2012
    Assignee: International Business Machines Corporation
    Inventors: David Arnold Luick, Eric Oliver Mejdrich, Adam James Muff
  • Publication number: 20120092351
    Abstract: The disclosed embodiments provide a system that configures a computer system to switch between two graphics-processing units (GPUs). During operation, the system receives a request to switch from using a first GPU to using a second GPU to drive the display. In response to this request, the system executes a user thread that copies pixel values from a first framebuffer for the first GPU to a second framebuffer for the second GPU. Next, the user thread initiates a switch from the first framebuffer to the second framebuffer as a signal source for driving the display. Finally, the user thread sends an asynchronous notification of the switch to one or more applications, wherein the asynchronous notification allows the applications to transition from rendering graphics using the first GPU to rendering graphics using the second GPU.
    Type: Application
    Filed: December 2, 2010
    Publication date: April 19, 2012
    Applicant: APPLE INC.
    Inventor: Andrew R. Barnes
  • Patent number: 8159496
    Abstract: Methods and apparatus for subdividing a shader program into regions or “phases” of instructions identifiable by phase identifiers (IDs) inserted into the shader program are provided. The phase IDs may be used to constrain execution of the shader program to prohibit texture fetches in later phases from being executed before a texture fetch in a current phase has completed. Other operations (e.g., math operations) within the current phase, however, may be allowed to execute while waiting for the current phase texture fetch to complete.
    Type: Grant
    Filed: June 1, 2009
    Date of Patent: April 17, 2012
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Brett W. Coon, Gary M Tarolli
  • Patent number: 8161209
    Abstract: A peer-to-peer special purpose processor architecture and method is described. Embodiments include a plurality of special purpose processors coupled to a central processing unit via a host bridge bus, a direct bus directly coupling each of the plurality of special purpose processors to at least one other of the plurality of special purpose processors and a memory controller coupled to the plurality of special purpose processors, wherein the at least one memory controller determines whether to transmit data via the host bus or the direct bus, and whether to receive data via the host bus or the direct bus.
    Type: Grant
    Filed: July 31, 2008
    Date of Patent: April 17, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Stephen Morein, Mark S. Grossman, Warren Fritz Kruger, Brian Etscheid
  • Patent number: 8154553
    Abstract: Exemplary embodiments include an interception mechanism for rendering commands generated by interactive applications, and a feed-forward control mechanism based on the processing of the commands on a rendering engine, on a pre-filtering module, and on a visual encoder. Also a feed-back control mechanism from the encoder is described. The mechanism is compression-quality optimized subject to some constraints on streaming bandwidth and system delay. The mechanisms allow controllable levels of detail for different rendered objects, controllable post filtering of rendered images, and controllable compression quality of each object in compressed images. A mechanism for processing and streaming of multiple interactive applications in a centralized streaming application server is also described.
    Type: Grant
    Filed: May 22, 2008
    Date of Patent: April 10, 2012
    Assignee: Playcast Media System, Ltd.
    Inventor: Natan Peterfreund
  • Patent number: 8156364
    Abstract: A method (which can be computer implemented) for processing a plurality of adjacent rows of data units, using a plurality of parallel processors, given (i) a predetermined processing order, and (ii) a specified inter-row dependency structure, includes the steps of determining starting times for each individual one of the processors, and maintaining synchronization across the processors, while ensuring that the dependency structure is not violated. Not all the starting times are the same, and a sum of absolute differences between (i) starting times of any given processor, and (ii) that one of the processors having an earliest starting time, is minimized.
    Type: Grant
    Filed: June 12, 2007
    Date of Patent: April 10, 2012
    Assignee: International Business Machines Corporation
    Inventors: Krishna Ratakonda, Deepak S. Turaga
  • Patent number: 8149247
    Abstract: One embodiment of the present invention sets forth a method, which includes the steps of generating a first rendered image associated with a first application, independently generating a second rendered image associated with a second application, applying a first set of blending weights to the first rendered image to establish a first weighted image, applying a second set of blending weights to the second rendered image to establish a second weighted image, and blending the first weighted image and the second weighted image before scanning out a blended result to a first display device.
    Type: Grant
    Filed: November 6, 2007
    Date of Patent: April 3, 2012
    Assignee: NVIDIA Corporation
    Inventor: Franck R. Diard
  • Patent number: 8139069
    Abstract: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.
    Type: Grant
    Filed: November 3, 2006
    Date of Patent: March 20, 2012
    Assignee: NVIDIA Corporation
    Inventors: Steven E. Molnar, Cass W. Everitt, Roger L. Allen, Gary M. Tarolli, John M. Danskin
  • Patent number: 8134563
    Abstract: A parallel graphics rendering system is embodied within a host computing system and includes a plurality of graphic processing pipelines (GPPLs) and graphics processing modules. The parallel graphics rendering system supports one or more modes of parallel operation selected from the group consisting of object division, image division, and time division. a plurality of graphic processing pipelines The GPPLs support a parallel graphics rendering process that employs one or more of the object division, image division and/or time division modes of parallel operation in order to execute graphic commands and process graphics data, and render pixel-composited images containing graphics for display on a display device during the run-time of the graphics-based application. An automatic mode control module automatically controls the mode of parallel operation of the parallel graphics rendering system during the run-time of the graphics-based application.
    Type: Grant
    Filed: October 30, 2007
    Date of Patent: March 13, 2012
    Assignee: Lucid Information Technology, Ltd
    Inventors: Reuven Bakalash, Yaniv Leviathan
  • Publication number: 20120056892
    Abstract: An approach to automatically specifying, or assisting with the specification of, a parallel computation graph involves determining data processing characteristics of the linking elements that couple data processing elements of the graph. The characteristics of the linking elements are determined according to the characteristics of the upstream and/or downstream data processing elements associated with the linking element, for example, to enable computation by the parallel computation graph that is equivalent to computation of an associated serial graph.
    Type: Application
    Filed: November 14, 2011
    Publication date: March 8, 2012
    Inventor: Craig W. Stanfill
  • Patent number: 8130228
    Abstract: A system, method and article of manufacture are disclosed for processing Low Density Parity Check (LDPC) codes. The system comprises a multitude of processing units for processing the codes; and a processor chip including an on-chip, multi-port data cache for temporarily storing the LDPC codes. This data cache includes a plurality of input ports for receiving the LDPC codes from some of the processing units, and a plurality of output ports for sending the LDPC codes to others of the processing units. An off-chip, external memory stores the LDPC codes and transmits the LDPC codes to and receives the LDPC codes from at least some of the processing units. A sequence processor controls the transmission of the LDPC codes between the processor units and the on-chip data cache so that the LDPC codes are processed by the processing units according to a given sequence.
    Type: Grant
    Filed: June 13, 2008
    Date of Patent: March 6, 2012
    Assignee: International Business Machines Corporation
    Inventor: Thomas A. Horvath
  • Patent number: 8125487
    Abstract: A game console system capable of parallelizing the operation of multiple graphics processing units (GPUs) supported on game console board, using a graphics hub device, and a multi-mode parallel graphics rendering subsystem supporting multiple modes of parallel operation and having software and hardware implemented components. The game console system includes (i) CPU memory space for storing one or more graphics-based applications, (ii) one or more CPUs for executing the graphics-based applications, (iii) a plurality of graphic processing pipelines (GPPLs), implemented using the GPUs, and (iv) an automatic mode control module. During the run-time of the graphics-based application, the automatic mode control module automatically controls the mode of parallel operation of the multi-mode parallel graphics rendering subsystem so that the GPUs are driven in a parallelized manner.
    Type: Grant
    Filed: September 26, 2007
    Date of Patent: February 28, 2012
    Assignee: Lucid Information Technology, Ltd
    Inventors: Reuven Bakalash, Yaniv Leviathan
  • Patent number: 8108147
    Abstract: A method of identifying and imaging a high risk collision object relative to a host vehicle includes arranging a plurality of N sensors for imaging a three-hundred and sixty degree horizontal field of view (hFOV) around the host vehicle. The sensors are mounted to a vehicle in a circular arrangement so that the sensors are radially equiangular from each other. For each sensor, contrast differences in the hFOV are used to identify a unique source of motion (hot spot) that is indicative of a remote object in the sensor hFOV. A first hot spot in one sensor hFOV is correlated to a second hot spot in another hFOV of at least one other N sensor to yield range, azimuth and trajectory data for said object. The processor then assesses a collision risk with the object according to the object's trajectory data relative to the host vehicle.
    Type: Grant
    Filed: February 6, 2009
    Date of Patent: January 31, 2012
    Assignee: The United States of America as represented by the Secretary of the Navy
    Inventor: Michael Blackburn
  • Patent number: 8106912
    Abstract: To reduce the required amount of program codes when processing the whole image in a one-dimensional SIMD parallel image processing system having a smaller number of PEs than the number of pixels in the width direction of the image to be processed. A controller for controlling a PE array includes a command repetitive-execution part, which includes an operand converting part, a memory address converting part, and an operation code converting part. When a command fetching/decoding part reads and executes program codes stored in a program memory, the repetitive-execution part determines the program codes to cause the operand converting part, memory address converting part and operation code converting part to perform conversions in accordance with the command, thereby performing a repetitive execution of the one-command program description adaptive to a plurality of related pixels assigned to the PEs, whereby the program code amount can be reduced.
    Type: Grant
    Filed: December 5, 2006
    Date of Patent: January 31, 2012
    Assignee: NEC Corporation
    Inventor: Takuya Koga
  • Patent number: 8106913
    Abstract: Circuits, methods, and apparatus for graphically displaying performance metrics of processors such as graphics processing units in multiple processor systems. Embodiments of the present invention may provide metric information regarding operations in alternate-frame rendering, split-frame rendering, or other modes of operation. One embodiment of the present invention provides data in split-frame rendering mode including load balancing, graphics processing unit utilization, frame rate, and other types of system information in a graphical manner. Another exemplary embodiment of the present invention provides graphical information regarding graphics processing unit utilization, frame rate, and other system information while operating in the alternate-frame rendering mode.
    Type: Grant
    Filed: November 25, 2008
    Date of Patent: January 31, 2012
    Assignee: NVIDIA Corporation
    Inventor: Franck R. Diard
  • Publication number: 20120019541
    Abstract: Disclosed herein is a vertex core. The vertex core includes a grouper module configured to process two or more primitives during one clock period and two or more vertex translators configured to respectively receive the two or more processed primitives in parallel.
    Type: Application
    Filed: July 20, 2010
    Publication date: January 26, 2012
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Vineet Goel, Ralph C. Taylor, Todd E. Martin
  • Patent number: 8102393
    Abstract: One embodiment of the present invention sets forth a technique to perform fine-grained rendering predication using an IGPU and a DGPU. A graphics driver divides a 3D object into batches of triangles. The IGPU processes each batch of triangles through a modified rendering pipeline to determine if the batch is culled. The IGPU writes bits into a bitstream corresponding to the visibility of the batches. The DGPU reads bits from the bitstream and performs full-blown rendering, including shading, but only on the batches of triangles whose bit indicates that the batch is visible. Advantageously, this approach to rendering predication provides fine-grained culling without adding unnecessary overhead, thereby optimizing both hardware resources and performance.
    Type: Grant
    Filed: December 13, 2007
    Date of Patent: January 24, 2012
    Assignee: NVIDIA Corporation
    Inventors: Cass W. Everitt, Franck R. Diard
  • Patent number: 8098256
    Abstract: Systems and techniques for processing sequences of video images involve receiving, on a computer, data corresponding to a sequence of video images detected by an image sensor. The received data is processed using a graphics processor to adjust one or more visual characteristics of the video images corresponding to the received data. The received data can include video data defining pixel values and ancillary data relating to settings on the image sensor. The video data can be processed in accordance with ancillary data to adjust the visual characteristics, which can include filtering the images, blending images, and/or other processing operations.
    Type: Grant
    Filed: September 29, 2005
    Date of Patent: January 17, 2012
    Assignee: Apple Inc.
    Inventors: Jay Zipnick, Brett Bilbrey, Alexei V. Ouzilevski, Fernando Urbina, Harry Guo
  • Patent number: 8098252
    Abstract: The video data is parallel processed allowing for extremely fast video processing or a greatly reduced clock requirement for the video processing circuit. In operation, each video channel reads from main memory. This allows each video channel to track the laser directly. The Parallel video processor receives non-columnar pixel data, such as rows. The videoprocessor may support printers of any width without significantly increasing the size of the system.
    Type: Grant
    Filed: July 28, 2010
    Date of Patent: January 17, 2012
    Assignee: Marvell International Technology Ltd.
    Inventor: Douglas G. Keithley
  • Patent number: 8094157
    Abstract: One embodiment of the present invention sets forth a technique for efficiently performing a radix sort operation on a graphics processing unit (GPU). The radix sort operation is conducted on an input list of data using one or more passes of a series of three processing phases. In each processing phase, thread groups are each associated with one segment of input data. In the first phase, occurrences of each radix symbol are counted and stored in a list of counters. In the second phase, the list of counters is processed by a parallel prefix sum operation to generate a list of offsets. In the third phase, the list of offsets is used to perform re-ordering on the list of data, according to the current radix symbol. To maintain sort stability, the one or more passes proceed from least significant data to most significant data in the sort key.
    Type: Grant
    Filed: August 9, 2007
    Date of Patent: January 10, 2012
    Assignee: NVIDIA Corporation
    Inventor: Scott M. Le Grand
  • Patent number: 8089481
    Abstract: An image processing system may perform various tasks in an effort to evenly distribute workload amongst workload managers. According to one embodiment of the invention, the image processing system may divide a frame of pixels into different regions and assign responsibility for the regions to different workload managers in order to evenly distribute workload. The workload managers may be responsible for performing operations relating to determining or maintaining the color of the pixel within the region or regions which they are responsible. According to another embodiment of the invention, the image processing system may re-divide the frame into new regions based on relative workloads experienced by the processing elements to evenly distribute workload. Furthermore, according to another embodiment of the invention, the image processing system may re-partition a spatial index based on relative workloads experienced by the processing elements to evenly distribute workload amongst workload managers.
    Type: Grant
    Filed: September 28, 2006
    Date of Patent: January 3, 2012
    Assignee: International Business Machines Corporation
    Inventor: Robert A. Shearer
  • Publication number: 20110316863
    Abstract: A memory section provides an input buffer capable of holding image data being a processing target of each processing by an image processing unit, and an output buffer capable of holding image data being a processing result. Through an input section, a user selects a plurality of kinds of processing to be executed by the image processing unit, and an execution sequence of the plurality of kinds of processing. A controller section reserves, based on information selected by a user through the input section, an input buffer and an output buffer for each processing in the memory section, sets an input-output connection relation between the buffers, and notifies, based on the set connection relation, the image processing unit of address information of the input buffer in the memory section and the output buffer for each processing sequentially executed by the image processing unit.
    Type: Application
    Filed: June 22, 2011
    Publication date: December 29, 2011
    Inventors: Hisashi Nishimaki, Masanori Kanemaru