Patents by Inventor Bogdan Mitu

Bogdan Mitu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9563433
    Abstract: The present invention is a data parallel system which is able to utilize a very high percentage of processing elements. In an embodiment, the data parallel system includes an array of processing elements and multiple instruction sequencers. Each instruction sequencer is coupled to the array of processing elements by a bus and is able to send an instruction to the array of processing elements. The processing elements are separated into classes and only execute instructions that are directed to their class, although all of the processing elements receive each instruction. In another embodiment, the data parallel system includes an array of processing elements and an instruction sequencer where the instruction sequencer is able to send multiple instructions. Again, the processing elements are separated in classes and execute instructions based on their class.
    Type: Grant
    Filed: December 18, 2012
    Date of Patent: February 7, 2017
    Inventors: Bogdan Mitu, Lazar Bivolarksi, Gheorghe Stefan
  • Publication number: 20140237017
    Abstract: The system provides energy-efficiency of computing nodes in a cluster such that application level compatibility is maintained with legacy programs. This enables clusters to grow in computer capability while optimizing and managing expenses in energy usage, cooling infrastructure and real estate costs. The present technology may leverage existing purpose built parallel processing hardware, such as for example GPU hardware cards, with software to provide the functionality discussed herein. The present technology may create and add to an existing Hadoop cluster, or other distributed data processing framework, an augmented data node with enhanced compute per watt capability using off the shelf parallel processing hardware (e.g., GPU cards) while preserving the application level compatibility with the framework infrastructure.
    Type: Application
    Filed: February 13, 2014
    Publication date: August 21, 2014
    Applicant: mParallelo Inc.
    Inventors: Sanjay Adkar, Bogdan Mitu, Manish Singh
  • Publication number: 20100066748
    Abstract: An efficient method and device for the parallel processing of multimedia data. Blocks (or portions thereof) are transmitted to various parallel processors, in the order of their dependency data. Earlier blocks are sent to the parallel processors first, with later blocks sent later. The blocks are stored in the parallel processors in specific locations, and shifted around as necessary, so that every block, when it is processed, has its dependency data located in a specific set of earlier blocks with specified relative positions. In this manner, its dependency data can be retrieved with the same commands. That is, earlier blocks are shifted around so that later blocks can be processed with a single set of commands that instructs each processor to retrieve its dependency data from specific known relative locations that do not vary.
    Type: Application
    Filed: July 10, 2009
    Publication date: March 18, 2010
    Inventors: Lazar Bivolarski, Bogdan Mitu
  • Publication number: 20080307196
    Abstract: A computer processor having an integrated instruction sequencer, array of processing engines, and I/O controller. The instruction sequencer sequences instructions from a host, and transfers these instructions to the processing engines, thus directing their operation. The I/O controller controls the transfer of I/O data to and from the processing engines in parallel with the processing controlled by the instruction sequencer. The processing engines themselves are constructed with an integer arithmetic and logic unit (ALU), a 1-bit ALU, a decision unit, and registers. Instructions from the instruction sequencer direct the integer ALU to perform integer operations according to logic states stored in the 1-bit ALU and data stored in the decision unit. The 1-bit ALU and the decision unit can modify their stored information in the same clock cycle as the integer ALU carries out its operation. The processing engines also contain a local memory for storing instructions and data.
    Type: Application
    Filed: May 28, 2008
    Publication date: December 11, 2008
    Inventors: Bogdan Mitu, Gheorghe Stefan, Dan Tomescu
  • Patent number: 7451293
    Abstract: A computer processor having an integrated instruction sequencer, array of processing engines, and I/O controller. The instruction sequencer sequences instructions from a host, and transfers these instructions to the processing engines, thus directing their operation. The I/O controller controls the transfer of I/O data to and from the processing engines in parallel with the processing controlled by the instruction sequencer. The processing engines themselves are constructed with an integer arithmetic and logic unit (ALU), a 1-bit ALU, a decision unit, and registers. Instructions from the instruction sequencer direct the integer ALU to perform integer operations according to logic states stored in the 1-bit ALU and data stored in the decision unit. The 1-bit ALU and the decision unit can modify their stored information in the same clock cycle as the integer ALU carries out its operation. The processing engines also contain a local memory for storing instructions and data.
    Type: Grant
    Filed: October 19, 2006
    Date of Patent: November 11, 2008
    Assignee: Brightscale Inc.
    Inventors: Bogdan Mitu, Gheorghe Stefan, Dan Tomescu
  • Publication number: 20080244238
    Abstract: The present invention is a stream processing accelerator which includes multiple coupled processing elements which are interconnected through a shared file register and a set of global predicates. The stream processing accelerator has two modes: full-processor mode and circuit mode. In full-processor mode, a branch unit, an arithmetic logic unit and a memory unit work together as a regular processor. In circuit mode, each component acts like functional units with configurable interconnections.
    Type: Application
    Filed: August 30, 2007
    Publication date: October 2, 2008
    Inventor: Bogdan Mitu
  • Publication number: 20080059762
    Abstract: The present invention is a data parallel system which is able to utilize a very high percentage of processing elements. In an embodiment, the data parallel system includes an array of processing elements and multiple instruction sequencers. Each instruction sequencer is coupled to the array of processing elements by a bus and is able to send an instruction to the array of processing elements. The processing elements are separated into classes and only execute instructions that are directed to their class, although all of the processing elements receive each instruction. In another embodiment, the data parallel system includes an array of processing elements and an instruction sequencer where the instruction sequencer is able to send multiple instructions. Again, the processing elements are separated in classes and execute instructions based on their class.
    Type: Application
    Filed: August 30, 2007
    Publication date: March 6, 2008
    Inventors: Bogdan Mitu, Gheorghe Stefan, Lazar Bivolarski
  • Publication number: 20070188505
    Abstract: An efficient method and device for the parallel processing of multimedia data. Blocks (or portions thereof) are transmitted to various parallel processors, in the order of their dependency data. Earlier blocks are sent to the parallel processors first, with later blocks sent later. The blocks are stored in the parallel processors in specific locations, and shifted around as necessary, so that every block, when it is processed, has its dependency data located in a specific set of earlier blocks with specified relative positions. In this manner, its dependency data can be retrieved with the same commands. That is, earlier blocks are shifted around so that later blocks can be processed with a single set of commands that instructs each processor to retrieve its dependency data from specific known relative locations that do not vary.
    Type: Application
    Filed: January 10, 2007
    Publication date: August 16, 2007
    Inventors: Lazar Bivolarski, Bogdan Mitu
  • Publication number: 20070189618
    Abstract: An efficient method and device for the parallel processing of sub-blocks of data. A parallel processing array has computing elements configured to process blocks of data of an image in a parallel manner. Blocks of image data are generated, wherein each of the blocks of image data are divided into sub-blocks, with a first data point of each sub-block flagging a beginning position of the sub-block. A block of type data is generated for each of the blocks of image data. Each of the blocks of type data contains the first data point for all of the sub-blocks in the block of image data, so that the numbers and locations of all sub-blocks in each block of image data can be determined without first having to process the block of image data.
    Type: Application
    Filed: January 10, 2007
    Publication date: August 16, 2007
    Inventors: Lazar Bivolarski, Bogdan Mitu
  • Publication number: 20070162722
    Abstract: An efficient method and device for the parallel processing of data variables. A parallel processing array has computing elements configured to process data variables in parallel. An algorithm for a plurality of computing elements of a parallel processor is loaded. The algorithm includes a plurality of processing steps. Each of the plurality of computing elements is configured to process a data variable associated with the computing element. Selection codes for the plurality of computing elements of the parallel processor are loaded, wherein the selection codes identify which of the algorithm steps are to be applied by the computing elements to the data variables. The algorithm processing steps are applied to the data variables by the computing elements, wherein for each computing element, only those processing steps identified by the selection codes are applied to the data variable.
    Type: Application
    Filed: January 10, 2007
    Publication date: July 12, 2007
    Inventors: Lazar Bivolarski, Bogdan Mitu
  • Publication number: 20070130444
    Abstract: A computer processor having an integrated instruction sequencer, array of processing engines, and I/O controller. The instruction sequencer sequences instructions from a host, and transfers these instructions to the processing engines, thus directing their operation. The I/O controller controls the transfer of I/O data to and from the processing engines in parallel with the processing controlled by the instruction sequencer. The processing engines themselves are constructed with an integer arithmetic and logic unit (ALU), a 1-bit ALU, a decision unit, and registers. Instructions from the instruction sequencer direct the integer ALU to perform integer operations according to logic states stored in the 1-bit ALU and data stored in the decision unit. The 1-bit ALU and the decision unit can modify their stored information in the same clock cycle as the integer ALU carries out its operation. The processing engines also contain a local memory for storing instructions and data.
    Type: Application
    Filed: October 19, 2006
    Publication date: June 7, 2007
    Inventors: Bogdan Mitu, Gheorghe Stefan, Dan Tomescu