Patents by Inventor Shorin Kyo

Shorin Kyo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20130024667
    Abstract: An attribute group storage unit acquires and holds attribute groups set to respective data blocks. A scenario determination unit determines respective transfer systems of the respective blocks between a memory of the lowest hierarchy and a memory of another hierarchy based on those attribute groups and a configuration of an arithmetic unit which is the parallel processor, and controls the transfer of the respective data blocks according to the determined transfer systems, and the parallel arithmetic operation corresponding to the transfer. Each of the attribute groups is necessary to determine the transfer systems, and includes one or more attributes not depending on the configuration of the parallel processor. The attribute groups of the write blocks are set assuming that each of the write blocks has already been located in the memory of another hierarchy, and is transferred to the memory of the lowest hierarchy.
    Type: Application
    Filed: June 21, 2012
    Publication date: January 24, 2013
    Applicant: Renesas Electronics Corporation
    Inventor: Shorin KYO
  • Patent number: 8190856
    Abstract: A processor of SIMD/MIMD dual mode architecture comprises common controlled first processing elements, self-controlled second processing elements and a pipelined (ring) network connecting the first PEs and the second PEs sequentially. An access controller has access control lines, each access control line being connected to each PE of the first and second PEs to control data access timing between each PE and the network. Each PE can be self-controlled or common controlled, such as dual mode SIMD/MIMD architectures, reducing the wiring area requirement.
    Type: Grant
    Filed: March 6, 2007
    Date of Patent: May 29, 2012
    Assignee: NEC Corporation
    Inventors: Hanno Lieske, Shorin Kyo
  • Patent number: 8131978
    Abstract: An original first instruction word (I1) to an original third instruction word (I3) include a bit field (L11) and a bit field (L12) to a bit field (L31) and a bit field (L32). An information word (IW) includes a set of some of bit fields belonging to a plurality of instruction words executed in the same cycle, which are the bit field (L12) of the original first instruction word (I1) to the bit field (L32) of the original third instruction word (I3). An instruction decoder (103) of a processor (100) decomposes the information word (IW) and restores the arrangements of the original first instruction word (I1) to the original third instruction word (I3) by combining the bit field (L11) to the bit field (L31) to the bit field (L12) to the bit field (L32). This can reduce the amount of memory consumption without degrading the instruction execution performance.
    Type: Grant
    Filed: June 15, 2007
    Date of Patent: March 6, 2012
    Assignee: NEC Corporation
    Inventor: Shorin Kyo
  • Publication number: 20120042129
    Abstract: For a program that is made up of functions in units, each function is divided into instruction code blocks having a size CS where CS is the instruction cache line size of a target processor and an instruction code block that is Xth counting from the top of each function F is expressed as (F, X). Flow information of nodes that take (F, X) as identification names is extracted from an executable file of the function program. For each identification name, as neighborhood weight of each identification name that differs from that identification name, information for which that the frequency of appearance of each identification name is taken into consideration that belongs to a function that differs from that function in the neighborhood of each appearing node in the flow information is found. Based on said neighborhood weight information, the functions are arranged in the memory space such that the number of conflicts of said instruction cache is reduced.
    Type: Application
    Filed: March 3, 2010
    Publication date: February 16, 2012
    Applicant: NEC CORPORATION
    Inventor: Shorin Kyo
  • Patent number: 8112613
    Abstract: Disclosed is a mixed mode parallel processor system in which N number of processing elements PEs, capable of performing SIMD operation, are grouped into M (=N÷S) processing units PUs performing MIMD operation. In MIMD operation, P out of S memories in each PU, which S memories inherently belong to the PEs, where P<S, operate as an instruction cache. The remaining memories operate as data memories or as data cache memories. One out of S sets of general-purpose registers, inherently belonging to the PEs, directly operates as a general register group for the PU. Out of the remaining S?1 sets, T set or a required number of sets, where T<S?1, are used as storage registers that store tags of the instruction cache.
    Type: Grant
    Filed: February 18, 2011
    Date of Patent: February 7, 2012
    Assignee: NEC Corporation
    Inventor: Shorin Kyo
  • Patent number: 8051273
    Abstract: Disclosed is a mixed mode parallel processor system in which N number of processing elements PEs, capable of performing SIMD operation, are grouped into M (=N÷S) processing units PUs performing MIMD operation. In MIMD operation, P out of S memories in each PU, which S memories inherently belong to the PEs, where P<S, operate as an instruction cache. The remaining memories operate as data memories or as data cache memories. One out of S sets of general-purpose registers, inherently belonging to the PEs, directly operates as a general register group for the PU. Out of the remaining S?1 sets, T set or a required number of sets, where T<S?1, are used as storage registers that store tags of the instruction cache.
    Type: Grant
    Filed: November 2, 2010
    Date of Patent: November 1, 2011
    Assignee: NEC Corporation
    Inventor: Shorin Kyo
  • Publication number: 20110138151
    Abstract: Disclosed is a mixed mode parallel processor system in which N number of processing elements PEs, capable of performing SIMD operation, are grouped into M (=N÷S) processing units PUs performing MIMD operation. In MIMD operation, P out of S memories in each PU, which S memories inherently belong to the PEs, where P<S, operate as an instruction cache. The remaining memories operate as data memories or as data cache memories. One out of S sets of general-purpose registers, inherently belonging to the PEs, directly operates as a general register group for the PU. Out of the remaining S?1 sets, T set or a required number of sets, where T<S?1, are used as storage registers that store tags of the instruction cache.
    Type: Application
    Filed: February 18, 2011
    Publication date: June 9, 2011
    Applicant: NEC CORPORATION
    Inventor: Shorin KYO
  • Publication number: 20110047348
    Abstract: Disclosed is a mixed mode parallel processor system in which N number of processing elements PEs, capable of performing SIMD operation, are grouped into M (=N÷S) processing units PUs performing MIMD operation. In MIMD operation, P out of S memories in each PU, which S memories inherently belong to the PEs, where P<S, operate as an instruction cache. The remaining memories operate as data memories or as data cache memories. One out of S sets of general-purpose registers, inherently belonging to the PEs, directly operates as a general register group for the PU. Out of the remaining S?1 sets, T set or a required number of sets, where T<S?1, are used as storage registers that store tags of the instruction cache.
    Type: Application
    Filed: November 2, 2010
    Publication date: February 24, 2011
    Applicant: NEC CORPORATION
    Inventor: Shorin KYO
  • Publication number: 20110040952
    Abstract: Uniforming of the processing load is efficiently realized. Each processing element configuring an SIMD parallel computer system includes a data storage module that stores data processed or transferred, a number-of-data-sets storage device that stores number of data sets, and a front data storage device that stores the front data. Each processing element further includes a control processor that compares the number of data sets stored in one processing element with the number of data sets stored in the own processing element, and issues a data distribution leveling instruction that designates an action for updating contents of the data storage module, the number-of-data-sets storage device, and the front data storage device according to a rule determined based on a comparison result of the own processing element and that of the other processing elements and an action for moving the data stored in the one processing element to the own processing element.
    Type: Application
    Filed: April 8, 2009
    Publication date: February 17, 2011
    Applicant: NEC CORPORATION
    Inventor: Shorin Kyo
  • Publication number: 20110010526
    Abstract: Nowadays, many architectures have processing units with different bandwidth requirements which are connected over a pipelined ring bus. The proposed invention can optimize the data transfer for the case where processing units with lower bandwidth requirements can be grouped and controlled together for a data transfer, so that the available bus bandwidth can be optimally utilized.
    Type: Application
    Filed: March 3, 2008
    Publication date: January 13, 2011
    Inventors: Hanno Lieske, Shorin Kyo
  • Publication number: 20110010524
    Abstract: There is provided an SIMD processor array system in which data can be efficiently transferred between processor elements located at different distances. The SIMD processor array system includes a control processor (CP) that is capable of issuing a plurality of instructions at the same time, and a PE array that includes a plurality of mutually-connected processing elements (PEs) to be controlled by the CP. The CP issues an inter-PE data shift instruction to each PE. According to the inter-PE data shift instruction, each PE performs a data sending operation of copying all the contents of a transfer data storing part of an adjoining PE to a transfer data storing part (MBF) of the own PE, and a data fetch operation of copying part or all of the contents of the MBF of the adjoining PE to a transfer data fetch and storing part (RBUF) of the own PE if part of the contents the MBF of the adjoining PE coincide with the contents of an ID storing part (IDB) of the own PE.
    Type: Application
    Filed: March 4, 2009
    Publication date: January 13, 2011
    Applicant: NEC CORPORATION
    Inventor: Shorin Kyo
  • Patent number: 7853775
    Abstract: Disclosed is a mixed mode parallel processor system in which N number of processing elements PEs, capable of performing SIMD operation, are grouped into M (=N÷S) processing units PUs performing MIMD operation. In MIMD operation, P out of S memories in each PU, which S memories inherently belong to the PEs, where P<S, operate as an instruction cache. The remaining memories operate as data memories or as data cache memories. One out of S sets of general-purpose registers, inherently belonging to the PEs, directly operates as a general register group for the PU. Out of the remaining S?1 sets, T set or a required number of sets, where T<S?1, are used as storage registers that store tags of the instruction cache.
    Type: Grant
    Filed: August 9, 2007
    Date of Patent: December 14, 2010
    Assignee: NEC Corporation
    Inventor: Shorin Kyo
  • Patent number: 7783861
    Abstract: When an instruction code “MVLR” is sent from a control processor in a PE having a mask register MR in operation setting, when the direction register F is ON, if a counter and transfer result storing buffer T is ?M, a value of T?M is stored in buffer T, and if T is less than M, content of a first transport register L of a PE whose PE number counted from the left inside a PE block is T, is selected by a first selector and stored in buffer T and the mask register is set to non-operation. When the direction register is OFF, if T is ??M, a value of T+M is stored in buffer T, and if T is greater than ?M, content of R of a PE whose PE number is ?T, counted from the right inside the PE block, is selected by a second selector and stored in buffer T, and MR is set to non-operation. Entire PEs transfer content of L and R to M-adjacent left and right PEs, and data transferred from M-adjacent right and M-adjacent left PEs are stored in L and R respectively.
    Type: Grant
    Filed: February 27, 2007
    Date of Patent: August 24, 2010
    Assignee: NEC Corporation
    Inventor: Shorin Kyo
  • Publication number: 20100161944
    Abstract: An original first instruction word (I1) to an original third instruction word (I3) include a bit field (L11) and a bit field (L12) to a bit field (L31) and a bit field (L32). An information word (IW) includes a set of some of bit fields belonging to a plurality of instruction words executed in the same cycle, which are the bit field (L12) of the original first instruction word (I1) to the bit field (L32) of the original third instruction word (I3). An instruction decoder (103) of a processor (100) decomposes the information word (IW) and restores the arrangements of the original first instruction word (I1) to the original third instruction word (I3) by combining the bit field (L11) to the bit field (L31) to the bit field (L12) to the bit field (L32). This can reduce the amount of memory consumption without degrading the instruction execution performance.
    Type: Application
    Filed: June 15, 2007
    Publication date: June 24, 2010
    Inventor: Shorin Kyo
  • Publication number: 20100088489
    Abstract: A processor of SIMD/MIMD dual mode architecture comprises common controlled first processing elements, self-controlled second processing elements and a pipelined (ring) network connecting the first PEs and the second PEs sequentially. An access controller has access control lines, each access control line being connected to each PE of the first and second PEs to control data access timing between each PE and the network. Each PE can be self-controlled or common controlled, such as dual mode SIMD/MIMD architectures, reducing the wiring area requirement.
    Type: Application
    Filed: March 6, 2007
    Publication date: April 8, 2010
    Inventors: Hanno Lieske, Shorin Kyo
  • Patent number: 7509634
    Abstract: A translator receives a source code that is described using a process designation (such as a line-by-line process designation, a line data extraction designation, and a broadcast designation) to be performed on line data of an image on a line by line basis, parses and optimizes the source code, and then generates an SIMD macro code that is an intermediate form taking into consideration the use of an SIMD instruction set. A simplifier generates, from the SIMD macro code, a simplified SIMD macro code, namely, a composite macro code into which a series of codes having the relationship between the definition and the reference of the same virtual SIMD register is organized. A machine code generator generates, from the simplified SIMD macro code, a machine code that efficiently uses an SIMD instruction.
    Type: Grant
    Filed: November 12, 2003
    Date of Patent: March 24, 2009
    Assignee: NEC Corporation
    Inventor: Shorin Kyo
  • Publication number: 20090049275
    Abstract: Disclosed is a mixed mode parallel processor system in which N number of processing elements PEs, capable of performing SIMD operation, are grouped into M (=N÷S) processing units PUs performing MIMD operation. In MIMD operation, P out of S memories in each PU, which S memories inherently belong to the PEs, where P<S, operate as an instruction cache. The remaining memories operate as data memories or as data cache memories. One out of S sets of general-purpose registers, inherently belonging to the PEs, directly operates as a general register group for the PU. Out of the remaining S?1 sets, T set or a required number of sets, where T<S?1, are used as storage registers that store tags of the instruction cache.
    Type: Application
    Filed: August 9, 2007
    Publication date: February 19, 2009
    Applicant: NEC CORPORATION
    Inventor: Shorin Kyo
  • Publication number: 20090043986
    Abstract: A processor array system which is able to perform load balancing among PEs at high speed is provided. When an instruction code 113, “MVLR”, is sent from a control processor 110, in a PE having a mask register MR being in operation setting, in case wherein the direction register F is ON, if a counter and transfer result storing buffer T is greater than or equal to M, a value of T?M is stored in T, and if T is less than M, content of a first transport register L of a PE whose PE number counted from the left inside a PE block is T, is selected by a first selector LS to be stored to in to a transfer result buffer T and the mask register is set to non-operation. On the other hand, in case wherein the direction register F is OFF, if T is less than or equal to ?M, a value of T+M is stored in T, and if T is greater than ?M, content of R of a PE whose PE number is ?T, counted from the right inside the PE block, is selected by a second selector RS to be stored in T, and MR is set to non-operation.
    Type: Application
    Filed: February 27, 2007
    Publication date: February 12, 2009
    Inventor: Shorin Kyo
  • Patent number: 7400781
    Abstract: A symmetric type image filter processing apparatus having a symmetric type image filter composed of symmetric kernel coefficients, in which SIMD commands are utilized efficiently for making the filtering processes high speed, is provided. The symmetric type image filter processing apparatus provides a row-wise intermediate data generating section, a row-wise intermediate data utilizing section, and a memory. The row-wise intermediate data generating section multiplies each kernel coefficient of M pieces in each column of {(N+1)/2} columns at the right or left column by each pixel of M pieces in the column direction of image data having P pixels in one row, and cumulatively adds the multiplied results, by using SIMD commands that can process sequential data of Q pieces. This multiplication and addition operation is executed P/Q times, and intermediate data in one row of the image data are generated and stored in an intermediate data storing region in the memory.
    Type: Grant
    Filed: December 16, 2003
    Date of Patent: July 15, 2008
    Assignee: NEC Corporation
    Inventor: Shorin Kyo
  • Publication number: 20040126035
    Abstract: A symmetric type image filter processing apparatus having a symmetric type image filter composed of symmetric kernel coefficients, in which SIMD commands are utilized efficiently for making the filtering processes high speed, is provided. The symmetric type image filter processing apparatus provides a row-wise intermediate data generating section, a row-wise intermediate data utilizing section, and a memory. The row-wise intermediate data generating section multiplies each kernel coefficient of M pieces in each column of {(N+1)/2} columns at the right or left column by each pixel of M pieces in the column direction of image data having P pixels in one row, and cumulatively adds the multiplied results, by using SIMD commands that can process sequential data of Q pieces. This multiplication and addition operation is executed P/Q times, and intermediate data in one row of the image data are generated and stored in an intermediate data storing region in the memory.
    Type: Application
    Filed: December 16, 2003
    Publication date: July 1, 2004
    Applicant: NEC CORPORATION
    Inventor: Shorin Kyo