SEMICONDUCTOR DEVICE DESIGN METHOD
There is provided a semiconductor device design method capable of achieving optimal layout design. For example, from the entire semiconductor device, a plurality of seeds which are flip-flops are set uniformly. In the first trace, the effective range (node) of each seed is expanded in parallel so that the respective objective function values (including difficulty levels of timing convergence) of the nodes are equalized. Then, in the first merge, adjacent seeds are merged as appropriate so that the number of nodes decreases to a certain rate, and a total cost containing the difficulty level of each node and the difficulty level of circuits remaining in the entire semiconductor device is calculated. Until the total cost worsens, as in the first trace and merge, the second trace and merge, the third trace and merge, . . . are performed. Based on optimal division units thereby determined, floorplan, division layout, and the like are performed.
Latest Patents:
The disclosure of Japanese Patent Application No. 2009-239619 filed on Oct. 16, 2009 including the specification, drawings and abstract is incorporated herein by reference in its entirety.
BACKGROUND OF THE INVENTIONThe present invention relates to a semiconductor device design method, and in particular, relates to a technique effective when applied as a division method for dividing the overall layout and performing automatic layout.
For example, Japanese Unexamined Patent Publication No. Hei 6 (1994)-348784 (Patent Document 1) describes a method for, in detailed wiring performed in parallel on wiring areas formed by dividing a wiring area after rough wiring, equalizing the respective detailed-wiring times of the divided wiring areas. Specifically, there is performed processing for calculating respective coarse-grid wiring loads, and with a plurality of seeds as origins, whose number is set to the number of processors, sequentially selecting from among adjacent seeds and merging coarse grids of smaller increments in detailed wiring loads merged. Further, each coarse-grid wiring load is determined based on the number of wires, the amount of wiring prohibition, and the distortion rate of grid shape contained in the respective coarse grid.
SUMMARY OF THE INVENTIONFor example, a hierarchical layout method is known as a method for implementing a large-scale semiconductor chip.
In layout design using such input data, generally first through tentative layout (floorplan), rough layout of each block and rough wiring between blocks are performed based on the logical structure of
Thus, in the hierarchical layout method, data is divided generally based on the logical hierarchy, that is, the blocks divided with the functional units in the semiconductor chip, and with each divided data as a parallel processing unit, a computer system performs automatic layout. However, in this case, from the viewpoint of the entire semiconductor chip, there is a high possibility of unevenness in the logical size (the number of cells and the number of nets) of each block, leading to unevenness in the amount of each processing data, which may increase the overall layout processing time.
On the other hand, for example, assume that the circuit is divided into block units in the following way (A) or (B). (A) The circuit is divided into each block that has an equal number of gates (equal cell area) and equal block area. (B) The circuit is divided into each block that has an equal number of interface pins. Performing such division so that each block has an equal amount of processing data might be expected to equalize the respective layout processing times for the blocks and shorten the overall layout processing time. However, in this case as well, the blocks have different difficulty levels of timing convergence, which may increase the overall layout processing time. That is, based on timing constraints obtained by dividing (budgeting) the timing constraints (SDC) of the entire semiconductor chip into block units, the layout is determined so as to satisfy the constraints; however, for example, different operating frequencies of the blocks lead to different difficulty levels of timing constraints, which makes it difficult to estimate the overall layout processing time including the time required for optimization in the highest hierarchy.
The microcomputer is classified for example as in
Such unevenness in layout processing time for each block increases the overall layout processing time and the design time. For example, there is a method for dividing the entire semiconductor chip into blocks whose number is equal to the number of CPUs and setting the division boundary so as to equalize the respective numbers of wires for the division blocks, as in Patent Document 1. However, this method does not necessarily bring about optimal division because the processing time varies depending on the operating frequency of the line as well as the number of wires as described above. Further, although this method is intended to equalize layout processing times in detailed wiring; from another point of view, that is, from the overall viewpoint of the layout design of the semiconductor device, it is not possible to sufficiently optimize layout design only by equalizing processing times in detailed wiring.
That is, in conventional layout methods such as Patent Document 1, after a predetermined rough layout is divided into, for example, blocks whose number is equal to the number of CPUs, detailed layout is performed, thereby shortening the overall layout processing time. However, in the first place, the rough layout itself is not necessarily optimal from the overall viewpoint of the design of the semiconductor device. Specifically, the method such as Patent Document 1 is intended to determine, on the condition that each block layout is determined as shown in
Further, in recent years, multilayer layout has sometimes been performed through the three-dimensional stack, as shown in
In the case of performing such multilayer layout, usually, with each circuit block BLK_A to BLK_D as a functional unit, the circuit blocks are allocated to the semiconductor chips as appropriate in such a way that similar functions are contained in one semiconductor chip.
From the overall viewpoint of the design of the semiconductor device, it is desirable to equalize the respective layout processing times for the semiconductor chips and equalize power consumption, noise, and the like. Particularly in the case of the multilayer layout, unevenness-associated trouble in an advanced stage of design causes a large loss with redesign; therefore, it is necessary to implement uniform layout design in an early stage. This unevenness problem applies, as a matter of course, not only to the multilayer layout but also to the layout of a single semiconductor chip, so that it is desirable to equalize the respective layout processing times for the circuit blocks in the single semiconductor chip and equalize power consumption, noise, and the like. However, in reality, a trade-off relationship exists, and a scheme for obtaining an optimal solution is required.
The present invention has been made in view of such a circumstance, and it is an object of the invention to provide a semiconductor device design method capable of achieving optimal layout design. The above and other objects and novel features of the present invention will become apparent from the description of this specification and the accompanying drawings.
A typical embodiment of the invention disclosed in the present application will be briefly described as follows.
In a semiconductor device design method according to this embodiment, an objective function which is a function of the length of layout processing time in consideration of timing convergence, the magnitude of power, the level of noise, etc. and represents the comprehensive complexity of layout is defined, and a computer system allocates the entire circuit of the highest hierarchy to N blocks so as to equalize the respective objective function values of the blocks, with a predetermined reference value as a target.
With this, it is possible to obtain a plurality of division blocks equalized comprehensively including layout processing time and quality. Therefore, by laying out each division block in parallel processing based on this result, it is possible to shorten the layout processing time. Further, by performing floorplan or allocation to a plurality of semiconductor chips based on this result, it is possible to perform optimization including the quality of the semiconductor device and the layout processing time. Thus, it is possible to optimize the layout design from the comprehensive viewpoint.
Further, in the semiconductor device design method according to this embodiment, a total cost is calculated by reflecting, in the reference value, the complexity (e.g., the number of timing paths) of circuits remaining in the highest hierarchy which are circuits other than the N blocks, so that while the reference value is increased and the N value is decreased in stages, the total cost for each N value is calculated, thereby obtaining the N value of the best total cost and the corresponding boundary of each block. That is, it is also possible to search for an optimal solution to the number of division blocks.
More specifically, in the semiconductor device design method, a netlist of the entire circuit, timing information, and floorplan information FP in some cases are inputted. First, from the entire circuit, a plurality of seeds which are flip-flop circuits are set. Then, in the first trace, the effective range of each seed is expanded in stages so that the respective objective function values are equalized among the effective ranges of the seeds. The expansion is performed by sequentially taking in preceding or subsequent flip-flops coupled to each seed. Then, a seed that meets a first condition in the process of expansion is converted into a subgraph, and the trace is continued until the number of remaining seeds which have not yet become a subgraph decreases to a first rate. Subsequently, in the first merge, subgraphs are merged as appropriate until the sum of the number of remaining seeds and the number of subgraphs decreases to a second rate. Then, a total cost in the case where division is performed with each of the remaining seeds and the subgraphs as a division unit is calculated in consideration of the number of timing paths etc. of circuits that do not belong to the remaining seeds or the subgraphs. As long as the total cost is better than the previous one, as in the first trace and merge, the second trace and merge, the third trace and merge, . . . are performed.
Thus, a plurality of seeds are set beforehand, the effective range of each seed is expanded in stages, and subgraphed seeds are merged as appropriate, thereby decreasing the overall division number in stages and checking whether the total cost is improved, so that an optimal division number can be obtained efficiently. The seed that meets the first condition in the above description refers to a seed that reaches the following state. All perimeters of the effective range of the seed come into contact with the effective ranges of other seeds and cannot expand any further. Alternatively, in the case where the netlist is managed with a hierarchy block, all perimeters of the effective range of the seed reach the boundary of a hierarchy block to which the seed belongs.
According to an effect of the typical embodiment of the invention disclosed in the present application, it is possible to optimize the layout design.
In the following embodiments, description will be made by dividing an embodiment into a plurality of sections or embodiments when necessary for the sake of convenience; however, except when a specific indication is given, they are not mutually unrelated, but there is a relationship that one section or embodiment is a modification, specification, or supplementary explanation of part or all of another section or embodiment. Further, in the case where the following embodiments deal with a numerical expression (including a number, a numerical value, amount, range) concerning elements, the numerical expression is not limited to the specific number but may be larger or smaller than the specific number except when a specific indication is given or when the expression is apparently limited to the specific number in principle.
Furthermore, in the following embodiments, the components (including element steps) are not always indispensable except when a specific indication is given or when they are apparently considered to be indispensable in principle. Similarly, in the case where the following embodiments deal with the shape, positional relationship, etc., of the components etc., those substantially approximate or similar to them in shape etc. are also included except when a specific indication is given or when they are apparently considered to be excluded in principle. This also applies to numerical values and ranges described above.
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In all the drawings for illustrating the embodiments, the same components or members are basically denoted by the same reference numerals, and their description will not be repeated.
First EmbodimentIn
Then, the computer system determines whether the number of nodes N after the merge is smaller than NI×J (S105). J is a constant (0<J<1) set beforehand by a user. If the condition of S105 is not satisfied, the computer system again performs a trace in S103. If the condition of S105 is satisfied, after setting the reference value NI=N (S106), the computer system calculates a total cost (S107). While the total cost value is improving, the computer system returns to S103 and repeats loop processing. If the total cost value has worsened, the computer system exits the loop and determines that the number of nodes N in the previous loop is an optimal division number (S108, S109).
The total cost is determined by adding the cost (the number of timing paths etc.) of circuits (i.e., circuits remaining in the highest hierarchy (TOP)) which do not belong to the nodes to the representative value (e.g., maximum value or average value) of the respective objective functions for the nodes, with each node laid out in parallel processing. Specifically, it is calculated, for example, by equation (1). In equation (1), α is an overhead coefficient depending on the number of nodes N and increases as the number of nodes N increases.
Total cost=max(respective objective function values of nodes)×α+top cost (1)
Thus, the semiconductor device design method according to the first embodiment equalizes the comprehensive complexity of layout and searches for a division condition (division number and the boundary of each division block) for shortening the overall layout processing time. In the method, by performing traces, the complexity of each division block is increased in stages while the uniformity thereof is maintained. Concurrently, by performing merges, the division number is decreased in stages. Further, by calculating the total cost at each stage, the overall layout processing time is verified.
Further, as shown in
Hereinafter, the flow of
That is, generally in the logical hierarchy of the netlist, the highest hierarchy TOP includes a lower hierarchy comprised of a plurality of blocks BLK0[0] to BLK0[n] which are large functional units, and each lower hierarchy includes a further lower hierarchy comprised of a plurality of blocks which are relatively large functional units, thus forming the structure of predetermined successive hierarchies. For example, in a lower layer of BLK0[1], blocks BLKi[0] to BLKi[m] exist. Further, each block (e.g., BLKi[1]) located in the lower layer includes a lower hierarchy comprised of a plurality of modules (e.g., MD0[0] to MD0[1]) which are small functional units, and each lower hierarchy includes a lower hierarchy comprised of a plurality of modules, thus forming the structure of predetermined successive hierarchies. For example, in a lower layer of MD0[1], modules MDj[0] to MDj[k] exist. Further, a module (e.g., MDj[1]) in the lowest layer includes a plurality of flip-flops (e.g., FF[0] to FF[x]).
Accordingly, for example, if the number of modules located in a same hierarchy is nearly equal to the number of seeds, the computer system selects one seed from each module. For excess or deficiency, the computer system, for example, does not select a seed from some modules or selects several seeds from one module of particularly large circuit size. This makes it to select seeds uniformly from the entire circuit.
Further, in the selection of a seed from each module, it is desirable to select a seed of a flip-flop estimated to be located in the center of each module to the extent possible. For this reason, as shown in
After thus selecting seeds, the computer system performs a trace with each seed as an origin. In the trace, based on the objective function defined beforehand, the computer system expands nodes in parallel so as to equalize the respective objective function values of the nodes which are the effective ranges of the seeds. As described above, the objective function G is a function of cost (RT) representing the length of layout processing time (in other words, the difficulty level of layout convergence), cost (PW) representing the magnitude of power, cost (NS) representing the level of noise, and cost (YE) representing manufacturability (yield) as variables, and is expressed, for example, by equation (2). In equation (2), β1 to β4 are weighting coefficients for the variables, and can be arbitrarily set by the user.
G=β1×RT+β2×PW+β3×NS+β4×YE (2)
Hereinafter, the objective function G will be detailed.
[A] Power Cost (PW) and Manufacturability Cost (YE)
The power cost (PW) is an index representing a possibility of a drop in supply voltage due to partial power concentration. Assume that problems occur with increase in this value. The value of PW is determined, for example, by the sum of power consumption (acquired from the cell information SL) of each cell contained in a node of interest and recognized from the netlist NL. If cell activation rate information exists, the information is also added. Further, the fan-out of each cell is recognized from the netlist NL, and wiring capacity associated with the fan-out is added as weight. Next, the value of the manufacturability cost (YE) is determined, for example, by the sum of yield (acquired from the cell information SL) of each cell contained in a node of interest and recognized from the netlist NL. Assume that problems occur with increase in this value.
[B] Layout Processing Time Cost (RT)
The layout processing time cost (RT) is determined, for example, by a function of four variables comprised of [1] Pin/Net, [2] the sum of the speeds of clocks supplied to flip-flops (CKSUM), [3] the number of endpoints (EP), and [4] the sum of timing slacks (TPS).
First, [1] Pin/Net is obtained by detecting the number of pins and the number of nets (the number of wires) contained in a node of interest by referring to the netlist NL. In general, as this value increases, the complexity (difficulty level) of layout increases, and the layout processing time increases. For example,
Next, [2] the sum of the speeds of clocks supplied to flip-flops (CKSUM) is obtained by recognizing clock information of flip-flops contained in a node of interest by referring to the netlist NL and the timing information TM. As the sum of the clock speeds increases, the difficulty level of timing convergence increases, and the layout processing time increases.
To be more precise, the difficulty level associated with the sum of the clock speeds (CKSUM) changes depending on the number of logic stages of each combinational circuit LOG in the example of
CLK1=FF1(10), FF2(15), FF3(15) CLK2=FF1(25), FF2(30), FF3(30), FF4(40), FF5(40) CLK3=FF4(40), FF5(40)
The above numbers of logic stages are reflected in CKSUM. With a function f in which the value thereof becomes 1 when the number of logic stages is a reference number, the value increases from 1 as the number of logic stages increases from the reference number, and the value decreases from 1 as the number of logic stages decreases from the reference number, the sum of the clock speeds (CKSUM)′ is calculated as follows.
150 MHz×(f(10)+f(15)+f(15)=3.4)=510
100 MHz×(f(25)+f(30)+f(30)+f(40)+f(40)=5.3)=515
50 MHz×(f(40)+f(40)=0.8)=40
(CKSUM)′=510+515+40=1065
Next, [3] the number of endpoints (EP) is obtained by recognizing the number of endpoints for each flip-flop contained in a node of interest by referring to the netlist NL. As the number of endpoints (EP) increases, the difficulty level of layout increases, and the layout processing time increases.
Next, [4] the sum of timing slacks (TPS) is obtained by recognizing each timing path contained in a node of interest and the result of STA (static timing analysis) of each timing path by referring to the netlist NL and the timing information TM. The result of STA is obtained beforehand in a circuit design stage and stored as the timing information TM. The sum of timing slacks (TPS) increases, the difficulty level of timing convergence increases, and the layout processing time increases.
Thus, the layout processing time cost (RT) is calculated by the function of four variables comprised of [1] Pin/Net, [2] the sum of clock speeds (CKSUM), [3] the number of endpoints (EP), and [4] the sum of timing slacks (TPS). Specifically, for example, as expressed by equation (3), the variables are weighted by γ1 to γ4 to calculate RT.
RT=γ1×(Pin/Net)+γ2×CKSUM+γ3×EP+γ4×TPS (3)
[C] Noise Cost (NS)
The noise cost (NS) is an index representing a possibility of degradation of chip performance due to occurrence of partial simultaneous-switching noise. Assume that problems occur with increase in this value. The value of NS is calculated, for example, by detecting the number of flip-flops triggered by the same clock by referring to the netlist NL. In particular, it is calculated by detecting the number of flip-flops that are the fan-out of the same clock gating cell.
With [A] to [C], the objective function G expressed by equation (2) is calculated. Here, assume that a semiconductor device to be designed has, for example, a plurality of timing constraints. That is, for example, the semiconductor device to be designed has a mode in which it operates at a certain frequency and a mode in which it operates at another frequency.
As shown in
With the thus calculated objective function G, the computer system expands nodes in parallel so as to equalize the respective objective function G values of the nodes.
The trace shown in
The first case completely maintains the logical hierarchy. In this case, the flow of
For example,
A merge is performed preferentially on a location where the edge cost is low in number (i.e., the correlation between the corresponding nodes is high). In the example of
Thus, with the semiconductor device design method according to the first embodiment, it is possible to obtain a plurality of division blocks equalized comprehensively including processing time and quality and to search for an optimal solution to the range of each division block and the number of division blocks. Therefore, by laying out each division block in parallel processing based on this result, it is possible to shorten the layout processing time. Further, by performing floorplan or allocation to a plurality of semiconductor chips based on this result, it is possible to perform optimization including the quality of the semiconductor device and the layout processing time. Thus, it is possible to optimize the layout design from the comprehensive viewpoint.
Second EmbodimentIn the second embodiment, description will be made as to the application of the design method according to the first embodiment to parallel automatic layout using a plurality of computer systems having different processing capabilities. In the first embodiment, division is performed so as to equalize the respective objective function values (including layout processing time) of the nodes. However, in the case where distributed processing hardware devices have different specs, the processing time may be shortened if the respective objective function values of the nodes have a predetermined ratio according to the different specs. Accordingly, in a semiconductor device design method according to the second embodiment, appropriate division is performed in consideration of the specs (CPU, memory) of distributed processing hardware devices, and each processing is assigned to the respective hardware device.
For example, the hardware specs of the computer systems for performing automatic layout are as follows.
CPU1: cpuf=100 MHz Memory=4 GB CPU2: cpuf=200 MHz Memory=8 GB CPU3: cpuf=300 MHz Memory=16 GB CPU4: cpuf=400 MHz Memory=32 GB
In this case, in terms of the CPU specs, the ratio among the processing capabilities of the CPUs is, for example, as follows. CPU1:CPU2:CPU3:CPU4=1:2:3:4
In this case, for example, CPU4 has processing capability four times as high as that of CPU1 and can therefore process a node having an objective function value four times as high as that of CPU1 within the same layout processing time. Accordingly, in a first method for semiconductor device design according to the second embodiment, in the trace (S103) and the merge (S104) in the flow of
Alternatively, in a second method, the systems may perform control so as to equalize the respective objective function values of the nodes in the same way as in the first embodiment and change the number of nodes finally assigned to each CPU. For example, in the case of ten nodes obtained as the final solution, one, two, three, and four nodes are assigned to CPU1, CPU2, CPU3, and CPU4, respectively. Further, there is no problem if resources are determined; however, in such a case of sharing resources through management software such as LSF (Load Sharing Facility), usable resources change dynamically; therefore, it is dealt with by spec equalization or specified block number.
Thus, with the semiconductor device design method according to the second embodiment, in addition to the various effects described in the first embodiment, it is possible to shorten the layout processing time even if a plurality of computer systems having different hardware specs perform automatic layout.
Third EmbodimentIn the third embodiment, the design method of
In the trace, the computer system repeats the loop processing of trace graph generation (S2004), objective-function calculation (S2005), and node expansion (S2006) until the number of remaining seeds X≦XI×K (S2007). K is an arbitrary value between 0 and 1 (0<K<1). That is, the computer system converts a node that meets a predetermined condition into a subgraph and continues to expand nodes until the number of remaining seeds which have not yet been converted into a subgraph decreases to a predetermined rate while expanding nodes so as to equalize the respective objective function values of the nodes in the same way as in the first embodiment. That is, as the trace proceeds, the number of subgraphs S increases, and the number of remaining seeds X decreases accordingly. The subgraph refers to a node that reaches the following state. All perimeters of the node come into contact with other nodes etc. in the process of node expansion and cannot expand any further. If the number of remaining seeds decreases to the predetermined rate, the computer system exits the loop and updates the reference value XI with the number of currently remaining seeds X (S2008).
Then, after setting a reference value NI=X+S (S2009), the computer system performs a merge. In the merge, the computer system repeats the loop processing of merge graph generation (S2010), edge cost calculation (S2011), and subgraph merge (S2012) until the number of nodes N≦NI×J (S2013). J is an arbitrary value between K and 1 (K<J<1). That is, the computer system merges adjacent subgraphs of a plurality of subgraphs generated by the trace. With the merge, the number of subgraphs decreases, and the number of nodes N (the sum of the number of subgraphs S and the number of remaining seeds X) also decreases accordingly. If the number of nodes N decreases to a predetermined rate, the computer system exits the loop.
After exiting the loop, the computer system calculates a total cost using equation (1) as in the first embodiment (S2014). If the total cost is higher than the previously calculated total cost (i.e., the total cost has worsened), the previously calculated total cost is an optimal solution, the previous number of nodes N is an optimal division number, and the boundary of each node is an optimal division boundary (S2016). On the other hand, if the total cost is lower than the previously calculated total cost (i.e., the total cost has improved), the computer system returns to S2004 and performs a trace again. In the trace, with the number of currently remaining seeds X as the reference value XI, conversion into the subgraph is performed in the remaining seeds, and the trace is continued until the number of remaining seeds decreases to the predetermined rate. Subsequently, in the same way, with the current number of nodes as the reference value NI, subgraphs are merged until the number of nodes decreases to the predetermined rate. Accordingly, as shown in S200 in
Then, the first merge is performed on the subgraphs SGH until the number of nodes N decreases from 16 (before the merge) to 11 (about 0.7 times 16). The number of nodes N is the sum of the number of remaining seeds X and the number of subgraphs S, and the number of remaining seeds X cannot be changed; therefore, the merge is performed until the number of subgraphs S decreases from 8 to 3. Subsequently, in the same way, the second trace is performed until the number of remaining seeds X decreases to 4 (about 0.5 times the number before the trace), and the second merge is performed until the number of nodes N decreases to 7 (about 0.7 times the number before the merge). The subsequent traces and merges are performed in the same way.
On the other hand, in the trace graph, as illustrated in
Subsequently, in the first trace, each node NDE with each seed as an origin expands in stages, and a node that has reached the limit of expansion becomes a subgraph SGH. In the case of maintaining the logical hierarchy, unlike the flat hierarchy shown in
Then, the first merge is performed on the subgraphs SGH until the number of nodes N decreases from 27 (before the merge) to 20 (about 0.75 times 27). The number of nodes N is the sum of the number of remaining seeds X and the number of subgraphs S, and the number of remaining seeds X cannot be changed; therefore, the merge is performed until the number of subgraphs S decreases from 14 to 7. In this example, e.g., three subgraphs SGH (modules MD) are merged into one subgraph SGH, and two subgraphs SGH (modules MD) are merged into one subgraph SGH.
Subsequently, the second trace is performed until the number of remaining seeds X decreases to 6 (about 0.5 times the number before the trace). At this time, for example, the subgraph SGH generated by merging the three modules MD is moved to a higher hierarchy and traced. Then, the second merge is performed until the number of nodes decreases to 15 (about 0.75 times the number before the merge). In this example, in addition to merges in the module hierarchy as in the first merge, merges in the subblock hierarchy are performed, for example, two subgraphs SGH (subblocks SBLK) are merged into one subgraph SGH. Subsequently, in the same way, as shown in
Thus, either in the case of the flat hierarchy or maintaining the logical hierarchy, the total cost is calculated each time while the number of nodes N is decreased. In the end, the number of nodes N of the best total cost is an optimal division number, and the boundary of each node is an optimal division boundary. Therefore, by performing automatic layout in parallel processing based on this division unit, it is possible to shorten the layout processing time. Further, by performing floorplan, allocation to the chips, or the like based on this division unit, it is possible to optimize the layout design comprehensively including the layout processing time and the quality of the semiconductor device.
While the invention made above by the present inventors has been described specifically based on the illustrated embodiments, the present invention is not limited thereto, and various changes and modifications can be made thereto without departing from the spirit and scope of the invention.
The semiconductor device design method according to the above embodiments is a technique effective in application to a layout design method for a semiconductor device, such as a microcomputer, containing mixed circuit blocks having different functions, but is not limited thereto and can widely be used as a layout design method for various semiconductor devices.
Claims
1. A semiconductor device design method allowing a computer system to execute, in layout design of a semiconductor device including a plurality of flip-flop circuits and combinational circuits coupled as appropriate among the flip-flop circuits:
- a first step of allocating the flip-flop circuits and the combinational circuits to N blocks so as to equalize respective objective function values of the blocks, with a predetermined reference value as a target, by referring to a netlist of the semiconductor device,
- wherein an objective function for each block includes a first variable reflecting timing information of a circuit contained in a respective block.
2. The semiconductor device design method according to claim 1, wherein the timing information contains clock frequency information for the flip-flop circuits.
3. The semiconductor device design method according to claim 1, wherein the timing information contains information about a result of performing static timing analysis on a timing path through the combinational circuits among the flip-flop circuits.
4. The semiconductor device design method according to claim 1, wherein the objective function for each block further includes a second variable reflecting the number of flip-flop circuits triggered by a same clock and contained in the respective block.
5. The semiconductor device design method according to claim 4, wherein the objective function for each block further includes a third variable reflecting the magnitude of power consumption of each cell in the circuit contained in the respective block.
6. The semiconductor device design method according to claim 1, wherein the computer system further executes a second step of performing floorplan, with the N blocks generated in the first step as a unit.
7. The semiconductor device design method according to claim 1, wherein the computer system further executes a third step of performing automatic layout processing in parallel using a plurality of CPUs, with the N blocks generated in the first step as a parallel processing unit.
8. A semiconductor device design method allowing a computer system to execute, in layout design of a semiconductor device including a plurality of flip-flop circuits and combinational circuits coupled as appropriate among the flip-flop circuits:
- a first step of selecting M flip-flop circuits from among the flip-flop circuits by referring to a netlist of the semiconductor device and setting the M flip-flop circuits as seeds;
- a second step of expanding each seed in parallel so as to equalize respective objective function values while taking in, step by step, a flip-flop circuit located in a preceding or subsequent stage for each of the M seeds as an origin, converting a seed that satisfies a first condition in the process of expansion into a subgraph, and continuing to expand each seed until the number of remaining seeds which have not yet become a subgraph decreases to a first rate;
- a third step of merging subgraphs until the sum of the number of remaining seeds and the number of subgraphs decreases to a second rate;
- a fourth step of calculating a total cost based on the respective objective function values of the remaining seeds and the subgraphs and the number of timing paths of a circuit that does not belong to the remaining seeds or the subgraphs; and
- a fifth step of repeating the second to fourth steps until the total cost worsens,
- wherein each objective function includes a first variable reflecting timing information of a circuit contained in the expansion range of each seed.
9. The semiconductor device design method according to claim 8,
- wherein the second step is performed in a state where a logical hierarchy of the netlist is flat, and
- wherein the first condition holds in the case where the seed cannot expand any further due to contact with the expansion range of another seed.
10. The semiconductor device design method according to claim 8,
- wherein the second step is performed in a state where a logical hierarchy of the netlist is maintained, and
- wherein the first condition holds in the case where the seed cannot expand any further due to contact with the boundary of a logical hierarchy.
11. The semiconductor device design method according to claim 8, wherein the timing information contains clock frequency information for the flip-flop circuits.
12. The semiconductor device design method according to claim 8, wherein the timing information contains information about a result of performing static timing analysis on a timing path through the combinational circuits among the flip-flop circuits.
13. The semiconductor device design method according to claim 8, wherein the objective function further includes a second variable reflecting the number of flip-flop circuits triggered by a same clock and contained in the expansion range of each seed.
14. The semiconductor device design method according to claim 13, wherein the objective function further includes a third variable reflecting the magnitude of power consumption of each cell in the circuit contained in the expansion range of each seed.
15. The semiconductor device design method according to claim 8, wherein in the first step, the computer system searches a logical hierarchy of the netlist toward a lower layer, detects lower layer blocks that are about the same in number as the M seeds, and sets a seed from each of the detected lower layer blocks.
16. The semiconductor device design method according to claim 15, wherein at the time of setting the seed from each of the detected lower layer blocks, the computer system detects, from each of the detected lower layer blocks, flip-flop circuits for input or output with the outside of the lower layer block, and sets a flip-flop circuit coupled through the largest number of stages from the flip-flop circuits as the seed.
17. The semiconductor device design method according to claim 8, wherein the computer system further executes a sixth step of recognizing the remaining seeds and the subgraphs of the best total cost, using a result of the fifth step and performing floorplan, with each of the remaining seeds and the subgraphs of the best total cost as a block unit.
18. The semiconductor device design method according to claim 8, wherein the computer system further executes a seventh step of recognizing the remaining seeds and the subgraphs of the best total cost, using a result of the fifth step and performing automatic layout processing in parallel using a plurality of CPUs, with each of the remaining seeds and the subgraphs of the best total cost as a parallel processing unit.
Type: Application
Filed: Oct 17, 2010
Publication Date: Apr 21, 2011
Applicant:
Inventors: Koki TSURUSAKI (Kanagawa), Satoshi Shibatani (Kanagawa)
Application Number: 12/906,117
International Classification: G06F 17/50 (20060101);