SYSTEM AND METHOD FOR ELECTRONIC CIRCUIT SIMULATION
A system and method transforms a model of electronic circuit to improve simulation speed and/or reduce emulation area. The model may include storage elements; one or more of these storage elements may be represented by dense memory, and the storage elements may be represented by references thereto.
This application claims priority to and the benefit of, and incorporates by reference herein in their entirety, U.S. Provisional Patent Applications Ser. No. 63/235,287 and Ser. No. 63/235,283, both filed Aug. 20, 2021, in the name of Steven F. Hoover.
BACKGROUNDA design of an electronic circuit, such as a microprocessor, application-specific integrated circuit (ASIC), or any other such circuit, may be designed by first describing it using a hardware-description language (HDL), such as the Very-High-Speed Hardware-Description Language (VHDL). The HDL description may be verified for correct operation by using it to create a corresponding simulation of the electronic circuit. The simulation may be tested by applying patterns of signals to inputs of all or part of the programmable computer hardware and by observing its corresponding output signals.
For a more complete understanding of the present disclosure, reference is now made to the following description taken in conjunction with the accompanying drawings.
The present disclosure relates to software-based simulation of a hardware-description language (HDL) model of an electronic circuit, such as a microprocessor, application-specific integrated circuit (ASIC), or any other such circuit. In particular, the present disclosure relates to using one or more of the techniques described herein to (a) increase the speed of the emulation of the HDL model, (b) reduce the amount of circuit elements (i.e., digital-logic circuitry) required to model the behavior of the HDL model, and/or (c) reduce the area required to implement the HDL model using the programmable computer hardware. For example, use of one or more of the techniques described herein may determine a transformed HDL model required to simulate the electronic circuit, thereby increasing the speed and/or reducing the cost, complexity, size, required power, and/or build time of the emulation system. The resultant simulation system may be tested by applying patterns of signals to inputs of all or part of the model and by observing some or all corresponding output signals.
In various embodiments of the present disclosure, one or more of the circuit-element transformation techniques described herein (as described in greater detail below with reference to, for example,
Referring first to
A model-synthesis system 104 may process the transformed HDL description 122 to determine a synthesized HDL description 124. While the HDL description 120 may contain (for example) Boolean logic equations and similar statements, the synthesized HDL description 124 may contain corresponding representations of circuit elements (e.g., AND and OR gates that implement a Boolean logic equation). The model-synthesis system 104 may perform basic logic optimization (e.g., combining or merging circuit elements when possible) and/or simple retiming (e.g., reconfiguring circuit elements to solve a critical-path or race condition). Examples of model-synthesis systems include the STRATUS and GENUS systems provided by Cadence Design Systems of San Jose, Calif., USA. One of skill in the art will understand that any model-synthesis system is within the scope of the present disclosure.
A model-emulation system 106 may process the synthesized HDL description 124 to program one or more programmable hardware components (e.g., FPGAs) of a hardware-based emulator 108. The hardware-based emulator 108 may further contain software component(s) for providing inputs to, and processing outputs from, the one or more programmable hardware components. The hardware-based emulator 108 is described in more detail with reference to
The model-transformation system 102, the model-synthesis system 104, and the model-emulation system 106 may communicate with each other and/or a user device 110 via a network 100; this network 100 may be a wired, wireless, or any other type of network. A user 112 may provide input 132 to the user device 110 to initiate transformation of the HDL description 120 by the model-transformation system 102, to initiate synthesis of the transformed HDL description 122 by the model-synthesis system 104, and/or initiate programming and/or execution of the synthesized HDL model 124 by the model-emulation system. The user 112 may further receive output 130 from the user device 110 indicating feedback from the systems 102, 104, 106. The user 112 may further interact with the model-emulation system 106 via the user-device input 132 and the user-device output 130 to initialize emulation, execute emulation, and/or view the results of an emulation.
Any number of user devices 110 and users 112 may so communicate with the systems 102, 104, 106. The systems 102, 104, 106 may be disposed on one or more remote systems 1100, as shown in
Waveform viewers of the user device 110 may be used to represent model output data, such as signal traces, as signal waveforms. Waveform viewers may be categorized as timeline views, where time is represented on one of the two axes of the two-dimensional display, and the state of the model is represented on the other axis. A state view may be used to represent machine state at a point in time. State viewers may provide interactive controls to allow a user to adjust which period of time is displayed. In synchronous logic, controlled by a clock, time may be expressed discretely as clock cycles. Timeline views and state views are not a mutually exclusive categorization. More generally, state views may represent a window of time in the neighborhood of a reference time. A timeline view with a reference time, such as a waveform view with a cursor (vertical line at the reference time), can also be considered a state view.
The outputs of the emulator 108 may further be used to represent emulation behavior as a state view by, for example, annotating displayed wires/arcs with values from the emulation. Unlike waveform viewers, these representations may require knowledge of the emulation, not just the signal trace.
In some embodiments, the model-transformation system 102 receives emulation-environment data 126 from the model-emulation system 106. This data 126 may include indications of the number and types of programmable hardware components available in the hardware-based emulator 108, in addition to latencies that exist therebetween. As explained in greater detail below, the emulation-environment data 126 may be used to determine partition boundaries in the transformed HDL description 122. By determining partition boundaries that correspond to the hardware-emulator latencies, the overall cycle time of the emulator may be increased.
A model-partitioning component 142 may parse the HDL description 120 to determine two or more partitions, each with corresponding partition boundaries and associated groups of circuit elements. As described in greater detail below with respect to
A sequence-of-storage-elements determination component 144 may parse the HDL description 120 to identify groups of circuit elements that correspond to one or more circuit-element transformations (candidate transformations are explained in greater detail with respect to
A circuit-element transformation component 146 may apply one or more transformations as directed by the sequence-of-storage-elements determination component 144 and/or the storage-elements size-reduction component 148. Further details of the various transformations are described in greater detail below with respect to
The storage-element size-reduction component 148 may process the HDL description 120 to identify structures of storage elements, such as FIFOs and/or queues, and absorb (e.g., merge) stand-alone storage elements, such as flip-flops, into the FIFOs and/or queues. The storage-element size-reduction component 148 may further preform one or more transforms such that additional storage elements are electrically connected adjacent to the FIFOs and/or queues. Further details of the operation of the storage-element size-reduction component 148 are described in greater detail below with respect to
A storage-element absorption component 150 may perform the absorption of the stand-alone storage elements into the FIFOs and/or queues. Further details of the storage-element absorption component 150 are described in greater detail below with respect to
A signal-remapping component 152 may be used to map signals created by the model-transformation system 102 in the transformed HDL description 122 to signals in the original HDL description 120. The user 112 may, for example, provide input 132 to the user device 110 requesting a value of a signal in the HDL description 120 at a particular time. If that signal was transformed to use different circuit elements, the signal-remapping component 152 may recursively apply corresponding reverse transforms to derive the value of the signal of the HDL description 120 from the values of signals of the transformed HDL description 122. Further details of the signal-remapping component 152 are described in greater detail below with respect to
The transformed model 304 (created by processing the transformed HDL description 122), as described in greater detail herein, may be divided into two or more partitions 310 (by, for example, the model-partitioning component 142). The number of partitions may be M partitions and, in some embodiments, the number of partitions equals the number of processing units in the HDL model 302 and each partition corresponds to a processing unit (e.g., N=M). In other embodiments, as shown in
Referring to
A second data-altering element A 404 may include, for example, a first element 1 440 (e.g., an AND gate), a second element 2 442 (e.g., an inverter), and a third element 3 444 (e.g., an OR gate). The second data-altering element A 404 may receive inputs 420, 422 and fixed input 424 and determine an output 416 and a fixed output 428. The present disclosure is not limited to data-altering elements with only the illustrated circuit elements; data-altering elements containing any number or type of similar circuit elements, and having any number of inputs or outputs, are within its scope.
The data-altering elements B 404 include a fixed input 424 and a fixed output 428. As the term is used herein, a “fixed” input or output is a signal that may not be transformed, migrated, or otherwise retimed using the techniques described herein. Examples of fixed inputs or outputs include processor (or other discrete element) inputs or outputs, inputs or outputs received or sent to non-programmable hardware or unchangeable software, or other such inputs or outputs. As explained below with reference to
Referring to
In
Although only two such transformations are shown in
Referring to
Referring to
As described above, the sequence of storage elements formed by the storage elements 502, 504, 506, 508 may be determined to be disposed, by the sequence-of-storage-elements determination component 144, to correspond to a connection between programmable hardware components. The disposition of the sequence of storage elements may thus wholly or partially compensate for the limit to clock cycle time imposed by the connection. For example, if the latency between two FPGAs is 100 clock cycles, the sequence of storage elements may effectively “pipeline” that latency such that a higher clock frequency is possible.
Referring first to
The transformation includes removing the storage element 666b and directly connecting data-altering elements 674 to data-altering elements 676. To compensate for this removal, an additional storage element 672a is added to the input to the data-altering elements 674 and additional storage elements 672b, 672c are added to the feedback outputs of the data altering elements 676 and the storage element 666c. If the circuit elements include additional groups of data-altering elements, corresponding additional storage elements may be added in the same locations.
The circuit-element transformation component 146 may thus transform the data-altering elements 680 into two (potentially non-overlapping) subsets: a first subset of data-altering elements 680a required to determine the fixed output 688 using the inputs 682 and fixed inputs 684 and a second subset of data-altering elements 680b that does not receive the fixed input 684 nor determine the fixed output 688 yet determines the output 686. The storage element 696b, data-altering elements 680b, and/or storage element 696c may thus be retimed using one or more of the techniques described herein.
As shown in
Referring to
Embodiments of the present invention thus store only a reference value in the flip-flops (or other such less-dense storage elements); this reference indicates an entry in dense data storage elements 714, which hold the actual data values. The dense data storage elements 714 may include dense storage such as a computer-memory structure (e.g., an array, FIFO, and/or queue implemented in computer memory), dedicated array hardware on the FPGA, or other such dense memory. “Dense” storage is defined as any storage that is capable of storing a number of data signals in such a way that the dense storage consumes less FPGA area and/or resources than less-dense storage (e.g., flip-flops) configured to store the same number of data signals.
In various embodiments, a data-reference determination component 716 may receive input data 722 and may generate reference data corresponding to an entry in the dense data-storage elements 714 holding the data 722. The size of the reference data (e.g., the number of bits required for the reference value) may correspond to the number of entries required in the dense data-storage elements 714. The number of entries may depend on the number of cycles of latency of the data-preserving elements 702, 706.
The dense data-storage elements 714 may store the input data in an entry corresponding to the determined reference value. The data-reference storage elements 712 may then process the determined reference value by, for example, storing it in a sequence of flip-flops corresponding to the original latency of the data-preserving elements 702, 706. The data-reference storage element 712 may thus have the same topology as the original data-preserving elements 702, 706. The topology of a circuit refers to the particular types of circuit elements therein and their particular connections therebetween. For example, if the data-preserving elements 702, 706 include one or more multiplexers, branches, joins, queues, FIFOs, etc. connected a particular way, the data-reference storage element 712 may have corresponding elements connected the same way. The data-reference storage element 712 may differ from the data-preserving elements 702, 706 only in the number of storage elements contained therein; the data-reference storage element 712 may have a number of flip-flops, for example, much less than the number of flip-flops contained in the (at least because the data-reference storage element 712 stores only references to the data, while the data-preserving elements 702, 706 store the actual data).
When the determined reference data 718 is output by the data-reference storage elements 712, it may be used to indicate the corresponding data value as stored in the dense data-storage elements 714. The dense data-storage elements 714 may then output the input data 722 as delayed input data 724 (e.g., delayed in accordance with the original latency), which may then be processed by the data-altering elements 704, 708.
Continuing the above example, if the input data is 1024 bits wide and if the latency of the data-preserving elements 702, 706 is ten cycles, the reference data may be determined to be 10 bits in size to address at least 10 entries in the dense data-storage elements 714. The number of flip-flops required for the data-reference storage elements 712 may thus be 10×10=100 flip-flops (as compared to the 10,240 flip-flops required to implement the data-preserving elements 702, 706), resulting in a large savings in FPGA area/resources.
Furthermore, while each entry in the dense data-storage elements 714 may be at least 1024 bits wide to store each item of input data 722, the dense data-storage elements 714 need not allocate additional entries to account for the latency of the data-preserving elements 702, 706; the dense data-storage elements 714 may store each item of input data 722 only once and then output corresponding delayed input data 724 when the reference data 718 so indicates. In other words, while the data-preserving elements 702, 706 may include a large number of flip-flops to store each of the 1024 bits of the input data 722 in each stage of a pipeline as it propagates through a pipeline, the dense data-storage elements may store the 1024 bits only once and model the delay of the pipeline by outputting the input data 722 only when indicated by the reference data 718.
Although
As the terms are used herein, a FIFO is a data structure than stores a first value in accordance with a control input 808 and FIFO/queue control logic 814 (e.g., a first-in value or a most-recently stored value) and outputs a second value in accordance with a control input 808 (e.g., a last-in or least-recently stored value). A queue is a data structure that maintains a first pointer to a “head” of the queue and a second pointer to the “tail” of the queue; when directed by the control signal 808, the queue returns a value at the head of the queue (and updates the first pointer accordingly) and stores a new value to the tail of the queue (and updates the second pointer accordingly).
Referring to
In various embodiments, FIFO control logic 826 may process the control inputs 808 to implement the first-in-first-out behavior of the FIFO described above. Because the absorbed storage elements 812a, 812b are not part of the FIFO, separate storage-element control logic 828 may be used to replicate the behavior of the storage elements 812a, 812b (e.g., capture the input data 806 and provide the output data 810).
The absorbing of the storage elements 812a, 812b may not consume all of the unused entries 804 in the FIFO 820; other storage elements may be further absorbed into the FIFO 820. Further, as illustrated, the FIFO 820 absorbs two connected storage elements 812a, 812b but, as described herein, transformations may be applied to create larger sequences of storage elements, which may also be absorbed into the FIFO 820.
Some original signals 920 of the HDL description 120 may not be transformed; these signals thus remain in the transformed HDL description 122 (represented in the figure as the model that results after N steps of transformation) and remapping of these signals is not necessary.
In some embodiments, original signals 922 are transformed, by a single transform 1 910 to produce transformed signals 1 928. In other embodiments, multiple transforms may affect the same original signals. For example, original signals 926 may be affected by a transform 2 912; some of the resultant transformed signals 932 may appear in the transformed HDL description 122, while other of the resultant transformed signals 932 may be further transformed by a transform 3 914. The transform 3 914 may further transform additional original signals 924 to produce transformed signals 2 932. Other sequences of transforms that may re-transform already transformed signals and/or transform further original signals are within the scope of the present disclosure.
In various embodiments, as the transformations are applied, data representing the transforms and the input and output signals affected thereby are stored in a computer memory or similar structure. For example, at the transformed model step 1 902, data may be stored corresponding to the transform 2 912, the original signals 922, and the transformed signals 932. At the transformed model step 2 904, data may be stored corresponding to the transform 1 910, the original signals 926, and the transformed signals 1 928. Similarly, at the transformed model step 3 906, data may be stored corresponding to the transform 3 914, the original signals 924, the subset of the transformed signals 932 processed by the transform 3 914, and the transformed signals 4 932.
Given this storage of the transforms 910, 912, 914 and the original signals processed thereby, if and when the user 112 inputs an indication to the user device 110 to display a value of an original signal affected by one or more transforms, the signal-remapping component 152 may use this stored transform data to determine the transformed signals corresponding to the original signal and process the transformed signals by recursively applying reverse transformations corresponding to the performed transformations to derive a value of the original signal. For example, to determine a value of the original signals 4 926, the signal-remapping component 152 may determine, using the stored data, that the corresponding transformed signals are the transformed signals 2 930 (as transformed by the transformation 2 912 and the transformation 3 914. The signal-remapping component 152 may then process the transformed signals 2 930 by first applying a first reverse transformation corresponding to the transform 3 914 and then applying a second reverse transformation corresponding to the transform 2 912.
The present disclosure is not limited to only these types or numbers of transforms and to the reverse-mapping of these signals; one of skill in the art will understand than any number of transformation steps and any combination of reverse transforms are within its scope.
Referring to
Multiple servers may be included in the remote system 1100, such as one or more servers for emulating operation of an electronic circuit. In operation, each of these server (or groups of servers) may include computer-readable and computer-executable instructions that reside on the respective server, as will be discussed further below. Each of these devices/systems 110/1100 may include one or more I/O device interfaces 1002/1102 for enabling communication over the network 100. Each of these devices/systems 110/1100 may include one or more controllers/processors 1004/1104, which may each include a central processing unit (CPU) for processing data and computer-readable instructions, and a memory 1006/1106 for storing data and instructions of the respective device. The memories 1006/1106 may individually include volatile random access memory (RAM), non-volatile read only memory (ROM), non-volatile magneto-resistive memory (MRAM), or other types of memory. Each device/system 110/1100 may also include a data-storage component 1008/1108 for storing data and controller/processor-executable instructions. Each data-storage component 1008/1108 may individually include one or more non-volatile storage types such as magnetic storage, optical storage, solid-state storage, etc. Each device/system 110/1100 may also be connected to removable or external non-volatile memory or storage (such as a removable memory card, memory key drive, networked storage, etc.) through respective input/output device interfaces 1002/1102. The user device 110 may further include an antenna 1012, microphone 1014, loudspeaker 1016, and/or display 1018.
Computer instructions for operating each device/system 110/1100 and its various components may be executed by the respective device's/system's controller(s)/processor(s) 1004/1104, using the memory 1006/1106 as temporary “working” storage at runtime. The computer instructions may be stored in a non-transitory manner in non-volatile memory 1006/1106, storage 1008/1108, and/or an external device(s). Alternatively, some or all of the executable instructions may be embedded in hardware or firmware on the respective device in addition to or instead of software.
Each device/system 110/1100 includes input/output device interfaces 1002/1102. A variety of components may be connected through the input/output device interfaces 1002/1102, as will be discussed further below. Additionally, each device/system 110/1100 may include an address/data bus 1010/1110 for conveying data among components of the respective device/system. Each component within a device/system 110/1110 may also be directly connected to other components in addition to (or instead of) being connected to other components across the bus 1010/1110.
The device 110 may include input/output device interfaces 1002 that connect to a variety of components such as an audio output component, a wired headset, or a wireless headset, or other component capable of outputting audio. The device 110 may also include an audio capture component. The audio capture component may be, for example, the microphone 1014 or array of microphones, a wired headset, or a wireless headset, etc.
Via antenna(s) 1012, the input/output device interfaces 1002 may connect to one or more networks 100 via a wireless local area network (WLAN) (such as WiFi) radio, Bluetooth, or wireless network radio, such as a radio capable of communication with a wireless communication network such as a Long Term Evolution (LTE) network, WiMAX network, 3G network, 4G network, 5G network, etc. A wired connection such as Ethernet may also be supported. Through the network(s) 100, the system may be distributed across a networked environment. The I/O device interface 1002/1102 may also include communication components that allow data to be exchanged between devices such as different physical systems in a collection of systems or other components.
Referring to
The network 100 may further connect user devices 110 such as a laptop computer 110a, a desktop computer 110b, a tablet computer 110c, and/or a smart phone 110d through a wireless service provider, over a WiFi or cellular network connection, or the like. Other devices may be included as network-connected support devices, such as a remote system 1100. The support devices may connect to the network 100 through a wired connection or wireless connection. Networked devices 110 may capture audio using one-or-more built-in or connected microphones 1014 or audio-capture devices, with processing performed by components of the same device 110 or another device/system 110/1100 connected via network 100. The concepts disclosed herein may be applied within a number of different devices and computer systems.
The above aspects of the present disclosure are meant to be illustrative. They were chosen to explain the principles and application of the disclosure and are not intended to be exhaustive or to limit the disclosure. Many modifications and variations of the disclosed aspects may be apparent to those of skill in the art. Persons having ordinary skill in the field of computers will understand that components and process steps described herein may be interchangeable with other components or steps, or combinations of components or steps, and still achieve the benefits and advantages of the present disclosure. Moreover, it should be apparent to one skilled in the art, that the disclosure may be practiced without some or all of the specific details and steps disclosed herein.
Aspects of the disclosed system may be implemented as a computer method or as an article of manufacture such as a memory device or non-transitory computer readable storage medium. The computer readable storage medium may be readable by a computer and may comprise instructions for causing a computer or other device to perform processes described in the present disclosure. The computer readable storage media may be implemented by a volatile computer memory, non-volatile computer memory, hard drive, solid-state memory, flash drive, removable disk or other media. In addition, components of one or more of the components and engines may be implemented as in firmware or hardware, such as the acoustic front end, which comprise among other things, analog or digital filters (e.g., filters configured as firmware to a digital signal processor (DSP)).
Conditional language used herein, such as, among others, “can,” “could,” “might,” “may,” “e.g.,” and the like, unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments include, while other embodiments do not include, certain features, elements or steps. Thus, such conditional language is not generally intended to imply that features, elements, or steps are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without other input or prompting, whether these features, elements, or steps are included or are to be performed in any particular embodiment. The terms “comprising,” “including,” “having,” and the like are synonymous and are used inclusively, in an open-ended fashion, and do not exclude additional elements, features, acts, operations, and so forth. Also, the term “or” is used in its inclusive sense (and not in its exclusive sense) so that when used, for example, to connect a list of elements, the term “or” means one, some, or all of the elements in the list.
Disjunctive language such as the phrase “at least one of X, Y, Z,” unless specifically stated otherwise, is understood with the context as used in general to present that an item, term, etc., may be either X, Y, or Z, or any combination thereof (e.g., X, Y, or Z). Thus, such disjunctive language is not generally intended to, and should not, imply that certain embodiments require at least one of X, at least one of Y, or at least one of Z to each be present.
As used in this disclosure, the term “a” or “one” may include one or more items unless specifically stated otherwise. Further, the phrase “based on” is intended to mean “based at least in part on” unless specifically stated otherwise.
Claims
1. A computer-implemented method for simulating operation of an electronic circuit, the method comprising:
- receiving first data representing a hardware-description language (HDL) description of an electronic circuit;
- identifying, using the first data, a first description of a first storage element configured to process, during a first time period, an input signal to store data associated with the input signal and to output, during a second time period, an output signal representing the data;
- determining a computer-memory structure configured to store the data;
- determining a second description of a second storage element configured to store a reference value indicating a location of the data in the computer-memory structure, a topology of the second storage element corresponding to a topology of the first storage element;
- determining, using the first data and the second description, a software-simulation model corresponding to the HDL description and the second storage element;
- storing, using a first portion of the software-simulation model corresponding to the second storage element, the reference value; and
- processing, by the computer-memory structure during the second time period, the reference value to output the data.
Type: Application
Filed: Aug 22, 2022
Publication Date: Feb 23, 2023
Inventor: Steven F. Hoover (Shrewsbury, MA)
Application Number: 17/893,168