MICROPROCESSOR PROTECTED AGAINST STACK OVERFLOW

Info

Publication number: 20130013965
Type: Application
Filed: Jul 6, 2012
Publication Date: Jan 10, 2013
Applicant: STMicroelectronics (Rousset) SAS (Rousset)
Inventors: Pierre Guillemin (Marseille), William Orlando (Peynier)
Application Number: 13/543,673

Abstract

A microprocessor includes a central processing unit, at least one call stack, a stack pointer, an address bus, and a data bus. The microprocessor further includes a hardware monitor configured to supply protection codes, insert the protection codes in the stack or let the central processing unit insert them, and then generate an error signal in response to an attempt to modify a protection code present in the stack.

Description

Description

BACKGROUND

1. Technical Field

The present disclosure relates to the protection of a microprocessor against stack overflow. The call stack of a microprocessor is a volatile memory zone directed to the storage of data concerning functions executed by the microprocessor. In particular, the stack allows the storage of an address to which the microprocessor should return after having executed a function.

2. Description of the Related Art

When a first function or “call function” calls a second function or “called function”, the call function places its return address in the stack. When the called function has finished the task for which it was designed, it fetches the return address from the stack. When the functions call each other, return addresses accumulate in the stack and are fetched one after the other after the execution of a function. Besides return addresses, the stack receives data associated with call or called functions, such as local variables of the called function, parameters of the called function, a frame pointer of the call function, etc.

When too much information is stored in the stack, a phenomena called stack overflow occurs, which may be an external overflow or an internal overflow. External overflow is an overflow beyond the amount of memory space attributed to the stack, whereas internal overflow is when writing of a data zone overflows into another data zone, in particular into a zone containing a return address. Such an internal overflow may be caused intentionally by an attacker aiming to take control of the microprocessor, and the return address pointing to the call function may be replaced by a return address pointing to a malicious program. In the present application, the term “stack overflow” designates an internal overflow.

A known method of countering stack overflow attacks consists of inserting a protection code, also known as a “canary”, in the stack. Such codes are generally small whole numbers of random value placed at strategic locations in the stack, preferably before each return address. To detect that the stack has not been subjected to a fraudulent overflow, the value of the canary is checked to determine that it has not been changed before the called function uses the return address present in the stack.

This technique considerably increases the difficulty of exploiting stack overflow, since it relates to the attacker taking control of the instruction pointer by complex means.

Nevertheless, this technique is not infallible because it entirely depends on the provision of canary insertion and verification instructions in the program executed by the microprocessor. An attack on the program itself may therefore allow a subsequent attack on the stack by neutralizing the canary verification instructions.

Similarly, an attacker may provoke a stack overflow that is undetected by reading the canary and then re-writing the correct value of the canary in the initial space while modifying the return address.

It may therefore be desired to reinforce the security offered by the insertion of canaries in the call stack of a microprocessor.

BRIEF SUMMARY

Embodiments of the invention relate to a microprocessor having a central processing unit, at least one call stack, a stack pointer, an address bus, a data bus, and a hardware monitor configured to: generate protection codes, insert protection codes in the stack or let the central processing unit insert them, store addresses of protection codes inserted in the stack, and generate an error signal in response to an attempt to modify a protection code present in the stack.

According to one embodiment, the monitor is also configured to generate the error signal in the case of an attempt to read a protection code in the stack.

According to one embodiment, the monitor is configured to generate random or pseudo-random protection codes.

According to one embodiment, the monitor is configured to generate deterministic and reproducible protection codes.

According to one embodiment, the monitor is configured to monitor the address bus, and generate the error signal if an address of a stored protection code appears on the address bus.

According to one embodiment, the monitor includes a first register write-accessible to the central processing unit, and is configured to generate a protection code and apply the value of the protection code on the data bus in response to the write of a data in the register by the central processing unit.

According to one embodiment, the monitor is also configured to insert the protection code in the stack, at an address present in the first register.

According to one embodiment, in response to a protection code erase request by the central processing unit or by the program executed by the central processing unit, the monitor is configured to read the protection code in the stack at an address specified by the erase request, compare the protection code with an expected protection code value, and generate the error signal if the read value is different from the expected value.

According to one embodiment, the monitor itself is configured to erase the protection code in the stack after having verified the protection code.

According to one embodiment, the monitor includes a second register write-accessible to the central processing unit, and is configured to interpret a write of the second register as a request to erase a protection code.

According to one embodiment, the monitor is configured to interpret a write of the second register as an erase request of a protection code at an address present in the second register.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

Non-limiting and non-exhaustive embodiments are described with reference to the following drawings, wherein like labels refer to like parts throughout the various views unless otherwise specified. The sizes and relative positions of elements in the drawings are not necessarily drawn to scale. For example, the shapes of various elements and angles are not drawn to scale, and some of these elements are enlarged and positioned to improve drawing legibility. Further, the particular shapes of the elements as drawn are not intended to convey any information regarding the actual shape of the particular elements and have been solely selected for ease of recognition in the drawings. One or more embodiments are described hereinafter with reference to the accompanying drawings in which:

FIG. 1 shows a microprocessor comprising a stack monitor according to one embodiment;

FIG. 2 shows an example of contents of the stack; and

FIG. 3 shows an implementation example of the stack monitor.

DETAILED DESCRIPTION

FIG. 1 shows a microprocessor 100 according to one embodiment. The microprocessor 100 includes a central processing unit CPU, hereinafter “the CPU”, a program memory PMEM, a call stack STCK, and a registry bank RBK. These different elements are linked by an address bus B1, a data bus B2, and an instruction bus B3.

Registry bank RBK includes a stack pointer register SP and a frame pointer register FP. The stack pointer contains the address of the top of the stack STCK and the frame pointer contains the start address of a frame of a function being executed. Program memory PMEM contains a program executed by the CPU.

It will be noted that instruction bus B3 is an optional microprocessor element, its provision particular to the architecture used, here the Harvard architecture. A Von Neumann architecture only includes bus B2 to convey both the data and the instructions.

The microprocessor 100 also includes a stack monitor MT according to one embodiment. Monitor MT is configured to place protection codes or “canaries” in the stack STCK as it is being filled by the CPU. The monitor is also configured to detect attempts to write the stack at the canary addresses, e.g., attempts to alter the canaries, and preferably to also detect attempts to read the canaries.

In one embodiment, the monitor only monitors address bus B1 and launches an alert when the address of a canary appears on address bus B1. In this case, the monitor does not try to determine whether it is a read or write attempt, and does not monitor instruction bus B3 (or bus B2 in a Von Neumann architecture). The alert is unconditional and is for example emitted in the form of an error signal ER.

Signal ER is applied to an interruption decoder that sets the CPU in a secure error processing sub-program (for example a secure interrupt of the platform termed “trust zone”, the processing of which is done in secure mode in protected memories). Alternatively, signal ER is used to provoke a CPU reset.

During the return to a call function, monitor MT also operates such that the canary surveillance is lifted before the canary is erased. Before lifting the surveillance, the monitor verifies that the canary value has not been altered. To this end, the monitor reads the canary and compares it with an initial value saved in its internal memory 102. In one embodiment, the monitor itself performs the erasure of the canary in the stack. In another embodiment, this erasure is assured by the CPU.

Thus, monitor MT is a hardware means to generate and to monitor the canaries, and cannot be corrupted by a fraudulent alteration of the program executed by the CPU. The monitor is a sort of impartial arbitrator, independent of the program itself, and confers a high level of security to the generation and the surveillance of canaries.

FIG. 2 shows an example of inserting canaries in stack STCK and an example of the stack contents. An arrow DIR1 indicates the increasing address direction and an arrow DIR2 indicates the stack filling direction. Here, the filling is done from the highest value address to the lowest value address. The “top” of the stack, corresponding to the current value of stack pointer SP, thus corresponds here to the lowest value of the stack. It will be noted that in other embodiments, the stack can have a reversed filling direction, corresponding to the increasing address direction.

In the example shown, the stack includes frame FFA of a function FA, frame FFB of a function FB, and frame FFC of a function FC being executed. It is assumed that function FC was called by function FB and that function FB was called by function FA. Each frame contains contextual data of the considered function and return data to the call function.

Thus, frame FFB of function FB includes a return address RAFA to function FA, a frame pointer value FPFA of function FA, and local variables LVFB of function FB. A canary C1 was inserted by monitor MT in frame FFB, for example between return address RAFA and frame pointer value FPFA.

Frame FFC of function FC includes a return address RAFB of function FB, a frame pointer value FPFB of function FB, and local variables LVFC of function FC. A canary C2 was inserted by monitor MT in frame FFC, for example between return address RAFB and frame pointer value FPFB.

The current value of stack pointer SP designates the top of the stack (the lowest address), and the current value of the frame pointer designates the position of the return address RAFB to function FB.

Canaries C2 and C1 are monitored in real time by monitor MT. Thus, any attempt to overflow the stack in order to smash return address RAFB and/or return address RAFA implies a stack write attempt at the locations of canaries C2 and C1. This attempt is detected by monitor MT and leads it to emit error signal ER. Similarly, any attempt to read the canaries is preferably detected by the monitor, which also emits the error signal.

To use monitor MT, canary insertion and erasure instructions are provided in the program executed by the CPU (program stored in program memory PMEM).

The implementation of embodiments may be the object of diverse variations which will be evoked in the following, before a detailed example implementation of monitor MT is provided in relation with FIG. 3.

Modification of the Program to Use the Monitor

The use of monitor MT supposes instructions in the program executed by the CPU for the insertion and erasure of canaries.

By “canary erasure instruction” is meant an instruction that leads monitor MT to lift the surveillance of the canary targeted by the instruction, in preparation for its erasure by the CPU, or which leads the monitor to lift the surveillance and to erase the canary from the stack itself.

To this end, several options may be provided:

- an explicit modification of the program by the programmer (calling macros or dedicated functions),
- a modification of the program by the compiler so that this operation is transparent to the programmer,
- a post-compilation of the generated binary code, also to render the operation transparent to the programmer,
- the modification of the compiler so that the programmer can indicate, by means of a “preprocessing” command (for example a command of the “pragma” type as specified by the C language standard), how the compiler should operate to compile a program section or a function. This allows the compiler to add the canary insertion and erasure instructions, while leaving to the programmer the choice as to which program sections or functions should be protected.

Generation of Canaries

The value of a canary may be:

- supplied by a random or pseudo-random generator, or
- deterministic and reproducible: the value of the canary is for example determined by the monitor as a function of known variables such as the address at which the canary is placed, and of a secret or random value known only to the monitor, or even a call function identifier supplied by the compiler.

The value of the canary preferably includes a byte at 0 to offer protection against faults due to character chain manipulations.

Insertion of Canaries in the Stack

The canaries are placed between the local variable zones and the return addresses in the embodiment. Indeed, it is preferable that they are situated between the zones sensitive to overflow and the zones to protect from an overflow. In one embodiment, additional canaries are placed at other locations in the stack to obtain the equivalent of a “mine field” offering application perspectives extending beyond the simple protection of return addresses.

The insertion of a canary in the stack may be launched by the monitor:

- automatically upon detection of a “call” type instruction causing a change of context in the stack. In this case, the monitor is linked to instruction bus B3, as shown in dotted lines in FIG. 1, and includes means of decoding such an instruction (in the Von Neumann architecture, this monitoring is done on bus B2);
- upon an explicit request by the program. An explicit request may consist of a specific canary insertion instruction that the monitor decodes, or more simply a write instruction of a monitor register, interpreted by the monitor as an insertion instruction.

Management of Canary Addresses

In order to monitor the different canaries placed in the stack, the monitor keeps track of where they are. To this end, the canary addresses are stored in an internal memory 102 of the monitor or an external memory 104 accessible by the intermediary of the data and address buses. Alternatively, a low-capacity internal memory 106 (memory zone within the CPU) may be provided to store the addresses of the canaries most recently introduced into the stack, and a complementary memory 104 may be provided to store the addresses of other canaries. A distinction is therefore made between “active” canaries, the addresses of which are under surveillance because they are loaded in the internal memory 102 of the monitor, and “inactive” canaries, the addresses of which are not monitored. During a change of context, the addresses of canaries present in the external memory 104 are transferred to the internal memory 102 to be monitored by the monitor. The canaries whose addresses were transferred thus become “active”.

The addresses stored in an external memory 104 are preferably protected in confidentiality and integrity, for example by encryption and association with an error correction code.

Similarly, the values of the canaries, if they are not deterministic and reproducible, may be stored in an internal memory 102 of the monitor or in the memory 106 of the CPU, in a form protected by confidentiality and integrity.

Monitoring of Canary Addresses

The monitor should first detect and prevent an attempt to write the stack at the canary addresses. The write attempt may be a modification attempt by the CPU or by a host device that can access the data bus and the address bus.

Preferably, the monitor also detects and prevents attempts to read the stack at a canary address. Indeed, certain attacks take advantage of a prior knowledge of the canary values.

To this end, the monitor compares the current address present on the address bus to the addresses of “active” canaries.

This monitoring may concern the addresses of all the canaries inserted in the stack (exhaustive monitoring) or only the addresses of the canaries associated with the current task, the function being executed. In the second case, the monitor is informed of a task change so that it can update the canary addresses to monitor.

The surveillance of the current address is done for example by comparators, each receiving the current address on one input and an address of a canary to monitor on another input. The detection time is preferably constant and short. The number of canaries simultaneously monitored may be limited to the number of comparators included in the monitor. A multiple-value comparator may also be provided to successively compare the current address with each canary address.

If the surveillance only concerns the canaries of the current task, the monitor recharges its internal comparators during a context change by applying the canary addresses associated with the new task to the comparators.

The new addresses applied to the comparators are imported from an external memory after integrity verification and error correction, or are selected in an internal memory zone associated with the considered task.

Erasure of a Canary

The erasure of a canary includes a prior step of verifying the value of the canary by comparison with an expected value by the monitor.

To this end, the monitor reads the canary in the stack and compares it with an expected value. If the value of the canary is not the same as the expected value, the monitor emits error signal ER.

The erasure of a canary may be launched upon detection of a “return” type instruction on instruction bus B3 (or on bus B2 in a Von Neumann architecture), or by the intermediary of an explicit request, indicating the address of the canary to erase, present in the program, or even by the intermediary of an access to a monitor register.

FIG. 3 shows an implementation example of monitor MT according to the guidelines presented above.

Monitor MT includes a control circuit CCT linked to address bus B1 and to data bus B2 of the CPU, a volatile memory CAM, a volatile memory CVM, address comparators CA0, CA1, . . . CAi, and a canary comparator CDT. Control circuit CCT is a hard-wired circuit of the state machine type. It is equipped with a random or pseudo-random generator CGEN, and two registers R1, R2 write-accessible to the CPU. Memory CAM is provided to store canary addresses, whereas memory CVM is provided to store canary values. The outputs of comparators CA0-CAi, CDTn are sent to an OR type gate G1, the output of which supplies error signal ER.

Memory CAM has a look up table structure and includes N parallel outputs supplying, in read mode, i+1 canary addresses associated with an index applied at the input of the memory. Each canary address supplied by memory CAM is applied on an input of a comparator CA0-CAi, the other input of which is linked to address bus B1.

Memory CVM also has a look up table structure and includes an output supplying, in read mode, a canary value associated with an address value supplied at the input of the memory. This value is applied to an input of comparator CDT, the other input of which is linked to data bus B2.

Register R1 receives an insertion address of a canary supplied by the CPU, and optionally a call function identifier, a called function identifier, or both.

Control circuit CCT is configured to detect a write of register R1 and to interpret this write as a request to write a canary in stack STCK, at the address present in the register.

Thus, to write a canary, the programmer or the compiler simply inserts a write instruction of register R1 in the program. This embodiment can be adapted to any type of microprocessor without having to provide a specific canary insertion instruction.

In response to the write of register R1, control circuit CCT:

- generates a canary, here by means of generator CGEN,
- takes control of address bus B1 and of data bus B2 and writes the canary in stack STCK at the specified address,
- stores the address of the canary in memory CAM in relation with an associated index which can be the identifier of the call or called function,
- stores the value of the canary in memory CVM in relation with the address of the canary,
- sets memory CAM in read mode and applies the index associated with the canary address to the memory.

Thus, the addresses of all the canaries associated with this index (for example all the canaries associated with the identifier of the call or called function) are applied to comparators CA0-CAi and are under surveillance. If the address of one of the canaries under surveillance appears on address bus B1, the output of one of comparators CA0-CAi goes to 1 and signal ER at the output of gate G1 goes to 1, launching the protection process of the system described herein (reset of the microprocessor or secure error processing).

In one variant, the write address of the canary is applied on the address bus by the CPU. Register R1 is used to communicate the identifier of the current process or any other information other than the address of the canary to control circuit CCT. Control circuit CCT reads this address on the address bus in order to store it in memory CVM, and supplies only the value of the canary on the data bus. The CPU itself then applies a write command to stack STCK.

In response to the write of an address in register R2, control circuit CCT:

- applies the address present in register R2 on the bus and reads the value of the canary in the stack. This value is thus found on data bus B2 and on one input of comparator CDT,
- sets memory CVM in read mode and applies the address present in register R2 as a read index to the memory. The initial value of the canary stored in relation with this address thus finds itself on a second input of comparator CDT.

If the two values are different, the output of comparator CDT goes to 1 and signal ER at the output of gate G1 goes to 1, launching the system protection process.

In absence of the error signal, control circuit CCT itself can perform the erasure of the canary in the stack by accessing it in write mode, then erasing the address of the canary in its internal memory. Alternatively, circuit CCT can let the CPU do this erasure, after having erased the address of the canary from its internal memory in order that it is no longer under surveillance.

In the case where circuit CCT is not configured to perform this erasure by itself, the programmer or the compiler should provide a stack erase instruction to the attention of the CPU after a write instruction of register R2.

In a variant evoked herein, control circuit CCT generates deterministic canary values and regenerates them in response to a write of register R2. In this case, memory CVM is not necessary.

Many other embodiments may be provided by the skilled person. In particular, in certain embodiments, the CPU is equipped with a cache memory, which may be formed for example as all or part of memory 106, arranged between the program memory and the CPU and receives instructions to execute several clock cycles before they are to be executed. In this case, the monitor is preferably arranged between the CPU and the cache, and is configured to observe the transactions performed by the CPU.

The various embodiments described above can be combined to provide further embodiments. Aspects of the embodiments can be modified, if necessary to employ concepts of the various patents, applications and publications to provide yet further embodiments. These and other changes can be made to the embodiments in light of the above-detailed description. In general, in the following claims, the terms used should not be construed to limit the claims to the specific embodiments disclosed in the specification and the claims, but should be construed to include all possible embodiments along with the full scope of equivalents to which such claims are entitled. Accordingly, the claims are not limited by the disclosure.

Claims

1. A microprocessor, comprising:

a central processing unit;

at least one call stack structure;

a stack pointer structure;

an address bus;

a data bus; and

a hardware monitor configured to: generate protection codes; insert protection codes in the stack structure or let the central processing unit insert protection codes in the stack structure; store addresses of protection codes inserted in the stack structure; and generate an error signal in response to an attempt to modify a protection code present in the stack structure.

2. A microprocessor according to claim 1 wherein the hardware monitor is configured to generate the error signal in response to an attempt to read a protection code in the stack structure.

3. A microprocessor according to claim 1 wherein the hardware monitor is configured to generate random or pseudo-random protection codes.

4. A microprocessor according to claim 1 wherein the hardware monitor is configured to generate deterministic and reproducible protection codes.

5. A microprocessor according to claim 1 wherein the hardware monitor is configured to:

monitor the address bus; and

generate the error signal if an address of a stored protection code appears on the address bus.

6. A microprocessor according to claim 1 wherein the hardware monitor comprises a register, the register being write-accessible to the central processing unit, and the hardware monitor is configured to generate a register protection code and apply the value of the register protection code on the data bus in response to the write of data in the register by the central processing unit.

7. A microprocessor according to claim 6 wherein the hardware monitor is configured to insert the register protection code in the stack structure at an address present in the register.

8. A microprocessor according to claim 1 wherein in response to a protection code erase request by the central processing unit or by a program executed by the central processing unit, the hardware monitor is configured to:

read a first protection code in the stack structure at an address specified by the erase request;

compare the first protection code with an expected protection code value; and

generate the error signal if the first protection code is different from the expected value.

9. A microprocessor according to claim 8 wherein the hardware monitor is configured to erase the first protection code in the stack after having verified the protection code.

10. A microprocessor according to claim 8 wherein the hardware monitor comprises a register, the register being write-accessible to the central processing unit, and the hardware monitor is configured to interpret a write of the register as a request to erase the first protection code.

11. A microprocessor according to claim 10 wherein the hardware monitor is configured to interpret a write of the register as an erase request of the first protection code at an address present in the register.

12. A method executed by a microprocessor, comprising:

initializing a stack pointer;

generating a plurality of protection codes;

inserting protection codes in a stack based on an address stored in the stack pointer;

storing addresses of inserted protection codes in a memory;

monitoring, with a hardware monitor, addresses on an address bus; and

generating an error signal in response to an attempt to modify a protection code present in the stack based on detecting a monitored address on the address bus.

13. The method according to claim 12 wherein the inserting is performed with the hardware monitor or with a central processing unit.

14. The method according to claim 12, comprising:

executing a secure error processing function in response to the error signal.

15. The method according to claim 12, comprising:

erasing protection codes when corresponding return functions are executed, wherein the erasing includes removing stored addresses of the inserted protection codes from the memory.

16. The method according to claim 12 wherein generating the plurality of protection codes includes drawing random or pseudo-random numbers from an associated random or pseudo-random number generator.

17. A stack overflow monitoring circuit, comprising:

a control circuit configured to generate protection codes and further configured to direct operations of the stack overflow monitoring circuit;

a bus interface configured to couple the stack overflow monitoring circuit to a stack structure, the stack structure operable to store protection codes;

at least one internal memory;

a plurality of comparators coupled to the at least one internal memory; and

gating logic coupled to the plurality of comparators and configured to output an error signal upon detection of an address on an address bus read via the bus interface being equal to an address stored in the at least one internal memory.

18. The stack overflow monitoring circuit according to claim 17, comprising:

an address memory configured within the at least one internal memory; and

a value memory configured within the at least one internal memory,

wherein the address memory is operable to store addresses of protection codes stored in the stack structure and the value memory is operable to store protection code values equal to the protection codes stored in the stack,

wherein the gating logic is configured to output an error signal upon detection of a protection code value on a data bus read via the bus interface being equal to a value stored in the value memory.

19. The stack overflow monitoring circuit according to claim 17, comprising:

a first register coupleable to a central processing unit, the first register operable to receive a function address, wherein the function address is usable by the control circuit to generate a protection code; and

a second register coupleable to the central processing unit, the second register operable to receive an address of a protection code.

20. The stack overflow monitoring circuit according to claim 17, comprising:

a random or pseudo-random number generator, the random or pseudo-random number generator operable during generation of the protection codes.