Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Add Zicfiss support to the shadow call stack implementation. #68075

Merged
merged 9 commits into from
Feb 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 26 additions & 16 deletions clang/docs/ShadowCallStack.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,19 +57,25 @@ compiled application or the operating system. Integrating the runtime into
the operating system should be preferred since otherwise all thread creation
and destruction would need to be intercepted by the application.

The instrumentation makes use of the platform register ``x18`` on AArch64 and
``x3`` (``gp``) on RISC-V. For simplicity we will refer to this as the
``SCSReg``. On some platforms, ``SCSReg`` is reserved, and on others, it is
designated as a scratch register. This generally means that any code that may
run on the same thread as code compiled with ShadowCallStack must either target
one of the platforms whose ABI reserves ``SCSReg`` (currently Android, Darwin,
Fuchsia and Windows) or be compiled with a flag to reserve that register (e.g.,
``-ffixed-x18``). If absolutely necessary, code compiled without reserving the
register may be run on the same thread as code that uses ShadowCallStack by
saving the register value temporarily on the stack (`example in Android`_) but
this should be done with care since it risks leaking the shadow call stack
address.

The instrumentation makes use of the platform register ``x18`` on AArch64,
``x3`` (``gp``) on RISC-V with software shadow stack and ``ssp`` on RISC-V with
hardware shadow stack, which needs `Zicfiss`_ and ``-mno-forced-sw-shadow-stack``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hardware shadow stack, which needs `Zicfiss`_ and ``-mno-forced-sw-shadow-stack``
hardware shadow stack, which needs `Zicfiss`

I don't think that the default we want is use SW shadow stack, right? I think that using the SW based SCS should be something users opt into when Zicfiss is available, and shouldn't be something they have to opt out of...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Android default to shadow stack?

My concern is that if a user passes an -mcpu that supports Zicfiss, but the OS doesn't, the compiler will emit shadow stack instructions that silently fall back to NOPs due to the missing OS support. This leaves your binary in a state where it gets no protection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Android default to shadow stack?

My concern is that if a user passes an -mcpu that supports Zicfiss, but the OS doesn't, the compiler will emit shadow stack instructions that silently fall back to NOPs due to the missing OS support. This leaves your binary in a state where it gets no protection.

I agree, changing -fsanitize=shadow-call-stack behavior based on -mcpu is problematic, especially when it can result in the program silently falling back to unprotected state. This might be a problem for other platforms too, not only Android.

Copy link
Collaborator

@topperc topperc Jan 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enh-google @appujee @samitolvanen @ilovepi @kito-cheng What should the default behavior be. Changing based on -mcpu seems surprising to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm in the minority here in thinking that the compiler should pick based on the HW capabilities. I think I'd be surprised if I was compiling w/ SCS on HW that supported it and got the software SCS... but at the end of the day, as long as we document the usage clearly its probably fine. I acknowledge the concern about falling back to a less secure method, but it still feels like the wrong tradeoff to me. That said, if there's a consensus that the other way makes more sense(which there seems to be), then I'm 100% fine with that.

Related: what are our thoughts about a frontend warning/diagnostic when the mcpu supports the feature, but hasn't been selected?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related: what are our thoughts about a frontend warning/diagnostic when the mcpu supports the feature, but hasn't been selected?

isn't the problem there that it's unlikely to be a decision that's in the developer's hands? (because the real question is "does the OS support it?", and i think that's the counter-suggestion? "decide based on the whole triple", in effect, since taking Android as an example it's likely to be something like "if android && riscv64 && api level >= where you can assume hardware scs works".)

(apologies if this isn't what's really being talked about here --- i came in very late to a long conversation :-) )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the problem there that it's unlikely to be a decision that's in the developer's hands? (because the real question is "does the OS support it?", and i think that's the counter-suggestion? "decide based on the whole triple", in effect, since taking Android as an example it's likely to be something like "if android && riscv64 && api level >= where you can assume hardware scs works".)

(apologies if this isn't what's really being talked about here --- i came in very late to a long conversation :-) )

No, I think your comment is on point.

w.r.t. the diagnostic, my thoughts were that it at least makes sense to point out the need for the developer to double check the flags. Maybe i'm being overly cautious, though.

I think using the whole triple is a good approach for platforms like Android and Fuchsia, where we can enable/disable certain features based on the target OS/SDK version. I don' think we can generalize that to other systems, but if some systems can automagically do the right thing its better than nothing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's really (non-Android) Linux that's the odd one out due to not having a version number in the triple?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Though you still have the related problem that some older LLVM may not know that OS version X supports it, and still default it to off)

(default option). Note that with ``Zicfiss``_ the RISC-V backend will default to
the hardware based shadow call stack. Users can force the RISC-V backend to
generate the software shadow call stack with ``Zicfiss``_ by passing
``-mforced-sw-shadow-stack``.
For simplicity we will refer to this as the ``SCSReg``. On some platforms,
``SCSReg`` is reserved, and on others, it is designated as a scratch register.
This generally means that any code that may run on the same thread as code
compiled with ShadowCallStack must either target one of the platforms whose ABI
reserves ``SCSReg`` (currently Android, Darwin, Fuchsia and Windows) or be
compiled with a flag to reserve that register (e.g., ``-ffixed-x18``). If
absolutely necessary, code compiled without reserving the register may be run on
the same thread as code that uses ShadowCallStack by saving the register value
temporarily on the stack (`example in Android`_) but this should be done with
care since it risks leaking the shadow call stack address.

.. _`Zicfiss`: https://github.com/riscv/riscv-cfi/blob/main/cfi_backward.adoc
.. _`example in Android`: https://android-review.googlesource.com/c/platform/frameworks/base/+/803717

Because it requires a dedicated register, the ShadowCallStack feature is
Expand Down Expand Up @@ -151,9 +157,13 @@ Usage

To enable ShadowCallStack, just pass the ``-fsanitize=shadow-call-stack`` flag
to both compile and link command lines. On aarch64, you also need to pass
``-ffixed-x18`` unless your target already reserves ``x18``. On RISC-V, ``x3``
(``gp``) is always reserved. It is, however, important to disable GP relaxation
in the linker. This can be done with the ``--no-relax-gp`` flag in GNU ld.
``-ffixed-x18`` unless your target already reserves ``x18``. No additional flags
need to be passed on RISC-V because the software based shadow stack uses
``x3`` (``gp``), which is always reserved, and the hardware based shadow call
stack uses a dedicated register, ``ssp``.
However, it is important to disable GP relaxation in the linker when using the
software based shadow call stack on RISC-V. This can be done with the
``--no-relax-gp`` flag in GNU ld, and is off by default in LLD.

Low-level API
-------------
Expand Down
4 changes: 4 additions & 0 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -4578,6 +4578,10 @@ def msave_restore : Flag<["-"], "msave-restore">, Group<m_riscv_Features_Group>,
HelpText<"Enable using library calls for save and restore">;
def mno_save_restore : Flag<["-"], "mno-save-restore">, Group<m_riscv_Features_Group>,
HelpText<"Disable using library calls for save and restore">;
def mforced_sw_shadow_stack : Flag<["-"], "mforced-sw-shadow-stack">, Group<m_riscv_Features_Group>,
HelpText<"Force using software shadow stack when shadow-stack enabled">;
def mno_forced_sw_shadow_stack : Flag<["-"], "mno-forced-sw-shadow-stack">, Group<m_riscv_Features_Group>,
HelpText<"Not force using software shadow stack when shadow-stack enabled">;
} // let Flags = [TargetSpecific]
let Flags = [TargetSpecific] in {
def menable_experimental_extensions : Flag<["-"], "menable-experimental-extensions">, Group<m_Group>,
Expand Down
6 changes: 6 additions & 0 deletions clang/test/Driver/riscv-features.c
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,12 @@
// DEFAULT-NOT: "-target-feature" "-save-restore"
// DEFAULT-NOT: "-target-feature" "+save-restore"

// RUN: %clang --target=riscv32-unknown-elf -### %s -mforced-sw-shadow-stack 2>&1 | FileCheck %s -check-prefix=FORCE-SW-SCS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After commenting about defaults, we should probably test the default, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// RUN: %clang --target=riscv32-unknown-elf -### %s -mno-forced-sw-shadow-stack 2>&1 | FileCheck %s -check-prefix=NO-FORCE-SW-SCS
// FORCE-SW-SCS: "-target-feature" "+forced-sw-shadow-stack"
// NO-FORCE-SW-SCS: "-target-feature" "-forced-sw-shadow-stack"
// DEFAULT-NOT: "-target-feature" "+forced-sw-shadow-stack"

// RUN: %clang --target=riscv32-unknown-elf -### %s -munaligned-access 2>&1 | FileCheck %s -check-prefix=FAST-UNALIGNED-ACCESS
// RUN: %clang --target=riscv32-unknown-elf -### %s -mno-unaligned-access 2>&1 | FileCheck %s -check-prefix=NO-FAST-UNALIGNED-ACCESS
// RUN: %clang --target=riscv32-unknown-elf -### %s -mno-strict-align 2>&1 | FileCheck %s -check-prefix=FAST-UNALIGNED-ACCESS
Expand Down
5 changes: 5 additions & 0 deletions llvm/lib/Target/RISCV/RISCVFeatures.td
Original file line number Diff line number Diff line change
Expand Up @@ -1044,3 +1044,8 @@ def FeatureTaggedGlobals : SubtargetFeature<"tagged-globals",
"AllowTaggedGlobals",
"true", "Use an instruction sequence for taking the address of a global "
"that allows a memory tag in the upper address bits">;

def FeatureForcedSWShadowStack : SubtargetFeature<
"forced-sw-shadow-stack", "HasForcedSWShadowStack", "true",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the default be false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This true means HasForcedSWShadowStack set to true when +forced-sw-shadow-stack. So actually the default is false.

"Implement shadow stack with software.">;
def HasForcedSWShadowStack : Predicate<"Subtarget->hasForcedSWShadowStack()">;
14 changes: 12 additions & 2 deletions llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,14 @@ static void emitSCSPrologue(MachineFunction &MF, MachineBasicBlock &MBB,
CSI, [&](CalleeSavedInfo &CSR) { return CSR.getReg() == RAReg; }))
return;

const RISCVInstrInfo *TII = STI.getInstrInfo();
if (!STI.hasForcedSWShadowStack() && STI.hasStdExtZicfiss()) {
BuildMI(MBB, MI, DL, TII->get(RISCV::SSPUSH)).addReg(RAReg);
return;
}

Register SCSPReg = RISCVABI::getSCSPReg();

const RISCVInstrInfo *TII = STI.getInstrInfo();
bool IsRV64 = STI.hasFeature(RISCV::Feature64Bit);
int64_t SlotSize = STI.getXLen() / 8;
// Store return address to shadow call stack
Expand Down Expand Up @@ -106,9 +111,14 @@ static void emitSCSEpilogue(MachineFunction &MF, MachineBasicBlock &MBB,
CSI, [&](CalleeSavedInfo &CSR) { return CSR.getReg() == RAReg; }))
return;

const RISCVInstrInfo *TII = STI.getInstrInfo();
if (!STI.hasForcedSWShadowStack() && STI.hasStdExtZicfiss()) {
BuildMI(MBB, MI, DL, TII->get(RISCV::SSPOPCHK)).addReg(RAReg);
return;
}

Register SCSPReg = RISCVABI::getSCSPReg();

const RISCVInstrInfo *TII = STI.getInstrInfo();
bool IsRV64 = STI.hasFeature(RISCV::Feature64Bit);
int64_t SlotSize = STI.getXLen() / 8;
// Load return address from shadow call stack
Expand Down
134 changes: 134 additions & 0 deletions llvm/test/CodeGen/RISCV/shadowcallstack.ll
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,14 @@
; RUN: | FileCheck %s --check-prefix=RV32
; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
; RUN: | FileCheck %s --check-prefix=RV64
; RUN: llc -mtriple=riscv32 -mattr=+experimental-zicfiss < %s \
; RUN: -verify-machineinstrs | FileCheck %s --check-prefix=RV32-ZICFISS
; RUN: llc -mtriple=riscv64 -mattr=+experimental-zicfiss < %s \
; RUN: -verify-machineinstrs | FileCheck %s --check-prefix=RV64-ZICFISS
; RUN: llc -mtriple=riscv32 -mattr=+experimental-zicfiss,forced-sw-shadow-stack \
; RUN: -verify-machineinstrs < %s | FileCheck %s --check-prefix=RV32
; RUN: llc -mtriple=riscv64 -mattr=+experimental-zicfiss,forced-sw-shadow-stack \
; RUN: -verify-machineinstrs < %s | FileCheck %s --check-prefix=RV64

define void @f1() shadowcallstack {
; RV32-LABEL: f1:
Expand All @@ -12,6 +20,14 @@ define void @f1() shadowcallstack {
; RV64-LABEL: f1:
; RV64: # %bb.0:
; RV64-NEXT: ret
;
; RV32-ZICFISS-LABEL: f1:
; RV32-ZICFISS: # %bb.0:
; RV32-ZICFISS-NEXT: ret
;
; RV64-ZICFISS-LABEL: f1:
; RV64-ZICFISS: # %bb.0:
; RV64-ZICFISS-NEXT: ret
ret void
}

Expand All @@ -25,6 +41,14 @@ define void @f2() shadowcallstack {
; RV64-LABEL: f2:
; RV64: # %bb.0:
; RV64-NEXT: tail foo
;
; RV32-ZICFISS-LABEL: f2:
; RV32-ZICFISS: # %bb.0:
; RV32-ZICFISS-NEXT: tail foo
;
; RV64-ZICFISS-LABEL: f2:
; RV64-ZICFISS: # %bb.0:
; RV64-ZICFISS-NEXT: tail foo
tail call void @foo()
ret void
}
Expand Down Expand Up @@ -65,6 +89,32 @@ define i32 @f3() shadowcallstack {
; RV64-NEXT: addi gp, gp, -8
; RV64-NEXT: .cfi_restore gp
; RV64-NEXT: ret
;
; RV32-ZICFISS-LABEL: f3:
; RV32-ZICFISS: # %bb.0:
; RV32-ZICFISS-NEXT: sspush ra
; RV32-ZICFISS-NEXT: addi sp, sp, -16
; RV32-ZICFISS-NEXT: .cfi_def_cfa_offset 16
; RV32-ZICFISS-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
; RV32-ZICFISS-NEXT: .cfi_offset ra, -4
; RV32-ZICFISS-NEXT: call bar
; RV32-ZICFISS-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
; RV32-ZICFISS-NEXT: addi sp, sp, 16
; RV32-ZICFISS-NEXT: sspopchk ra
; RV32-ZICFISS-NEXT: ret
;
; RV64-ZICFISS-LABEL: f3:
; RV64-ZICFISS: # %bb.0:
; RV64-ZICFISS-NEXT: sspush ra
; RV64-ZICFISS-NEXT: addi sp, sp, -16
; RV64-ZICFISS-NEXT: .cfi_def_cfa_offset 16
; RV64-ZICFISS-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
; RV64-ZICFISS-NEXT: .cfi_offset ra, -8
; RV64-ZICFISS-NEXT: call bar
; RV64-ZICFISS-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
; RV64-ZICFISS-NEXT: addi sp, sp, 16
; RV64-ZICFISS-NEXT: sspopchk ra
; RV64-ZICFISS-NEXT: ret
%res = call i32 @bar()
%res1 = add i32 %res, 1
ret i32 %res
Expand Down Expand Up @@ -140,6 +190,68 @@ define i32 @f4() shadowcallstack {
; RV64-NEXT: addi gp, gp, -8
; RV64-NEXT: .cfi_restore gp
; RV64-NEXT: ret
;
; RV32-ZICFISS-LABEL: f4:
; RV32-ZICFISS: # %bb.0:
; RV32-ZICFISS-NEXT: sspush ra
; RV32-ZICFISS-NEXT: addi sp, sp, -16
; RV32-ZICFISS-NEXT: .cfi_def_cfa_offset 16
; RV32-ZICFISS-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
; RV32-ZICFISS-NEXT: sw s0, 8(sp) # 4-byte Folded Spill
; RV32-ZICFISS-NEXT: sw s1, 4(sp) # 4-byte Folded Spill
; RV32-ZICFISS-NEXT: sw s2, 0(sp) # 4-byte Folded Spill
; RV32-ZICFISS-NEXT: .cfi_offset ra, -4
; RV32-ZICFISS-NEXT: .cfi_offset s0, -8
; RV32-ZICFISS-NEXT: .cfi_offset s1, -12
; RV32-ZICFISS-NEXT: .cfi_offset s2, -16
; RV32-ZICFISS-NEXT: call bar
; RV32-ZICFISS-NEXT: mv s0, a0
; RV32-ZICFISS-NEXT: call bar
; RV32-ZICFISS-NEXT: mv s1, a0
; RV32-ZICFISS-NEXT: call bar
; RV32-ZICFISS-NEXT: mv s2, a0
; RV32-ZICFISS-NEXT: call bar
; RV32-ZICFISS-NEXT: add s0, s0, s1
; RV32-ZICFISS-NEXT: add a0, s2, a0
; RV32-ZICFISS-NEXT: add a0, s0, a0
; RV32-ZICFISS-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
; RV32-ZICFISS-NEXT: lw s0, 8(sp) # 4-byte Folded Reload
; RV32-ZICFISS-NEXT: lw s1, 4(sp) # 4-byte Folded Reload
; RV32-ZICFISS-NEXT: lw s2, 0(sp) # 4-byte Folded Reload
; RV32-ZICFISS-NEXT: addi sp, sp, 16
; RV32-ZICFISS-NEXT: sspopchk ra
; RV32-ZICFISS-NEXT: ret
;
; RV64-ZICFISS-LABEL: f4:
; RV64-ZICFISS: # %bb.0:
; RV64-ZICFISS-NEXT: sspush ra
; RV64-ZICFISS-NEXT: addi sp, sp, -32
; RV64-ZICFISS-NEXT: .cfi_def_cfa_offset 32
; RV64-ZICFISS-NEXT: sd ra, 24(sp) # 8-byte Folded Spill
; RV64-ZICFISS-NEXT: sd s0, 16(sp) # 8-byte Folded Spill
; RV64-ZICFISS-NEXT: sd s1, 8(sp) # 8-byte Folded Spill
; RV64-ZICFISS-NEXT: sd s2, 0(sp) # 8-byte Folded Spill
; RV64-ZICFISS-NEXT: .cfi_offset ra, -8
; RV64-ZICFISS-NEXT: .cfi_offset s0, -16
; RV64-ZICFISS-NEXT: .cfi_offset s1, -24
; RV64-ZICFISS-NEXT: .cfi_offset s2, -32
; RV64-ZICFISS-NEXT: call bar
; RV64-ZICFISS-NEXT: mv s0, a0
; RV64-ZICFISS-NEXT: call bar
; RV64-ZICFISS-NEXT: mv s1, a0
; RV64-ZICFISS-NEXT: call bar
; RV64-ZICFISS-NEXT: mv s2, a0
; RV64-ZICFISS-NEXT: call bar
; RV64-ZICFISS-NEXT: add s0, s0, s1
; RV64-ZICFISS-NEXT: add a0, s2, a0
; RV64-ZICFISS-NEXT: addw a0, s0, a0
; RV64-ZICFISS-NEXT: ld ra, 24(sp) # 8-byte Folded Reload
; RV64-ZICFISS-NEXT: ld s0, 16(sp) # 8-byte Folded Reload
; RV64-ZICFISS-NEXT: ld s1, 8(sp) # 8-byte Folded Reload
; RV64-ZICFISS-NEXT: ld s2, 0(sp) # 8-byte Folded Reload
; RV64-ZICFISS-NEXT: addi sp, sp, 32
; RV64-ZICFISS-NEXT: sspopchk ra
; RV64-ZICFISS-NEXT: ret
%res1 = call i32 @bar()
%res2 = call i32 @bar()
%res3 = call i32 @bar()
Expand Down Expand Up @@ -176,6 +288,28 @@ define i32 @f5() shadowcallstack nounwind {
; RV64-NEXT: ld ra, -8(gp)
; RV64-NEXT: addi gp, gp, -8
; RV64-NEXT: ret
;
; RV32-ZICFISS-LABEL: f5:
; RV32-ZICFISS: # %bb.0:
; RV32-ZICFISS-NEXT: sspush ra
; RV32-ZICFISS-NEXT: addi sp, sp, -16
; RV32-ZICFISS-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
; RV32-ZICFISS-NEXT: call bar
; RV32-ZICFISS-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
; RV32-ZICFISS-NEXT: addi sp, sp, 16
; RV32-ZICFISS-NEXT: sspopchk ra
; RV32-ZICFISS-NEXT: ret
;
; RV64-ZICFISS-LABEL: f5:
; RV64-ZICFISS: # %bb.0:
; RV64-ZICFISS-NEXT: sspush ra
; RV64-ZICFISS-NEXT: addi sp, sp, -16
; RV64-ZICFISS-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
; RV64-ZICFISS-NEXT: call bar
; RV64-ZICFISS-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
; RV64-ZICFISS-NEXT: addi sp, sp, 16
; RV64-ZICFISS-NEXT: sspopchk ra
; RV64-ZICFISS-NEXT: ret
%res = call i32 @bar()
%res1 = add i32 %res, 1
ret i32 %res
Expand Down
Loading