From 069057bf0f8f776f7981c94b4e3cbc821342e593 Mon Sep 17 00:00:00 2001 From: Andrew Burgess Date: Tue, 23 Jun 2020 23:22:43 +0100 Subject: [PATCH] bfd/riscv: tighten matching rules in riscv_scan The following GDB behaviour was observed: (gdb) x/1i 0x0001014a 0x1014a : jal 0x10132 (gdb) show architecture The target architecture is set automatically (currently riscv:rv32) (gdb) set architecture riscv:rv32 The target architecture is assumed to be riscv:rv32 (gdb) x/1i 0x0001014a 0x1014a : 0x37e5 (gdb) Notice that initially we can disassemble the instruction (it's a compressed jal instruction), but after setting the architecture we can no longer disassemble the instruction. This is particularly puzzling as GDB initially thought the architecture was 'riscv:rv32', but when we force the architecture to be that, the disassembly stops working. This issue was introduced with this commit: commit c35d018b1a5ec604e49a807402c4205530b25ca8 Date: Mon Jan 27 15:19:30 2020 -0800 RISC-V: Fix gdbserver problem with handling arch strings. In this commit we try to make riscv_scan handle cases where we see architecture strings like 'riscv:rv32imc' (for example). Normally this wouldn't match as bfd_default_scan requires an exact match, so we extended riscv_scan to ignore trailing characters. Unfortunately the default riscv arch is called 'riscv', is 64-bit, and has its mach type set to 0, which I think is intended to pair with code is riscv-dis.c:riscv_disassemble_insn that tries to guess if we are 32 or 64 bit. What happens then is that 'riscv:rv32' is first tested against 'riscv' using bfd_default_scan, this doesn't match, we then compare this to 'riscv', but allowing trailing characters to be ignored, this matches, and our 'riscv:rv32' matches against the default (64-bit) architecture. The solution I propose is to prevent the default architecture from taking part in this "ignore trailing characters" extra match case, only the more specific 'riscv:rv32' and 'riscv:rv64' get this extra matching. bfd/ChangeLog: * cpu-riscv.c (riscv_scan): Don't allow shorter matches using the default architecture. --- bfd/ChangeLog | 5 +++++ bfd/cpu-riscv.c | 18 ++++++++++++++---- 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/bfd/ChangeLog b/bfd/ChangeLog index 943139ebf4..de2e917335 100644 --- a/bfd/ChangeLog +++ b/bfd/ChangeLog @@ -1,3 +1,8 @@ +2020-06-24 Andrew Burgess + + * cpu-riscv.c (riscv_scan): Don't allow shorter matches using the + default architecture. + 2020-06-24 H.J. Lu PR ld/26083 diff --git a/bfd/cpu-riscv.c b/bfd/cpu-riscv.c index b5c972ff4d..22067ab29b 100644 --- a/bfd/cpu-riscv.c +++ b/bfd/cpu-riscv.c @@ -47,10 +47,20 @@ riscv_scan (const struct bfd_arch_info *info, const char *string) if (bfd_default_scan (info, string)) return TRUE; - /* The string might have extra characters for supported subsets. So allow - a match that ignores trailing characters in string. */ - if (strncasecmp (string, info->printable_name, - strlen (info->printable_name)) == 0) + /* The incoming STRING might take the form of riscv:rvXXzzz, where XX is + 32 or 64, and zzz are one or more extension characters. As we + currently only have 3 architectures defined, 'riscv', 'riscv:rv32', + and 'riscv:rv64', we would like to ignore the zzz for the purpose of + matching here. + + However, we don't want the default 'riscv' to match over a more + specific 'riscv:rv32' or 'riscv:rv64', so in the case of the default + architecture (with the shorter 'riscv' name) we don't allow any + special matching, but for the 'riscv:rvXX' cases, we allow a match + with any additional trailing characters being ignored. */ + if (!info->the_default + && strncasecmp (string, info->printable_name, + strlen (info->printable_name)) == 0) return TRUE; return FALSE;