During armv7b testing gdb.base/store.exp test was failling with
'GDB internal error' with the following message:
Temporary breakpoint 1, wack_double (u=
../../binutils-gdb/gdb/regcache.c:177: internal-error: register_size: Assertion `regnum >= 0 && regnum < (gdbarch_num_regs (gdbarch) + gdbarch_num_pseudo_regs (gdbarch))' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
It turns out that compiler generated DWARF with non-existent
register numbers. The compiler issue is present in both little endian
(armv7) and big endian (armv7b) (it is separate issue). Here is
example for one of formal parameters of wack_double function:
<2><792>: Abbrev Number: 10 (DW_TAG_formal_parameter)
<793> DW_AT_name : u
<795> DW_AT_decl_file : 1
<796> DW_AT_decl_line : 115
<797> DW_AT_type : <0x57c>
<79b> DW_AT_location : 6 byte block: 6d 93 4 6c 93 4 (DW_OP_reg29 (r29); DW_OP_piece: 4; DW_OP_reg28 (r28); DW_OP_piece: 4)
In both big and little endian cases gdbarch_dwarf2_reg_to_regnum
returns -1 which is stored into gdb_regnum. But it causes severe
problem only in big endian case because in read_pieced_value and
write_pieced_value functions BFD_ENDIAN_BIG related processing
happen regardless of gdb_regnum value, for example register_size
function is called and in case of gdb_regnum=-1, it cause
'GDB internal error' and crash.
Solution is to move BFD_ENDIAN_BIG related processing under
(gdb_regnum != -1) branch of processing.
gdb/ChangeLog:
2014-11-02 Victor Kamensky <victor.kamensky@linaro.org>
* dwarf2loc.c (read_pieced_value): Do big endian
processing only if gdb_regnum is not -1.
(write_pieced_value): Ditto.
tdep->arm_breakpoint, tdep->thumb_breakpoint, tdep->thumb2_breakpoint
should be set le_ variants in case of arm BE8 code. Those instruciton
sequences are writen to target with simple write_memory, without
regarding gdbarch_byte_order_for_code. But in BE8 case even data
memory is in big endian form, instructions are still in little endian
form.
Because of this issue there are many issues while running gdb test
case in armv7b mode. For example gdb.arch/arm-disp-step.exp test fails
because it gets SIGILL when displaced instrucion sequence reaches
break instruction, which is in wrong byte order.
Solution is to set tdep->xxx_breakpoint sequences in BE8 case (i.e
when gdbarch_byte_order_for_code is BFD_ENDIAN_BIG.
gdb/ChangeLog:
2014-11-02 Victor Kamensky <victor.kamensky@linaro.org>
* arm-linux-tdep.c (arm_linux_init_abi): Use
info.byte_order_for_code to choose endianity of breakpoint
instructions snippets.
extract_arm_insn function needs to read instructions in
gdbarch_byte_order_for_code byte order, because in case armv7b,
even data is big endian, instructions are still little endian.
Currently function uses gdbarch_byte_order which would be
big endian in armv7b case.
Because of this issue pretty much all gdb.reverse/ tests are
failing with 'Process record does not support instruction' message.
Fix is to change gdbarch_byte_order to gdbarch_byte_order_for_code,
when passed to extract_unsigned_integer that reads instruction.
gdb/ChangeLog:
2014-11-02 Victor Kamensky <victor.kamensky@linaro.org>
* arm-tdep.c (extract_arm_insn): Use
gdbarch_byte_order_for_code to read arm instruction.
The test in gdb.python/python.exp tests "extended-prompt" and expects
working directory is printed. However, working directory on remote
host doesn't have "gdb/testsuite", so the test fails on remote host
like:
set extended-prompt \w ^M
^M
/home/yao FAIL: gdb.python/python.exp: set extended prompt working directory (timeout)
This patch is to get the working directory first, and use it to match
the output of "set extended-prompt \\w ". It works for remote host
and non remote host.
gdb/testsuite:
2014-11-02 Yao Qi <yao@codesourcery.com>
* gdb.python/python.exp: Get working directory and match the
output of "set extended-prompt \\w " with it.
binutils:
2014-10-31 Andrew Pinski <apinski@cavium.com>
Naveen H.S <Naveen.Hurugalawadi@caviumnetworks.com>
* readelf.c (print_mips_isa_ext): Print the value of Octeon3.
gas:
2014-10-31 Andrew Pinski <apinski@cavium.com>
Naveen H.S <Naveen.Hurugalawadi@caviumnetworks.com>
* config/tc-mips.c (CPU_IS_OCTEON): Handle CPU_OCTEON3.
(mips_cpu_info_table): Octeon3 enables virt ase.
* doc/c-mips.texi: Document octeon3 as an acceptable value for
-march=.
gas/testsuite:
2014-10-31 Andrew Pinski <apinski@cavium.com>
Naveen H.S <Naveen.Hurugalawadi@caviumnetworks.com>
* gas/mips/mips.exp: Add support for Octeon3 architecture.
Also add in support for running Octeon3 tests.
* gas/mips/octeon3.d: New test.
* gas/mips/octeon3.s: New test source.
opcodes:
2014-10-31 Andrew Pinski <apinski@cavium.com>
Naveen H.S <Naveen.Hurugalawadi@caviumnetworks.com>
* mips-dis.c (mips_arch_choices): Add octeon3.
* mips-opc.c (IOCT): Include INSN_OCTEON3.
(IOCT2): Likewise.
(IOCT3): New define.
(IVIRT): New define.
(mips_builtin_opcodes): Add dmfgc0, dmtgc0, hypcall, mfgc0, mtgc0,
tlbinv, tlbinvf, tlbgr, tlbgwi, tlbginv, tlbginvf, tlbgwr, tlbgp, tlti
IVIRT instructions.
Extend mtm0, mtm1, mtm2, mtp0, mtp1, mtp2 instructions to take another
operand for IOCT3.
bfd:
2014-10-31 Andrew Pinski <apinski@cavium.com>
Naveen H.S <Naveen.Hurugalawadi@caviumnetworks.com>
* archures.c: Add octeon3 for mips target.
* bfd-in2.h: Regenerate.
* bfd/cpu-mips.c: Define I_mipsocteon3.
nfo_struct): Add octeon3 support.
* bfd/elfxx-mips.c: (_bfd_elf_mips_mach): Add support for
octeon3.
(mips_set_isa_flags): Add support for octeon3.
(bfd_mips_isa_ext): Add bfd_mach_mips_octeon3.
(mips_mach_extensions): Make bfd_mach_mips_octeon3 an
extension of bfd_mach_mips_octeon2.
(print_mips_isa_ext): Print the value of Octeon3.
2014-10-21 Andrew Pinski <apinski@cavium.com>
* config/tc-aarch64.c (aarch64_cpus):
Add thunderx.
* doc/c-aarch64.texi: Document that thunderx
is a valid processor name.
PR binutils/17512
* coffgen.c (_bfd_coff_get_external_symbols): Do not try to load a
symbol table bigger than the file.
* elf.c (bfd_elf_get_str_section): Do not try to load a string
table bigger than the file.
* readelf.c (process_program_headers): Avoid memory exhaustion due
to corrupt values in a dynamis segment header.
(get_32bit_elf_symbols): Do not attempt to read an over-large
section.
(get_64bit_elf_symbols): Likewise.
--all option which displays text from anywhere in the input file(s). The
default used to be --data, which only displays text from loadable data sections,
but this requires the use of the BFD library. Since the BFD library almost
certainly still contains buffer overrun and/or memory corruption bugs, and
since the strings program is often used to examine malicious code, it was
decided that the --data option option represents a possible security risk.
* strings.c: Add new command line option --data to only scan the
initialized, loadable data secions of binaries. Choose the
default behaviour of --all or --data based upon a configure
option.
* doc/binutils.texi (strings): Update documentation. Include
description of why the --data option might be unsafe.
* configure.ac: Add new option --disable-default-strings-all which
restores the old behaviour of strings using --data by default. If
the option is not used make strings use --all by default.
* NEWS: Mention the new behaviour of strings.
* configure: Regenerate.
* config.in: Regenerate.
gdb/ChangeLog:
* NEWS: Mention ability add attributes to gdb.Objfile and
gdb.Progspace objects.
* python/py-objfile.c (objfile_object): New member dict.
(objfpy_dealloc): Py_XDECREF dict.
(objfpy_initialize): Initialize dict.
(objfile_getset): Add __dict__.
(objfile_object_type): Set tp_dictoffset member.
* python/py-progspace.c (progspace_object): New member dict.
(pspy_dealloc): Py_XDECREF dict.
(pspy_initialize): Initialize dict.
(pspace_getset): Add __dict__.
(pspace_object_type): Set tp_dictoffset member.
gdb/doc/ChangeLog:
* python.texi (Progspaces In Python): Document ability to add
random attributes to gdb.Progspace objects.
(Objfiles In Python): Document ability to add random attributes to
gdb.objfile objects.
gdb/testsuite/ChangeLog:
* gdb.python/py-objfile.exp: Add tests for setting random attributes
in objfiles.
* gdb.python/py-progspace.exp: Add tests for setting random attributes
in progspaces.
Several GDB tests change directory before compiling the test program
in order to test source file names that include directories. This
doesn't work on a remote host because default_target_compile in
DejaGnu's target.exp copies each source file with
"[remote_download host $x]" which uses "[file tail $file] to strip
off the directory of each file. If the source directory is remote
mounted on the host, this also leaves copied files in the source
directory.
A similar skip is already used in gdb.test/fullname.exp:
# We rely on being able to copy things around.
if { [is_remote host] } {
untested "setting breakpoints by full path"
return -1
}
This patch causes three GDB tests that use "cd" to be skipped for a
remote host. For gdb.base/fullpath-expand.exp this eliminates two
failures and prevents the test from leaving files fullpath-expand.c
and fullpath-expand-func.c in gdb/testsuite. For
gdb.base/realname-expand.exp it eliminates two failures. For
gdb.linespec/macro-relative.exp it prevents file macro-relative.c
from being left in gdb/testsuite/gdb.linespec/base/two.
gdb/testsuite/
* gdb.base/fullpath-expand.exp: Skip for a remote host.
* gdb.base/realname-expand.exp: Likewise.
* gdb.linespec/macro-relative.exp: Likewise.
The @ character is a comment character on ARM, so use % instead. Also
use a wider glob for matching ARM targets to make sure the test gets
run.
ld/testsuite/ChangeLog:
2014-10-30 Will Newton <will.newton@linaro.org>
* ld-unique/unique.exp: Use a wider glob for matching ARM
targets.
* ld-unique/unique.s: Use % instead of @ in .type directive.
* ld-unique/unique_shared.s: Likewise.
fixed part of a fragment for output generation only, which required
MAX_MEM_FOR_RS_ALIGN_CODE to be large enough to hold the maximum pad.
* config/tc-aarch64.h (MAX_MEM_FOR_RS_ALIGN_CODE): Define to 7.
* config/tc-aarch64.c (aarch64_handle_align): Rewrite to handle
large alignments with a constant fragment size of
MAX_MEM_FOR_RS_ALIGN_CODE.
In gdb/command/prompt.py:before_prompt_hook, the '\' in the new prompt
is replaced with '\\', shown as below,
> def before_prompt_hook(self, current):
> if self.value is not '':
> newprompt = gdb.prompt.substitute_prompt(self.value)
> return newprompt.replace('\\', '\\\\')
> else:
> return None
I don't see any explanations on this in comments nor email. As doc
said, "set extended-prompt \w" substitute the current working
directory, but it prints something different from what pwd or
os.getcwdu() prints on mingw32 host.
(gdb) python print os.getcwdu()^M
\\build2-lucid-cs\yqi\yqi\arm-none-eabi
(gdb) pwd^M
Working directory \\build2-lucid-cs\yqi\yqi\arm-none-eabi
(gdb) set extended-prompt \w
\\\\build2-lucid-cs\\yqi\\yqi\\arm-none-eabi
This makes me think whether the substitution in before_prompt_hook is
necessary or not. This patch is to remove this substitution.
Run gdb.python on x86_64-linux and arm-none-eabi on mingw32 host. No
regressions.
gdb:
2014-10-30 Yao Qi <yao@codesourcery.com>
* python/lib/gdb/command/prompt.py (before_prompt_hook): Don't
replace '\\' with '\\\\'.
The patch does the following things:
-- Add support for ifunc.
-- Enable safe icf
-- Add support for TLSLD relocations
R_AARCH64_TLSLD_ADR_PAGE21,
R_AARCH64_TLSLD_ADD_LO12_NC,
R_AARCH64_TLSLD_MOVW_DTPREL_G1,
R_AARCH64_TLSLD_MOVW_DTPREL_G0_NC.
(R_AARCH64_TLSLD_MOVW_* are used by LLVM.)
-- Add support for TLSLD->TLSLE relaxation.
-- Add support for R_AARCH64_LD_PREL_LO19, R_AARCH64_ADR_PREL_LO21.
-- Fix 2 encoding bugs in AArch64_relocate_functions::update_movnz.
-- Correct TLS relocation properties in gold/aarch64-reloc.def.
-- Update testsuite/icf_safe_so_test.cc, testsuite/icf_safe_test.sh.
gold/
2014-10-29 Han Shen <shenhan@google.com>
Jing Yu <jingyu@google.com>
* aarch64-reloc.def: Add LD_PREL_LO12, ADR_PREL_LO21,
TLSLD_ADR_PAGE21, TLSLD_ADD_LO12_NC, TLSLD_MOVW_DTPREL_G1,
TLSLD_MOVW_DTPREL_G0_NC. Change property of TLS relocations to
Symbol::TLS_REF.
* aarch64.cc (Target_aarch64::do_can_check_for_function_pointers): New
method.
(Target_aarch64::reloc_needs_plt_for_ifunc): New method.
(Target_aarch64::tls_ld_to_le): New method.
(Target_aarch64::aarch64_info): Enable can_icf_inline_merge_sections
for 64bit targets.
(Output_data_plt_aarch64::irelative_rel_): New data member.
(Output_data_plt_aarch64::add_entry): Add irelative entries to plt.
(Output_data_plt_aarch64::add_local_ifunc_entry): New method.
(Output_data_plt_aarch64::add_relocation): New method.
(Output_data_plt_aarch64::do_write): Add gold_assert on got_irelative
offset. Add got_irelative size to got size.
(AArch64_relocate_functions): Typedef AArch64_valtype. Replace long
type string with the new typename.
(AArch64_relocate_functions::update_adr): Replace parameter x with
immed.
(AArch64_relocate_functions::update_movnz): Correct wrong val mask.
(AArch64_relocate_functions::reloc_common): New method.
(AArch64_relocate_funcsions::rela_general): Extract common part out
into reloc_common method.
(AArch64_relocate_functions::rela_general): Likewise.
(AArch64_relocate_functions::pcrela_general): Likewise.
(AArch64_relocate_functions::adr): New method.
(AArch64_relocate_functions::adrp): Calculate immed before calling
update_adr.
(AArch64_relocate_functions::adrp): Likewise.
(AArch64_relocate_functions::movnz): Cast x to SignedW type when
comparing x to 0. Calculate immed from ~x when x < 0.
(Target_aarch64::optimize_tls_reloc): Add new cases for
TLSLD_ADR_PAGE21, TLSLD_ADD_LO12_NC, TLSLD_MOVW_DTPREL_G1,
TLSLD_MOVW_DTPREL_G0_NC.
(Target_aarch64::possible_function_pointer_reloc): Implement this
method.
(Target_aarch64::Scan::local_reloc_may_be_function_pointer): Update
comment.
(Target_aarch64::Scan::local): Add codes to handle STT_GNU_IFUNC
symbol. Add cases for TLSLD_ADR_PAGE21, TLSLD_ADD_LO12_NC,
TLSLD_MOVW_DTPREL_G1, TLSLD_MOVW_DTPREL_G0_NC.
(Target_aarch64::Scan::global): Add codes to handle STT_GNU_IFUNC
symbol. Add cases for TLSLD_ADR_PAGE21, TLSLD_ADD_LO12_NC,
TLSLD_MOVW_DTPREL_G1, TLSLD_MOVW_DTPREL_G0_NC.
(Target_aarch64::make_plt_entry): Call add_entry with two more
parameters.
(Target_aarch64::make_local_ifunc_plt_entry): New method.
(Target_aarch64::Relocate::relocate): Add cases for LD_PREL_LO19,
ADR_PREL_LO21, TLSLD_ADR_PAGE21, TLSLD_ADD_LO12_NC,
TLSLD_MOVW_DTPREL_G1, TLSLD_MOVW_DTPREL_G0_NC.
(Target_aarch64::Relocate::relocate_tls): Add cases for
TLSLD_ADR_PAGE21, TLSLD_ADD_LO12_NC, TLSLD_MOVW_DTPREL_G1,
TLSLD_MOVW_DTPREL_G0_NC.
* testsuite/icf_safe_so_test.cc: Correct test comment.
* testsuite/icf_safe_test.sh: Add AArch64 arch.
infrun.c:
5392 /* Did we find the stepping thread? */
5393 if (tp->control.step_range_end)
5394 {
5395 /* Yep. There should only one though. */
5396 gdb_assert (stepping_thread == NULL);
5397
5398 /* The event thread is handled at the top, before we
5399 enter this loop. */
5400 gdb_assert (tp != ecs->event_thread);
5401
5402 /* If some thread other than the event thread is
5403 stepping, then scheduler locking can't be in effect,
5404 otherwise we wouldn't have resumed the current event
5405 thread in the first place. */
5406 gdb_assert (!schedlock_applies (currently_stepping (tp)));
5407
5408 stepping_thread = tp;
5409 }
Like:
gdb/infrun.c:5406: internal-error: switch_back_to_stepped_thread: Assertion `!schedlock_applies (1)' failed.
The way the assertion is written is assuming that with schedlock=step
we'll always leave threads other than the one with the stepping range
locked, while that's not true with the "next" command. With schedlock
"step", other threads still run unlocked when "next" detects a
function call and steps over it. Whether that makes sense or not,
still, it's documented that way in the manual. If another thread hits
an event that doesn't cause a stop while the nexting thread steps over
a function call, we'll get here and fail the assertion.
The fix is just to adjust the assertion. Even though we found the
stepping thread, we'll still step-over the breakpoint that just
triggered correctly.
Surprisingly, gdb.threads/schedlock.exp doesn't have any test that
steps over a function call. This commits fixes that. This ensures
that "next" doesn't switch focus to another thread, and checks whether
other threads run locked or not, depending on scheduler locking mode
and command. There's a lot of duplication in that file that this ends
cleaning up. There's more that could be cleaned up, but that would
end up an unrelated change, best done separately.
This new coverage in schedlock.exp happens to trigger the internal
error in question, like so:
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (1) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (3) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (5) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (7) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (9) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next does not change thread (switched to thread 0)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: current thread advanced - unlocked (wrong amount)
That's because we have more than one thread running the same loop, and
while one thread is stepping over a function call, the other thread
hits the step-resume breakpoint of the first, which needs to be
stepped over, and we end up in switch_back_to_stepped_thread exactly
in the problem case.
I think a simpler and more directed test is also useful, to not rely
on internal breakpoint magics. So this commit also adds a test that
has a thread trip on a conditional breakpoint that doesn't cause a
user-visible stop while another thread is stepping over a call. That
currently fails like this:
FAIL: gdb.threads/next-bp-other-thread.exp: schedlock=step: next over function call (GDB internal error)
Tested on x86_64 Fedora 20.
gdb/
2014-10-29 Pedro Alves <palves@redhat.com>
PR gdb/17408
* infrun.c (switch_back_to_stepped_thread): Use currently_stepping
instead of assuming a thread with a stepping range is always
stepping.
gdb/testsuite/
2014-10-29 Pedro Alves <palves@redhat.com>
PR gdb/17408
* gdb.threads/schedlock.c (some_function): New function.
(call_function): New global.
(MAYBE_CALL_SOME_FUNCTION): New macro.
(thread_function): Call it.
* gdb.threads/schedlock.exp (get_args): Add description parameter,
and use it instead of a global counter. Adjust all callers.
(get_current_thread): Use "find current thread" for test message
here rather than having all callers pass down the same string.
(goto_loop): New procedure, factored out from ...
(my_continue): ... this.
(step_ten_loops): Change parameter from test message to command to
use. Adjust.
(list_count): Delete global.
(check_result): New procedure, factored out from duplicate top
level code.
(continue tests): Wrap in with_test_prefix.
(test_step): New procedure, factored out from duplicate top level
code.
(top level): Test "step" in combination with all scheduler-locking
modes. Test "next" in combination with all scheduler-locking
modes, and in combination with stepping over a function call or
not.
* gdb.threads/next-bp-other-thread.c: New file.
* gdb.threads/next-bp-other-thread.exp: New file.
This PR shows that GDB can easily trigger an assertion here, in
infrun.c:
5392 /* Did we find the stepping thread? */
5393 if (tp->control.step_range_end)
5394 {
5395 /* Yep. There should only one though. */
5396 gdb_assert (stepping_thread == NULL);
5397
5398 /* The event thread is handled at the top, before we
5399 enter this loop. */
5400 gdb_assert (tp != ecs->event_thread);
5401
5402 /* If some thread other than the event thread is
5403 stepping, then scheduler locking can't be in effect,
5404 otherwise we wouldn't have resumed the current event
5405 thread in the first place. */
5406 gdb_assert (!schedlock_applies (currently_stepping (tp)));
5407
5408 stepping_thread = tp;
5409 }
Like:
gdb/infrun.c:5406: internal-error: switch_back_to_stepped_thread: Assertion `!schedlock_applies (1)' failed.
The way the assertion is written is assuming that with schedlock=step
we'll always leave threads other than the one with the stepping range
locked, while that's not true with the "next" command. With schedlock
"step", other threads still run unlocked when "next" detects a
function call and steps over it. Whether that makes sense or not,
still, it's documented that way in the manual. If another thread hits
an event that doesn't cause a stop while the nexting thread steps over
a function call, we'll get here and fail the assertion.
The fix is just to adjust the assertion. Even though we found the
stepping thread, we'll still step-over the breakpoint that just
triggered correctly.
Surprisingly, gdb.threads/schedlock.exp doesn't have any test that
steps over a function call. This commits fixes that. This ensures
that "next" doesn't switch focus to another thread, and checks whether
other threads run locked or not, depending on scheduler locking mode
and command. There's a lot of duplication in that file that this ends
cleaning up. There's more that could be cleaned up, but that would
end up an unrelated change, best done separately.
This new coverage in schedlock.exp happens to trigger the internal
error in question, like so:
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (1) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (3) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (5) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (7) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (9) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next does not change thread (switched to thread 0)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: current thread advanced - unlocked (wrong amount)
That's because we have more than one thread running the same loop, and
while one thread is stepping over a function call, the other thread
hits the step-resume breakpoint of the first, which needs to be
stepped over, and we end up in switch_back_to_stepped_thread exactly
in the problem case.
I think a simpler and more directed test is also useful, to not rely
on internal breakpoint magics. So this commit also adds a test that
has a thread trip on a conditional breakpoint that doesn't cause a
user-visible stop while another thread is stepping over a call. That
currently fails like this:
FAIL: gdb.threads/next-bp-other-thread.exp: schedlock=step: next over function call (GDB internal error)
Tested on x86_64 Fedora 20.
gdb/
2014-10-29 Pedro Alves <palves@redhat.com>
PR gdb/17408
* infrun.c (switch_back_to_stepped_thread): Use currently_stepping
instead of assuming a thread with a stepping range is always
stepping.
gdb/testsuite/
2014-10-29 Pedro Alves <palves@redhat.com>
PR gdb/17408
* gdb.threads/schedlock.c (some_function): New function.
(call_function): New global.
(MAYBE_CALL_SOME_FUNCTION): New macro.
(thread_function): Call it.
* gdb.threads/schedlock.exp (get_args): Add description parameter,
and use it instead of a global counter. Adjust all callers.
(get_current_thread): Use "find current thread" for test message
here rather than having all callers pass down the same string.
(goto_loop): New procedure, factored out from ...
(my_continue): ... this.
(step_ten_loops): Change parameter from test message to command to
use. Adjust.
(list_count): Delete global.
(check_result): New procedure, factored out from duplicate top
level code.
(continue tests): Wrap in with_test_prefix.
(test_step): New procedure, factored out from duplicate top level
code.
(top level): Test "step" in combination with all scheduler-locking
modes. Test "next" in combination with all scheduler-locking
modes, and in combination with stepping over a function call or
not.
* gdb.threads/next-bp-other-thread.c: New file.
* gdb.threads/next-bp-other-thread.exp: New file.
This is more of a readline/terminal issue than a Python one.
PR17372 is a regression in 7.8 caused by the fix for PR17072:
commit 0017922d02
Author: Pedro Alves <palves@redhat.com>
Date: Mon Jul 14 19:55:32 2014 +0100
Background execution + pagination aborts readline/gdb
gdb_readline_wrapper_line removes the handler after a line is
processed. Usually, we'll end up re-displaying the prompt, and that
reinstalls the handler. But if the output is coming out of handling
a stop event, we don't re-display the prompt, and nothing restores the
handler. So the next input wakes up the event loop and calls into
readline, which aborts.
...
gdb/
2014-07-14 Pedro Alves <palves@redhat.com>
PR gdb/17072
* top.c (gdb_readline_wrapper_line): Tweak comment.
(gdb_readline_wrapper_cleanup): If readline is enabled, reinstall
the input handler callback.
The problem is that installing the input handler callback also preps
the terminal, putting it in raw mode and with echo disabled, which is
bad if we're going to call a command that assumes cooked/canonical
mode, and echo enabled, like in the case of the PR, Python's
interactive shell. Another example I came up with that doesn't depend
on Python is starting a subshell with "(gdb) shell /bin/sh" from a
multi-line command. Tests covering both these examples are added.
The fix is to revert the original fix for PR gdb/17072, and instead
restore the callback handler after processing an asynchronous target
event.
Furthermore, calling rl_callback_handler_install when we already have
some input in readline's line buffer discards that input, which is
obviously a bad thing to do while the user is typing. No specific
test is added for that, because I first tried calling it even if the
callback handler was still installed and that resulted in hundreds of
failures in the testsuite.
gdb/
2014-10-29 Pedro Alves <palves@redhat.com>
PR python/17372
* event-top.c (change_line_handler): Call
gdb_rl_callback_handler_remove instead of
rl_callback_handler_remove.
(callback_handler_installed): New global.
(gdb_rl_callback_handler_remove, gdb_rl_callback_handler_install)
(gdb_rl_callback_handler_reinstall): New functions.
(display_gdb_prompt): Call gdb_rl_callback_handler_remove and
gdb_rl_callback_handler_install instead of
rl_callback_handler_remove and rl_callback_handler_install.
(gdb_disable_readline): Call gdb_rl_callback_handler_remove
instead of rl_callback_handler_remove.
* event-top.h (gdb_rl_callback_handler_remove)
(gdb_rl_callback_handler_install)
(gdb_rl_callback_handler_reinstall): New declarations.
* infrun.c (reinstall_readline_callback_handler_cleanup): New
cleanup function.
(fetch_inferior_event): Install it.
* top.c (gdb_readline_wrapper_line) Call
gdb_rl_callback_handler_remove instead of
rl_callback_handler_remove.
(gdb_readline_wrapper_cleanup): Don't call
rl_callback_handler_install.
gdb/testsuite/
2014-10-29 Pedro Alves <palves@redhat.com>
PR python/17372
* gdb.python/python.exp: Test a multi-line command that spawns
interactive Python.
* gdb.base/multi-line-starts-subshell.exp: New file.
While running GDB under Valgrind, I noticed that if the very first
command entered is just <RET>, GDB accesses an uninitialized value:
$ valgrind ./gdb -q -nx
==26790== Memcheck, a memory error detector
==26790== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==26790== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==26790== Command: ./gdb -q -nx
==26790==
(gdb)
==26790== Conditional jump or move depends on uninitialised value(s)
==26790== at 0x619DFC: command_line_handler (event-top.c:588)
==26790== by 0x7813D5: rl_callback_read_char (callback.c:220)
==26790== by 0x6194B4: rl_callback_read_char_wrapper (event-top.c:166)
==26790== by 0x61988A: stdin_event_handler (event-top.c:372)
==26790== by 0x61847D: handle_file_event (event-loop.c:762)
==26790== by 0x617964: process_event (event-loop.c:339)
==26790== by 0x617A2B: gdb_do_one_event (event-loop.c:403)
==26790== by 0x617A7B: start_event_loop (event-loop.c:428)
==26790== by 0x6194E6: cli_command_loop (event-top.c:181)
==26790== by 0x60F86B: current_interp_command_loop (interps.c:317)
==26790== by 0x610A34: captured_command_loop (main.c:321)
==26790== by 0x60C728: catch_errors (exceptions.c:237)
==26790==
(gdb)
It's this check here:
/* If we just got an empty line, and that is supposed to repeat the
previous command, return the value in the global buffer. */
if (repeat && p == linebuffer && *p != '\\')
{
The problem is that linebuffer's contents were never initialized at
this point.
gdb/
2014-10-29 Pedro Alves <palves@redhat.com>
* event-top.c (command_line_handler): Clear the first byte of
linebuffer, when it is first allocated.
exiting instead of throwing an error. E.g.:
$ TERM=foo gdb
(gdb) layout asm
Error opening terminal: foo.
$
The problem is that we're calling initscr to initialize the screen.
As mentioned in
http://pubs.opengroup.org/onlinepubs/7908799/xcurses/initscr.html:
If errors occur, initscr() writes an appropriate error message to
standard error and exits.
^^^^^
Instead, we should use newterm:
"A program that needs an indication of error conditions, so it can
continue to run in a line-oriented mode if the terminal cannot support
a screen-oriented program, would also use this function."
After the patch:
$ TERM=foo gdb -q -nx
(gdb) layout asm
Cannot enable the TUI: error opening terminal [TERM=foo]
(gdb)
And then PR tui/17519 is about GDB not validating whether the terminal
has the necessary capabilities when enabling the TUI. If one tries to
enable the TUI with TERM=dumb (and e.g., from a shell within emacs),
GDB ends up with a clear screen, the cursor is placed at the
bottom/right corner of the screen, there's no prompt, typing shows no
echo, and there's no indication of what's going on. c-x,a gets you
out of the TUI, but it's completely non-obvious.
After the patch, we get:
$ TERM=dumb gdb -q -nx
(gdb) layout asm
Cannot enable the TUI: terminal doesn't support cursor addressing [TERM=dumb]
(gdb)
While at it, I've moved all the tui_allowed_p validation to
tui_enable, and expanded the error messages. Previously we'd get:
$ gdb -q -nx -i=mi
(gdb)
layout asm
&"layout asm\n"
&"TUI mode not allowed\n"
^error,msg="TUI mode not allowed"
and:
$ gdb -q -nx -ex "layout asm" > foo
TUI mode not allowed
While now we get:
$ gdb -q -nx -i=mi
(gdb)
layout asm
&"layout asm\n"
&"Cannot enable the TUI when the interpreter is 'mi'\n"
^error,msg="Cannot enable the TUI when the interpreter is 'mi'"
(gdb)
and:
$ gdb -q -nx -ex "layout asm" > foo
Cannot enable the TUI when output is not a terminal
Tested on x86_64 Fedora 20.
gdb/
2014-10-29 Pedro Alves <palves@redhat.com>
PR tui/16138
PR tui/17519
* tui/tui-interp.c (tui_is_toplevel): Delete global.
(tui_allowed_p): Delete function.
* tui/tui.c: Include "interps.h".
(tui_enable): Don't use tui_allowed_p. Error out here with
detailed error messages if the TUI is the top level interpreter,
or if output is not a terminal. Use newterm instead of initscr,
and error out if initializing the terminal fails. Also error out if
the terminal doesn't support cursor addressing.
* tui/tui.h (tui_allowed_p): Delete declaration.
I noticed that with:
$ TERM=dumb ./gdb -q -nx
<c-x,a>
Cannot enable the TUI: terminal doesn't support cursor addressing [TERM=dumb]
(gdb)
The next key the user types is silently eaten.
The problem is that we're throwing an exception while in a readline
callback that isn't prepared for that:
(top-gdb) bt
#0 tui_enable () at /home/pedro/gdb/mygit/build/../src/gdb/tui/tui.c:388
#1 0x000000000051f47b in tui_rl_switch_mode (notused1=1, notused2=1) at /home/pedro/gdb/mygit/build/../src/gdb/tui/tui.c:101
#2 0x0000000000768d6f in _rl_dispatch_subseq (key=1, map=0xd069c0 <emacs_ctlx_keymap>, got_subseq=0) at /home/pedro/gdb/mygit/build/../src/readline/readline.c:774
#3 0x0000000000768acb in _rl_dispatch_callback (cxt=0x1ce6190) at /home/pedro/gdb/mygit/build/../src/readline/readline.c:686
#4 0x000000000078120b in rl_callback_read_char () at /home/pedro/gdb/mygit/build/../src/readline/callback.c:170
#5 0x0000000000619445 in rl_callback_read_char_wrapper (client_data=0x0) at /home/pedro/gdb/mygit/build/../src/gdb/event-top.c:166
#6 0x000000000061981b in stdin_event_handler (error=0, client_data=0x0) at /home/pedro/gdb/mygit/build/../src/gdb/event-top.c:372
#7 0x000000000061840e in handle_file_event (data=...) at /home/pedro/gdb/mygit/build/../src/gdb/event-loop.c:762
#8 0x00000000006178f5 in process_event () at /home/pedro/gdb/mygit/build/../src/gdb/event-loop.c:339
#9 0x00000000006179bc in gdb_do_one_event () at /home/pedro/gdb/mygit/build/../src/gdb/event-loop.c:403
#10 0x0000000000617a0c in start_event_loop () at /home/pedro/gdb/mygit/build/../src/gdb/event-loop.c:428
Here, in _rl_dispatch_subseq:
769
770 rl_executing_keymap = map;
771
772 rl_dispatching = 1;
773 RL_SETSTATE(RL_STATE_DISPATCHING);
774 (*map[key].function)(rl_numeric_arg * rl_arg_sign, key);
775 RL_UNSETSTATE(RL_STATE_DISPATCHING);
776 rl_dispatching = 0;
777
778 /* If we have input pending, then the last command was a prefix
779 command. Don't change the state of rl_last_func. Otherwise,
GDB is called from line 774, but longjmp'ing at that point leaves
rl_dispatching and RL_STATE_DISPATCHING set.
Fix this by wrapping tui_rl_switch_mode in a TRY_CATCH.
gdb/
2014-10-29 Pedro Alves <palves@redhat.com>
* tui/tui.c (tui_rl_switch_mode): Wrap tui_enable/tui_disable in
TRY_CATCH.
PR tui/16138 is about failure to initialize curses resulting in GDB
exiting instead of throwing an error. E.g.:
$ TERM=foo gdb
(gdb) layout asm
Error opening terminal: foo.
$
The problem is that we're calling initscr to initialize the screen.
As mentioned in
http://pubs.opengroup.org/onlinepubs/7908799/xcurses/initscr.html:
If errors occur, initscr() writes an appropriate error message to
standard error and exits.
^^^^^
Instead, we should use newterm:
"A program that needs an indication of error conditions, so it can
continue to run in a line-oriented mode if the terminal cannot support
a screen-oriented program, would also use this function."
After the patch:
$ TERM=foo gdb -q -nx
(gdb) layout asm
Cannot enable the TUI: error opening terminal [TERM=foo]
(gdb)
And then PR tui/17519 is about GDB not validating whether the terminal
has the necessary capabilities when enabling the TUI. If one tries to
enable the TUI with TERM=dumb (and e.g., from a shell within emacs),
GDB ends up with a clear screen, the cursor is placed at the
bottom/right corner of the screen, there's no prompt, typing shows no
echo, and there's no indication of what's going on. c-x,a gets you
out of the TUI, but it's completely non-obvious.
After the patch, we get:
$ TERM=dumb gdb -q -nx
(gdb) layout asm
Cannot enable the TUI: terminal doesn't support cursor addressing [TERM=dumb]
(gdb)
While at it, I've moved all the tui_allowed_p validation to
tui_enable, and expanded the error messages. Previously we'd get:
$ gdb -q -nx -i=mi
(gdb)
layout asm
&"layout asm\n"
&"TUI mode not allowed\n"
^error,msg="TUI mode not allowed"
and:
$ gdb -q -nx -ex "layout asm" > foo
TUI mode not allowed
While now we get:
$ gdb -q -nx -i=mi
(gdb)
layout asm
&"layout asm\n"
&"Cannot enable the TUI when the interpreter is 'mi'\n"
^error,msg="Cannot enable the TUI when the interpreter is 'mi'"
(gdb)
and:
$ gdb -q -nx -ex "layout asm" > foo
Cannot enable the TUI when output is not a terminal
Tested on x86_64 Fedora 20.
gdb/
2014-10-29 Pedro Alves <palves@redhat.com>
PR tui/16138
PR tui/17519
* tui/tui-interp.c (tui_is_toplevel): Delete global.
(tui_allowed_p): Delete function.
* tui/tui.c: Include "interps.h".
(tui_enable): Don't use tui_allowed_p. Error out here with
detailed error messages if the TUI is the top level interpreter,
or if output is not a terminal. Use newterm instead of initscr,
and error out if initializing the terminal fails. Also error out if
the terminal doesn't support cursor addressing.
* tui/tui.h (tui_allowed_p): Delete declaration.
In gdb.base/fileio.c, some functions may depend on others. For
example, test_rename renames a file to one directory which is created
in test_system. That is means, if test_system fails, test_rename
fails too, which is not a good practise, IMO.
In test_system, system ("mkdir -p XX") is used to create directories
needed for test_rename. In this patch, we use dejagnu remote_exec
proc to create these directories on host.
In my gdb testing, mingw32 host and arm-none-eabi target, system
("mkdir -p XX") doesn't work properly (this issue can be addressed
separately), and this patch fixes the following fails.
FAIL: gdb.base/fileio.exp: Renaming a directory to a non-empty directory returns ENOTEMPTY or EEXIST
FAIL: gdb.base/fileio.exp: Unlink a file
FAIL: gdb.base/fileio.exp: Unlinking a file in a directory w/o write access returns EACCES
gdb/testsuite:
2014-10-29 Yao Qi <yao@codesourcery.com>
* gdb.base/fileio.exp: Make directories on host.
I see the following fail in fileio.exp on mingw32 host gdb,
rename 1: ret = -1, errno = 13^M
^M
Breakpoint 2, stop () at fileio.c:76^M
76 static void stop () {}^M
(gdb) FAIL: gdb.base/fileio.exp: Rename a file
the test fails to rename a file which is not expected. The previous
test test_write doesn't close the file, so the rename fails as a
result on Windows. This patch fixes it by closing file in test_write,
and the fail goes away.
rename 1: ret = 0, errno = 0 OK^M
^M
Breakpoint 2, stop () at fileio.c:76^M
76 static void stop () {}^M
(gdb) PASS: gdb.base/fileio.exp: Rename a file
gdb/testsuite:
2014-10-29 Yao Qi <yao@codesourcery.com>
* gdb.base/fileio.c (test_write): Close the file.
We are trying to insert a breakpoint on line 4 for the following
Ada code.
3 procedure STR is
4 XX : String (1 .. Blocks.Sz) := (others => 'X'); -- STOP
5 K : Integer;
6 begin
7 K := 13;
The code generated on ARM (-march=armv7-m) starts like this:
(gdb) disass str'address
Dump of assembler code for function _ada_str:
--# Line str.adb:3
0x08000014 <+0>: push {r4, r7, lr}
0x08000016 <+2>: sub sp, #28
0x08000018 <+4>: add r7, sp, #0
0x0800001a <+6>: mov r3, sp
0x0800001c <+8>: mov r4, r3
--# Line str.adb:4
0x0800001e <+10>: ldr r3, [pc, #84] ; (0x8000074 <_ada_str+96>)
0x08000020 <+12>: ldr r3, [r3, #0]
0x08000022 <+14>: str r3, [r7, #20]
0x08000024 <+16>: ldr r3, [r7, #20]
[...]
When computing the address related to str.adb:4, GDB correctly
resolves it to 0x0800001e first, but then considers the next
3 instructions as being part of the prologue because it thinks
they are part of stack-protector code. As a result, instead
of inserting the breakpoint at line 4, it skips those instruction
and consequently the rest of the instructions until the start
of the next line, which his line 7.
The stack-protector code is expected to start like this...
ldr Rn, .Label
....
.Lable:
.word __stack_chk_guard
... but the implementation actually accepts a sequence where
the ldr location points to an address for which there is no symbol.
It only aborts if the address points to a symbol which is not
__stack_chk_guard.
Since the __stack_chk_guard symbol is always expected to exist
when used (it lives in .dynsym), this patch fixes the issue by
requiring that the ldr gets the address of the __stack_chk_guard
symbol. If the address could not be resolved, then it rejects
the sequence as being stack-protector code.
gdb/ChangeLog:
* arm-tdep.c (arm_skip_stack_protector): Return early if
address loaded by first "ldr" instruction does not have
a corresponding minimal symbol. Update comment.
Tested on arm-eabi using AdaCore's testsuite.
Tested on arm-linux-gnueabi by Yao as well.
This patch fixes the bug in my patch skipping stack protector
https://www.sourceware.org/ml/gdb-patches/2010-12/msg00110.html
In my skipping stack protector patch, I misunderstood the constant vs.
immediate on instruction encodings, and treated immediate as constant
by mistake. The instruction 'ldr Rd, [PC, #immed]' loads the
address of __stack_chk_guard to Rd, and #immed is an offset from PC.
We should get the __stack_chk_guard from *(pc + #immed).
As a result of this mistake, arm_analyze_load_stack_chk_guard returns
the wrong address of __stack_chk_guard, and the symbol
__stack_chk_guard can't be found. However, we continue to match the
following instructions when symbol isn't found, so the code still
works. In other words, the code just matches the instruction pattern
without checking __stack_chk_guard symbol correctly.
Joel's patch <https://sourceware.org/ml/gdb-patches/2014-10/msg00605.html>
makes the heuristics stricter that we stop matching instructions if
symbol __stack_chk_guard isn't found. Then the bug is exposed. This
patch is to correct the load address computation for ldr instruction,
and it fixes some fails in gdb.mi/gdb792.exp on armv4t both arm and
thumb mode.
Regression tested on arm-linux-gnueabi target with
{armv4t, armv7-a} x {marm, mthumb} x {-fstack-protector,-fno-stack-protector}
gdb:
2014-10-29 Yao Qi <yao@codesourcery.com>
* arm-tdep.c (arm_analyze_load_stack_chk_guard): Compute the
loaded address correctly of ldr instruction.
TL;DR - if we step an instruction that is as long as
decr_pc_after_break (1-byte on x86) right after removing the
breakpoint at PC, in non-stop mode, adjust_pc_after_break adjusts the
PC, but it shouldn't.
In non-stop mode, when a breakpoint is removed, it is moved to the
"moribund locations" list. This is because other threads that are
running may have tripped on that breakpoint as well, and we haven't
heard about it. When a trap is reported, we check if perhaps it was
such a deleted breakpoint that caused the trap. If so, we also need
to adjust the PC (decr_pc_after_break).
Now, say that, on x86:
- a breakpoint was placed at an address where we have an instruction
of the same length as decr_pc_after_break on this arch (1 on x86).
- the breakpoint is removed, and thus put on the moribund locations
list.
- the thread is single-stepped.
As there's no breakpoint inserted at PC anymore, the single-step
actually executes the 1-byte instruction normally. GDB should _not_
adjust the PC for the resulting SIGTRAP. But, adjust_pc_after_break
confuses the step SIGTRAP reported for this single-step as being a
SIGTRAP for the moribund location of the breakpoint that used to be at
the previous PC, and so infrun applies the decr_pc_after_break
adjustment incorrectly.
The confusion comes from the special case mentioned in the comment:
static void
adjust_pc_after_break (struct execution_control_state *ecs)
{
...
As a special case, we could have hardware single-stepped a
software breakpoint. In this case (prev_pc == breakpoint_pc),
we also need to back up to the breakpoint address. */
if (thread_has_single_step_breakpoints_set (ecs->event_thread)
|| !ptid_equal (ecs->ptid, inferior_ptid)
|| !currently_stepping (ecs->event_thread)
|| (ecs->event_thread->stepped_breakpoint
&& ecs->event_thread->prev_pc == breakpoint_pc))
regcache_write_pc (regcache, breakpoint_pc);
The condition that incorrectly triggers is the
"ecs->event_thread->prev_pc == breakpoint_pc" one.
Afterwards, the next resume resume re-executes an instruction that had
already executed, which if you're lucky, results in the inferior
crashing. If you're unlucky, you'll get silent bad behavior...
The fix is to remember that we stepped a breakpoint. Turns out the
only case we step a breakpoint instruction today isn't covered by the
testsuite. It's the case of a 'handle nostop" signal arriving while a
step is in progress _and_ we have a software watchpoint, which forces
always single-stepping. This commit extends sigstep.exp to cover
that, and adds a new test for the adjust_pc_after_break issue.
Tested on x86_64 Fedora 20, native and gdbserver.
gdb/
2014-10-28 Pedro Alves <palves@redhat.com>
PR gdb/12623
* gdbthread.h (struct thread_info) <stepped_breakpoint>: New
field.
* infrun.c (resume) <stepping breakpoint instruction>: Set the
thread's stepped_breakpoint field. Skip if reverse debugging.
Add comment.
(init_thread_stepping_state, handle_signal_stop): Clear the
thread's stepped_breakpoint field.
gdb/testsuite/
2014-10-28 Pedro Alves <palves@redhat.com>
PR gdb/12623
* gdb.base/sigstep.c (no_handler): New global.
(main): If 'no_handler is true, set the signal handlers to
SIG_IGN.
* gdb.base/sigstep.exp (breakpoint_over_handler): Add
with_sw_watch and no_handler parameters. Handle them.
(top level) <stepping over handler when stopped at a breakpoint
test>: Add a test axis for testing with a software watchpoint, and
another for testing with the signal handler set to SIG_IGN.
* gdb.base/step-sw-breakpoint-adjust-pc.c: New file.
* gdb.base/step-sw-breakpoint-adjust-pc.exp: New file.
I noticed that when I single-step into a signal handler with a
pending/queued signal, the following single-steps while the program is
in the signal handler leave $eflags.TF set. That means subsequent
continues will trap after one instruction, resulting in a spurious
SIGTRAP being reported to the user.
This is a kernel bug; I've reported it to kernel devs (turned out to
be a known bug). I'm seeing it on x86_64 Fedora 20 (Linux
3.16.4-200.fc20.x86_64), and I was told it's still not fixed upstream.
This commit extends gdb.base/sigstep.exp to cover this use case,
xfailed.
Here's what the bug looks like:
(gdb) start
Temporary breakpoint 1, main () at si-handler.c:48
48 setup ();
(gdb) next
50 global = 0; /* set break here */
Let's queue a signal, so we can step into the handler:
(gdb) handle SIGUSR1
Signal Stop Print Pass to program Description
SIGUSR1 Yes Yes Yes User defined signal 1
(gdb) queue-signal SIGUSR1
TF is not set:
(gdb) display $eflags
1: $eflags = [ PF ZF IF ]
Now step into the handler -- "si" does PTRACE_SINGLESTEP+SIGUSR1:
(gdb) si
sigusr1_handler (sig=0) at si-handler.c:31
31 {
1: $eflags = [ PF ZF IF ]
No TF yet. But another single-step...
(gdb) si
0x0000000000400621 31 {
1: $eflags = [ PF ZF TF IF ]
... ends up with TF left set. This results in PTRACE_CONTINUE
trapping after each instruction is executed:
(gdb) c
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
0x0000000000400624 in sigusr1_handler (sig=0) at si-handler.c:31
31 {
1: $eflags = [ PF ZF TF IF ]
(gdb) c
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
sigusr1_handler (sig=10) at si-handler.c:32
32 global = 0;
1: $eflags = [ PF ZF TF IF ]
(gdb)
Note that even another PTRACE_SINGLESTEP does not fix it:
(gdb) si
33 }
1: $eflags = [ PF ZF TF IF ]
(gdb)
Eventually, it gets "fixed" by the rt_sigreturn syscall, when
returning out of the handler:
(gdb) bt
#0 sigusr1_handler (sig=10) at si-handler.c:33
#1 <signal handler called>
#2 main () at si-handler.c:50
(gdb) set disassemble-next-line on
(gdb) si
0x0000000000400632 33 }
0x0000000000400631 <sigusr1_handler+17>: 5d pop %rbp
=> 0x0000000000400632 <sigusr1_handler+18>: c3 retq
1: $eflags = [ PF ZF TF IF ]
(gdb)
<signal handler called>
=> 0x0000003b36a358f0 <__restore_rt+0>: 48 c7 c0 0f 00 00 00 mov $0xf,%rax
1: $eflags = [ PF ZF TF IF ]
(gdb) si
<signal handler called>
=> 0x0000003b36a358f7 <__restore_rt+7>: 0f 05 syscall
1: $eflags = [ PF ZF TF IF ]
(gdb)
main () at si-handler.c:50
50 global = 0; /* set break here */
=> 0x000000000040066b <main+9>: c7 05 cb 09 20 00 00 00 00 00 movl $0x0,0x2009cb(%rip) # 0x601040 <global>
1: $eflags = [ PF ZF IF ]
(gdb)
The bug doesn't happen if we instead PTRACE_CONTINUE into the signal
handler -- e.g., set a breakpoint in the handler, queue a signal, and
"continue".
gdb/testsuite/
2014-10-28 Pedro Alves <palves@redhat.com>
PR gdb/17511
* gdb.base/sigstep.c (handler): Add a few more writes to 'done'.
* gdb.base/sigstep.exp (other_handler_location): New global.
(advance): Support stepping into the signal handler, and running
commands while in the handler.
(in_handler_map): New global.
(top level): In the advance test, add combinations for getting
into the handler with stepping commands, and for running commands
in the handler. Add comment descripting the advancei tests.
PR binutils/17512
* elf.c (bfd_section_from_shdr): Allocate and free the recursion
detection table on a per-bfd basis.
* peXXigen.c (pe_print_edata): Handle binaries with a truncated
export table.
Hacking on sigstep.exp, I found it harder to understand and extend
than ideal.
- GDB is currently not restarted between the different
tests/combinations in the file, and some parts of the tests' setup
are done on the top level, and shared between tests. It's not
trivial to understand which breakpoints each test procedure expects
to be set or not set. And it's not trivial to disable parts of the
test if you want quickly try out just a subset of the tests
(running the whole file takes a bit).
- Because GDB is currently not restarted between tests, if some test
triggers a ptrace/kernel bug, the following tests may end up with
cascading fails. That makes it hard to add a test to cover a
kernel bug that isn't fixed yet, with a xfail/kfail. E.g,. note
how with kernels with bug gdb/8744 (stepi over sigreturn syscall
exits program) the test program exits, and nothing restarts it
afterwards...
- The manual test message prefix management gets a bit in the way.
Nowadays, we have with_test_prefix which makes it simpler.
- 'i' is used as parameter name in the various procedures, meaning
'the command the test', which isn't as obvious as it could.
This commit addresses all that.
gdb/testsuite/
2014-10-28 Pedro Alves <palves@redhat.com>
* gdb.base/sigstep.exp: Use build_executable instead of
prepare_for_testing.
(top level): Move code that starts GDB, runs to main and creates a
display to ...
(restart): ... this new procedure.
(top level): Move backtrace from signal handler test to ...
(validate_backtrace): ... this new procedure.
(advance, advancei): Rename parameter from 'i' to 'cmd'. Use
with_test_prefix. Always restart GDB.
(skip_to_handler): Rename parameter from 'i' to 'cmd'. Use
with_test_prefix. Always restart GDB. No need to delete
breakpoints after the test.
(test_skip_handler): Remove prefix parameter.
(skip_over_handler, breakpoint_to_handler)
(breakpoint_to_handler_entry, breakpoint_over_handler): Rename
parameter from 'i' to 'cmd'. Use with_test_prefix. Always
restart GDB. No need to delete breakpoints after the test.
(top level): Use foreach to call the test procedures with
different commands.
This makes it easier to find the bugs in Bugzilla.
gdb/testsuite/
2014-10-28 Pedro Alves <palves@redhat.com>
* gdb.base/sigaltstack.exp: Update to use Bugzilla bug numbers
instead of GNATS numbers.
* gdb.base/sigbpt.exp: Likewise.
* gdb.base/siginfo.exp: Likewise.
* gdb.base/sigstep.exp: Likewise.
In https://sourceware.org/ml/gdb-patches/2014-10/msg00652.html, Sandra
shows a target that was broken by the recent update_thread_list
optimization:
(gdb) target remote qa8-centos32-cs:10514
...
(gdb) continue
Continuing.
Cannot execute this command without a live selected thread.
(gdb)
The error means that the current thread is in "exited" state when the
continue command is processed. The root of the problem was found
here:
> Sending packet: $Hg0#df...Packet received:
...
> Sending packet: $?#3f...Packet received: S00
> Sending packet: $qfThreadInfo#bb...Packet received: l
> Sending packet: $Hc-1#09...Packet received:
> Sending packet: $qC#b4...Packet received: unset
This target doesn't really support threads (no thread indication in
stop reply packets; no support for qC), but then supports
qfThreadInfo, and returns an empty thread list to GDB.
See https://sourceware.org/ml/gdb-patches/2014-10/msg00665.html for
why the target does that.
As remote_update_thread_list deletes threads from GDB's list that are
not found in the thread list that the target reports, the result is
that GDB deletes the "fake" main thread that GDB added itself. (As
that thread is currently selected, it is marked "exited" instead of
being deleted straight away.)
This commit avoids deleting the main thread in this scenario.
gdb/
2014-10-27 Pedro Alves <palves@redhat.com>
* remote.c (remote_thread_alive): New, factored out from ...
(remote_thread_alive): ... this.
(remote_update_thread_list): Bail out before deleting threads if
the target returned an empty list, and, the current thread has a
magic/fake ptid.