Fix crashes due to python GIL released too early
When running GDB tests under Valgrind, various tests are failing due to invalid memory access. Here is the stack trace reported by Valgrind, for gdb.base/freebpcmd.exp : ==18658== Invalid read of size 8 ==18658== at 0x7F9107: is_main (signalmodule.c:195) ==18658== by 0x7F9107: PyOS_InterruptOccurred (signalmodule.c:1730) ==18658== by 0x3696E2: check_quit_flag() (extension.c:829) ==18658== by 0x36980B: restore_active_ext_lang(active_ext_lang_state*) (extension.c:782) ==18658== by 0x48F617: gdbpy_enter::~gdbpy_enter() (python.c:235) ==18658== by 0x47BB71: add_thread_object(thread_info*) (object.h:470) ==18658== by 0x53A84D: operator() (std_function.h:687) ==18658== by 0x53A84D: notify (observable.h:106) ==18658== by 0x53A84D: add_thread_silent(ptid_t) (thread.c:311) ==18658== by 0x3CD954: inf_ptrace_target::create_inferior(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const& , char**, int) (inf-ptrace.c:139) ==18658== by 0x3FE644: linux_nat_target::create_inferior(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char**, int) (linux-nat.c:1094) ==18658== by 0x3D5727: run_command_1(char const*, int, run_how) (infcmd.c:633) ==18658== by 0x2C05D1: cmd_func(cmd_list_element*, char const*, int) (cli-decode.c:1948) ==18658== by 0x53F29F: execute_command(char const*, int) (top.c:639) ==18658== by 0x3638EB: command_handler(char const*) (event-top.c:586) ==18658== by 0x36468C: command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) (event-top.c:771) ==18658== by 0x36407C: gdb_rl_callback_handler(char*) (event-top.c:217) ==18658== by 0x5B2A1F: rl_callback_read_char (callback.c:281) ==18658== by 0x36346D: gdb_rl_callback_read_char_wrapper_noexcept() (event-top.c:175) ==18658== by 0x363F70: gdb_rl_callback_read_char_wrapper(void*) (event-top.c:192) ==18658== by 0x3633AF: stdin_event_handler(int, void*) (event-top.c:514) ==18658== by 0x362504: gdb_wait_for_event (event-loop.c:857) ==18658== by 0x362504: gdb_wait_for_event(int) (event-loop.c:744) ==18658== by 0x362676: gdb_do_one_event() [clone .part.11] (event-loop.c:321) ==18658== by 0x3627AD: gdb_do_one_event (event-loop.c:303) ==18658== by 0x3627AD: start_event_loop() (event-loop.c:370) ==18658== by 0x41D35A: captured_command_loop() (main.c:381) ==18658== by 0x41F2A4: captured_main (main.c:1224) ==18658== by 0x41F2A4: gdb_main(captured_main_args*) (main.c:1239) ==18658== by 0x227D0A: main (gdb.c:32) ==18658== Address 0x10 is not stack'd, malloc'd or (recently) free'd The problem seems to be created by gdbpy_enter::~gdbpy_enter () releasing the GIL lock too early: ~gdbpy_enter () does: ... PyGILState_Release (m_state); python_gdbarch = m_gdbarch; python_language = m_language; restore_active_ext_lang (m_previous_active); } So, it releases the GIL lock, does 2 assignments and then leads to the following call sequence: restore_active_ext_lang => check_quit_flag => python.c gdbpy_check_quit_flag => PyOS_InterruptOccurred => is_main. is_main code is: static int is_main(_PyRuntimeState *runtime) { unsigned long thread = PyThread_get_thread_ident(); PyInterpreterState *interp = _PyRuntimeState_GetThreadState(runtime)->interp; return (thread == runtime->main_thread && interp == runtime->interpreters.main); } The macros and functions to access the thread state are documented as: /* Variable and macro for in-line access to current thread and interpreter state */ #define _PyRuntimeState_GetThreadState(runtime) \ ((PyThreadState*)_Py_atomic_load_relaxed(&(runtime)->gilstate.tstate_current)) /* Get the current Python thread state. Efficient macro reading directly the 'gilstate.tstate_current' atomic variable. The macro is unsafe: it does not check for error and it can return NULL. The caller must hold the GIL. See also PyThreadState_Get() and PyThreadState_GET(). */ #define _PyThreadState_GET() _PyRuntimeState_GetThreadState(&_PyRuntime) So, we see that GDB releases the GIL and then potentially calls _PyRuntimeState_GetThreadState that needs the GIL. It is not very clear why the problem is only observed when running under Valgrind. Probably caused by the slowdown due to Valgrind and/or to the 'single thread' scheduling by Valgrind. This patch fixes the crashes by releasing the GIT lock later. 2019-11-26 Philippe Waroquiers <philippe.waroquiers@skynet.be> * python/python.c (gdbpy_enter::~gdbpy_enter): Release GIL after restore_active_ext_lang, as GIL is needed for (indirectly) called PyOS_InterruptOccurred.
This commit is contained in:
parent
cadc9cb888
commit
aa36950904
@ -1,3 +1,9 @@
|
||||
2019-11-26 Philippe Waroquiers <philippe.waroquiers@skynet.be>
|
||||
|
||||
* python/python.c (gdbpy_enter::~gdbpy_enter): Release GIL after
|
||||
restore_active_ext_lang, as GIL is needed for (indirectly)
|
||||
called PyOS_InterruptOccurred.
|
||||
|
||||
2019-11-26 Simon Marchi <simon.marchi@efficios.com>
|
||||
|
||||
* sparc-nat.c (sparc_xfer_wcookie): Sync declaration with
|
||||
|
@ -228,11 +228,11 @@ gdbpy_enter::~gdbpy_enter ()
|
||||
|
||||
m_error->restore ();
|
||||
|
||||
PyGILState_Release (m_state);
|
||||
python_gdbarch = m_gdbarch;
|
||||
python_language = m_language;
|
||||
|
||||
restore_active_ext_lang (m_previous_active);
|
||||
PyGILState_Release (m_state);
|
||||
}
|
||||
|
||||
/* Set the quit flag. */
|
||||
|
Loading…
x
Reference in New Issue
Block a user