UNIX specifies that signal dispositions and masks get inherited to child processes, but in general, programs are not very robust to being started with non-default signal dispositions or to signals being blocked. For example, libstd sets `SIGPIPE` to be ignored, on the grounds that Rust code using libstd will get the `EPIPE` errno and handle it correctly. But shell pipelines are built around the assumption that `SIGPIPE` will have its default behavior of killing the process, so that things like `head` work:
```
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
geofft@titan:/tmp$ cat bash.rs
fn main() {
std::process::Command::new("bash").status();
}
geofft@titan:/tmp$ ./bash
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
[...]
```
Here, `head` is supposed to terminate the input process quietly, but the bash subshell has inherited the ignored disposition of `SIGPIPE` from its Rust grandparent process. So it gets a bunch of `EPIPE`s that it doesn't know what to do with, and treats it as a generic, transient error. You can see similar behavior with `find / | head`, `yes | head`, etc.
This PR resets Rust's `SIGPIPE` handler, as well as any signal mask that may have been set, before spawning a child. Setting a signal mask, and then using a dedicated thread or something like `signalfd` to dequeue signals, is one of two reasonable ways for a library to process signals. See carllerche/mio#16 for more discussion about this approach to signal handling and why it needs a change to `std::process`. The other approach is for the library to set a signal-handling function (`signal()` / `sigaction()`): in that case, dispositions are reset to the default behavior on exec (since the function pointer isn't valid across exec), so we don't have to care about that here.
As part of this PR, I noticed that we had two somewhat-overlapping sets of bindings to signal functionality in `libstd`. One dated to old-IO and probably the old runtime, and was mostly unused. The other is currently used by `stack_overflow.rs`. I consolidated the two bindings into one set, and double-checked them by hand against all supported platforms' headers. This probably means it's safe to enable `stack_overflow.rs` on more targets, but I'm not including such a change in this PR.
r? @alexcrichton
cc @Zoxc for changes to `stack_overflow.rs`
Since the "Book" already avoids jQuery in its inline script tags and playpen.js is tiny, I figured I would convert it to plain old JS as well.
Side note: This is a separate issue, but another thing I noticed in my testing is that the "⇱" character doesn't display correctly in Chrome on Windows 7. (Firefox and IE work fine; other browsers not tested)
r? @steveklabnik
Edit: Github didn't like the "script" tag above
Edit 2: Actually, now IE seems to render "⇱" fine for me. Odd.
signal(), sigemptyset(), and sigaddset() are only available as inline
functions until Android API 21. liblibc already handles signal()
appropriately, so drop it from c.rs; translate sigemptyset() and
sigaddset() (which is only used in a test) by hand from the C inlines.
We probably want to revert this commit when we bump Android API level.
Make sure that child processes don't get affected by libstd's desire to
ignore SIGPIPE, nor a third-party library's signal mask (which is needed
to use either a signal-handling thread correctly or to use signalfd /
kqueue correctly).
Both c.rs and stack_overflow.rs had bindings of libc's signal-handling
routines. It looks like the split dated from #16388, when (what is now)
c.rs was in libnative but not libgreen. Nobody is currently using the
c.rs bindings, but they're a bit more accurate in some places.
Move everything to c.rs (since I'll need signal handling in process.rs,
and we should avoid duplication), clean up the bindings, and manually
double-check everything against the relevant system headers (fixing a
few things in the process).
It looks like a lot of this dated to previous incarnations of the io
module, etc., and went unused in the reworking leading up to 1.0. Remove
everything we're not actively using (except for signal handling, which
will be reworked in the next commit).
This makes them compliant with the new version of RFC 401 (i.e.
RFC 1052).
Fixes#26391. I *hope* the tests I have are enough.
This is a [breaking-change]
r? @nrc
Previously it also tried to find out the best way to translate the
expression, which could ICE during type-checking.
Fixes#23173Fixes#24322Fixes#25757
r? @eddyb
The `execv` family of functions and `getopt` are prototyped somewhat strangely in C: they take a `char *const argv[]` (and `envp`, for `execve`), an immutable array of mutable C strings -- in other words, a `char *const *argv` or `argv: *const *mut c_char`. The current Rust binding uses `*mut *const c_char`, which is backwards (a mutable array of constant C strings).
That said, these functions do not actually modify their arguments. Once upon a time, C didn't have `const`, and to this day, string literals in C have type `char *` (`*mut c_char`). So an array of string literals has type `char * []`, equivalent to `char **` in a function parameter (Rust `*mut *mut c_char`). C allows an implicit cast from `T **` to `T * const *` (`*const *mut T`) but not to `const T * const *` (`*const *const T`). Therefore, prototyping `execv` as taking `const char * const argv[]` would have broken existing code (by requiring an explicit cast), despite being more correct. So, even though these functions don't need mutable data, they're prototyped as if they do.
While it's theoretically possible that an implementation could choose to use its freedom to modify the mutable data, such an implementation would break the innumerable users of `execv`-family functions that call them with string literals. Such an implementation would also break `std::process`, which currently works around this with an unsafe `as *mut _` cast, and assumes that `execvp` secretly does not modify its argument. Furthermore, there's nothing useful to be gained by being able to write to the strings in `argv` themselves but not being able to write to the array containing those strings (you can't reorder arguments, add arguments, increase the length of any parameter, etc.).
So, despite the C prototype with its legacy C problems, it's simpler for everyone for Rust to consider these functions as taking `*const *const c_char`, and it's also safe to do so. Rust does not need to expose the mistakes of C here. This patch makes that change, and drops the unsafe cast in `std::process` since it's now unnecessary.
The documentation of `std::net::Shutdown` mentions says it can be passed to the `shutdown` method of `UdpSocket`, which isn't true because `UdpSocket` has no such method. This commit removes that mention.
Fixes https://github.com/rust-lang/rust/issues/26408
Example output when using `Result::unwrap`:
```
thread '<main>' panicked at 'called `Result::unwrap()` on an `Err` value: Error { repr: Os { code: 2, message: "The system cannot find the file specified.\r\n" } }', src/libcore\result.rs:731
```
This could technically be considered a breaking change for any code that depends on the format of the `Debug` output for `io::Error` but I sincerely hope nobody wrote code like that.
In #26252 support was added to have prettier paths printed out on failure by not
passing the full path to the source file to the compiler, but instead just a
small relative path. To preserve this relative path across configurations, the
`SREL` variable was used for reconfiguring, but if `SREL` is empty then it will
attempt to run the command `configure` which is distinct from running
`./configure` (e.g. doesn't run the local script).
This commit modifies the `SREL` value to re-run the configure script by setting
it to `./` in the case where `SREL` is empty.
When overflow checking on `<<` and `>>` was added for integers, the `<<` and `>>` operations broke for SIMD types (`u32x4`, `i16x8`, etc.). This PR implements checked shifts on SIMD types.
Fixes#24258.
This has a number of advantages compared to creating a copy in memory
and passing a pointer. The obvious one is that we don't have to put the
data into memory but can keep it in registers. Since we're currently
passing a pointer anyway (instead of using e.g. a known offset on the
stack, which is what the `byval` attribute would achieve), we only use a
single additional register for each fat pointer, but save at least two
pointers worth of stack in exchange (sometimes more because more than
one copy gets eliminated). On archs that pass arguments on the stack, we
save a pointer worth of stack even without considering the omitted
copies.
Additionally, LLVM can optimize the code a lot better, to a large degree
due to the fact that lots of copies are gone or can be optimized away.
Additionally, we can now emit attributes like nonnull on the data and/or
vtable pointers contained in the fat pointer, potentially allowing for
even more optimizations.
This results in LLVM passes being about 3-7% faster (depending on the
crate), and the resulting code is also a few percent smaller, for
example:
|text|data|filename|
|----|----|--------|
|5671479|3941461|before/librustc-d8ace771.so|
|5447663|3905745|after/librustc-d8ace771.so|
| | | |
|1944425|2394024|before/libstd-d8ace771.so|
|1896769|2387610|after/libstd-d8ace771.so|
I had to remove a call in the backtrace-debuginfo test, because LLVM can
now merge the tails of some blocks when optimizations are turned on,
which can't correctly preserve line info.
Fixes#22924
Cc #22891 (at least for fat pointers the code is good now)
This has a number of advantages compared to creating a copy in memory
and passing a pointer. The obvious one is that we don't have to put the
data into memory but can keep it in registers. Since we're currently
passing a pointer anyway (instead of using e.g. a known offset on the
stack, which is what the `byval` attribute would achieve), we only use a
single additional register for each fat pointer, but save at least two
pointers worth of stack in exchange (sometimes more because more than
one copy gets eliminated). On archs that pass arguments on the stack, we
save a pointer worth of stack even without considering the omitted
copies.
Additionally, LLVM can optimize the code a lot better, to a large degree
due to the fact that lots of copies are gone or can be optimized away.
Additionally, we can now emit attributes like nonnull on the data and/or
vtable pointers contained in the fat pointer, potentially allowing for
even more optimizations.
This results in LLVM passes being about 3-7% faster (depending on the
crate), and the resulting code is also a few percent smaller, for
example:
text data filename
5671479 3941461 before/librustc-d8ace771.so
5447663 3905745 after/librustc-d8ace771.so
1944425 2394024 before/libstd-d8ace771.so
1896769 2387610 after/libstd-d8ace771.so
I had to remove a call in the backtrace-debuginfo test, because LLVM can
now merge the tails of some blocks when optimizations are turned on,
which can't correctly preserve line info.
Fixes#22924
Cc #22891 (at least for fat pointers the code is good now)
As per #26009 this PR implements a new collation system for extended-error metadata. I've tried to keep it as simple as possible for now, so there's no uniqueness checking and minimal modularity.
Although using a lint was discussed in #26009 I decided against this because it would require converting the AST output from the plugin back into an internal data-structure. Emitting the metadata from within the plugin prevents this double-handling. We also didn't identify this as the source of the failures last time, although something untoward was definitely happening... With that in mind I would like as much feedback as possible on this before it's merged, I don't want to break the bots again!
I've successfully built for my host architecture and I'm building an ARM cross-compiler now to test my assumptions about the various `CFG` variables. Despite the confusing name of `CFG_COMPILER_HOST_TRIPLE` it is actually the compile time target triple, as explained in `mk/target.mk`.
```
# This is the compile-time target-triple for the compiler. For the compiler at
# runtime, this should be considered the host-triple. More explanation for why
# this exists can be found on issue #2400
export CFG_COMPILER_HOST_TRIPLE
```
CC @pnkfelix @brson @nrc @alexcrichton
Closes#25705, closes#26009.
It now says '#[feature] may not be used on the stable release channel'.
I had to convert this error from a lint to a normal compiler error.
I left the lint previously-used for this in place since removing it is
a breaking change. It will just go unused until the end of time.
Fixes#24125
Environment variables are global state so this can lead to surprising results if
the driver is called in a multithreaded environment (e.g. doctests). There
shouldn't be any memory corruption that's possible, but a lot of the bots have
been failing because they can't find `cc` or `gcc` in the path during doctests,
and I highly suspect that it is due to the compiler modifying `PATH` in a
multithreaded fashion.
This commit moves the logic for appending to `PATH` to only affect the child
process instead of also affecting the parent, at least for the linking stage.
When loading dynamic libraries the compiler still modifies `PATH` on Windows,
but this may be more difficult to fix than spawning off a new process.
The execv family of functions do not modify their arguments, so they do
not need mutable pointers. The C prototypes take a constant array of
mutable C-strings, but that's a legacy quirk from before C had const
(since C string literals have type `char *`). The Rust prototypes had
`*mut` in the wrong place, anyway: to match the C prototypes, it should
have been `*const *mut c_char`. But it is safe to pass constant strings
(like string literals) to these functions.
getopt is a special case, since GNU getopt modifies its arguments
despite the `const` claim in the prototype. It is apparently only
well-defined to call getopt on the actual argc and argv parameters
passed to main, anyway. Change it to take `*mut *mut c_char` for an
attempt at safety, but probably nobody should be using it from Rust,
since there's no great way to get at the parameters as passed to main.
Also fix the one caller of execvp in libstd, which now no longer needs
an unsafe cast.
Fixes#16290.