Simplify HashMap layout calculation by using Layout
`RawTable` uses a single allocation to hold both the array of hashes and the array of key/value pairs. This PR changes `RawTable` to use `Layout` when calculating the amount of memory to allocate instead of performing the calculation manually.
r? @SimonSapin
optimize joining for slices
This improves the speed of string joining up to 3x.
It removes the boolean flag check every iteration, eliminates repeated bounds checks and adds a fast paths for small separators up to a len of 4 bytes
These optimizations gave me ~10%, ~50% and ~80% improvements respectively over the previous speed. Those are multiplicative.
3x improvement happens for the optimal case of joining many small strings together in my microbenchmarks. Improvements flatten out for larger strings of course as more time is spent copying bits around. I've run a few benchmarks [with this code](https://github.com/Emerentius/join_bench). They are pretty noise despite high iteration counts, but in total one can see the trends.
```
len_separator len_string n_strings speedup
4 10 10 2.38
4 10 100 3.41
4 10 1000 3.43
4 10 10000 3.25
4 100 10 2.23
4 100 100 2.73
4 100 1000 1.33
4 100 10000 1.14
4 1000 10 1.33
4 1000 100 1.15
4 1000 1000 1.08
4 1000 10000 1.04
10 10 10 1.61
10 10 100 1.74
10 10 1000 1.77
10 10 10000 1.75
10 100 10 1.58
10 100 100 1.65
10 100 1000 1.24
10 100 10000 1.12
10 1000 10 1.23
10 1000 100 1.11
10 1000 1000 1.05
10 1000 10000 0.997
100 10 10 1.66
100 10 100 1.78
100 10 1000 1.28
100 10 10000 1.16
100 100 10 1.37
100 100 100 1.26
100 100 1000 1.09
100 100 10000 1.0
100 1000 10 1.19
100 1000 100 1.12
100 1000 1000 1.05
100 1000 10000 1.12
```
The string joining with small or empty separators is now ~50% faster than the old concatenation (small strings). The same approach can also improve the performance of joining into vectors.
If this approach is acceptable, I can apply it for concatenation and for vectors as well. Alternatively, concat could just call `.join("")`.
old tests cover the new fast path of str joining already
this adds tests for joining into Strings with long separators (>4 byte) and
for joining into Vec<T>, T: Clone + !Copy. Vec<T: Copy> will be
specialised when specialisation type inference bugs are fixed.
for both Vec<T> and String
- eliminates the boolean first flag in fn join()
for String only
- eliminates repeated bounds checks in join(), concat()
- adds fast paths for small string separators up to a len of 4 bytes
Make the OOM hook return `()` rather than `!`
Per discussion in https://github.com/rust-lang/rust/issues/51245#issuecomment-393651083
This allows more flexibility in what can be done with the API. This also
splits `rtabort!` into `dumb_print` happening in the default hook and
`abort_internal`, happening in the actual oom handler after calling the
hook. Registering an empty function thus makes the oom handler not print
anything but still abort.
Cc: @alexcrichton
Make const decoding thread-safe.
This is an alternative to https://github.com/rust-lang/rust/pull/50957. It's a proof of concept (e.g. it doesn't adapt metadata decoding, just the incr. comp. cache) but I think it turned out nice. It's rather simple and does not require passing around a bunch of weird closures, like we currently do.
If you (@Zoxc & @oli-obk) think this approach is good then I'm happy to finish and clean this up.
Note: The current version just spins when it encounters an in-progress decoding. I don't have a strong preference for this approach. Decoding concurrently is equally fine by me (or maybe even better because it doesn't require poisoning).
r? @Zoxc
Make some std::intrinsics `const fn`s
Making some rustc intrinsics (`ctpop`, `cttz`, `ctlz` and `bswap`) `const fn`s.
This is a pre-step to being able to make `swap_bytes`, `to_be` and `from_be` constant functions. That in itself could be ergonomic and useful. But even better is that it would allow `Ipv4Addr::new` etc becoming `const fn`s as well. Which might be really useful since I find it quite common to want to define them as constants.
r? @oli-obk
Arc downcast
Implement `downcast` for `Arc<Any + Send + Sync>` as part of #44608, and gated by the same `rc_downcast` feature.
This PR is mostly lightly-edited cut'n'paste.
This has two additional changes:
- The `downcast` implementation needs `Any + Send + Sync` implementations for `is` and `Debug`, and I added `downcast_ref` and `downcast_mut` for completeness/consistency. (Can these be insta-stabilized?)
- At @SimonSapin's suggestion, I converted `Arc` and `Rc` to use `NonNull::cast` to avoid an `unsafe` block in each which tidied things up nicely.
Per discussion in https://github.com/rust-lang/rust/issues/51245#issuecomment-393651083
This allows more flexibility in what can be done with the API. This also
splits `rtabort!` into `dumb_print` happening in the default hook and
`abort_internal`, happening in the actual oom handler after calling the
hook. Registering an empty function thus makes the oom handler not print
anything but still abort.
Cc: @alexcrichton
Register outlives predicates from queries the right way around.
Closes#49354
The region constraints from queries need to be reversed from sub to outlives.
Note: wf checking reports these errors before NLL, so I'm not sure if there's any case when these predicates need to be created at all.
cc @nikomatsakis
Rollup of 7 pull requests
Successful merges:
- #49546 (Stabilize short error format)
- #51123 (Update build instructions)
- #51146 (typeck: Do not pass the field check on field error)
- #51193 (Fixes some style issues in rustdoc "implementations on Foreign types")
- #51213 (fs: copy: Use File::set_permissions instead of fs::set_permissions)
- #51227 (mod.rs isn't beautiful)
- #51240 (Two minor parsing tweaks)
Failed merges:
We only need to implement it for `Any + Send + Sync` because in practice
that's the only useful combination for `Arc` and `Any`.
Implementation for #44608 under the `rc_downcast` feature.
fs: copy: Use File::set_permissions instead of fs::set_permissions
We already got the open file descriptor at this point.
Don't make the kernel resolve the path again.
Update build instructions
It get stuck at the cloning step.
`./x.py build `
Updating only changed submodules
Updating submodule src/llvm
Submodule 'src/llvm' (https://github.com/rust-lang/llvm.git) registered for path 'src/llvm'
Cloning into '/home/username/rust/src/llvm'...
std::fs::DirEntry.metadata(): use fstatat instead of lstat when possible
When reading a directory with `read_dir`, querying metadata for a resulting `DirEntry` is done by building the whole path and then `lstat`ing it, which requires the kernel to resolve the whole path. Instead, one
can use the file descriptor to the enumerated directory and use `fstatat`. This make the resolving step
unnecessary.
This PR implements using `fstatat` on linux, android and emscripten.
## Compatibility across targets
`fstatat` is POSIX.
* Linux >= 2.6.19 according to https://linux.die.net/man/2/fstatat
* android according to https://android.googlesource.com/platform/bionic/+/master/libc/libc.map.txt#392
* emscripten according to 7f89560101/system/include/libc/sys/stat.h (L76)
The man page says "A similar system call exists on Solaris." but I haven't found it.
## Compatibility with old platforms
This was introduced with glibc 2.4 according to the man page. The only information I could find about the minimal version of glibc rust must support is this discussion https://internals.rust-lang.org/t/bumping-glibc-requirements-for-the-rust-toolchain/5111/10
The conclusion, if I understand correctly, is that currently rust supports glibc >= 2.3.4 but the "real" requirement is Centos 5 with glibc 2.5. This PR would make the minimal version 2.4, so this should be fine.
## Benefit
I did the following silly benchmark:
```rust
use std::io;
use std::fs;
use std::os::linux::fs::MetadataExt;
use std::time::Instant;
fn main() -> Result<(), io::Error> {
let mut n = 0;
let mut size = 0;
let start = Instant::now();
for entry in fs::read_dir("/nix/store/.links")? {
let entry = entry?;
let stat = entry.metadata()?;
size += stat.st_size();
n+=1;
}
println!("{} files, size {}, time {:?}", n, size, Instant::now().duration_since(start));
Ok(())
}
```
On warm cache, with current rust nightly:
```
1014099 files, size 76895290022, time Duration { secs: 2, nanos: 65832118 }
```
(between 2.1 and 2.9 seconds usually)
With this PR:
```
1014099 files, size 76895290022, time Duration { secs: 1, nanos: 581662953 }
```
(1.5 to 1.6 seconds usually).
approximately 40% faster :)
On cold cache there is not much to gain because path lookup (which we spare) would have been a cache hit:
Before
```
1014099 files, size 76895290022, time Duration { secs: 391, nanos: 739874992 }
```
After
```
1014099 files, size 76895290022, time Duration { secs: 388, nanos: 431567396 }
```
## Testing
The tests were run on linux `x86_64`
```
python x.py test src/tools/tidy
./x.py test src/libstd
```
and the above benchmark.
I did not test any other target.
remove notion of Implicit derefs from mem-cat
`PointerKind` is included in `LoanPath` and hence forms part of the equality check; this led to having two unequal paths that both represent `*x`, depending on whether the `*` was inserted automatically or explicitly. Bad mojo.
Fixes#51117
r? @eddyb
`PointerKind` is included in `LoanPath` and hence forms part of the
equality check; this led to having two unequal paths that both
represent `*x`, depending on whether the `*` was inserted
automatically or explicitly. Bad mojo. The `note` field, in contrast,
is intended more-or-less primarily for this purpose of adding extra
data.