Record when a local pointer variable is set to a value such that
indirecting through the pointer does not require a write barrier. Use
that to eliminate write barriers when indirecting through that local
pointer variable. Only keep this information per-block, so it's not
all that applicable.
This reduces the number of write barriers generated when compiling the
runtime package from 553 to 524.
The point of this is to eliminate a bad write barrier in the bytes
function in runtime/print.go. Mark that function nowritebarrier so
that the problem does not recur.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/191581
From-SVN: r274890
With CL 190599, along with what we do in greyobject, we ensure
that we only mark allocated heap objects. As a result we can be
more strict in GC:
- Enable "sweep increased allocation count" check, which checks
that the number of mark bits set are no more than the number of
allocation bits.
- Enable invalid pointer check on heap scan. We only trace
allocated heap objects, which should not contain invalid
pointer.
This also makes the libgo runtime more convergent with the gc
runtime.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/190797
From-SVN: r274678
When a defer is executed at most once in a function body,
we can allocate the defer record for it on the stack instead
of on the heap.
This should make defers like this (which are very common) faster.
This is a port of CL 171758 from the gc repo.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/190410
From-SVN: r274613
In gccgo, we insert the write barriers in the frontend, and so we
cannot completely prevent write barriers on stack writes. So it
is possible for a bad pointer appearing in the write barrier
buffer. When flushing the write barrier, treat it the same as
sacnning the stack. In particular, don't mark a pointer if it
does not point to an allocated object. We already have similar
logic in greyobject. With this, hopefully, we can prevent an
unallocated object from being marked completely.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/190599
From-SVN: r274598
Currently, getg is implemented in C, which loads the thread-local
g variable. The g variable is declared static in C.
This CL exposes the g variable, so it can be accessed from the Go
side. This allows the Go compiler to inline getg calls to direct
access of g.
Currently, the actual inlining is only implemented in the gollvm
compiler. The g variable is thread-local and the compiler backend
may choose to cache the TLS address in a register or on stack. If
a thread switch happens the cache may become invalid. I don't
know how to disable the TLS address cache in gccgo, therefore
the inlining of getg is not implemented. In the future gccgo may
gain this if we know how to do it safely.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/186238
From-SVN: r273499
For a select statement with zero-, one-, or two-case with a
default case, we can generate simpler code instead of calling the
generic selectgo. A zero-case select is just blocking the
execution. A one-case select is mostly just executing the case. A
two-case select with a default case is a non-blocking send or
receive. We add these special cases for lowering a select
statement.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/184998
From-SVN: r273034
On AIX, a function has two symbols, a text symbol (with a leading dot)
and a data one (without it).
As the tests must be run only once, only the data symbol can be used to
retrieve the final go symbol. Therefore, all symbols beginning with a dot
are ignored by symtogo.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/177837
From-SVN: r272666
The first call of ar must not show its output in order to avoid useless
error messages about D flag.
The corresponding Go toolchain patch is CL 182077.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/183817
From-SVN: r272661
Instead of going through a C function __go_memcmp, we can just
use __builtin_memcmp directly. This allows more optimizations in
the compiler backend.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/183537
From-SVN: r272620
Currently a string slice expression is implemented with a runtime
call __go_string_slice. Change it to open code it, which is more
efficient, and allows the backend to further optimize it.
Also omit the write barrier for length-only update (i.e.
s = s[:n]).
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/182540
From-SVN: r272549
runtime.concatstring{2,3,4,5} are just wrappers of concatstrings.
These wrappers don't provide any benefit, at least in the C
calling convention we use, where passing arrays by value isn't an
efficient thing. Change it to always use concatstrings.
Also, the cap field of the slice passed to concatstrings is not
necessary. So change it to pass a pointer and a length directly,
which is more efficient than passing a slice header by value.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/182539
From-SVN: r272476
Due to inlining, we can now see unexported functions and variables,
and functions and variables imported from different packages.
Ignore them rather than reporting them from this package.
Handle $hash and $equal functions consistently, so that we discard the
inline body if there is one.
Ignore names created for result parameters for inlining purposes.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/180758
From-SVN: r272023
In the runtime there are specialized fast map routines for
certain kep types. This CL lets the compiler make use of these
functions, instead of always using the generic ones.
As we now generate multiple versions of map delete calls, to make
things easier we delay the expansion of the built-in delete
function to flatten phase.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/180858
From-SVN: r271983
Scan inlinable methods for references to global variables and
functions (forgot to do that earlier).
Track all packages mentioned by exports (that should have been done
earlier too).
Record assembler name in export data, so that we can inline calls to
non-Go functions. Modify gccgoimporter code to skip assembler name.
This increases the number of inlinable functions in the standard
library from 215 to 439.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/180677
From-SVN: r271976
Currently, the compiler already generates common symbols for type
descriptors, so the type descriptors are unique. However, when a
type is created through reflection, it is not deduplicated with
compiler-generated types. As a consequence, we cannot assume type
descriptors are unique, and cannot use pointer equality to
compare them. Also, when constructing a reflect.Type, it has to
go through a canonicalization map, which introduces overhead to
reflect.TypeOf, and lock contentions in concurrent programs.
In order for the reflect package to deduplicate types with
compiler-created types, we register all the compiler-created type
descriptors at startup time. The reflect package, when it needs
to create a type, looks up the registry of compiler-created types
before creates a new one. There is no lock contention since the
registry is read-only after initialization.
This lets us get rid of the canonicalization map, and also makes
it possible to compare type descriptors with pointer equality.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/179598
From-SVN: r271894
When the runtime collects a stack trace to associate it with some
profiling event (mem alloc, mutex, etc) there is a skip count passed
to runtime.Callers (or equivalent) to skip some known count of frames
in order to get to the "interesting" frame corresponding to the
profile event. Now that the profiling mechanism uses lazy fixup (when
removing compiler artifacts like thunks, morestack calls etc), we also
need to move the frame skipping logic after the fixup, so as to insure
that the skip count isn't thrown off by these artifacts.
Fixesgolang/go#32290.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/179740
From-SVN: r271892
The gc compiler recognizes append(s, make([]T, n)...), and
generates code to directly zero the tail instead of allocating a
new slice and copying. This CL lets the Go frontend do basically
the same.
The difficulty is that at the point we handle append, there may
already be temporaries introduced (e.g. in order_evaluations),
which makes it hard to find the append-of-make pattern. The
compiler could "see through" the value of a temporary, but it is
only safe to do if the temporary is not assigned multiple times.
For this, we add tracking of assignments and uses for temporaries.
This also helps in optimizing non-escape slice make. We already
optimize non-escape slice make with constant len/cap to stack
allocation. But it failed to handle things like f(make([]T, n))
(where the slice doesn't escape and n is constant), because of
the temporary. With tracking of temporary assignments and uses,
it can handle this now as well.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/179597
From-SVN: r271822
Currently, goroutine switches are implemented with libc
getcontext/setcontext functions, which saves/restores the machine
register states and also the signal context. This does more than
what we need, and performs an expensive syscall.
This CL implements a simplified version of getcontext/setcontext,
in assembly, that only saves/restores the necessary part, i.e.
the callee-save registers, and the PC, SP. A simplified version
of makecontext, written in C, is also added. Currently this is
only implemented on Linux/AMD64.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/178298
From-SVN: r271818
Revise the gccgo version of memory/block/mutex profiling to reduce
runtime overhead. The main change is to collect raw stack traces while
the profile is on line, then post-process the stacks just prior to the
point where we are ready to use the final product. Memory profiling
(at a very low sampling rate) is enabled by default, and the overhead
of the symbolization / DWARF-reading from backtrace_full was slowing
things down relative to the main Go runtime.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/171497
From-SVN: r271172
runtime.throw needs a g to work properly. Set up g early, to
ensure that if something goes wrong in the runtime startup (e.g.
runtime.check fails), the program terminates in a reasonable way.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/176657
From-SVN: r271088
A direct interface is an interface whose data word contains the
actual data value, instead of a pointer to it. The gc toolchain
creates a direct interface if the value is pointer shaped, that
includes pointers (including unsafe.Pointer), functions, channels,
maps, and structs and arrays containing a single pointer-shaped
field. In gccgo, we only do this for pointers. This CL unifies
direct interface types with gc. This reduces allocations when
converting such types to interfaces.
Our method functions used to always take pointer receivers, to
make interface calls easy. Now for direct interface types, their
value methods will take value receivers. For a pointer to those
types, when converted to interface, the interface data contains
the pointer. For that interface to call a value method, it will
need a wrapper method that dereference the pointer and invokes
the value method. The wrapper method, instead of the actual one,
is put into the itable of the pointer type.
In the runtime, adjust funcPC for the new layout of interfaces of
functions.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/168409
From-SVN: r270779
Previously, each time we do an interface conversion for which the
method table is not known at compile time, we allocate a new
method table.
This CL ports the mechanism of itab caching from the gc runtime,
adapted to our itab representation and method finding mechanism.
With the cache, we reuse the same itab for the same (interface,
concrete) type pair. This reduces allocations in interface
conversions.
Unlike the gc runtime, we don't prepopulate the cache with
statically allocated itabs, as currently we don't have a way to
find them. This means we don't deduplicate run-time allocated
itabs with compile-time allocated ones. But that is not too bad
-- it is just a cache anyway.
As now itabs are never freed, it is also possible to drop the
write barrier for writing the first word of an interface header.
I'll leave this optimization for the future.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/171617
From-SVN: r270778
AIX doesn't allow to mmap an address range which is already mmap.
Therefore, once the region has been allocated, it must munmap before
being able to play with it.
The corresponding Go Toolchain patch is CL 174059.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/174138
From-SVN: r270615
In the C calling convention, on AMD64, and probably a number of
other architectures, a 3-word struct argument is passed on stack.
This is less efficient than passing in three registers. Further,
this may affect the code generation in other part of the program,
even if the function is not actually called.
Slices are common in Go and append is a common slice operation,
which calls growslice in the growing path. To improve the code
generation, pass the slice header's three fields as separate
values, instead of a struct, to growslice.
The drawback is that this makes the runtime implementation
slightly diverges from the gc runtime.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/168277
From-SVN: r269811
Since aix/ppc64 has been added to GC toolchain, a mix between new and
old files were created in gcc toolchain.
This commit corrects this merge for aix/ppc64 and aix/ppc.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/167658
From-SVN: r269797
PR go/89447
syscall, internal/syscall: adjust use of largefile functions
Consistently call __go_openat for openat. Use fstatat64, creat64,
sendfile64, and getdents64 where needed.
Based on patch by Rainer Orth.
Fixes https://gcc.gnu.org/PR89447
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/166420
From-SVN: r269521
In the runtime there are bad pointer checks that currently don't
work with the concervative collector. With stack maps, the GC is
precise and the checks should work. Enable them.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/153871
From-SVN: r269406