Currently, the compiler already generates common symbols for type
descriptors, so the type descriptors are unique. However, when a
type is created through reflection, it is not deduplicated with
compiler-generated types. As a consequence, we cannot assume type
descriptors are unique, and cannot use pointer equality to
compare them. Also, when constructing a reflect.Type, it has to
go through a canonicalization map, which introduces overhead to
reflect.TypeOf, and lock contentions in concurrent programs.
In order for the reflect package to deduplicate types with
compiler-created types, we register all the compiler-created type
descriptors at startup time. The reflect package, when it needs
to create a type, looks up the registry of compiler-created types
before creates a new one. There is no lock contention since the
registry is read-only after initialization.
This lets us get rid of the canonicalization map, and also makes
it possible to compare type descriptors with pointer equality.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/179598
From-SVN: r271894
When the runtime collects a stack trace to associate it with some
profiling event (mem alloc, mutex, etc) there is a skip count passed
to runtime.Callers (or equivalent) to skip some known count of frames
in order to get to the "interesting" frame corresponding to the
profile event. Now that the profiling mechanism uses lazy fixup (when
removing compiler artifacts like thunks, morestack calls etc), we also
need to move the frame skipping logic after the fixup, so as to insure
that the skip count isn't thrown off by these artifacts.
Fixesgolang/go#32290.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/179740
From-SVN: r271892
This requires tracking all references to unexported variables, so that
we can make them global symbols in the object file, and can export
them so that other compilations can see the right definition for their
own inline bodies.
This introduces a syntax for referencing names defined in other
packages: a <pNN> prefix, where NN is the package index. This will
need to be added to gccgoimporter, but I didn't do it yet since it
isn't yet possible to create an object for which gccgoimporter will
see a <pNN> prefix.
This increases the number of inlinable functions in the standard
library from 181 to 215, adding functions like context.Background.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/177920
From-SVN: r271891
The "wg" constraint is used for the floating point side on mfpgpr
instructions. Those instructions do not exist on any relevant
hardware. This patch deletes the constraint and the insns using it.
* config/rs6000/constraints.md (define_register_constraint "wg"):
Delete.
* config/rs6000/rs6000.h (enum r6000_reg_class_enum): Delete
RS6000_CONSTRAINT_wg.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Adjust.
(rs6000_init_hard_regno_mode_ok): Adjust.
* config/rs6000/rs6000.md (*mov<mode>_softfloat32, *movdi_internal64):
Delete "wg" alternatives.
* doc/md.texi (Machine Constraints): Adjust.
From-SVN: r271888
Improve the fix for PR64242. Various optimizations can change a memory
reference into a frame access. Given there are multiple virtual frame pointers
which may be replaced by multiple hard frame pointers, there are no checks for
writes to the various frame pointers. So updates to a frame pointer tends to
generate incorrect code. Improve the previous fix to also add clobbers of
several frame pointers and add a scheduling barrier. This should work in most
cases until GCC supports a generic "don't optimize across this instruction"
feature.
Bootstrap OK. Testcase passes on AArch64 and x86-64. Inspected x86, Arm,
Thumb-1 and Thumb-2 assembler which looks correct.
gcc/
PR middle-end/64242
* builtins.c (expand_builtin_longjmp): Add frame clobbers and schedule
block.
(expand_builtin_nonlocal_goto): Likewise.
testsuite/
PR middle-end/64242
* gcc.c-torture/execute/pr64242.c: Update test.
From-SVN: r271870
A dynamic linker with lazy binding support may need to handle vector PCS
function symbols specially, so an ELF symbol table marking was
introduced for such symbols.
Function symbol references and definitions that follow the vector PCS
are marked in the generated assembly with .variant_pcs and then the
STO_AARCH64_VARIANT_PCS st_other flag is set on the symbol in the object
file. The marking is propagated to the dynamic symbol table by the
static linker so a dynamic linker can handle such symbols specially.
For this to work, the assembler, the static linker and the dynamic
linker has to be updated on a system. Old assembler does not support
the new .variant_pcs directive, so a toolchain with old binutils won't
be able to compile code that references vector PCS symbols.
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h (aarch64_asm_output_alias): Declare.
(aarch64_asm_output_external): Declare.
* config/aarch64/aarch64.c (aarch64_asm_output_variant_pcs): New.
(aarch64_declare_function_name): Call aarch64_asm_output_variant_pcs.
(aarch64_asm_output_alias): New.
(aarch64_asm_output_external): New.
* config/aarch64/aarch64.h (ASM_OUTPUT_DEF_FROM_DECLS): Define.
(ASM_OUTPUT_EXTERNAL): Define.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pcs_attribute-2.c: New test.
* gcc.target/aarch64/torture/simd-abi-4.c: Check .variant_pcs support.
* lib/target-supports.exp (check_effective_target_aarch64_variant_pcs):
New.
From-SVN: r271869
In previous standards it is undefined for a container and its allocator
to have a different value_type. Libstdc++ has traditionally allowed it
as an extension, automatically rebinding the allocator to the
container's value_type. Since GCC 8.1 that extension has been disabled
for C++11 and later when __STRICT_ANSI__ is defined (i.e. for
-std=c++11, -std=c++14, -std=c++17 and -std=c++2a).
Since the acceptance of P1463R1 into the C++2a draft an incorrect
allocator::value_type now requires a diagnostic. This patch implements
that by enabling the static_assert for -std=gnu++2a as well.
* doc/xml/manual/status_cxx2020.xml: Document P1463R1 status.
* include/bits/forward_list.h [__cplusplus > 201703]: Enable
allocator::value_type assertion for C++2a.
* include/bits/hashtable.h: Likewise.
* include/bits/stl_deque.h: Likewise.
* include/bits/stl_list.h: Likewise.
* include/bits/stl_map.h: Likewise.
* include/bits/stl_multimap.h: Likewise.
* include/bits/stl_multiset.h: Likewise.
* include/bits/stl_set.h: Likewise.
* include/bits/stl_vector.h: Likewise.
* testsuite/23_containers/deque/48101-3_neg.cc: New test.
* testsuite/23_containers/forward_list/48101-3_neg.cc: New test.
* testsuite/23_containers/list/48101-3_neg.cc: New test.
* testsuite/23_containers/map/48101-3_neg.cc: New test.
* testsuite/23_containers/multimap/48101-3_neg.cc: New test.
* testsuite/23_containers/multiset/48101-3_neg.cc: New test.
* testsuite/23_containers/set/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_map/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_multimap/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_multiset/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_set/48101-3_neg.cc: New test.
* testsuite/23_containers/vector/48101-3_neg.cc: New test.
From-SVN: r271866
Fix the alignment option parser to always allow up to 4 alignments.
Now -falign-functions=16:8:8:8 no longer reports an error.
gcc/
PR driver/90684
* opts.c (parse_and_check_align_values): Allow 4 alignment values.
M gcc/ChangeLog
M gcc/opts.c
From-SVN: r271864
Wilco pointed out that when the Dot Product instructions are available we can use them
to generate an even more efficient expansion for the [us]sadv16qi optab.
Instead of the current:
uabdl2 v0.8h, v1.16b, v2.16b
uabal v0.8h, v1.8b, v2.8b
uadalp v3.4s, v0.8h
we can generate:
(1) mov v4.16b, 1
(2) uabd v0.16b, v1.16b, v2.16b
(3) udot v3.4s, v0.16b, v4.16b
Instruction (1) can be CSEd across multiple such expansions and even hoisted outside of loops,
so when this sequence appears frequently back-to-back (like in x264_r) we essentially only have 2 instructions
per sum. Also, the UDOT instruction does the byte-to-word accumulation in one step, which allows us to use
the much simpler UABD instruction before it.
This makes it a shorter and lower-latency sequence overall for targets that support it.
* config/aarch64/iterators.md (MAX_OPP): New code attr.
* config/aarch64/aarch64-simd.md (*aarch64_<su>abd<mode>_3): Rename to...
(aarch64_<su>abd<mode>_3): ... This.
(<sur>sadv16qi): Add TARGET_DOTPROD expansion.
* gcc.target/aarch64/ssadv16qi.c: Add +nodotprod to pragma.
* gcc.target/aarch64/usadv16qi.c: Likewise.
* gcc.target/aarch64/ssadv16qi-dotprod.c: New test.
* gcc.target/aarch64/usadv16qi-dotprod.c: Likewise.
From-SVN: r271863
2019-06-03 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.c (ao_ref_init_from_vn_reference): Get original
full reference tree and record in ref->ref.
(vn_reference_lookup_3): Pass in original ref to
ao_ref_init_from_vn_reference.
(vn_reference_lookup): Likewise.
* tree-ssa-sccvn.h (ao_ref_init_from_vn_reference): Adjust prototype.
* tree-ssa-alias.c (nonoverlapping_component_refs_of_decl_p):
Handle non-decl bases in the original reference.
* gcc.dg/tree-ssa/alias-access-path-1.c: Scan fre1.
From-SVN: r271860
2019-06-03 Martin Liska <mliska@suse.cz>
* fold-const.c (operand_equal_p): Fix typo as compare_tree_int
returns 0 when operands are equal.
From-SVN: r271859
2019-06-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/90716
* tree-loop-distribution.c (destroy_loop): Process blocks in
correct order.
* gcc.dg/guality/pr90716.c: New testcase.
From-SVN: r271858
This patch fixes bug 90681. It was caused by trying to SLP vectorize a non
groupped load. We've fixed it by tweaking a bit the implementation: mark
masked loads as not vectorizable, but support them as an special case. Then
the detect them in the test for normal non-groupped loads that was already
there.
From-SVN: r271856
2019-06-02 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/90539
* trans-expr.c (gfc_conv_subref_array_arg): If the size of the
expression can be determined to be one, treat it as contiguous.
Set likelyhood of presence of an actual argument according to
PRED_FORTRAN_ABSENT_DUMMY and likelyhood of being contiguous
according to PRED_FORTRAN_CONTIGUOUS.
2019-06-02 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/90539
* predict.def (PRED_FORTRAN_CONTIGUOUS): New predictor.
2019-06-02 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/90539
* gfortran.dg/internal_pack_24.f90: New test.
From-SVN: r271844
We don't have support for -mcmodel={medium, large, kernel} so don't
expect tests for those things to work.
For now mark them as xfail where possible and skip where that isn't.
These changes will be logged onto the PR and therefore can be backed
out when the facility is implemented.
gcc/testsuite/ChangeLog:
2019-06-01 Iain Sandoe <iain@sandoe.co.uk>
PR target/90698
* gcc.target/i386/pr49866.c: XFAIL for Darwin.
* gcc.target/i386/pr63538.c: Likewise.
* gcc.target/i386/pr61599-1.c: Skip for Darwin.
From-SVN: r271839
* alias.c: Include ipa-utils.h.
(get_alias_set): Try to complete ODR type via ODR type hash lookup.
* ipa-devirt.c (prevailing_odr_type): New.
* ipa-utils.h (previaling_odr_type): Declare.
* g++.dg/lto/alias-1_0.C: New testcase.
* g++.dg/lto/alias-1_1.C: New testcase.
From-SVN: r271837
NOTE_INSN_DELETED_LABEL is used to mark what used to be a 'code_label',
but was not used for other purposes than taking its address which cannot
be used as target for indirect jumps.
Tested on Linux/x86-64 with -fcf-protection.
For x86-64 libc.so on glibc master branch (commit f43b8dd55588c3),
Before: 2961 endbr64
After: 2943 endbr64
gcc/
PR target/89355
* config/i386/i386-features.c (rest_of_insert_endbranch): Remove
NOTE_INSN_DELETED_LABEL check.
gcc/testsuite/
PR target/89355
* gcc.target/i386/cet-label-3.c: New test.
* gcc.target/i386/cet-label-4.c: Likewise.
* gcc.target/i386/cet-label-5.c: Likewise.
Co-Authored-By: Hongtao Liu <hongtao.liu@intel.com>
From-SVN: r271828
* config/mips/mips.c (mips_expand_builtin_insn): Swap the 1st
and 3rd operands of the fmadd/fmsub/maddv builtin.
* gcc.target/mips/msa-fmadd.c: New.
From-SVN: r271826
* tree.h (OMP_CLAUSE__CONDTEMP__ITER): Define.
* gimplify.c (gimplify_scan_omp_clauses): Allow lastprivate conditional
on OMP_SIMD if not nested inside of worksharing loop that also has
lastprivate conditional clause for the same decl.
(gimplify_omp_for): Add _condtemp_ clauses to OMP_SIMD if needed.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE__CONDTEMP_ also
on simd.
(lower_rec_input_clauses): Likewise. Handle lastprivate conditional
on simd construct.
(lower_lastprivate_conditional_clauses): Handle lastprivate conditional
on simd construct.
(lower_lastprivate_clauses): Likewise.
(lower_omp_sections): Call lower_lastprivate_conditional_clauses before
calling lower_rec_input_clauses.
(lower_omp_for): Likewise.
(lower_omp_1): Use first rather than second OMP_CLAUSE__CONDTEMP_
clause on simd construct.
* omp-expand.c (expand_omp_simd): Initialize cond_var if
OMP_CLAUSE__CONDTEMP_ clause is present.
* c-c++-common/gomp/lastprivate-conditional-2.c (foo): Don't expect
a sorry on lastprivate conditional on simd construct.
* gcc.dg/vect/vect-simd-6.c: New test.
* gcc.dg/vect/vect-simd-7.c: New test.
From-SVN: r271825
The gc compiler recognizes append(s, make([]T, n)...), and
generates code to directly zero the tail instead of allocating a
new slice and copying. This CL lets the Go frontend do basically
the same.
The difficulty is that at the point we handle append, there may
already be temporaries introduced (e.g. in order_evaluations),
which makes it hard to find the append-of-make pattern. The
compiler could "see through" the value of a temporary, but it is
only safe to do if the temporary is not assigned multiple times.
For this, we add tracking of assignments and uses for temporaries.
This also helps in optimizing non-escape slice make. We already
optimize non-escape slice make with constant len/cap to stack
allocation. But it failed to handle things like f(make([]T, n))
(where the slice doesn't escape and n is constant), because of
the temporary. With tracking of temporary assignments and uses,
it can handle this now as well.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/179597
From-SVN: r271822
Currently, Type_conversion_expression::do_is_constant thinks the
int-to-string conversion is constant if the integer operand is
constant, but Type_conversion_expression::do_get_backend actually
generates a call to runtime.intstring if the integer does not fit
in a "ushort", which makes it not suitable in constant context,
such as static initializer.
This CL makes it handle all constant integer input as constant,
generating constant string.
Fixesgolang/go#32347.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/179777
From-SVN: r271821