A left shift of 1 can always be done using an add, so slightly adjust rtx
cost for DImode left shift by 1 so that adddi3 is preferred in all cases,
and the arm_ashldi3_1bit is redundant.
DImode right shifts of 1 are rarely used (6 in total in the GCC binary),
so there is little benefit of the arm_ashrdi3_1bit and arm_lshrdi3_1bit
patterns. The generated code is better and faster without these shifts
as it allows early expansion, optimization and better register allocation.
gcc/
* config/arm/arm.md (ashldi3): Remove shift by 1 expansion.
(arm_ashldi3_1bit): Remove pattern.
(ashrdi3): Remove shift by 1 expansion.
(arm_ashrdi3_1bit): Remove pattern.
(lshrdi3): Remove shift by 1 expansion.
(arm_lshrdi3_1bit): Remove pattern.
* config/arm/arm.c (arm_rtx_costs_internal): Slightly increase
cost of ashldi3 by 1.
* config/arm/neon.md (ashldi3_neon): Remove shift by 1 expansion.
(<shift>di3_neon): Likewise.
From-SVN: r254237
Fix the type attributes of the integer stores in aarch64_simd_mov.
gcc/
* config/aarch64/aarch64-simd.md (*aarch64_simd_mov): Rename
both identically named patterns to (*aarch64_simd_mov<VD:mode>)
and (*aarch64_simd_mov<VQ:mode>).
(*aarch64_simd_mov<VD:mode>): Change type attribute to match
pattern alternative.
(*aarch64_simd_mov<VQ:mode>): Re-order and change type
attributes to match pattern alternative.
From-SVN: r254236
This patch includes testsuite/gcc.target tests for the intrinsics
in emmintrin.h. For these tests I added -Wno-psabi to dg-options
to suppress warnings associated with the vector ABI change in GCC5.
From-SVN: r254235
Part 1/2 for contributing PPC64LE support for X86 SSE2
instrisics. This patch includes the new (for PPC) emmintrin.h,
changes x86intrin.h to include xmmintrin.h, and associated
config.gcc changes.
From-SVN: r254234
Merge the movdi_vfp_cortexa8 pattern into movdi_vfp and remove it to avoid
unnecessary duplication and repeating bugs like PR78439 due to changes being
applied only to one of the duplicates.
gcc/
* config/arm/vfp.md (movdi_vfp): Merge changes from movdi_vfp_cortexa8.
* (movdi_vfp_cortexa8): Remove pattern.
From-SVN: r254233
* include/std/fstream (basic_ifstream, basic_ofstream, basic_fstream):
Remove outdated comments about calling c_str() to create a file stream
from a std::string.
(basic_ofstream::basic_ofstream, basic_ofstream::open): Remove
redundant ios_mode::trunc bits from default arguments and comments.
From-SVN: r254226
2017-10-30 Martin Jambor <mjambor@suse.cz>
* omp-grid.c (grid_attempt_target_gridification): Also insert a
condition whether loop should be executed at all.
From-SVN: r254225
* include/bits/hashtable_policy.h: Include <tuple>.
* include/std/unordered_map: Only include <bits/stl_pair.h> instead
of <utility> and <tuple>.
* include/std/unordered_set: Likewise.
From-SVN: r254223
[gcc]
2017-10-30 Will Schmidt <will_schmidt@vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
gimple folding of vec_madd() intrinsics.
* config/rs6000/altivec.md (mulv8hi3): Rename altivec_vmladduhm to
fmav8hi4. (altivec_vmladduhm): Rename to fmav8hi4.
* config/rs6000/rs6000-builtin.def: Rename vmladduhm to fmav8hi4
From-SVN: r254221
2017-10-30 Will Schmidt <will_schmidt@vnet.ibm.com>
* gcc.target/powerpc/fold-vec-perm-longlong.c: Update to use long long
types for testcase arguments.
From-SVN: r254220
2017-10-30 Richard Biener <rguenther@suse.de>
PR lto/82757
* simple-object-elf.c (simple_object_elf_copy_lto_debug_sections):
Strip two leading _s from the __gnu_lto_* symbols.
From-SVN: r254219
C17, a bug-fix version of the C11 standard with DR resolutions
integrated, will soon go to ballot. This patch adds corresponding
options -std=c17, -std=gnu17 (new default version, replacing
-std=gnu11 as the default), -std=iso9899:2017. As a bug-fix version
of the standard, there is no need for flag_isoc17 or any options for
compatibility warnings; however, there is a new __STDC_VERSION__
value, so new cpplib languages CLK_GNUC17 and CLK_STDC17 are added to
support using that new value with the new options. (If the standard
ends up being published in 2018 and being known as C18, option aliases
can be added. Note however that -std=iso9899:199409 corresponds to a
__STDC_VERSION__ value rather than a publication date.)
(There are a couple of DR resolutions needing implementing in GCC, but
that's independent of the new options.)
(I'd propose to add -std=c2x / -std=gnu2x / -Wc11-c2x-compat for the
next major C standard revision once there are actually C2x drafts
being issued with new features included.)
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc:
* doc/invoke.texi (C Dialect Options): Document -std=c17,
-std=iso9899:2017 and -std=gnu17.
* doc/standards.texi (C Language): Document C17 support.
* doc/cpp.texi (Overview): Mention -std=c17.
(Standard Predefined Macros): Document C11 and C17 values of
__STDC_VERSION__. Do not refer to C99 support as incomplete.
* doc/extend.texi (Inline): Do not list individual options for
standards newer than C99.
* dwarf2out.c (highest_c_language, gen_compile_unit_die): Handle
"GNU C17".
* config/rl78/rl78.c (rl78_option_override): Handle "GNU C17"
language name.
gcc/c-family:
* c.opt (std=c17, std=gnu17, std=iso9899:2017): New options.
* c-opts.c (set_std_c17): New function.
(c_common_init_options): Use gnu17 as default C version.
(c_common_handle_option): Handle -std=c17 and -std=gnu17.
gcc/testsuite:
* gcc.dg/c17-version-1.c, gcc.dg/c17-version-2.c: New tests.
libcpp:
* include/cpplib.h (enum c_lang): Add CLK_GNUC17 and CLK_STDC17.
* init.c (lang_defaults): Add GNUC17 and STDC17 data.
(cpp_init_builtins): Handle C17 value of __STDC_VERSION__.
From-SVN: r254216
PR middle-end/22141
* gimple-ssa-store-merging.c: Include rtl.h and expr.h.
(struct store_immediate_info): Add bitregion_start and bitregion_end
fields.
(store_immediate_info::store_immediate_info): Add brs and bre
arguments and initialize bitregion_{start,end} from those.
(struct merged_store_group): Add bitregion_start, bitregion_end,
align_base and mask fields. Drop unnecessary struct keyword from
struct store_immediate_info. Add do_merge method.
(clear_bit_region_be): Use memset instead of loop storing zeros.
(merged_store_group::do_merge): New method.
(merged_store_group::merge_into): Use do_merge. Allow gaps in between
stores as long as the surrounding bitregions have no gaps.
(merged_store_group::merge_overlapping): Use do_merge.
(merged_store_group::apply_stores): Test that bitregion_{start,end}
is byte aligned, rather than requiring that start and width are
byte aligned. Drop unnecessary struct keyword from
struct store_immediate_info. Allocate and populate also mask array.
Make start of the arrays relative to bitregion_start rather than
start and size them according to bitregion_{end,start} difference.
(struct imm_store_chain_info): Drop unnecessary struct keyword from
struct store_immediate_info.
(pass_store_merging::gate): Punt if BITS_PER_UNIT or CHAR_BIT is not 8.
(pass_store_merging::terminate_all_aliasing_chains): Drop unnecessary
struct keyword from struct store_immediate_info.
(imm_store_chain_info::coalesce_immediate_stores): Allow gaps in
between stores as long as the surrounding bitregions have no gaps.
Formatting fixes.
(struct split_store): Add orig non-static data member.
(split_store::split_store): Initialize orig to false.
(find_constituent_stmts): Return store_immediate_info *, non-NULL
if there is exactly a single original stmt. Change stmts argument
to pointer from reference, if NULL, don't push anything to it. Add
first argument, use it to optimize skipping over orig stmts that
are known to be before bitpos already. Simplify.
(split_group): Return unsigned int count how many stores are or
would be needed rather than a bool. Add allow_unaligned argument.
Change split_stores argument from reference to pointer, if NULL,
only do a dry run computing how many stores would be produced.
Rewritten algorithm to use both alignment and misalign if
!allow_unaligned and handle bitfield stores with gaps.
(imm_store_chain_info::output_merged_store): Set start_byte_pos
from bitregion_start instead of start. Compute allow_unaligned
here, if true, do 2 split_group dry runs to compute which one
produces fewer stores and prefer aligned if equal. Punt if
new count is bigger or equal than original before emitting any
statements, rather than during that. Remove no longer needed
new_ssa_names tracking. Replace num_stmts with
split_stores.length (). Use 32-bit stack allocated entries
in split_stores auto_vec. Try to reuse original store lhs/rhs1
if possible. Handle bitfields with gaps.
(pass_store_merging::execute): Ignore bitsize == 0 stores.
Compute bitregion_{start,end} for the stores and construct
store_immediate_info with that. Formatting fixes.
* gcc.dg/store_merging_10.c: New test.
* gcc.dg/store_merging_11.c: New test.
* gcc.dg/store_merging_12.c: New test.
* g++.dg/pr71694.C: Add -fno-store-merging to dg-options.
From-SVN: r254213
2017-10-28 Sandra Loosemore <sandra@codesourcery.com>
gcc/
* config/nios2/nios2.h (FRAME_GROWS_DOWNWARD): Define to 1.
* config/nios2/nios2.c (nios2_initial_elimination_offset): Make
FRAME_POINTER_REGNUM point at high end of local var area.
From-SVN: r254204
2017-10-28 Paul Thomas <pault@gcc.gnu.org>
PR fortran/81758
* trans-expr.c (trans_class_vptr_len_assignment): 'vptr_expr'
must only be set if the right hand side expression is of type
class.
2017-10-28 Paul Thomas <pault@gcc.gnu.org>
PR fortran/81758
* gfortran.dg/class_63.f90: New test.
From-SVN: r254195
* target.c (struct gomp_coalesce_buf): New type.
(MAX_COALESCE_BUF_SIZE, MAX_COALESCE_BUF_GAP): Define.
(gomp_coalesce_buf_add, gomp_to_device_kind_p): New functions.
(gomp_copy_host2dev): Add CBUF argument, if copying into
the cached ranges, memcpy into buffer instead of copying
into device.
(gomp_map_vars_existing, gomp_map_pointer, gomp_map_fields_existing):
Add CBUF argument, pass it through to other calls.
(gomp_map_vars): Aggregate copies from host to device if small enough
and with small enough gaps in between into memcpy into a buffer and
fewer host to device copies from the buffer.
(gomp_update): Adjust gomp_copy_host2dev caller.
From-SVN: r254194
2017-10-27 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/82620
* match.c (gfc_match_allocate): Exit early on syntax error.
2017-10-27 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/82620
* gfortran.dg/allocate_error_7.f90: new test.
From-SVN: r254193
* bb-reorder.c (find_traces_1_round): Fix off-by-one index.
Move comment around. Do not reset best_edge for a copiable
destination if the copy would cause a partition change.
(better_edge_p): Remove redundant check.
From-SVN: r254188
[gcc]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* builtins.c (CASE_MATHFN_FLOATN): New helper macro to add cases
for math functions that have _Float<N> and _Float<N>X variants.
(mathfn_built_in_2): Add support for math functions that have
_Float<N> and _Float<N>X variants.
(DEF_INTERNAL_FLT_FLOATN_FN): New helper macro.
(expand_builtin_mathfn_ternary): Add support for fma with
_Float<N> and _Float<N>X variants.
(expand_builtin): Likewise.
(fold_builtin_3): Likewise.
* builtins.def (DEF_EXT_LIB_FLOATN_NX_BUILTINS): New macro to
create math function _Float<N> and _Float<N>X variants as external
library builtins.
(BUILT_IN_COPYSIGN _Float<N> and _Float<N>X variants) Use
DEF_EXT_LIB_FLOATN_NX_BUILTINS to make built-in functions using
the __builtin_ prefix and if not strict ansi, without the prefix.
(BUILT_IN_FABS _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_FMA _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_FMAX _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_FMIN _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_NAN _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_SQRT _Float<N> and _Float<N>X variants): Likewise.
* builtin-types.def (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16): New
function signatures for fma _Float<N> and _Float<N>X variants.
(BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32): Likewise.
(BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64): Likewise.
(BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128): Likewise.
(BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X): Likewise.
(BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X): Likewise.
(BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X): Likewise.
* gencfn-macros.c (print_case_cfn): Add support for math functions
that have _Float<N> and _Float<N>X variants.
(print_define_operator_list): Likewise.
(fltfn_suffixes): Likewise.
(main): Likewise.
* internal-fn.def (DEF_INTERNAL_FLT_FLOATN_FN): New helper macro
for math functions that have _Float<N> and _Float<N>X variants.
(SQRT): Add support for sqrt, copysign, fmin and fmax _Float<N>
and _Float<N>X variants.
(COPYSIGN): Likewise.
(FMIN): Likewise.
(FMAX): Likewise.
* fold-const.c (tree_call_nonnegative_warnv_p): Add support for
copysign, fma, fmax, fmin, and sqrt _Float<N> and _Float<N>X
variants.
(integer_valued_read_call_p): Likewise.
* fold-const-call.c (fold_const_call_ss): Likewise.
(fold_const_call_sss): Add support for copysign, fmin, and fmax
_Float<N> and _Float<N>X variants.
(fold_const_call_ssss): Add support for fma _Float<N> and
_Float<N>X variants.
* gimple-ssa-backprop.c (backprop::process_builtin_call_use): Add
support for copysign and fma _Float<N> and _Float<N>X variants.
(backprop::process_builtin_call_use): Likewise.
* tree-call-cdce.c (can_test_argument_range); Add support for
sqrt _Float<N> and _Float<N>X variants.
(edom_only_function): Likewise.
(get_no_error_domain): Likewise.
* tree-ssa-math-opts.c (internal_fn_reciprocal): Likewise.
* tree-ssa-reassoc.c (attempt_builtin_copysign): Add support for
copysign _Float<N> and _Float<N>X variants.
* config/rs6000/rs6000-builtin.def (SQRTF128): Delete, this is now
handled by machine independent code.
(FMAF128): Likewise.
* doc/cpp.texi (Common Predefined Macros): Document defining
__FP_FAST_FMAF<N> and __FP_FAST_FMAF<N>X if the backend supports
fma _Float<N> and _Float<N>X variants.
[gcc/c]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* c-decl.c (header_for_builtin_fn): Add support for copysign, fma,
fmax, fmin, and sqrt _Float<N> and _Float<N>X variants.
[gcc/c-family]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* c-cppbuiltin.c (mode_has_fma): Add support for PowerPC KFmode.
(c_cpp_builtins): If a machine has a fast fma _Float<N> and
_Float<N>X variant, define __FP_FAST_FMA<N> and/or
__FP_FAST_FMA<N>X.
[gcc/testsuite]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/float128-hw.c: Add support for all 4 FMA
variants. Check various conversions to/from float128. Check
negation. Use {\m...\M} in the tests.
* gcc.target/powerpc/float128-hw2.c: New test for implicit
_Float128 math functions.
* gcc.target/powerpc/float128-hw3.c: New test for strict ansi mode
not implicitly adding the _Float128 math functions.
* gcc.target/powerpc/float128-fma2.c: Delete, test is no longer
valid.
* gcc.target/powerpc/float128-sqrt2.c: Likewise.
From-SVN: r254168
2017-10-27 Jerry DeLisle <jvdelisle@gcc.gnu.org>
Rimvydas (RJ)
PR libgfortran/81938
io/format.c (free_format_data): Don't try to free vlist
descriptors past the end of the fnode array.
From-SVN: r254163