Here is a patch to fix up the ppc64be vs. ppc64le byteswapping
of IBM extended real(kind=16) and complex(kind=16).
Similarly to the BT_COMPLEX case it halves size and doubles nelems
for the bswap_array calls. Of course for r16_ibm and r16_ieee conversions
one needs to make sure it is only done when the on file data is in that
format and not in IEEE quad.
2022-01-11 Jakub Jelinek <jakub@redhat.com>
* io/transfer.c (unformatted_read, unformatted_write): When
byteswapping IBM extended real(kind=16), handle it as byteswapping
two real(kind=8) values.
This patch handles the environment variables for the REAL(KIND=16)
variables like for the little/big-endian routines, so users without
who have no access to the source or are unwilling to recompile
can use this.
Syntax is, for example
GFORTRAN_CONVERT_UNIT="r16_ieee:10;little_endian:10" ./a.out
libgfortran/ChangeLog:
* runtime/environ.c (R16_IEEE): New macro.
(R16_IBM): New macro.
(next_token): Handle IBM R16 conversion cases.
(push_token): Likewise.
(mark_single): Likewise.
(do_parse): Likewise, initialize endian.
This patch, based on Jakub's work, implements the CONVERT
specifier for the power-ieee128 brach. It allows specifying
the conversion as r16_ieee,big_endian and the other way around,
based on a table. Setting the conversion via environment
variable and via program option does not yet work.
gcc/ChangeLog:
* flag-types.h (enum gfc_convert): Add flags for
conversion.
gcc/fortran/ChangeLog:
* libgfortran.h (unit_convert): Add flags.
libgfortran/ChangeLog:
* Makefile.in: Regenerate.
* io/file_pos.c (unformatted_backspace): Mask off
R16 parts for convert.
* io/inquire.c (inquire_via_unit): Add cases for
R16 parts.
* io/open.c (st_open): Add cases for R16 conversion.
* io/transfer.c (unformatted_read): Adjust for R16 conversions.
(unformatted_write): Likewise.
(us_read): Mask of R16 bits.
(data_transfer_init): Likewiese.
(write_us_marker): Likewise.
I've just tried to build libgfortran on an old glibc system
(gcc112.fsffrance.org) and unfortunately we still have work to do:
[jakub@gcc2-power8 obj38]$ LD_PRELOAD=/home/jakub/gcc/obj38/powerpc64le-unknown-linux-gnu/libgfortran/.libs/libgfortran.so.5.0.0 /bin/true
[jakub@gcc2-power8 obj38]$ LD_BIND_NOW=1 LD_PRELOAD=/home/jakub/gcc/obj38/powerpc64le-unknown-linux-gnu/libgfortran/.libs/libgfortran.so.5.0.0 /bin/true
/bin/true: symbol lookup error: /home/jakub/gcc/obj38/powerpc64le-unknown-linux-gnu/libgfortran/.libs/libgfortran.so.5.0.0: undefined symbol: __atan2ieee128
While we do use some libquadmath APIs:
readelf -Wr /home/jakub/gcc/obj38/powerpc64le-unknown-linux-gnu/libgfortran/.libs/libgfortran.so.5.0.0 | grep QUADMATH
0000000000251268 000005e400000026 R_PPC64_ADDR64 0000000000000000 quadmath_snprintf@QUADMATH_1.0 + 0
0000000000251270 0000030600000026 R_PPC64_ADDR64 0000000000000000 strtoflt128@QUADMATH_1.0 + 0
00000000002502e0 0000011600000015 R_PPC64_JMP_SLOT 0000000000000000 ynq@QUADMATH_1.0 + 0
0000000000250390 0000016000000015 R_PPC64_JMP_SLOT 0000000000000000 sqrtq@QUADMATH_1.0 + 0
0000000000250508 000001fa00000015 R_PPC64_JMP_SLOT 0000000000000000 fmaq@QUADMATH_1.0 + 0
0000000000250530 0000021200000015 R_PPC64_JMP_SLOT 0000000000000000 fabsq@QUADMATH_1.0 + 0
0000000000250760 0000030600000015 R_PPC64_JMP_SLOT 0000000000000000 strtoflt128@QUADMATH_1.0 + 0
0000000000250990 000003df00000015 R_PPC64_JMP_SLOT 0000000000000000 cosq@QUADMATH_1.0 + 0
00000000002509f0 0000040a00000015 R_PPC64_JMP_SLOT 0000000000000000 expq@QUADMATH_1.0 + 0
0000000000250a88 0000045100000015 R_PPC64_JMP_SLOT 0000000000000000 erfcq@QUADMATH_1.0 + 0
0000000000250a98 0000045e00000015 R_PPC64_JMP_SLOT 0000000000000000 jnq@QUADMATH_1.0 + 0
0000000000250ac8 0000047e00000015 R_PPC64_JMP_SLOT 0000000000000000 sinq@QUADMATH_1.0 + 0
0000000000250e38 000005db00000015 R_PPC64_JMP_SLOT 0000000000000000 fmodq@QUADMATH_1.0 + 0
0000000000250e48 000005e000000015 R_PPC64_JMP_SLOT 0000000000000000 tanq@QUADMATH_1.0 + 0
0000000000250e58 000005e400000015 R_PPC64_JMP_SLOT 0000000000000000 quadmath_snprintf@QUADMATH_1.0 + 0
0000000000250f20 0000062900000015 R_PPC64_JMP_SLOT 0000000000000000 copysignq@QUADMATH_1.0 + 0
we don't do it consistently:
readelf -Wr /home/jakub/gcc/obj38/powerpc64le-unknown-linux-gnu/libgfortran/.libs/libgfortran.so.5.0.0 | grep ieee128
0000000000250310 0000012800000015 R_PPC64_JMP_SLOT 0000000000000000 __atan2ieee128 + 0
0000000000250340 0000014200000015 R_PPC64_JMP_SLOT 0000000000000000 __clogieee128 + 0
0000000000250438 000001a300000015 R_PPC64_JMP_SLOT 0000000000000000 __acoshieee128 + 0
00000000002504b8 000001cc00000015 R_PPC64_JMP_SLOT 0000000000000000 __csinieee128 + 0
0000000000250500 000001f300000015 R_PPC64_JMP_SLOT 0000000000000000 __sinhieee128 + 0
0000000000250570 0000022a00000015 R_PPC64_JMP_SLOT 0000000000000000 __asinieee128 + 0
0000000000250580 0000022d00000015 R_PPC64_JMP_SLOT 0000000000000000 __roundieee128 + 0
00000000002505a0 0000023e00000015 R_PPC64_JMP_SLOT 0000000000000000 __logieee128 + 0
00000000002505c8 0000024900000015 R_PPC64_JMP_SLOT 0000000000000000 __tanieee128 + 0
0000000000250630 0000027500000015 R_PPC64_JMP_SLOT 0000000000000000 __ccosieee128 + 0
0000000000250670 0000028a00000015 R_PPC64_JMP_SLOT 0000000000000000 __log10ieee128 + 0
00000000002506c8 000002bd00000015 R_PPC64_JMP_SLOT 0000000000000000 __cexpieee128 + 0
00000000002506d8 000002c800000015 R_PPC64_JMP_SLOT 0000000000000000 __coshieee128 + 0
00000000002509b0 000003ef00000015 R_PPC64_JMP_SLOT 0000000000000000 __truncieee128 + 0
0000000000250af8 000004a600000015 R_PPC64_JMP_SLOT 0000000000000000 __expieee128 + 0
0000000000250b50 000004c600000015 R_PPC64_JMP_SLOT 0000000000000000 __fmodieee128 + 0
0000000000250bb0 000004e700000015 R_PPC64_JMP_SLOT 0000000000000000 __tanhieee128 + 0
0000000000250c38 0000051300000015 R_PPC64_JMP_SLOT 0000000000000000 __acosieee128 + 0
0000000000250ce0 0000055400000015 R_PPC64_JMP_SLOT 0000000000000000 __sinieee128 + 0
0000000000250d60 0000057e00000015 R_PPC64_JMP_SLOT 0000000000000000 __atanieee128 + 0
0000000000250dd8 000005b100000015 R_PPC64_JMP_SLOT 0000000000000000 __sqrtieee128 + 0
0000000000250e98 0000060200000015 R_PPC64_JMP_SLOT 0000000000000000 __cosieee128 + 0
0000000000250eb0 0000060a00000015 R_PPC64_JMP_SLOT 0000000000000000 __atanhieee128 + 0
0000000000250ef0 0000062000000015 R_PPC64_JMP_SLOT 0000000000000000 __asinhieee128 + 0
0000000000250fd8 0000067f00000015 R_PPC64_JMP_SLOT 0000000000000000 __csqrtieee128 + 0
0000000000251038 000006ad00000015 R_PPC64_JMP_SLOT 0000000000000000 __cabsieee128 + 0
All these should for POWER_IEEE128 use atan2q@QUADMATH_1.0 etc.
It seems all these come from f951 compiled sources.
For user code, I think the agreement was if you want to use successfully
-mabi=ieeelongdouble, you need glibc 2.32 or later, which is why the Fortran
FE doesn't conditionalize on whether glibc 2.32 is available or not and just
emits __WHATEVERieee128 entrypoints.
But for Fortran compiled sources in libgfortran, we need to use
__WHATEVERieee128 only if glibc 2.32 or later and WHATEVERq (from
libquadmath) otherwise.
The following patch implements that, adds -fbuilding-libgfortran option
similar to e.g. -fbuilding-libgcc used when building libgcc and if
that option is set and the TARGET_GLIBC_{MAJOR,MINOR} macros indicate
no glibc or glibc older than 2.32, it will use the libquadmath APIs
rather than glibc 2.32 APIs.
2022-01-07 Jakub Jelinek <jakub@redhat.com>
gcc/fortran/
* trans-types.c (gfc_init_kinds): When setting abi_kind to 17, if not
targetting glibc 2.32 or later and -fbuilding-libgfortran, set
gfc_real16_is_float128 and c_float128 in gfc_real_kinds.
(gfc_build_real_type): Don't set c_long_double if c_float128 is
already set.
* trans-intrinsic.c (builtin_decl_for_precision): Don't use
long_double_built_in if gfc_real16_is_float128 and
long_double_type_node == gfc_float128_type_node.
* lang.opt (fbuilding-libgfortran): New undocumented option.
libgfortran/
* Makefile.am (AM_FCFLAGS): Add -fbuilding-libgfortran after
-fallow-leading-underscore.
* Makefile.in: Regenerated.
With Fortran adding support for changing the long double format, this
patch removes the code that only allowed C/C++ to change the long double
format for GLIBC 2.32 and later without a warning.
gcc/
2022-01-05 Michael Meissner <meissner@the-meissners.org>
* config/rs6000/rs6000.c (rs6000_option_override_internal): Remove
checks for only C/C++ front ends before allowing the long double
format to change without a warning.
This test FAILs because
f951: Error: '-mabi=ieeelongdouble' requires full ISA 2.06 support
compiler exited with status 1
FAIL: gfortran.dg/pr47614.f -O0 (test for excess errors)
As powerpc64le* only supports -mcpu=power8 and newer, I think we shouldn't
be testing with that option.
2022-01-04 Jakub Jelinek <jakub@redhat.com>
* gfortran.dg/pr47614.f: Don't use -mcpu=power4 for
powerpc64le*-*-linux*.
Testing found that we also need libquadmath to be built with
-mno-gnu-attribute, otherwise -mabi=ieeelongdouble programs don't link.
2022-01-03 Jakub Jelinek <jakub@redhat.com>
* configure.ac: Set XCFLAGS to -mno-gnu-attribute on
powerpc64le*-linux*.
* configure: Regenerated.
This brings the library to compile with all specific functions.
It also corrects the patsubst patterns so the right files
get the flags.
It was necessary to manually add -D__powerpc64__ because apparently
this is not set for Fortran.
libgfortran/ChangeLog:
* Makefile.am: Correct files for compilation flags. Add
-D__powerpc64__ for Fortran sources. Get kinds.inc from
grep of kinds.h and kinds-override.h.
* Makefile.in: Regenerate.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Add -mno-gnu-attribute to compile flags.
* generated/_abs_c17.F90: Regenerate.
* generated/_abs_r17.F90: Regenerate.
* generated/_acos_r17.F90: Regenerate.
* generated/_acosh_r17.F90: Regenerate.
* generated/_aimag_c17.F90: Regenerate.
* generated/_aint_r17.F90: Regenerate.
* generated/_anint_r17.F90: Regenerate.
* generated/_asin_r17.F90: Regenerate.
* generated/_asinh_r17.F90: Regenerate.
* generated/_atan2_r17.F90: Regenerate.
* generated/_atan_r17.F90: Regenerate.
* generated/_atanh_r17.F90: Regenerate.
* generated/_conjg_c17.F90: Regenerate.
* generated/_cos_c17.F90: Regenerate.
* generated/_cos_r17.F90: Regenerate.
* generated/_cosh_r17.F90: Regenerate.
* generated/_dim_r17.F90: Regenerate.
* generated/_exp_c17.F90: Regenerate.
* generated/_exp_r17.F90: Regenerate.
* generated/_log10_r17.F90: Regenerate.
* generated/_log_c17.F90: Regenerate.
* generated/_log_r17.F90: Regenerate.
* generated/_mod_r17.F90: Regenerate.
* generated/_sign_r17.F90: Regenerate.
* generated/_sin_c17.F90: Regenerate.
* generated/_sin_r17.F90: Regenerate.
* generated/_sinh_r17.F90: Regenerate.
* generated/_sqrt_c17.F90: Regenerate.
* generated/_sqrt_r17.F90: Regenerate.
* generated/_tan_r17.F90: Regenerate.
* generated/_tanh_r17.F90: Regenerate.
* kinds-override.h: Adjust to trunk.
Change condition to single line so it can be grepped.
* m4/specific.m4: Make sure that real=kind16 is used
for _r17.F90 and _c17.F90 files.
* m4/specific2.m4: Likewise.
The following patch detects the powerpc64le-linux kind == 16 cases
and for the -mabi=ieeelongdouble case (no matter whether it is the
configured in default or just option used on the command line) uses
_r17 or _c17 instead of _r16 or _c17 in the library API names.
From what I can see, e.g. calls to sin on real(kind = 16) works fine
with or without this patch (we call __builtin_sinl and the backend
uses rs6000_mangle_decl_assembler_name which ensures __sinieee128
is called).
What is clearly still broken is IO, where for
real(kind=16) a
a = 1.0
print *, a
end
we call
_gfortran_transfer_real_write (&dt_parm.0, &a, 16);
for both -mabi=ibmlongdouble and -mabi=ieeelongdouble
I don't remember what was the agreement, do we want
_gfortran_transfer_real_write (&dt_parm.0, &a, 17);
for the ieeelongdouble case, or some new entrypoint for
the abi_kind == 17 real/complex IO?
Also, what about kind stored in array descriptors? Shall we use
there the abi_kind or kind?
I guess at least before the IO case is solved there is no point
in checking the testsuite, too many things will be majorly broken...
2021-12-31 Jakub Jelinek <jakub@redhat.com>
* gfortran.h (gfc_real_info): Add abi_kind member.
(gfc_type_abi_kind): Declare.
* trans-types.c (gfc_init_kinds): Initialize abi_kind.
* intrinsic.c (gfc_type_abi_kind): New function.
(conv_name): Use it.
* iresolve.c (resolve_transformational, gfc_resolve_abs,
gfc_resolve_char_achar, gfc_resolve_acos, gfc_resolve_acosh,
gfc_resolve_aimag, gfc_resolve_and, gfc_resolve_aint, gfc_resolve_all,
gfc_resolve_anint, gfc_resolve_any, gfc_resolve_asin,
gfc_resolve_asinh, gfc_resolve_atan, gfc_resolve_atanh,
gfc_resolve_atan2, gfc_resolve_bessel_n2, gfc_resolve_ceiling,
gfc_resolve_cmplx, gfc_resolve_complex, gfc_resolve_cos,
gfc_resolve_cosh, gfc_resolve_count, gfc_resolve_dble,
gfc_resolve_dim, gfc_resolve_dot_product, gfc_resolve_dprod,
gfc_resolve_exp, gfc_resolve_floor, gfc_resolve_hypot,
gfc_resolve_int, gfc_resolve_int2, gfc_resolve_int8, gfc_resolve_long,
gfc_resolve_log, gfc_resolve_log10, gfc_resolve_logical,
gfc_resolve_matmul, gfc_resolve_minmax, gfc_resolve_maxloc,
gfc_resolve_findloc, gfc_resolve_maxval, gfc_resolve_merge,
gfc_resolve_minloc, gfc_resolve_minval, gfc_resolve_mod,
gfc_resolve_modulo, gfc_resolve_nearest, gfc_resolve_or,
gfc_resolve_real, gfc_resolve_realpart, gfc_resolve_reshape,
gfc_resolve_sign, gfc_resolve_sin, gfc_resolve_sinh, gfc_resolve_sqrt,
gfc_resolve_tan, gfc_resolve_tanh, gfc_resolve_transpose,
gfc_resolve_trigd, gfc_resolve_xor, gfc_resolve_random_number):
Likewise.
* trans-decl.c (gfc_build_intrinsic_function_decls): Likewise.
The following patch quiets
../../../libgfortran/generated/in_pack_r17.c:35:1: warning: no previous prototype for ‘internal_pack_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/in_pack_c17.c:35:1: warning: no previous prototype for ‘internal_pack_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/in_unpack_r17.c:33:1: warning: no previous prototype for ‘internal_unpack_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/in_unpack_c17.c:33:1: warning: no previous prototype for ‘internal_unpack_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/pack_r17.c:73:1: warning: no previous prototype for ‘pack_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/pack_c17.c:73:1: warning: no previous prototype for ‘pack_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/unpack_r17.c:34:1: warning: no previous prototype for ‘unpack0_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/unpack_r17.c:178:1: warning: no previous prototype for ‘unpack1_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/unpack_c17.c:34:1: warning: no previous prototype for ‘unpack0_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/unpack_c17.c:178:1: warning: no previous prototype for ‘unpack1_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/spread_r17.c:34:1: warning: no previous prototype for ‘spread_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/spread_r17.c:230:1: warning: no previous prototype for ‘spread_scalar_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/spread_c17.c:34:1: warning: no previous prototype for ‘spread_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/spread_c17.c:230:1: warning: no previous prototype for ‘spread_scalar_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift0_r17.c:33:1: warning: no previous prototype for ‘cshift0_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift0_c17.c:33:1: warning: no previous prototype for ‘cshift0_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_4_r17.c:32:1: warning: no previous prototype for ‘cshift1_4_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_4_c17.c:32:1: warning: no previous prototype for ‘cshift1_4_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_8_r17.c:32:1: warning: no previous prototype for ‘cshift1_8_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_8_c17.c:32:1: warning: no previous prototype for ‘cshift1_8_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_16_r17.c:32:1: warning: no previous prototype for ‘cshift1_16_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_16_c17.c:32:1: warning: no previous prototype for ‘cshift1_16_c17’ [-Wmissing-prototypes]
warnings during libgfortran build and exports the new entrypoints.
Note, not all of them, clearly e.g. there are fewer *_r17* entrypoints than
*_r16* entrypoints, so more work is needed.
2021-12-31 Jakub Jelinek <jakub@redhat.com>
* libgfortran.h (internal_pack_r17, internal_pack_c17,
internal_unpack_r17, internal_unpack_c17, pack_r17, pack_c17,
unpack0_r17, unpack0_c17, unpack1_r17, unpack1_c17, spread_r17,
spread_c17, spread_scalar_r17, spread_scalar_c17, cshift0_r17,
cshift0_c17, cshift1_4_r17, cshift1_8_r17, cshift1_16_r17,
cshift1_4_c17, cshift1_8_c17, cshift1_16_c17): Declare.
* gfortran.map (GFORTRAN_12): Export *_r17 and *_c17.
This prepares the library side for REAL(KIND=17). It is
not yet tested, but at least compiles cleanly on POWER 9
and x86_64.
2021-10-19 Thomas Koenig <tkoenig@gcc.gnu.org>
* Makefile.am: Add _r17 and _c17 files. Build them
with -mabi=ieeelongdouble on POWER.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: New flag HAVE_REAL_17.
* kinds-override.h: (HAVE_GFC_REAL_17): New macro.
(HAVE_GFC_COMPLEX_17): New macro.
(GFC_REAL_17_HUGE): New macro.
(GFC_REAL_17_LITERAL_SUFFIX): New macro.
(GFC_REAL_17_LITERAL): New macro.
(GFC_REAL_17_DIGITS): New macro.
(GFC_REAL_17_RADIX): New macro.
* libgfortran.h (POWER_IEEE128): New macro.
(gfc_array_r17): Typedef.
(GFC_DTYPE_REAL_17): New macro.
(GFC_DTYPE_COMPLEX_17): New macro.
(__acoshieee128): Prototype.
(__acosieee128): Prototype.
(__asinhieee128): Prototype.
(__asinieee128): Prototype.
(__atan2ieee128): Prototype.
(__atanhieee128): Prototype.
(__atanieee128): Prototype.
(__coshieee128): Prototype.
(__cosieee128): Prototype.
(__erfieee128): Prototype.
(__expieee128): Prototype.
(__fabsieee128): Prototype.
(__jnieee128): Prototype.
(__log10ieee128): Prototype.
(__logieee128): Prototype.
(__powieee128): Prototype.
(__sinhieee128): Prototype.
(__sinieee128): Prototype.
(__sqrtieee128): Prototype.
(__tanhieee128): Prototype.
(__tanieee128): Prototype.
(__ynieee128): Prototype.
* m4/mtype.m4: Make a bit more readable. Add KIND=17.
* generated/_abs_c17.F90: New file.
* generated/_abs_r17.F90: New file.
* generated/_acos_r17.F90: New file.
* generated/_acosh_r17.F90: New file.
* generated/_aimag_c17.F90: New file.
* generated/_aint_r17.F90: New file.
* generated/_anint_r17.F90: New file.
* generated/_asin_r17.F90: New file.
* generated/_asinh_r17.F90: New file.
* generated/_atan2_r17.F90: New file.
* generated/_atan_r17.F90: New file.
* generated/_atanh_r17.F90: New file.
* generated/_conjg_c17.F90: New file.
* generated/_cos_c17.F90: New file.
* generated/_cos_r17.F90: New file.
* generated/_cosh_r17.F90: New file.
* generated/_dim_r17.F90: New file.
* generated/_exp_c17.F90: New file.
* generated/_exp_r17.F90: New file.
* generated/_log10_r17.F90: New file.
* generated/_log_c17.F90: New file.
* generated/_log_r17.F90: New file.
* generated/_mod_r17.F90: New file.
* generated/_sign_r17.F90: New file.
* generated/_sin_c17.F90: New file.
* generated/_sin_r17.F90: New file.
* generated/_sinh_r17.F90: New file.
* generated/_sqrt_c17.F90: New file.
* generated/_sqrt_r17.F90: New file.
* generated/_tan_r17.F90: New file.
* generated/_tanh_r17.F90: New file.
* generated/bessel_r17.c: New file.
* generated/cshift0_c17.c: New file.
* generated/cshift0_r17.c: New file.
* generated/cshift1_16_c17.c: New file.
* generated/cshift1_16_r17.c: New file.
* generated/cshift1_4_c17.c: New file.
* generated/cshift1_4_r17.c: New file.
* generated/cshift1_8_c17.c: New file.
* generated/cshift1_8_r17.c: New file.
* generated/findloc0_c17.c: New file.
* generated/findloc0_r17.c: New file.
* generated/findloc1_c17.c: New file.
* generated/findloc1_r17.c: New file.
* generated/in_pack_c17.c: New file.
* generated/in_pack_r17.c: New file.
* generated/in_unpack_c17.c: New file.
* generated/in_unpack_r17.c: New file.
* generated/matmul_c17.c: New file.
* generated/matmul_r17.c: New file.
* generated/matmulavx128_c17.c: New file.
* generated/matmulavx128_r17.c: New file.
* generated/maxloc0_16_r17.c: New file.
* generated/maxloc0_4_r17.c: New file.
* generated/maxloc0_8_r17.c: New file.
* generated/maxloc1_16_r17.c: New file.
* generated/maxloc1_4_r17.c: New file.
* generated/maxloc1_8_r17.c: New file.
* generated/maxval_r17.c: New file.
* generated/minloc0_16_r17.c: New file.
* generated/minloc0_4_r17.c: New file.
* generated/minloc0_8_r17.c: New file.
* generated/minloc1_16_r17.c: New file.
* generated/minloc1_4_r17.c: New file.
* generated/minloc1_8_r17.c: New file.
* generated/minval_r17.c: New file.
* generated/norm2_r17.c: New file.
* generated/pack_c17.c: New file.
* generated/pack_r17.c: New file.
* generated/pow_c17_i16.c: New file.
* generated/pow_c17_i4.c: New file.
* generated/pow_c17_i8.c: New file.
* generated/pow_r17_i16.c: New file.
* generated/pow_r17_i4.c: New file.
* generated/pow_r17_i8.c: New file.
* generated/product_c17.c: New file.
* generated/product_r17.c: New file.
* generated/reshape_c17.c: New file.
* generated/reshape_r17.c: New file.
* generated/spread_c17.c: New file.
* generated/spread_r17.c: New file.
* generated/sum_c17.c: New file.
* generated/sum_r17.c: New file.
* generated/unpack_c17.c: New file.
* generated/unpack_r17.c: New file.
The new IRA heuristics would need more work on old-reload targets,
since flattening needs to be able to undo the cost propagation.
It's doable, but hardly seems worth it.
This patch therefore makes all the new calls to
ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p.
The color_pass code that predated the new function (and that was
the source of ira_subloop_allocnos_can_differ_p) continues to
behave as before.
It's a hack, but at least it has the advantage that the new parameter
would become obviously unused if reload and (!)ira_use_lra_p were
removed. The hack should therefore disappear alongside reload.
gcc/
PR rtl-optimization/103974
* ira-int.h (ira_subloop_allocnos_can_differ_p): Take an
extra argument, default true, that says whether old-reload
targets should be excluded.
* ira-color.c (color_pass): Pass false.
This C++20 header is also supposed to be present for freestanding.
libstdc++-v3/ChangeLog:
PR libstdc++/103726
* include/Makefile.am: Install <source_location> for
freestanding.
* include/Makefile.in: Regenerate.
* include/std/version (__cpp_lib_source_location): Define for
freestanding.
This patch also moves V2HI and V4QImode vector conditional moves
to SSE4.1 targets. Vector cmoves are implemented with SSE logic functions
without -msse4.1, and they are hardly worthwile for narrow vector modes.
More important, we would like to keep vector logic functions for GPR
registers, and the current RTX description of 32-bit vector modes logic
insns does not include the necessary CC reg clobber. Solve these issues by
restricting vector cmove insns for these modes to -msse4.1, where logic
instructions are avoided, and pblend insn is used instead.
A follow-up patch will add clobbers and necessary splits to 32-bit
vector mode logic insns, and in a future patch, ix86_sse_movcc will be
improved to use expand_simple_{unop,binop} to emit logic insns, allowing
us to re-enable 16-bit and 32-bit narrow vector cmoves for -msse2.
2022-01-11 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/103861
* config/i386/mmx.md (vcond<mode><mode>):
Use VI_16_32 mode iterator. Enable for TARGET_SSE4_1.
(vcondu<mode><mode>): Ditto.
(vcond_mask_<mode><mode>): Ditto.
(mmx_pblendvb_v8qi): Rename from mmx_pblendvb64.
(mmx_pblendvb_<mode>): Rename from mmx_pblendvb32.
Use VI_16_32 mode iterator.
* config/i386/i386-expand.c (ix86_expand_sse_movcc):
Update for rename. Handle V2QImode.
(expand_vec_perm_blend): Update for rename.
gcc/testsuite/ChangeLog:
PR target/103861
* g++.target/i386/pr100637-1b.C (dg-options):
Use -msse4 instead of -msse2.
* g++.target/i386/pr100637-1w.C (dg-options): Ditto.
* g++.target/i386/pr103861-1.C: New test.
* gcc.target/i386/pr100637-4b.c (dg-options):
Use -msse4 instead of -msse2.
* gcc.target/i386/pr103861-4.c: New test.
The following testcase ICEs, because middle-end uses the C++ FE pretty
printing code through langhooks in the diagnostics.
The FE expects OBJ_TYPE_REF_OBJECT's type to be useful (pointer to the
class type it is called on), but in the middle-end conversions between
pointer types are useless, so the actual type can be some random
unrelated pointer type (in the testcase void * pointer). The pretty
printing code then ICEs on it.
The following patch fixes that by sticking the original
OBJ_TYPE_REF_OBJECT's also as type of OBJ_TYPE_REF_TOKEN operand.
That one must be an INTEGER_CST, all the current uses of
OBJ_TYPE_REF_TOKEN just use tree_to_uhwi or tree_to_shwi on it,
and because it is constant, there is no risk of the middle-end propagating
into it some other pointer type. So, approach similar to how MEM_REF
treats its second operand or a couple of internal functions (e.g.
IFN_VA_ARG) some of its parameters.
2022-01-11 Jakub Jelinek <jakub@redhat.com>
PR c++/101597
gcc/
* tree.def (OBJ_TYPE_REF): Document type of OBJ_TYPE_REF_TOKEN.
gcc/cp/
* class.c (build_vfn_ref): Build OBJ_TYPE_REF with INTEGER_CST
OBJ_TYPE_REF_TOKEN with type equal to OBJ_TYPE_REF_OBJECT type.
* error.c (resolve_virtual_fun_from_obj_type_ref): Use type of
OBJ_TYPE_REF_TOKEN rather than type of OBJ_TYPE_REF_OBJECT as
obj_type.
gcc/objc/
* objc-act.c (objc_rewrite_function_call): Build OBJ_TYPE_REF
with INTEGER_CST OBJ_TYPE_REF_TOKEN with type equal to
OBJ_TYPE_REF_OBJECT type.
* objc-next-runtime-abi-01.c (build_objc_method_call): Likewise.
* objc-gnu-runtime-abi-01.c (build_objc_method_call): Likewise.
* objc-next-runtime-abi-02.c (build_v2_objc_method_fixup_call,
build_v2_build_objc_method_call): Likewise.
gcc/testsuite/
* g++.dg/opt/pr101597.C: New test.
The following testcases emit a bogus -Wconversion warning. This is because
conversion_warning function doesn't handle BIT_*_EXPR (only unsafe_conversion_p
that is called during the default: case, and that one doesn't handle
SAVE_EXPRs added because the unsigned char & or | operands promoted to int
have side-effects and =| or =& is used.
The patch handles BIT_IOR_EXPR/BIT_XOR_EXPR like the last 2 operands of
COND_EXPR by recursing on the two operands, if either of them doesn't fit
into the narrower type, complain. BIT_AND_EXPR too, but first it needs to
handle some special cases that unsafe_conversion_p does, namely when one
of the two operands is a constant.
This fixes completely the pr101537.c test and for C also pr103881.c
and doesn't regress anything in the testsuite, for C++ pr103881.c still
emits the bogus warnings.
This is because while the C FE emits in that case a SAVE_EXPR that
conversion_warning can handle already, C++ FE emits
TARGET_EXPR <D.whatever, ...>, something | D.whatever
etc. and conversion_warning handles COMPOUND_EXPR by "recursing" on the
rhs. To handle that case, we'd need for TARGET_EXPR on the lhs remember
in some hash map the mapping from D.whatever to the TARGET_EXPR and when
we see D.whatever, use corresponding TARGET_EXPR initializer instead.
2022-01-11 Jakub Jelinek <jakub@redhat.com>
PR c/101537
PR c/103881
gcc/c-family/
* c-warn.c (conversion_warning): Handle BIT_AND_EXPR, BIT_IOR_EXPR
and BIT_XOR_EXPR.
gcc/testsuite/
* c-c++-common/pr101537.c: New test.
* c-c++-common/pr103881.c: New test.
Here during satisfaction of B's constraints we're failing to reject the
object-less call to the non-static member function A::size ultimately
because satisfaction is performed in the (access) context of the class
template B, which has a dependent base, and so the any_dependent_bases_p
check within build_new_method_call causes us to not reject the call.
(Subsequent constexpr evaluation of the call succeeds since the function
is effectively static.)
This patch fixes this by refining the any_dependent_bases_p check within
build_new_method_call: if we're in a context where 'this' is unavailable,
then we cannot resolve the implicit object regardless of the presence of
a dependent base. So let's also check current_class_ptr alongside a_d_b_p.
PR c++/103831
gcc/cp/ChangeLog:
* call.c (build_new_method_call): Consider dependent bases only
if 'this' is available.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-class3.C: New test.
* g++.dg/template/non-dependent18.C: New test.
This was approved at the October 2021 plenary. We already have noexcept
in the other places the issue adds it in the spec.
libstdc++-v3/ChangeLog:
* include/std/ranges (ranges::lazy_split_view::_InnerIter::end()):
Add neoxcept (LWG 3593).
This LWG issue was approved at the October 2021 plenary and can be
implemented now that std::optional is fully constexpr.
libstdc++-v3/ChangeLog:
* include/std/ranges (ranges::__detail::__box): Add constexpr to
assignment operators (LWG 3572).
* testsuite/std/ranges/adaptors/filter.cc: Check assignment of a
view that uses copyable-box.
Allow returning dynamic expressions from ADDR_EXPR for
__builtin_dynamic_object_size and also allow offsets to be dynamic.
gcc/ChangeLog:
PR middle-end/70090
* tree-object-size.c (size_valid_p): New function.
(size_for_offset): Remove OFFSET constness assertion.
(addr_object_size): Build dynamic expressions for object
sizes and use size_valid_p to decide if it is valid for the
given OBJECT_SIZE_TYPE.
(compute_builtin_object_size): Allow dynamic offsets when
computing size at O0.
(call_object_size): Call size_valid_p.
(plus_stmt_object_size): Allow non-constant offset and use
size_valid_p to decide if it is valid for the given
OBJECT_SIZE_TYPE.
gcc/testsuite/ChangeLog:
PR middle-end/70090
* gcc.dg/builtin-dynamic-object-size-0.c: Add new tests.
* gcc.dg/builtin-object-size-1.c (test1)
[__builtin_object_size]: Adjust expected output for dynamic
object sizes.
* gcc.dg/builtin-object-size-2.c (test1)
[__builtin_object_size]: Likewise.
* gcc.dg/builtin-object-size-3.c (test1)
[__builtin_object_size]: Likewise.
* gcc.dg/builtin-object-size-4.c (test1)
[__builtin_object_size]: Likewise.
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
Handle GIMPLE_PHI and conditionals specially for dynamic objects,
returning PHI/conditional expressions instead of just a MIN/MAX
estimate.
This makes the returned object size variable for loops and conditionals,
so tests need to be adjusted to look for precise size in some cases.
builtin-dynamic-object-size-5.c had to be modified to only look for
success in maximum object size case and skip over the minimum object
size tests because the result is no longer a compile time constant.
I also added some simple tests to exercise conditionals with dynamic
object sizes.
gcc/ChangeLog:
PR middle-end/70090
* builtins.c (fold_builtin_object_size): Adjust for dynamic size
expressions.
* tree-object-size.c: Include gimplify-me.h.
(struct object_size_info): New member UNKNOWNS.
(size_initval_p, size_usable_p, object_sizes_get_raw): New
functions.
(object_sizes_get): Return suitable gimple variable for
object size.
(bundle_sizes): New function.
(object_sizes_set): Use it and handle dynamic object size
expressions.
(object_sizes_set_temp): New function.
(size_for_offset): Adjust for dynamic size expressions.
(emit_phi_nodes, propagate_unknowns, gimplify_size_expressions):
New functions.
(compute_builtin_object_size): Call gimplify_size_expressions
for OST_DYNAMIC.
(dynamic_object_size): New function.
(cond_expr_object_size): Use it.
(phi_dynamic_object_size): New function.
(collect_object_sizes_for): Call it for OST_DYNAMIC. Adjust to
accommodate dynamic object sizes.
gcc/testsuite/ChangeLog:
PR middle-end/70090
* gcc.dg/builtin-dynamic-object-size-0.c: New tests.
* gcc.dg/builtin-dynamic-object-size-10.c: Add comment.
* gcc.dg/builtin-dynamic-object-size-5-main.c: New file.
* gcc.dg/builtin-dynamic-object-size-5.c: Use it and change test
to dg-do run.
* gcc.dg/builtin-object-size-5.c [!N]: Define N.
(test1, test2, test3, test4) [__builtin_object_size]: Expect
exact result for __builtin_dynamic_object_size.
* gcc.dg/builtin-object-size-1.c [__builtin_object_size]: Expect
exact size expressions for __builtin_dynamic_object_size.
* gcc.dg/builtin-object-size-2.c [__builtin_object_size]:
Likewise.
* gcc.dg/builtin-object-size-3.c [__builtin_object_size]:
Likewise.
* gcc.dg/builtin-object-size-4.c [__builtin_object_size]:
Likewise.
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
Never try to compute size for offset when the object size is -1, which
is either unknown maximum or uninitialized minimum irrespective of the
osi->pass number.
gcc/ChangeLog:
PR tree-optimization/103961
* tree-object-size.c (plus_stmt_object_size): Always avoid
computing offset for -1 size.
gcc/testsuite/ChangeLog:
PR tree-optimization/103961
* gcc.dg/pr103961.c: New test case.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
Produce a summary result for any operation involving too many subranges.
PR tree-optimization/103821
* range-op.cc (range_operator::fold_range): Only do precise ranges
when there are not too many subranges.
PR analyzer/102692 reports a false positive at -O2 from
-Wanalyzer-null-dereference on:
if (!p || q || !p->next)
At the gimple level, -O2 has converted the first || into bitwise or
controlling a jump:
_4 = _2 | _3;
if (_4 != 0)
and a recursive call has been converted to iteration. The analyzer hits
the symbolic value complexity limit whilst analyzing the iteration and
hits a case where _2 is (_Bool)1 (i.e. true) and _3 (i.e. q) is
UNKNOWN(_Bool).
There are two issues leading to the false positive; fixing either issue
fixes the false positive; this patch fixes both for good measure:
(a) The analyzer erronously treats bitwise ops on UNKNOWN(_Bool) as UNKNOWN,
even for case like (1 | UNKNOWN) where we know the result, leading to
bogus edges in the exploded graph. The patch fixes these cases.
(b) The state-handling code creates "UNKNOWN" symbolic values, as a way
to give up when symbolic expressions get too complicated, and in various
other cases. This makes sense when building the exploded graph, since
we want the analysis to terminate, but doesn't make sense when checking
the feasibility along a specific path, where we want precision. The patch
prevents all use of "unknown" symbolic values when performing feasibility
checking of a path (before it merely stopped doing complexity-checking),
by creating a unique placeholder_svalue instead.
This fixes the -Wanalyzer-null-dereference false positive.
Unfortunately, with GCC 12 there's also a false positive from
-Wanalyzer-use-of-uninitialized-value on this code, which is a separate
issue that the patch does not fix.
gcc/analyzer/ChangeLog:
PR analyzer/102692
* diagnostic-manager.cc
(class auto_disable_complexity_checks): Rename to...
(class auto_checking_feasibility): ...this, updating
the calls accordingly.
(epath_finder::explore_feasible_paths): Update for renaming.
* region-model-manager.cc
(region_model_manager::region_model_manager): Update for change from
m_check_complexity to m_checking_feasibility.
(region_model_manager::reject_if_too_complex): Likewise.
(region_model_manager::get_or_create_unknown_svalue): Handle
m_checking_feasibility.
(region_model_manager::create_unique_svalue): New.
(region_model_manager::maybe_fold_binop): Handle BIT_AND_EXPR and
BIT_IOR_EXPRs on booleans where we know the result.
* region-model.cc (test_binop_svalue_folding): Add test coverage
for the above.
* region-model.h (region_model_manager::create_unique_svalue): New
decl.
(region_model_manager::enable_complexity_check): Replace with...
(region_model_manager::begin_checking_feasibility): ...this.
(region_model_manager::disable_complexity_check): Replace with...
(region_model_manager::end_checking_feasibility): ...this.
(region_model_manager::m_check_complexity): Replace with...
(region_model_manager::m_checking_feasibility): ...this.
(region_model_manager::m_managed_dynamic_svalues): New field.
gcc/testsuite/ChangeLog:
PR analyzer/102692
* gcc.dg/analyzer/pr102692.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
The standard says that <coroutine> should be present for freestanding.
That was intentionally left out of the initial implementation, but can
be done without much trouble. The header should be moved to libsupc++ at
some point in stage 1.
The standard also says that <coroutine> defines a std::hash
specialization, which was missing from our implementation. That's a
problem for freestanding (see LWG 3653) so only do that for hosted.
We can use concepts to constrain the __coroutine_traits_impl base class
when compiled with concepts enabled. In a pure C++20 implementation we
would not need that base class at all and could just use a constrained
partial specialization of coroutine_traits. But the absence of the
__coroutine_traits_impl<R, void> base would create an ABI difference
between the non-standard C++14/C++17 support for coroutines and the same
code compiled as C++20. If we drop support for <coroutine> pre-C++20 we
should revisit this.
libstdc++-v3/ChangeLog:
PR libstdc++/103726
* include/Makefile.am: Install <coroutine> for freestanding.
* include/Makefile.in: Regenerate.
* include/std/coroutine: Adjust headers and preprocessor
conditions.
(__coroutine_traits_impl): Use concepts when available.
[_GLIBCXX_HOSTED] (hash<coroutine_handle>): Define.
On the libsdc++ mailing list Lewis Hyatt pointed out the performance
overhead of using sputn in stream inserters, rather than writing
directly to the streambuf's put area:
https://gcc.gnu.org/pipermail/libstdc++/2021-July/052877.html
As Lewis noted, the standard explicitly requires a call to sputn for
inserting a std::basic_string_view or std::basic_string. But for
inserting single characters or null-terminated strings it is more vague,
and so we can improve performance by not using the __ostream_insert
function.
This is a minimal change that avoids __ostream_insert for single
characters. We can use the unformatted basic_ostream::put(charT)
function when we don't need the additional effects of a formatted output
function (i.e. padding and resetting the width). The put function will
insert into the buffer if possible, and only make a virtual call (to
overflow) if the buffer is full.
We could also avoid sputn when inserting null-terminated character
strings, but that would require using a new function for inserting
null-terminated strings, so the existing code using sputn is still used
for basic_string and basic_string_view. My preference is to leave that
for now, and try to improve the standard. We could either remove the
requirement to call sputn, or allow sputn to write directly to the
buffer instead of calling xsputn.
libstdc++-v3/ChangeLog:
* include/std/ostream (operator<<(basic_ostream&, charT)):
Use unformatted input if no padding is needed.
(operator<<(basic_ostream<char>&, char)): Likewise.
gcc/ada/
* libgnat/s-aridou.adb (Double_Divide): Adjust proof of lemma
Prove_Signs, call lemma for commutation of Big and
multiplication.
(Multiply_With_Ovflo_Check): Adjust postcondition of
Prove_Pos_Int.
(Scaled_Divide): Explicit commutation in the proof of lemma
Prove_Multiplication, add new lemma Prove_Shift_Progress for
congruence property that is not proved in a larger context, add
assertions at the end of the loop to state loop invariant
properties.