From e55e40561955a4e732e8b503e37ca148fe162909 Mon Sep 17 00:00:00 2001 From: Georg-Johann Lay Date: Fri, 24 Aug 2012 12:42:48 +0000 Subject: [PATCH] re PR target/54222 ([avr] Implement fixed-point support) libgcc/ PR target/54222 * config/avr/lib1funcs-fixed.S: New file. * config/avr/lib1funcs.S: Include it. Undefine some divmodsi after they are used. (neg2, neg4): New macros. (__mulqihi3,__umulqihi3,__mulhi3): Rewrite non-MUL variants. (__mulhisi3,__umulhisi3,__mulsi3): Rewrite non-MUL variants. (__umulhisi3): Speed up MUL variant if there is enough flash. * config/avr/avr-lib.h (TA, UTA): Adjust according to gcc's avr-modes.def. * config/avr/t-avr (LIB1ASMFUNCS): Add: _fractqqsf, _fractuqqsf, _fracthqsf, _fractuhqsf, _fracthasf, _fractuhasf, _fractsasf, _fractusasf, _fractsfqq, _fractsfuqq, _fractsfhq, _fractsfuhq, _fractsfha, _fractsfsa, _mulqq3, _muluqq3, _mulhq3, _muluhq3, _mulha3, _muluha3, _mulsa3, _mulusa3, _divqq3, _udivuqq3, _divhq3, _udivuhq3, _divha3, _udivuha3, _divsa3, _udivusa3. (LIB2FUNCS_EXCLUDE): Add supported functions. gcc/ PR target/54222 * avr-modes.def (HA, SA, DA, TA, UTA): Adjust modes. * avr/avr-fixed.md: New file. * avr/avr.md: Include it. (cc): Add: minus. (adjust_len): Add: minus, minus64, ufract, sfract. (ALL1, ALL2, ALL4, ORDERED234): New mode iterators. (MOVMODE): Add: QQ, UQQ, HQ, UHQ, HA, UHA, SQ, USQ, SA, USA. (MPUSH): Add: HQ, UHQ, HA, UHA, SQ, USQ, SA, USA. (pushqi1, xload8_A, xload_8, movqi_insn, *reload_inqi, addqi3, subqi3, ashlqi3, *ashlqi3, ashrqi3, lshrqi3, *lshrqi3, *cmpqi, cbranchqi4, *cpse.eq): Generalize to handle all 8-bit modes in ALL1. (*movhi, reload_inhi, addhi3, *addhi3, addhi3_clobber, subhi3, ashlhi3, *ashlhi3_const, ashrhi3, *ashirhi3_const, lshrhi3, *lshrhi3_const, *cmphi, cbranchhi4): Generalize to handle all 16-bit modes in ALL2. (subhi3, casesi, strlenhi): Add clobber when expanding minus:HI. (*movsi, *reload_insi, addsi3, subsi3, ashlsi3, *ashlsi3_const, ashrsi3, *ashrhi3_const, *ashrsi3_const, lshrsi3, *lshrsi3_const, *reversed_tstsi, *cmpsi, cbranchsi4): Generalize to handle all 32-bit modes in ALL4. * avr-dimode.md (ALL8): New mode iterator. (adddi3, adddi3_insn, adddi3_const_insn, subdi3, subdi3_insn, subdi3_const_insn, cbranchdi4, compare_di2, compare_const_di2, ashrdi3, lshrdi3, rotldi3, ashldi3_insn, ashrdi3_insn, lshrdi3_insn, rotldi3_insn): Generalize to handle all 64-bit modes in ALL8. * config/avr/avr-protos.h (avr_to_int_mode): New prototype. (avr_out_fract, avr_out_minus, avr_out_minus64): New prototypes. * config/avr/avr.c (TARGET_FIXED_POINT_SUPPORTED_P): Define to... (avr_fixed_point_supported_p): ...this new static function. (TARGET_BUILD_BUILTIN_VA_LIST): Define to... (avr_build_builtin_va_list): ...this new static function. (avr_adjust_type_node): New static function. (avr_scalar_mode_supported_p): Allow if ALL_FIXED_POINT_MODE_P. (avr_builtin_setjmp_frame_value): Use gen_subhi3 and return new pseudo instead of gen_rtx_MINUS. (avr_print_operand, avr_operand_rtx_cost): Handle: CONST_FIXED. (notice_update_cc): Handle: CC_MINUS. (output_movqi): Generalize to handle respective fixed-point modes. (output_movhi, output_movsisf, avr_2word_insn_p): Ditto. (avr_out_compare, avr_out_plus_1): Also handle fixed-point modes. (avr_assemble_integer): Ditto. (output_reload_in_const, output_reload_insisf): Ditto. (avr_compare_pattern): Skip all modes > 4 bytes. (avr_2word_insn_p): Skip movuqq_insn, movqq_insn. (avr_out_fract, avr_out_minus, avr_out_minus64): New functions. (avr_to_int_mode): New function. (adjust_insn_length): Handle: ADJUST_LEN_SFRACT, ADJUST_LEN_UFRACT, ADJUST_LEN_MINUS, ADJUST_LEN_MINUS64. * config/avr/predicates.md (const0_operand): Allow const_fixed. (const_operand, const_or_immediate_operand): New. (nonmemory_or_const_operand): New. * config/avr/constraints.md (Ynn, Y00, Y01, Y02, Ym1, Ym2, YIJ): New constraints. * config/avr/avr.h (LONG_LONG_ACCUM_TYPE_SIZE): Define. From-SVN: r190644 --- gcc/ChangeLog | 59 ++ gcc/config/avr/avr-dimode.md | 189 +++-- gcc/config/avr/avr-fixed.md | 287 ++++++++ gcc/config/avr/avr-modes.def | 27 + gcc/config/avr/avr-protos.h | 5 + gcc/config/avr/avr.c | 592 ++++++++++++++- gcc/config/avr/avr.h | 1 + gcc/config/avr/avr.md | 1043 +++++++++++++++------------ gcc/config/avr/constraints.md | 44 ++ gcc/config/avr/predicates.md | 20 +- libgcc/ChangeLog | 20 + libgcc/config/avr/avr-lib.h | 76 ++ libgcc/config/avr/lib1funcs-fixed.S | 874 ++++++++++++++++++++++ libgcc/config/avr/lib1funcs.S | 421 +++++++---- libgcc/config/avr/t-avr | 65 ++ 15 files changed, 3046 insertions(+), 677 deletions(-) create mode 100644 gcc/config/avr/avr-fixed.md create mode 100644 libgcc/config/avr/lib1funcs-fixed.S diff --git a/gcc/ChangeLog b/gcc/ChangeLog index a39bc8bfbe9..04d08a4e521 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,62 @@ +2012-08-24 Georg-Johann Lay + + PR target/54222 + * avr-modes.def (HA, SA, DA, TA, UTA): Adjust modes. + * avr/avr-fixed.md: New file. + * avr/avr.md: Include it. + (cc): Add: minus. + (adjust_len): Add: minus, minus64, ufract, sfract. + (ALL1, ALL2, ALL4, ORDERED234): New mode iterators. + (MOVMODE): Add: QQ, UQQ, HQ, UHQ, HA, UHA, SQ, USQ, SA, USA. + (MPUSH): Add: HQ, UHQ, HA, UHA, SQ, USQ, SA, USA. + (pushqi1, xload8_A, xload_8, movqi_insn, *reload_inqi, addqi3, + subqi3, ashlqi3, *ashlqi3, ashrqi3, lshrqi3, *lshrqi3, *cmpqi, + cbranchqi4, *cpse.eq): Generalize to handle all 8-bit modes in ALL1. + (*movhi, reload_inhi, addhi3, *addhi3, addhi3_clobber, subhi3, + ashlhi3, *ashlhi3_const, ashrhi3, *ashirhi3_const, lshrhi3, + *lshrhi3_const, *cmphi, cbranchhi4): Generalize to handle all + 16-bit modes in ALL2. + (subhi3, casesi, strlenhi): Add clobber when expanding minus:HI. + (*movsi, *reload_insi, addsi3, subsi3, ashlsi3, *ashlsi3_const, + ashrsi3, *ashrhi3_const, *ashrsi3_const, lshrsi3, *lshrsi3_const, + *reversed_tstsi, *cmpsi, cbranchsi4): Generalize to handle all + 32-bit modes in ALL4. + * avr-dimode.md (ALL8): New mode iterator. + (adddi3, adddi3_insn, adddi3_const_insn, subdi3, subdi3_insn, + subdi3_const_insn, cbranchdi4, compare_di2, + compare_const_di2, ashrdi3, lshrdi3, rotldi3, ashldi3_insn, + ashrdi3_insn, lshrdi3_insn, rotldi3_insn): Generalize to handle + all 64-bit modes in ALL8. + * config/avr/avr-protos.h (avr_to_int_mode): New prototype. + (avr_out_fract, avr_out_minus, avr_out_minus64): New prototypes. + * config/avr/avr.c (TARGET_FIXED_POINT_SUPPORTED_P): Define to... + (avr_fixed_point_supported_p): ...this new static function. + (TARGET_BUILD_BUILTIN_VA_LIST): Define to... + (avr_build_builtin_va_list): ...this new static function. + (avr_adjust_type_node): New static function. + (avr_scalar_mode_supported_p): Allow if ALL_FIXED_POINT_MODE_P. + (avr_builtin_setjmp_frame_value): Use gen_subhi3 and return new + pseudo instead of gen_rtx_MINUS. + (avr_print_operand, avr_operand_rtx_cost): Handle: CONST_FIXED. + (notice_update_cc): Handle: CC_MINUS. + (output_movqi): Generalize to handle respective fixed-point modes. + (output_movhi, output_movsisf, avr_2word_insn_p): Ditto. + (avr_out_compare, avr_out_plus_1): Also handle fixed-point modes. + (avr_assemble_integer): Ditto. + (output_reload_in_const, output_reload_insisf): Ditto. + (avr_compare_pattern): Skip all modes > 4 bytes. + (avr_2word_insn_p): Skip movuqq_insn, movqq_insn. + (avr_out_fract, avr_out_minus, avr_out_minus64): New functions. + (avr_to_int_mode): New function. + (adjust_insn_length): Handle: ADJUST_LEN_SFRACT, + ADJUST_LEN_UFRACT, ADJUST_LEN_MINUS, ADJUST_LEN_MINUS64. + * config/avr/predicates.md (const0_operand): Allow const_fixed. + (const_operand, const_or_immediate_operand): New. + (nonmemory_or_const_operand): New. + * config/avr/constraints.md (Ynn, Y00, Y01, Y02, Ym1, Ym2, YIJ): + New constraints. + * config/avr/avr.h (LONG_LONG_ACCUM_TYPE_SIZE): Define. + 2012-08-23 Kenneth Zadeck * alias.c (rtx_equal_for_memref_p): Convert constant cases. diff --git a/gcc/config/avr/avr-dimode.md b/gcc/config/avr/avr-dimode.md index 3db069b3ad7..ed5752319eb 100644 --- a/gcc/config/avr/avr-dimode.md +++ b/gcc/config/avr/avr-dimode.md @@ -47,44 +47,58 @@ [(ACC_A 18) (ACC_B 10)]) +;; Supported modes that are 8 bytes wide +(define_mode_iterator ALL8 [(DI "") + (DQ "") (UDQ "") + (DA "") (UDA "") + (TA "") (UTA "")]) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; Addition ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; -(define_expand "adddi3" - [(parallel [(match_operand:DI 0 "general_operand" "") - (match_operand:DI 1 "general_operand" "") - (match_operand:DI 2 "general_operand" "")])] +;; "adddi3" +;; "adddq3" "addudq3" +;; "addda3" "adduda3" +;; "addta3" "adduta3" +(define_expand "add3" + [(parallel [(match_operand:ALL8 0 "general_operand" "") + (match_operand:ALL8 1 "general_operand" "") + (match_operand:ALL8 2 "general_operand" "")])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (mode, ACC_A); emit_move_insn (acc_a, operands[1]); - if (s8_operand (operands[2], VOIDmode)) + if (DImode == mode + && s8_operand (operands[2], VOIDmode)) { emit_move_insn (gen_rtx_REG (QImode, REG_X), operands[2]); emit_insn (gen_adddi3_const8_insn ()); } - else if (CONST_INT_P (operands[2]) - || CONST_DOUBLE_P (operands[2])) + else if (const_operand (operands[2], GET_MODE (operands[2]))) { - emit_insn (gen_adddi3_const_insn (operands[2])); + emit_insn (gen_add3_const_insn (operands[2])); } else { - emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]); - emit_insn (gen_adddi3_insn ()); + emit_move_insn (gen_rtx_REG (mode, ACC_B), operands[2]); + emit_insn (gen_add3_insn ()); } emit_move_insn (operands[0], acc_a); DONE; }) -(define_insn "adddi3_insn" - [(set (reg:DI ACC_A) - (plus:DI (reg:DI ACC_A) - (reg:DI ACC_B)))] +;; "adddi3_insn" +;; "adddq3_insn" "addudq3_insn" +;; "addda3_insn" "adduda3_insn" +;; "addta3_insn" "adduta3_insn" +(define_insn "add3_insn" + [(set (reg:ALL8 ACC_A) + (plus:ALL8 (reg:ALL8 ACC_A) + (reg:ALL8 ACC_B)))] "avr_have_dimode" "%~call __adddi3" [(set_attr "adjust_len" "call") @@ -99,10 +113,14 @@ [(set_attr "adjust_len" "call") (set_attr "cc" "clobber")]) -(define_insn "adddi3_const_insn" - [(set (reg:DI ACC_A) - (plus:DI (reg:DI ACC_A) - (match_operand:DI 0 "const_double_operand" "n")))] +;; "adddi3_const_insn" +;; "adddq3_const_insn" "addudq3_const_insn" +;; "addda3_const_insn" "adduda3_const_insn" +;; "addta3_const_insn" "adduta3_const_insn" +(define_insn "add3_const_insn" + [(set (reg:ALL8 ACC_A) + (plus:ALL8 (reg:ALL8 ACC_A) + (match_operand:ALL8 0 "const_operand" "n Ynn")))] "avr_have_dimode && !s8_operand (operands[0], VOIDmode)" { @@ -116,30 +134,62 @@ ;; Subtraction ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; -(define_expand "subdi3" - [(parallel [(match_operand:DI 0 "general_operand" "") - (match_operand:DI 1 "general_operand" "") - (match_operand:DI 2 "general_operand" "")])] +;; "subdi3" +;; "subdq3" "subudq3" +;; "subda3" "subuda3" +;; "subta3" "subuta3" +(define_expand "sub3" + [(parallel [(match_operand:ALL8 0 "general_operand" "") + (match_operand:ALL8 1 "general_operand" "") + (match_operand:ALL8 2 "general_operand" "")])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (mode, ACC_A); emit_move_insn (acc_a, operands[1]); - emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]); - emit_insn (gen_subdi3_insn ()); + + if (const_operand (operands[2], GET_MODE (operands[2]))) + { + emit_insn (gen_sub3_const_insn (operands[2])); + } + else + { + emit_move_insn (gen_rtx_REG (mode, ACC_B), operands[2]); + emit_insn (gen_sub3_insn ()); + } + emit_move_insn (operands[0], acc_a); DONE; }) -(define_insn "subdi3_insn" - [(set (reg:DI ACC_A) - (minus:DI (reg:DI ACC_A) - (reg:DI ACC_B)))] +;; "subdi3_insn" +;; "subdq3_insn" "subudq3_insn" +;; "subda3_insn" "subuda3_insn" +;; "subta3_insn" "subuta3_insn" +(define_insn "sub3_insn" + [(set (reg:ALL8 ACC_A) + (minus:ALL8 (reg:ALL8 ACC_A) + (reg:ALL8 ACC_B)))] "avr_have_dimode" "%~call __subdi3" [(set_attr "adjust_len" "call") (set_attr "cc" "set_czn")]) +;; "subdi3_const_insn" +;; "subdq3_const_insn" "subudq3_const_insn" +;; "subda3_const_insn" "subuda3_const_insn" +;; "subta3_const_insn" "subuta3_const_insn" +(define_insn "sub3_const_insn" + [(set (reg:ALL8 ACC_A) + (minus:ALL8 (reg:ALL8 ACC_A) + (match_operand:ALL8 0 "const_operand" "n Ynn")))] + "avr_have_dimode" + { + return avr_out_minus64 (operands[0], NULL); + } + [(set_attr "adjust_len" "minus64") + (set_attr "cc" "clobber")]) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; Negation @@ -180,15 +230,19 @@ (pc)))] "avr_have_dimode") -(define_expand "cbranchdi4" - [(parallel [(match_operand:DI 1 "register_operand" "") - (match_operand:DI 2 "nonmemory_operand" "") +;; "cbranchdi4" +;; "cbranchdq4" "cbranchudq4" +;; "cbranchda4" "cbranchuda4" +;; "cbranchta4" "cbranchuta4" +(define_expand "cbranch4" + [(parallel [(match_operand:ALL8 1 "register_operand" "") + (match_operand:ALL8 2 "nonmemory_operand" "") (match_operator 0 "ordered_comparison_operator" [(cc0) (const_int 0)]) (label_ref (match_operand 3 "" ""))])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (mode, ACC_A); emit_move_insn (acc_a, operands[1]); @@ -197,25 +251,28 @@ emit_move_insn (gen_rtx_REG (QImode, REG_X), operands[2]); emit_insn (gen_compare_const8_di2 ()); } - else if (CONST_INT_P (operands[2]) - || CONST_DOUBLE_P (operands[2])) + else if (const_operand (operands[2], GET_MODE (operands[2]))) { - emit_insn (gen_compare_const_di2 (operands[2])); + emit_insn (gen_compare_const_2 (operands[2])); } else { - emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]); - emit_insn (gen_compare_di2 ()); + emit_move_insn (gen_rtx_REG (mode, ACC_B), operands[2]); + emit_insn (gen_compare_2 ()); } emit_jump_insn (gen_conditional_jump (operands[0], operands[3])); DONE; }) -(define_insn "compare_di2" +;; "compare_di2" +;; "compare_dq2" "compare_udq2" +;; "compare_da2" "compare_uda2" +;; "compare_ta2" "compare_uta2" +(define_insn "compare_2" [(set (cc0) - (compare (reg:DI ACC_A) - (reg:DI ACC_B)))] + (compare (reg:ALL8 ACC_A) + (reg:ALL8 ACC_B)))] "avr_have_dimode" "%~call __cmpdi2" [(set_attr "adjust_len" "call") @@ -230,10 +287,14 @@ [(set_attr "adjust_len" "call") (set_attr "cc" "compare")]) -(define_insn "compare_const_di2" +;; "compare_const_di2" +;; "compare_const_dq2" "compare_const_udq2" +;; "compare_const_da2" "compare_const_uda2" +;; "compare_const_ta2" "compare_const_uta2" +(define_insn "compare_const_2" [(set (cc0) - (compare (reg:DI ACC_A) - (match_operand:DI 0 "const_double_operand" "n"))) + (compare (reg:ALL8 ACC_A) + (match_operand:ALL8 0 "const_operand" "n Ynn"))) (clobber (match_scratch:QI 1 "=&d"))] "avr_have_dimode && !s8_operand (operands[0], VOIDmode)" @@ -254,29 +315,39 @@ ;; Shift functions from libgcc are called without defining these insns, ;; but with them we can describe their reduced register footprint. -;; "ashldi3" -;; "ashrdi3" -;; "lshrdi3" -;; "rotldi3" -(define_expand "di3" - [(parallel [(match_operand:DI 0 "general_operand" "") - (di_shifts:DI (match_operand:DI 1 "general_operand" "") - (match_operand:QI 2 "general_operand" ""))])] +;; "ashldi3" "ashrdi3" "lshrdi3" "rotldi3" +;; "ashldq3" "ashrdq3" "lshrdq3" "rotldq3" +;; "ashlda3" "ashrda3" "lshrda3" "rotlda3" +;; "ashlta3" "ashrta3" "lshrta3" "rotlta3" +;; "ashludq3" "ashrudq3" "lshrudq3" "rotludq3" +;; "ashluda3" "ashruda3" "lshruda3" "rotluda3" +;; "ashluta3" "ashruta3" "lshruta3" "rotluta3" +(define_expand "3" + [(parallel [(match_operand:ALL8 0 "general_operand" "") + (di_shifts:ALL8 (match_operand:ALL8 1 "general_operand" "") + (match_operand:QI 2 "general_operand" ""))])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (mode, ACC_A); emit_move_insn (acc_a, operands[1]); emit_move_insn (gen_rtx_REG (QImode, 16), operands[2]); - emit_insn (gen_di3_insn ()); + emit_insn (gen_3_insn ()); emit_move_insn (operands[0], acc_a); DONE; }) -(define_insn "di3_insn" - [(set (reg:DI ACC_A) - (di_shifts:DI (reg:DI ACC_A) - (reg:QI 16)))] +;; "ashldi3_insn" "ashrdi3_insn" "lshrdi3_insn" "rotldi3_insn" +;; "ashldq3_insn" "ashrdq3_insn" "lshrdq3_insn" "rotldq3_insn" +;; "ashlda3_insn" "ashrda3_insn" "lshrda3_insn" "rotlda3_insn" +;; "ashlta3_insn" "ashrta3_insn" "lshrta3_insn" "rotlta3_insn" +;; "ashludq3_insn" "ashrudq3_insn" "lshrudq3_insn" "rotludq3_insn" +;; "ashluda3_insn" "ashruda3_insn" "lshruda3_insn" "rotluda3_insn" +;; "ashluta3_insn" "ashruta3_insn" "lshruta3_insn" "rotluta3_insn" +(define_insn "3_insn" + [(set (reg:ALL8 ACC_A) + (di_shifts:ALL8 (reg:ALL8 ACC_A) + (reg:QI 16)))] "avr_have_dimode" "%~call __di3" [(set_attr "adjust_len" "call") diff --git a/gcc/config/avr/avr-fixed.md b/gcc/config/avr/avr-fixed.md new file mode 100644 index 00000000000..bfbdaecf215 --- /dev/null +++ b/gcc/config/avr/avr-fixed.md @@ -0,0 +1,287 @@ +;; This file contains instructions that support fixed-point operations +;; for Atmel AVR micro controllers. +;; Copyright (C) 2012 +;; Free Software Foundation, Inc. +;; +;; Contributed by Sean D'Epagnier (sean@depagnier.com) +;; Georg-Johann Lay (avr@gjlay.de) + +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . + +(define_mode_iterator ALL1Q [(QQ "") (UQQ "")]) +(define_mode_iterator ALL2Q [(HQ "") (UHQ "")]) +(define_mode_iterator ALL2A [(HA "") (UHA "")]) +(define_mode_iterator ALL2QA [(HQ "") (UHQ "") + (HA "") (UHA "")]) +(define_mode_iterator ALL4A [(SA "") (USA "")]) + +;;; Conversions + +(define_mode_iterator FIXED_A + [(QQ "") (UQQ "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "") + (DQ "") (UDQ "") (DA "") (UDA "") + (TA "") (UTA "") + (QI "") (HI "") (SI "") (DI "")]) + +;; Same so that be can build cross products + +(define_mode_iterator FIXED_B + [(QQ "") (UQQ "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "") + (DQ "") (UDQ "") (DA "") (UDA "") + (TA "") (UTA "") + (QI "") (HI "") (SI "") (DI "")]) + +(define_insn "fract2" + [(set (match_operand:FIXED_A 0 "register_operand" "=r") + (fract_convert:FIXED_A + (match_operand:FIXED_B 1 "register_operand" "r")))] + "mode != mode" + { + return avr_out_fract (insn, operands, true, NULL); + } + [(set_attr "cc" "clobber") + (set_attr "adjust_len" "sfract")]) + +(define_insn "fractuns2" + [(set (match_operand:FIXED_A 0 "register_operand" "=r") + (unsigned_fract_convert:FIXED_A + (match_operand:FIXED_B 1 "register_operand" "r")))] + "mode != mode" + { + return avr_out_fract (insn, operands, false, NULL); + } + [(set_attr "cc" "clobber") + (set_attr "adjust_len" "ufract")]) + +;****************************************************************************** +; mul + +;; "mulqq3" "muluqq3" +(define_expand "mul3" + [(parallel [(match_operand:ALL1Q 0 "register_operand" "") + (match_operand:ALL1Q 1 "register_operand" "") + (match_operand:ALL1Q 2 "register_operand" "")])] + "" + { + emit_insn (AVR_HAVE_MUL + ? gen_mul3_enh (operands[0], operands[1], operands[2]) + : gen_mul3_nomul (operands[0], operands[1], operands[2])); + DONE; + }) + +(define_insn "mulqq3_enh" + [(set (match_operand:QQ 0 "register_operand" "=r") + (mult:QQ (match_operand:QQ 1 "register_operand" "a") + (match_operand:QQ 2 "register_operand" "a")))] + "AVR_HAVE_MUL" + "fmuls %1,%2\;dec r1\;brvs 0f\;inc r1\;0:\;mov %0,r1\;clr __zero_reg__" + [(set_attr "length" "6") + (set_attr "cc" "clobber")]) + +(define_insn "muluqq3_enh" + [(set (match_operand:UQQ 0 "register_operand" "=r") + (mult:UQQ (match_operand:UQQ 1 "register_operand" "r") + (match_operand:UQQ 2 "register_operand" "r")))] + "AVR_HAVE_MUL" + "mul %1,%2\;mov %0,r1\;clr __zero_reg__" + [(set_attr "length" "3") + (set_attr "cc" "clobber")]) + +(define_expand "mulqq3_nomul" + [(set (reg:QQ 24) + (match_operand:QQ 1 "register_operand" "")) + (set (reg:QQ 25) + (match_operand:QQ 2 "register_operand" "")) + ;; "*mulqq3.call" + (parallel [(set (reg:QQ 23) + (mult:QQ (reg:QQ 24) + (reg:QQ 25))) + (clobber (reg:QI 22)) + (clobber (reg:HI 24))]) + (set (match_operand:QQ 0 "register_operand" "") + (reg:QQ 23))] + "!AVR_HAVE_MUL") + +(define_expand "muluqq3_nomul" + [(set (reg:UQQ 22) + (match_operand:UQQ 1 "register_operand" "")) + (set (reg:UQQ 24) + (match_operand:UQQ 2 "register_operand" "")) + ;; "*umulqihi3.call" + (parallel [(set (reg:HI 24) + (mult:HI (zero_extend:HI (reg:QI 22)) + (zero_extend:HI (reg:QI 24)))) + (clobber (reg:QI 21)) + (clobber (reg:HI 22))]) + (set (match_operand:UQQ 0 "register_operand" "") + (reg:UQQ 25))] + "!AVR_HAVE_MUL") + +(define_insn "*mulqq3.call" + [(set (reg:QQ 23) + (mult:QQ (reg:QQ 24) + (reg:QQ 25))) + (clobber (reg:QI 22)) + (clobber (reg:HI 24))] + "!AVR_HAVE_MUL" + "%~call __mulqq3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + + +;; "mulhq3" "muluhq3" +;; "mulha3" "muluha3" +(define_expand "mul3" + [(set (reg:ALL2QA 18) + (match_operand:ALL2QA 1 "register_operand" "")) + (set (reg:ALL2QA 26) + (match_operand:ALL2QA 2 "register_operand" "")) + ;; "*mulhq3.call.enh" + (parallel [(set (reg:ALL2QA 24) + (mult:ALL2QA (reg:ALL2QA 18) + (reg:ALL2QA 26))) + (clobber (reg:HI 22))]) + (set (match_operand:ALL2QA 0 "register_operand" "") + (reg:ALL2QA 24))] + "AVR_HAVE_MUL") + +;; "*mulhq3.call" "*muluhq3.call" +;; "*mulha3.call" "*muluha3.call" +(define_insn "*mul3.call" + [(set (reg:ALL2QA 24) + (mult:ALL2QA (reg:ALL2QA 18) + (reg:ALL2QA 26))) + (clobber (reg:HI 22))] + "AVR_HAVE_MUL" + "%~call __mul3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + + +;; On the enhanced core, don't clobber either input and use a separate output + +;; "mulsa3" "mulusa3" +(define_expand "mul3" + [(set (reg:ALL4A 16) + (match_operand:ALL4A 1 "register_operand" "")) + (set (reg:ALL4A 20) + (match_operand:ALL4A 2 "register_operand" "")) + (set (reg:ALL4A 24) + (mult:ALL4A (reg:ALL4A 16) + (reg:ALL4A 20))) + (set (match_operand:ALL4A 0 "register_operand" "") + (reg:ALL4A 24))] + "AVR_HAVE_MUL") + +;; "*mulsa3.call" "*mulusa3.call" +(define_insn "*mul3.call" + [(set (reg:ALL4A 24) + (mult:ALL4A (reg:ALL4A 16) + (reg:ALL4A 20)))] + "AVR_HAVE_MUL" + "%~call __mul3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +; / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / +; div + +(define_code_iterator usdiv [udiv div]) + +;; "divqq3" "udivuqq3" +(define_expand "3" + [(set (reg:ALL1Q 25) + (match_operand:ALL1Q 1 "register_operand" "")) + (set (reg:ALL1Q 22) + (match_operand:ALL1Q 2 "register_operand" "")) + (parallel [(set (reg:ALL1Q 24) + (usdiv:ALL1Q (reg:ALL1Q 25) + (reg:ALL1Q 22))) + (clobber (reg:QI 25))]) + (set (match_operand:ALL1Q 0 "register_operand" "") + (reg:ALL1Q 24))]) + +;; "*divqq3.call" "*udivuqq3.call" +(define_insn "*3.call" + [(set (reg:ALL1Q 24) + (usdiv:ALL1Q (reg:ALL1Q 25) + (reg:ALL1Q 22))) + (clobber (reg:QI 25))] + "" + "%~call __3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +;; "divhq3" "udivuhq3" +;; "divha3" "udivuha3" +(define_expand "3" + [(set (reg:ALL2QA 26) + (match_operand:ALL2QA 1 "register_operand" "")) + (set (reg:ALL2QA 22) + (match_operand:ALL2QA 2 "register_operand" "")) + (parallel [(set (reg:ALL2QA 24) + (usdiv:ALL2QA (reg:ALL2QA 26) + (reg:ALL2QA 22))) + (clobber (reg:HI 26)) + (clobber (reg:QI 21))]) + (set (match_operand:ALL2QA 0 "register_operand" "") + (reg:ALL2QA 24))]) + +;; "*divhq3.call" "*udivuhq3.call" +;; "*divha3.call" "*udivuha3.call" +(define_insn "*3.call" + [(set (reg:ALL2QA 24) + (usdiv:ALL2QA (reg:ALL2QA 26) + (reg:ALL2QA 22))) + (clobber (reg:HI 26)) + (clobber (reg:QI 21))] + "" + "%~call __3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +;; Note the first parameter gets passed in already offset by 2 bytes + +;; "divsa3" "udivusa3" +(define_expand "3" + [(set (reg:ALL4A 24) + (match_operand:ALL4A 1 "register_operand" "")) + (set (reg:ALL4A 18) + (match_operand:ALL4A 2 "register_operand" "")) + (parallel [(set (reg:ALL4A 22) + (usdiv:ALL4A (reg:ALL4A 24) + (reg:ALL4A 18))) + (clobber (reg:HI 26)) + (clobber (reg:HI 30))]) + (set (match_operand:ALL4A 0 "register_operand" "") + (reg:ALL4A 22))]) + +;; "*divsa3.call" "*udivusa3.call" +(define_insn "*3.call" + [(set (reg:ALL4A 22) + (usdiv:ALL4A (reg:ALL4A 24) + (reg:ALL4A 18))) + (clobber (reg:HI 26)) + (clobber (reg:HI 30))] + "" + "%~call __3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) diff --git a/gcc/config/avr/avr-modes.def b/gcc/config/avr/avr-modes.def index 4a16f888ddf..09e6b4983f0 100644 --- a/gcc/config/avr/avr-modes.def +++ b/gcc/config/avr/avr-modes.def @@ -1 +1,28 @@ FRACTIONAL_INT_MODE (PSI, 24, 3); + +/* On 8 bit machines it requires fewer instructions for fixed point + routines if the decimal place is on a byte boundary which is not + the default for signed accum types. */ + +ADJUST_IBIT (HA, 7); +ADJUST_FBIT (HA, 8); + +ADJUST_IBIT (SA, 15); +ADJUST_FBIT (SA, 16); + +ADJUST_IBIT (DA, 31); +ADJUST_FBIT (DA, 32); + +/* Make TA and UTA 64 bits wide. + 128 bit wide modes would be insane on a 8-bit machine. + This needs special treatment in avr.c and avr-lib.h. */ + +ADJUST_BYTESIZE (TA, 8); +ADJUST_ALIGNMENT (TA, 1); +ADJUST_IBIT (TA, 15); +ADJUST_FBIT (TA, 48); + +ADJUST_BYTESIZE (UTA, 8); +ADJUST_ALIGNMENT (UTA, 1); +ADJUST_IBIT (UTA, 16); +ADJUST_FBIT (UTA, 48); diff --git a/gcc/config/avr/avr-protos.h b/gcc/config/avr/avr-protos.h index 7b9b05effa6..5d6fabb6b6d 100644 --- a/gcc/config/avr/avr-protos.h +++ b/gcc/config/avr/avr-protos.h @@ -79,6 +79,9 @@ extern const char* avr_load_lpm (rtx, rtx*, int*); extern bool avr_rotate_bytes (rtx operands[]); +extern const char* avr_out_fract (rtx, rtx[], bool, int*); +extern rtx avr_to_int_mode (rtx); + extern void expand_prologue (void); extern void expand_epilogue (bool); extern bool avr_emit_movmemhi (rtx*); @@ -92,6 +95,8 @@ extern const char* avr_out_plus (rtx*, int*, int*); extern const char* avr_out_plus_noclobber (rtx*, int*, int*); extern const char* avr_out_plus64 (rtx, int*); extern const char* avr_out_addto_sp (rtx*, int*); +extern const char* avr_out_minus (rtx*, int*, int*); +extern const char* avr_out_minus64 (rtx, int*); extern const char* avr_out_xload (rtx, rtx*, int*); extern const char* avr_out_movmem (rtx, rtx*, int*); extern const char* avr_out_insert_bits (rtx*, int*); diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c index e3b85d69b55..c17533000c7 100644 --- a/gcc/config/avr/avr.c +++ b/gcc/config/avr/avr.c @@ -49,6 +49,10 @@ #include "params.h" #include "df.h" +#ifndef CONST_FIXED_P +#define CONST_FIXED_P(X) (CONST_FIXED == GET_CODE (X)) +#endif + /* Maximal allowed offset for an address in the LD command */ #define MAX_LD_OFFSET(MODE) (64 - (signed)GET_MODE_SIZE (MODE)) @@ -264,6 +268,23 @@ avr_popcount_each_byte (rtx xval, int n_bytes, int pop_mask) return true; } + +/* Access some RTX as INT_MODE. If X is a CONST_FIXED we can get + the bit representation of X by "casting" it to CONST_INT. */ + +rtx +avr_to_int_mode (rtx x) +{ + enum machine_mode mode = GET_MODE (x); + + return VOIDmode == mode + ? x + : simplify_gen_subreg (int_mode_for_mode (mode), x, mode, 0); +} + + +/* Implement `TARGET_OPTION_OVERRIDE'. */ + static void avr_option_override (void) { @@ -389,9 +410,14 @@ avr_regno_reg_class (int r) } +/* Implement `TARGET_SCALAR_MODE_SUPPORTED_P'. */ + static bool avr_scalar_mode_supported_p (enum machine_mode mode) { + if (ALL_FIXED_POINT_MODE_P (mode)) + return true; + if (PSImode == mode) return true; @@ -715,6 +741,58 @@ avr_initial_elimination_offset (int from, int to) } } + +/* Helper for the function below. */ + +static void +avr_adjust_type_node (tree *node, enum machine_mode mode, int sat_p) +{ + *node = make_node (FIXED_POINT_TYPE); + TYPE_SATURATING (*node) = sat_p; + TYPE_UNSIGNED (*node) = UNSIGNED_FIXED_POINT_MODE_P (mode); + TYPE_IBIT (*node) = GET_MODE_IBIT (mode); + TYPE_FBIT (*node) = GET_MODE_FBIT (mode); + TYPE_PRECISION (*node) = GET_MODE_BITSIZE (mode); + TYPE_ALIGN (*node) = 8; + SET_TYPE_MODE (*node, mode); + + layout_type (*node); +} + + +/* Implement `TARGET_BUILD_BUILTIN_VA_LIST'. */ + +static tree +avr_build_builtin_va_list (void) +{ + /* avr-modes.def adjusts [U]TA to be 64-bit modes with 48 fractional bits. + This is more appropriate for the 8-bit machine AVR than 128-bit modes. + The ADJUST_IBIT/FBIT are handled in toplev:init_adjust_machine_modes() + which is auto-generated by genmodes, but the compiler assigns [U]DAmode + to the long long accum modes instead of the desired [U]TAmode. + + Fix this now, right after node setup in tree.c:build_common_tree_nodes(). + This must run before c-cppbuiltin.c:builtin_define_fixed_point_constants() + which built-in defines macros like __ULLACCUM_FBIT__ that are used by + libgcc to detect IBIT and FBIT. */ + + avr_adjust_type_node (&ta_type_node, TAmode, 0); + avr_adjust_type_node (&uta_type_node, UTAmode, 0); + avr_adjust_type_node (&sat_ta_type_node, TAmode, 1); + avr_adjust_type_node (&sat_uta_type_node, UTAmode, 1); + + unsigned_long_long_accum_type_node = uta_type_node; + long_long_accum_type_node = ta_type_node; + sat_unsigned_long_long_accum_type_node = sat_uta_type_node; + sat_long_long_accum_type_node = sat_ta_type_node; + + /* Dispatch to the default handler. */ + + return std_build_builtin_va_list (); +} + + +/* Implement `TARGET_BUILTIN_SETJMP_FRAME_VALUE'. */ /* Actual start of frame is virtual_stack_vars_rtx this is offset from frame pointer by +STARTING_FRAME_OFFSET. Using saved frame = virtual_stack_vars_rtx - STARTING_FRAME_OFFSET @@ -723,10 +801,13 @@ avr_initial_elimination_offset (int from, int to) static rtx avr_builtin_setjmp_frame_value (void) { - return gen_rtx_MINUS (Pmode, virtual_stack_vars_rtx, - gen_int_mode (STARTING_FRAME_OFFSET, Pmode)); + rtx xval = gen_reg_rtx (Pmode); + emit_insn (gen_subhi3 (xval, virtual_stack_vars_rtx, + gen_int_mode (STARTING_FRAME_OFFSET, Pmode))); + return xval; } + /* Return contents of MEM at frame pointer + stack size + 1 (+2 if 3 byte PC). This is return address of function. */ rtx @@ -1580,7 +1661,7 @@ avr_legitimate_address_p (enum machine_mode mode, rtx x, bool strict) MEM, strict); if (strict - && DImode == mode + && GET_MODE_SIZE (mode) > 4 && REG_X == REGNO (x)) { ok = false; @@ -2081,6 +2162,14 @@ avr_print_operand (FILE *file, rtx x, int code) /* Use normal symbol for direct address no linker trampoline needed */ output_addr_const (file, x); } + else if (GET_CODE (x) == CONST_FIXED) + { + HOST_WIDE_INT ival = INTVAL (avr_to_int_mode (x)); + if (code != 0) + output_operand_lossage ("Unsupported code '%c'for fixed-point:", + code); + fprintf (file, HOST_WIDE_INT_PRINT_DEC, ival); + } else if (GET_CODE (x) == CONST_DOUBLE) { long val; @@ -2116,6 +2205,7 @@ notice_update_cc (rtx body ATTRIBUTE_UNUSED, rtx insn) case CC_OUT_PLUS: case CC_OUT_PLUS_NOCLOBBER: + case CC_MINUS: case CC_LDI: { rtx *op = recog_data.operand; @@ -2139,6 +2229,11 @@ notice_update_cc (rtx body ATTRIBUTE_UNUSED, rtx insn) cc = (enum attr_cc) icc; break; + case CC_MINUS: + avr_out_minus (op, &len_dummy, &icc); + cc = (enum attr_cc) icc; + break; + case CC_LDI: cc = (op[1] == CONST0_RTX (GET_MODE (op[0])) @@ -2779,9 +2874,11 @@ output_movqi (rtx insn, rtx operands[], int *real_l) if (real_l) *real_l = 1; - if (register_operand (dest, QImode)) + gcc_assert (1 == GET_MODE_SIZE (GET_MODE (dest))); + + if (REG_P (dest)) { - if (register_operand (src, QImode)) /* mov r,r */ + if (REG_P (src)) /* mov r,r */ { if (test_hard_reg_class (STACK_REG, dest)) return "out %0,%1"; @@ -2803,7 +2900,7 @@ output_movqi (rtx insn, rtx operands[], int *real_l) rtx xop[2]; xop[0] = dest; - xop[1] = src == const0_rtx ? zero_reg_rtx : src; + xop[1] = src == CONST0_RTX (GET_MODE (dest)) ? zero_reg_rtx : src; return out_movqi_mr_r (insn, xop, real_l); } @@ -2825,6 +2922,8 @@ output_movhi (rtx insn, rtx xop[], int *plen) return avr_out_lpm (insn, xop, plen); } + gcc_assert (2 == GET_MODE_SIZE (GET_MODE (dest))); + if (REG_P (dest)) { if (REG_P (src)) /* mov r,r */ @@ -2843,7 +2942,6 @@ output_movhi (rtx insn, rtx xop[], int *plen) return TARGET_NO_INTERRUPTS ? avr_asm_len ("out __SP_H__,%B1" CR_TAB "out __SP_L__,%A1", xop, plen, -2) - : avr_asm_len ("in __tmp_reg__,__SREG__" CR_TAB "cli" CR_TAB "out __SP_H__,%B1" CR_TAB @@ -2880,7 +2978,7 @@ output_movhi (rtx insn, rtx xop[], int *plen) rtx xop[2]; xop[0] = dest; - xop[1] = src == const0_rtx ? zero_reg_rtx : src; + xop[1] = src == CONST0_RTX (GET_MODE (dest)) ? zero_reg_rtx : src; return out_movhi_mr_r (insn, xop, plen); } @@ -3403,9 +3501,10 @@ output_movsisf (rtx insn, rtx operands[], int *l) if (!l) l = &dummy; - if (register_operand (dest, VOIDmode)) + gcc_assert (4 == GET_MODE_SIZE (GET_MODE (dest))); + if (REG_P (dest)) { - if (register_operand (src, VOIDmode)) /* mov r,r */ + if (REG_P (src)) /* mov r,r */ { if (true_regnum (dest) > true_regnum (src)) { @@ -3440,10 +3539,10 @@ output_movsisf (rtx insn, rtx operands[], int *l) { return output_reload_insisf (operands, NULL_RTX, real_l); } - else if (GET_CODE (src) == MEM) + else if (MEM_P (src)) return out_movsi_r_mr (insn, operands, real_l); /* mov r,m */ } - else if (GET_CODE (dest) == MEM) + else if (MEM_P (dest)) { const char *templ; @@ -4126,14 +4225,25 @@ avr_out_compare (rtx insn, rtx *xop, int *plen) rtx xval = xop[1]; /* MODE of the comparison. */ - enum machine_mode mode = GET_MODE (xreg); + enum machine_mode mode; /* Number of bytes to operate on. */ - int i, n_bytes = GET_MODE_SIZE (mode); + int i, n_bytes = GET_MODE_SIZE (GET_MODE (xreg)); /* Value (0..0xff) held in clobber register xop[2] or -1 if unknown. */ int clobber_val = -1; + /* Map fixed mode operands to integer operands with the same binary + representation. They are easier to handle in the remainder. */ + + if (CONST_FIXED == GET_CODE (xval)) + { + xreg = avr_to_int_mode (xop[0]); + xval = avr_to_int_mode (xop[1]); + } + + mode = GET_MODE (xreg); + gcc_assert (REG_P (xreg)); gcc_assert ((CONST_INT_P (xval) && n_bytes <= 4) || (const_double_operand (xval, VOIDmode) && n_bytes == 8)); @@ -4143,7 +4253,7 @@ avr_out_compare (rtx insn, rtx *xop, int *plen) /* Comparisons == +/-1 and != +/-1 can be done similar to camparing against 0 by ORing the bytes. This is one instruction shorter. - Notice that DImode comparisons are always against reg:DI 18 + Notice that 64-bit comparisons are always against reg:ALL8 18 (ACC_A) and therefore don't use this. */ if (!test_hard_reg_class (LD_REGS, xreg) @@ -5884,6 +5994,9 @@ avr_out_plus_1 (rtx *xop, int *plen, enum rtx_code code, int *pcc) /* MODE of the operation. */ enum machine_mode mode = GET_MODE (xop[0]); + /* INT_MODE of the same size. */ + enum machine_mode imode = int_mode_for_mode (mode); + /* Number of bytes to operate on. */ int i, n_bytes = GET_MODE_SIZE (mode); @@ -5908,8 +6021,11 @@ avr_out_plus_1 (rtx *xop, int *plen, enum rtx_code code, int *pcc) *pcc = (MINUS == code) ? CC_SET_CZN : CC_CLOBBER; + if (CONST_FIXED_P (xval)) + xval = avr_to_int_mode (xval); + if (MINUS == code) - xval = simplify_unary_operation (NEG, mode, xval, mode); + xval = simplify_unary_operation (NEG, imode, xval, imode); op[2] = xop[3]; @@ -5920,7 +6036,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enum rtx_code code, int *pcc) { /* We operate byte-wise on the destination. */ rtx reg8 = simplify_gen_subreg (QImode, xop[0], mode, i); - rtx xval8 = simplify_gen_subreg (QImode, xval, mode, i); + rtx xval8 = simplify_gen_subreg (QImode, xval, imode, i); /* 8-bit value to operate with this byte. */ unsigned int val8 = UINTVAL (xval8) & GET_MODE_MASK (QImode); @@ -5941,7 +6057,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enum rtx_code code, int *pcc) && i + 2 <= n_bytes && test_hard_reg_class (ADDW_REGS, reg8)) { - rtx xval16 = simplify_gen_subreg (HImode, xval, mode, i); + rtx xval16 = simplify_gen_subreg (HImode, xval, imode, i); unsigned int val16 = UINTVAL (xval16) & GET_MODE_MASK (HImode); /* Registers R24, X, Y, Z can use ADIW/SBIW with constants < 64 @@ -6085,6 +6201,41 @@ avr_out_plus_noclobber (rtx *xop, int *plen, int *pcc) } +/* Output subtraction of register XOP[0] and compile time constant XOP[2]: + + XOP[0] = XOP[0] - XOP[2] + + This is basically the same as `avr_out_plus' except that we subtract. + It's needed because (minus x const) is not mapped to (plus x -const) + for the fixed point modes. */ + +const char* +avr_out_minus (rtx *xop, int *plen, int *pcc) +{ + rtx op[4]; + + if (pcc) + *pcc = (int) CC_SET_CZN; + + if (REG_P (xop[2])) + return avr_asm_len ("sub %A0,%A2" CR_TAB + "sbc %B0,%B2", xop, plen, -2); + + if (!CONST_INT_P (xop[2]) + && !CONST_FIXED_P (xop[2])) + return avr_asm_len ("subi %A0,lo8(%2)" CR_TAB + "sbci %B0,hi8(%2)", xop, plen, -2); + + op[0] = avr_to_int_mode (xop[0]); + op[1] = avr_to_int_mode (xop[1]); + op[2] = gen_int_mode (-INTVAL (avr_to_int_mode (xop[2])), + GET_MODE (op[0])); + op[3] = xop[3]; + + return avr_out_plus (op, plen, pcc); +} + + /* Prepare operands of adddi3_const_insn to be used with avr_out_plus_1. */ const char* @@ -6103,6 +6254,19 @@ avr_out_plus64 (rtx addend, int *plen) return ""; } + +/* Prepare operands of subdi3_const_insn to be used with avr_out_plus64. */ + +const char* +avr_out_minus64 (rtx subtrahend, int *plen) +{ + rtx xneg = avr_to_int_mode (subtrahend); + xneg = simplify_unary_operation (NEG, DImode, xneg, DImode); + + return avr_out_plus64 (xneg, plen); +} + + /* Output bit operation (IOR, AND, XOR) with register XOP[0] and compile time constant XOP[2]: @@ -6442,6 +6606,349 @@ avr_rotate_bytes (rtx operands[]) return true; } + +/* Outputs instructions needed for fixed point type conversion. + This includes converting between any fixed point type, as well + as converting to any integer type. Conversion between integer + types is not supported. + + The number of instructions generated depends on the types + being converted and the registers assigned to them. + + The number of instructions required to complete the conversion + is least if the registers for source and destination are overlapping + and are aligned at the decimal place as actual movement of data is + completely avoided. In some cases, the conversion may already be + complete without any instructions needed. + + When converting to signed types from signed types, sign extension + is implemented. + + Converting signed fractional types requires a bit shift if converting + to or from any unsigned fractional type because the decimal place is + shifted by 1 bit. When the destination is a signed fractional, the sign + is stored in either the carry or T bit. */ + +const char* +avr_out_fract (rtx insn, rtx operands[], bool intsigned, int *plen) +{ + int i; + bool sbit[2]; + /* ilen: Length of integral part (in bytes) + flen: Length of fractional part (in bytes) + tlen: Length of operand (in bytes) + blen: Length of operand (in bits) */ + int ilen[2], flen[2], tlen[2], blen[2]; + int rdest, rsource, offset; + int start, end, dir; + bool sign_in_T = false, sign_in_Carry = false, sign_done = false; + bool widening_sign_extend = false; + int clrword = -1, lastclr = 0, clr = 0; + rtx xop[6]; + + const int dest = 0; + const int src = 1; + + xop[dest] = operands[dest]; + xop[src] = operands[src]; + + if (plen) + *plen = 0; + + /* Determine format (integer and fractional parts) + of types needing conversion. */ + + for (i = 0; i < 2; i++) + { + enum machine_mode mode = GET_MODE (xop[i]); + + tlen[i] = GET_MODE_SIZE (mode); + blen[i] = GET_MODE_BITSIZE (mode); + + if (SCALAR_INT_MODE_P (mode)) + { + sbit[i] = intsigned; + ilen[i] = GET_MODE_SIZE (mode); + flen[i] = 0; + } + else if (ALL_SCALAR_FIXED_POINT_MODE_P (mode)) + { + sbit[i] = SIGNED_SCALAR_FIXED_POINT_MODE_P (mode); + ilen[i] = (GET_MODE_IBIT (mode) + 1) / 8; + flen[i] = (GET_MODE_FBIT (mode) + 1) / 8; + } + else + fatal_insn ("unsupported fixed-point conversion", insn); + } + + /* Perform sign extension if source and dest are both signed, + and there are more integer parts in dest than in source. */ + + widening_sign_extend = sbit[dest] && sbit[src] && ilen[dest] > ilen[src]; + + rdest = REGNO (xop[dest]); + rsource = REGNO (xop[src]); + offset = flen[src] - flen[dest]; + + /* Position of MSB resp. sign bit. */ + + xop[2] = GEN_INT (blen[dest] - 1); + xop[3] = GEN_INT (blen[src] - 1); + + /* Store the sign bit if the destination is a signed fract and the source + has a sign in the integer part. */ + + if (sbit[dest] && ilen[dest] == 0 && sbit[src] && ilen[src] > 0) + { + /* To avoid using BST and BLD if the source and destination registers + overlap or the source is unused after, we can use LSL to store the + sign bit in carry since we don't need the integral part of the source. + Restoring the sign from carry saves one BLD instruction below. */ + + if (reg_unused_after (insn, xop[src]) + || (rdest < rsource + tlen[src] + && rdest + tlen[dest] > rsource)) + { + avr_asm_len ("lsl %T1%t3", xop, plen, 1); + sign_in_Carry = true; + } + else + { + avr_asm_len ("bst %T1%T3", xop, plen, 1); + sign_in_T = true; + } + } + + /* Pick the correct direction to shift bytes. */ + + if (rdest < rsource + offset) + { + dir = 1; + start = 0; + end = tlen[dest]; + } + else + { + dir = -1; + start = tlen[dest] - 1; + end = -1; + } + + /* Perform conversion by moving registers into place, clearing + destination registers that do not overlap with any source. */ + + for (i = start; i != end; i += dir) + { + int destloc = rdest + i; + int sourceloc = rsource + i + offset; + + /* Source register location is outside range of source register, + so clear this byte in the dest. */ + + if (sourceloc < rsource + || sourceloc >= rsource + tlen[src]) + { + if (AVR_HAVE_MOVW + && i + dir != end + && (sourceloc + dir < rsource + || sourceloc + dir >= rsource + tlen[src]) + && ((dir == 1 && !(destloc % 2) && !(sourceloc % 2)) + || (dir == -1 && (destloc % 2) && (sourceloc % 2))) + && clrword != -1) + { + /* Use already cleared word to clear two bytes at a time. */ + + int even_i = i & ~1; + int even_clrword = clrword & ~1; + + xop[4] = GEN_INT (8 * even_i); + xop[5] = GEN_INT (8 * even_clrword); + avr_asm_len ("movw %T0%t4,%T0%t5", xop, plen, 1); + i += dir; + } + else + { + if (i == tlen[dest] - 1 + && widening_sign_extend + && blen[src] - 1 - 8 * offset < 0) + { + /* The SBRC below that sign-extends would come + up with a negative bit number because the sign + bit is out of reach. ALso avoid some early-clobber + situations because of premature CLR. */ + + if (reg_unused_after (insn, xop[src])) + avr_asm_len ("lsl %T1%t3" CR_TAB + "sbc %T0%t2,%T0%t2", xop, plen, 2); + else + avr_asm_len ("mov __tmp_reg__,%T1%t3" CR_TAB + "lsl __tmp_reg__" CR_TAB + "sbc %T0%t2,%T0%t2", xop, plen, 3); + sign_done = true; + + continue; + } + + /* Do not clear the register if it is going to get + sign extended with a MOV later. */ + + if (sbit[dest] && sbit[src] + && i != tlen[dest] - 1 + && i >= flen[dest]) + { + continue; + } + + xop[4] = GEN_INT (8 * i); + avr_asm_len ("clr %T0%t4", xop, plen, 1); + + /* If the last byte was cleared too, we have a cleared + word we can MOVW to clear two bytes at a time. */ + + if (lastclr) + clrword = i; + + clr = 1; + } + } + else if (destloc == sourceloc) + { + /* Source byte is already in destination: Nothing needed. */ + + continue; + } + else + { + /* Registers do not line up and source register location + is within range: Perform move, shifting with MOV or MOVW. */ + + if (AVR_HAVE_MOVW + && i + dir != end + && sourceloc + dir >= rsource + && sourceloc + dir < rsource + tlen[src] + && ((dir == 1 && !(destloc % 2) && !(sourceloc % 2)) + || (dir == -1 && (destloc % 2) && (sourceloc % 2)))) + { + int even_i = i & ~1; + int even_i_plus_offset = (i + offset) & ~1; + + xop[4] = GEN_INT (8 * even_i); + xop[5] = GEN_INT (8 * even_i_plus_offset); + avr_asm_len ("movw %T0%t4,%T1%t5", xop, plen, 1); + i += dir; + } + else + { + xop[4] = GEN_INT (8 * i); + xop[5] = GEN_INT (8 * (i + offset)); + avr_asm_len ("mov %T0%t4,%T1%t5", xop, plen, 1); + } + } + + lastclr = clr; + clr = 0; + } + + /* Perform sign extension if source and dest are both signed, + and there are more integer parts in dest than in source. */ + + if (widening_sign_extend) + { + if (!sign_done) + { + xop[4] = GEN_INT (blen[src] - 1 - 8 * offset); + + /* Register was cleared above, so can become 0xff and extended. + Note: Instead of the CLR/SBRC/COM the sign extension could + be performed after the LSL below by means of a SBC if only + one byte has to be shifted left. */ + + avr_asm_len ("sbrc %T0%T4" CR_TAB + "com %T0%t2", xop, plen, 2); + } + + /* Sign extend additional bytes by MOV and MOVW. */ + + start = tlen[dest] - 2; + end = flen[dest] + ilen[src] - 1; + + for (i = start; i != end; i--) + { + if (AVR_HAVE_MOVW && i != start && i-1 != end) + { + i--; + xop[4] = GEN_INT (8 * i); + xop[5] = GEN_INT (8 * (tlen[dest] - 2)); + avr_asm_len ("movw %T0%t4,%T0%t5", xop, plen, 1); + } + else + { + xop[4] = GEN_INT (8 * i); + xop[5] = GEN_INT (8 * (tlen[dest] - 1)); + avr_asm_len ("mov %T0%t4,%T0%t5", xop, plen, 1); + } + } + } + + /* If destination is a signed fract, and the source was not, a shift + by 1 bit is needed. Also restore sign from carry or T. */ + + if (sbit[dest] && !ilen[dest] && (!sbit[src] || ilen[src])) + { + /* We have flen[src] non-zero fractional bytes to shift. + Because of the right shift, handle one byte more so that the + LSB won't be lost. */ + + int nonzero = flen[src] + 1; + + /* If the LSB is in the T flag and there are no fractional + bits, the high byte is zero and no shift needed. */ + + if (flen[src] == 0 && sign_in_T) + nonzero = 0; + + start = flen[dest] - 1; + end = start - nonzero; + + for (i = start; i > end && i >= 0; i--) + { + xop[4] = GEN_INT (8 * i); + if (i == start && !sign_in_Carry) + avr_asm_len ("lsr %T0%t4", xop, plen, 1); + else + avr_asm_len ("ror %T0%t4", xop, plen, 1); + } + + if (sign_in_T) + { + avr_asm_len ("bld %T0%T2", xop, plen, 1); + } + } + else if (sbit[src] && !ilen[src] && (!sbit[dest] || ilen[dest])) + { + /* If source was a signed fract and dest was not, shift 1 bit + other way. */ + + start = flen[dest] - flen[src]; + + if (start < 0) + start = 0; + + for (i = start; i < flen[dest]; i++) + { + xop[4] = GEN_INT (8 * i); + + if (i == start) + avr_asm_len ("lsl %T0%t4", xop, plen, 1); + else + avr_asm_len ("rol %T0%t4", xop, plen, 1); + } + } + + return ""; +} + + /* Modifies the length assigned to instruction INSN LEN is the initially computed length of the insn. */ @@ -6489,6 +6996,8 @@ adjust_insn_length (rtx insn, int len) case ADJUST_LEN_OUT_PLUS: avr_out_plus (op, &len, NULL); break; case ADJUST_LEN_PLUS64: avr_out_plus64 (op[0], &len); break; + case ADJUST_LEN_MINUS: avr_out_minus (op, &len, NULL); break; + case ADJUST_LEN_MINUS64: avr_out_minus64 (op[0], &len); break; case ADJUST_LEN_OUT_PLUS_NOCLOBBER: avr_out_plus_noclobber (op, &len, NULL); break; @@ -6502,6 +7011,9 @@ adjust_insn_length (rtx insn, int len) case ADJUST_LEN_XLOAD: avr_out_xload (insn, op, &len); break; case ADJUST_LEN_LOAD_LPM: avr_load_lpm (insn, op, &len); break; + case ADJUST_LEN_SFRACT: avr_out_fract (insn, op, true, &len); break; + case ADJUST_LEN_UFRACT: avr_out_fract (insn, op, false, &len); break; + case ADJUST_LEN_TSTHI: avr_out_tsthi (insn, op, &len); break; case ADJUST_LEN_TSTPSI: avr_out_tstpsi (insn, op, &len); break; case ADJUST_LEN_TSTSI: avr_out_tstsi (insn, op, &len); break; @@ -6683,6 +7195,20 @@ avr_assemble_integer (rtx x, unsigned int size, int aligned_p) return true; } + else if (CONST_FIXED_P (x)) + { + unsigned n; + + /* varasm fails to handle big fixed modes that don't fit in hwi. */ + + for (n = 0; n < size; n++) + { + rtx xn = simplify_gen_subreg (QImode, x, GET_MODE (x), n); + default_assemble_integer (xn, 1, aligned_p); + } + + return true; + } return default_assemble_integer (x, size, aligned_p); } @@ -7489,6 +8015,7 @@ avr_operand_rtx_cost (rtx x, enum machine_mode mode, enum rtx_code outer, return 0; case CONST_INT: + case CONST_FIXED: case CONST_DOUBLE: return COSTS_N_INSNS (GET_MODE_SIZE (mode)); @@ -7518,6 +8045,7 @@ avr_rtx_costs_1 (rtx x, int codearg, int outer_code ATTRIBUTE_UNUSED, switch (code) { case CONST_INT: + case CONST_FIXED: case CONST_DOUBLE: case SYMBOL_REF: case CONST: @@ -8446,11 +8974,17 @@ avr_compare_pattern (rtx insn) if (pattern && NONJUMP_INSN_P (insn) && SET_DEST (pattern) == cc0_rtx - && GET_CODE (SET_SRC (pattern)) == COMPARE - && DImode != GET_MODE (XEXP (SET_SRC (pattern), 0)) - && DImode != GET_MODE (XEXP (SET_SRC (pattern), 1))) + && GET_CODE (SET_SRC (pattern)) == COMPARE) { - return pattern; + enum machine_mode mode0 = GET_MODE (XEXP (SET_SRC (pattern), 0)); + enum machine_mode mode1 = GET_MODE (XEXP (SET_SRC (pattern), 1)); + + /* The 64-bit comparisons have fixed operands ACC_A and ACC_B. + They must not be swapped, thus skip them. */ + + if ((mode0 == VOIDmode || GET_MODE_SIZE (mode0) <= 4) + && (mode1 == VOIDmode || GET_MODE_SIZE (mode1) <= 4)) + return pattern; } return NULL_RTX; @@ -8788,6 +9322,8 @@ avr_2word_insn_p (rtx insn) return false; case CODE_FOR_movqi_insn: + case CODE_FOR_movuqq_insn: + case CODE_FOR_movqq_insn: { rtx set = single_set (insn); rtx src = SET_SRC (set); @@ -8796,7 +9332,7 @@ avr_2word_insn_p (rtx insn) /* Factor out LDS and STS from movqi_insn. */ if (MEM_P (dest) - && (REG_P (src) || src == const0_rtx)) + && (REG_P (src) || src == CONST0_RTX (GET_MODE (dest)))) { return CONSTANT_ADDRESS_P (XEXP (dest, 0)); } @@ -9021,7 +9557,7 @@ output_reload_in_const (rtx *op, rtx clobber_reg, int *len, bool clear_p) if (NULL_RTX == clobber_reg && !test_hard_reg_class (LD_REGS, dest) - && (! (CONST_INT_P (src) || CONST_DOUBLE_P (src)) + && (! (CONST_INT_P (src) || CONST_FIXED_P (src) || CONST_DOUBLE_P (src)) || !avr_popcount_each_byte (src, n_bytes, (1 << 0) | (1 << 1) | (1 << 8)))) { @@ -9048,6 +9584,7 @@ output_reload_in_const (rtx *op, rtx clobber_reg, int *len, bool clear_p) ldreg_p = test_hard_reg_class (LD_REGS, xdest[n]); if (!CONST_INT_P (src) + && !CONST_FIXED_P (src) && !CONST_DOUBLE_P (src)) { static const char* const asm_code[][2] = @@ -9239,6 +9776,7 @@ output_reload_insisf (rtx *op, rtx clobber_reg, int *len) if (AVR_HAVE_MOVW && !test_hard_reg_class (LD_REGS, op[0]) && (CONST_INT_P (op[1]) + || CONST_FIXED_P (op[1]) || CONST_DOUBLE_P (op[1]))) { int len_clr, len_noclr; @@ -10834,6 +11372,12 @@ avr_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *arg, #undef TARGET_SCALAR_MODE_SUPPORTED_P #define TARGET_SCALAR_MODE_SUPPORTED_P avr_scalar_mode_supported_p +#undef TARGET_BUILD_BUILTIN_VA_LIST +#define TARGET_BUILD_BUILTIN_VA_LIST avr_build_builtin_va_list + +#undef TARGET_FIXED_POINT_SUPPORTED_P +#define TARGET_FIXED_POINT_SUPPORTED_P hook_bool_void_true + #undef TARGET_ADDR_SPACE_SUBSET_P #define TARGET_ADDR_SPACE_SUBSET_P avr_addr_space_subset_p diff --git a/gcc/config/avr/avr.h b/gcc/config/avr/avr.h index 0ce0af4ca58..f8686685b2f 100644 --- a/gcc/config/avr/avr.h +++ b/gcc/config/avr/avr.h @@ -261,6 +261,7 @@ enum #define FLOAT_TYPE_SIZE 32 #define DOUBLE_TYPE_SIZE 32 #define LONG_DOUBLE_TYPE_SIZE 32 +#define LONG_LONG_ACCUM_TYPE_SIZE 64 #define DEFAULT_SIGNED_CHAR 1 diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md index 2b1a83c607a..6146fe6a013 100644 --- a/gcc/config/avr/avr.md +++ b/gcc/config/avr/avr.md @@ -88,10 +88,10 @@ (include "predicates.md") (include "constraints.md") - + ;; Condition code settings. (define_attr "cc" "none,set_czn,set_zn,set_n,compare,clobber, - out_plus, out_plus_noclobber,ldi" + out_plus, out_plus_noclobber,ldi,minus" (const_string "none")) (define_attr "type" "branch,branch1,arith,xcall" @@ -139,8 +139,10 @@ (define_attr "adjust_len" "out_bitop, out_plus, out_plus_noclobber, plus64, addto_sp, + minus, minus64, tsthi, tstpsi, tstsi, compare, compare64, call, mov8, mov16, mov24, mov32, reload_in16, reload_in24, reload_in32, + ufract, sfract, xload, movmem, load_lpm, ashlqi, ashrqi, lshrqi, ashlhi, ashrhi, lshrhi, @@ -225,8 +227,20 @@ (define_mode_iterator QIDI [(QI "") (HI "") (PSI "") (SI "") (DI "")]) (define_mode_iterator HISI [(HI "") (PSI "") (SI "")]) +(define_mode_iterator ALL1 [(QI "") (QQ "") (UQQ "")]) +(define_mode_iterator ALL2 [(HI "") (HQ "") (UHQ "") (HA "") (UHA "")]) +(define_mode_iterator ALL4 [(SI "") (SQ "") (USQ "") (SA "") (USA "")]) + ;; All supported move-modes -(define_mode_iterator MOVMODE [(QI "") (HI "") (SI "") (SF "") (PSI "")]) +(define_mode_iterator MOVMODE [(QI "") (HI "") (SI "") (SF "") (PSI "") + (QQ "") (UQQ "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "")]) + +;; Supported ordered modes that are 2, 3, 4 bytes wide +(define_mode_iterator ORDERED234 [(HI "") (SI "") (PSI "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "")]) ;; Define code iterators ;; Define two incarnations so that we can build the cross product. @@ -317,9 +331,11 @@ DONE; }) -(define_insn "pushqi1" - [(set (mem:QI (post_dec:HI (reg:HI REG_SP))) - (match_operand:QI 0 "reg_or_0_operand" "r,L"))] +;; "pushqi1" +;; "pushqq1" "pushuqq1" +(define_insn "push1" + [(set (mem:ALL1 (post_dec:HI (reg:HI REG_SP))) + (match_operand:ALL1 0 "reg_or_0_operand" "r,Y00"))] "" "@ push %0 @@ -334,7 +350,11 @@ (PSI "") (SI "") (CSI "") (DI "") (CDI "") - (SF "") (SC "")]) + (SF "") (SC "") + (HA "") (UHA "") (HQ "") (UHQ "") + (SA "") (USA "") (SQ "") (USQ "") + (DA "") (UDA "") (DQ "") (UDQ "") + (TA "") (UTA "")]) (define_expand "push1" [(match_operand:MPUSH 0 "" "")] @@ -422,12 +442,14 @@ (set_attr "cc" "clobber")]) -(define_insn_and_split "xload8_A" - [(set (match_operand:QI 0 "register_operand" "=r") - (match_operand:QI 1 "memory_operand" "m")) +;; "xload8qi_A" +;; "xload8qq_A" "xload8uqq_A" +(define_insn_and_split "xload8_A" + [(set (match_operand:ALL1 0 "register_operand" "=r") + (match_operand:ALL1 1 "memory_operand" "m")) (clobber (reg:HI REG_Z))] "can_create_pseudo_p() - && !avr_xload_libgcc_p (QImode) + && !avr_xload_libgcc_p (mode) && avr_mem_memx_p (operands[1]) && REG_P (XEXP (operands[1], 0))" { gcc_unreachable(); } @@ -441,16 +463,16 @@ emit_move_insn (reg_z, simplify_gen_subreg (HImode, addr, PSImode, 0)); emit_move_insn (hi8, simplify_gen_subreg (QImode, addr, PSImode, 2)); - insn = emit_insn (gen_xload_8 (operands[0], hi8)); + insn = emit_insn (gen_xload_8 (operands[0], hi8)); set_mem_addr_space (SET_SRC (single_set (insn)), MEM_ADDR_SPACE (operands[1])); DONE; }) -;; "xloadqi_A" -;; "xloadhi_A" +;; "xloadqi_A" "xloadqq_A" "xloaduqq_A" +;; "xloadhi_A" "xloadhq_A" "xloaduhq_A" "xloadha_A" "xloaduha_A" +;; "xloadsi_A" "xloadsq_A" "xloadusq_A" "xloadsa_A" "xloadusa_A" ;; "xloadpsi_A" -;; "xloadsi_A" ;; "xloadsf_A" (define_insn_and_split "xload_A" [(set (match_operand:MOVMODE 0 "register_operand" "=r") @@ -488,11 +510,13 @@ ;; Move value from address space memx to a register ;; These insns must be prior to respective generic move insn. -(define_insn "xload_8" - [(set (match_operand:QI 0 "register_operand" "=&r,r") - (mem:QI (lo_sum:PSI (match_operand:QI 1 "register_operand" "r,r") - (reg:HI REG_Z))))] - "!avr_xload_libgcc_p (QImode)" +;; "xloadqi_8" +;; "xloadqq_8" "xloaduqq_8" +(define_insn "xload_8" + [(set (match_operand:ALL1 0 "register_operand" "=&r,r") + (mem:ALL1 (lo_sum:PSI (match_operand:QI 1 "register_operand" "r,r") + (reg:HI REG_Z))))] + "!avr_xload_libgcc_p (mode)" { return avr_out_xload (insn, operands, NULL); } @@ -504,11 +528,11 @@ ;; R21:Z : 24-bit source address ;; R22 : 1-4 byte output -;; "xload_qi_libgcc" -;; "xload_hi_libgcc" -;; "xload_psi_libgcc" -;; "xload_si_libgcc" +;; "xload_qi_libgcc" "xload_qq_libgcc" "xload_uqq_libgcc" +;; "xload_hi_libgcc" "xload_hq_libgcc" "xload_uhq_libgcc" "xload_ha_libgcc" "xload_uha_libgcc" +;; "xload_si_libgcc" "xload_sq_libgcc" "xload_usq_libgcc" "xload_sa_libgcc" "xload_usa_libgcc" ;; "xload_sf_libgcc" +;; "xload_psi_libgcc" (define_insn "xload__libgcc" [(set (reg:MOVMODE 22) (mem:MOVMODE (lo_sum:PSI (reg:QI 21) @@ -528,9 +552,9 @@ ;; General move expanders -;; "movqi" -;; "movhi" -;; "movsi" +;; "movqi" "movqq" "movuqq" +;; "movhi" "movhq" "movuhq" "movha" "movuha" +;; "movsi" "movsq" "movusq" "movsa" "movusa" ;; "movsf" ;; "movpsi" (define_expand "mov" @@ -546,8 +570,7 @@ /* One of the operands has to be in a register. */ if (!register_operand (dest, mode) - && !(register_operand (src, mode) - || src == CONST0_RTX (mode))) + && !reg_or_0_operand (src, mode)) { operands[1] = src = copy_to_mode_reg (mode, src); } @@ -560,7 +583,9 @@ src = replace_equiv_address (src, copy_to_mode_reg (PSImode, addr)); if (!avr_xload_libgcc_p (mode)) - emit_insn (gen_xload8_A (dest, src)); + /* ; No here because gen_xload8_A only iterates over ALL1. + ; insn-emit does not depend on the mode, it' all about operands. */ + emit_insn (gen_xload8qi_A (dest, src)); else emit_insn (gen_xload_A (dest, src)); @@ -627,12 +652,13 @@ ;; are call-saved registers, and most of LD_REGS are call-used registers, ;; so this may still be a win for registers live across function calls. -(define_insn "movqi_insn" - [(set (match_operand:QI 0 "nonimmediate_operand" "=r ,d,Qm,r ,q,r,*r") - (match_operand:QI 1 "nox_general_operand" "rL,i,rL,Qm,r,q,i"))] - "register_operand (operands[0], QImode) - || register_operand (operands[1], QImode) - || const0_rtx == operands[1]" +;; "movqi_insn" +;; "movqq_insn" "movuqq_insn" +(define_insn "mov_insn" + [(set (match_operand:ALL1 0 "nonimmediate_operand" "=r ,d ,Qm ,r ,q,r,*r") + (match_operand:ALL1 1 "nox_general_operand" "r Y00,n Ynn,r Y00,Qm,r,q,i"))] + "register_operand (operands[0], mode) + || reg_or_0_operand (operands[1], mode)" { return output_movqi (insn, operands, NULL); } @@ -643,9 +669,11 @@ ;; This is used in peephole2 to optimize loading immediate constants ;; if a scratch register from LD_REGS happens to be available. -(define_insn "*reload_inqi" - [(set (match_operand:QI 0 "register_operand" "=l") - (match_operand:QI 1 "immediate_operand" "i")) +;; "*reload_inqi" +;; "*reload_inqq" "*reload_inuqq" +(define_insn "*reload_in" + [(set (match_operand:ALL1 0 "register_operand" "=l") + (match_operand:ALL1 1 "const_operand" "i")) (clobber (match_operand:QI 2 "register_operand" "=&d"))] "reload_completed" "ldi %2,lo8(%1) @@ -655,14 +683,15 @@ (define_peephole2 [(match_scratch:QI 2 "d") - (set (match_operand:QI 0 "l_register_operand" "") - (match_operand:QI 1 "immediate_operand" ""))] - "(operands[1] != const0_rtx - && operands[1] != const1_rtx - && operands[1] != constm1_rtx)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + (set (match_operand:ALL1 0 "l_register_operand" "") + (match_operand:ALL1 1 "const_operand" ""))] + ; No need for a clobber reg for 0x0, 0x01 or 0xff + "!satisfies_constraint_Y00 (operands[1]) + && !satisfies_constraint_Y01 (operands[1]) + && !satisfies_constraint_Ym1 (operands[1])" + [(parallel [(set (match_dup 0) + (match_dup 1)) + (clobber (match_dup 2))])]) ;;============================================================================ ;; move word (16 bit) @@ -693,18 +722,20 @@ (define_peephole2 [(match_scratch:QI 2 "d") - (set (match_operand:HI 0 "l_register_operand" "") - (match_operand:HI 1 "immediate_operand" ""))] - "(operands[1] != const0_rtx - && operands[1] != constm1_rtx)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + (set (match_operand:ALL2 0 "l_register_operand" "") + (match_operand:ALL2 1 "const_or_immediate_operand" ""))] + "operands[1] != CONST0_RTX (mode)" + [(parallel [(set (match_dup 0) + (match_dup 1)) + (clobber (match_dup 2))])]) ;; '*' because it is not used in rtl generation, only in above peephole -(define_insn "*reload_inhi" - [(set (match_operand:HI 0 "register_operand" "=r") - (match_operand:HI 1 "immediate_operand" "i")) +;; "*reload_inhi" +;; "*reload_inhq" "*reload_inuhq" +;; "*reload_inha" "*reload_inuha" +(define_insn "*reload_in" + [(set (match_operand:ALL2 0 "l_register_operand" "=l") + (match_operand:ALL2 1 "immediate_operand" "i")) (clobber (match_operand:QI 2 "register_operand" "=&d"))] "reload_completed" { @@ -712,14 +743,16 @@ } [(set_attr "length" "4") (set_attr "adjust_len" "reload_in16") - (set_attr "cc" "none")]) + (set_attr "cc" "clobber")]) -(define_insn "*movhi" - [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,m ,d,*r,q,r") - (match_operand:HI 1 "nox_general_operand" "r,L,m,rL,i,i ,r,q"))] - "register_operand (operands[0], HImode) - || register_operand (operands[1], HImode) - || const0_rtx == operands[1]" +;; "*movhi" +;; "*movhq" "*movuhq" +;; "*movha" "*movuha" +(define_insn "*mov" + [(set (match_operand:ALL2 0 "nonimmediate_operand" "=r,r ,r,m ,d,*r,q,r") + (match_operand:ALL2 1 "nox_general_operand" "r,Y00,m,r Y00,i,i ,r,q"))] + "register_operand (operands[0], mode) + || reg_or_0_operand (operands[1], mode)" { return output_movhi (insn, operands, NULL); } @@ -728,28 +761,30 @@ (set_attr "cc" "none,none,clobber,clobber,none,clobber,none,none")]) (define_peephole2 ; movw - [(set (match_operand:QI 0 "even_register_operand" "") - (match_operand:QI 1 "even_register_operand" "")) - (set (match_operand:QI 2 "odd_register_operand" "") - (match_operand:QI 3 "odd_register_operand" ""))] + [(set (match_operand:ALL1 0 "even_register_operand" "") + (match_operand:ALL1 1 "even_register_operand" "")) + (set (match_operand:ALL1 2 "odd_register_operand" "") + (match_operand:ALL1 3 "odd_register_operand" ""))] "(AVR_HAVE_MOVW && REGNO (operands[0]) == REGNO (operands[2]) - 1 && REGNO (operands[1]) == REGNO (operands[3]) - 1)" - [(set (match_dup 4) (match_dup 5))] + [(set (match_dup 4) + (match_dup 5))] { operands[4] = gen_rtx_REG (HImode, REGNO (operands[0])); operands[5] = gen_rtx_REG (HImode, REGNO (operands[1])); }) (define_peephole2 ; movw_r - [(set (match_operand:QI 0 "odd_register_operand" "") - (match_operand:QI 1 "odd_register_operand" "")) - (set (match_operand:QI 2 "even_register_operand" "") - (match_operand:QI 3 "even_register_operand" ""))] + [(set (match_operand:ALL1 0 "odd_register_operand" "") + (match_operand:ALL1 1 "odd_register_operand" "")) + (set (match_operand:ALL1 2 "even_register_operand" "") + (match_operand:ALL1 3 "even_register_operand" ""))] "(AVR_HAVE_MOVW && REGNO (operands[2]) == REGNO (operands[0]) - 1 && REGNO (operands[3]) == REGNO (operands[1]) - 1)" - [(set (match_dup 4) (match_dup 5))] + [(set (match_dup 4) + (match_dup 5))] { operands[4] = gen_rtx_REG (HImode, REGNO (operands[2])); operands[5] = gen_rtx_REG (HImode, REGNO (operands[3])); @@ -801,19 +836,21 @@ (define_peephole2 ; *reload_insi [(match_scratch:QI 2 "d") - (set (match_operand:SI 0 "l_register_operand" "") - (match_operand:SI 1 "const_int_operand" "")) + (set (match_operand:ALL4 0 "l_register_operand" "") + (match_operand:ALL4 1 "immediate_operand" "")) (match_dup 2)] - "(operands[1] != const0_rtx - && operands[1] != constm1_rtx)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + "operands[1] != CONST0_RTX (mode)" + [(parallel [(set (match_dup 0) + (match_dup 1)) + (clobber (match_dup 2))])]) ;; '*' because it is not used in rtl generation. +;; "*reload_insi" +;; "*reload_insq" "*reload_inusq" +;; "*reload_insa" "*reload_inusa" (define_insn "*reload_insi" - [(set (match_operand:SI 0 "register_operand" "=r") - (match_operand:SI 1 "const_int_operand" "n")) + [(set (match_operand:ALL4 0 "register_operand" "=r") + (match_operand:ALL4 1 "immediate_operand" "n Ynn")) (clobber (match_operand:QI 2 "register_operand" "=&d"))] "reload_completed" { @@ -824,12 +861,14 @@ (set_attr "cc" "clobber")]) -(define_insn "*movsi" - [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r ,Qm,!d,r") - (match_operand:SI 1 "nox_general_operand" "r,L,Qm,rL,i ,i"))] - "register_operand (operands[0], SImode) - || register_operand (operands[1], SImode) - || const0_rtx == operands[1]" +;; "*movsi" +;; "*movsq" "*movusq" +;; "*movsa" "*movusa" +(define_insn "*mov" + [(set (match_operand:ALL4 0 "nonimmediate_operand" "=r,r ,r ,Qm ,!d,r") + (match_operand:ALL4 1 "nox_general_operand" "r,Y00,Qm,r Y00,i ,i"))] + "register_operand (operands[0], mode) + || reg_or_0_operand (operands[1], mode)" { return output_movsisf (insn, operands, NULL); } @@ -844,8 +883,7 @@ [(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,r ,Qm,!d,r") (match_operand:SF 1 "nox_general_operand" "r,G,Qm,rG,F ,F"))] "register_operand (operands[0], SFmode) - || register_operand (operands[1], SFmode) - || operands[1] == CONST0_RTX (SFmode)" + || reg_or_0_operand (operands[1], SFmode)" { return output_movsisf (insn, operands, NULL); } @@ -861,8 +899,7 @@ "operands[1] != CONST0_RTX (SFmode)" [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + (clobber (match_dup 2))])]) ;; '*' because it is not used in rtl generation. (define_insn "*reload_insf" @@ -1015,9 +1052,10 @@ (set (match_dup 4) (plus:HI (match_dup 4) (const_int -1))) - (set (match_operand:HI 0 "register_operand" "") - (minus:HI (match_dup 4) - (match_dup 5)))] + (parallel [(set (match_operand:HI 0 "register_operand" "") + (minus:HI (match_dup 4) + (match_dup 5))) + (clobber (scratch:QI))])] "" { rtx addr; @@ -1043,10 +1081,12 @@ ;+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ; add bytes -(define_insn "addqi3" - [(set (match_operand:QI 0 "register_operand" "=r,d,r,r,r,r") - (plus:QI (match_operand:QI 1 "register_operand" "%0,0,0,0,0,0") - (match_operand:QI 2 "nonmemory_operand" "r,i,P,N,K,Cm2")))] +;; "addqi3" +;; "addqq3" "adduqq3" +(define_insn "add3" + [(set (match_operand:ALL1 0 "register_operand" "=r,d ,r ,r ,r ,r") + (plus:ALL1 (match_operand:ALL1 1 "register_operand" "%0,0 ,0 ,0 ,0 ,0") + (match_operand:ALL1 2 "nonmemory_operand" "r,n Ynn,Y01,Ym1,Y02,Ym2")))] "" "@ add %0,%2 @@ -1058,11 +1098,13 @@ [(set_attr "length" "1,1,1,1,2,2") (set_attr "cc" "set_czn,set_czn,set_zn,set_zn,set_zn,set_zn")]) - -(define_expand "addhi3" - [(set (match_operand:HI 0 "register_operand" "") - (plus:HI (match_operand:HI 1 "register_operand" "") - (match_operand:HI 2 "nonmemory_operand" "")))] +;; "addhi3" +;; "addhq3" "adduhq3" +;; "addha3" "adduha3" +(define_expand "add3" + [(set (match_operand:ALL2 0 "register_operand" "") + (plus:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:ALL2 2 "nonmemory_or_const_operand" "")))] "" { if (CONST_INT_P (operands[2])) @@ -1079,6 +1121,12 @@ DONE; } } + + if (CONST_FIXED == GET_CODE (operands[2])) + { + emit_insn (gen_add3_clobber (operands[0], operands[1], operands[2])); + DONE; + } }) @@ -1124,24 +1172,22 @@ [(set_attr "length" "6") (set_attr "adjust_len" "addto_sp")]) -(define_insn "*addhi3" - [(set (match_operand:HI 0 "register_operand" "=r,d,!w,d") - (plus:HI (match_operand:HI 1 "register_operand" "%0,0,0 ,0") - (match_operand:HI 2 "nonmemory_operand" "r,s,IJ,n")))] +;; "*addhi3" +;; "*addhq3" "*adduhq3" +;; "*addha3" "*adduha3" +(define_insn "*add3" + [(set (match_operand:ALL2 0 "register_operand" "=r,d,!w ,d") + (plus:ALL2 (match_operand:ALL2 1 "register_operand" "%0,0,0 ,0") + (match_operand:ALL2 2 "nonmemory_or_const_operand" "r,s,IJ YIJ,n Ynn")))] "" { - static const char * const asm_code[] = - { - "add %A0,%A2\;adc %B0,%B2", - "subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))", - "", - "" - }; - - if (*asm_code[which_alternative]) - return asm_code[which_alternative]; - - return avr_out_plus_noclobber (operands, NULL, NULL); + if (REG_P (operands[2])) + return "add %A0,%A2\;adc %B0,%B2"; + else if (CONST_INT_P (operands[2]) + || CONST_FIXED == GET_CODE (operands[2])) + return avr_out_plus_noclobber (operands, NULL, NULL); + else + return "subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))"; } [(set_attr "length" "2,2,2,2") (set_attr "adjust_len" "*,*,out_plus_noclobber,out_plus_noclobber") @@ -1152,41 +1198,44 @@ ;; itself because that insn is special to reload. (define_peephole2 ; addhi3_clobber - [(set (match_operand:HI 0 "d_register_operand" "") - (match_operand:HI 1 "const_int_operand" "")) - (set (match_operand:HI 2 "l_register_operand" "") - (plus:HI (match_dup 2) - (match_dup 0)))] + [(set (match_operand:ALL2 0 "d_register_operand" "") + (match_operand:ALL2 1 "const_operand" "")) + (set (match_operand:ALL2 2 "l_register_operand" "") + (plus:ALL2 (match_dup 2) + (match_dup 0)))] "peep2_reg_dead_p (2, operands[0])" [(parallel [(set (match_dup 2) - (plus:HI (match_dup 2) - (match_dup 1))) + (plus:ALL2 (match_dup 2) + (match_dup 1))) (clobber (match_dup 3))])] { - operands[3] = simplify_gen_subreg (QImode, operands[0], HImode, 0); + operands[3] = simplify_gen_subreg (QImode, operands[0], mode, 0); }) ;; Same, but with reload to NO_LD_REGS ;; Combine *reload_inhi with *addhi3 (define_peephole2 ; addhi3_clobber - [(parallel [(set (match_operand:HI 0 "l_register_operand" "") - (match_operand:HI 1 "const_int_operand" "")) + [(parallel [(set (match_operand:ALL2 0 "l_register_operand" "") + (match_operand:ALL2 1 "const_operand" "")) (clobber (match_operand:QI 2 "d_register_operand" ""))]) - (set (match_operand:HI 3 "l_register_operand" "") - (plus:HI (match_dup 3) - (match_dup 0)))] + (set (match_operand:ALL2 3 "l_register_operand" "") + (plus:ALL2 (match_dup 3) + (match_dup 0)))] "peep2_reg_dead_p (2, operands[0])" [(parallel [(set (match_dup 3) - (plus:HI (match_dup 3) - (match_dup 1))) + (plus:ALL2 (match_dup 3) + (match_dup 1))) (clobber (match_dup 2))])]) -(define_insn "addhi3_clobber" - [(set (match_operand:HI 0 "register_operand" "=!w,d,r") - (plus:HI (match_operand:HI 1 "register_operand" "%0,0,0") - (match_operand:HI 2 "const_int_operand" "IJ,n,n"))) - (clobber (match_scratch:QI 3 "=X,X,&d"))] +;; "addhi3_clobber" +;; "addhq3_clobber" "adduhq3_clobber" +;; "addha3_clobber" "adduha3_clobber" +(define_insn "add3_clobber" + [(set (match_operand:ALL2 0 "register_operand" "=!w ,d ,r") + (plus:ALL2 (match_operand:ALL2 1 "register_operand" "%0 ,0 ,0") + (match_operand:ALL2 2 "const_operand" "IJ YIJ,n Ynn,n Ynn"))) + (clobber (match_scratch:QI 3 "=X ,X ,&d"))] "" { gcc_assert (REGNO (operands[0]) == REGNO (operands[1])); @@ -1198,29 +1247,24 @@ (set_attr "cc" "out_plus")]) -(define_insn "addsi3" - [(set (match_operand:SI 0 "register_operand" "=r,d ,d,r") - (plus:SI (match_operand:SI 1 "register_operand" "%0,0 ,0,0") - (match_operand:SI 2 "nonmemory_operand" "r,s ,n,n"))) - (clobber (match_scratch:QI 3 "=X,X ,X,&d"))] +;; "addsi3" +;; "addsq3" "addusq3" +;; "addsa3" "addusa3" +(define_insn "add3" + [(set (match_operand:ALL4 0 "register_operand" "=r,d ,r") + (plus:ALL4 (match_operand:ALL4 1 "register_operand" "%0,0 ,0") + (match_operand:ALL4 2 "nonmemory_operand" "r,i ,n Ynn"))) + (clobber (match_scratch:QI 3 "=X,X ,&d"))] "" { - static const char * const asm_code[] = - { - "add %A0,%A2\;adc %B0,%B2\;adc %C0,%C2\;adc %D0,%D2", - "subi %0,lo8(-(%2))\;sbci %B0,hi8(-(%2))\;sbci %C0,hlo8(-(%2))\;sbci %D0,hhi8(-(%2))", - "", - "" - }; - - if (*asm_code[which_alternative]) - return asm_code[which_alternative]; + if (REG_P (operands[2])) + return "add %A0,%A2\;adc %B0,%B2\;adc %C0,%C2\;adc %D0,%D2"; return avr_out_plus (operands, NULL, NULL); } - [(set_attr "length" "4,4,4,8") - (set_attr "adjust_len" "*,*,out_plus,out_plus") - (set_attr "cc" "set_n,set_czn,out_plus,out_plus")]) + [(set_attr "length" "4,4,8") + (set_attr "adjust_len" "*,out_plus,out_plus") + (set_attr "cc" "set_n,out_plus,out_plus")]) (define_insn "*addpsi3_zero_extend.qi" [(set (match_operand:PSI 0 "register_operand" "=r") @@ -1329,27 +1373,38 @@ ;----------------------------------------------------------------------------- ; sub bytes -(define_insn "subqi3" - [(set (match_operand:QI 0 "register_operand" "=r,d") - (minus:QI (match_operand:QI 1 "register_operand" "0,0") - (match_operand:QI 2 "nonmemory_operand" "r,i")))] + +;; "subqi3" +;; "subqq3" "subuqq3" +(define_insn "sub3" + [(set (match_operand:ALL1 0 "register_operand" "=r,d ,r ,r ,r ,r") + (minus:ALL1 (match_operand:ALL1 1 "register_operand" "0,0 ,0 ,0 ,0 ,0") + (match_operand:ALL1 2 "nonmemory_or_const_operand" "r,n Ynn,Y01,Ym1,Y02,Ym2")))] "" "@ sub %0,%2 - subi %0,lo8(%2)" - [(set_attr "length" "1,1") - (set_attr "cc" "set_czn,set_czn")]) + subi %0,lo8(%2) + dec %0 + inc %0 + dec %0\;dec %0 + inc %0\;inc %0" + [(set_attr "length" "1,1,1,1,2,2") + (set_attr "cc" "set_czn,set_czn,set_zn,set_zn,set_zn,set_zn")]) -(define_insn "subhi3" - [(set (match_operand:HI 0 "register_operand" "=r,d") - (minus:HI (match_operand:HI 1 "register_operand" "0,0") - (match_operand:HI 2 "nonmemory_operand" "r,i")))] +;; "subhi3" +;; "subhq3" "subuhq3" +;; "subha3" "subuha3" +(define_insn "sub3" + [(set (match_operand:ALL2 0 "register_operand" "=r,d ,*r") + (minus:ALL2 (match_operand:ALL2 1 "register_operand" "0,0 ,0") + (match_operand:ALL2 2 "nonmemory_or_const_operand" "r,i Ynn,Ynn"))) + (clobber (match_scratch:QI 3 "=X,X ,&d"))] "" - "@ - sub %A0,%A2\;sbc %B0,%B2 - subi %A0,lo8(%2)\;sbci %B0,hi8(%2)" - [(set_attr "length" "2,2") - (set_attr "cc" "set_czn,set_czn")]) + { + return avr_out_minus (operands, NULL, NULL); + } + [(set_attr "adjust_len" "minus") + (set_attr "cc" "minus")]) (define_insn "*subhi3_zero_extend1" [(set (match_operand:HI 0 "register_operand" "=r") @@ -1373,13 +1428,23 @@ [(set_attr "length" "5") (set_attr "cc" "clobber")]) -(define_insn "subsi3" - [(set (match_operand:SI 0 "register_operand" "=r") - (minus:SI (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "register_operand" "r")))] +;; "subsi3" +;; "subsq3" "subusq3" +;; "subsa3" "subusa3" +(define_insn "sub3" + [(set (match_operand:ALL4 0 "register_operand" "=r,d ,r") + (minus:ALL4 (match_operand:ALL4 1 "register_operand" "0,0 ,0") + (match_operand:ALL4 2 "nonmemory_or_const_operand" "r,n Ynn,Ynn"))) + (clobber (match_scratch:QI 3 "=X,X ,&d"))] "" - "sub %0,%2\;sbc %B0,%B2\;sbc %C0,%C2\;sbc %D0,%D2" + { + if (REG_P (operands[2])) + return "sub %0,%2\;sbc %B0,%B2\;sbc %C0,%C2\;sbc %D0,%D2"; + + return avr_out_minus (operands, NULL, NULL); + } [(set_attr "length" "4") + (set_attr "adjust_len" "*,minus,minus") (set_attr "cc" "set_czn")]) (define_insn "*subsi3_zero_extend" @@ -1515,8 +1580,18 @@ adc %A0,__zero_reg__\;adc %B0,__zero_reg__\;adc %C0,__zero_reg__\;adc %D0,__zero_reg__" [(set_attr "length" "6") (set_attr "cc" "clobber")]) - +(define_insn "*umulqihi3.call" + [(set (reg:HI 24) + (mult:HI (zero_extend:HI (reg:QI 22)) + (zero_extend:HI (reg:QI 24)))) + (clobber (reg:QI 21)) + (clobber (reg:HI 22))] + "!AVR_HAVE_MUL" + "%~call __umulqihi3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + ;; "umulqihi3" ;; "mulqihi3" (define_insn "mulqihi3" @@ -3303,44 +3378,58 @@ ;;<< << << << << << << << << << << << << << << << << << << << << << << << << << ;; arithmetic shift left -(define_expand "ashlqi3" - [(set (match_operand:QI 0 "register_operand" "") - (ashift:QI (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "nop_general_operand" "")))]) +;; "ashlqi3" +;; "ashlqq3" "ashluqq3" +(define_expand "ashl3" + [(set (match_operand:ALL1 0 "register_operand" "") + (ashift:ALL1 (match_operand:ALL1 1 "register_operand" "") + (match_operand:QI 2 "nop_general_operand" "")))]) (define_split ; ashlqi3_const4 - [(set (match_operand:QI 0 "d_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 4)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 4)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int -16)))] - "") + [(set (match_dup 1) + (rotate:QI (match_dup 1) + (const_int 4))) + (set (match_dup 1) + (and:QI (match_dup 1) + (const_int -16)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; ashlqi3_const5 - [(set (match_operand:QI 0 "d_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 5)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 5)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 1))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int -32)))] - "") + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (ashift:QI (match_dup 1) (const_int 1))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int -32)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; ashlqi3_const6 - [(set (match_operand:QI 0 "d_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 6)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 6)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 2))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int -64)))] - "") + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (ashift:QI (match_dup 1) (const_int 2))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int -64)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) -(define_insn "*ashlqi3" - [(set (match_operand:QI 0 "register_operand" "=r,r,r,r,!d,r,r") - (ashift:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0 ,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] +;; "*ashlqi3" +;; "*ashlqq3" "*ashluqq3" +(define_insn "*ashl3" + [(set (match_operand:ALL1 0 "register_operand" "=r,r,r,r,!d,r,r") + (ashift:ALL1 (match_operand:ALL1 1 "register_operand" "0,0,0,0,0 ,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] "" { return ashlqi3_out (insn, operands, NULL); @@ -3349,10 +3438,10 @@ (set_attr "adjust_len" "ashlqi") (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,set_czn,clobber")]) -(define_insn "ashlhi3" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashift:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +(define_insn "ashl3" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r,r,r") + (ashift:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashlhi3_out (insn, operands, NULL); @@ -3377,8 +3466,7 @@ "" [(set (match_dup 0) (ashift:QI (match_dup 1) - (match_dup 2)))] - "") + (match_dup 2)))]) ;; ??? Combiner does not recognize that it could split the following insn; ;; presumably because he has no register handy? @@ -3443,10 +3531,13 @@ }) -(define_insn "ashlsi3" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashift:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "ashlsi3" +;; "ashlsq3" "ashlusq3" +;; "ashlsa3" "ashlusa3" +(define_insn "ashl3" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r,r,r,r") + (ashift:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashlsi3_out (insn, operands, NULL); @@ -3458,55 +3549,65 @@ ;; Optimize if a scratch register from LD_REGS happens to be available. (define_peephole2 ; ashlqi3_l_const4 - [(set (match_operand:QI 0 "l_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 4))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 4))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) (set (match_dup 1) (const_int -16)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; ashlqi3_l_const5 - [(set (match_operand:QI 0 "l_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 5))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 5))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 1))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (ashift:QI (match_dup 2) (const_int 1))) (set (match_dup 1) (const_int -32)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; ashlqi3_l_const6 - [(set (match_operand:QI 0 "l_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 6))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 6))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 2))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (ashift:QI (match_dup 2) (const_int 2))) (set (match_dup 1) (const_int -64)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:HI 0 "register_operand" "") - (ashift:HI (match_operand:HI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL2 0 "register_operand" "") + (ashift:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashift:HI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") + [(parallel [(set (match_dup 0) + (ashift:ALL2 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) -(define_insn "*ashlhi3_const" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r") - (ashift:HI (match_operand:HI 1 "register_operand" "0,0,r,0,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] +;; "*ashlhi3_const" +;; "*ashlhq3_const" "*ashluhq3_const" +;; "*ashlha3_const" "*ashluha3_const" +(define_insn "*ashl3_const" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r") + (ashift:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,r,0,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] "reload_completed" { return ashlhi3_out (insn, operands, NULL); @@ -3517,19 +3618,24 @@ (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:SI 0 "register_operand" "") - (ashift:SI (match_operand:SI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL4 0 "register_operand" "") + (ashift:ALL4 (match_operand:ALL4 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashift:SI (match_dup 1) (match_dup 2))) + [(parallel [(set (match_dup 0) + (ashift:ALL4 (match_dup 1) + (match_dup 2))) (clobber (match_dup 3))])] "") -(define_insn "*ashlsi3_const" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r") - (ashift:SI (match_operand:SI 1 "register_operand" "0,0,r,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,&d"))] +;; "*ashlsi3_const" +;; "*ashlsq3_const" "*ashlusq3_const" +;; "*ashlsa3_const" "*ashlusa3_const" +(define_insn "*ashl3_const" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r") + (ashift:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,&d"))] "reload_completed" { return ashlsi3_out (insn, operands, NULL); @@ -3580,10 +3686,12 @@ ;; >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ;; arithmetic shift right -(define_insn "ashrqi3" - [(set (match_operand:QI 0 "register_operand" "=r,r,r,r,r ,r ,r") - (ashiftrt:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0 ,0 ,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,K,C03 C04 C05,C06 C07,Qm")))] +;; "ashrqi3" +;; "ashrqq3" "ashruqq3" +(define_insn "ashr3" + [(set (match_operand:ALL1 0 "register_operand" "=r,r,r,r,r ,r ,r") + (ashiftrt:ALL1 (match_operand:ALL1 1 "register_operand" "0,0,0,0,0 ,0 ,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,K,C03 C04 C05,C06 C07,Qm")))] "" { return ashrqi3_out (insn, operands, NULL); @@ -3592,10 +3700,13 @@ (set_attr "adjust_len" "ashrqi") (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,clobber,clobber")]) -(define_insn "ashrhi3" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashiftrt:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "ashrhi3" +;; "ashrhq3" "ashruhq3" +;; "ashrha3" "ashruha3" +(define_insn "ashr3" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r,r,r") + (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashrhi3_out (insn, operands, NULL); @@ -3616,10 +3727,13 @@ [(set_attr "adjust_len" "ashrpsi") (set_attr "cc" "clobber")]) -(define_insn "ashrsi3" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashiftrt:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "ashrsi3" +;; "ashrsq3" "ashrusq3" +;; "ashrsa3" "ashrusa3" +(define_insn "ashr3" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r,r,r,r") + (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashrsi3_out (insn, operands, NULL); @@ -3632,19 +3746,23 @@ (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:HI 0 "register_operand" "") - (ashiftrt:HI (match_operand:HI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL2 0 "register_operand" "") + (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashiftrt:HI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") + [(parallel [(set (match_dup 0) + (ashiftrt:ALL2 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) -(define_insn "*ashrhi3_const" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r") - (ashiftrt:HI (match_operand:HI 1 "register_operand" "0,0,r,0,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] +;; "*ashrhi3_const" +;; "*ashrhq3_const" "*ashruhq3_const" +;; "*ashrha3_const" "*ashruha3_const" +(define_insn "*ashr3_const" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r") + (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,r,0,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] "reload_completed" { return ashrhi3_out (insn, operands, NULL); @@ -3655,19 +3773,23 @@ (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:SI 0 "register_operand" "") - (ashiftrt:SI (match_operand:SI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL4 0 "register_operand" "") + (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashiftrt:SI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") + [(parallel [(set (match_dup 0) + (ashiftrt:ALL4 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) -(define_insn "*ashrsi3_const" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r") - (ashiftrt:SI (match_operand:SI 1 "register_operand" "0,0,r,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,&d"))] +;; "*ashrsi3_const" +;; "*ashrsq3_const" "*ashrusq3_const" +;; "*ashrsa3_const" "*ashrusa3_const" +(define_insn "*ashr3_const" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r") + (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,&d"))] "reload_completed" { return ashrsi3_out (insn, operands, NULL); @@ -3679,44 +3801,59 @@ ;; >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ;; logical shift right -(define_expand "lshrqi3" - [(set (match_operand:QI 0 "register_operand" "") - (lshiftrt:QI (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "nop_general_operand" "")))]) +;; "lshrqi3" +;; "lshrqq3 "lshruqq3" +(define_expand "lshr3" + [(set (match_operand:ALL1 0 "register_operand" "") + (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand" "") + (match_operand:QI 2 "nop_general_operand" "")))]) (define_split ; lshrqi3_const4 - [(set (match_operand:QI 0 "d_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 4)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 4)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int 15)))] - "") + [(set (match_dup 1) + (rotate:QI (match_dup 1) + (const_int 4))) + (set (match_dup 1) + (and:QI (match_dup 1) + (const_int 15)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; lshrqi3_const5 - [(set (match_operand:QI 0 "d_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 5)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 5)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 1))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int 7)))] - "") + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (lshiftrt:QI (match_dup 1) (const_int 1))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int 7)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; lshrqi3_const6 [(set (match_operand:QI 0 "d_register_operand" "") (lshiftrt:QI (match_dup 0) (const_int 6)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 2))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int 3)))] - "") + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (lshiftrt:QI (match_dup 1) (const_int 2))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int 3)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) -(define_insn "*lshrqi3" - [(set (match_operand:QI 0 "register_operand" "=r,r,r,r,!d,r,r") - (lshiftrt:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0 ,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] +;; "*lshrqi3" +;; "*lshrqq3" +;; "*lshruqq3" +(define_insn "*lshr3" + [(set (match_operand:ALL1 0 "register_operand" "=r,r,r,r,!d,r,r") + (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand" "0,0,0,0,0 ,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] "" { return lshrqi3_out (insn, operands, NULL); @@ -3725,10 +3862,13 @@ (set_attr "adjust_len" "lshrqi") (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,set_czn,clobber")]) -(define_insn "lshrhi3" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r,r,r") - (lshiftrt:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "lshrhi3" +;; "lshrhq3" "lshruhq3" +;; "lshrha3" "lshruha3" +(define_insn "lshr3" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r,r,r") + (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return lshrhi3_out (insn, operands, NULL); @@ -3749,10 +3889,13 @@ [(set_attr "adjust_len" "lshrpsi") (set_attr "cc" "clobber")]) -(define_insn "lshrsi3" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") - (lshiftrt:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "lshrsi3" +;; "lshrsq3" "lshrusq3" +;; "lshrsa3" "lshrusa3" +(define_insn "lshr3" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r,r,r,r") + (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return lshrsi3_out (insn, operands, NULL); @@ -3764,55 +3907,65 @@ ;; Optimize if a scratch register from LD_REGS happens to be available. (define_peephole2 ; lshrqi3_l_const4 - [(set (match_operand:QI 0 "l_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 4))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 4))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) (set (match_dup 1) (const_int 15)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; lshrqi3_l_const5 - [(set (match_operand:QI 0 "l_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 5))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 5))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 1))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (lshiftrt:QI (match_dup 2) (const_int 1))) (set (match_dup 1) (const_int 7)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; lshrqi3_l_const6 - [(set (match_operand:QI 0 "l_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 6))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 6))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 2))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (lshiftrt:QI (match_dup 2) (const_int 2))) (set (match_dup 1) (const_int 3)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:HI 0 "register_operand" "") - (lshiftrt:HI (match_operand:HI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL2 0 "register_operand" "") + (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (lshiftrt:HI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") + [(parallel [(set (match_dup 0) + (lshiftrt:ALL2 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) -(define_insn "*lshrhi3_const" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r") - (lshiftrt:HI (match_operand:HI 1 "register_operand" "0,0,r,0,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] +;; "*lshrhi3_const" +;; "*lshrhq3_const" "*lshruhq3_const" +;; "*lshrha3_const" "*lshruha3_const" +(define_insn "*lshr3_const" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r") + (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,r,0,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] "reload_completed" { return lshrhi3_out (insn, operands, NULL); @@ -3823,19 +3976,23 @@ (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:SI 0 "register_operand" "") - (lshiftrt:SI (match_operand:SI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL4 0 "register_operand" "") + (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (lshiftrt:SI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") + [(parallel [(set (match_dup 0) + (lshiftrt:ALL4 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) -(define_insn "*lshrsi3_const" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r") - (lshiftrt:SI (match_operand:SI 1 "register_operand" "0,0,r,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,&d"))] +;; "*lshrsi3_const" +;; "*lshrsq3_const" "*lshrusq3_const" +;; "*lshrsa3_const" "*lshrusa3_const" +(define_insn "*lshr3_const" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r") + (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,&d"))] "reload_completed" { return lshrsi3_out (insn, operands, NULL); @@ -4278,24 +4435,29 @@ [(set_attr "cc" "compare") (set_attr "length" "4")]) -(define_insn "*reversed_tstsi" +;; "*reversed_tstsi" +;; "*reversed_tstsq" "*reversed_tstusq" +;; "*reversed_tstsa" "*reversed_tstusa" +(define_insn "*reversed_tst" [(set (cc0) - (compare (const_int 0) - (match_operand:SI 0 "register_operand" "r"))) - (clobber (match_scratch:QI 1 "=X"))] + (compare (match_operand:ALL4 0 "const0_operand" "Y00") + (match_operand:ALL4 1 "register_operand" "r"))) + (clobber (match_scratch:QI 2 "=X"))] "" - "cp __zero_reg__,%A0 - cpc __zero_reg__,%B0 - cpc __zero_reg__,%C0 - cpc __zero_reg__,%D0" + "cp __zero_reg__,%A1 + cpc __zero_reg__,%B1 + cpc __zero_reg__,%C1 + cpc __zero_reg__,%D1" [(set_attr "cc" "compare") (set_attr "length" "4")]) -(define_insn "*cmpqi" +;; "*cmpqi" +;; "*cmpqq" "*cmpuqq" +(define_insn "*cmp" [(set (cc0) - (compare (match_operand:QI 0 "register_operand" "r,r,d") - (match_operand:QI 1 "nonmemory_operand" "L,r,i")))] + (compare (match_operand:ALL1 0 "register_operand" "r ,r,d") + (match_operand:ALL1 1 "nonmemory_operand" "Y00,r,i")))] "" "@ tst %0 @@ -4313,11 +4475,14 @@ [(set_attr "cc" "compare") (set_attr "length" "1")]) -(define_insn "*cmphi" +;; "*cmphi" +;; "*cmphq" "*cmpuhq" +;; "*cmpha" "*cmpuha" +(define_insn "*cmp" [(set (cc0) - (compare (match_operand:HI 0 "register_operand" "!w,r,r,d ,r ,d,r") - (match_operand:HI 1 "nonmemory_operand" "L ,L,r,s ,s ,M,n"))) - (clobber (match_scratch:QI 2 "=X ,X,X,&d,&d ,X,&d"))] + (compare (match_operand:ALL2 0 "register_operand" "!w ,r ,r,d ,r ,d,r") + (match_operand:ALL2 1 "nonmemory_operand" "Y00,Y00,r,s ,s ,M,n Ynn"))) + (clobber (match_scratch:QI 2 "=X ,X ,X,&d,&d ,X,&d"))] "" { switch (which_alternative) @@ -4330,11 +4495,15 @@ return "cp %A0,%A1\;cpc %B0,%B1"; case 3: + if (mode != HImode) + break; return reg_unused_after (insn, operands[0]) ? "subi %A0,lo8(%1)\;sbci %B0,hi8(%1)" : "ldi %2,hi8(%1)\;cpi %A0,lo8(%1)\;cpc %B0,%2"; case 4: + if (mode != HImode) + break; return "ldi %2,lo8(%1)\;cp %A0,%2\;ldi %2,hi8(%1)\;cpc %B0,%2"; } @@ -4374,11 +4543,14 @@ (set_attr "length" "3,3,5,6,3,7") (set_attr "adjust_len" "tstpsi,*,*,*,compare,compare")]) -(define_insn "*cmpsi" +;; "*cmpsi" +;; "*cmpsq" "*cmpusq" +;; "*cmpsa" "*cmpusa" +(define_insn "*cmp" [(set (cc0) - (compare (match_operand:SI 0 "register_operand" "r,r ,d,r ,r") - (match_operand:SI 1 "nonmemory_operand" "L,r ,M,M ,n"))) - (clobber (match_scratch:QI 2 "=X,X ,X,&d,&d"))] + (compare (match_operand:ALL4 0 "register_operand" "r ,r ,d,r ,r") + (match_operand:ALL4 1 "nonmemory_operand" "Y00,r ,M,M ,n Ynn"))) + (clobber (match_scratch:QI 2 "=X ,X ,X,&d,&d"))] "" { if (0 == which_alternative) @@ -4398,55 +4570,33 @@ ;; ---------------------------------------------------------------------- ;; Conditional jump instructions -(define_expand "cbranchsi4" - [(parallel [(set (cc0) - (compare (match_operand:SI 1 "register_operand" "") - (match_operand:SI 2 "nonmemory_operand" ""))) - (clobber (match_scratch:QI 4 ""))]) - (set (pc) - (if_then_else - (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") - -(define_expand "cbranchpsi4" - [(parallel [(set (cc0) - (compare (match_operand:PSI 1 "register_operand" "") - (match_operand:PSI 2 "nonmemory_operand" ""))) - (clobber (match_scratch:QI 4 ""))]) - (set (pc) - (if_then_else (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") - -(define_expand "cbranchhi4" - [(parallel [(set (cc0) - (compare (match_operand:HI 1 "register_operand" "") - (match_operand:HI 2 "nonmemory_operand" ""))) - (clobber (match_scratch:QI 4 ""))]) - (set (pc) - (if_then_else - (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") - -(define_expand "cbranchqi4" +;; "cbranchqi4" +;; "cbranchqq4" "cbranchuqq4" +(define_expand "cbranch4" [(set (cc0) - (compare (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "nonmemory_operand" ""))) + (compare (match_operand:ALL1 1 "register_operand" "") + (match_operand:ALL1 2 "nonmemory_operand" ""))) (set (pc) (if_then_else - (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") + (match_operator 0 "ordered_comparison_operator" [(cc0) + (const_int 0)]) + (label_ref (match_operand 3 "" "")) + (pc)))]) + +;; "cbranchhi4" "cbranchhq4" "cbranchuhq4" "cbranchha4" "cbranchuha4" +;; "cbranchsi4" "cbranchsq4" "cbranchusq4" "cbranchsa4" "cbranchusa4" +;; "cbranchpsi4" +(define_expand "cbranch4" + [(parallel [(set (cc0) + (compare (match_operand:ORDERED234 1 "register_operand" "") + (match_operand:ORDERED234 2 "nonmemory_operand" ""))) + (clobber (match_scratch:QI 4 ""))]) + (set (pc) + (if_then_else + (match_operator 0 "ordered_comparison_operator" [(cc0) + (const_int 0)]) + (label_ref (match_operand 3 "" "")) + (pc)))]) ;; Test a single bit in a QI/HI/SImode register. @@ -4477,7 +4627,7 @@ (const_int 4)))) (set_attr "cc" "clobber")]) -;; Same test based on Bitwise AND RTL. Keep this incase gcc changes patterns. +;; Same test based on bitwise AND. Keep this in case gcc changes patterns. ;; or for old peepholes. ;; Fixme - bitwise Mask will not work for DImode @@ -4492,12 +4642,12 @@ (label_ref (match_operand 3 "" "")) (pc)))] "" -{ + { HOST_WIDE_INT bitnumber; bitnumber = exact_log2 (GET_MODE_MASK (mode) & INTVAL (operands[2])); operands[2] = GEN_INT (bitnumber); return avr_out_sbxx_branch (insn, operands); -} + } [(set (attr "length") (if_then_else (and (ge (minus (pc) (match_dup 3)) (const_int -2046)) (le (minus (pc) (match_dup 3)) (const_int 2046))) @@ -4837,9 +4987,10 @@ (define_expand "casesi" - [(set (match_dup 6) - (minus:HI (subreg:HI (match_operand:SI 0 "register_operand" "") 0) - (match_operand:HI 1 "register_operand" ""))) + [(parallel [(set (match_dup 6) + (minus:HI (subreg:HI (match_operand:SI 0 "register_operand" "") 0) + (match_operand:HI 1 "register_operand" ""))) + (clobber (scratch:QI))]) (parallel [(set (cc0) (compare (match_dup 6) (match_operand:HI 2 "register_operand" ""))) @@ -5201,8 +5352,8 @@ (define_peephole ; "*cpse.eq" [(set (cc0) - (compare (match_operand:QI 1 "register_operand" "r,r") - (match_operand:QI 2 "reg_or_0_operand" "r,L"))) + (compare (match_operand:ALL1 1 "register_operand" "r,r") + (match_operand:ALL1 2 "reg_or_0_operand" "r,Y00"))) (set (pc) (if_then_else (eq (cc0) (const_int 0)) @@ -5236,8 +5387,8 @@ (define_peephole ; "*cpse.ne" [(set (cc0) - (compare (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "reg_or_0_operand" ""))) + (compare (match_operand:ALL1 1 "register_operand" "") + (match_operand:ALL1 2 "reg_or_0_operand" ""))) (set (pc) (if_then_else (ne (cc0) (const_int 0)) @@ -5246,7 +5397,7 @@ "!AVR_HAVE_JMP_CALL || !avr_current_device->errata_skip" { - if (operands[2] == const0_rtx) + if (operands[2] == CONST0_RTX (mode)) operands[2] = zero_reg_rtx; return 3 == avr_jump_mode (operands[0], insn) @@ -6265,4 +6416,8 @@ }) +;; Fixed-point instructions +(include "avr-fixed.md") + +;; Operations on 64-bit registers (include "avr-dimode.md") diff --git a/gcc/config/avr/constraints.md b/gcc/config/avr/constraints.md index 57e259db6d1..5a1c9f1aef1 100644 --- a/gcc/config/avr/constraints.md +++ b/gcc/config/avr/constraints.md @@ -192,3 +192,47 @@ "32-bit integer constant where no nibble equals 0xf." (and (match_code "const_int") (match_test "!avr_has_nibble_0xf (op)"))) + +;; CONST_FIXED is no element of 'n' so cook our own. +;; "i" or "s" would match but because the insn uses iterators that cover +;; INT_MODE, "i" or "s" is not always possible. + +(define_constraint "Ynn" + "Fixed-point constant known at compile time." + (match_code "const_fixed")) + +(define_constraint "Y00" + "Fixed-point or integer constant with bit representation 0x0" + (and (match_code "const_fixed,const_int") + (match_test "op == CONST0_RTX (GET_MODE (op))"))) + +(define_constraint "Y01" + "Fixed-point or integer constant with bit representation 0x1" + (ior (and (match_code "const_fixed") + (match_test "1 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_P (op)"))) + +(define_constraint "Ym1" + "Fixed-point or integer constant with bit representation -0x1" + (ior (and (match_code "const_fixed") + (match_test "-1 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_N (op)"))) + +(define_constraint "Y02" + "Fixed-point or integer constant with bit representation 0x2" + (ior (and (match_code "const_fixed") + (match_test "2 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_K (op)"))) + +(define_constraint "Ym2" + "Fixed-point or integer constant with bit representation -0x2" + (ior (and (match_code "const_fixed") + (match_test "-2 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_Cm2 (op)"))) + +;; Similar to "IJ" used with ADIW/SBIW, but for CONST_FIXED. + +(define_constraint "YIJ" + "Fixed-point constant from @minus{}0x003f to 0x003f." + (and (match_code "const_fixed") + (match_test "IN_RANGE (INTVAL (avr_to_int_mode (op)), -63, 63)"))) diff --git a/gcc/config/avr/predicates.md b/gcc/config/avr/predicates.md index f6563c6751d..04587ae491f 100644 --- a/gcc/config/avr/predicates.md +++ b/gcc/config/avr/predicates.md @@ -74,7 +74,7 @@ ;; Return 1 if OP is the zero constant for MODE. (define_predicate "const0_operand" - (and (match_code "const_int,const_double") + (and (match_code "const_int,const_fixed,const_double") (match_test "op == CONST0_RTX (mode)"))) ;; Return 1 if OP is the one constant integer for MODE. @@ -248,3 +248,21 @@ (define_predicate "o16_operand" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), -(1<<16), -1)"))) + +;; Const int, fixed, or double operand +(define_predicate "const_operand" + (ior (match_code "const_fixed") + (match_code "const_double") + (match_operand 0 "const_int_operand"))) + +;; Const int, const fixed, or const double operand +(define_predicate "nonmemory_or_const_operand" + (ior (match_code "const_fixed") + (match_code "const_double") + (match_operand 0 "nonmemory_operand"))) + +;; Immediate, const fixed, or const double operand +(define_predicate "const_or_immediate_operand" + (ior (match_code "const_fixed") + (match_code "const_double") + (match_operand 0 "immediate_operand"))) diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog index 33ad5185479..60f19491d76 100644 --- a/libgcc/ChangeLog +++ b/libgcc/ChangeLog @@ -1,3 +1,23 @@ +2012-08-24 Georg-Johann Lay + + PR target/54222 + * config/avr/lib1funcs-fixed.S: New file. + * config/avr/lib1funcs.S: Include it. Undefine some divmodsi + after they are used. + (neg2, neg4): New macros. + (__mulqihi3,__umulqihi3,__mulhi3): Rewrite non-MUL variants. + (__mulhisi3,__umulhisi3,__mulsi3): Rewrite non-MUL variants. + (__umulhisi3): Speed up MUL variant if there is enough flash. + * config/avr/avr-lib.h (TA, UTA): Adjust according to gcc's + avr-modes.def. + * config/avr/t-avr (LIB1ASMFUNCS): Add: _fractqqsf, _fractuqqsf, + _fracthqsf, _fractuhqsf, _fracthasf, _fractuhasf, _fractsasf, + _fractusasf, _fractsfqq, _fractsfuqq, _fractsfhq, _fractsfuhq, + _fractsfha, _fractsfsa, _mulqq3, _muluqq3, _mulhq3, _muluhq3, + _mulha3, _muluha3, _mulsa3, _mulusa3, _divqq3, _udivuqq3, _divhq3, + _udivuhq3, _divha3, _udivuha3, _divsa3, _udivusa3. + (LIB2FUNCS_EXCLUDE): Add supported functions. + 2012-08-22 Georg-Johann Lay * Makefile.in (fixed-funcs,fixed-conv-funcs): filter-out diff --git a/libgcc/config/avr/avr-lib.h b/libgcc/config/avr/avr-lib.h index daca4d81f9a..66082eb8a48 100644 --- a/libgcc/config/avr/avr-lib.h +++ b/libgcc/config/avr/avr-lib.h @@ -4,3 +4,79 @@ #define DI SI typedef int QItype __attribute__ ((mode (QI))); #endif + +/* fixed-bit.h does not define functions for TA and UTA because + that part is wrapped in #if MIN_UNITS_PER_WORD > 4. + This would lead to empty functions for TA and UTA. + Thus, supply appropriate defines as if HAVE_[U]TA == 1. + #define HAVE_[U]TA 1 won't work because avr-modes.def + uses ADJUST_BYTESIZE(TA,8) and fixed-bit.h is not generic enough + to arrange for such changes of the mode size. */ + +typedef unsigned _Fract UTAtype __attribute__ ((mode (UTA))); + +#if defined (UTA_MODE) +#define FIXED_SIZE 8 /* in bytes */ +#define INT_C_TYPE UDItype +#define UINT_C_TYPE UDItype +#define HINT_C_TYPE USItype +#define HUINT_C_TYPE USItype +#define MODE_NAME UTA +#define MODE_NAME_S uta +#define MODE_UNSIGNED 1 +#endif + +#if defined (FROM_UTA) +#define FROM_TYPE 4 /* ID for fixed-point */ +#define FROM_MODE_NAME UTA +#define FROM_MODE_NAME_S uta +#define FROM_INT_C_TYPE UDItype +#define FROM_SINT_C_TYPE DItype +#define FROM_UINT_C_TYPE UDItype +#define FROM_MODE_UNSIGNED 1 +#define FROM_FIXED_SIZE 8 /* in bytes */ +#elif defined (TO_UTA) +#define TO_TYPE 4 /* ID for fixed-point */ +#define TO_MODE_NAME UTA +#define TO_MODE_NAME_S uta +#define TO_INT_C_TYPE UDItype +#define TO_SINT_C_TYPE DItype +#define TO_UINT_C_TYPE UDItype +#define TO_MODE_UNSIGNED 1 +#define TO_FIXED_SIZE 8 /* in bytes */ +#endif + +/* Same for TAmode */ + +typedef _Fract TAtype __attribute__ ((mode (TA))); + +#if defined (TA_MODE) +#define FIXED_SIZE 8 /* in bytes */ +#define INT_C_TYPE DItype +#define UINT_C_TYPE UDItype +#define HINT_C_TYPE SItype +#define HUINT_C_TYPE USItype +#define MODE_NAME TA +#define MODE_NAME_S ta +#define MODE_UNSIGNED 0 +#endif + +#if defined (FROM_TA) +#define FROM_TYPE 4 /* ID for fixed-point */ +#define FROM_MODE_NAME TA +#define FROM_MODE_NAME_S ta +#define FROM_INT_C_TYPE DItype +#define FROM_SINT_C_TYPE DItype +#define FROM_UINT_C_TYPE UDItype +#define FROM_MODE_UNSIGNED 0 +#define FROM_FIXED_SIZE 8 /* in bytes */ +#elif defined (TO_TA) +#define TO_TYPE 4 /* ID for fixed-point */ +#define TO_MODE_NAME TA +#define TO_MODE_NAME_S ta +#define TO_INT_C_TYPE DItype +#define TO_SINT_C_TYPE DItype +#define TO_UINT_C_TYPE UDItype +#define TO_MODE_UNSIGNED 0 +#define TO_FIXED_SIZE 8 /* in bytes */ +#endif diff --git a/libgcc/config/avr/lib1funcs-fixed.S b/libgcc/config/avr/lib1funcs-fixed.S new file mode 100644 index 00000000000..c1aff53d5fd --- /dev/null +++ b/libgcc/config/avr/lib1funcs-fixed.S @@ -0,0 +1,874 @@ +/* -*- Mode: Asm -*- */ +;; Copyright (C) 2012 +;; Free Software Foundation, Inc. +;; Contributed by Sean D'Epagnier (sean@depagnier.com) +;; Georg-Johann Lay (avr@gjlay.de) + +;; This file is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by the +;; Free Software Foundation; either version 3, or (at your option) any +;; later version. + +;; In addition to the permissions in the GNU General Public License, the +;; Free Software Foundation gives you unlimited permission to link the +;; compiled version of this file into combinations with other programs, +;; and to distribute those combinations without any restriction coming +;; from the use of this file. (The General Public License restrictions +;; do apply in other respects; for example, they cover modification of +;; the file, and distribution when not linked into a combine +;; executable.) + +;; This file is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with this program; see the file COPYING. If not, write to +;; the Free Software Foundation, 51 Franklin Street, Fifth Floor, +;; Boston, MA 02110-1301, USA. + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Fixed point library routines for AVR +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +.section .text.libgcc.fixed, "ax", @progbits + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Conversions to float +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +#if defined (L_fractqqsf) +DEFUN __fractqqsf + ;; Move in place for SA -> SF conversion + clr r22 + mov r23, r24 + lsl r23 + ;; Sign-extend + sbc r24, r24 + mov r25, r24 + XJMP __fractsasf +ENDF __fractqqsf +#endif /* L_fractqqsf */ + +#if defined (L_fractuqqsf) +DEFUN __fractuqqsf + ;; Move in place for USA -> SF conversion + clr r22 + mov r23, r24 + ;; Zero-extend + clr r24 + clr r25 + XJMP __fractusasf +ENDF __fractuqqsf +#endif /* L_fractuqqsf */ + +#if defined (L_fracthqsf) +DEFUN __fracthqsf + ;; Move in place for SA -> SF conversion + wmov 22, 24 + lsl r22 + rol r23 + ;; Sign-extend + sbc r24, r24 + mov r25, r24 + XJMP __fractsasf +ENDF __fracthqsf +#endif /* L_fracthqsf */ + +#if defined (L_fractuhqsf) +DEFUN __fractuhqsf + ;; Move in place for USA -> SF conversion + wmov 22, 24 + ;; Zero-extend + clr r24 + clr r25 + XJMP __fractusasf +ENDF __fractuhqsf +#endif /* L_fractuhqsf */ + +#if defined (L_fracthasf) +DEFUN __fracthasf + ;; Move in place for SA -> SF conversion + clr r22 + mov r23, r24 + mov r24, r25 + ;; Sign-extend + lsl r25 + sbc r25, r25 + XJMP __fractsasf +ENDF __fracthasf +#endif /* L_fracthasf */ + +#if defined (L_fractuhasf) +DEFUN __fractuhasf + ;; Move in place for USA -> SF conversion + clr r22 + mov r23, r24 + mov r24, r25 + ;; Zero-extend + clr r25 + XJMP __fractusasf +ENDF __fractuhasf +#endif /* L_fractuhasf */ + + +#if defined (L_fractsqsf) +DEFUN __fractsqsf + XCALL __floatsisf + ;; Divide non-zero results by 2^31 to move the + ;; decimal point into place + tst r25 + breq 0f + subi r24, exp_lo (31) + sbci r25, exp_hi (31) +0: ret +ENDF __fractsqsf +#endif /* L_fractsqsf */ + +#if defined (L_fractusqsf) +DEFUN __fractusqsf + XCALL __floatunsisf + ;; Divide non-zero results by 2^32 to move the + ;; decimal point into place + cpse r25, __zero_reg__ + subi r25, exp_hi (32) + ret +ENDF __fractusqsf +#endif /* L_fractusqsf */ + +#if defined (L_fractsasf) +DEFUN __fractsasf + XCALL __floatsisf + ;; Divide non-zero results by 2^16 to move the + ;; decimal point into place + cpse r25, __zero_reg__ + subi r25, exp_hi (16) + ret +ENDF __fractsasf +#endif /* L_fractsasf */ + +#if defined (L_fractusasf) +DEFUN __fractusasf + XCALL __floatunsisf + ;; Divide non-zero results by 2^16 to move the + ;; decimal point into place + cpse r25, __zero_reg__ + subi r25, exp_hi (16) + ret +ENDF __fractusasf +#endif /* L_fractusasf */ + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Conversions from float +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +#if defined (L_fractsfqq) +DEFUN __fractsfqq + ;; Multiply with 2^{24+7} to get a QQ result in r25 + subi r24, exp_lo (-31) + sbci r25, exp_hi (-31) + XCALL __fixsfsi + mov r24, r25 + ret +ENDF __fractsfqq +#endif /* L_fractsfqq */ + +#if defined (L_fractsfuqq) +DEFUN __fractsfuqq + ;; Multiply with 2^{24+8} to get a UQQ result in r25 + subi r25, exp_hi (-32) + XCALL __fixunssfsi + mov r24, r25 + ret +ENDF __fractsfuqq +#endif /* L_fractsfuqq */ + +#if defined (L_fractsfha) +DEFUN __fractsfha + ;; Multiply with 2^24 to get a HA result in r25:r24 + subi r25, exp_hi (-24) + XJMP __fixsfsi +ENDF __fractsfha +#endif /* L_fractsfha */ + +#if defined (L_fractsfuha) +DEFUN __fractsfuha + ;; Multiply with 2^24 to get a UHA result in r25:r24 + subi r25, exp_hi (-24) + XJMP __fixunssfsi +ENDF __fractsfuha +#endif /* L_fractsfuha */ + +#if defined (L_fractsfhq) +DEFUN __fractsfsq +ENDF __fractsfsq + +DEFUN __fractsfhq + ;; Multiply with 2^{16+15} to get a HQ result in r25:r24 + ;; resp. with 2^31 to get a SQ result in r25:r22 + subi r24, exp_lo (-31) + sbci r25, exp_hi (-31) + XJMP __fixsfsi +ENDF __fractsfhq +#endif /* L_fractsfhq */ + +#if defined (L_fractsfuhq) +DEFUN __fractsfusq +ENDF __fractsfusq + +DEFUN __fractsfuhq + ;; Multiply with 2^{16+16} to get a UHQ result in r25:r24 + ;; resp. with 2^32 to get a USQ result in r25:r22 + subi r25, exp_hi (-32) + XJMP __fixunssfsi +ENDF __fractsfuhq +#endif /* L_fractsfuhq */ + +#if defined (L_fractsfsa) +DEFUN __fractsfsa + ;; Multiply with 2^16 to get a SA result in r25:r22 + subi r25, exp_hi (-16) + XJMP __fixsfsi +ENDF __fractsfsa +#endif /* L_fractsfsa */ + +#if defined (L_fractsfusa) +DEFUN __fractsfusa + ;; Multiply with 2^16 to get a USA result in r25:r22 + subi r25, exp_hi (-16) + XJMP __fixunssfsi +ENDF __fractsfusa +#endif /* L_fractsfusa */ + + +;; For multiplication the functions here are called directly from +;; avr-fixed.md instead of using the standard libcall mechanisms. +;; This can make better code because GCC knows exactly which +;; of the call-used registers (not all of them) are clobbered. */ + +/******************************************************* + Fractional Multiplication 8 x 8 without MUL +*******************************************************/ + +#if defined (L_mulqq3) && !defined (__AVR_HAVE_MUL__) +;;; R23 = R24 * R25 +;;; Clobbers: __tmp_reg__, R22, R24, R25 +;;; Rounding: ??? +DEFUN __mulqq3 + XCALL __fmuls + ;; TR 18037 requires that (-1) * (-1) does not overflow + ;; The only input that can produce -1 is (-1)^2. + dec r23 + brvs 0f + inc r23 +0: ret +ENDF __mulqq3 +#endif /* L_mulqq3 && ! HAVE_MUL */ + +/******************************************************* + Fractional Multiply .16 x .16 with and without MUL +*******************************************************/ + +#if defined (L_mulhq3) +;;; Same code with and without MUL, but the interfaces differ: +;;; no MUL: (R25:R24) = (R22:R23) * (R24:R25) +;;; Clobbers: ABI, called by optabs +;;; MUL: (R25:R24) = (R19:R18) * (R27:R26) +;;; Clobbers: __tmp_reg__, R22, R23 +;;; Rounding: -0.5 LSB <= error <= 0.5 LSB +DEFUN __mulhq3 + XCALL __mulhisi3 + ;; Shift result into place + lsl r23 + rol r24 + rol r25 + brvs 1f + ;; Round + sbrc r23, 7 + adiw r24, 1 + ret +1: ;; Overflow. TR 18037 requires (-1)^2 not to overflow + ldi r24, lo8 (0x7fff) + ldi r25, hi8 (0x7fff) + ret +ENDF __mulhq3 +#endif /* defined (L_mulhq3) */ + +#if defined (L_muluhq3) +;;; Same code with and without MUL, but the interfaces differ: +;;; no MUL: (R25:R24) *= (R23:R22) +;;; Clobbers: ABI, called by optabs +;;; MUL: (R25:R24) = (R19:R18) * (R27:R26) +;;; Clobbers: __tmp_reg__, R22, R23 +;;; Rounding: -0.5 LSB < error <= 0.5 LSB +DEFUN __muluhq3 + XCALL __umulhisi3 + ;; Round + sbrc r23, 7 + adiw r24, 1 + ret +ENDF __muluhq3 +#endif /* L_muluhq3 */ + + +/******************************************************* + Fixed Multiply 8.8 x 8.8 with and without MUL +*******************************************************/ + +#if defined (L_mulha3) +;;; Same code with and without MUL, but the interfaces differ: +;;; no MUL: (R25:R24) = (R22:R23) * (R24:R25) +;;; Clobbers: ABI, called by optabs +;;; MUL: (R25:R24) = (R19:R18) * (R27:R26) +;;; Clobbers: __tmp_reg__, R22, R23 +;;; Rounding: -0.5 LSB <= error <= 0.5 LSB +DEFUN __mulha3 + XCALL __mulhisi3 + XJMP __muluha3_round +ENDF __mulha3 +#endif /* L_mulha3 */ + +#if defined (L_muluha3) +;;; Same code with and without MUL, but the interfaces differ: +;;; no MUL: (R25:R24) *= (R23:R22) +;;; Clobbers: ABI, called by optabs +;;; MUL: (R25:R24) = (R19:R18) * (R27:R26) +;;; Clobbers: __tmp_reg__, R22, R23 +;;; Rounding: -0.5 LSB < error <= 0.5 LSB +DEFUN __muluha3 + XCALL __umulhisi3 + XJMP __muluha3_round +ENDF __muluha3 +#endif /* L_muluha3 */ + +#if defined (L_muluha3_round) +DEFUN __muluha3_round + ;; Shift result into place + mov r25, r24 + mov r24, r23 + ;; Round + sbrc r22, 7 + adiw r24, 1 + ret +ENDF __muluha3_round +#endif /* L_muluha3_round */ + + +/******************************************************* + Fixed Multiplication 16.16 x 16.16 +*******************************************************/ + +#if defined (__AVR_HAVE_MUL__) + +;; Multiplier +#define A0 16 +#define A1 A0+1 +#define A2 A1+1 +#define A3 A2+1 + +;; Multiplicand +#define B0 20 +#define B1 B0+1 +#define B2 B1+1 +#define B3 B2+1 + +;; Result +#define C0 24 +#define C1 C0+1 +#define C2 C1+1 +#define C3 C2+1 + +#if defined (L_mulusa3) +;;; (C3:C0) = (A3:A0) * (B3:B0) +;;; Clobbers: __tmp_reg__ +;;; Rounding: -0.5 LSB < error <= 0.5 LSB +DEFUN __mulusa3 + ;; Some of the MUL instructions have LSBs outside the result. + ;; Don't ignore these LSBs in order to tame rounding error. + ;; Use C2/C3 for these LSBs. + + clr C0 + clr C1 + mul A0, B0 $ movw C2, r0 + + mul A1, B0 $ add C3, r0 $ adc C0, r1 + mul A0, B1 $ add C3, r0 $ adc C0, r1 $ rol C1 + + ;; Round + sbrc C3, 7 + adiw C0, 1 + + ;; The following MULs don't have LSBs outside the result. + ;; C2/C3 is the high part. + + mul A0, B2 $ add C0, r0 $ adc C1, r1 $ sbc C2, C2 + mul A1, B1 $ add C0, r0 $ adc C1, r1 $ sbci C2, 0 + mul A2, B0 $ add C0, r0 $ adc C1, r1 $ sbci C2, 0 + neg C2 + + mul A0, B3 $ add C1, r0 $ adc C2, r1 $ sbc C3, C3 + mul A1, B2 $ add C1, r0 $ adc C2, r1 $ sbci C3, 0 + mul A2, B1 $ add C1, r0 $ adc C2, r1 $ sbci C3, 0 + mul A3, B0 $ add C1, r0 $ adc C2, r1 $ sbci C3, 0 + neg C3 + + mul A1, B3 $ add C2, r0 $ adc C3, r1 + mul A2, B2 $ add C2, r0 $ adc C3, r1 + mul A3, B1 $ add C2, r0 $ adc C3, r1 + + mul A2, B3 $ add C3, r0 + mul A3, B2 $ add C3, r0 + + clr __zero_reg__ + ret +ENDF __mulusa3 +#endif /* L_mulusa3 */ + +#if defined (L_mulsa3) +;;; (C3:C0) = (A3:A0) * (B3:B0) +;;; Clobbers: __tmp_reg__ +;;; Rounding: -0.5 LSB <= error <= 0.5 LSB +DEFUN __mulsa3 + XCALL __mulusa3 + tst B3 + brpl 1f + sub C2, A0 + sbc C3, A1 +1: sbrs A3, 7 + ret + sub C2, B0 + sbc C3, B1 + ret +ENDF __mulsa3 +#endif /* L_mulsa3 */ + +#undef A0 +#undef A1 +#undef A2 +#undef A3 +#undef B0 +#undef B1 +#undef B2 +#undef B3 +#undef C0 +#undef C1 +#undef C2 +#undef C3 + +#else /* __AVR_HAVE_MUL__ */ + +#define A0 18 +#define A1 A0+1 +#define A2 A0+2 +#define A3 A0+3 + +#define B0 22 +#define B1 B0+1 +#define B2 B0+2 +#define B3 B0+3 + +#define C0 22 +#define C1 C0+1 +#define C2 C0+2 +#define C3 C0+3 + +;; __tmp_reg__ +#define CC0 0 +;; __zero_reg__ +#define CC1 1 +#define CC2 16 +#define CC3 17 + +#define AA0 26 +#define AA1 AA0+1 +#define AA2 30 +#define AA3 AA2+1 + +#if defined (L_mulsa3) +;;; (R25:R22) *= (R21:R18) +;;; Clobbers: ABI, called by optabs +;;; Rounding: -1 LSB <= error <= 1 LSB +DEFUN __mulsa3 + push B0 + push B1 + bst B3, 7 + XCALL __mulusa3 + ;; A survived in 31:30:27:26 + rcall 1f + pop AA1 + pop AA0 + bst AA3, 7 +1: brtc 9f + ;; 1-extend A/B + sub C2, AA0 + sbc C3, AA1 +9: ret +ENDF __mulsa3 +#endif /* L_mulsa3 */ + +#if defined (L_mulusa3) +;;; (R25:R22) *= (R21:R18) +;;; Clobbers: ABI, called by optabs and __mulsua +;;; Rounding: -1 LSB <= error <= 1 LSB +;;; Does not clobber T and A[] survives in 26, 27, 30, 31 +DEFUN __mulusa3 + push CC2 + push CC3 + ; clear result + clr __tmp_reg__ + wmov CC2, CC0 + ; save multiplicand + wmov AA0, A0 + wmov AA2, A2 + rjmp 3f + + ;; Loop the integral part + +1: ;; CC += A * 2^n; n >= 0 + add CC0,A0 $ adc CC1,A1 $ adc CC2,A2 $ adc CC3,A3 + +2: ;; A <<= 1 + lsl A0 $ rol A1 $ rol A2 $ rol A3 + +3: ;; IBIT(B) >>= 1 + ;; Carry = n-th bit of B; n >= 0 + lsr B3 + ror B2 + brcs 1b + sbci B3, 0 + brne 2b + + ;; Loop the fractional part + ;; B2/B3 is 0 now, use as guard bits for rounding + ;; Restore multiplicand + wmov A0, AA0 + wmov A2, AA2 + rjmp 5f + +4: ;; CC += A:Guard * 2^n; n < 0 + add B3,B2 $ adc CC0,A0 $ adc CC1,A1 $ adc CC2,A2 $ adc CC3,A3 +5: + ;; A:Guard >>= 1 + lsr A3 $ ror A2 $ ror A1 $ ror A0 $ ror B2 + + ;; FBIT(B) <<= 1 + ;; Carry = n-th bit of B; n < 0 + lsl B0 + rol B1 + brcs 4b + sbci B0, 0 + brne 5b + + ;; Move result into place and round + lsl B3 + wmov C2, CC2 + wmov C0, CC0 + clr __zero_reg__ + adc C0, __zero_reg__ + adc C1, __zero_reg__ + adc C2, __zero_reg__ + adc C3, __zero_reg__ + + ;; Epilogue + pop CC3 + pop CC2 + ret +ENDF __mulusa3 +#endif /* L_mulusa3 */ + +#undef A0 +#undef A1 +#undef A2 +#undef A3 +#undef B0 +#undef B1 +#undef B2 +#undef B3 +#undef C0 +#undef C1 +#undef C2 +#undef C3 +#undef AA0 +#undef AA1 +#undef AA2 +#undef AA3 +#undef CC0 +#undef CC1 +#undef CC2 +#undef CC3 + +#endif /* __AVR_HAVE_MUL__ */ + +/******************************************************* + Fractional Division 8 / 8 +*******************************************************/ + +#define r_divd r25 /* dividend */ +#define r_quo r24 /* quotient */ +#define r_div r22 /* divisor */ + +#if defined (L_divqq3) +DEFUN __divqq3 + mov r0, r_divd + eor r0, r_div + sbrc r_div, 7 + neg r_div + sbrc r_divd, 7 + neg r_divd + cp r_divd, r_div + breq __divqq3_minus1 ; if equal return -1 + XCALL __udivuqq3 + lsr r_quo + sbrc r0, 7 ; negate result if needed + neg r_quo + ret +__divqq3_minus1: + ldi r_quo, 0x80 + ret +ENDF __divqq3 +#endif /* defined (L_divqq3) */ + +#if defined (L_udivuqq3) +DEFUN __udivuqq3 + clr r_quo ; clear quotient + inc __zero_reg__ ; init loop counter, used per shift +__udivuqq3_loop: + lsl r_divd ; shift dividend + brcs 0f ; dividend overflow + cp r_divd,r_div ; compare dividend & divisor + brcc 0f ; dividend >= divisor + rol r_quo ; shift quotient (with CARRY) + rjmp __udivuqq3_cont +0: + sub r_divd,r_div ; restore dividend + lsl r_quo ; shift quotient (without CARRY) +__udivuqq3_cont: + lsl __zero_reg__ ; shift loop-counter bit + brne __udivuqq3_loop + com r_quo ; complement result + ; because C flag was complemented in loop + ret +ENDF __udivuqq3 +#endif /* defined (L_udivuqq3) */ + +#undef r_divd +#undef r_quo +#undef r_div + + +/******************************************************* + Fractional Division 16 / 16 +*******************************************************/ +#define r_divdL 26 /* dividend Low */ +#define r_divdH 27 /* dividend Hig */ +#define r_quoL 24 /* quotient Low */ +#define r_quoH 25 /* quotient High */ +#define r_divL 22 /* divisor */ +#define r_divH 23 /* divisor */ +#define r_cnt 21 + +#if defined (L_divhq3) +DEFUN __divhq3 + mov r0, r_divdH + eor r0, r_divH + sbrs r_divH, 7 + rjmp 1f + NEG2 r_divL +1: + sbrs r_divdH, 7 + rjmp 2f + NEG2 r_divdL +2: + cp r_divdL, r_divL + cpc r_divdH, r_divH + breq __divhq3_minus1 ; if equal return -1 + XCALL __udivuhq3 + lsr r_quoH + ror r_quoL + brpl 9f + ;; negate result if needed + NEG2 r_quoL +9: + ret +__divhq3_minus1: + ldi r_quoH, 0x80 + clr r_quoL + ret +ENDF __divhq3 +#endif /* defined (L_divhq3) */ + +#if defined (L_udivuhq3) +DEFUN __udivuhq3 + sub r_quoH,r_quoH ; clear quotient and carry + ;; FALLTHRU +ENDF __udivuhq3 + +DEFUN __udivuha3_common + clr r_quoL ; clear quotient + ldi r_cnt,16 ; init loop counter +__udivuhq3_loop: + rol r_divdL ; shift dividend (with CARRY) + rol r_divdH + brcs __udivuhq3_ep ; dividend overflow + cp r_divdL,r_divL ; compare dividend & divisor + cpc r_divdH,r_divH + brcc __udivuhq3_ep ; dividend >= divisor + rol r_quoL ; shift quotient (with CARRY) + rjmp __udivuhq3_cont +__udivuhq3_ep: + sub r_divdL,r_divL ; restore dividend + sbc r_divdH,r_divH + lsl r_quoL ; shift quotient (without CARRY) +__udivuhq3_cont: + rol r_quoH ; shift quotient + dec r_cnt ; decrement loop counter + brne __udivuhq3_loop + com r_quoL ; complement result + com r_quoH ; because C flag was complemented in loop + ret +ENDF __udivuha3_common +#endif /* defined (L_udivuhq3) */ + +/******************************************************* + Fixed Division 8.8 / 8.8 +*******************************************************/ +#if defined (L_divha3) +DEFUN __divha3 + mov r0, r_divdH + eor r0, r_divH + sbrs r_divH, 7 + rjmp 1f + NEG2 r_divL +1: + sbrs r_divdH, 7 + rjmp 2f + NEG2 r_divdL +2: + XCALL __udivuha3 + sbrs r0, 7 ; negate result if needed + ret + NEG2 r_quoL + ret +ENDF __divha3 +#endif /* defined (L_divha3) */ + +#if defined (L_udivuha3) +DEFUN __udivuha3 + mov r_quoH, r_divdL + mov r_divdL, r_divdH + clr r_divdH + lsl r_quoH ; shift quotient into carry + XJMP __udivuha3_common ; same as fractional after rearrange +ENDF __udivuha3 +#endif /* defined (L_udivuha3) */ + +#undef r_divdL +#undef r_divdH +#undef r_quoL +#undef r_quoH +#undef r_divL +#undef r_divH +#undef r_cnt + +/******************************************************* + Fixed Division 16.16 / 16.16 +*******************************************************/ + +#define r_arg1L 24 /* arg1 gets passed already in place */ +#define r_arg1H 25 +#define r_arg1HL 26 +#define r_arg1HH 27 +#define r_divdL 26 /* dividend Low */ +#define r_divdH 27 +#define r_divdHL 30 +#define r_divdHH 31 /* dividend High */ +#define r_quoL 22 /* quotient Low */ +#define r_quoH 23 +#define r_quoHL 24 +#define r_quoHH 25 /* quotient High */ +#define r_divL 18 /* divisor Low */ +#define r_divH 19 +#define r_divHL 20 +#define r_divHH 21 /* divisor High */ +#define r_cnt __zero_reg__ /* loop count (0 after the loop!) */ + +#if defined (L_divsa3) +DEFUN __divsa3 + mov r0, r_arg1HH + eor r0, r_divHH + sbrs r_divHH, 7 + rjmp 1f + NEG4 r_divL +1: + sbrs r_arg1HH, 7 + rjmp 2f + NEG4 r_arg1L +2: + XCALL __udivusa3 + sbrs r0, 7 ; negate result if needed + ret + NEG4 r_quoL + ret +ENDF __divsa3 +#endif /* defined (L_divsa3) */ + +#if defined (L_udivusa3) +DEFUN __udivusa3 + ldi r_divdHL, 32 ; init loop counter + mov r_cnt, r_divdHL + clr r_divdHL + clr r_divdHH + wmov r_quoL, r_divdHL + lsl r_quoHL ; shift quotient into carry + rol r_quoHH +__udivusa3_loop: + rol r_divdL ; shift dividend (with CARRY) + rol r_divdH + rol r_divdHL + rol r_divdHH + brcs __udivusa3_ep ; dividend overflow + cp r_divdL,r_divL ; compare dividend & divisor + cpc r_divdH,r_divH + cpc r_divdHL,r_divHL + cpc r_divdHH,r_divHH + brcc __udivusa3_ep ; dividend >= divisor + rol r_quoL ; shift quotient (with CARRY) + rjmp __udivusa3_cont +__udivusa3_ep: + sub r_divdL,r_divL ; restore dividend + sbc r_divdH,r_divH + sbc r_divdHL,r_divHL + sbc r_divdHH,r_divHH + lsl r_quoL ; shift quotient (without CARRY) +__udivusa3_cont: + rol r_quoH ; shift quotient + rol r_quoHL + rol r_quoHH + dec r_cnt ; decrement loop counter + brne __udivusa3_loop + com r_quoL ; complement result + com r_quoH ; because C flag was complemented in loop + com r_quoHL + com r_quoHH + ret +ENDF __udivusa3 +#endif /* defined (L_udivusa3) */ + +#undef r_arg1L +#undef r_arg1H +#undef r_arg1HL +#undef r_arg1HH +#undef r_divdL +#undef r_divdH +#undef r_divdHL +#undef r_divdHH +#undef r_quoL +#undef r_quoH +#undef r_quoHL +#undef r_quoHH +#undef r_divL +#undef r_divH +#undef r_divHL +#undef r_divHH +#undef r_cnt diff --git a/libgcc/config/avr/lib1funcs.S b/libgcc/config/avr/lib1funcs.S index 95a7d3d4eeb..6b9879ee7d7 100644 --- a/libgcc/config/avr/lib1funcs.S +++ b/libgcc/config/avr/lib1funcs.S @@ -91,6 +91,35 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see .endfunc .endm +;; Negate a 2-byte value held in consecutive registers +.macro NEG2 reg + com \reg+1 + neg \reg + sbci \reg+1, -1 +.endm + +;; Negate a 4-byte value held in consecutive registers +.macro NEG4 reg + com \reg+3 + com \reg+2 + com \reg+1 +.if \reg >= 16 + neg \reg + sbci \reg+1, -1 + sbci \reg+2, -1 + sbci \reg+3, -1 +.else + com \reg + adc \reg, __zero_reg__ + adc \reg+1, __zero_reg__ + adc \reg+2, __zero_reg__ + adc \reg+3, __zero_reg__ +.endif +.endm + +#define exp_lo(N) hlo8 ((N) << 23) +#define exp_hi(N) hhi8 ((N) << 23) + .section .text.libgcc.mul, "ax", @progbits @@ -126,175 +155,246 @@ ENDF __mulqi3 #endif /* defined (L_mulqi3) */ -#if defined (L_mulqihi3) -DEFUN __mulqihi3 - clr r25 - sbrc r24, 7 - dec r25 - clr r23 - sbrc r22, 7 - dec r22 - XJMP __mulhi3 -ENDF __mulqihi3: -#endif /* defined (L_mulqihi3) */ - -#if defined (L_umulqihi3) -DEFUN __umulqihi3 - clr r25 - clr r23 - XJMP __mulhi3 -ENDF __umulqihi3 -#endif /* defined (L_umulqihi3) */ /******************************************************* + Widening Multiplication 16 = 8 x 8 without MUL Multiplication 16 x 16 without MUL *******************************************************/ + +#define A0 r22 +#define A1 r23 +#define B0 r24 +#define BB0 r20 +#define B1 r25 +;; Output overlaps input, thus expand result in CC0/1 +#define C0 r24 +#define C1 r25 +#define CC0 __tmp_reg__ +#define CC1 R21 + +#if defined (L_umulqihi3) +;;; R25:R24 = (unsigned int) R22 * (unsigned int) R24 +;;; (C1:C0) = (unsigned int) A0 * (unsigned int) B0 +;;; Clobbers: __tmp_reg__, R21..R23 +DEFUN __umulqihi3 + clr A1 + clr B1 + XJMP __mulhi3 +ENDF __umulqihi3 +#endif /* L_umulqihi3 */ + +#if defined (L_mulqihi3) +;;; R25:R24 = (signed int) R22 * (signed int) R24 +;;; (C1:C0) = (signed int) A0 * (signed int) B0 +;;; Clobbers: __tmp_reg__, R20..R23 +DEFUN __mulqihi3 + ;; Sign-extend B0 + clr B1 + sbrc B0, 7 + com B1 + ;; The multiplication runs twice as fast if A1 is zero, thus: + ;; Zero-extend A0 + clr A1 +#ifdef __AVR_HAVE_JMP_CALL__ + ;; Store B0 * sign of A + clr BB0 + sbrc A0, 7 + mov BB0, B0 + call __mulhi3 +#else /* have no CALL */ + ;; Skip sign-extension of A if A >= 0 + ;; Same size as with the first alternative but avoids errata skip + ;; and is faster if A >= 0 + sbrs A0, 7 + rjmp __mulhi3 + ;; If A < 0 store B + mov BB0, B0 + rcall __mulhi3 +#endif /* HAVE_JMP_CALL */ + ;; 1-extend A after the multiplication + sub C1, BB0 + ret +ENDF __mulqihi3 +#endif /* L_mulqihi3 */ + #if defined (L_mulhi3) -#define r_arg1L r24 /* multiplier Low */ -#define r_arg1H r25 /* multiplier High */ -#define r_arg2L r22 /* multiplicand Low */ -#define r_arg2H r23 /* multiplicand High */ -#define r_resL __tmp_reg__ /* result Low */ -#define r_resH r21 /* result High */ - +;;; R25:R24 = R23:R22 * R25:R24 +;;; (C1:C0) = (A1:A0) * (B1:B0) +;;; Clobbers: __tmp_reg__, R21..R23 DEFUN __mulhi3 - clr r_resH ; clear result - clr r_resL ; clear result -__mulhi3_loop: - sbrs r_arg1L,0 - rjmp __mulhi3_skip1 - add r_resL,r_arg2L ; result + multiplicand - adc r_resH,r_arg2H -__mulhi3_skip1: - add r_arg2L,r_arg2L ; shift multiplicand - adc r_arg2H,r_arg2H - cp r_arg2L,__zero_reg__ - cpc r_arg2H,__zero_reg__ - breq __mulhi3_exit ; while multiplicand != 0 + ;; Clear result + clr CC0 + clr CC1 + rjmp 3f +1: + ;; Bit n of A is 1 --> C += B << n + add CC0, B0 + adc CC1, B1 +2: + lsl B0 + rol B1 +3: + ;; If B == 0 we are ready + sbiw B0, 0 + breq 9f - lsr r_arg1H ; gets LSB of multiplier - ror r_arg1L - sbiw r_arg1L,0 - brne __mulhi3_loop ; exit if multiplier = 0 -__mulhi3_exit: - mov r_arg1H,r_resH ; result to return register - mov r_arg1L,r_resL - ret -ENDF __mulhi3 + ;; Carry = n-th bit of A + lsr A1 + ror A0 + ;; If bit n of A is set, then go add B * 2^n to C + brcs 1b -#undef r_arg1L -#undef r_arg1H -#undef r_arg2L -#undef r_arg2H -#undef r_resL -#undef r_resH + ;; Carry = 0 --> The ROR above acts like CP A0, 0 + ;; Thus, it is sufficient to CPC the high part to test A against 0 + cpc A1, __zero_reg__ + ;; Only proceed if A != 0 + brne 2b +9: + ;; Move Result into place + mov C0, CC0 + mov C1, CC1 + ret +ENDF __mulhi3 +#endif /* L_mulhi3 */ -#endif /* defined (L_mulhi3) */ +#undef A0 +#undef A1 +#undef B0 +#undef BB0 +#undef B1 +#undef C0 +#undef C1 +#undef CC0 +#undef CC1 + + +#define A0 22 +#define A1 A0+1 +#define A2 A0+2 +#define A3 A0+3 + +#define B0 18 +#define B1 B0+1 +#define B2 B0+2 +#define B3 B0+3 + +#define CC0 26 +#define CC1 CC0+1 +#define CC2 30 +#define CC3 CC2+1 + +#define C0 22 +#define C1 C0+1 +#define C2 C0+2 +#define C3 C0+3 /******************************************************* Widening Multiplication 32 = 16 x 16 without MUL *******************************************************/ -#if defined (L_mulhisi3) -DEFUN __mulhisi3 -;;; FIXME: This is dead code (noone calls it) - mov_l r18, r24 - mov_h r19, r25 - clr r24 - sbrc r23, 7 - dec r24 - mov r25, r24 - clr r20 - sbrc r19, 7 - dec r20 - mov r21, r20 - XJMP __mulsi3 -ENDF __mulhisi3 -#endif /* defined (L_mulhisi3) */ - #if defined (L_umulhisi3) DEFUN __umulhisi3 -;;; FIXME: This is dead code (noone calls it) - mov_l r18, r24 - mov_h r19, r25 - clr r24 - clr r25 - mov_l r20, r24 - mov_h r21, r25 + wmov B0, 24 + ;; Zero-extend B + clr B2 + clr B3 + ;; Zero-extend A + wmov A2, B2 XJMP __mulsi3 ENDF __umulhisi3 -#endif /* defined (L_umulhisi3) */ +#endif /* L_umulhisi3 */ + +#if defined (L_mulhisi3) +DEFUN __mulhisi3 + wmov B0, 24 + ;; Sign-extend B + lsl r25 + sbc B2, B2 + mov B3, B2 +#ifdef __AVR_ERRATA_SKIP_JMP_CALL__ + ;; Sign-extend A + clr A2 + sbrc A1, 7 + com A2 + mov A3, A2 + XJMP __mulsi3 +#else /* no __AVR_ERRATA_SKIP_JMP_CALL__ */ + ;; Zero-extend A and __mulsi3 will run at least twice as fast + ;; compared to a sign-extended A. + clr A2 + clr A3 + sbrs A1, 7 + XJMP __mulsi3 + ;; If A < 0 then perform the B * 0xffff.... before the + ;; very multiplication by initializing the high part of the + ;; result CC with -B. + wmov CC2, A2 + sub CC2, B0 + sbc CC3, B1 + XJMP __mulsi3_helper +#endif /* __AVR_ERRATA_SKIP_JMP_CALL__ */ +ENDF __mulhisi3 +#endif /* L_mulhisi3 */ + -#if defined (L_mulsi3) /******************************************************* Multiplication 32 x 32 without MUL *******************************************************/ -#define r_arg1L r22 /* multiplier Low */ -#define r_arg1H r23 -#define r_arg1HL r24 -#define r_arg1HH r25 /* multiplier High */ - -#define r_arg2L r18 /* multiplicand Low */ -#define r_arg2H r19 -#define r_arg2HL r20 -#define r_arg2HH r21 /* multiplicand High */ - -#define r_resL r26 /* result Low */ -#define r_resH r27 -#define r_resHL r30 -#define r_resHH r31 /* result High */ +#if defined (L_mulsi3) DEFUN __mulsi3 - clr r_resHH ; clear result - clr r_resHL ; clear result - clr r_resH ; clear result - clr r_resL ; clear result -__mulsi3_loop: - sbrs r_arg1L,0 - rjmp __mulsi3_skip1 - add r_resL,r_arg2L ; result + multiplicand - adc r_resH,r_arg2H - adc r_resHL,r_arg2HL - adc r_resHH,r_arg2HH -__mulsi3_skip1: - add r_arg2L,r_arg2L ; shift multiplicand - adc r_arg2H,r_arg2H - adc r_arg2HL,r_arg2HL - adc r_arg2HH,r_arg2HH - - lsr r_arg1HH ; gets LSB of multiplier - ror r_arg1HL - ror r_arg1H - ror r_arg1L - brne __mulsi3_loop - sbiw r_arg1HL,0 - cpc r_arg1H,r_arg1L - brne __mulsi3_loop ; exit if multiplier = 0 -__mulsi3_exit: - mov_h r_arg1HH,r_resHH ; result to return register - mov_l r_arg1HL,r_resHL - mov_h r_arg1H,r_resH - mov_l r_arg1L,r_resL - ret -ENDF __mulsi3 + ;; Clear result + clr CC2 + clr CC3 + ;; FALLTHRU +ENDF __mulsi3 -#undef r_arg1L -#undef r_arg1H -#undef r_arg1HL -#undef r_arg1HH - -#undef r_arg2L -#undef r_arg2H -#undef r_arg2HL -#undef r_arg2HH - -#undef r_resL -#undef r_resH -#undef r_resHL -#undef r_resHH +DEFUN __mulsi3_helper + clr CC0 + clr CC1 + rjmp 3f -#endif /* defined (L_mulsi3) */ +1: ;; If bit n of A is set, then add B * 2^n to the result in CC + ;; CC += B + add CC0,B0 $ adc CC1,B1 $ adc CC2,B2 $ adc CC3,B3 + +2: ;; B <<= 1 + lsl B0 $ rol B1 $ rol B2 $ rol B3 + +3: ;; A >>= 1: Carry = n-th bit of A + lsr A3 $ ror A2 $ ror A1 $ ror A0 + + brcs 1b + ;; Only continue if A != 0 + sbci A1, 0 + brne 2b + sbiw A2, 0 + brne 2b + + ;; All bits of A are consumed: Copy result to return register C + wmov C0, CC0 + wmov C2, CC2 + ret +ENDF __mulsi3_helper +#endif /* L_mulsi3 */ + +#undef A0 +#undef A1 +#undef A2 +#undef A3 +#undef B0 +#undef B1 +#undef B2 +#undef B3 +#undef C0 +#undef C1 +#undef C2 +#undef C3 +#undef CC0 +#undef CC1 +#undef CC2 +#undef CC3 #endif /* !defined (__AVR_HAVE_MUL__) */ ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; @@ -316,7 +416,7 @@ ENDF __mulsi3 #define C3 C0+3 /******************************************************* - Widening Multiplication 32 = 16 x 16 + Widening Multiplication 32 = 16 x 16 with MUL *******************************************************/ #if defined (L_mulhisi3) @@ -364,7 +464,17 @@ DEFUN __umulhisi3 mul A1, B1 movw C2, r0 mul A0, B1 +#ifdef __AVR_HAVE_JMP_CALL__ + ;; This function is used by many other routines, often multiple times. + ;; Therefore, if the flash size is not too limited, avoid the RCALL + ;; and inverst 6 Bytes to speed things up. + add C1, r0 + adc C2, r1 + clr __zero_reg__ + adc C3, __zero_reg__ +#else rcall 1f +#endif mul A1, B0 1: add C1, r0 adc C2, r1 @@ -375,7 +485,7 @@ ENDF __umulhisi3 #endif /* L_umulhisi3 */ /******************************************************* - Widening Multiplication 32 = 16 x 32 + Widening Multiplication 32 = 16 x 32 with MUL *******************************************************/ #if defined (L_mulshisi3) @@ -425,7 +535,7 @@ ENDF __muluhisi3 #endif /* L_muluhisi3 */ /******************************************************* - Multiplication 32 x 32 + Multiplication 32 x 32 with MUL *******************************************************/ #if defined (L_mulsi3) @@ -468,7 +578,7 @@ ENDF __mulsi3 #endif /* __AVR_HAVE_MUL__ */ /******************************************************* - Multiplication 24 x 24 + Multiplication 24 x 24 with MUL *******************************************************/ #if defined (L_mulpsi3) @@ -1247,6 +1357,19 @@ __divmodsi4_exit: ENDF __divmodsi4 #endif /* defined (L_divmodsi4) */ +#undef r_remHH +#undef r_remHL +#undef r_remH +#undef r_remL +#undef r_arg1HH +#undef r_arg1HL +#undef r_arg1H +#undef r_arg1L +#undef r_arg2HH +#undef r_arg2HL +#undef r_arg2H +#undef r_arg2L +#undef r_cnt /******************************************************* Division 64 / 64 @@ -2757,9 +2880,7 @@ DEFUN __fmulsu_exit XJMP __fmul 1: XCALL __fmul ;; C = -C iff A0.7 = 1 - com C1 - neg C0 - sbci C1, -1 + NEG2 C0 ret ENDF __fmulsu_exit #endif /* L_fmulsu */ @@ -2794,3 +2915,5 @@ ENDF __fmul #undef B1 #undef C0 #undef C1 + +#include "lib1funcs-fixed.S" diff --git a/libgcc/config/avr/t-avr b/libgcc/config/avr/t-avr index 43caa94ca2a..6f783cd9d52 100644 --- a/libgcc/config/avr/t-avr +++ b/libgcc/config/avr/t-avr @@ -2,6 +2,7 @@ LIB1ASMSRC = avr/lib1funcs.S LIB1ASMFUNCS = \ _mulqi3 \ _mulhi3 \ + _mulqihi3 _umulqihi3 \ _mulpsi3 _mulsqipsi3 \ _mulhisi3 \ _umulhisi3 \ @@ -55,6 +56,24 @@ LIB1ASMFUNCS = \ _cmpdi2 _cmpdi2_s8 \ _fmul _fmuls _fmulsu +# Fixed point routines in avr/lib1funcs-fixed.S +LIB1ASMFUNCS += \ + _fractqqsf _fractuqqsf \ + _fracthqsf _fractuhqsf _fracthasf _fractuhasf \ + _fractsasf _fractusasf _fractsqsf _fractusqsf \ + \ + _fractsfqq _fractsfuqq \ + _fractsfhq _fractsfuhq _fractsfha _fractsfuha \ + _fractsfsa _fractsfusa \ + _mulqq3 \ + _mulhq3 _muluhq3 \ + _mulha3 _muluha3 _muluha3_round \ + _mulsa3 _mulusa3 \ + _divqq3 _udivuqq3 \ + _divhq3 _udivuhq3 \ + _divha3 _udivuha3 \ + _divsa3 _udivusa3 + LIB2FUNCS_EXCLUDE = \ _moddi3 _umoddi3 \ _clz @@ -81,3 +100,49 @@ libgcc-objects += $(patsubst %,%$(objext),$(hiintfuncs16)) ifeq ($(enable_shared),yes) libgcc-s-objects += $(patsubst %,%_s$(objext),$(hiintfuncs16)) endif + + +# Filter out supported conversions from fixed-bit.c + +conv_XY=$(conv)$(mode1)$(mode2) +conv_X=$(conv)$(mode) + +# Conversions supported by the compiler + +convf_modes = QI UQI QQ UQQ \ + HI UHI HQ UHQ HA UHA \ + SI USI SQ USQ SA USA \ + DI UDI DQ UDQ DA UDA \ + TI UTI TQ UTQ TA UTA + +LIB2FUNCS_EXCLUDE += \ + $(foreach conv,_fract _fractuns,\ + $(foreach mode1,$(convf_modes),\ + $(foreach mode2,$(convf_modes),$(conv_XY)))) + +# Conversions supported by lib1funcs-fixed.S + +conv_to_sf_modes = QQ UQQ HQ UHQ HA UHA SQ USQ SA USA +conv_from_sf_modes = QQ UQQ HQ UHQ HA UHA SA USA + +LIB2FUNCS_EXCLUDE += \ + $(foreach conv,_fract, \ + $(foreach mode1,$(conv_to_sf_modes), \ + $(foreach mode2,SF,$(conv_XY)))) + +LIB2FUNCS_EXCLUDE += \ + $(foreach conv,_fract,\ + $(foreach mode1,SF,\ + $(foreach mode2,$(conv_from_sf_modes),$(conv_XY)))) + +# Arithmetik supported by the compiler + +allfix_modes = QQ UQQ HQ UHQ HA UHA SQ USQ SA USA DA UDA DQ UDQ TQ UTQ TA UTA + +LIB2FUNCS_EXCLUDE += \ + $(foreach conv,_add _sub,\ + $(foreach mode,$(allfix_modes),$(conv_X)3)) + +LIB2FUNCS_EXCLUDE += \ + $(foreach conv,_lshr _ashl _ashr _cmp,\ + $(foreach mode,$(allfix_modes),$(conv_X)))