binutils-gdb/gas/doc/c-aarch64.texi
Matthew Malcomson 8382113fdb [binutils][aarch64] Matrix Multiply extension enablement [8/X]
Hi,

This patch is part of a series that adds support for Armv8.6-A
(Matrix Multiply and BFloat16 extensions) to binutils.

This patch introduces the Matrix Multiply (Int8, F32, F64) extensions
to the aarch64 backend.

The following instructions are added: {s/u}mmla, usmmla, {us/su}dot,
fmmla, ld1rob, ld1roh, d1row, ld1rod, uzip{1/2}, trn{1/2}.

Committed on behalf of Mihail Ionescu.

gas/ChangeLog:

2019-11-07  Mihail Ionescu  <mihail.ionescu@arm.com>

	* config/tc-aarch64.c: Add new arch fetures to suppport the mm extension.
	(parse_operands): Add new operand.
	* testsuite/gas/aarch64/i8mm.s: New test.
	* testsuite/gas/aarch64/i8mm.d: New test.
	* testsuite/gas/aarch64/f32mm.s: New test.
	* testsuite/gas/aarch64/f32mm.d: New test.
	* testsuite/gas/aarch64/f64mm.s: New test.
	* testsuite/gas/aarch64/f64mm.d: New test.
	* testsuite/gas/aarch64/sve-movprfx-mm.s: New test.
	* testsuite/gas/aarch64/sve-movprfx-mm.d: New test.

include/ChangeLog:

2019-11-07  Mihail Ionescu  <mihail.ionescu@arm.com>

	* opcode/aarch64.h (AARCH64_FEATURE_I8MM): New.
	(AARCH64_FEATURE_F32MM): New.
	(AARCH64_FEATURE_F64MM): New.
	(AARCH64_OPND_SVE_ADDR_RI_S4x32): New.
	(enum aarch64_insn_class): Add new instruction class "aarch64_misc" for
	instructions that do not require special handling.

opcodes/ChangeLog:

2019-11-07  Mihail Ionescu  <mihail.ionescu@arm.com>

	* aarch64-tbl.h (aarch64_feature_i8mm_sve, aarch64_feature_f32mm_sve,
	aarch64_feature_f64mm_sve, aarch64_feature_i8mm, aarch64_feature_f32mm,
	aarch64_feature_f64mm): New feature sets.
	(INT8MATMUL_INSN, F64MATMUL_SVE_INSN, F64MATMUL_INSN,
	F32MATMUL_SVE_INSN, F32MATMUL_INSN): New macros to define matrix multiply
	instructions.
	(I8MM_SVE, F32MM_SVE, F64MM_SVE, I8MM, F32MM, F64MM): New feature set
	macros.
	(QL_MMLA64, OP_SVE_SBB): New qualifiers.
	(OP_SVE_QQQ): New qualifier.
	(INT8MATMUL_SVE_INSNC, F64MATMUL_SVE_INSNC,
	F32MATMUL_SVE_INSNC): New feature set for bfloat16 instructions to support
	the movprfx constraint.
	(aarch64_opcode_table): Support for SVE_ADDR_RI_S4x32.
	(aarch64_opcode_table): Define new instructions smmla,
	ummla, usmmla, usdot, sudot, fmmla, ld1rob, ld1roh, ld1row, ld1rod
	uzip{1/2}, trn{1/2}.
	* aarch64-opc.c (operand_general_constraint_met_p): Handle
	AARCH64_OPND_SVE_ADDR_RI_S4x32.
	(aarch64_print_operand): Handle AARCH64_OPND_SVE_ADDR_RI_S4x32.
	* aarch64-dis-2.c (aarch64_opcode_lookup_1, aarch64_find_next_opcode):
	Account for new instructions.
	* opcodes/aarch64-asm-2.c (aarch64_insert_operand): Support the new
	S4x32 operand.
	* aarch64-opc-2.c (aarch64_operands): Support the new S4x32 operand.

Regression tested on arm-none-eabi.

Is it ok for trunk?

Regards,
Mihail
2019-11-07 17:11:52 +00:00

535 lines
18 KiB
Plaintext

@c Copyright (C) 2009-2019 Free Software Foundation, Inc.
@c Contributed by ARM Ltd.
@c This is part of the GAS manual.
@c For copying conditions, see the file as.texinfo.
@c man end
@ifset GENERIC
@page
@node AArch64-Dependent
@chapter AArch64 Dependent Features
@end ifset
@ifclear GENERIC
@node Machine Dependencies
@chapter AArch64 Dependent Features
@end ifclear
@cindex AArch64 support
@menu
* AArch64 Options:: Options
* AArch64 Extensions:: Extensions
* AArch64 Syntax:: Syntax
* AArch64 Floating Point:: Floating Point
* AArch64 Directives:: AArch64 Machine Directives
* AArch64 Opcodes:: Opcodes
* AArch64 Mapping Symbols:: Mapping Symbols
@end menu
@node AArch64 Options
@section Options
@cindex AArch64 options (none)
@cindex options for AArch64 (none)
@c man begin OPTIONS
@table @gcctabopt
@cindex @option{-EB} command-line option, AArch64
@item -EB
This option specifies that the output generated by the assembler should
be marked as being encoded for a big-endian processor.
@cindex @option{-EL} command-line option, AArch64
@item -EL
This option specifies that the output generated by the assembler should
be marked as being encoded for a little-endian processor.
@cindex @option{-mabi=} command-line option, AArch64
@item -mabi=@var{abi}
Specify which ABI the source code uses. The recognized arguments
are: @code{ilp32} and @code{lp64}, which decides the generated object
file in ELF32 and ELF64 format respectively. The default is @code{lp64}.
@cindex @option{-mcpu=} command-line option, AArch64
@item -mcpu=@var{processor}[+@var{extension}@dots{}]
This option specifies the target processor. The assembler will issue an error
message if an attempt is made to assemble an instruction which will not execute
on the target processor. The following processor names are recognized:
@code{cortex-a34},
@code{cortex-a35},
@code{cortex-a53},
@code{cortex-a55},
@code{cortex-a57},
@code{cortex-a65},
@code{cortex-a65ae},
@code{cortex-a72},
@code{cortex-a73},
@code{cortex-a75},
@code{cortex-a76},
@code{cortex-a76ae},
@code{cortex-a77},
@code{ares},
@code{exynos-m1},
@code{falkor},
@code{neoverse-n1},
@code{neoverse-e1},
@code{qdf24xx},
@code{saphira},
@code{thunderx},
@code{vulcan},
@code{xgene1}
and
@code{xgene2}.
The special name @code{all} may be used to allow the assembler to accept
instructions valid for any supported processor, including all optional
extensions.
In addition to the basic instruction set, the assembler can be told to
accept, or restrict, various extension mnemonics that extend the
processor. @xref{AArch64 Extensions}.
If some implementations of a particular processor can have an
extension, then then those extensions are automatically enabled.
Consequently, you will not normally have to specify any additional
extensions.
@cindex @option{-march=} command-line option, AArch64
@item -march=@var{architecture}[+@var{extension}@dots{}]
This option specifies the target architecture. The assembler will
issue an error message if an attempt is made to assemble an
instruction which will not execute on the target architecture. The
following architecture names are recognized: @code{armv8-a},
@code{armv8.1-a}, @code{armv8.2-a}, @code{armv8.3-a}, @code{armv8.4-a}
@code{armv8.5-a}, and @code{armv8.6-a}.
If both @option{-mcpu} and @option{-march} are specified, the
assembler will use the setting for @option{-mcpu}. If neither are
specified, the assembler will default to @option{-mcpu=all}.
The architecture option can be extended with the same instruction set
extension options as the @option{-mcpu} option. Unlike
@option{-mcpu}, extensions are not always enabled by default,
@xref{AArch64 Extensions}.
@cindex @code{-mverbose-error} command-line option, AArch64
@item -mverbose-error
This option enables verbose error messages for AArch64 gas. This option
is enabled by default.
@cindex @code{-mno-verbose-error} command-line option, AArch64
@item -mno-verbose-error
This option disables verbose error messages in AArch64 gas.
@end table
@c man end
@node AArch64 Extensions
@section Architecture Extensions
The table below lists the permitted architecture extensions that are
supported by the assembler and the conditions under which they are
automatically enabled.
Multiple extensions may be specified, separated by a @code{+}.
Extension mnemonics may also be removed from those the assembler
accepts. This is done by prepending @code{no} to the option that adds
the extension. Extensions that are removed must be listed after all
extensions that have been added.
Enabling an extension that requires other extensions will
automatically cause those extensions to be enabled. Similarly,
disabling an extension that is required by other extensions will
automatically cause those extensions to be disabled.
@multitable @columnfractions .12 .17 .17 .54
@headitem Extension @tab Minimum Architecture @tab Enabled by default
@tab Description
@item @code{i8mm} @tab ARMv8.2-A @tab ARMv8.6-A or later
@tab Enable Int8 Matrix Multiply extension.
@item @code{f32mm} @tab ARMv8.2-A @tab No
@tab Enable F32 Matrix Multiply extension.
@item @code{f64mm} @tab ARMv8.2-A @tab No
@tab Enable F64 Matrix Multiply extension.
@item @code{bf16} @tab ARMv8.2-A @tab ARMv8.6-A or later
@tab Enable BFloat16 extension.
@item @code{compnum} @tab ARMv8.2-A @tab ARMv8.3-A or later
@tab Enable the complex number SIMD extensions. This implies
@code{fp16} and @code{simd}.
@item @code{crc} @tab ARMv8-A @tab ARMv8.1-A or later
@tab Enable CRC instructions.
@item @code{crypto} @tab ARMv8-A @tab No
@tab Enable cryptographic extensions. This implies @code{fp}, @code{simd}, @code{aes} and @code{sha2}.
@item @code{aes} @tab ARMv8-A @tab No
@tab Enable the AES cryptographic extensions. This implies @code{fp} and @code{simd}.
@item @code{sha2} @tab ARMv8-A @tab No
@tab Enable the SHA2 cryptographic extensions. This implies @code{fp} and @code{simd}.
@item @code{sha3} @tab ARMv8.2-A @tab No
@tab Enable the ARMv8.2-A SHA2 and SHA3 cryptographic extensions. This implies @code{fp}, @code{simd} and @code{sha2}.
@item @code{sm4} @tab ARMv8.2-A @tab No
@tab Enable the ARMv8.2-A SM3 and SM4 cryptographic extensions. This implies @code{fp} and @code{simd}.
@item @code{fp} @tab ARMv8-A @tab ARMv8-A or later
@tab Enable floating-point extensions.
@item @code{fp16} @tab ARMv8.2-A @tab ARMv8.2-A or later
@tab Enable ARMv8.2 16-bit floating-point support. This implies
@code{fp}.
@item @code{lor} @tab ARMv8-A @tab ARMv8.1-A or later
@tab Enable Limited Ordering Regions extensions.
@item @code{lse} @tab ARMv8-A @tab ARMv8.1-A or later
@tab Enable Large System extensions.
@item @code{pan} @tab ARMv8-A @tab ARMv8.1-A or later
@tab Enable Privileged Access Never support.
@item @code{profile} @tab ARMv8.2-A @tab No
@tab Enable statistical profiling extensions.
@item @code{ras} @tab ARMv8-A @tab ARMv8.2-A or later
@tab Enable the Reliability, Availability and Serviceability
extension.
@item @code{rcpc} @tab ARMv8.2-A @tab ARMv8.3-A or later
@tab Enable the weak release consistency extension.
@item @code{rdma} @tab ARMv8-A @tab ARMv8.1-A or later
@tab Enable ARMv8.1 Advanced SIMD extensions. This implies @code{simd}.
@item @code{simd} @tab ARMv8-A @tab ARMv8-A or later
@tab Enable Advanced SIMD extensions. This implies @code{fp}.
@item @code{sve} @tab ARMv8.2-A @tab No
@tab Enable the Scalable Vector Extensions. This implies @code{fp16},
@code{simd} and @code{compnum}.
@item @code{dotprod} @tab ARMv8.2-A @tab ARMv8.4-A or later
@tab Enable the Dot Product extension. This implies @code{simd}.
@item @code{fp16fml} @tab ARMv8.2-A @tab ARMv8.4-A or later
@tab Enable ARMv8.2 16-bit floating-point multiplication variant support.
This implies @code{fp16}.
@item @code{sb} @tab ARMv8-A @tab ARMv8.5-A or later
@tab Enable the speculation barrier instruction sb.
@item @code{predres} @tab ARMv8-A @tab ARMv8.5-A or later
@tab Enable the Execution and Data and Prediction instructions.
@item @code{rng} @tab ARMv8.5-A @tab No
@tab Enable ARMv8.5-A random number instructions.
@item @code{ssbs} @tab ARMv8-A @tab ARMv8.5-A or later
@tab Enable Speculative Store Bypassing Safe state read and write.
@item @code{memtag} @tab ARMv8.5-A @tab No
@tab Enable ARMv8.5-A Memory Tagging Extensions.
@item @code{tme} @tab ARMv8-A @tab No
@tab Enable Transactional Memory Extensions.
@item @code{sve2} @tab ARMv8-A @tab No
@tab Enable the SVE2 Extension.
@item @code{sve2-bitperm} @tab ARMv8-A @tab No
@tab Enable SVE2 BITPERM Extension.
@item @code{sve2-sm4} @tab ARMv8-A @tab No
@tab Enable SVE2 SM4 Extension.
@item @code{sve2-aes} @tab ARMv8-A @tab No
@tab Enable SVE2 AES Extension. This also enables the .Q->.B form of the
@code{pmullt} and @code{pmullb} instructions.
@item @code{sve2-sha3} @tab ARMv8-A @tab No
@tab Enable SVE2 SHA3 Extension.
@end multitable
@node AArch64 Syntax
@section Syntax
@menu
* AArch64-Chars:: Special Characters
* AArch64-Regs:: Register Names
* AArch64-Relocations:: Relocations
@end menu
@node AArch64-Chars
@subsection Special Characters
@cindex line comment character, AArch64
@cindex AArch64 line comment character
The presence of a @samp{//} on a line indicates the start of a comment
that extends to the end of the current line. If a @samp{#} appears as
the first character of a line, the whole line is treated as a comment.
@cindex line separator, AArch64
@cindex statement separator, AArch64
@cindex AArch64 line separator
The @samp{;} character can be used instead of a newline to separate
statements.
@cindex immediate character, AArch64
@cindex AArch64 immediate character
The @samp{#} can be optionally used to indicate immediate operands.
@node AArch64-Regs
@subsection Register Names
@cindex AArch64 register names
@cindex register names, AArch64
Please refer to the section @samp{4.4 Register Names} of
@samp{ARMv8 Instruction Set Overview}, which is available at
@uref{http://infocenter.arm.com}.
@node AArch64-Relocations
@subsection Relocations
@cindex relocations, AArch64
@cindex AArch64 relocations
@cindex MOVN, MOVZ and MOVK group relocations, AArch64
Relocations for @samp{MOVZ} and @samp{MOVK} instructions can be generated
by prefixing the label with @samp{#:abs_g2:} etc.
For example to load the 48-bit absolute address of @var{foo} into x0:
@smallexample
movz x0, #:abs_g2:foo // bits 32-47, overflow check
movk x0, #:abs_g1_nc:foo // bits 16-31, no overflow check
movk x0, #:abs_g0_nc:foo // bits 0-15, no overflow check
@end smallexample
@cindex ADRP, ADD, LDR/STR group relocations, AArch64
Relocations for @samp{ADRP}, and @samp{ADD}, @samp{LDR} or @samp{STR}
instructions can be generated by prefixing the label with
@samp{:pg_hi21:} and @samp{#:lo12:} respectively.
For example to use 33-bit (+/-4GB) pc-relative addressing to
load the address of @var{foo} into x0:
@smallexample
adrp x0, :pg_hi21:foo
add x0, x0, #:lo12:foo
@end smallexample
Or to load the value of @var{foo} into x0:
@smallexample
adrp x0, :pg_hi21:foo
ldr x0, [x0, #:lo12:foo]
@end smallexample
Note that @samp{:pg_hi21:} is optional.
@smallexample
adrp x0, foo
@end smallexample
is equivalent to
@smallexample
adrp x0, :pg_hi21:foo
@end smallexample
@node AArch64 Floating Point
@section Floating Point
@cindex floating point, AArch64 (@sc{ieee})
@cindex AArch64 floating point (@sc{ieee})
The AArch64 architecture uses @sc{ieee} floating-point numbers.
@node AArch64 Directives
@section AArch64 Machine Directives
@cindex machine directives, AArch64
@cindex AArch64 machine directives
@table @code
@c AAAAAAAAAAAAAAAAAAAAAAAAA
@cindex @code{.arch} directive, AArch64
@item .arch @var{name}
Select the target architecture. Valid values for @var{name} are the same as
for the @option{-march} command-line option.
Specifying @code{.arch} clears any previously selected architecture
extensions.
@cindex @code{.arch_extension} directive, AArch64
@item .arch_extension @var{name}
Add or remove an architecture extension to the target architecture. Valid
values for @var{name} are the same as those accepted as architectural
extensions by the @option{-mcpu} command-line option.
@code{.arch_extension} may be used multiple times to add or remove extensions
incrementally to the architecture being compiled for.
@c BBBBBBBBBBBBBBBBBBBBBBBBBB
@cindex @code{.bss} directive, AArch64
@item .bss
This directive switches to the @code{.bss} section.
@c CCCCCCCCCCCCCCCCCCCCCCCCCC
@cindex @code{.cpu} directive, AArch64
@item .cpu @var{name}
Set the target processor. Valid values for @var{name} are the same as
those accepted by the @option{-mcpu=} command-line option.
@c DDDDDDDDDDDDDDDDDDDDDDDDDD
@cindex @code{.dword} directive, AArch64
@item .dword @var{expressions}
The @code{.dword} directive produces 64 bit values.
@c EEEEEEEEEEEEEEEEEEEEEEEEEE
@cindex @code{.even} directive, AArch64
@item .even
The @code{.even} directive aligns the output on the next even byte
boundary.
@c FFFFFFFFFFFFFFFFFFFFFFFFFF
@cindex @code{.float16} directive, AArch64
@item .float16 @var{value [,...,value_n]}
Place the half precision floating point representation of one or more
floating-point values into the current section.
The format used to encode the floating point values is always the
IEEE 754-2008 half precision floating point format.
@c GGGGGGGGGGGGGGGGGGGGGGGGGG
@c HHHHHHHHHHHHHHHHHHHHHHHHHH
@c IIIIIIIIIIIIIIIIIIIIIIIIII
@cindex @code{.inst} directive, AArch64
@item .inst @var{expressions}
Inserts the expressions into the output as if they were instructions,
rather than data.
@c JJJJJJJJJJJJJJJJJJJJJJJJJJ
@c KKKKKKKKKKKKKKKKKKKKKKKKKK
@c LLLLLLLLLLLLLLLLLLLLLLLLLL
@cindex @code{.ltorg} directive, AArch64
@item .ltorg
This directive causes the current contents of the literal pool to be
dumped into the current section (which is assumed to be the .text
section) at the current location (aligned to a word boundary).
GAS maintains a separate literal pool for each section and each
sub-section. The @code{.ltorg} directive will only affect the literal
pool of the current section and sub-section. At the end of assembly
all remaining, un-empty literal pools will automatically be dumped.
Note - older versions of GAS would dump the current literal
pool any time a section change occurred. This is no longer done, since
it prevents accurate control of the placement of literal pools.
@c MMMMMMMMMMMMMMMMMMMMMMMMMM
@c NNNNNNNNNNNNNNNNNNNNNNNNNN
@c OOOOOOOOOOOOOOOOOOOOOOOOOO
@c PPPPPPPPPPPPPPPPPPPPPPPPPP
@cindex @code{.pool} directive, AArch64
@item .pool
This is a synonym for .ltorg.
@c QQQQQQQQQQQQQQQQQQQQQQQQQQ
@c RRRRRRRRRRRRRRRRRRRRRRRRRR
@cindex @code{.req} directive, AArch64
@item @var{name} .req @var{register name}
This creates an alias for @var{register name} called @var{name}. For
example:
@smallexample
foo .req w0
@end smallexample
ip0, ip1, lr and fp are automatically defined to
alias to X16, X17, X30 and X29 respectively.
@c SSSSSSSSSSSSSSSSSSSSSSSSSS
@c TTTTTTTTTTTTTTTTTTTTTTTTTT
@cindex @code{.tlsdescadd} directive, AArch64
@item @code{.tlsdescadd}
Emits a TLSDESC_ADD reloc on the next instruction.
@cindex @code{.tlsdesccall} directive, AArch64
@item @code{.tlsdesccall}
Emits a TLSDESC_CALL reloc on the next instruction.
@cindex @code{.tlsdescldr} directive, AArch64
@item @code{.tlsdescldr}
Emits a TLSDESC_LDR reloc on the next instruction.
@c UUUUUUUUUUUUUUUUUUUUUUUUUU
@cindex @code{.unreq} directive, AArch64
@item .unreq @var{alias-name}
This undefines a register alias which was previously defined using the
@code{req} directive. For example:
@smallexample
foo .req w0
.unreq foo
@end smallexample
An error occurs if the name is undefined. Note - this pseudo op can
be used to delete builtin in register name aliases (eg 'w0'). This
should only be done if it is really necessary.
@c VVVVVVVVVVVVVVVVVVVVVVVVVV
@cindex @code{.variant_pcs} directive, AArch64
@item .variant_pcs @var{symbol}
This directive marks @var{symbol} referencing a function that may
follow a variant procedure call standard with different register
usage convention from the base procedure call standard.
@c WWWWWWWWWWWWWWWWWWWWWWWWWW
@c XXXXXXXXXXXXXXXXXXXXXXXXXX
@cindex @code{.xword} directive, AArch64
@item .xword @var{expressions}
The @code{.xword} directive produces 64 bit values. This is the same
as the @code{.dword} directive.
@c YYYYYYYYYYYYYYYYYYYYYYYYYY
@c ZZZZZZZZZZZZZZZZZZZZZZZZZZ
@cindex @code{.cfi_b_key_frame} directive, AArch64
@item @code{.cfi_b_key_frame}
The @code{.cfi_b_key_frame} directive inserts a 'B' character into the CIE
corresponding to the current frame's FDE, meaning that its return address has
been signed with the B-key. If two frames are signed with differing keys then
they will not share the same CIE. This information is intended to be used by
the stack unwinder in order to properly authenticate return addresses.
@end table
@node AArch64 Opcodes
@section Opcodes
@cindex AArch64 opcodes
@cindex opcodes for AArch64
GAS implements all the standard AArch64 opcodes. It also
implements several pseudo opcodes, including several synthetic load
instructions.
@table @code
@cindex @code{LDR reg,=<expr>} pseudo op, AArch64
@item LDR =
@smallexample
ldr <register> , =<expression>
@end smallexample
The constant expression will be placed into the nearest literal pool (if it not
already there) and a PC-relative LDR instruction will be generated.
@end table
For more information on the AArch64 instruction set and assembly language
notation, see @samp{ARMv8 Instruction Set Overview} available at
@uref{http://infocenter.arm.com}.
@node AArch64 Mapping Symbols
@section Mapping Symbols
The AArch64 ELF specification requires that special symbols be inserted
into object files to mark certain features:
@table @code
@cindex @code{$x}
@item $x
At the start of a region of code containing AArch64 instructions.
@cindex @code{$d}
@item $d
At the start of a region of data.
@end table