linux/arch/mips/lib
Chen Jie 615eb603f4 MIPS: csum_partial: Improve instruction parallelism.
Computing sum introduces true data dependency. This patch removes some
true data depdendencies, hence increases instruction level parallelism.

This patch brings up to 50% csum performance gain on Loongson 3a.

One example about how this patch works is in CSUM_BIGCHUNK1:
// ** original **    vs    ** patch applied **
    ADDC(sum, t0)           ADDC(t0, t1)
    ADDC(sum, t1)           ADDC(t2, t3)
    ADDC(sum, t2)           ADDC(sum, t0)
    ADDC(sum, t3)           ADDC(sum, t2)

In the original implementation, each ADDC(sum, ...) depends on the sum
value updated by previous ADDC(as source operand).

With this patch applied, the first two ADDC operations are independent,
hence can be executed simultaneously if possible.

Another example is in the "copy and sum calculating chunk":
// ** original **    vs    ** patch applied **
    STORE(t0, UNIT(0) ...   STORE(t0, UNIT(0) ...
    ADDC(sum, t0)           ADDC(t0, t1)
    STORE(t1, UNIT(1) ...   STORE(t1, UNIT(1) ...
    ADDC(sum, t1)           ADDC(sum, t0)
    STORE(t2, UNIT(2) ...   STORE(t2, UNIT(2) ...
    ADDC(sum, t2)           ADDC(t2, t3)
    STORE(t3, UNIT(3) ...   STORE(t3, UNIT(3) ...
    ADDC(sum, t3)           ADDC(sum, t2)

With this patch applied, ADDC and the **next next** ADDC are independent.

Signed-off-by: chenj <chenj@lemote.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9608/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-04-01 17:22:11 +02:00
..
Makefile MIPS: Use generic checksum functions for MIPS R6 2015-02-17 15:37:19 +00:00
ashldi3.c
ashrdi3.c
bitops.c MIPS: Remove unneeded volatile from arch/mips/lib/bitops.c 2013-05-08 01:19:06 +02:00
cmpdi2.c
csum_partial.S MIPS: csum_partial: Improve instruction parallelism. 2015-04-01 17:22:11 +02:00
delay.c MIPS: __delay ABI-dependent subtraction simplification 2014-05-30 21:01:08 +02:00
dump_tlb.c Revert "MIPS: Allow ASID size to be determined at boot time." 2013-05-16 20:35:42 +02:00
iomap-pci.c mips: use the the PCI controller's io_map_base 2012-01-31 23:20:30 +02:00
iomap.c MIPS: iomap: Use __mem_{read,write}{b,w,l} for MMIO 2014-11-24 07:45:42 +01:00
libgcc.h
lshrdi3.c
memcpy.S MIPS: lib: memcpy: Add MIPS R6 support 2015-02-17 15:37:29 +00:00
memset.S MIPS: lib: memset: Add MIPS R6 support 2015-02-17 15:37:30 +00:00
mips-atomic.c MIPS: asm: irqflags: Add MIPS R6 related definitions 2015-02-17 15:37:20 +00:00
r3k_dump_tlb.c MIPS: R3000: Remove redundant parentheses 2014-11-24 07:45:01 +01:00
strlen_user.S MIPS: Remove __strlen_user(). 2014-11-24 07:45:00 +01:00
strncpy_user.S MIPS: __strncpy_from_user_asm CPU_DADDI_WORKAROUNDS bug fix 2014-05-13 00:29:38 +02:00
strnlen_user.S MIPS: Fix strnlen_user() return value in case of overlong strings. 2014-11-04 12:46:33 +01:00
ucmpdi2.c
uncached.c mips: delete non-required instances of include <linux/init.h> 2014-01-24 22:39:56 +01:00