linux/include
Vineet Gupta 3ca65c19dd mm: optimize PageHighMem() check
This came up when implementing HIHGMEM/PAE40 for ARC.  The kmap() /
kmap_atomic() generated code seemed needlessly bloated due to the way
PageHighMem() macro is implemented.  It derives the exact zone for page
and then does pointer subtraction with first zone to infer the zone_type.
The pointer arithmatic in turn generates the code bloat.

PageHighMem(page)
  is_highmem(page_zone(page))
     zone_off = (char *)zone - (char *)zone->zone_pgdat->node_zones

Instead use is_highmem_idx() to work on zone_type available in page flags

   ----- Before -----
80756348:	mov_s      r13,r0
8075634a:	ld_s       r2,[r13,0]
8075634c:	lsr_s      r2,r2,30
8075634e:	mpy        r2,r2,0x2a4
80756352:	add_s      r2,r2,0x80aef880
80756358:	ld_s       r3,[r2,28]
8075635a:	sub_s      r2,r2,r3
8075635c:	breq       r2,0x2a4,80756378 <kmap+0x48>
80756364:	breq       r2,0x548,80756378 <kmap+0x48>

   ----- After  -----
80756330:	mov_s      r13,r0
80756332:	ld_s       r2,[r13,0]
80756334:	lsr_s      r2,r2,30
80756336:	sub_s      r2,r2,1
80756338:	brlo       r2,2,80756348 <kmap+0x30>

For x86 defconfig build (32 bit only) it saves around 900 bytes.
For ARC defconfig with HIGHMEM, it saved around 2K bytes.

   ---->8-------
./scripts/bloat-o-meter x86/vmlinux-defconfig-pre x86/vmlinux-defconfig-post
add/remove: 0/0 grow/shrink: 0/36 up/down: 0/-934 (-934)
function                                     old     new   delta
saveable_page                                162     154      -8
saveable_highmem_page                        154     146      -8
skb_gro_reset_offset                         147     131     -16
...
...
__change_page_attr_set_clr                  1715    1678     -37
setup_data_read                              434     394     -40
mon_bin_event                               1967    1927     -40
swsusp_save                                 1148    1105     -43
_set_pages_array                             549     493     -56
   ---->8-------

e.g. For ARC kmap()

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jennifer Herbert <jennifer.herbert@citrix.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
..
acpi Merge branch 'acpi-processor' 2015-11-02 00:50:37 +01:00
asm-generic Power management and ACPI updates for v4.4-rc1 2015-11-04 18:10:13 -08:00
clocksource
crypto crypto: ahash - Add crypto_ahash_blocksize 2015-10-20 22:10:48 +08:00
drm drm/dp/mst: make mst i2c transfer code more robust. 2015-10-15 09:06:20 +10:00
dt-bindings char/misc drivers for 4.4-rc1 2015-11-04 22:15:15 -08:00
keys
kvm arm/arm64: KVM: Only allow 64bit hosts to build VGICv3 2015-10-09 23:11:57 +01:00
linux mm: optimize PageHighMem() check 2015-11-05 19:34:48 -08:00
math-emu
media
memory
misc
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-11-03 13:41:45 -05:00
pcmcia
ras
rdma
rxrpc
scsi
soc
sound Merge remote-tracking branches 'asoc/fix/rt298', 'asoc/fix/sx', 'asoc/fix/wm8904' and 'asoc/fix/wm8962' into asoc-linus 2015-10-23 08:44:14 +09:00
target
trace mm, compaction: distinguish contended status in tracepoints 2015-11-05 19:34:48 -08:00
uapi char/misc drivers for 4.4-rc1 2015-11-04 22:15:15 -08:00
video
xen xen/grant-table: Add an helper to iterate over a specific number of grants 2015-10-23 14:20:46 +01:00
Kbuild