Avoid a call to find_first_bit if the bitmap size is know at
compile time and small enough to fit in a single long integer.
Modeled after an optimization in the original x86_64-specific
code.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Switch x86_64 to generic find_first_bit. The x86_64-specific
implementation is not removed.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Generic versions of __find_first_bit and __find_first_zero_bit
are introduced as simplified versions of __find_next_bit and
__find_next_zero_bit. Their compilation and use are guarded by
a new config variable GENERIC_FIND_FIRST_BIT.
The generic versions of find_first_bit and find_first_zero_bit
are implemented in terms of the newly introduced __find_first_bit
and __find_first_zero_bit.
This patch does not remove the i386-specific implementation,
but it does switch i386 to use the generic functions by setting
GENERIC_FIND_FIRST_BIT=y for X86_32.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use __fls for fls64 on 64-bit archs. The implementation for
64-bit archs is moved from x86_64 to asm-generic.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Implement __fls on all 64-bit archs:
alpha has an implementation of fls64.
Added __fls(x) = fls64(x) - 1.
ia64 has fls, but not __fls.
Added __fls based on code of fls.
mips and powerpc have __ilog2, which is the same as __fls.
Added __fls = __ilog2.
parisc, s390, sh and sparc64:
Include generic __fls.
x86_64 already has __fls.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a generic __fls implementation in the same spirit as
the generic __ffs one. It finds the last (most significant)
set bit in the given long value.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some of those can be written in such a way that the same
inline assembly can be used to generate both 32 bit and
64 bit code.
For ffs and fls, x86_64 unconditionally used the cmov
instruction and i386 unconditionally used a conditional
branch over a mov instruction. In the current patch I
chose to select the version based on the availability
of the cmov instruction instead. A small detail here is
that x86_64 did not previously set CONFIG_X86_CMOV=y.
Improved comments for ffs, ffz, fls and variations.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This moves an optimization for searching constant-sized small
bitmaps form x86_64-specific to generic code.
On an i386 defconfig (the x86#testing one), the size of vmlinux hardly
changes with this applied. I have observed only four places where this
optimization avoids a call into find_next_bit:
In the functions return_unused_surplus_pages, alloc_fresh_huge_page,
and adjust_pool_surplus, this patch avoids a call for a 1-bit bitmap.
In __next_cpu a call is avoided for a 32-bit bitmap. That's it.
On x86_64, 52 locations are optimized with a minimal increase in
code size:
Current #testing defconfig:
146 x bsf, 27 x find_next_*bit
text data bss dec hex filename
5392637 846592 724424 6963653 6a41c5 vmlinux
After removing the x86_64 specific optimization for find_next_*bit:
94 x bsf, 79 x find_next_*bit
text data bss dec hex filename
5392358 846592 724424 6963374 6a40ae vmlinux
After this patch (making the optimization generic):
146 x bsf, 27 x find_next_*bit
text data bss dec hex filename
5392396 846592 724424 6963412 6a40d4 vmlinux
[ tglx@linutronix.de: build fixes ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The versions with inline assembly are in fact slower on the machines I
tested them on (in userspace) (Athlon XP 2800+, p4-like Xeon 2.8GHz, AMD
Opteron 270). The i386-version needed a fix similar to 06024f21 to avoid
crashing the benchmark.
Benchmark using: gcc -fomit-frame-pointer -Os. For each bitmap size
1...512, for each possible bitmap with one bit set, for each possible
offset: find the position of the first bit starting at offset. If you
follow ;). Times include setup of the bitmap and checking of the
results.
Athlon Xeon Opteron 32/64bit
x86-specific: 0m3.692s 0m2.820s 0m3.196s / 0m2.480s
generic: 0m2.622s 0m1.662s 0m2.100s / 0m1.572s
If the bitmap size is not a multiple of BITS_PER_LONG, and no set
(cleared) bit is found, find_next_bit (find_next_zero_bit) returns a
value outside of the range [0, size]. The generic version always returns
exactly size. The generic version also uses unsigned long everywhere,
while the x86 versions use a mishmash of int, unsigned (int), long and
unsigned long.
Using the generic version does give a slightly bigger kernel, though.
defconfig: text data bss dec hex filename
x86-specific: 4738555 481232 626688 5846475 5935cb vmlinux (32 bit)
generic: 4738621 481232 626688 5846541 59360d vmlinux (32 bit)
x86-specific: 5392395 846568 724424 6963387 6a40bb vmlinux (64 bit)
generic: 5392458 846568 724424 6963450 6a40fa vmlinux (64 bit)
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
pointed out by Linus: arch/um/Kconfig.x86_64 should
include arch/x86/Kconfig.cpu instead of defining those
symbols itself.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
arch/um/os-Linux/helper.c: In function 'run_helper':
arch/um/os-Linux/helper.c:73: error: 'PATH_MAX' undeclared (first use in this function)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes:
x86 PAT: decouple from nonpromisc devmem
x86 PAT: tone down debugging messages
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (7751): ir-kbd-i2c: Save a temporary memory allocation in ir_probe
V4L/DVB (7750): au0828/ cleanups and fixes
V4L/DVB (7748): tuner-core: some adjustments at tuner logs, if debug enabled
V4L/DVB (7746): pvrusb2: make signed one-bit bitfields unsigned
V4L/DVB (7744): pvrusb2-dvb: add atsc/qam support for Hauppauge pvrusb2 model 751xx
V4L/DVB (7742): cx88: Add support for the DViCO FusionHDTV_7_GOLD digital modes
V4L/DVB (7741): s5h1411: Adding support for this ATSC/QAM demodulator
V4L/DVB (7740): tuner-xc2028.c dubious !x & y
V4L/DVB (7739): mt312.h: dubious one-bit signed bitfield
V4L/DVB (7735): Fix compilation for au0828
V4L/DVB (7734): em28xx: copy and paste error in em28xx_init_isoc
V4L/DVB (7733): blackbird_find_mailbox negative return ignored in blackbird_initialize_codec()
V4L/DVB (7732): vivi: fix a warning
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (61 commits)
ide: sanitize handling of IDE_HFLAG_NO_SET_MODE host flag
sis5513: fail early for unsupported chipsets
it821x: fix kzalloc() failure handling
qd65xx: use IDE_HFLAG_SINGLE host flag
qd65xx: always use ->selectproc method
ide-cd: put proc-related functions together under single ifdef
ide-cd: Replace __FUNCTION__ with __func__
IDE: Coding Style fixes to drivers/ide/ide-cd.c
IDE: Coding Style fixes to drivers/ide/pci/cy82c693.c
IDE: Coding Style fixes to drivers/ide/pci/it8213.c
IDE: Coding Style fixes to drivers/ide/ide-floppy.c
IDE: Coding Style fixes to drivers/ide/legacy/ali14xx.c
IDE: Coding Style fixes to drivers/ide/legacy/hd.c
IDE: Coding Style fixes to drivers/ide/pci/cmd640.c
IDE: Coding Style fixes to drivers/ide/pci/opti621.c
IDE: Coding Style fixes to drivers/ide/ide-pnp.c
IDE: Coding Style fixes to drivers/ide/ide-proc.c
IDE: Coding Style fixes to drivers/ide/legacy/ide-4drives.c
IDE: Coding Style fixes to drivers/ide/legacy/umc8672.c
IDE: Coding Style fixes to drivers/ide/pci/generic.c
...
Linus pointed it out that PAT should not depend on NONPROMISC_DEVMEM.
Also make PAT non-default.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
agp: convert drivers/char/agp/frontend.c to use unlocked_ioctl
agp: fix shadowed variable warning in amd-k7-agp.c
* 'drm-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: _end is shadowing real _end, just rename it.
drm/vbl rework: rework how the drm deals with vblank.
drm: reorganise minor number handling using backported modesetting code.
drm/i915: Handle tiled buffers in vblank tasklet
drm/i965: On I965, use correct 3DSTATE_DRAWING_RECTANGLE command in vblank
drm: Remove unneeded dma sync in ATI pcigart alloc
drm: Fix mismerge of non-coherent DMA patch
Arrgghhh...
Sorry about that, I'd been sure I'd folded that one, but it actually got
lost. Please apply - that breaks execve().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stephen Rothwell reported that linux-next did not build on powerpc64.
make optimized inlining dependent on architecture opt-in.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
add CONFIG_OPTIMIZE_INLINING=y.
allow gcc to optimize the kernel image's size by uninlining
functions that have been marked 'inline'. Previously gcc was
forced by Linux to always-inline these functions via a gcc
attribute:
#define inline inline __attribute__((always_inline))
Especially when the user has already selected
CONFIG_OPTIMIZE_FOR_SIZE=y this can make a huge difference in
kernel image size (using a standard Fedora .config):
text data bss dec hex filename
5613924 562708 3854336 10030968 990f78 vmlinux.before
5486689 562708 3854336 9903733 971e75 vmlinux.after
that's a 2.3% text size reduction (!).
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Check for IDE_HFLAG_NO_SET_MODE host flag in ide_set_pio(),
ide_set_[pio,dma]_mode(), ide_set_xfer_rate() and set_pio_mode().
* Remove no longer needed IDE_HFLAG_NO_SET_MODE host flag checking
from ide_tune_dma().
* Remove superfluous ->set_pio_mode checking from do_special().
This is a part of preparations for adding 'struct ide_port_ops'.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Factor out chipset family detection from init_chipset_sis5513()
to sis_find_family().
* Use sis_find_family() in sis5513_init_one() to fail early if the
chipset is unsupported.
* Keep a local copy sis5513_chipset in sis5513_init_one()
and set .udma_mask according to chipset family.
* Remove no longer need ->ultra_mask setting from init_hwif_sis5513().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Allocate 'struct it821x_dev' objects for both ports in it821x_init_one().
Fixes potential OOPS in it821x_quirkproc() (uses 'itdev' unconditionally)
and other problems ('itdev' is needed for correct operation of the driver).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Set IDE_HFLAG_SINGLE host flag in qd_probe() for QD6500 and QD6580
with the second port disabled.
* Check for IDE_HFLAG_SINGLE in qd6580_port_init_devs() instead of
using cached value of QD6580 Control register.
* Don't cache QD6580 Control register value in hwif->config_data
(bits 8-15) and remove no longer needed QD_CONTROL() macro.
* Cache QD65xx base address in hwif->config_data (bits 8-15)
instead of hwif->select_data.
* Set hwif->config_data in qd_probe() and remove qd_setup() helper.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
qd_select() checks itself whether timings should be reprogrammed so
remove superfluous qd_timing_ok() and always use ->selectproc method
(rename qd_select() to qd65xx_select() while at it).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error free, only a few
WARNING: line over 80 characters
are left.
Compile tested.
[bart: md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error free.
Compile tested.
[bart: minor fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Fix a lot of errors and warnings.
Compile tested.
[bart: some fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Fix all the errors and a few warnings.
Compile tested.
[bart: some fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Lot of errors and warnings removed.
Compile tested.
[bart: minor fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error and warning free.
Compile tested.
[bart: md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error free.
Compile tested.
[bart: minor fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
About 300 errors and warnings fixed.
File is now error free.
Compile tested.
[bart: minor fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error and warning free.
Compile tested.
[bart: md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error free.
Compile tested.
[bart: minor fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
All host drivers now either set hwif->mmio or reserve continuous
I/O resources so remove no longer needed hwif->straight8 flag
and never reached code for 'hwif->straight8 == 0' case.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei suggested that it shouldn't be necessary + it had no effect
anyway since ide_id_dma_bug() is called earlier in ide_tune_dma().
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>