Now that network timestamps use ktime_t infrastructure, we can add a new
SOL_SOCKET sockopt SO_TIMESTAMPNS.
This command is similar to SO_TIMESTAMP, but permits transmission of
a 'timespec struct' instead of a 'timeval struct' control message.
(nanosecond resolution instead of microsecond)
Control message is labelled SCM_TIMESTAMPNS instead of SCM_TIMESTAMP
A socket cannot mix SO_TIMESTAMP and SO_TIMESTAMPNS : the two modes are
mutually exclusive.
sock_recv_timestamp() became too big to be fully inlined so I added a
__sock_recv_timestamp() helper function.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
CC: linux-arch@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Now network timestamps use ktime_t infrastructure, we can add a new
ioctl() SIOCGSTAMPNS command to get timestamps in 'struct timespec'.
User programs can thus access to nanosecond resolution.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
CC: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add unsigned to unused bit field in a.out.h to make sparse happy.
[ I took care of the sparc64 side as well -DaveM ]
Signed-off-by: Robert Reif <reif@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Compiling 2.6.21-rc5 with gcc-4.2.0 20070317 (prerelease)
for sparc64 fails as follows:
gcc -Wp,-MD,arch/sparc64/kernel/.time.o.d -nostdinc -isystem /home/mikpe/pkgs/linux-sparc64/gcc-4.2.0/lib/gcc/sparc64-unknown-linux-gnu/4.2.0/include -D__KERNEL__ -Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -pipe -mno-fpu -mcpu=ultrasparc -mcmodel=medlow -ffixed-g4 -ffixed-g5 -fcall-used-g7 -Wno-sign-compare -Wa,--undeclared-regs -fomit-frame-pointer -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -Werror -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(time)" -D"KBUILD_MODNAME=KBUILD_STR(time)" -c -o arch/sparc64/kernel/time.o arch/sparc64/kernel/time.c
cc1: warnings being treated as errors
arch/sparc64/kernel/time.c: In function 'kick_start_clock':
arch/sparc64/kernel/time.c:559: warning: overflow in implicit constant conversion
make[1]: *** [arch/sparc64/kernel/time.o] Error 1
make: *** [arch/sparc64/kernel] Error 2
gcc gets unhappy when the MSTK_SET macro's u8 __val variable
is updated with &= ~0xff (MSTK_YEAR_MASK). Making the constant
unsigned fixes the problem.
[ I fixed up the sparc32 side as well -DaveM ]
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
sys_mbind
sys_get_mempolicy
sys_set_mempolicy
sys_kexec_load
sys_move_pages
sys_getcpu
sys_epoll_pwait
This work is largely a result of David Woodhouse's most
excellent missing syscalls patch.
Signed-off-by: David S. Miller <davem@davemloft.net>
The line discipline numbers N_* are currently defined for each architecture
individually, but (except for a seeming mistake) identically, in
asm/termios.h. There is no obvious reason why these numbers should be
architecture specific, nor any apparent relationship with the termios
structure. The total number of these, NR_LDISCS, is defined in linux/tty.h
anyway. So I propose the following patch which moves the definitions of
the individual line disciplines to linux/tty.h too.
Three of these numbers (N_MASC, N_PROFIBUS_FDL, and N_SMSBLOCK) are unused
in the current kernel, but the patch still keeps the complete set in case
there are plans to use them yet.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In some cases such as:
iph->check = 0;
iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
GCC may optimize out the previous store.
Observed as a failure of NFS over udp (bad checksums on ip fragments)
when compiled with GCC 3.4.2.
Signed-off-by: Bob Breuer <breuerr@mc.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to pass in the resource otherwise we cannot
release the region properly. We must know whether it is
an I/O or MEM resource.
Spotted by Eric Brower.
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent workqueue changes basically make this a formal requirement.
Also, move atomic32.o from lib-y to obj-y since it exports symbols
to modules.
Signed-off-by: David S. Miller <davem@davemloft.net>
Virtually index, physically tagged cache architectures can get away
without cache flushing when forking. This patch adds a new cache
flushing function flush_cache_dup_mm(struct mm_struct *) which for the
moment I've implemented to do the same thing on all architectures
except on MIPS where it's a no-op.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In order to sort out our struct termios and add proper speed control we need
to separate the kernel and user termios structures. Glibc is fine but the
other libraries rely on the kernel exported struct termios and we need to
extend this without breaking the ABI/API
To do so we add a struct ktermios which is the kernel view of a termios
structure and overlaps the struct termios with extra fields on the end for
now. (That limitation will go away in later patches). Some platforms (eg
alpha) planned ahead and thus use the same struct for both, others did not.
This just adds the structures but does not use them, it seems a sensible
splitting point for bisect if there are compile failures (not that I expect
them)
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The last thing we agreed on was to remove the macros entirely for 2.6.19,
on all architectures. Unfortunately, I think nobody actually _did_ that,
so they are still there.
[akpm@osdl.org: x86_64 fix]
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Schafer <gschafer@zip.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* sanitize prototypes, annotate
* kill bogus access_ok() in csum_partial_copy_from_user (the only caller checks)
* kill useless shift
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add arch specific dev_archdata to struct device
Adds an arch specific struct dev_arch to struct device. This enables
architecture to add specific fields to every device in the system, like
DMA operation pointers, NUMA node ID, firmware specific data, etc...
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Andi Kleen <ak@suse.de>
Acked-By: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When I added the entries for the robust futex syscall entries, I
forgot to bump NR_SYSCALLS. The current situation is error-prone
because NR_SYSCALLS lives in entry.S where the system call limit
checks are enforced. Move the definition to asm/unistd.h in order to
make this mistake much more difficult to make.
And wire up sys_migrate_pages since the powerpc folks implemented the
compat wrapper for us.
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't need to export sparc_elf_hwcap() to userspace, and it doesn't
build there. Remove it by moving it inside #ifdef __KERNEL__, along with
some other things which don't need to be exported.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new typedef for interrupt handler function pointers rather than
actually spelling out the full thing each time. This was scripted with the
following small shell script:
#!/bin/sh
egrep -nHrl -e 'irqreturn_t[ ]*[(][*]' $* |
while read i
do
echo $i
perl -pi -e 's/irqreturn_t\s*[(]\s*[*]\s*([_a-zA-Z0-9]*)\s*[)]\s*[(]\s*int\s*,\s*void\s*[*]\s*[)]/irq_handler_t \1/g' $i || exit $?
done
Signed-Off-By: David Howells <dhowells@redhat.com>
read_trylock() is broken on sparc32 (doesn't build and didn't work
right, actually). Proposed fix:
- make "writer holds lock" distinguishable from "reader tries to grab
lock"
- have __raw_read_trylock() try to acquire the mutex (in LSB of lock),
terminating spin if we see that there's writer holding it. Then do
the rest as we do in read_lock().
Thanks to Ingo for discussion...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Many files include the filename at the beginning, serveral used a wrong one.
Signed-off-by: Uwe Zeisberger <Uwe_Zeisberger@digi.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
The last in-kernel user of errno is gone, so we should remove the definition
and everything referring to it. This also removes the now-unused lib/execve.c
file that was introduced earlier.
Also remove every trace of __KERNEL_SYSCALLS__ that still remained in the
kernel.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On systems running with virtual cpus there is optimization potential in
regard to spinlocks and rw-locks. If the virtual cpu that has taken a lock
is known to a cpu that wants to acquire the same lock it is beneficial to
yield the timeslice of the virtual cpu in favour of the cpu that has the
lock (directed yield).
With CONFIG_PREEMPT="n" this can be implemented by the architecture without
common code changes. Powerpc already does this.
With CONFIG_PREEMPT="y" the lock loops are coded with _raw_spin_trylock,
_raw_read_trylock and _raw_write_trylock in kernel/spinlock.c. If the lock
could not be taken cpu_relax is called. A directed yield is not possible
because cpu_relax doesn't know anything about the lock. To be able to
yield the lock in favour of the current lock holder variants of cpu_relax
for spinlocks and rw-locks are needed. The new _raw_spin_relax,
_raw_read_relax and _raw_write_relax primitives differ from cpu_relax
insofar that they have an argument: a pointer to the lock structure.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
One of the changes necessary for shared page tables is to standardize the
pxx_page macros. pte_page and pmd_page have always returned the struct
page associated with their entry, while pte_page_kernel and pmd_page_kernel
have returned the kernel virtual address. pud_page and pgd_page, on the
other hand, return the kernel virtual address.
Shared page tables needs pud_page and pgd_page to return the actual page
structures. There are very few actual users of these functions, so it is
simple to standardize their usage.
Since this is basic cleanup, I am submitting these changes as a standalone
patch. Per Hugh Dickins' comments about it, I am also changing the
pxx_page_kernel macros to pxx_page_vaddr to clarify their meaning.
Signed-off-by: Dave McCracken <dmccr@us.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
They all contain the same thing. Instead, have a single generic one in
include/asm-generic, and permit an arch to override as needed.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Mostly removing files which have no business being used in userspace.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This prevents cross-region mappings on IA64 and SPARC which could lead
to system crash. They were correctly trapped for normal mmap() calls,
but not for the kernel internal calls generated by executable loading.
This code just moves the architecture-specific cross-region checks into
an arch-specific "arch_mmap_check()" macro, and defines that for the
architectures that needed it (ia64, sparc and sparc64).
Architectures that don't have any special requirements can just ignore
the new cross-region check, since the mmap() code will just notice on
its own when the macro isn't defined.
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[ Cleaned up to not affect architectures that don't need it ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kill host_set->next
Fix simplex support
Allow per platform setting of IDE legacy bases
Some of this can be tidied further later on, in particular all the
legacy port gunge belongs as a PCI quirk/PCI header decode to understand
the special legacy IDE rules in the PCI spec.
Longer term Jeff also wants to move the request_irq/free_irq out of core
which will make this even cleaner.
tj: folded in three followup patches - ata_piix-fix, broken-arch-fix
and fix-new-legacy-handling, and separated per-dev xfermask into
separate patch preceding this one. Folded in fixes are...
* ata_piix-fix: fix build failure due to host_set->next removal
* broken-arch-fix: add missing include/asm-*/libata-portmap.h
* fix-new-legacy-handling:
* In ata_pci_init_legacy_port(), probe_num was incorrectly
incremented during initialization of the secondary port and
probe_ent->n_ports was incorrectly fixed to 1.
* Both legacy ports ended up having the same hard_port_no.
* When printing port information, both legacy ports printed
the first irq.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tejun Heo <htejun@gmail.com>
set_wmb should not be used in the kernel because it just confuses the
code more and has no benefit. Since it is not currently used in the
kernel this patch removes it so that new code does not include it.
All archs define set_wmb(var, value) to do { var = value; wmb(); }
while(0) except ia64 and sparc which use a mb() instead. But this is
still moot since it is not used anyway.
Hasn't been tested on any archs but x86 and x86_64 (and only compiled
tested)
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the new IRQF_ constants and remove the SA_INTERRUPT define
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits)
[TIPC]: Initial activation message now includes TIPC version number
[TIPC]: Improve response to requests for node/link information
[TIPC]: Fixed skb_under_panic caused by tipc_link_bundle_buf
[IrDA]: Fix the AU1000 FIR dependencies
[IrDA]: Fix RCU lock pairing on error path
[XFRM]: unexport xfrm_state_mtu
[NET]: make skb_release_data() static
[NETFILTE] ipv4: Fix typo (Bugzilla #6753)
[IrDA]: MCS7780 usb_driver struct should be static
[BNX2]: Turn off link during shutdown
[BNX2]: Use dev_kfree_skb() instead of the _irq version
[ATM]: basic sysfs support for ATM devices
[ATM]: [suni] change suni_init to __devinit
[ATM]: [iphase] should be __devinit not __init
[ATM]: [idt77105] should be __devinit not __init
[BNX2]: Add NETIF_F_TSO_ECN
[NET]: Add ECN support for TSO
[AF_UNIX]: Datagram getpeersec
[NET]: Fix logical error in skb_gso_ok
[PKT_SCHED]: PSCHED_TADD() and PSCHED_TADD2() can result,tv_usec >= 1000000
...
This patch implements an API whereby an application can determine the
label of its peer's Unix datagram sockets via the auxiliary data mechanism of
recvmsg.
Patch purpose:
This patch enables a security-aware application to retrieve the
security context of the peer of a Unix datagram socket. The application
can then use this security context to determine the security context for
processing on behalf of the peer who sent the packet.
Patch design and implementation:
The design and implementation is very similar to the UDP case for INET
sockets. Basically we build upon the existing Unix domain socket API for
retrieving user credentials. Linux offers the API for obtaining user
credentials via ancillary messages (i.e., out of band/control messages
that are bundled together with a normal message). To retrieve the security
context, the application first indicates to the kernel such desire by
setting the SO_PASSSEC option via getsockopt. Then the application
retrieves the security context using the auxiliary data mechanism.
An example server application for Unix datagram socket should look like this:
toggle = 1;
toggle_len = sizeof(toggle);
setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len);
recvmsg(sockfd, &msg_hdr, 0);
if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
cmsg_hdr->cmsg_level == SOL_SOCKET &&
cmsg_hdr->cmsg_type == SCM_SECURITY) {
memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
}
}
sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow
a server socket to receive security context of the peer.
Testing:
We have tested the patch by setting up Unix datagram client and server
applications. We verified that the server can retrieve the security context
using the auxiliary data mechanism of recvmsg.
Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Acked-by: Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Happily, life is much simpler on 32-bit sparc systems.
The "intr" property, preferred over the "interrupts"
property is used-as. Some minor translations of this
value happen on sun4d systems.
The stage is now set to rewrite the sparc serial driver
probing to use the of_driver framework, and then to convert
all SBUS, EBUS, and ISA drivers in-kind so that we can nuke
all those special bus frameworks.
Signed-off-by: David S. Miller <davem@davemloft.net>
The idea is to fully construct the device register and
interrupt values into these of_device objects, and convert
all of SBUS, EBUS, ISA drivers to use this new stuff.
Much ideas and code taken from Ben H.'s powerpc work.
Signed-off-by: David S. Miller <davem@davemloft.net>