Commit Graph

984 Commits

Author SHA1 Message Date
David S. Miller 8a2950cce6 [SPARC64]: Handle LDC resets properly in domain-services driver.
Reset the handshake and per-capability state so that when the
link comes back up we'll renegotiate the DS version and then
reregister all of the services.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 01:20:09 -07:00
David S. Miller 6160f63518 [SPARC64]: Massively simplify VIO device layer and support hot add/remove.
Create and destroy VIO devices in response to MD update events.  These
run synchronously inside of the MD update mutex so the VIO layer
doesn't need to do internal locking of any sort.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 01:20:04 -07:00
David S. Miller 920c3ed741 [SPARC64]: Add basic infrastructure for MD add/remove notification.
And add dummy handlers for the VIO device layer.  These will be filled
in with real code after the vdc, vnet, and ds drivers are reworked to
have simpler dependencies on the VIO device tree.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 01:19:51 -07:00
Oleg Nesterov 62715ec832 [SPARC64]: Kill bogus set_fs(KERNEL_DS) in do_rt_sigreturn().
From: Oleg Nesterov <oleg@tv-sign.ru>

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-17 14:37:54 -07:00
David S. Miller c1e49e3a1b [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-17 12:18:16 -07:00
David S. Miller 41120551fa [SPARC64]: Kill explicit %gl register reference.
Older binutils can't handle it.  Use SET_GL() instead,
which is explicitly for this purpose.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-17 12:18:15 -07:00
Pavel Emelianov bcdcd8e725 Report that kernel is tainted if there was an OOPS
If the kernel OOPSed or BUGed then it probably should be considered as
tainted.  Thus, all subsequent OOPSes and SysRq dumps will report the
tainted kernel.  This saves a lot of time explaining oddities in the
calltraces.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Added parisc patch from Matthew Wilson  -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:02 -07:00
David S. Miller 778feeb475 [SPARC64]: Fix race between MD update and dr-cpu add.
We need to make sure the MD update occurs before we try to
process dr-cpu configure requests.  MD update and dr-cpu
were being processed by seperate threads so that did not
happen occaisionally.

Fix this by executing all domain services data packets from
a single thread, in order.

This will help simplify some other things as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 17:11:59 -07:00
Fabio Massimo Di Nitto 3ac66e33ea [SPARC64]: SMP build fix.
The UP build fix had some unintended consequences.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 17:11:58 -07:00
David S. Miller d54bc2793e [SPARC64]: Fix UP build.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:41:51 -07:00
David S. Miller e0204409df [SPARC64]: dr-cpu unconfigure support.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:32 -07:00
David S. Miller 9918cc2e32 [SPARC64]: Give more accurate errors in dr_cpu_configure().
When cpu_up() fails, we can discern the most likely cause.

If cpu_present() is false, this means the cpu did not appear
in the MD.  If -ENODEV is the error return value, then
the processor did not boot properly into the kernel.

Pass this information back in the dr-cpu response packet.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:24 -07:00
David S. Miller 39dd992aee [SPARC64]: Clear cpu_{core,sibling}_map[] in smp_fill_in_sib_core_maps()
When we hot-plug in new cpus, the core_id and proc_id of existing
cpus can change.  So in order to set the cpu groups correctly we
need to clear the maps out completely first.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:19 -07:00
David S. Miller b37d40d175 [SPARC64]: Fix leak when DR added cpu does not bootup.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:15 -07:00
David S. Miller b53bcb6799 [SPARC64]: Add ->set_affinity IRQ handlers.
dr-cpu unconfigure requests will walk throught he enabled
IRQs and trigger ->set_affinity so that the going-down
cpu no longer has INOs targetted to it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:11 -07:00
David S. Miller bd0e11ff22 [SPARC64]: Process dr-cpu events in a kthread instead of workqueue.
This will be necessary to handle unconfigure requests
properly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:07 -07:00
David S. Miller 8b99cfb8cc [SPARC64]: More sensible udelay implementation.
Take a page from the powerpc folks and just calculate the
delay factor directly.

Since frequency scaling chips use a system-tick register,
the value is going to be the same system-wide.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:02 -07:00
David S. Miller 27a2ef382c [SPARC64]: SMP build fixes.
With the move of ldom_startcpu_cpuid() into smp.c some other
things need to follow along:

1) smp.c is not a driver so we can't use "PFX" macro in the
   printk calls.

2) smp.c now needs asm/io.h and asm/hvtramp.h, ds.c no longer
   does

3) kimage_addr_to_ra() also needs to move into smp.c

While we're here, update copyright info and my email address
in smp.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:58 -07:00
David S. Miller 8f3fff2050 [SPARC64]: mdesc.c needs linux/mm.h
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:53 -07:00
David S. Miller b14f5c100c [SPARC64]: Fix build regressions added by dr-cpu changes.
Do not select HOTPLUG_CPU from SUN_LDOMS, that causes
HOTPLUG_CPU to be selected even on non-SMP which is
illegal.

Only build hvtramp.o when SMP, just like trampoline.o

Protect dr-cpu code in ds.c with HOTPLUG_CPU.

Likewise move ldom_startcpu_cpuid() to smp.c and protect
it and the call site with SUN_LDOMS && HOTPLUG_CPU.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:49 -07:00
David S. Miller f8be339c02 [SPARC64]: Unconditionally register vio_bus_type.
The VIO drivers register themselves unconditionally just
like those of any other bus type, so to avoid crashes
on non-VIO systems we need to always register vio_bus_type.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:45 -07:00
David S. Miller 4f0234f4f9 [SPARC64]: Initial LDOM cpu hotplug support.
Only adding cpus is supports at the moment, removal
will come next.

When new cpus are configured, the machine description is
updated.  When we get the configure request we pass in a
cpu mask of to-be-added cpus to the mdesc CPU node parser
so it only fetches information for those cpus.  That code
also proceeds to update the SMT/multi-core scheduling bitmaps.

cpu_up() does all the work and we return the status back
over the DS channel.

CPUs via dr-cpu need to be booted straight out of the
hypervisor, and this requires:

1) A new trampoline mechanism.  CPUs are booted straight
   out of the hypervisor with MMU disabled and running in
   physical addresses with no mappings installed in the TLB.

   The new hvtramp.S code sets up the critical cpu state,
   installs the locked TLB mappings for the kernel, and
   turns the MMU on.  It then proceeds to follow the logic
   of the existing trampoline.S SMP cpu bringup code.

2) All calls into OBP have to be disallowed when domaining
   is enabled.  Since cpus boot straight into the kernel from
   the hypervisor, OBP has no state about that cpu and therefore
   cannot handle being invoked on that cpu.

   Luckily it's only a handful of interfaces which can be called
   after the OBP device tree is obtained.  For example, rebooting,
   halting, powering-off, and setting options node variables.

CPU removal support will require some infrastructure changes
here.  Namely we'll have to process the requests via a true
kernel thread instead of in a workqueue.  workqueues run on
a per-cpu thread, but when unconfiguring we might need to
force the thread to execute on another cpu if the current cpu
is the one being removed.  Removal of a cpu also causes the kernel
to destroy that cpu's workqueue running thread.

Another issue on removal is that we may have interrupts still
pointing to the cpu-to-be-removed.  So new code will be needed
to walk the active INO list and retarget those cpus as-needed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:40 -07:00
David S. Miller b3e13fbeb9 [SPARC64]: Fix setting of variables in LDOM guest.
There is a special domain services capability for setting
variables in the OBP options node.  Guests don't have permanent
store for the OBP variables like a normal system, so they are
instead maintained in the LDOM control node or in the SC.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:36 -07:00
David S. Miller 83292e0a9c [SPARC64]: Fix MD property lifetime bugs.
Property values cannot be referenced outside of
mdesc_grab()/mdesc_release() pairs.  The only major
offender was the VIO bus layer, easily fixed.

Add some commentary to mdesc.h describing these rules.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:33 -07:00
David S. Miller 43fdf27470 [SPARC64]: Abstract out mdesc accesses for better MD update handling.
Since we have to be able to handle MD updates, having an in-tree
set of data structures representing the MD objects actually makes
things more painful.

The MD itself is easy to parse, and we can implement the existing
interfaces using direct parsing of the MD binary image.

The MD is now reference counted, so accesses have to now take the
form:

	handle = mdesc_grab();

	... operations on MD ...

	mdesc_release(handle);

The only remaining issue are cases where code holds on to references
to MD property values.  mdesc_get_property() returns a direct pointer
to the property value, most cases just pull in the information they
need and discard the pointer, but there are few that use the pointer
directly over a long lifetime.  Those will be fixed up in a subsequent
changeset.

A preliminary handler for MD update events from domain services is
there, it is rudimentry but it works and handles all of the reference
counting.  It does not check the generation number of the MDs,
and it does not generate a "add/delete" list for notification to
interesting parties about MD changes but that will be forthcoming.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:28 -07:00
David S. Miller 133f09a169 [SPARC64]: Use more mearningful names for IRQ registry.
All of the interrupts say "LDX RX" and "LDX TX" currently
which is next to useless.  Put a device specific prefix
before "RX" and "TX" instead which makes it much more
useful.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:24 -07:00
David S. Miller e450992d13 [SPARC64]: Initial domain-services driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:20 -07:00
David S. Miller 13077d8028 [SPARC64]: Export powerd facilities for external entities.
Besides the existing usage for power-button interrupts, we'll
want to make use of this code for domain-services where the
LDOM manager can send reboot requests to the guest node.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:16 -07:00
David S. Miller 2c4f4ecb7a [SPARC64]: Add domain-services nodes to VIO device tree.
They sit under the root of the MD tree unlike the rest of
the LDC channel based virtual devices.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:13 -07:00
David S. Miller cb48123584 [SPARC64]: Assorted LDC bug cures.
1) LDC_MODE_RELIABLE is deprecated an unused by anything, plus
   it and LDC_MODE_STREAM were mis-numbered.

2) read_stream() should try to read as much as possible into
   the per-LDC stream buffer area, so do not trim the read_nonraw()
   length by the caller's size parameter.

3) Send data ACKs when necessary in read_nonraw().

4) In read_nonraw() when we get a pure ACK, advance the RX head
   unconditionally past it.

5) Provide the ACKID field in the ldcdgb() packet dump in read_nonraw().
   This helps debugging stream mode LDC channel problems.

6) Decrease verbosity of rx_data_wait() so that it is more useful.
   A debugging message each loop iteration is too much.

7) In process_data_ack() stop the loop checking when we hit lp->tx_tail
   not lp->tx_head.

8) Set the seqid field properly in send_data_nack().

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:09 -07:00
David S. Miller 5a606b72a4 [SPARC64]: Do not ACK an INO if it is disabled or inprogress.
This is also a partial workaround for a bug in the LDOM firmware which
double-transmits RX inos during high load.  Without this, such an
event causes the kernel to loop forever in the interrupt call chain
ACK'ing but never actually running the IRQ handler (and thus clearing
the interrupt condition in the device).

There is still a bad potential effect when double INOs occur,
not covered by this changeset.  Namely, if the INO is already on
the per-cpu INO vector list, we still blindly re-insert it and
thus we can end up losing interrupts already linked in after
it.

We could deal with that by traversing the list before insertion,
but that's too expensive for this edge case.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:05 -07:00
David S. Miller e53e97ce3c [SPARC64]: Add LDOM virtual channel driver and VIO device layer.
Virtual devices on Sun Logical Domains are built on top
of a virtual channel framework.  This, with help of hypervisor
interfaces, provides a link layer protocol with basic
handshaking over which virtual device clients and servers
communicate.

Built on top of this is a VIO device protocol which has it's
own handshaking and message types.  At this layer attributes
are exchanged (disk size, network device addresses, etc.)
descriptor rings are registered, and data transfers are
triggers and replied to.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:03:18 -07:00
Matthew Wilcox 36e235901f PCI: Only build PCI syscalls on architectures that want them
The PCI syscalls are built on every architecture except X86, but only
a few have ever hooked them up.  Use a new Kconfig symbol to save a
couple of kB on the architectures that have never used the syscalls.
Tested on x86 and ia64 only.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:13 -07:00
Auke Kok b8a3a5214d PCI: read revision ID by default
Currently there are 97 occurrences where drivers need the pci
revision ID. We can do this once for all devices. Even the pci
subsystem needs the revision several times for quirks. The extra
u8 member pads out nicely in the pci_dev struct.

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:09 -07:00
Ingo Molnar 0437e109e1 sched: zap the migration init / cache-hot balancing code
the SMP load-balancer uses the boot-time migration-cost estimation
code to attempt to improve the quality of balancing. The reason for
this code is that the discrete priority queues do not preserve
the order of scheduling accurately, so the load-balancer skips
tasks that were running on a CPU 'recently'.

this code is fundamental fragile: the boot-time migration cost detector
doesnt really work on systems that had large L3 caches, it caused boot
delays on large systems and the whole cache-hot concept made the
balancing code pretty undeterministic as well.

(and hey, i wrote most of it, so i can say it out loud that it sucks ;-)

under CFS the same purpose of cache affinity can be achieved without
any special cache-hot special-case: tasks are sorted in the 'timeline'
tree and the SMP balancer picks tasks from the left side of the
tree, thus the most cache-cold task is balanced automatically.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-09 18:51:57 +02:00
David S. Miller a357b8f42e [SPARC64]: Need to set state to IDLE during sun4v IRQ enable.
This fixes hypervisor console interrupts on LDOM guests.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-26 00:13:31 -07:00
David S. Miller 1245088400 [SPARC64]: Fix VIRQ enabling.
We were doing the wrong call to turn them on, and also
when enabling we need to forcefully set the state to IDLE.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-26 00:13:09 -07:00
David S. Miller fc395f8d58 [SPARC64]: Fix args to sun4v_ldc_revoke().
First argument is LDC channel ID, then mapping cookie,
then the MTE revoke cookie.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:27 -07:00
David S. Miller 56f5c0bd50 [SPARC64]: Fix IO/MEM space sizing for PCI.
In pci_determine_mem_io_space(), do not hard code the region sizes.
Instead, use the values given to us in the ranges property.

Thanks goes to Mikael Petterson for the original Xorg failure
bug repoert, and strace dumps from Mikael and Dmitry Artamonow.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:19 -07:00
David S. Miller 4a907dec98 [SPARC64]: Wire up cookie based sun4v interrupt registry.
This will be used for logical domain channel interrupts.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:04 -07:00
David S. Miller 8c2786cfa6 [SPARC64]: Handle PCI bridges without 'ranges' property.
This fixes the IDE controller not showing up on Netra-T1
systems.

Just like Simba bridges, some PCI bridges can lack the
'ranges' OBP property.  So we handle this similarly to
the existing Simba code:

1) In of_device register address resolving, we push the
   translation to the parent.

2) In PCI device scanning, we interrogate the PCI config
   space registers of the PCI bus device in order to resolve
   the resources, just like the generic Linux PCI probing
   code does.

With much help and testing from Fabio, who also reported
the initial problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Fabio Massimo Di Nitto <fabbione@ubuntu.com>
2007-06-07 21:59:44 -07:00
Robert P. J. Day ea1ff19ce0 [SPARC64]: Include <linux/rwsem.h> instead of <asm/rwsem.h>.
To be consistent with other architectures, include the generic version
of rwsem.h.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 20:24:50 -07:00
David S. Miller ec4d18f219 [SPARC64]: Fix SBUS IRQ regression caused by PCI-E driver.
We used to access the 64-bit IRQ IMAP and ICLR registers of bus
controllers 4-bytes in and as a 32-bit register word, since only the
low 32-bits were relevant.  This seemed like a good idea at the time.

But the PCI-E controller requires full 8-byte 64-bit access to
these registers, so we switched over to accessing them fully.

SBUS was not adjusted properly, which broke interrupts completely.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 16:59:51 -07:00
David S. Miller 321566c250 [SPARC64]: Fix 2 bugs in PCI Sabre bus scanning.
If we are on hummingbird, bus runs at 66MHZ.

pbm->pci_bus should be setup with the result of pci_scan_one_pbm()
or else we deref NULL pointers in the error interrupt handlers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 16:59:46 -07:00
David S. Miller a2f9f6bbb3 [SPARC64]: Fix {mc,smt}_capable().
It's not just sun4v hypervisor platforms that should return true
for this, sun4u with UltraSPARC-IV should return true too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:05 -07:00
David S. Miller 5cd342df96 [SPARC64]: Make core and sibling groups equal on UltraSPARC-IV.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:02 -07:00
David S. Miller f78eae2e6f [SPARC64]: Proper multi-core scheduling support.
The scheduling domain hierarchy is:

   all cpus -->
      cpus that share an instruction cache -->
          cpus that share an integer execution unit

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:00 -07:00
David Miller d887ab3a9b [SPARC64]: Provide mmu statistics via sysfs.
If the system supports hypervisor based statistics, allow them to
be fetched, enabled, and disabled via sysfs.

Enable and disable via the boolean:

/sys/devices/systems/cpu/cpuN/mmustat_enable

Statistic values are provided under:

/sys/devices/systems/cpu/cpuN/mmu_status/

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:57 -07:00
David Miller 48b6735640 [SPARC64]: Fix service channel hypervisor function names.
sed 's/scv/svc/'

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:54 -07:00
David S. Miller d1f253e60a [SPARC64]: Export basic cpu properties via sysfs.
Cache sizes, udelay_val, and clock_tick.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:51 -07:00
David S. Miller eff3414b72 [SPARC64]: Move topology init code into new file, sysfs.c
Also, use per-cpu data for struct cpu.  Calling kmalloc for
each cpu in topology_init() is just plain clumsy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:50 -07:00
Linus Torvalds 54ca412336 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fix
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fix:
  sparc64: fix alignment bug in linker definition script
2007-05-31 12:33:16 -07:00
David S. Miller dbbe3cb8cf [SPARC64]: Add missing NCS and SVC hypervisor interfaces.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-31 01:52:48 -07:00
Sam Ravnborg 4096b46f01 sparc64: fix alignment bug in linker definition script
The RO_DATA section were hardcoded to a specific
alignment in include/asm-generic/vmlinux.h.
But for sparc64 this did not match the PAGE_SIZE.

Introduce a new section definition named:
RO_DATA that takes actual alignment as parameter.
RODATA are provided for backward compatibility.

On top of this avoid hardcoding alignment for
sparc64 in reset of the script
Fix is build-tested on sparc64 + x86_64.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-29 21:29:00 +02:00
David S. Miller 7db35f31cb [SPARC64]: Fill holes in hypervisor APIs and fix KTSB registry.
Several interfaces were missing and others misnumbered or
improperly documented.

Also, make sure to check the return value when registering
the kernel TSBs with the hypervisor.  This helped to find
the 4MB kernel TSB alignment bug fixed in a previous changeset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:52:15 -07:00
David S. Miller 2d9e2763c2 [SPARC64]: Fix two bugs wrt. kernel 4MB TSB.
1) The TSB lookup was not using the correct hash mask.

2) It was not aligned on a boundary equal to it's size,
   which is required by the sun4v Hypervisor.

wasn't having it's return value checked, and that bug will be fixed up
as well in a subsequent changeset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:51:38 -07:00
David S. Miller 679292993c [SPARC64]: Fix _PAGE_EXEC_4U check in sun4u I-TLB miss handler.
It was using an immediate _PAGE_EXEC_4U value in an 'and'
instruction to perform the test.  This doesn't work because
the immediate field is signed 13-bit, this the mask being
tested against the PTE was 0x1000 sign-extended to 32-bits
instead of just plain 0x1000.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:50:15 -07:00
Horst H. von Brand 7189859f28 [SPARC64]: arch/sparc64/time.c doesn't compile on Ultra 1 (no PCI)
This is bug 8540 on bugzilla.kernel.org

arch/sparc64/time.c contains references to assorted bq4802 stuff if
CONFIG_PCI is not set, and compile fails. I #ifdef'ed out everything
that looks PCI-ish in that file.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:50:02 -07:00
David S. Miller 22adb358e8 [SPARC64]: Eliminate NR_CPUS limitations.
Cheetah systems can have cpuids as large as 1023, although physical
systems don't have that many cpus.

Only three limitations existed in the kernel preventing arbitrary
NR_CPUS values:

1) dcache dirty cpu state stored in page->flags on
   D-cache aliasing platforms.  With some build time
   calculations and some build-time BUG checks on
   page->flags layout, this one was easily solved.

2) The cheetah XCALL delivery code could only handle
   a cpumask with up to 32 cpus set.  Some simple looping
   logic clears that up too.

3) thread_info->cpu was a u8, easily changed to a u16.

There are a few spots in the kernel that still put NR_CPUS
sized arrays on the kernel stack, but that's not a sparc64
specific problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:49 -07:00
David S. Miller 5cbc307373 [SPARC64]: Use machine description and OBP properly for cpu probing.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:41 -07:00
David S. Miller e01c0d6d8c [SPARC64]: Negotiate hypervisor API for PCI services.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:34 -07:00
David S. Miller 22d6a1cba3 [SPARC64]: Report proper system soft state to the hypervisor.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:29 -07:00
David S. Miller 36b48973b8 [SPARC64]: Fix typo in sun4v_hvapi_register error handling.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:21 -07:00
David S. Miller 5840fc66bb [SPARC64]: PCI device scan is way too verbose by default.
These messages were very useful when bringing up the
OBP based PCI device scan code, but it's just a lot
of noise every bootup now especially on big machines.

The messages can be re-enabled via 'ofpci_debug=1' on
the kernel command line.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:18 -07:00
David S. Miller 59db8102bd [SPARC64]: Don't be picky about virtual-dma values on sun4v.
Handle arbitrary base and length values as long as they
are multiples of IO_PAGE_SIZE.

Bug found by Arun Kumar Rao.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:15 -07:00
Sam Ravnborg ca967258b6 all-archs: consolidate .data section definition in asm-generic
With this consolidation we can now modify the .data
section definition in one spot for all archs.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-19 09:11:57 +02:00
Sam Ravnborg 7664709b44 all-archs: consolidate .text section definition in asm-generic
Move definition of .text section to asm-generic.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-19 09:11:57 +02:00
David S. Miller 03983ab858 [SPARC64]: Fix sched_clock() et al.
SPARC64_NSEC_PER_CYC_SHIFT was set too high.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-17 22:55:26 -07:00
David S. Miller c7754d465b [SPARC64]: Add hypervisor API negotiation and fix console bugs.
Hypervisor interfaces need to be negotiated in order to use
some API calls reliably.  So add a small set of interfaces
to request API versions and query current settings.

This allows us to fix some bugs in the hypervisor console:

1) If we can negotiate API group CORE of at least major 1
   minor 1 we can use con_read and con_write which can improve
   console performance quite a bit.

2) When we do a console write request, we should hold the
   spinlock around the whole request, not a byte at a time.
   What would happen is that it's easy for output from
   different cpus to get mixed with each other.

3) Use consistent udelay() based polling, udelay(1) each
   loop with a limit of 1000 polls to handle stuck hypervisor
   console.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-15 20:23:02 -07:00
David S. Miller be35cf01a9 [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-14 04:19:01 -07:00
David S. Miller 17f34f0ec9 [SPARC64]: Add missing cpus_empty() check in hypervisor xcall handling.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-14 02:01:52 -07:00
David S. Miller 49d23cfcec [SPARC64]: Be more resiliant with PCI I/O space regs.
If we miss on the ranges, just toss the translation up to the parent
instead of failing.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-13 22:01:18 -07:00
David S. Miller 8354c5b726 [SPARC]: Wire up signalfd/timerfd/eventfd syscalls.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 22:06:51 -07:00
David S. Miller d037e0532e [SPARC64]: Add support for bq4802 TOD chip, as found on ultra45.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:27 -07:00
David S. Miller 95d71e663e [SPARC64]: Correct FIRE_IOMMU_FLUSHINV register offset.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:26 -07:00
David S. Miller d77311f942 [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:24 -07:00
David S. Miller f16537bac7 [SPARC64]: pci_resource_adjust() cannot be __init.
Noticed by Meelis Roos.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:22 -07:00
Simon Arlott e5dd42e4fb [SPARC64]: Spelling fixes.
Spelling fixes in arch/sparc64/.

Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:21 -07:00
David S. Miller e9429eacd7 [SPARC64]: Kill LARGE_ALLOCS and update defconfig.
Let's use SLUB, since it works now, in order to get it
tested a bit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:19 -07:00
Amy Griffis e54dc2431d [PATCH] audit signal recipients
When auditing syscalls that send signals, log the pid and security
context for each target process. Optimize the data collection by
adding a counter for signal-related rules, and avoiding allocating an
aux struct unless we have more than one target process. For process
groups, collect pid/context data in blocks of 16. Move the
audit_signal_info() hook up in check_kill_permission() so we audit
attempts where permission is denied.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-05-11 05:38:25 -04:00
Amy Griffis 7f13da40e3 [PATCH] add SIGNAL syscall class (v3)
Add a syscall class for sending signals.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-05-11 05:38:25 -04:00
Linus Torvalds 0ab598099c Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: Use alloc_pci_dev() in PCI bus probes.
  [SPARC64]: Bump PROMINTR_MAX to 32.
  [SPARC64]: Fix recursion in PROM tree building.
  [SERIAL] sunzilog: Interrupt enable before ISR handler installed
  [SPARC64] PCI: Consolidate PCI access code into pci_common.c
2007-05-10 13:32:05 -07:00
David S. Miller 26e6385f14 [SPARC64]: Use alloc_pci_dev() in PCI bus probes.
Otherwise MSI explodes because pci_msi_init_pci_dev() does not
get invoked.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-10 02:16:27 -07:00
David S. Miller aa5242e78f [SPARC64]: Fix recursion in PROM tree building.
Use iteration for scanning of PROM node siblings.

Based upon a patch by Greg Onufer, who found this bug.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-10 00:53:29 -07:00
Fernando Luis Vazquez Cao 2f4dfe206a Remove hardcoding of hard_smp_processor_id on UP systems
With the advent of kdump, the assumption that the boot CPU when booting an UP
kernel is always the CPU with a particular hardware ID (often 0) (usually
referred to as BSP on some architectures) is not valid anymore.  The reason
being that the dump capture kernel boots on the crashed CPU (the CPU that
invoked crash_kexec), which may be or may not be that particular CPU.

Move definition of hard_smp_processor_id for the UP case to
architecture-specific code ("asm/smp.h") where it belongs, so that each
architecture can provide its own implementation.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Acked-by: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:48 -07:00
David S. Miller ca3dd88e41 [SPARC64] PCI: Consolidate PCI access code into pci_common.c
All the sun4u controllers do the same thing to compute the physical
I/O address to poke, and we can move the sun4v code into this common
location too.

This one needs a bit of testing, in particular the Sabre code had some
funny stuff that would break up u16 and/or u32 accesses into pieces
and I didn't think that was needed any more.  If it is we need to find
out why and add back code to do it again.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-09 02:35:27 -07:00
David S. Miller 127cda1e8c [SPARC64]: Optimize fault kprobe handling just like powerpc.
And eliminate DIE_GPF while we're at it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 18:25:14 -07:00
David S. Miller 6c1142602c [SPARC]: Wire up utimensat syscall.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 17:50:14 -07:00
David S. Miller af80318eb7 [SPARC64]: Fix request_irq() ignored result warnings in PCI controller code.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 17:23:31 -07:00
David S. Miller c57c2ffb15 [SPARC64]: Kill asm-sparc64/pbm.h
Everything it contains can be hidden in pci_impl.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:43:08 -07:00
David S. Miller 28113a9941 [SPARC64]: Removal of trivial pci_controller_info uses.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:44 -07:00
David S. Miller 6c108f1299 [SPARC64]: Move index info pci_pbm_info.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:40 -07:00
David S. Miller e9870c4c0a [SPARC64]: Move {setup,teardown}_msi_irq into pci_pbm_info.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:36 -07:00
David S. Miller f1cd8de2c9 [SPARC64]: Move pci_ops into pci_pbm_info.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:32 -07:00
David S. Miller 96a496fd49 [SPARC64] SBUS: Error interrupt registry cleanups.
Do not use IRQF_SHARED, these interrupt numbers should all
be unique.

Also use name strings without spaces in them just like
PCI controller drivers do, for consistency.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:28 -07:00
David S. Miller 34768bc832 [SPARC64] PCI: Use root list of pbm's instead of pci_controller_info's
The idea is to move more and more things into the pbm,
with the eventual goal of eliminating the pci_controller_info
entirely as there really isn't any need for it.

This stage of the transformations requires some reworking of
the PCI error interrupt handling.

It might be tricky to get rid of the pci_controller_info parenting for
a few reasons:

1) When we get an uncorrectable or correctable error we want
   to interrogate the IOMMU and streaming cache of both
   PBMs for error status.  These errors come from the UPA
   front-end which is shared between the two PBM PCI bus
   segments.

   Historically speaking this is why I choose the datastructure
   hierarchy of pci_controller_info-->pci_pbm_info

2) The probing does a portid/devhandle match to look for the
   'other' pbm, but this is entirely an artifact and can be
   eliminated trivially.

What we could do to solve #1 is to have a "buddy" pointer from one pbm
to another.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:24 -07:00
David S. Miller cfa0652c4e [SPARC64] PCI: Use common routine to fetch PBM properties.
Namely bus-range and ino-bitmap.

This allows us also to eliminate pci_controller_info's
pci_{first,last}_busno fields as only the pbm ones are
used now.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:12 -07:00
Ulrich Drepper 1c710c896e utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it

a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
   of the BSD lutimes(3) functions

For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.

Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.

Also, the completely missing futimensat() functionality is added.  We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).

Test application (the syscall number will need per-arch editing):

#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>

#define __NR_utimensat 280

#define UTIME_NOW       ((1l << 30) - 1l)
#define UTIME_OMIT      ((1l << 30) - 2l)

int
main(void)
{
  int status = 0;

  int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
  if (fd == -1)
    error (1, errno, "failed to create test file \"ttt\"");

  struct stat64 st1;
  if (fstat64 (fd, &st1) != 0)
    error (1, errno, "fstat failed");

  struct timespec t[2];
  t[0].tv_sec = 0;
  t[0].tv_nsec = 0;
  t[1].tv_sec = 0;
  t[1].tv_nsec = 0;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  struct stat64 st2;
  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
    {
      puts ("atim not reset to zero");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("mtim not reset to zero");
      status = 1;
    }
  if (status != 0)
    goto out;

  t[0] = st1.st_atim;
  t[1].tv_sec = 0;
  t[1].tv_nsec = UTIME_OMIT;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
      || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
    {
      puts ("atim not set");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("mtim changed from zero");
      status = 1;
    }
  if (status != 0)
    goto out;

  t[0].tv_sec = 0;
  t[0].tv_nsec = UTIME_OMIT;
  t[1] = st1.st_mtim;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
      || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
    {
      puts ("mtim changed from original time");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
      || st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
    {
      puts ("mtim not set");
      status = 1;
    }
  if (status != 0)
    goto out;

  sleep (2);

  t[0].tv_sec = 0;
  t[0].tv_nsec = UTIME_NOW;
  t[1].tv_sec = 0;
  t[1].tv_nsec = UTIME_NOW;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  struct timeval tv;
  gettimeofday(&tv,NULL);

  if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
      || st2.st_atim.tv_sec > tv.tv_sec)
    {
      puts ("atim not set to NOW");
      status = 1;
    }
  if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
      || st2.st_mtim.tv_sec > tv.tv_sec)
    {
      puts ("mtim not set to NOW");
      status = 1;
    }

  if (symlink ("ttt", "tttsym") != 0)
    error (1, errno, "cannot create symlink");

  t[0].tv_sec = 0;
  t[0].tv_nsec = 0;
  t[1].tv_sec = 0;
  t[1].tv_nsec = 0;
  if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
    error (1, errno, "utimensat failed");

  if (lstat64 ("tttsym", &st2) != 0)
    error (1, errno, "lstat failed");

  if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
    {
      puts ("symlink atim not reset to zero");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("symlink mtim not reset to zero");
      status = 1;
    }
  if (status != 0)
    goto out;

  t[0].tv_sec = 1;
  t[0].tv_nsec = 0;
  t[1].tv_sec = 1;
  t[1].tv_nsec = 0;
  if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
    {
      puts ("atim not reset to one");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("mtim not reset to one");
      status = 1;
    }

  if (status == 0)
     puts ("all OK");

 out:
  close (fd);
  unlink ("ttt");
  unlink ("tttsym");

  return status;
}

[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:18 -07:00
Randy Dunlap e63340ae6b header cleaning: don't include smp_lock.h when not used
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Christoph Hellwig 1eeb66a1bb move die notifier handling to common code
This patch moves the die notifier handling to common code.  Previous
various architectures had exactly the same code for it.  Note that the new
code is compiled unconditionally, this should be understood as an appel to
the other architecture maintainer to implement support for it aswell (aka
sprinkling a notify_die or two in the proper place)

arm had a notifiy_die that did something totally different, I renamed it to
arm_notify_die as part of the patch and made it static to the file it's
declared and used at.  avr32 used to pass slightly less information through
this interface and I brought it into line with the other architectures.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix vmalloc_sync_all bustage]
[bryan.wu@analog.com: fix vmalloc_sync_all in nommu]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-arch@vger.kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:04 -07:00
Christoph Hellwig ab1b6f03a1 simplify the stacktrace code
Simplify the stacktrace code:

 - remove the unused task argument to save_stack_trace, it's always
   current
 - remove the all_contexts flag, it's alwasy 0

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@suse.de>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:58 -07:00
Linus Torvalds ef93127e4c Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SERIAL] sunsu: Fix section mismatch warnings.
  [SPARC64]: pgtable_cache_init() should be __init.
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/prom.c
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/pci.c
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/console.c
  [MM]: sparse_init() should be __init.
  [SPARC64]: Update defconfig.
  [VIDEO]: Add Sun XVR-2500 framebuffer driver.
  [VIDEO]: Add Sun XVR-500 framebuffer driver.
  [SPARC64]: SUN4U PCI-E controller support.
  [SPARC]: Fix comment typo in smp4m_blackbox_current().
  [SCSI] SUNESP: sun_esp.c needs linux/delay.h

Fix up conflict in arch/sparc64/mm/init.c manually due to removal of
pgtable_cache_init() through the -mm patches (even though that patch was
also by David ;)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:22:48 -07:00
Benjamin Herrenschmidt ac35ee484d get_unmapped_area handles MAP_FIXED on sparc64
Handle MAP_FIXED in hugetlb_get_unmapped_area on sparc64 by just using
prepare_hugepage_range()

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: William Irwin <bill.irwin@oracle.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:56 -07:00
Christoph Lameter f0f3980b21 slab allocators: remove multiple alignment specifications
It is not necessary to tell the slab allocators to align to a cacheline
if an explicit alignment was already specified. It is rather confusing
to specify multiple alignments.

Make sure that the call sites only use one form of alignment.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:55 -07:00
Christoph Lameter 5af6083990 slab allocators: Remove obsolete SLAB_MUST_HWCACHE_ALIGN
This patch was recently posted to lkml and acked by Pekka.

The flag SLAB_MUST_HWCACHE_ALIGN is

1. Never checked by SLAB at all.

2. A duplicate of SLAB_HWCACHE_ALIGN for SLUB

3. Fulfills the role of SLAB_HWCACHE_ALIGN for SLOB.

The only remaining use is in sparc64 and ppc64 and their use there
reflects some earlier role that the slab flag once may have had. If
its specified then SLAB_HWCACHE_ALIGN is also specified.

The flag is confusing, inconsistent and has no purpose.

Remove it.

Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:55 -07:00
David Miller 3a2cba993b Quicklist support for sparc64
I ported this to sparc64 as per the patch below, tested on UP SunBlade1500 and
24 cpu Niagara T1000.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:54 -07:00
David S. Miller e7f11aeed0 [SPARC64]: pgtable_cache_init() should be __init.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:02:46 -07:00
David S. Miller c35a376d60 [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/prom.c
The IRQ translation init routines should all be __init.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:02:24 -07:00
David S. Miller a6009dda97 [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/pci.c
apb_calc_first_last(), apb_fake_ranges(), pci_of_scan_bus(),
of_scan_pci_bridge(), pci_of_scan_bus(), and pci_scan_one_pbm()
should all be __devinit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:01:38 -07:00
David S. Miller 23abc9ec6a [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/console.c
probe_other_fhcs() and central_probe() should be __init

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:00:37 -07:00
David S. Miller 7db00552d9 [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-06 22:47:14 -07:00
David S. Miller 861fe90656 [SPARC64]: SUN4U PCI-E controller support.
Some minor refactoring in the generic code was necessary for
this:

1) This controller requires 8-byte access to the interrupt map
   and clear register.  They are 64-bits on all the other
   SBUS and PCI controllers anyways, so this was easy to cure.

2) The IMAP register has a different layout and some bits that we
   need to preserve, so use a read/modify/write when making
   changes to the IMAP register in generic code.

3) Flushing the entire IOMMU TLB is best done with a single write
   to a register on this PCI controller, add a iommu->iommu_flushinv
   for this.

Still lacks MSI support, that will come later.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-06 22:44:06 -07:00
Linus Torvalds ea62ccd00f Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (231 commits)
  [PATCH] i386: Don't delete cpu_devs data to identify different x86 types in late_initcall
  [PATCH] i386: type may be unused
  [PATCH] i386: Some additional chipset register values validation.
  [PATCH] i386: Add missing !X86_PAE dependincy to the 2G/2G split.
  [PATCH] x86-64: Don't exclude asm-offsets.c in Documentation/dontdiff
  [PATCH] i386: avoid redundant preempt_disable in __unlazy_fpu
  [PATCH] i386: white space fixes in i387.h
  [PATCH] i386: Drop noisy e820 debugging printks
  [PATCH] x86-64: Fix allnoconfig error in genapic_flat.c
  [PATCH] x86-64: Shut up warnings for vfat compat ioctls on other file systems
  [PATCH] x86-64: Share identical video.S between i386 and x86-64
  [PATCH] x86-64: Remove CONFIG_REORDER
  [PATCH] x86-64: Print type and size correctly for unknown compat ioctls
  [PATCH] i386: Remove copy_*_user BUG_ONs for (size < 0)
  [PATCH] i386: Little cleanups in smpboot.c
  [PATCH] x86-64: Don't enable NUMA for a single node in K8 NUMA scanning
  [PATCH] x86: Use RDTSCP for synchronous get_cycles if possible
  [PATCH] i386: Add X86_FEATURE_RDTSCP
  [PATCH] i386: Implement X86_FEATURE_SYNC_RDTSC on i386
  [PATCH] i386: Implement alternative_io for i386
  ...

Fix up trivial conflict in include/linux/highmem.h manually.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-05 14:55:20 -07:00
Linus Torvalds 7e20ef030d Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (49 commits)
  [SCTP]: Set assoc_id correctly during INIT collision.
  [SCTP]: Re-order SCTP initializations to avoid race with sctp_rcv()
  [SCTP]: Fix the SO_REUSEADDR handling to be similar to TCP.
  [SCTP]: Verify all destination ports in sctp_connectx.
  [XFRM] SPD info TLV aggregation
  [XFRM] SAD info TLV aggregationx
  [AF_RXRPC]: Sort out MTU handling.
  [AF_IUCV/IUCV] : Add missing section annotations
  [AF_IUCV]: Implementation of a skb backlog queue
  [NETLINK]: Remove bogus BUG_ON
  [IPV6]: Some cleanups in include/net/ipv6.h
  [TCP]: zero out rx_opt in tcp_disconnect()
  [BNX2]: Fix TSO problem with small MSS.
  [NET]: Rework dev_base via list_head (v3)
  [TCP] Highspeed: Limited slow-start is nowadays in tcp_slow_start
  [BNX2]: Update version and reldate.
  [BNX2]: Print bus information for PCIE devices.
  [BNX2]: Add 1-shot MSI handler for 5709.
  [BNX2]: Restructure PHY event handling.
  [BNX2]: Add indirect spinlock.
  ...
2007-05-04 19:36:58 -07:00
Pavel Emelianov 7562f876cd [NET]: Rework dev_base via list_head (v3)
Cleanup of dev_base list use, with the aim to simplify making device
list per-namespace. In almost every occasion, use of dev_base variable
and dev->next pointer could be easily replaced by for_each_netdev
loop. A few most complicated places were converted to using
first_netdev()/next_netdev().

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 15:13:45 -07:00
Michael Ellerman 7fe3730de7 MSI: arch must connect the irq and the msi_desc
set_irq_msi() currently connects an irq_desc to an msi_desc. The archs call
it at some point in their setup routine, and then the generic code sets up the
reverse mapping from the msi_desc back to the irq.

set_irq_msi() should do both connections, making it the one and only call
required to connect an irq with it's MSI desc and vice versa.

The arch code MUST call set_irq_msi(), and it must do so only once it's sure
it's not going to fail the irq allocation.

Given that there's no need for the arch to return the irq anymore, the return
value from the arch setup routine just becomes 0 for success and anything else
for failure.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:38 -07:00
Dan Williams f282b97021 msi: introduce ARCH_SUPPORTS_MSI Kconfig option (rev2)
Allows architectures to advertise that they support MSI rather than listing
each architecture as a PCI_MSI dependency.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:37 -07:00
Jeremy Fitzhardinge b6e3590f81 [PATCH] x86: Allow percpu variables to be page-aligned
Let's allow page-alignment in general for per-cpu data (wanted by Xen, and
Ingo suggested KVM as well).

Because larger alignments can use more room, we increase the max per-cpu
memory to 64k rather than 32k: it's getting a little tight.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02 19:27:12 +02:00
David S. Miller 16ce82d846 [SPARC64]: Convert PCI over to generic struct iommu/strbuf.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 21:08:21 -07:00
David S. Miller 3e4d26508a [SPARC64]: Convert SBUS over to generic iommu/strbuf structs.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:44 -07:00
David S. Miller 9b3627f389 [SPARC64]: Consolidate {sbus,pci}_iommu_arena.
Move to asm-sparc64/iommu.h and rename to plain "iommu_arena".

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:42 -07:00
Stephen Rothwell 3dfe10ee7c [SPARC64]: constify some paramaters of OF routines
This starts bringing the PowerPC and Sparc64 implemetations back closer
together.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:40 -07:00
David S. Miller a165b4205e [SPARC64]: Fix PCI rework to adhere to of_get_property() const return.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:37 -07:00
David S. Miller f1cfdb55f1 [SPARC64]: Document and fix calculation of pages_avail.
It should be set to the total number of pages that the
system will really have available after things like
initmem, the bootmem map, and initrd are freed up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:36 -07:00
David S. Miller 0f3e25049e [SPARC64]: Make sure pbm->prom_node is setup easly enough in psycho.c
It needs to be ready before we invoke pci_determine_mem_io_space().

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:35 -07:00
David S. Miller 3996465392 [SPARC64]: Use bootmem_bootmap_pages() in choose_bootmap_pfn().
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:34 -07:00
David S. Miller b93f262023 [SPARC64]: Add proper header file extern for cmdline_memory_size.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:33 -07:00
David S. Miller 9753f0d650 [SPARC64]: Kill sparc_ultra_dump_{i,d}tlb()
While useful in odd circumstances to debug something, they are
normally totally unused and anyone can fetch this code out of the
history if they really need it.

And in any event, the person who needs this kind of code is usually me
:-)

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:32 -07:00
David S. Miller 85f1e1f660 [SPARC64]: Use DECLARE_BITMAP and BITS_TO_LONGS in mm/init.c
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:31 -07:00
David S. Miller 5be4a96367 [SPARC64]: Give move verbose show_mem() output just like i386.
We now report everything i386 does except for highmem which
doesn't apply.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:30 -07:00
David S. Miller 28256ca2e0 [SPARC64]: Mark show_mem() printk's with KERN_INFO.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:28 -07:00
David S. Miller a94aa25306 [SPARC64]: Kill kvaddr_to_phys() and friends.
Just inline it into flush_icache_range() which is the only
user.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:27 -07:00
David S. Miller 4be5c34dc4 [SPARC64]: Privatize sun4u_get_pte() and fix name.
__get_phys is only called from init.c as is prom_virt_to_phys(),
__get_iospace() is not called at all, and sun4u_get_pte() is largely
misnamed.

Privatize the implementation and helper functions of
sun4u_get_phys() to mm/init.c, and rename to
kvaddr_to_paddr().

The only used of this thing is flush_icache_range(), and thus
things can be considerably further simplified.  For example,
we should only see module or PAGE_OFFSET kernel addresses here,
so we don't need the OBP firmware range handling at all.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:26 -07:00
David S. Miller a0963bdfb9 [SPARC64]: Kill _start[]/_end[] declarations in mm/init.c
We already get those from asm/sections.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:25 -07:00
David S. Miller 0015d3d68c [SPARC64]: Simplify read_obp_memory().
Kick out empty entries as soon as we spot them, and use memmove()
instead of a silly loop to make the operation more clear.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:23 -07:00
David S. Miller d78d0891d3 [SPARC64]: Use SPARSEMEM_STATIC
Decrease the SECTION_SIZE_BITS --> MAX_PHYSADDR_BITS
range a little bit.

The cost of going to SPARSEMEM_STATIC becomes 8K of BSS space, and in
return we save a pointer dereferences on every page struct lookup.
Even better we hit the main kernel image for the base address which is
in a hugepage locked TLB entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:22 -07:00
David S. Miller 28f57e774d [SPARC64]: Force dummy host controller onto bus zero.
This helps deal with the invisible bridge that sits between
the host controller and the top-most visisble PCI devices
on hypervisor systems.

For example, on T1000 the bus-range property says 2 --> 4
and so there is a PCI express bridge at bus 2, devfn 0, etc.

So if we don't force the dummy host controller to bus zero,
we'll try to create two devices with the same domain/bus/devfn
triplet.

Also, add some more log diagnostics to make debugging stuff like this
easyer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:20 -07:00
David S. Miller 97b3cf050b [SPARC64]: Add dummy host controller to root of all PCI domains.
We fake up a dummy one in all cases because that is the simplest
thing to do and it happens to be necessary for hypervisor systems.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:19 -07:00
David S. Miller c6e87566ea [SPARC64]: Const'ify pci_iommu_ops.
Based upon a similar patch for x86_64 written by
Stephen Hemminger.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:18 -07:00
David S. Miller 0bba2dd823 [SPARC64]: Kill pbm->pci_first_slot.
Set but never used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:17 -07:00
David S. Miller 3875c5c02d [SPARC64]: Kill pci_controller->pbms_same_domain
We don't do the "Simba APB is a PBM" bogosity for Sabre
controllers any longer, so this pbms_same_domain thing
is no longer necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:16 -07:00
David S. Miller 8d3aee9375 [SPARC64]: Kill pci_controller->base_address_update().
Implemented but never actually used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:15 -07:00
David S. Miller 0bae5f81b6 [SPARC64]: Kill pci_controller->resource_adjust()
All the implementations can be identical and generic, so
no need for controller specific methods.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:14 -07:00
David S. Miller 3487a1f9e7 [SPARC64]: Kill PBM ranges software state.
It is only used in one spot and we can just fetch the
OF property right there.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:13 -07:00
David S. Miller 229177c7f3 [SPARC64]: Kill PBM intmap software state.
Set but never used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:12 -07:00
David S. Miller 9fd8b64761 [SPARC64]: Consolidate PCI mem/io resource determination.
It can be done for every PCI configuration using OF properties.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:11 -07:00
David S. Miller 01f94c4a6c [SPARC64]: Fix sabre pci controllers with new probing scheme.
The SIMBA APB bridge is strange, it is a PCI bridge but it lacks
some standard OF properties, in particular it lacks a 'ranges'
property.

What you have to do is read the IO and MEM range registers in
the APB bridge to determine the ranges handled by each bridge.
So fill in the bus resources by doing that.

Since we now handle this quirk in the generic PCI and OF device
probing layers, we can flat out eliminate all of that code from
the sabre pci controller driver.

In fact we can thus eliminate completely another quirk of the sabre
driver.  It tried to make the two APB bridges look like PBMs but that
makes zero sense now (and it's questionable whether it ever made sense).
So now just use pbm_A and probe the whole PCI hierarchy using that as
the root.

This simplification allows many future cleanups to occur.

Also, I've found yet another quirk that needs to be worked around
while testing this.  You can't use the 'class-code' OF firmware
property, especially for IDE controllers.  We have to read the value
out of PCI config space or else we'll see the value the device was
showing before it was programmed into native mode.

I'm starting to think it might be wise to just read all of the values
out of PCI config space instead of using the OF properties. :-/

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:10 -07:00
David S. Miller a378fd0ee8 [SPARC64]: Fix obppath pci device sysfs creation.
Need to traverse recursively down child busses else we only
get the file created under devices at the top-level.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:09 -07:00
David S. Miller bc606f3c91 [SPARC64]: Minor cleanups to schizo pci controller driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:08 -07:00
David S. Miller 1e8a8cc52d [SPARC64]: Internalize pci_memspace_mask.
The only user was bus_dvma_to_mem() which is no longer used
by any driver, so kill that, and the export of pci_memspace_mask.

The only user now is the PCI mmap support code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:07 -07:00
David S. Miller a2fb23af1c [SPARC64]: Probe PCI bus using OF device tree.
Almost entirely taken from the 64-bit PowerPC PCI code.

This allowed to eliminate a ton of cruft from the sparc64
PCI layer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:06 -07:00
David S. Miller deb66c4521 [SPARC64] isa: Convert to use pci_device_to_OF_node().
Also, do not try to compute resources by hand, instead use
the pre-computed ones in the of_device.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:05 -07:00
David S. Miller 1327e9b62f [SPARC64] ebus: Convert to use pci_device_to_OF_node().
Also, we don't need to store or use the PBM so kill that
from the linux_ebus.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:04 -07:00
David S. Miller a8b8814bdf [SPARC]: Use strcasecmp for OFW property name comparisons.
This allows us to simplify sharing code with powerpc which
has properties that have various forms of capitalization
when on the sparc64 side the property is all lower-case.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:41 -07:00
Stephen Rothwell 64b94701c0 [SPARC/64]: constify of_get_property return
Finally, we actually change the functions themselves.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:35 -07:00
Stephen Rothwell 6a23acf390 [SPARC64]: constify of_get_property return: arch/sparc64
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:24 -07:00
Tony Breeds 644923d4a5 [SPARC64]: Small cleanups time.c
- Removes days_in_mo[], as it's almost identical to month_days[]
- Use the leapyear() macro
- Line length wrapping.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:20 -07:00
David S. Miller d62c6f093a [SPARC64]: Fix sparc64_next_event() error return.
It should return an error code not a boolean.

Based upon an hpet timer fix by Thomas Gleixner.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:18 -07:00
David S. Miller 112f48716d [SPARC64]: Add clocksource/clockevents support.
I'd like to thank John Stul and others for helping
me along the way.

A lot of cleanups fell out of this.  For example, the get_compare()
tick_op was totally unused, so was deleted.  And the most often used
tick_op members were grouped together for cache-friendlyness.

The sparc64 TSC is given to the kernel as a one-shot timer.

tick_ops->init_timer() simply turns off the privileged bit in
the tick register (when possible), and disables the interrupt
by setting bit 63 in the compare register.  The ->disable_irq()
op also sets this bit.

tick_ops->add_compare() is changed to:

1) Add the given delta to "tick" not to "compare"
2) Return a boolean which, if true, means that the tick
   value read after writing the compare value was found
   to have incremented past the initial tick value.  This
   mirrors logic used in the HPET driver's ->next_event()
   method.

Each tick_ops implementation also now provides a name string.
And we feed this into the clocksource and clockevents layers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:15 -07:00
David S. Miller 038cb01ea6 [SPARC64]: Add tick_nohz_{stop,restart}_sched_tick() calls to cpu_idle().
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:13 -07:00
David S. Miller 777a447529 [SPARC64]: Unify timer interrupt handler.
Things were scattered all over the place, split between
SMP and non-SMP.

Unify it all so that dyntick support is easier to add.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:11 -07:00
David S. Miller a58c9f3c1e [SPARC64]: Synchronize RTC clock via timer just like x86.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:08 -07:00
Tom "spot" Callaway 24fc6f00b6 [SPARC64]: Fix inline directive in pci_iommu.c
While building a test kernel for the new esp driver (against
git-current), I hit this bug. Trivial fix, put the inline declaration
in the right place. :)

Signed-off-by: Tom "spot" Callaway <tcallawa@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-13 13:35:35 -07:00
David S. Miller 5c7aa6ffae [SPARC64]: Fix arg passing to compat_sys_ipc().
Do not sign extend args using the sys32_ipc stub, that is
buggy and unnecessary.

Based upon an excellent report by Mikael Pettersson.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-13 13:27:08 -07:00
Robert Reif f6b45da129 [SPARC]: Fix section mismatch warnings in pci.c and pcic.c
Fix section mismatch in arch/sparc/kernel/pcic.c and 
arch/sparc64/kernel/pci.c.

Signed-off-by: Robert Reif <reif@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-12 13:47:37 -07:00
Roland McGrath 1d51c69fb6 [SPARC]: avoid CHILD_MAX and OPEN_MAX constants
I don't figure anyone really cares about SunOS syscall emulation, and I
certainly don't.  But I'm getting rid of uses of the OPEN_MAX and CHILD_MAX
compile-time constant, and these are almost the only ones.  OPEN_MAX is a
bogus constant with no meaning about anything.  The RLIMIT_NOFILE resource
limit is what sysconf (_SC_OPEN_MAX) actually wants to return.

The CHILD_MAX cases weren't actually using anything I want to get rid of,
but I noticed that they are there and are wrong too.  The CHILD_MAX value
is not really unlimited as a -1 return from sysconf indicates.  The
RLIMIT_NPROC resource limit is what sysconf (_SC_CHILD_MAX) wants to return.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-12 13:13:42 -07:00
David S. Miller 2f3a2efd85 [SPARC64]: Fix SBUS IOMMU allocation code.
There are several IOMMU allocator bugs.  Instead of trying to fix this
overly complicated code, just mirror the PCI IOMMU arena allocator
which is very stable and well stress tested.

I tried to make the code as identical as possible so we can switch
sun4u PCI and SBUS over to a common piece of IOMMU code.  All that
will be need are two callbacks, one to do a full IOMMU flush and one
to do a streaming buffer flush.

This patch gets rid of a lot of hangs and mysterious crashes on SBUS
sparc64 systems, at least for me.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-11 23:56:10 -07:00
David S. Miller 24d559cac4 [SPARC64]: store-init needs trailing membar.
The manual says that it is required and we actually have crash reports
where loads see stale data due to not having membars here.

In one case the networking does:

	memset(skb, 0, offsetof(struct sk_buff, truesize));

and then some code later checks skb->nohdr for zero, but it's still
the value that was there before the memset().

Note that arch/sparc64/lib/xor.S already got this right.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-19 13:27:33 -07:00
David S. Miller db2f9f6d91 [SPARC64]: Use Kconfig.preempt
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-17 15:23:22 -07:00
David S. Miller d1acb4210a [SPARC64]: Get DEBUG_PAGEALLOC working again.
We have to make sure to use base-pagesize TLB entries even during the
early transition period where we need TLB miss handling but don't have
the kernel page tables setup yet for the linear region.

Also, it is necessary therefore to not use the 4MB TSB for these
translations, and instead use the normal kernel TSB.  This allows us
to also get rid of the 4MB tsb for debug builds which shrinks the
kernel a little bit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-16 17:20:28 -07:00
David S. Miller bb8236f2b9 [SPARC64]: Add missing HPAGE_MASK masks on address parameters.
These pte loops all assume the passed in address is HPAGE
aligned, make sure that is actually true.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-12 22:55:39 -07:00
David S. Miller 50d266a3a1 [SPARC]: Hook up missing syscalls.
sys_mbind
sys_get_mempolicy
sys_set_mempolicy
sys_kexec_load
sys_move_pages
sys_getcpu
sys_epoll_pwait

This work is largely a result of David Woodhouse's most
excellent missing syscalls patch.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-12 19:58:18 -07:00
Mathieu Desnoyers c0a79b229a [SPARC64]: Fix atomicity of TIF update in flush_thread()
Fix atomicity of TIF update in flush_thread() for sparc64

Fixes correctly the race by using *_ti_thread_flag.

Race :

parent process executing :
sys_ptrace()
 (lock_kernel())
 (ptrace_get_task_struct(pid))
 arch_ptrace()
   ptrace_detach()
     ptrace_disable(child);
       clear_singlestep(child);
         clear_tsk_thread_flag(child, TIF_SINGLESTEP);
         (which clears the TIF_SINGLESTEP flag atomically from a different
          process)
 (put_task_struct(child))
 (unlock_kernel())

And at the same time, in the child process :
sys_execve()
 do_execve()
   search_binary_handler()
     load_elf_binary()
       flush_old_exec()
         flush_thread()
           doing a non-atomic thread flag update

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-10 00:19:49 -08:00
David S. Miller f6d0f9ea55 [SPARC]: Provide pci_device_to_OF_node() just like powerpc.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-02 15:22:51 -08:00
David S. Miller 45bcca67ed [SPARC]: Handle unresolvable resources better in of_device.c
Just leave them as zero if we couldn't calculate it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-02 15:22:50 -08:00
David S. Miller b85cdd490a [SPARC]: Fix bus handling in build_device_resources().
We mistakedly modify 'bus' in the innermost loop.  What
should happen is that at each register index iteration,
we start with the same 'bus'.

So preserve it's value at the top level, and use a loop
local variable 'dbus' for iteration.

This bug causes registers other than the first to be
decoded improperly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-02 15:22:49 -08:00
David S. Miller 5f1ef5108a [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-28 09:51:15 -08:00
David S. Miller bb4c18cbba [SPARC64]: Fix PCI interrupts on E450 et al.
When the PCI controller OBP node lacks an interrupt-map
and interrupt-map-mask property, we need to form the
INO by hand.  The PCI swizzle logic was not doing that
properly.

This was a regression added by the of_device code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-27 09:46:52 -08:00
David S. Miller c5b002c1bf [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-26 11:35:50 -08:00
David S. Miller abfd336cd7 [SPARC64]: Fix arch_teardown_msi_irq().
Need to use get_irq_msi() not get_irq_data().

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-26 11:35:48 -08:00
David S. Miller 5746c99dfa [SPARC64]: virt_irq_free only needed when CONFIG_PCI_MSI
Noticed by Meelis Roos.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-26 11:35:46 -08:00
David S. Miller 34cc560e6a [SPARC]: Re-export saved_command_line to modules.
This reverts some bogosity from the dynamic command-line
changes made on sparc32 and sparc64.

Drivers such as drivers/sbus/char/openprom.c reference
saved_command_line, and can be modular.

The boot_command_line is __initdata, yet the dynamic command-line
changes add modular exports of that symbol, obviously wrong.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12 15:15:48 -08:00
David S. Miller 1b51d3a08b [SPARC64]: We do not need ZONE_DMA.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12 15:15:46 -08:00
Arjan van de Ven 5dfe4c964a [PATCH] mark struct file_operations const 2
Many struct file_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

[akpm@osdl.org: sparc64 fix]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:44 -08:00
Alon Bar-Lev 383464c0fb [PATCH] Dynamic kernel command-line: sparc64
Rename saved_command_line into boot_command_line.

Signed-off-by: Alon Bar-Lev <alon.barlev@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:39 -08:00
Eric W. Biederman 0e25338bc1 [PATCH] signal: use kill_pgrp not kill_pg in the sunos compatibility code
I am slowly moving to a model where all process killing is struct pid based
instead of pid_t based.  The sunos compatibility code is one of the last users
of the old pid_t based kill_pg in the kernel.  By being complete I allow for
the future removal of kill_pg from the kernel, which will ensure I don't miss
something.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:31 -08:00
Linus Torvalds c827ba4cb4 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: Update defconfig.
  [SPARC64]: Add PCI MSI support on Niagara.
  [SPARC64] IRQ: Use irq_desc->chip_data instead of irq_desc->handler_data
  [SPARC64]: Add obppath sysfs attribute for SBUS and PCI devices.
  [PARTITION]: Add whole_disk attribute.
2007-02-11 11:37:45 -08:00
Kyle McMartin d4d23add3a [PATCH] Common compat_sys_sysinfo
I noticed that almost all architectures implemented exactly the same
sys32_sysinfo...  except parisc, where a bug was to be found in handling of
the uptime.  So let's remove a whole whack of code for fun and profit.
Cribbed compat_sys_sysinfo from x86_64's implementation, since I figured it
would be the best tested.

This patch incorporates Arnd's suggestion of not using set_fs/get_fs, but
instead extracting out the common code from sys_sysinfo.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:32 -08:00
Tilman Schmidt 4564f9e5fd [PATCH] consolidate line discipline number definitions
The line discipline numbers N_* are currently defined for each architecture
individually, but (except for a seeming mistake) identically, in
asm/termios.h.  There is no obvious reason why these numbers should be
architecture specific, nor any apparent relationship with the termios
structure.  The total number of these, NR_LDISCS, is defined in linux/tty.h
anyway.  So I propose the following patch which moves the definitions of
the individual line disciplines to linux/tty.h too.

Three of these numbers (N_MASC, N_PROFIBUS_FDL, and N_SMSBLOCK) are unused
in the current kernel, but the patch still keeps the complete set in case
there are plans to use them yet.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:26 -08:00
Jean-Paul Saman 67d38229df [PATCH] disable init/initramfs.c: architectures
Update all arch/*/kernel/vmlinux.lds.S to not include space for initramfs
when CONFIG_BLK_DEV_INITRAMFS is not selected.  This saves another 4 kbytes
on most platfoms (some reserve PAGE_SIZE for initramfs).

Signed-off-by: Jean-Paul Saman <jean-paul.saman@nxp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:25 -08:00
Christoph Lameter 5ac6da669e [PATCH] Set CONFIG_ZONE_DMA for arches with GENERIC_ISA_DMA
As Andi pointed out: CONFIG_GENERIC_ISA_DMA only disables the ISA DMA
channel management.  Other functionality may still expect GFP_DMA to
provide memory below 16M.  So we need to make sure that CONFIG_ZONE_DMA is
set independent of CONFIG_GENERIC_ISA_DMA.  Undo the modifications to
mm/Kconfig where we made ZONE_DMA dependent on GENERIC_ISA_DMA and set
theses explicitly in each arches Kconfig.

Reviews must occur for each arch in order to determine if ZONE_DMA can be
switched off.  It can only be switched off if we know that all devices
supported by a platform are capable of performing DMA transfers to all of
memory (Some arches already support this: uml, avr32, sh sh64, parisc and
IA64/Altix).

In order to switch ZONE_DMA off conditionally, one would have to establish
a scheme by which one can assure that no drivers are enabled that are only
capable of doing I/O to a part of memory, or one needs to provide an
alternate means of performing an allocation from a specific range of memory
(like provided by alloc_pages_range()) and insure that all drivers use that
call.  In that case the arches alloc_dma_coherent() may need to be modified
to call alloc_pages_range() instead of relying on GFP_DMA.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:19 -08:00
Christoph Lameter 9617729941 [PATCH] Drop free_pages()
nr_free_pages is now a simple access to a global variable.  Make it a macro
instead of a function.

The nr_free_pages now requires vmstat.h to be included.  There is one
occurrence in power management where we need to add the include.  Directly
refrer to global_page_state() there to clarify why the #include was added.

[akpm@osdl.org: arm build fix]
[akpm@osdl.org: sparc64 build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:18 -08:00
David S. Miller 784020fb95 [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-10 23:50:38 -08:00
David S. Miller 35a17eb6a8 [SPARC64]: Add PCI MSI support on Niagara.
This is kind of hokey, we could use the hardware provided facilities
much better.

MSIs are assosciated with MSI Queues.  MSI Queues generate interrupts
when any MSI assosciated with it is signalled.  This suggests a
two-tiered IRQ dispatch scheme:

	MSI Queue interrupt --> queue interrupt handler
		MSI dispatch --> driver interrupt handler

But we just get one-level under Linux currently.  What I'd like to do
is possibly stick the IRQ actions into a per-MSI-Queue data structure,
and dispatch them form there, but the generic IRQ layer doesn't
provide a way to do that right now.

So, the current kludge is to "ACK" the interrupt by processing the
MSI Queue data structures and ACK'ing them, then we run the actual
handler like normal.

We are wasting a lot of useful information, for example the MSI data
and address are provided with ever MSI, as well as a system tick if
available.  If we could pass this into the IRQ handler it could help
with certain things, in particular for PCI-Express error messages.

The MSI entries on sparc64 also tell you exactly which bus/device/fn
sent the MSI, which would be great for error handling when no
registered IRQ handler can service the interrupt.

We override the disable/enable IRQ chip methods in sun4v_msi, so we
have to call {mask,unmask}_msi_irq() directly from there.  This is
another ugly wart.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-10 23:50:37 -08:00
David S. Miller 68c9218694 [SPARC64] IRQ: Use irq_desc->chip_data instead of irq_desc->handler_data
Otherwise we can't use the generic MSI code.

Furthermore, properly use the {get,set}_irq_foo() abstracted
interfaces instead of direct accesses to irq_desc[]->foo.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-10 23:50:36 -08:00
Fabio Massimo Di Nitto cf69eab231 [SPARC64]: Add obppath sysfs attribute for SBUS and PCI devices.
Signed-off-by: Fabio Massimo Di Nitto <fabbione@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-10 23:50:35 -08:00
David S. Miller 86d43258bc [SPARC64]: Set g4/g5 properly in sun4v dtlb-prot handling.
Mirror the logic in the sun4u handler, we have to update
both registers even when we branch out to window fault
fixup handling.

The way it works is that if we are in etrap processing a
fault already, g4/g5 holds the original fault information.
If we take a window spill fault while doing etrap, then
we put the window spill fault info into g4/g5 and this is
what the top-level fault handler ends up processing first.

Then we retry the originally faulting instruction, and
process the original fault at that time.

This is all necessary because of how constrained the trap
registers are in these code paths.  These cases trigger
very rarely, so even if there is some performance implication
it's doesn't happen very often.  In fact the rarity is why
it took so long to trigger and find this particular bug.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-01-26 18:56:01 -08:00
Gautham R Shenoy b282b6f8a8 [PATCH] Change cpu_up and co from __devinit to __cpuinit
Compiling the kernel with CONFIG_HOTPLUG = y and CONFIG_HOTPLUG_CPU = n
with CONFIG_RELOCATABLE = y generates the following modpost warnings

WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141b7d) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141b9c) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.text:__cpu_up
from .text between '_cpu_up' (at offset 0xc0141bd8) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141c05) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141c26) and 'cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.data: from
.text between '_cpu_up' (at offset 0xc0141c37) and 'cpu_up'

This is because cpu_up, _cpu_up and __cpu_up (in some architectures) are
defined as __devinit
AND
__cpu_up calls some __cpuinit functions.

Since __cpuinit would map to __init with this kind of a configuration,
we get a .text refering .init.data warning.

This patch solves the problem by converting all of __cpu_up, _cpu_up
and cpu_up from __devinit to __cpuinit. The approach is justified since
the callers of cpu_up are either dependent on CONFIG_HOTPLUG_CPU or
are of __init type.

Thus when CONFIG_HOTPLUG_CPU=y, all these cpu up functions would land up
in .text section, and when CONFIG_HOTPLUG_CPU=n, all these functions would
land up in .init section.

Tested on a i386 SMP machine running linux-2.6.20-rc3-mm1.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-11 18:18:20 -08:00
David S. Miller f4060c0dbb [SPARC64]: Handle ISA devices with no 'regs' property.
And this points out that the return value from
isa_dev_get_resource() and the 'pregs' arg to
isa_dev_get_irq() are totally unused.

Based upon a patch from Richard Mortimer <richm@oldelvet.org.uk>

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-31 14:06:07 -08:00
David S. Miller 55d0bef587 [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-31 14:06:06 -08:00