Commit Graph

46395 Commits

Author SHA1 Message Date
Sage Weil b61c27636f libceph: don't complain on msgpool alloc failures
The pool allocation failures are masked by the pool; there is no need to
spam the console about them.  (That's the whole point of having the pool
in the first place.)

Mark msg allocations whose failure is safely handled as such.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25 16:10:15 -07:00
Sage Weil 6ab00d465a libceph: create messenger with client
This simplifies the init/shutdown paths, and makes client->msgr available
during the rest of the setup process.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25 16:10:15 -07:00
Jeremy Fitzhardinge 97ce2c88f9 jump-label: initialize jump-label subsystem much earlier
Initialize jump_labels much, much earlier, so they're available for use
during system setup.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2011-10-25 11:55:15 -07:00
Jeremy Fitzhardinge 20284aa77c jump_label: add arch_jump_label_transform_static() to optimise non-live code updates
When updating a newly loaded module, the code is definitely not yet
executing on any processor, so it can be updated with no need for any
heavyweight synchronization.

This patch adds arch_jump_label_static() which is implemented as
arch_jump_label_transform() by default, but architectures can override
it if it avoids, say, a call to stop_machine().

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2011-10-25 11:54:31 -07:00
Jeremy Fitzhardinge 37348804e0 jump_label: if a key has already been initialized, don't nop it out
If a key has been enabled before jump_label_init() is called, don't
nop it out.

This removes arch_jump_label_text_poke_early() (which can only nop
out a site) and uses arch_jump_label_transform() instead.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2011-10-25 11:54:15 -07:00
Jeremy Fitzhardinge d5d9a3b12a jump_label: use proper atomic_t initializer
ATOMIC_INIT() is the proper thing to use.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2011-10-25 11:53:17 -07:00
Linus Torvalds ef78cc75f1 Merge branch 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
* 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (26 commits)
  Check validity of cl_rpcclient in nfs_server_list_show
  NFS: Get rid of the nfs_rdata_mempool
  NFS: Don't rely on PageError in nfs_readpage_release_partial
  NFS: Get rid of unnecessary calls to ClearPageError() in read code
  NFS: Get rid of nfs_restart_rpc()
  NFS: Get rid of the unused nfs_write_data->flags field
  NFS: Get rid of the unused nfs_read_data->flags field
  NFSv4: Translate NFS4ERR_BADNAME into ENOENT when applied to a lookup
  NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup()
  NFS: Use the inode->i_version to cache NFSv4 change attribute information
  SUNRPC: Remove unnecessary export of rpc_sockaddr2uaddr
  SUNRPC: Fix rpc_sockaddr2uaddr
  nfs/super.c: local functions should be static
  pnfsblock: fix writeback deadlock
  pnfsblock: fix NULL pointer dereference
  pnfs: recoalesce when ld read pagelist fails
  pnfs: recoalesce when ld write pagelist fails
  pnfs: make _set_lo_fail generic
  pnfsblock: add missing rpc_put_mount and path_put
  SUNRPC/NFS: make rpc pipe upcall generic
  ...
2011-10-25 15:44:06 +02:00
Linus Torvalds 1442d1678c Merge branch 'for-3.2' of git://linux-nfs.org/~bfields/linux
* 'for-3.2' of git://linux-nfs.org/~bfields/linux: (103 commits)
  nfs41: implement DESTROY_CLIENTID operation
  nfsd4: typo logical vs bitwise negate for want_mask
  nfsd4: allow NFS4_SHARE_SIGNAL_DELEG_WHEN_RESRC_AVAIL | NFS4_SHARE_PUSH_DELEG_WHEN_UNCONTENDED
  nfsd4: seq->status_flags may be used unitialized
  nfsd41: use SEQ4_STATUS_BACKCHANNEL_FAULT when cb_sequence is invalid
  nfsd4: implement new 4.1 open reclaim types
  nfsd4: remove unneeded CLAIM_DELEGATE_CUR workaround
  nfsd4: warn on open failure after create
  nfsd4: preallocate open stateid in process_open1()
  nfsd4: do idr preallocation with stateid allocation
  nfsd4: preallocate nfs4_file in process_open1()
  nfsd4: clean up open owners on OPEN failure
  nfsd4: simplify process_open1 logic
  nfsd4: make is_open_owner boolean
  nfsd4: centralize renew_client() calls
  nfsd4: typo logical vs bitwise negate
  nfs: fix bug about IPv6 address scope checking
  nfsd4: more robust ignoring of WANT bits in OPEN
  nfsd4: move name-length checks to xdr
  nfsd4: move access/deny validity checks to xdr code
  ...
2011-10-25 15:42:01 +02:00
Linus Torvalds 7e0bb71e75 Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
* 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (63 commits)
  PM / Clocks: Remove redundant NULL checks before kfree()
  PM / Documentation: Update docs about suspend and CPU hotplug
  ACPI / PM: Add Sony VGN-FW21E to nonvs blacklist.
  ARM: mach-shmobile: sh7372 A4R support (v4)
  ARM: mach-shmobile: sh7372 A3SP support (v4)
  PM / Sleep: Mark devices involved in wakeup signaling during suspend
  PM / Hibernate: Improve performance of LZO/plain hibernation, checksum image
  PM / Hibernate: Do not initialize static and extern variables to 0
  PM / Freezer: Make fake_signal_wake_up() wake TASK_KILLABLE tasks too
  PM / Hibernate: Add resumedelay kernel param in addition to resumewait
  MAINTAINERS: Update linux-pm list address
  PM / ACPI: Blacklist Vaio VGN-FW520F machine known to require acpi_sleep=nonvs
  PM / ACPI: Blacklist Sony Vaio known to require acpi_sleep=nonvs
  PM / Hibernate: Add resumewait param to support MMC-like devices as resume file
  PM / Hibernate: Fix typo in a kerneldoc comment
  PM / Hibernate: Freeze kernel threads after preallocating memory
  PM: Update the policy on default wakeup settings
  PM / VT: Cleanup #if defined uglyness and fix compile error
  PM / Suspend: Off by one in pm_suspend()
  PM / Hibernate: Include storage keys in hibernation image on s390
  ...
2011-10-25 15:18:39 +02:00
Linus Torvalds c9d6329c35 Merge branch 'for-next' of git://git.linaro.org/people/triad/linux-pinctrl
* 'for-next' of git://git.linaro.org/people/triad/linux-pinctrl:
  pinctrl/sirf: fix sirfsoc_get_group_pins prototype
  pinctrl: Don't copy function name when requesting a pin
  pinctrl: Don't copy pin names when registering them
  pinctrl: Remove unsafe __refdata
  pinctrl: get_group_pins() const fixes
  pinctrl: add a driver for the CSR SiRFprimaII pinmux
  pinctrl: add a driver for the U300 pinmux
  drivers: create a pin control subsystem
2011-10-25 14:04:01 +02:00
Linus Torvalds 4e7e2a2008 Merge branch 'for-linus' of git://opensource.wolfsonmicro.com/regmap
* 'for-linus' of git://opensource.wolfsonmicro.com/regmap: (62 commits)
  mfd: Enable rbtree cache for wm831x devices
  regmap: Support some block operations on cached devices
  regmap: Allow caches for devices with no defaults
  regmap: Ensure rbtree syncs registers set to zero properly
  regmap: Allow rbtree to cache zero default values
  regmap: Warn on raw I/O as well as bulk reads that bypass cache
  regmap: Return a sensible error code if we fail to read the cache
  regmap: Use bsearch() to search the register defaults
  regmap: Fix doc comment
  regmap: Optimize the lookup path to use binary search
  regmap: Ensure we scream if we enable cache bypass/only at the same time
  regmap: Implement regcache_cache_bypass helper function
  regmap: Save/restore the bypass state upon syncing
  regmap: Lock the sync path, ensure we use the lockless _regmap_write()
  regmap: Fix apostrophe usage
  regmap: Make _regmap_write() global
  regmap: Fix lock used for regcache_cache_only()
  regmap: Grab the lock in regcache_cache_only()
  regmap: Modify map->cache_bypass directly
  regmap: Fix regcache_sync generic implementation
  ...
2011-10-25 13:57:45 +02:00
Linus Torvalds 8a9ea3237e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1745 commits)
  dp83640: free packet queues on remove
  dp83640: use proper function to free transmit time stamping packets
  ipv6: Do not use routes from locally generated RAs
  |PATCH net-next] tg3: add tx_dropped counter
  be2net: don't create multiple RX/TX rings in multi channel mode
  be2net: don't create multiple TXQs in BE2
  be2net: refactor VF setup/teardown code into be_vf_setup/clear()
  be2net: add vlan/rx-mode/flow-control config to be_setup()
  net_sched: cls_flow: use skb_header_pointer()
  ipv4: avoid useless call of the function check_peer_pmtu
  TCP: remove TCP_DEBUG
  net: Fix driver name for mdio-gpio.c
  ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT
  rtnetlink: Add missing manual netlink notification in dev_change_net_namespaces
  ipv4: fix ipsec forward performance regression
  jme: fix irq storm after suspend/resume
  route: fix ICMP redirect validation
  net: hold sock reference while processing tx timestamps
  tcp: md5: add more const attributes
  Add ethtool -g support to virtio_net
  ...

Fix up conflicts in:
 - drivers/net/Kconfig:
	The split-up generated a trivial conflict with removal of a
	stale reference to Documentation/networking/net-modules.txt.
	Remove it from the new location instead.
 - fs/sysfs/dir.c:
	Fairly nasty conflicts with the sysfs rb-tree usage, conflicting
	with Eric Biederman's changes for tagged directories.
2011-10-25 13:25:22 +02:00
Stanislav Kinsbursky 16d0587090 NFSd: call svc rpcbind cleanup explicitly
We have to call svc_rpcb_cleanup() explicitly from nfsd_last_thread() since
this function is registered as service shutdown callback and thus nobody else
will done it for us.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-25 13:19:40 +02:00
Stanislav Kinsbursky d99085605c SUNRPC: introduce svc helpers for prepairing rpcbind infrastructure
This helpers will be used only for those services, that will send portmapper
registration calls.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-25 13:18:05 +02:00
Linus Torvalds 1be025d3cb Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
* 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (260 commits)
  usb: renesas_usbhs: fixup inconsistent return from usbhs_pkt_push()
  usb/isp1760: Allow to optionally trigger low-level chip reset via GPIOLIB.
  USB: gadget: midi: memory leak in f_midi_bind_config()
  USB: gadget: midi: fix range check in f_midi_out_open()
  QE/FHCI: fixed the CONTROL bug
  usb: renesas_usbhs: tidyup for smatch warnings
  USB: Fix USB Kconfig dependency problem on 85xx/QoirQ platforms
  EHCI: workaround for MosChip controller bug
  usb: gadget: file_storage: fix race on unloading
  USB: ftdi_sio.c: Use ftdi async_icount structure for TIOCMIWAIT, as in other drivers
  USB: ftdi_sio.c:Fill MSR fields of the ftdi async_icount structure
  USB: ftdi_sio.c: Fill LSR fields of the ftdi async_icount structure
  USB: ftdi_sio.c:Fill TX field of the ftdi async_icount structure
  USB: ftdi_sio.c: Fill the RX field of the ftdi async_icount structure
  USB: ftdi_sio.c: Basic icount infrastructure for ftdi_sio
  usb/isp1760: Let OF bindings depend on general CONFIG_OF instead of PPC_OF .
  USB: ftdi_sio: Support TI/Luminary Micro Stellaris BD-ICDI Board
  USB: Fix runtime wakeup on OHCI
  xHCI/USB: Make xHCI driver have a BOS descriptor.
  usb: gadget: add new usb gadget for ACM and mass storage
  ...
2011-10-25 12:23:15 +02:00
Linus Torvalds 2d03423b23 Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
* 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (38 commits)
  mm: memory hotplug: Check if pages are correctly reserved on a per-section basis
  Revert "memory hotplug: Correct page reservation checking"
  Update email address for stable patch submission
  dynamic_debug: fix undefined reference to `__netdev_printk'
  dynamic_debug: use a single printk() to emit messages
  dynamic_debug: remove num_enabled accounting
  dynamic_debug: consolidate repetitive struct _ddebug descriptor definitions
  uio: Support physical addresses >32 bits on 32-bit systems
  sysfs: add unsigned long cast to prevent compile warning
  drivers: base: print rejected matches with DEBUG_DRIVER
  memory hotplug: Correct page reservation checking
  memory hotplug: Refuse to add unaligned memory regions
  remove the messy code file Documentation/zh_CN/SubmitChecklist
  ARM: mxc: convert device creation to use platform_device_register_full
  new helper to create platform devices with dma mask
  docs/driver-model: Update device class docs
  docs/driver-model: Document device.groups
  kobj_uevent: Ignore if some listeners cannot handle message
  dynamic_debug: make netif_dbg() call __netdev_printk()
  dynamic_debug: make netdev_dbg() call __netdev_printk()
  ...
2011-10-25 12:13:59 +02:00
Linus Torvalds 59e5253417 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (59 commits)
  MAINTAINERS: linux-m32r is moderated for non-subscribers
  linux@lists.openrisc.net is moderated for non-subscribers
  Drop default from "DM365 codec select" choice
  parisc: Kconfig: cleanup Kernel page size default
  Kconfig: remove redundant CONFIG_ prefix on two symbols
  cris: remove arch/cris/arch-v32/lib/nand_init.S
  microblaze: add missing CONFIG_ prefixes
  h8300: drop puzzling Kconfig dependencies
  MAINTAINERS: microblaze-uclinux@itee.uq.edu.au is moderated for non-subscribers
  tty: drop superfluous dependency in Kconfig
  ARM: mxc: fix Kconfig typo 'i.MX51'
  Fix file references in Kconfig files
  aic7xxx: fix Kconfig references to READMEs
  Fix file references in drivers/ide/
  thinkpad_acpi: Fix printk typo 'bluestooth'
  bcmring: drop commented out line in Kconfig
  btmrvl_sdio: fix typo 'btmrvl_sdio_sd6888'
  doc: raw1394: Trivial typo fix
  CIFS: Don't free volume_info->UNC until we are entirely done with it.
  treewide: Correct spelling of successfully in comments
  ...
2011-10-25 12:11:02 +02:00
Linus Torvalds 31dced41c6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (61 commits)
  HID: hid-magicmouse: Magic Trackpad has 1 button, not 2
  HID: Add device IDs for more SJOY adapters
  HID: primax: remove spurious dependency
  HID: support primax keyboards violating USB HID spec
  HID: usbhid: cancel timer for retry synchronously
  HID: wacom: Set input bits before registration
  HID: consolidate MacbookAir 4,1 mappings
  HID: MacbookAir4,1 and MacbookAir4,2 need entry in hid_mouse_ignore_list[]
  HID: Add support MacbookAir 4,1 keyboard
  HID: hidraw: open count should not increase if error
  HID: hiddev: potential info leak in hiddev_ioctl()
  HID: multitouch: decide if hid-multitouch needs to handle mt devices
  HID: add autodetection of multitouch devices
  HID: "hid-logitech" driver with Logitech Driving Force GT
  HID: hid-logitech-dj: fix off by one
  HID: hidraw: protect hidraw_disconnect() better
  HID: hid-multitouch: add support for the IDEACOM 6650 chip
  HID: Add full support for Logitech Unifying receivers
  HID: hidraw: free list for all error in hidraw_open
  HID: roccat: Kone now reports external profile changes via roccat device
  ...
2011-10-25 12:03:13 +02:00
Linus Torvalds 7c1953ddb6 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (62 commits)
  target: Fix compile warning w/ missing module.h include
  target: Remove legacy se_task->task_timer and associated logic
  target: Fix incorrect transport_sent usage
  target: re-use the command S/G list for single-task commands
  target: Fix BIDI t_task_cdb handling in transport_generic_new_cmd
  target: remove transport_allocate_tasks
  target: merge transport_new_cmd_obj into transport_generic_new_cmd
  target: remove the task_sg_bidi field se_task and pSCSI BIDI support
  target: transport_subsystem_check_init cleanups
  target: use a workqueue for I/O completions
  target: remove unused TRANSPORT_ states
  target: remove TRANSPORT_DEFERRED_CMD state
  target: remove the TRANSPORT_REMOVE state
  target: move depth_left manipulation out of transport_generic_request_failure
  target: stop task timers earlier
  target: remove TF_TIMER_STOP
  target: factor some duplicate code for stopping a task
  target: fix list walking in transport_free_dev_tasks
  target: use transport_cmd_check_stop_to_fabric consistently
  target: do not pass the queue object to transport_remove_cmd_from_queue
  ...
2011-10-25 11:17:39 +02:00
Jiri Kosina b3aec7b686 Merge branch 'upstream' into for-linus
Conflicts:
	drivers/hid/hid-core.c
	drivers/hid/hid-ids.h
2011-10-25 09:59:04 +02:00
Jiri Kosina b0eae38ceb Merge branches 'acrux', 'logitech', 'multitouch', 'roccat' and 'wiimote' into for-linus 2011-10-25 09:54:16 +02:00
Linus Torvalds 36b8d186e6 Merge branch 'next' of git://selinuxproject.org/~jmorris/linux-security
* 'next' of git://selinuxproject.org/~jmorris/linux-security: (95 commits)
  TOMOYO: Fix incomplete read after seek.
  Smack: allow to access /smack/access as normal user
  TOMOYO: Fix unused kernel config option.
  Smack: fix: invalid length set for the result of /smack/access
  Smack: compilation fix
  Smack: fix for /smack/access output, use string instead of byte
  Smack: domain transition protections (v3)
  Smack: Provide information for UDS getsockopt(SO_PEERCRED)
  Smack: Clean up comments
  Smack: Repair processing of fcntl
  Smack: Rule list lookup performance
  Smack: check permissions from user space (v2)
  TOMOYO: Fix quota and garbage collector.
  TOMOYO: Remove redundant tasklist_lock.
  TOMOYO: Fix domain transition failure warning.
  TOMOYO: Remove tomoyo_policy_memory_lock spinlock.
  TOMOYO: Simplify garbage collector.
  TOMOYO: Fix make namespacecheck warnings.
  target: check hex2bin result
  encrypted-keys: check hex2bin result
  ...
2011-10-25 09:45:31 +02:00
Linus Torvalds cd85b55741 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: Finally remove leftover markers sections
  m68k/mac: Fix mac_irq_pending() for PSC MACE and SCC
  m68k/mac: Fix compiler warning in via_read_time()
  zorro: Fix four checkpatch warnings
2011-10-25 09:34:10 +02:00
Linus Torvalds 04a8752485 Merge branches 'stable/drivers-3.2', 'stable/drivers.bugfixes-3.2' and 'stable/pci.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/drivers-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xenbus: don't rely on xen_initial_domain to detect local xenstore
  xenbus: Fix loopback event channel assuming domain 0
  xen/pv-on-hvm:kexec: Fix implicit declaration of function 'xen_hvm_domain'
  xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel
  xen/pv-on-hvm kexec: update xs_wire.h:xsd_sockmsg_type from xen-unstable
  xen/pv-on-hvm kexec+kdump: reset PV devices in kexec or crash kernel
  xen/pv-on-hvm kexec: rebind virqs to existing eventchannel ports
  xen/pv-on-hvm kexec: prevent crash in xenwatch_thread() when stale watch events arrive

* 'stable/drivers.bugfixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pciback: Check if the device is found instead of blindly assuming so.
  xen/pciback: Do not dereference psdev during printk when it is NULL.
  xen: remove XEN_PLATFORM_PCI config option
  xen: XEN_PVHVM depends on PCI
  xen/pciback: double lock typo
  xen/pciback: use mutex rather than spinlock in vpci backend
  xen/pciback: Use mutexes when working with Xenbus state transitions.
  xen/pciback: miscellaneous adjustments
  xen/pciback: use mutex rather than spinlock in passthrough backend
  xen/pciback: use resource_size()

* 'stable/pci.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pci: support multi-segment systems
  xen-swiotlb: When doing coherent alloc/dealloc check before swizzling the MFNs.
  xen/pci: make bus notifier handler return sane values
  xen-swiotlb: fix printk and panic args
  xen-swiotlb: Fix wrong panic.
  xen-swiotlb: Retry up three times to allocate Xen-SWIOTLB
  xen-pcifront: Update warning comment to use 'e820_host' option.
2011-10-25 09:19:36 +02:00
Greg Kroah-Hartman 43a3beb6da Merge branch 'staging-next' into Linux 3.1
This was done to resolve a conflict in the
drivers/staging/comedi/drivers/ni_labpc.c file that resolved a build
bugfix in Linus's tree with a "better" bugfix that was in the
staging-next tree that resolved the issue in a more complete manner.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-25 09:18:11 +02:00
Linus Torvalds 31018acd4c Merge branches 'stable/bug.fixes-3.2' and 'stable/mmu.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/bug.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/p2m/debugfs: Make type_name more obvious.
  xen/p2m/debugfs: Fix potential pointer exception.
  xen/enlighten: Fix compile warnings and set cx to known value.
  xen/xenbus: Remove the unnecessary check.
  xen/irq: If we fail during msi_capability_init return proper error code.
  xen/events: Don't check the info for NULL as it is already done.
  xen/events: BUG() when we can't allocate our event->irq array.

* 'stable/mmu.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: Fix selfballooning and ensure it doesn't go too far
  xen/gntdev: Fix sleep-inside-spinlock
  xen: modify kernel mappings corresponding to granted pages
  xen: add an "highmem" parameter to alloc_xenballooned_pages
  xen/p2m: Use SetPagePrivate and its friends for M2P overrides.
  xen/p2m: Make debug/xen/mmu/p2m visible again.
  Revert "xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set."
2011-10-25 09:17:47 +02:00
Linus Torvalds 5eef150c1d Merge branch 'stable/e820-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/e820-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: release all pages within 1-1 p2m mappings
  xen: allow extra memory to be in multiple regions
  xen: allow balloon driver to use more than one memory region
  xen/balloon: simplify test for the end of usable RAM
  xen/balloon: account for pages released during memory setup
2011-10-25 09:17:07 +02:00
Boaz Harrosh 769ba8d920 ore: RAID5 Write
This is finally the RAID5 Write support.

The bigger part of this patch is not the XOR engine itself, But the
read4write logic, which is a complete mini prepare_for_striping
reading engine that can read scattered pages of a stripe into cache
so it can be used for XOR calculation. That is, if the write was not
stripe aligned.

The main algorithm behind the XOR engine is the 2 dimensional array:
	struct __stripe_pages_2d.
A drawing might save 1000 words
---

__stripe_pages_2d
       |
 n = pages_in_stripe_unit;
 w = group_width - parity;
       |                            pages array presented to the XOR lib
       |                                                |
       V                                                |
 __1_page_stripe[0].pages --> [c0][c1]..[cw][c_par] <---|
       |                                                |
 __1_page_stripe[1].pages --> [c0][c1]..[cw][c_par] <---
       |
...    |                         ...
       |
 __1_page_stripe[n].pages --> [c0][c1]..[cw][c_par]
                               ^
                               |
           data added columns first then row

---
The pages are put on this array columns first. .i.e:
	p0-of-c0, p1-of-c0, ... pn-of-c0, p0-of-c1, ...
So we are doing a corner turn of the pages.

Note that pages will zigzag down and left. but are put sequentially
in growing order. So when the time comes to XOR the stripe, only the
beginning and end of the array need be checked. We scan the array
and any NULL spot will be field by pages-to-be-read.

The FS that wants to support RAID5 needs to supply an
operations-vector that searches a given page in cache, and specifies
if the page is uptodate or need reading. All these pages to be read
are put on a slave ore_io_state and synchronously read. All the pages
of a stripe are read in one IO, using the scatter gather mechanism.

In write we constrain our IO to only be incomplete on a single
stripe. Meaning either the complete IO is within a single stripe so
we might have pages to read from both beginning  or end of the
strip. Or we have some reading to do at beginning but end at strip
boundary. The left over pages are pushed to the next IO by the API
already established by previous work, where an IO offset/length
combination presented to the ORE might get the length truncated and
the user must re-submit the leftover pages. (Both exofs and NFS
support this)

But any ORE user should make it's best effort to align it's IO
before hand and avoid complications. A cached ore_layout->stripe_size
member can be used for that calculation. (NOTE: that ORE demands
that stripe_size may not be bigger then 32bit)

What else? Well read it and tell me.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-24 17:15:33 -07:00
Boaz Harrosh a1fec1dbbc ore: RAID5 read
This patch introduces the first stage of RAID5 support
mainly the skip-over-raid-units when reading. For
writes it inserts BLANK units, into where XOR blocks
should be calculated and written to.

It introduces the new "general raid maths", and the main
additional parameters and components needed for raid5.

Since at this stage it could corrupt future version that
actually do support raid5. The enablement of raid5
mounting and setting of parity-count > 0 is disabled. So
the raid5 code will never be used. Mounting of raid5 is
only enabled later once the basic XOR write is also in.
But if the patch "enable RAID5" is applied this code has
been tested to be able to properly read raid5 volumes
and is according to standard.

Also it has been tested that the new maths still properly
supports RAID0 and grouping code just as before.
(BTW: I have found more bugs in the pnfs-obj RAID math
 fixed here)

The ore.c file is getting too big, so new ore_raid.[hc]
files are added that will include the special raid stuff
that are not used in striping and mirrors. In future write
support these will get bigger.
When adding the ore_raid.c to Kbuild file I was forced to
rename ore.ko to libore.ko. Is it possible to keep source
file, say ore.c and module file ore.ko the same even if there
are multiple files inside ore.ko?

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-24 16:55:36 -07:00
Boaz Harrosh 611d7a5dc6 ore: Make ore_calc_stripe_info EXPORT_SYMBOL
ore_calc_stripe_info is needed by exofs::export.c
for the layout calculations. Make it exportable

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-24 16:30:08 -07:00
Grant Likely 940ab88962 drivercore: Add helper macro for platform_driver boilerplate
For simple modules that contain a single platform_driver without any
additional setup code then ends up being a block of duplicated
boilerplate.  This patch adds a new macro, module_platform_driver(),
which replaces the module_init()/module_exit() registrations with
template functions.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Reviewed-by: Magnus Damm <magnus.damm@gmail.com>
Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
2011-10-25 00:35:47 +02:00
David S. Miller 1805b2f048 Merge branch 'master' of ra.kernel.org:/pub/scm/linux/kernel/git/davem/net 2011-10-24 18:18:09 -04:00
Flavio Leitner 78d81d15b7 TCP: remove TCP_DEBUG
It was enabled by default and the messages guarded
by the define are useful.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 17:36:08 -04:00
Rob Herring 3a82543642 Merge remote-tracking branch 'rmk/devel-stable' into HEAD 2011-10-24 14:02:37 -05:00
Kirill Tkhai bc74ee9769 m68k: Finally remove leftover markers sections
Markers have removed already twice:

1: fc5377668c
2: eb878b3bc0

But a little bit is still here.

Signed-off-by: Tkhai Kirill <tkhai@yandex.ru>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2011-10-24 21:00:34 +02:00
Donggeun Kim 9d97e5c81e hwmon: Add driver for EXYNOS4 TMU
This patch allows to read temperature
from TMU(Thermal Management Unit) of SAMSUNG EXYNOS4 series of SoC.

Signed-off-by: Donggeun Kim <dg77.kim@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
2011-10-24 11:09:35 -07:00
Aneesh Kumar K.V 348b59012e net/9p: Convert net/9p protocol dumps to tracepoints
This helps in more control over debugging.
root@qemu-img-64:~# ls /pass/123
ls: cannot access /pass/123: No such file or directory
root@qemu-img-64:~# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
              ls-1536  [001]    70.928584: 9p_protocol_dump: clnt 18446612132784021504 P9_TWALK(tag = 1)
000: 16 00 00 00 6e 01 00 01 00 00 00 02 00 00 00 01
010: 00 03 00 31 32 33 00 00 00 ff ff ff ff 00 00 00

              ls-1536  [001]    70.928587: <stack trace>
 => trace_9p_protocol_dump
 => p9pdu_finalize
 => p9_client_rpc
 => p9_client_walk
 => v9fs_vfs_lookup
 => d_alloc_and_lookup
 => walk_component
 => path_lookupat
              ls-1536  [000]    70.929696: 9p_protocol_dump: clnt 18446612132784021504 P9_RLERROR(tag = 1)
000: 0b 00 00 00 07 01 00 02 00 00 00 4e 03 00 02 00
010: 00 00 00 00 03 00 02 00 00 00 00 00 ff 43 00 00

              ls-1536  [000]    70.929697: <stack trace>
 => trace_9p_protocol_dump
 => p9_client_rpc
 => p9_client_walk
 => v9fs_vfs_lookup
 => d_alloc_and_lookup
 => walk_component
 => path_lookupat
 => do_path_lookup

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-10-24 11:13:12 -05:00
Dan Carpenter ef6b0807e2 fs/9p: change an int to unsigned int
Without this msize=4294967295 will result in a crash

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-10-24 11:13:12 -05:00
Aneesh Kumar K.V abfa034e4b fs/9p: Update zero-copy implementation in 9p
* remove lot of update to different data structure
* add a seperate callback for zero copy request.
* above makes non zero copy code path simpler
* remove conditionalizing TREAD/TREADDIR/TWRITE in the zero copy path
* Fix the dotu p9_check_errors with zero copy. Add sufficient doc around
* Add support for both in and output buffers in zero copy callback
* pin and unpin pages in the same context
* use helpers instead of defining page offset and rest of page ourself
* Fix mem leak in p9_check_errors
* Remove 'E' and 'F' in p9pdu_vwritef

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-10-24 11:13:11 -05:00
Grant Likely 2bf6f675fa Merge commit 'v3.1' into devicetree/next 2011-10-24 17:03:35 +02:00
Nicolas Ferre 5762c20593 dt: Add empty of_match_node() macro
Add an empty macro for of_match_node() that will save
some '#ifdef CONFIG_OF' for non-dt builds.

I have chosen to use a macro instead of a function to
be able to avoid defining the first parameter.
In fact, this "struct of_device_id *" first parameter
is usualy not defined as well on non-dt builds.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
2011-10-24 16:59:39 +02:00
Jens Axboe 83157223de Merge branch 'for-linus' into for-3.2/core 2011-10-24 16:24:38 +02:00
Tao Ma 9562ad9ab3 block: Remove the control of complete cpu from bio.
bio originally has the functionality to set the complete cpu, but
it is broken.

Chirstoph said that "This code is unused, and from the all the
discussions lately pretty obviously broken.  The only thing keeping
it serves is creating more confusion and possibly more bugs."

And Jens replied with "We can kill bio_set_completion_cpu(). I'm fine
with leaving cpu control to the request based drivers, they are the
only ones that can toggle the setting anyway".

So this patch tries to remove all the work of controling complete cpu
from a bio.

Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-24 16:11:30 +02:00
Mark Brown feb8369924 gpiolib: Ensure struct gpio is always defined
Currently struct gpio is only defined when using gpiolib which makes the
stub gpio_request_array() much less useful in drivers than is ideal as
they can't work with struct gpio.  Since there are no other definitions
in kernel instead make the define always available no matter if gpiolib
is selectable or selected, ensuring that drivers can always use the
type.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-10-24 16:04:06 +02:00
Mattias Nilsson 73180f85f4 mfd: Move to the new db500 PRCMU API
Now that we have a shared API between the DB8500 and DB5500
PRCMU's, switch to using this neutral API instead. We delete the
parts of db8500-prcmu.h that is now PRCMU-neutral, and calls will
be diverted to respective driver. Common registers are in
dbx500-prcmu-regs.h and common accessors and defines in
<linux/mfd/dbx500-prcmu.h> This way we get a a lot more
abstraction and code reuse.

Signed-off-by: Mattias Nilsson <mattias.i.nilsson@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:18 +02:00
Mattias Nilsson fea799e3d3 mfd: Create a common interface for dbx500 PRCMU drivers
This adds a header file that contains the set of functions and
definitions that will be shared between the DB8500 and DB5500
PRCMU drivers.

Signed-off-by: Mattias Nilsson <mattias.i.nilsson@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:18 +02:00
Linus Walleij 8959e74399 mfd: Delete ab3550 driver
The AB3550 never passed the prototype stage. Instead it was used
as a precursor to AB5500 for testing basic building blocks used
in that chip, since they had large similarities. Since AB3550 will
not see the light of day in product form and since the prototypes
are no longer used, let's delete the driver and any references to
it.

Cc: Mattias Wallin <mattias.wallin@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:16 +02:00
Mattias Wallin 3d5e2cabf1 mfd: ab5500 chip register access
The analog baseband chip ab5500 is a multi functional chip
containing regulators, charging, gpio, USB and accessory detect.
It also contain various multimedia functionalities like digital
encoder and audio codec.
The core driver added with this patch provides register access via
i2c via PRCMU. Event handling implemented as irq_chip will come in
future patches since it depends on PRCMU functionality not yet
implemented.

Signed-off-by: Mattias Wallin <mattias.wallin@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:16 +02:00
Mika Westerberg 1f5a371c07 mfd: Add Intel MSIC driver
Add support for Intel MSIC chip found on Intel Medfield platforms. This
chip embeds several subdevices: audio, ADC, GPIO, power button, etc. The
driver creates platform device for each subdevice.

We also provide an MSIC register access API which should replace the more
generic SCU IPC interface currently used. Existing drivers can choose
whether they convert to this new API or stick with the SCU IPC interface.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:15 +02:00
Mark Brown 7583a213ec mfd: Simulate active high IRQs with wm831x
In order to ease system integration provide a simulation of active high
IRQs on the GPIOs by polling the GPIO status when an IRQ is generated.

This isn't ideal on several fronts and will miss initially active IRQs in
the current implementation but it should work well for most cases.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:14 +02:00
Philippe Rétornaz 30fc7ac3f6 input: Add power button support for mc13783
This adds support for the power-on buttons of MC13783 PMIC.

Signed-off-by: Philippe Rétornaz <philippe.retornaz@epfl.ch>
Acked-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:14 +02:00
Philippe Rétornaz 5ab9059d7f mfd: Remove unused mc13xxx defines
Signed-off-by: Philippe Rétornaz <philippe.retornaz@epfl.ch>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:14 +02:00
Mark Brown 5da721c87a mfd: Support software initiated shutdown of WM831x PMICs
In systems where there is no hardware signal from the processor to the
PMIC to initiate the final power off sequence we must initiate the
shutdown with a register write to the PMIC. Support such systems in the
driver. Since this may prevent a full shutdown of the system platform
data is used to enable the feature.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:13 +02:00
Uwe Kleine-König b46880e57b mfd: Remove mc13783 API functions and symbols
Now that all in-tree users are fixed to use the more general mc13xxx API
the obsolete stuff can go away.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:12 +02:00
Uwe Kleine-König fec316d632 mfd: Provide a generic version of mc13xxx adc_do_conversion
This is needed to convert the touch driver away from using struct mc13783.

Note this patch drops MC13783_ADC0_ADREFMODE. This is unused and doesn't
exist on mc13892.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:12 +02:00
MyungJoo Ham 01fdaab8ff mfd: Wake-up from Suspend MAX8997 support
- Support wake-up from suspend-to-ram.
- Handle pending interrupt after a resume.
- If pdata->wakeup is enabled, by default, the device is assumed to be
capable of wakeup (the interrupt pin is connected to a wakeup-source GPIO)
and may wakeup the system (MAX8997 has a power button input pin).

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:11 +02:00
Kyle Manna 3d6271f92e mfd: Turn on the twl4030-madc MADC clock
Without turning the MADC clock on, no MADC conversions occur.

$ cat /sys/class/hwmon/hwmon0/device/in8_input
[   53.428436] twl4030_madc twl4030_madc: conversion timeout!
cat: read error: Resource temporarily unavailable

Signed-off-by: Kyle Manna <kyle@kylemanna.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:10 +02:00
Mark Brown 881de67046 mfd: Allow WM8994 LDO enable pulls to be disabled
In systems where the LDO enables are always driven (for example, being
connected to an always on supply rail or a GPIO which is driven by the
CPU even in suspend) then we can disable the pull downs on the LDO for
a small power savings.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:10 +02:00
Karl Komierowski bd4a40b57b mfd: Refactor ab8500 GPADC API, add raw access
Refactor the GPADC interface to avoid bugs in calling code:

- ab8500_gpadc_[convert|read_raw|ad_to_voltage] clarifies
  each functions use case, *convert wraps *read_raw, and we
  can access raw ADC values properly.
- Renamed gpadc function arguments from "input" to "channel" to
  clarify use, so we don't get confused again.

Signed-off-by: Kalle Komierowski <kalle.komierowski@stericsson.com>
Reviewed-by: Mattias Wallin <mattias.wallin@stericsson.com>
Reviewed-by: John Beckett <john.beckett@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:10 +02:00
Mark Brown 6e3ad11804 mfd: Convert pcf50633 to use new register map API
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Tested-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-24 14:09:08 +02:00
Barry Song 1e11bec9b0 Merge branch 'l2x0' of rmk tree into prima2-l2x0 2011-10-24 02:45:43 -07:00
Eric Dumazet 66b13d99d9 ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT
There is a long standing bug in linux tcp stack, about ACK messages sent
on behalf of TIME_WAIT sockets.

In the IP header of the ACK message, we choose to reflect TOS field of
incoming message, and this might break some setups.

Example of things that were broken :
  - Routing using TOS as a selector
  - Firewalls
  - Trafic classification / shaping

We now remember in timewait structure the inet tos field and use it in
ACK generation, and route lookup.

Notes :
 - We still reflect incoming TOS in RST messages.
 - We could extend MuraliRaja Muniraju patch to report TOS value in
netlink messages for TIME_WAIT sockets.
 - A patch is needed for IPv6

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 03:06:21 -04:00
Richard Cochran da92b194cc net: hold sock reference while processing tx timestamps
The pair of functions,

 * skb_clone_tx_timestamp()
 * skb_complete_tx_timestamp()

were designed to allow timestamping in PHY devices. The first
function, called during the MAC driver's hard_xmit method, identifies
PTP protocol packets, clones them, and gives them to the PHY device
driver. The PHY driver may hold onto the packet and deliver it at a
later time using the second function, which adds the packet to the
socket's error queue.

As pointed out by Johannes, nothing prevents the socket from
disappearing while the cloned packet is sitting in the PHY driver
awaiting a timestamp. This patch fixes the issue by taking a reference
on the socket for each such packet. In addition, the comments
regarding the usage of these function are expanded to highlight the
rule that PHY drivers must use skb_complete_tx_timestamp() to release
the packet, in order to release the socket reference, too.

These functions first appeared in v2.6.36.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@vger.kernel.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 02:54:50 -04:00
Eric Dumazet 318cf7aaa0 tcp: md5: add more const attributes
Now tcp_md5_hash_header() has a const tcphdr argument, we can add more
const attributes to callers.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 02:46:04 -04:00
Rick Jones 8f9f4668b3 Add ethtool -g support to virtio_net
Add support for reporting ring sizes via ethtool -g to the virtio_net
driver.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 02:07:21 -04:00
Eric Dumazet ca35a0ef85 tcp: md5: dont write skb head in tcp_md5_hash_header()
tcp_md5_hash_header() writes into skb header a temporary zero value,
this might confuse other users of this area.

Since tcphdr is small (20 bytes), copy it in a temporary variable and
make the change in the copy.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 01:52:35 -04:00
Dave Airlie 9b553f7286 Merge branch 'drm-intel-next' of git://people.freedesktop.org/~keithp/linux into drm-core-next
* 'drm-intel-next' of git://people.freedesktop.org/~keithp/linux: (72 commits)
  drm/i915/dp: Fix eDP on PCH DP on CPT/PPT
  drm/i915/dp: Introduce is_cpu_edp()
  drm/i915: use correct SPD type value
  drm/i915: fix ILK+ infoframe support
  drm/i915: add DP test request handling
  drm/i915: read full receiver capability field during DP hot plug
  drm/i915/dp: Remove eDP special cases from bandwidth checks
  drm/i915/dp: Fix the math in intel_dp_link_required
  drm/i915/panel: Always record the backlight level again (but cleverly)
  i915: Move i915_read/write out of line
  drm/i915: remove transcoder PLL mashing from mode_set per specs
  drm/i915: if transcoder disable fails, say which
  drm/i915: set watermarks for third pipe on IVB
  drm/i915: export a CPT mode set verification function
  drm/i915: fix transcoder PLL select masking
  drm/i915: fix IVB cursor support
  drm/i915: fix debug output for 3 pipe configs
  drm/i915: add PLL sharing support to handle 3 pipes
  drm/i915: fix PCH PLL assertion check for 3 pipes
  drm/i915: use transcoder select bits on VGA and HDMI on CPT
  ...
2011-10-24 05:48:39 +01:00
Nicholas Bellinger 2e982ab92d target: Remove legacy se_task->task_timer and associated logic
This patch removes the legacy usage of se_task->task_timer and associated
infrastructure that originally was used as a way to help manage buggy backend
SCSI LLDs that in certain cases would never return back an outstanding task.

This includes the removal of target_complete_timeout_work(), timeout logic
from transport_complete_task(), transport_task_timeout_handler(),
transport_start_task_timer(), the per device task_timeout configfs attribute,
and all task_timeout associated structure members and defines in
target_core_base.h

This is being removed in preparation to make transport_complete_task() run
in lock-less mode.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:22:08 +00:00
Nicholas Bellinger 415a090ade target: Fix incorrect transport_sent usage
This patch converts target-core to use se_cmd->t_transport_sent instead of
a duplicated se_cmd->transport_sent member in a handful of locations.
It also updates iscsi_target to properly use ->t_transport_sent instead of
it's own iscsi_cmd_t->transport_sent value that was not being assigned.

Reported-by: Christoph Hellwig <hch@lst.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:22:06 +00:00
Christoph Hellwig 7c1c6af37a target: remove the task_sg_bidi field se_task and pSCSI BIDI support
This field is never used given that BIDI handling happens at the
command and not the task level.  Remove it and the dead code in
pscsi that tries to work on it.

It also prevents pSCSI passthrough for the two currently enabled BIDI
commands now that task->task_sg_bidi support has been removed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:56 +00:00
Nicholas Bellinger dbc5623eb2 target: transport_subsystem_check_init cleanups
Remove the now unnecessary extra call to transport_subsystem_check_init() in
target_core_register_fabric(), and also merge transport_subsystem_reqmods()
directly into transport_subsystem_check_init().

Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:54 +00:00
Christoph Hellwig 35e0e75753 target: use a workqueue for I/O completions
Instead of abusing the target processing thread for offloading I/O
completion in the backends to user context add a new workqueue.  This means
completions can be processed as fast as available CPU time allows it,
including in parallel with other completions and more importantly I/O
submission or QUEUE FULL retries.  This should give much better performance
especially on loaded systems.

As a fallout we can merge all the completed states into a single
one.

On the downside this change complicates lun reset handling a bit by
requiring us to cancel a work item only for those states that have it
initialized.  The alternative would be to either always initialize the work
item to a dummy handler, or always use the same handler and do a switch on
the state. The long term solution will be a flag that says that the command
has an initialized work item, but that's only going to be useful once we
have more users.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:52 +00:00
Christoph Hellwig 59aaad1ec4 target: remove unused TRANSPORT_ states
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:51 +00:00
Christoph Hellwig f2da9dbdb5 target: remove TRANSPORT_DEFERRED_CMD state
We never check for this state, and it makes testing for a completed
state much harder given that it overrides the existing state.

Also remove the unused deferred_t_state which is related to it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:49 +00:00
Christoph Hellwig bfaf40ada2 target: remove the TRANSPORT_REMOVE state
We never queue an command with this state, and only set it in a completely
bogus place in tcm_fc.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:47 +00:00
Christoph Hellwig cc5d0f0f61 target: stop task timers earlier
Currently we stop the timers for all tasks in a command fairly late during
I/O completion, which is fairly pointless and requires all kinds of safety
checks.

Instead delete pending timers early on in transport_complete_task, thus
ensuring no new timers firest after that.  We take t_state_lock a bit later
in that function thus making sure currenly running timers are out of the
criticial section.  To be completely sure the timer has finished we also
add another del_timer_sync call when freeing the task.

This also allows removing TF_TIMER_RUNNING as it would be equivalent
to TF_ACTIVE now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:44 +00:00
Christoph Hellwig e99d48a62b target: remove TF_TIMER_STOP
TF_TIMER_STOP is useless as it only helps to mitigate a tiny race during
deleting the timer.  But given that we have cleared TF_ACTIVE at this point
we already have another mitigation a few lines down the function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:42 +00:00
Christoph Hellwig cdbb70bb4c target: factor some duplicate code for stopping a task
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:41 +00:00
Christoph Hellwig f7a5cc0b31 target: remove SCF_EMULATE_QUEUE_FULL
Add a new boolean at_head parameter to transport_add_cmd_to_queue and thus
obsolete the SCF_EMULATE_QUEUE_FULL flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:34 +00:00
Christoph Hellwig e057f53308 target: remove the transport_qf_callback se_cmd callback
Remove the need for the transport_qf_callback callback by making
sure we have specific states with specific handlers for the two
queue full cases.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:32 +00:00
Christoph Hellwig f55918fa32 target: clean up the backend interface to caching parameters
Remove the dpo_emulated, fua_write_emulated, fua_read_emulated and
write_cache_emulated methods, and replace them with a simple bitfields in
se_subsystem_api in those cases where they ever returned one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:31 +00:00
Christoph Hellwig b937d27052 target: remove the ->transport_split_cdb callback in se_cmd
Add a switch statement implementing the CDB LBA/len update directly
in target_get_task_cdb and remove the old ->transport_split_cdb
callback and all its implementations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:15 +00:00
Christoph Hellwig 485fd0d1e3 target: replace ->get_cdb with a target_get_task_cdb helper
Instead of calling out to the backends from the core to get a per-task
CDB and then modify it for the LBA/len pair used for this CDB provide
a helper that writes the adjusted CDB into a provided buffer and call
this method from ->do_task in pscsi.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:13 +00:00
Christoph Hellwig 3189b067ee target: pack struct se_task more tightly
Rearrange the fields in se_task to avoid holes.  Also increase the
flags field to 16 bits as we have the space for it, and this makes
adding new flags safer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:10 +00:00
Christoph Hellwig 04629b7bde target: Remove unnecessary se_task members
This is a squashed version of the following unnecessary se_task structure
member removal patches:

target: remove the task_execute_queue field in se_task

    Instead of using a separate flag we can simply do list_emptry checks
    on t_execute_list if we make sure to always use list_del_init to remove
    a task from the list.  Also factor some duplicate code into a new
    __transport_remove_task_from_execute_queue helper.

target: remove the read-only task_no field in se_task

    The task_no field never was initialized and only used in debug printks,
    so kill it.

target: remove the task_padded_sg field in se_task

    This field is only check in one place and not actually needed there.

    Rationale:
    - transport_do_task_sg_chain asserts that we have task_sg_chaining
      set early on
    - we only make use of the sg_prev_nents field we calculate based on it
      if there is another sg list that gets chained onto this one, which
      never happens for the last (or only) task.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:08 +00:00
Christoph Hellwig 6c76bf951c target: make more use of the task_flags field in se_task
Replace various atomic_t variables that were mostly under t_state_lock
with new flags in task_flags.  Note that the execution error path
didn't take t_state_lock before, so add it there.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:06 +00:00
Christoph Hellwig 42bf829eee target: Cleanup unused se_task bits
This is a squashed version of the following se_task cleanup patches:

    target: remove the unused task_state_flags field in se_task
    target: remove the unused se_obj_ptr field in se_task
    target: remove the se_dev field in se_task

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:05 +00:00
Christoph Hellwig c0427f1556 target: Cleanup unused target_core_base.h bits
This is a squashed version of the following target_core_base.h
cleanup patches:

    target: remove the unused SHUTDOWN_SIGS defintion
    target: remove unused se_mem leftovers
    target: remove the unused map_func_t typedef
    target: move TRANSPORT_IOV_DATA_BUFFER to the iscsi-specific code

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:21:03 +00:00
Nicholas Bellinger 8dc52b5420 target: Merge transport_cmd_finish_abort_tmr into transport_cmd_finish_abort
This patch merges transport_cmd_finish_abort_tmr() logic into a single
transport_cmd_finish_abort() function by adding a cmd->se_tmr_req check
around transport_lun_remove_cmd(), and updates the single caller within
core_tmr_drain_tmr_list().

Reported-by: Christoph Hellwig <hch@lst.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:20:58 +00:00
Nicholas Bellinger d14921d6ad target: Convert ->transport_wait_for_tasks usage to transport_generic_free_cmd
This patch converts se_cmd->transport_wait_for_tasks(se_cmd, 1) usage to use
transport_generic_free_cmd() directly in target-core and iscsi-target fabric
usage.  The includes:

*) Removal of the optional transport_generic_free_cmd() call from within
   transport_generic_wait_for_tasks()
*) Usage of existing SCF_SUPPORTED_SAM_OPCODE to determine when
   transport_generic_wait_for_tasks() processing may occur instead of
   checking se_cmd->transport_wait_for_tasks()
*) Move transport_generic_wait_for_tasks() call ahead of core_dec_lacl_count()
   and transport_lun_remove_cmd() in transport_generic_free_cmd() to follow
   existing logic for iscsi-target w/ se_cmd->transport_wait_for_tasks(se_cmd, 1)
*) Removal of se_cmd->transport_wait_for_tasks() function pointer
*) Rename transport_generic_wait_for_tasks() -> transport_wait_for_tasks(), and
   add docbook comment.
*) Add EXPORT_SYMBOL for transport_wait_for_tasks()

For the case in iscsi_target_erl2.c:iscsit_prepare_cmds_for_realligance()
where se_cmd->transport_wait_for_tasks(se_cmd, 0) is called, this patch
adds a direct call to transport_wait_for_tasks().

(hch: Fix transport_generic_free_cmd() usage in iscsit_release_commands_from_conn)
(nab: Add patch: Ensure that TMRs hit wait_for_tasks logic during release)

Reported-by: Christoph Hellwig <hch@lst.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:20:54 +00:00
Roland Dreier dd503a5fcc target: Have core_tmr_alloc_req() take an explicit GFP_xxx flag
Testing in_interrupt() to know when sleeping is allowed is not really
reliable (since eg it won't be true if the caller is holding a spinlock).
Instead have the caller tell core_tmr_alloc_req() what GFP_xxx to use;
every caller except tcm_qla2xxx can use GFP_KERNEL.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:20:53 +00:00
Christoph Hellwig a3eedc227b target: remove unused se_subsystem_api methods
The cdb_none, map_data_SG and map_control_SG methods have no callers left
and can be removed now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:20:46 +00:00
Nicholas Bellinger 39c05f321a target: Remove session_reinstatement parameter from ->transport_wait_for_tasks
This patch removes the unnecessary session_reinstatement parameter from
se_cmd->transport_wait_for_tasks(), logic in transport_generic_wait_for_tasks,
and usage within iscsi-target code.

This also includes the removal of the 'bool' return from transport_put_cmd() +
transport_generic_free_cmd() that is no longer necessary.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:20:39 +00:00
Christoph Hellwig 82f1c8a4e7 target: push session reinstatement out of transport_generic_free_cmd
Push session reinstatement out of transport_generic_free_cmd into the only
caller that actually needs it.  Clean up transport_generic_free_cmd a bit,
and remove the useless comment.  I'd love to add a more useful kerneldoc
comment for it, but as this point I'm still a bit confused in where it
stands in the command release stack.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:20:38 +00:00
Christoph Hellwig 2dbc43d256 target: remove transport_free_se_cmd
It is only called by transport_release_cmd, so inline it there.  Also add
a kerneldoc comment for transport_release_cmd while we are at it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:20:31 +00:00
Christoph Hellwig 680b73c5f2 target: remove transport_generic_handle_cdb
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2011-10-24 03:20:28 +00:00
Russell King 34471a9168 Merge branch 'ppi-irq-core-for-rmk' of git://github.com/mzyngier/arm-platforms into devel-stable 2011-10-23 14:42:30 +01:00
Eric Dumazet b5d9c9c281 inet: add rfc 3168 extract in front of INET_ECN_encapsulate()
INET_ECN_encapsulate() is better understood if we can read the official
statement.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-22 01:25:23 -04:00
Rafael J. Wysocki d033e07856 Merge branch 'pm-domains' into pm-for-linus
* pm-domains:
  ARM: mach-shmobile: sh7372 A4R support (v4)
  ARM: mach-shmobile: sh7372 A3SP support (v4)
  PM / Sleep: Mark devices involved in wakeup signaling during suspend
2011-10-22 00:21:52 +02:00
Rafael J. Wysocki 4ca46ff3e0 PM / Sleep: Mark devices involved in wakeup signaling during suspend
The generic PM domains code in drivers/base/power/domain.c has
to avoid powering off domains that provide power to wakeup devices
during system suspend.  Currently, however, this only works for
wakeup devices directly belonging to the given domain and not for
their children (or the children of their children and so on).
Thus, if there's a wakeup device whose parent belongs to a power
domain handled by the generic PM domains code, the domain will be
powered off during system suspend preventing the device from
signaling wakeup.

To address this problem introduce a device flag, power.wakeup_path,
that will be set during system suspend for all wakeup devices,
their parents, the parents of their parents and so on.  This way,
all wakeup paths in the device hierarchy will be marked and the
generic PM domains code will only need to avoid powering off
domains containing devices whose power.wakeup_path is set.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-22 00:19:29 +02:00
Joerg Roedel 1abb4ba596 Merge branches 'amd/fixes', 'debug/dma-api', 'arm/omap', 'arm/msm', 'core', 'iommu/fault-reporting' and 'api/iommu-ops-per-bus' into next
Conflicts:
	drivers/iommu/amd_iommu.c
	drivers/iommu/iommu.c
2011-10-21 14:38:55 +02:00
Joerg Roedel 94441c3bd9 iommu/core: Remove global iommu_ops and register_iommu
With all IOMMU drivers being converted to bus_set_iommu the
global iommu_ops are no longer required. The same is true
for the deprecated register_iommu function.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-21 14:37:23 +02:00
Joerg Roedel a1b60c1cd9 iommu/core: Convert iommu_found to iommu_present
With per-bus iommu_ops the iommu_found function needs to
work on a bus_type too. This patch adds a bus_type parameter
to that function and converts all call-places.
The function is also renamed to iommu_present because the
function now checks if an iommu is present for a given bus
and does not check for a global iommu anymore.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-21 14:37:20 +02:00
Joerg Roedel 905d66c1e5 iommu/core: Add bus_type parameter to iommu_domain_alloc
This is necessary to store a pointer to the bus-specific
iommu_ops in the iommu-domain structure. It will be used
later to call into bus-specific iommu-ops.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-21 14:37:19 +02:00
Joerg Roedel ff21776d12 Driver core: Add iommu_ops to bus_type
This is the starting point to make the iommu_ops used for
the iommu-api a per-bus-type structure. It is required to
easily implement bus-specific setup in the iommu-layer.
The first user will be the iommu-group attribute in sysfs.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-21 14:37:19 +02:00
Joerg Roedel 39d4ebb959 iommu/core: Define iommu_ops and register_iommu only with CONFIG_IOMMU_API
This makes it impossible to compile an iommu driver into the
kernel without selecting CONFIG_IOMMU_API.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-21 14:37:18 +02:00
Steffen Klassert 07a5fa4abd crypto: Add userspace report for cipher type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:07 +02:00
Steffen Klassert 792608e9c2 crypto: Add userspace report for rng type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:06 +02:00
Steffen Klassert a55465dca7 crypto: Add userspace report for pcompress type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:06 +02:00
Steffen Klassert 6ad414fe71 crypto: Add userspace report for aead type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:06 +02:00
Steffen Klassert 50496a1fab crypto: Add userspace report for blkcipher type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:05 +02:00
Steffen Klassert f4d663ce63 crypto: Add userspace report for shash type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:04 +02:00
Steffen Klassert 6c5a86f529 crypto: Add userspace report for larval type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:04 +02:00
Steffen Klassert b6aa63c09b crypto: Add a report function pointer to crypto_type
We add a report function pointer to struct crypto_type. This function
pointer is used from the crypto userspace configuration API to report
crypto algorithms to userspace.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:03 +02:00
Steffen Klassert a38f7907b9 crypto: Add userspace configuration API
This patch adds a basic userspace configuration API for the crypto layer.
With this it is possible to instantiate, remove and to show crypto
algorithms from userspace.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:03 +02:00
Steffen Klassert 64a947b133 crypto: Add a flag to identify crypto instances
The upcomming crypto user configuration api needs to identify
crypto instances. This patch adds a flag that is set if the
algorithm is an instance that is build from templates.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:01 +02:00
Eric Dumazet cf533ea53e tcp: add const qualifiers where possible
Adding const qualifiers to pointers can ease code review, and spot some
bugs. It might allow compiler to optimize code further.

For example, is it legal to temporary write a null cksum into tcphdr
in tcp_md5_hash_header() ? I am afraid a sniffer could catch the
temporary null value...

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 05:22:42 -04:00
Eric W. Biederman e09eff7fc1 macvtap: Fix the minor device number allocation
On systems that create and delete lots of dynamic devices the
31bit linux ifindex fails to fit in the 16bit macvtap minor,
resulting in unusable macvtap devices.  I have systems running
automated tests that that hit this condition in just a few days.

Use a linux idr allocator to track which mavtap minor numbers
are available and and to track the association between macvtap
minor numbers and macvtap network devices.

Remove the unnecessary unneccessary check to see if the network
device we have found is indeed a macvtap device.  With macvtap
specific data structures it is impossible to find any other
kind of networking device.

Increase the macvtap minor range from 65536 to the full 20 bits
that is supported by linux device numbers.  It doesn't solve the
original problem but there is no penalty for a larger minor
device range.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:53:07 -04:00
Ian Campbell a8605c6063 net: add opaque struct around skb frag page
I've split this bit out of the skb frag destructor patch since it helps enforce
the use of the fragment API.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:52:53 -04:00
Jesse Barnes a60f0e38d7 drm/i915: add DP test request handling
DPCD 1.1+ adds some automated test infrastructure support.  Add support
for reading the IRQ source and jumping to a test handling routine if
needed.  Subsequent patches will handle particular tests; this patch
just ACKs any requested tests by default.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 23:22:01 -07:00
Ben Widawsky b73fe58caf drm: Add Panel Self Refresh DP addresses
Add the addresses and definitions I care about for Panel Self Refresh, as
documented in the eDP spec.

I'm sending these out before some other patches because this should be a fairly
simple one to get upstream and not require too much fuss (where the others may
have some fuss).

This file is a mess with white spacing. I tried to stay consistent with the
surrounding code.

v2: had some silly mistakes in v1 which Keith caught

Cc: Dave Airlie <airlied@redhat.com>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 15:26:39 -07:00
Ben Widawsky 5c0422878f drm/i915: ILK + VT-d workaround
Idle the GPU before doing any unmaps. We know if VT-d is in use through
an exported variable from iommu code.

This should avoid a known HW issue.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 15:26:39 -07:00
Daniel Vetter 24dd85ff72 io-mapping: ensure io_mapping_map_atomic _is_ atomic
For the !HAVE_ATOMIC_IOMAP case the stub functions did not call
pagefault_disable/_enable. The i915 driver relies on the map
actually being atomic, otherwise it can deadlock with it's own
pagefault handler in the gtt pwrite fastpath.

This is exercised by gem_mmap_gtt from the intel-gpu-toosl gem
testsuite.

v2: Chris Wilson noted the lack of an include.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38115
Cc: stable@kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 15:26:17 -07:00
Maciej Żenczykowski 6cc7a765c2 net: allow CAP_NET_RAW to set socket options IP{,V6}_TRANSPARENT
Up till now the IP{,V6}_TRANSPARENT socket options (which actually set
the same bit in the socket struct) have required CAP_NET_ADMIN
privileges to set or clear the option.

- we make clearing the bit not require any privileges.
- we allow CAP_NET_ADMIN to set the bit (as before this change)
- we allow CAP_NET_RAW to set this bit, because raw
  sockets already pretty much effectively allow you
  to emulate socket transparency.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 18:21:36 -04:00
Eric Dumazet 05bdd2f143 net: constify skbuff and Qdisc elements
Preliminary patch before tcp constification

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 17:45:43 -04:00
Mike Christie 2d63673b4d [SCSI] iscsi class: fix vlan configuration
Userspace was sending the priority/id part of the vlan tag
and sysfs was displaying the id in the vlan file. This
renames the vlan sysfs file to vlan_id to reflect that it
was showing the id and to match the vlan_priority file.
This also adds a ISCSI_NET_PARAM_VLAN_TAG iscsi nl command
to relfect that we are sending down the vlan/priority
part of the tag.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-20 10:13:55 -05:00
Mike Christie 00c31889f7 [SCSI] qla4xxx: fix data alignment and use nl helpers
This has the driver use helpers for a common operation and fixes
a issue where if multiple iscsi params are sent they could be
sent at offsets that cause unaligned accesses. The nla helpers
account for the padding needed to align properly for the driver.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-20 10:12:44 -05:00
Mike Christie 8d4a690cd4 [SCSI] iscsi class: Replace iscsi_get_next_target_id with IDA
Replaced the iscsi_get_next_target_id with IDA to make
 target-id allocation efficient for iscsi offload drivers

 This patch should be applied after Jonathen Cameron Patch
 "ida : simplified functions for id allocation"

Signed-off-by: John Soni Jose <jose0here@gmail.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-20 10:10:07 -05:00
Jens Axboe b8d8bdfe31 Merge branch 'stable/for-jens-3.2' of git://oss.oracle.com/git/kwilk/xen into for-3.2/drivers 2011-10-20 15:10:59 +02:00
Stephen Warren a5818a8bd0 pinctrl: get_group_pins() const fixes
get_group_pins() "returns" a pointer to an array of const objects, through
a pointer parameter. Fix the prototype so what's pointed at by the returned
pointer is const, rather than the function parameter being const.

This also allows the removal of a cast in each of the two current pinmux
drivers.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2011-10-20 11:41:49 +02:00
Ian Campbell 30d3c128ea mm: add a "struct page_frag" type containing a page, offset and length
A few network drivers currently use skb_frag_struct for this purpose but I have
patches which add additional fields and semantics there which these other uses
do not want.

A structure for reference sub-page regions seems like a generally useful thing
so do so instead of adding a network subsystem specific structure.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jens Axboe <jaxboe@fusionio.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 04:58:32 -04:00
Steve French b957ae9c53 [CIFS] Fixup trivial checkpatch warning
Signed-off-by: Steve French <smfrench@gmail.com>
2011-10-19 21:27:11 -05:00
Ian Campbell a0bec1cd8f net: do not take an additional reference in skb_frag_set_page
I audited all of the callers in the tree and only one of them (pktgen) expects
it to do so. Taking this reference is pretty obviously confusing and error
prone.

In particular I looked at the following commits which switched callers of
(__)skb_frag_set_page to the skb paged fragment api:

6a930b9f16 cxgb3: convert to SKB paged frag API.
5dc3e196ea myri10ge: convert to SKB paged frag API.
0e0634d20d vmxnet3: convert to SKB paged frag API.
86ee8130a4 virtionet: convert to SKB paged frag API.
4a22c4c919 sfc: convert to SKB paged frag API.
18324d690d cassini: convert to SKB paged frag API.
b061b39e3a benet: convert to SKB paged frag API.
b7b6a688d2 bnx2: convert to SKB paged frag API.
804cf14ea5 net: xfrm: convert to SKB frag APIs
ea2ab69379 net: convert core to skb paged frag APIs

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 19:40:39 -04:00
Dan Carpenter 4f25af2782 filter: use unsigned int to silence static checker warning
This is just a cleanup.

My testing version of Smatch warns about this:
net/core/filter.c +380 check_load_and_stores(6)
	warn: check 'flen' for negative values

flen comes from the user.  We try to clamp the values here between 1
and BPF_MAXINSNS but the clamp doesn't work because it could be
negative.  This is a bug, but it's not exploitable.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 19:35:51 -04:00
Eric W. Biederman 672d82c18d class: Implement support for class attrs in tagged sysfs directories.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 19:24:15 -04:00
Eric W. Biederman 487505c257 sysfs: Implement support for tagged files in sysfs.
Looking up files in sysfs is hard to understand and analyize because we
currently allow placing untagged files in tagged directories.  In the
implementation of that we have two subtly different meanings of NULL.
NULL meaning there is no tag on a directory entry and NULL meaning
we don't care which namespace the lookup is performed for.  This
multiple uses of NULL have resulted in subtle bugs (since fixed)
in the code.

Currently it is only the bonding driver that needs to have an untagged
file in a tagged directory.

To untagle this mess I am adding support for tagged files to sysfs.
Modifying the bonding driver to implement bonding_masters as a tagged
file.  Registering bonding_masters once for each network namespace.
Then I am removing support for untagged entries in tagged sysfs
directories.

Resulting in code that is much easier to reason about.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 19:24:14 -04:00
Richard Cochran 4dc360c5e7 net: validate HWTSTAMP ioctl parameters
This patch adds a sanity check on the values provided by user space for
the hardware time stamping configuration. If the values lie outside of
the absolute limits, then the ioctl request will be denied.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 17:00:35 -04:00
Trond Myklebust fba730050d NFS: Don't rely on PageError in nfs_readpage_release_partial
Don't rely on the PageError flag to tell us if one of the partial reads of
the page failed. Instead, replace that with a dedicated flag in the
struct nfs_page.

Then clean out redundant uses of the PageError flag: the VM no longer
checks it for reads.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-19 13:58:38 -07:00
Trond Myklebust b8ef70639b NFS: Get rid of the unused nfs_write_data->flags field
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-19 13:37:34 -07:00
Trond Myklebust a1940805d0 NFS: Get rid of the unused nfs_read_data->flags field
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-19 13:37:34 -07:00
Andy Fleming 3d153a7c8b net: Allow skb_recycle_check to be done in stages
skb_recycle_check resets the skb if it's eligible for recycling.
However, there are times when a driver might want to optionally
manipulate the skb data with the skb before resetting the skb,
but after it has determined eligibility.  We do this by splitting the
eligibility check from the skb reset, creating two inline functions to
accomplish that task.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 15:59:45 -04:00
Jeff Layton f06ac72e92 cifs, freezer: add wait_event_freezekillable and have cifs use it
CIFS currently uses wait_event_killable to put tasks to sleep while
they await replies from the server. That function though does not
allow the freezer to run. In many cases, the network interface may
be going down anyway, in which case the reply will never come. The
client then ends up blocking the computer from suspending.

Fix this by adding a new wait_event_freezable variant --
wait_event_freezekillable. The idea is to combine the behavior of
wait_event_killable and wait_event_freezable -- put the task to
sleep and only allow it to be awoken by fatal signals, but also
allow the freezer to do its job.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
2011-10-19 15:30:40 -04:00
J. Bruce Fields 8b289b2c23 nfsd4: implement new 4.1 open reclaim types
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-19 11:52:12 -04:00
Tejun Heo bd87b5898a block: drop @tsk from attempt_plug_merge() and explain sync rules
attempt_plug_merge() accesses elevator without holding queue_lock and
may call into ->elevator_bio_merge_fn().  The elvator is guaranteed to
be valid because it's accessed iff the plugged list has requests and
elevator is never exited with live requests, so as long as the
elevator method can deal with unlocked access, this is safe.

Explain the sync rules around attempt_plug_merge() and drop the
unnecessary @tsk parameter.

This patch doesn't introduce any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-19 14:33:08 +02:00
Tejun Heo bc9fcbf9cb block: move blk_throtl prototypes to block/blk.h
blk_throtl interface is block internal and there's no reason to have
them in linux/blkdev.h.  Move them to block/blk.h.

This patch doesn't introduce any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-19 14:31:18 +02:00
Jens Axboe 5c04b426f2 Merge branch 'v3.1-rc10' into for-3.2/core
Conflicts:
	block/blk-core.c
	include/linux/blkdev.h

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-19 14:30:42 +02:00
Yevgeny Petrilin f3a9d1f25d mlx4_en: Controlling FCS header removal
Canceling FCS removal where FW allows for better alignment
of incoming data.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:42:26 -04:00
Daniel Martensson 5ea2ef5f8b caif-hsi: Added recovery check of CA wake status.
Added recovery check of CA wake status in case of wake up timeout.
Added check of CA wake status in case of wake down timeout.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:25:43 -04:00
Daniel Martensson 5bbed92d3d caif-hsi: Added sanity check for length of CAIF frames
Added sanity check for length of CAIF frames, and tear down of
CAIF link-layer device upon protocol error.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:25:42 -04:00
Dmitry Tarnyagin 28bd204942 caif-hsi: Make inactivity timeout configurable.
CAIF HSI uses a timer for inactivity. Upon timeout HSI-wake signaling
is initiated to allow power-down of the HSI block.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:25:42 -04:00
Daniel Martensson 687b13e98a caif-hsi: Making read and writes asynchronous.
Some platforms do not allow to put HSI block into low-power
mode when FIFO is not empty. The patch flushes (by reading)
FIFO at wake down sequence. Asynchronous read and write is
implemented for that. As a side effect this will also greatly
improve performance.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:25:41 -04:00
Eric Dumazet 9e903e0852 net: add skb frag size accessors
To ease skb->truesize sanitization, its better to be able to localize
all references to skb frags size.

Define accessors : skb_frag_size() to fetch frag size, and
skb_frag_size_{set|add|sub}() to manipulate it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:10:46 -04:00
Michael Hennerich 3f48e73543 Input: adp5589-keys - add support for the ADP5585 derivatives
The ADP5585 family keypad decoder and IO expander is similar to the ADP5589,
however it features less IO pins, and lacks hardware assisted key-lock
functionality. Unfortunately the register addresses are different, as well as
the event codes and bit organization within the port related registers.

Move ADP5589 Register defines from the header file into the main source file.
Add new defines while making sure we don't break existing platform_data.
Add register address translation, and turn device specific defines into variables.
Introduce some helper functions and disable functions that doesn't
exist on the added devices.

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-10-18 21:26:55 -07:00
Eric Dumazet bc416d9768 macvlan: handle fragmented multicast frames
Fragmented multicast frames are delivered to a single macvlan port,
because ip defrag logic considers other samples are redundant.

Implement a defrag step before trying to send the multicast frame.

Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-18 23:22:07 -04:00
Trond Myklebust 0c2e53f11a NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup()
...and also remove the associated nfs_v4_clientops entry.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 16:13:51 -07:00
Jiri Slaby fa90e1c935 TTY: make tty_add_file non-failing
If tty_add_file fails at the point it is now, we have to revert all
the changes we did to the tty. It means either decrease all refcounts
if this was a tty reopen or delete the tty if it was newly allocated.

There was a try to fix this in v3.0-rc2 using tty_release in 0259894c7
(TTY: fix fail path in tty_open). But instead it introduced a NULL
dereference. It's because tty_release dereferences
filp->private_data, but that one is set even in our tty_add_file. And
when tty_add_file fails, it's still NULL/garbage. Hence tty_release
cannot be called there.

To circumvent the original leak (and the current NULL deref) we split
tty_add_file into two functions, making the latter non-failing. In
that case we may do the former early in open, where handling failures
is easy. The latter stays as it is now. So there is no change in
functionality.

The original bug (leak) was introduced by f573bd176 (tty: Remove
__GFP_NOFAIL from tty_add_file()). Thanks Dan for reporting this.

Later, we may split tty_release into more functions and call only some
of them in this fail path instead. (If at all possible.)

Introduced-in: v2.6.37-rc2
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-18 14:22:37 -07:00
Thomas Meyer 8193c42906 tty: Support compat_ioctl get/set termios_locked
When running a Fedora 15 (x86) on an x86_64 kernel, in the boot process
plymouthd complains about those two missing ioctls:
[    2.581783] ioctl32(plymouthd:186): Unknown cmd fd(10) cmd(00005457){t:'T';sz:0} arg(ffb6a5d0) on /dev/tty1
[    2.581803] ioctl32(plymouthd:186): Unknown cmd fd(10) cmd(00005456){t:'T';sz:0} arg(ffb6a680) on /dev/tty1

both ioctl functions work on the 'struct termios' resp. 'struct termios2',
which has the same size (36 bytes resp. 44 bytes) on x86 and x86_64,
so it's just a matter of converting the pointer from userland.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-18 14:17:11 -07:00
Jason Baron 07613b0b5e dynamic_debug: consolidate repetitive struct _ddebug descriptor definitions
Replace the repetitive struct _ddebug descriptor definitions with a new
DECLARE_DYNAMIC_DEBUG_META_DATA(name, fmt) macro.

[akpm@linux-foundation.org: s/DECLARE/DEFINE/]
Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-18 11:22:00 -07:00
Kai Jiang 27a90700a4 uio: Support physical addresses >32 bits on 32-bit systems
To support >32-bit physical addresses for UIO_MEM_PHYS type we need to
extend the width of 'addr' in struct uio_mem.  Numerous platforms like
embedded PPC, ARM, and X86 have support for systems with larger physical
address than logical.

Since 'addr' may contain a physical, logical, or virtual address the
easiest solution is to just change the type to 'phys_addr_t' which
should always be greater than or equal to the sizeof(void *) such that
it can properly hold any of the address types.

For physical address we can support up to a 44-bit physical address on a
typical 32-bit system as we utilize remap_pfn_range() for the mapping of
the memory region and pfn's are represnted by shifting the address by
the page size (typically 4k).

Signed-off-by: Kai Jiang <Kai.Jiang@freescale.com>
Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Hans J. Koch <hjk@hansjkoch.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-18 11:18:57 -07:00
Trond Myklebust a9a4a87a59 NFS: Use the inode->i_version to cache NFSv4 change attribute information
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 09:14:34 -07:00
Trond Myklebust d77385f238 SUNRPC: Fix rpc_sockaddr2uaddr
rpc_sockaddr2uaddr is only used by net/sunrpc/rpcb_clnt.c, where
it is used in a non-blockable context in at least one case.

Add non-blocking capability by adding a gfp_t argument

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 09:13:32 -07:00
Peng Tao c1225158a8 SUNRPC/NFS: make rpc pipe upcall generic
The same function is used by idmap, gss and blocklayout code. Make it
generic.

Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Cc: stable@kernel.org [3.0]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 09:08:12 -07:00
Dave Airlie 017ed8012e Merge tag 'v3.1-rc10' into drm-core-next
There are a number of fixes in mainline required for code in -next,
also there was a few conflicts I'd rather resolve myself.

Signed-off-by: Dave Airlie <airlied@redhat.com>

Conflicts:
	drivers/gpu/drm/radeon/evergreen.c
	drivers/gpu/drm/radeon/r600.c
	drivers/gpu/drm/radeon/radeon_asic.h
2011-10-18 10:54:30 +01:00
David S. Miller ae2a458315 Merge branch 'nf' of git://1984.lsi.us.es/net 2011-10-17 19:38:03 -04:00
Marc Kleine-Budde f861c2b80c can: remove references to berlios mailinglist
The BerliOS project, which currently hosts our mailinglist, will
close with the end of the year. Now take the chance and remove all
occurrences of the mailinglist address from the source files.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-17 19:22:46 -04:00
Gerrit Renker f36c23bb9f udplite: fast-path computation of checksum coverage
Commit 903ab86d19 of 1 March this year ("udp: Add
lockless transmit path") introduced a new fast TX path that broke the checksum
coverage computation of UDP-lite, which so far depended on up->len (only set
if the socket is locked and 0 in the fast path).

Fixed by providing both fast- and slow-path computation of checksum coverage.
The latter can be removed when UDP(-lite)v6 also uses a lockless transmit path.
 
Reported-by: Thomas Volkert <thomas@homer-conferencing.com>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-17 19:07:30 -04:00
John W. Linville 41ebe9cde7 Merge branch 'master' of git://git.infradead.org/users/linville/wireless-next into for-davem 2011-10-17 15:05:26 -04:00
Christoph Hellwig 456be1484f loop: remove the incorrect write_begin/write_end shortcut
Currently the loop device tries to call directly into write_begin/write_end
instead of going through ->write if it can.  This is a fairly nasty shortcut
as write_begin and write_end are only callbacks for the generic write code
and expect to be called with filesystem specific locks held.

This code currently causes various issues for clustered filesystems as it
doesn't take the required cluster locks, and it also causes issues for XFS
as it doesn't properly lock against the swapext ioctl as called by the
defragmentation tools.  This in case causes data corruption if
defragmentation hits a busy loop device in the wrong time window, as
reported by RH QA.

The reason why we have this shortcut is that it saves a data copy when
doing a transformation on the loop device, which is the technical term
for using cryptoloop (or an XOR transformation).  Given that cryptoloop
has been deprecated in favour of dm-crypt my opinion is that we should
simply drop this shortcut instead of finding complicated ways to to
introduce a formal interface for this shortcut.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-17 12:57:20 +02:00
Ian Campbell 9bab0b7fba genirq: Add IRQF_RESUME_EARLY and resume such IRQs earlier
This adds a mechanism to resume selected IRQs during syscore_resume
instead of dpm_resume_noirq.

Under Xen we need to resume IRQs associated with IPIs early enough
that the resched IPI is unmasked and we can therefore schedule
ourselves out of the stop_machine where the suspend/resume takes
place.

This issue was introduced by 676dc3cf5b "xen: Use IRQF_FORCE_RESUME".

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1318713254.11016.52.camel@dagon.hellion.org.uk
Cc: stable@kernel.org (at least to 2.6.32.y)
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-10-17 11:42:49 +02:00
Rafael J. Wysocki 2aede851dd PM / Hibernate: Freeze kernel threads after preallocating memory
There is a problem with the current ordering of hibernate code which
leads to deadlocks in some filesystems' memory shrinkers.  Namely,
some filesystems use freezable kernel threads that are inactive when
the hibernate memory preallocation is carried out.  Those same
filesystems use memory shrinkers that may be triggered by the
hibernate memory preallocation.  If those memory shrinkers wait for
the frozen kernel threads, the hibernate process deadlocks (this
happens with XFS, for one example).

Apparently, it is not technically viable to redesign the filesystems
in question to avoid the situation described above, so the only
possible solution of this issue is to defer the freezing of kernel
threads until the hibernate memory preallocation is done, which is
implemented by this change.

Unfortunately, this requires the memory preallocation to be done
before the "prepare" stage of device freeze, so after this change the
only way drivers can allocate additional memory for their freeze
routines in a clean way is to use PM notifiers.

Reported-by: Christoph <cr2005@u-club.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-16 23:28:52 +02:00
H Hartley Sweeten 37cce26b32 PM / VT: Cleanup #if defined uglyness and fix compile error
Introduce the config option CONFIG_VT_CONSOLE_SLEEP in order to cleanup
the #if defined ugliness for the vt suspend support functions. Note that
CONFIG_VT_CONSOLE is already dependant on CONFIG_VT.

The function pm_set_vt_switch is actually dependant on CONFIG_VT and not
CONFIG_PM_SLEEP. This fixes a compile error when CONFIG_PM_SLEEP is
not set:

drivers/tty/vt/vt_ioctl.c:1794: error: redefinition of 'pm_set_vt_switch'
include/linux/suspend.h:17: error: previous definition of 'pm_set_vt_switch' was here

Also, remove the incorrect path from the comment in console.c.

[rjw: Replaced #if defined() with #ifdef in suspend.h.]

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-16 23:28:51 +02:00
Martin Schwidefsky 85055dd805 PM / Hibernate: Include storage keys in hibernation image on s390
For s390 there is one additional byte associated with each page,
the storage key. This byte contains the referenced and changed
bits and needs to be included into the hibernation image.
If the storage keys are not restored to their previous state all
original pages would appear to be dirty. This can cause
inconsistencies e.g. with read-only filesystems.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-16 23:27:46 +02:00
ShuoX Liu 2a77c46de1 PM / Suspend: Add statistics debugfs file for suspend to RAM
Record S3 failure time about each reason and the latest two failed
devices' names in S3 progress.
We can check it through 'suspend_stats' entry in debugfs.

The motivation of the patch:

We are enabling power features on Medfield. Comparing with PC/notebook,
a mobile enters/exits suspend-2-ram (we call it s3 on Medfield) far
more frequently. If it can't enter suspend-2-ram in time, the power
might be used up soon.

We often find sometimes, a device suspend fails. Then, system retries
s3 over and over again. As display is off, testers and developers
don't know what happens.

Some testers and developers complain they don't know if system
tries suspend-2-ram, and what device fails to suspend. They need
such info for a quick check. The patch adds suspend_stats under
debugfs for users to check suspend to RAM statistics quickly.

If not using this patch, we have other methods to get info about
what device fails. One is to turn on  CONFIG_PM_DEBUG, but users
would get too much info and testers need recompile the system.

In addition, dynamic debug is another good tool to dump debug info.
But it still doesn't match our utilization scenario closely.
1) user need write a user space parser to process the syslog output;
2) Our testing scenario is we leave the mobile for at least hours.
   Then, check its status. No serial console available during the
   testing. One is because console would be suspended, and the other
   is serial console connecting with spi or HSU devices would consume
   power. These devices are powered off at suspend-2-ram.

Signed-off-by: ShuoX Liu <shuox.liu@intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-16 23:27:45 +02:00
Greg Rose 5f8444a3fa if_link: Add additional parameter to IFLA_VF_INFO for spoof checking
Add configuration setting for drivers to turn spoof checking on or off
for discrete VFs.

v2 - Fix indentation problem, wrap the ifla_vf_info structure in
     #ifdef __KERNEL__ to prevent user space from accessing and
     change function paramater for the spoof check setting netdev
     op from u8 to bool.
v3 - Preset spoof check setting to -1 so that user space tools such
     as ip can detect that the driver didn't report a spoofcheck
     setting.  Prevents incorrect display of spoof check settings
     for drivers that don't report it.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-10-16 13:15:38 -07:00
Dan Williams 1a34c06401 [SCSI] libsas: fix port->dev_list locking
port->dev_list maintains a list of devices attached to a given sas root port.
It needs to be mutated under a lock as contexts outside of the
single-threaded-libsas-workqueue access the list via sas_find_dev_by_rphy().
Fixup locations where the list was being mutated without a lock.

This is a follow-up to commit 5911e963 "[SCSI] libsas: remove expander
from dev list on error", where Luben noted [1]:

    > 2/ We have unlocked list manipulations in sas_ex_discover_end_dev(),
    > sas_unregister_common_dev(), and sas_ex_discover_end_dev()

    Yes, I can see that and that is very unfortunate.

[1]: http://marc.info/?l=linux-scsi&m=131480962006471&w=2

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-16 10:54:02 -05:00
Bhanu Prakash Gollapudi 814740d5f6 [SCSI] fcoe,libfcoe: Move common code for fcoe_get_lesb to fcoe_transport
Except for obtaining the netdev from lport, fcoe_get_lesb is the common code
for the LLDs.

Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Acked-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-16 10:38:01 -05:00
Florian Tobias Schandinat ef26b7943c Merge branch 'for-florian' of git://gitorious.org/linux-omap-dss2/linux into fbdev-next 2011-10-15 00:19:52 +00:00
Florian Tobias Schandinat 07aaae44f5 Merge commit 'v3.1-rc9' into fbdev-next 2011-10-15 00:14:01 +00:00
Mark Brown 0151546fb3 regulator: Constify constraints name
There's no need for the API to modify it and having it const makes it
easier to use with random strings the board code has.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-14 20:59:30 +01:00
Peter Ujfalusi 1d69c5c5de ASoC: core: Add flag to ignore pmdown_time at pcm_close
With this flag codec drivers can indicate that it is desired
to ignore the pmdown_time for DAPM shutdown sequence when
playback stream is stopped.
The DAPM sequence will be executed without delay in this case.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-14 20:42:21 +01:00
Helmut Schaa bb6e753e95 nl80211: Add sta_flags to the station info
Reuse the already existing struct nl80211_sta_flag_update to specify
both, a flag mask and the flag set itself. This means
nl80211_sta_flag_update is now used for setting station flags and also
for getting station flags.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-10-14 14:48:23 -04:00
Helmut Schaa 7f2a5e214d mac80211: Populate radiotap header with MCS info for TX frames
mac80211 already filled in the MCS rate info for rx'ed frames but tx'ed
frames that are sent to a monitor interface during the status callback
lack this information.

Add the radiotap fields for MCS info to ieee80211_tx_status_rtap_hdr
and populate them when sending tx'ed frames to the monitors.

The needed headroom is only extended by one byte since we don't include
legacy rate information in the rtap header for HT frames.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-10-14 14:48:14 -04:00
Szymon Janc 88149db494 Bluetooth: rfcomm: Fix sleep in invalid context in rfcomm_security_cfm
This was triggered by turning off encryption on ACL link when rfcomm
was using high security. rfcomm_security_cfm (which is called from rx
task) was closing DLC and this involves sending disconnect message
(and locking socket).

Move closing DLC to rfcomm_process_dlcs and only flag DLC for closure
in rfcomm_security_cfm.

BUG: sleeping function called from invalid context at net/core/sock.c:2032
in_atomic(): 1, irqs_disabled(): 0, pid: 1788, name: kworker/0:3
[<c0068a08>] (unwind_backtrace+0x0/0x108) from [<c05e25dc>] (dump_stack+0x20/0x24)
[<c05e25dc>] (dump_stack+0x20/0x24) from [<c0087ba8>] (__might_sleep+0x110/0x12c)
[<c0087ba8>] (__might_sleep+0x110/0x12c) from [<c04801d8>] (lock_sock_nested+0x2c/0x64)
[<c04801d8>] (lock_sock_nested+0x2c/0x64) from [<c05670c8>] (l2cap_sock_sendmsg+0x58/0xcc)
[<c05670c8>] (l2cap_sock_sendmsg+0x58/0xcc) from [<c047cf6c>] (sock_sendmsg+0xb0/0xd0)
[<c047cf6c>] (sock_sendmsg+0xb0/0xd0) from [<c047cfc8>] (kernel_sendmsg+0x3c/0x44)
[<c047cfc8>] (kernel_sendmsg+0x3c/0x44) from [<c056b0e8>] (rfcomm_send_frame+0x50/0x58)
[<c056b0e8>] (rfcomm_send_frame+0x50/0x58) from [<c056b168>] (rfcomm_send_disc+0x78/0x80)
[<c056b168>] (rfcomm_send_disc+0x78/0x80) from [<c056b9f4>] (__rfcomm_dlc_close+0x2d0/0x2fc)
[<c056b9f4>] (__rfcomm_dlc_close+0x2d0/0x2fc) from [<c056bbac>] (rfcomm_security_cfm+0x140/0x1e0)
[<c056bbac>] (rfcomm_security_cfm+0x140/0x1e0) from [<c0555ec0>] (hci_event_packet+0x1ce8/0x4d84)
[<c0555ec0>] (hci_event_packet+0x1ce8/0x4d84) from [<c0550380>] (hci_rx_task+0x1d0/0x2d0)
[<c0550380>] (hci_rx_task+0x1d0/0x2d0) from [<c009ee04>] (tasklet_action+0x138/0x1e4)
[<c009ee04>] (tasklet_action+0x138/0x1e4) from [<c009f21c>] (__do_softirq+0xcc/0x274)
[<c009f21c>] (__do_softirq+0xcc/0x274) from [<c009f6c0>] (do_softirq+0x60/0x6c)
[<c009f6c0>] (do_softirq+0x60/0x6c) from [<c009f794>] (local_bh_enable_ip+0xc8/0xd4)
[<c009f794>] (local_bh_enable_ip+0xc8/0xd4) from [<c05e5804>] (_raw_spin_unlock_bh+0x48/0x4c)
[<c05e5804>] (_raw_spin_unlock_bh+0x48/0x4c) from [<c040d470>] (data_from_chip+0xf4/0xaec)
[<c040d470>] (data_from_chip+0xf4/0xaec) from [<c04136c0>] (send_skb_to_core+0x40/0x178)
[<c04136c0>] (send_skb_to_core+0x40/0x178) from [<c04139f4>] (cg2900_hu_receive+0x15c/0x2d0)
[<c04139f4>] (cg2900_hu_receive+0x15c/0x2d0) from [<c0414cb8>] (hci_uart_tty_receive+0x74/0xa0)
[<c0414cb8>] (hci_uart_tty_receive+0x74/0xa0) from [<c02cbd9c>] (flush_to_ldisc+0x188/0x198)
[<c02cbd9c>] (flush_to_ldisc+0x188/0x198) from [<c00b2774>] (process_one_work+0x144/0x4b8)
[<c00b2774>] (process_one_work+0x144/0x4b8) from [<c00b2e8c>] (worker_thread+0x198/0x468)
[<c00b2e8c>] (worker_thread+0x198/0x468) from [<c00b9bc8>] (kthread+0x98/0xa0)
[<c00b9bc8>] (kthread+0x98/0xa0) from [<c0061744>] (kernel_thread_exit+0x0/0x8)

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-10-14 15:04:54 -03:00
Boaz Harrosh 4b46c9f5cf ore/exofs: Change ore_check_io API
Current ore_check_io API receives a residual
pointer, to report partial IO. But it is actually
not used, because in a multiple devices IO there
is never a linearity in the IO failure.

On the other hand if every failing device is reported
through a received callback measures can be taken to
handle only failed devices. One at a time.

This will also be needed by the objects-layout-driver
for it's error reporting facility.

Exofs is not currently using the new information and
keeps the old behaviour of failing the complete IO in
case of an error. (No partial completion)

TODO: Use an ore_check_io callback to set_page_error only
the failing pages. And re-dirty write pages.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-14 18:54:42 +02:00
Boaz Harrosh 5a51c0c7e9 ore/exofs: Define new ore_verify_layout
All users of the ore will need to check if current code
supports the given layout. For example RAID5/6 is not
currently supported.

So move all the checks from exofs/super.c to a new
ore_verify_layout() to be used by ore users.

Note that any new layout should be passed through the
ore_verify_layout() because the ore engine will prepare
and verify some internal members of ore_layout, and
assumes it's called.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-14 18:54:41 +02:00
Boaz Harrosh 3bd9856857 ore: Support for partial component table
Users like the objlayout-driver would like to only pass
a partial device table that covers the IO in question.
For example exofs divides the file into raid-group-sized
chunks and only serves group_width number of devices at
a time.

The partiality is communicated by setting
ore_componets->first_dev and the array covers all logical
devices from oc->first_dev upto (oc->first_dev + oc->numdevs)

The ore_comp_dev() API receives a logical device index
and returns the actual present device in the table.
An out-of-range dev_index will BUG.

Logical device index is the theoretical device index as if
all the devices of a file are present. .i.e:
	total_devs = group_width * mirror_p1 * group_count
	0 <= dev_index < total_devs

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-14 18:54:41 +02:00
Boaz Harrosh 9826075404 ore: cleanup: Embed an ore_striping_info inside ore_io_state
Now that each ore_io_state covers only a single raid group.
A single striping_info math is needed. Embed one inside
ore_io_state to cache the calculation results and eliminate
an extra call.

Also the outer _prepare_for_striping is removed since it does nothing.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-14 18:53:54 +02:00
Joerg Roedel 086ac11f64 PCI: Add support for PASID capability
Devices supporting Process Address Space Identifiers
(PASIDs) can use an IOMMU to access multiple IO address
spaces at the same time. A PCIe device indicates support for
this feature by implementing the PASID capability. This
patch adds support for the capability to the Linux kernel.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:35 -07:00
Joerg Roedel c320b976d7 PCI: Add implementation for PRI capability
Implement the necessary functions to handle PRI capabilities
on PCIe devices. With PRI devices behind an IOMMU can signal
page fault conditions to software and recover from such
faults.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:34 -07:00
Joerg Roedel db3c33c6d3 PCI: Move ATS implementation into own file
ATS does not depend on IOV support, so move the code into
its own file. This file will also include support for the
PRI and PASID capabilities later.
Also give ATS its own Kconfig variable to allow selecting it
without IOV support.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:33 -07:00
Prarit Bhargava 6af8bef14d PCI hotplug: acpiphp: Prevent deadlock on PCI-to-PCI bridge remove
I originally submitted a patch to workaround this by pushing all Ejection
Requests and Device Checks onto the kacpi_hotplug queue.

http://marc.info/?l=linux-acpi&m=131678270930105&w=2

The patch is still insufficient in that Bus Checks also need to be added.

Rather than add all events, including non-PCI-hotplug events, to the
hotplug queue, mjg suggested that a better approach would be to modify
the acpiphp driver so only acpiphp events would be added to the
kacpi_hotplug queue.

It's a longer patch, but at least we maintain the benefit of having separate
queues in ACPI.  This, of course, is still only a workaround the problem.
As Bjorn and mjg pointed out, we have to refactor a lot of this code to do
the right thing but at this point it is a better to have this code working.

The acpi core places all events on the kacpi_notify queue.  When the acpiphp
driver is loaded and a PCI card with a PCI-to-PCI bridge is removed the
following call sequence occurs:

cleanup_p2p_bridge()
	    -> cleanup_bridge()
		    -> acpi_remove_notify_handler()
			    -> acpi_os_wait_events_complete()
				    -> flush_workqueue(kacpi_notify_wq)

which is the queue we are currently executing on and the process will hang.

Move all hotplug acpiphp events onto the kacpi_hotplug workqueue.  In
handle_hotplug_event_bridge() and handle_hotplug_event_func() we can simply
push the rest of the work onto the kacpi_hotplug queue and then avoid the
deadlock.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: mjg@redhat.com
Cc: bhelgaas@google.com
Cc: linux-acpi@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:31 -07:00
Rafael J. Wysocki 379021d5c0 PCI / PM: Extend PME polling to all PCI devices
The land of PCI power management is a land of sorrow and ugliness,
especially in the area of signaling events by devices.  There are
devices that set their PME Status bits, but don't really bother
to send a PME message or assert PME#.  There are hardware vendors
who don't connect PME# lines to the system core logic (they know
who they are).  There are PCI Express Root Ports that don't bother
to trigger interrupts when they receive PME messages from the devices
below.  There are ACPI BIOSes that forget to provide _PRW methods for
devices capable of signaling wakeup.  Finally, there are BIOSes that
do provide _PRW methods for such devices, but then don't bother to
call Notify() for those devices from the corresponding _Lxx/_Exx
GPE-handling methods.  In all of these cases the kernel doesn't have
a chance to receive a proper notification that it should wake up a
device, so devices stay in low-power states forever.  Worse yet, in
some cases they continuously send PME Messages that are silently
ignored, because the kernel simply doesn't know that it should clear
the device's PME Status bit.

This problem was first observed for "parallel" (non-Express) PCI
devices on add-on cards and Matthew Garrett addressed it by adding
code that polls PME Status bits of such devices, if they are enabled
to signal PME, to the kernel.  Recently, however, it has turned out
that PCI Express devices are also affected by this issue and that it
is not limited to add-on devices, so it seems necessary to extend
the PME polling to all PCI devices, including PCI Express and planar
ones.  Still, it would be wasteful to poll the PME Status bits of
devices that are known to receive proper PME notifications, so make
the kernel (1) poll the PME Status bits of all PCI and PCIe devices
enabled to signal PME and (2) disable the PME Status polling for
devices for which correct PME notifications are received.

Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:31 -07:00
Benjamin Herrenschmidt e24442733e PCI: Make pci_setup_bridge() non-static for use by arch code
The "powernv" platform of the powerpc architecture needs to assign PCI
resources using a specific algorithm to fit some HW constraints of
the IBM "IODA" architecture (related to the ability to create error
handling domains that encompass specific segments of MMIO space).

For doing so, it wants to call pci_setup_bridge() from architecture
specific resource management in order to configure bridges after all
resources have been assigned. So make it non-static.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:29 -07:00
Ben Hutchings 937383a58e PCI: Add Solarflare vendor ID and SFC4000 device IDs
These will be shared between the sfc driver and a PCI quirk.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:27 -07:00
Eric Dumazet 87fb4b7b53 net: more accurate skb truesize
skb truesize currently accounts for sk_buff struct and part of skb head.
kmalloc() roundings are also ignored.

Considering that skb_shared_info is larger than sk_buff, its time to
take it into account for better memory accounting.

This patch introduces SKB_TRUESIZE(X) macro to centralize various
assumptions into a single place.

At skb alloc phase, we put skb_shared_info struct at the exact end of
skb head, to allow a better use of memory (lowering number of
reallocations), since kmalloc() gives us power-of-two memory blocks.

Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
aligned to cache lines, as before.

Note: This patch might trigger performance regressions because of
misconfigured protocol stacks, hitting per socket or global memory
limits that were previously not reached. But its a necessary step for a
more accurate memory accounting.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andi Kleen <ak@linux.intel.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-13 16:05:07 -04:00
Rajendra Nayak 36a0904ea0 dt: add empty dt helpers for non-dt build
Add empty of_device_is_compatible() and of_parse_phandle() for non-dt
builds to work.

Signed-off-by: Rajendra Nayak <rnayak@ti.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-10-13 12:33:38 -06:00
Neil Zhang dde34cc501 usb: gadget: mv_udc: refine the driver structure
This patch do the following things:

1. Add header and Copyright for marvell usb driver.
2. Add mv_usb.h in include/linux/platform_data, make the driver
   fits all the marvell platform using the same ChipIdea usb ip.
3. Some SOC may has mutiple clock sources, so let me define it
   in mv_usb_platform_data and give two helper functions named
   udc_clock_enable/udc_clock_disable to deal with the clocks.
4. Different SOCs will have some difference in PHY initialization,
   so we will remove file mv_udc_phy.c and add two funtions in
   mv_usb_platform_data, let the platform relative driver to realize it.
5. Rewrite probe function according to the modification list above. Find
   it will kernel panic when probe failed. The root cause is as follows:
	When probe failed, the error handle may call device_unregister()
	which in return will call gadget_release.In current code,
	gadget_release have two issues:
		1: the_controller is a NULL pointer.
		2: if we free udc here, then the following code in probe
		   will access NULL pointer.

Signed-off-by: Neil Zhang <zhangwm@marvell.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-10-13 20:41:56 +03:00
Kuninori Morimoto f427eb64f4 usb: gadget: renesas_usbhs: support otg pin control
some renesas_usbhs device is supporting OTG external device interface.
In that device, it is necessary to control PWEN/EXTLP on DVSTCTR.
This patch support it.
But renesas_usbhs driver doesn't have OTG support for now.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-10-13 20:41:47 +03:00
Kuninori Morimoto 258485d990 usb: gadget: renesas_usbhs: add bus control functions
this patch add DVSTCTR control function for HOST support

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-10-13 20:41:38 +03:00
Kuninori Morimoto 11935de557 usb: gadget: renesas_usbhs: change usbhsc_bus_ctrl() to usbsc_set_buswait()
renesas_usbhs will have register DVSTCTR control function for HOST support.
This patch changes usbhsc_bus_ctrl() to usbsc_set_buswait(),
to remove DVSTCTR access from it,

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-10-13 20:41:37 +03:00
Felipe Balbi 089b837a39 usb: gadget: fix typo for default U1/U2 exit latencies
s/DEFULT/DEFAULT/, no functional changes.

Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-10-13 20:39:59 +03:00
Yoshihiro Shimoda b8a56e17e1 usb: gadget: r8a66597-udc: add support for SUDMAC
SH7757 has a USB function with internal DMA controller (SUDMAC).
This patch supports the SUDMAC. The SUDMAC is incompatible with
general-purpose DMAC. So, it doesn't use dmaengine.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-10-13 20:38:39 +03:00
Sean Hefty 42849b2697 RDMA/uverbs: Export ib_open_qp() capability to user space
Allow processes that share the same XRC domain to open an existing
shareable QP.  This permits those processes to receive events on the
shared QP and transfer ownership, so that any process may modify the
QP.  The latter allows the creating process to exit, while a remaining
process can still transition it for path migration purposes.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:50:56 -07:00
Sean Hefty 0e0ec7e063 RDMA/core: Export ib_open_qp() to share XRC TGT QPs
XRC TGT QPs are shared resources among multiple processes.  Since the
creating process may exit, allow other processes which share the same
XRC domain to open an existing QP.  This allows us to transfer
ownership of an XRC TGT QP to another process.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:49:51 -07:00
Sean Hefty 0a1405da99 IB/mlx4: Add support for XRC QPs
Support the creation of XRC INI and TGT QPs.  To handle the case where
a CQ or PD is not provided, we allocate them internally with the xrcd.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:44:18 -07:00
Sean Hefty 18abd5ea57 IB/mlx4: Add support for XRC SRQs
Allow the user to create XRC SRQs.  This patch is based on a patch
from Jack Morgenstrein <jackm@dev.mellanox.co.il>.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:43:46 -07:00
Sean Hefty 012a8ff577 IB/mlx4: Add support for XRC domains
Support creating and destroying XRC domains.  Any sharing of the XRCD
is managed above the low-level driver.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:43:03 -07:00
Sean Hefty 638ef7a6c6 RDMA/ucm: Allow user to specify QP type when creating id
Allow the user to indicate the QP type separately from the port space
when allocating an rdma_cm_id.  With RDMA_PS_IB, there is no longer a
1:1 relationship between the QP type and port space, so we need to
switch on the QP type to select between UD and connected QPs.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:40:36 -07:00
Sean Hefty 2d2e941529 RDMA/cm: Define new RDMA port space specific to IB
Add RDMA_PS_IB.  XRC QP types will use the IB port space when operating
over the RDMA CM.  For the 'IP protocol' field value, we select 0x3F,
which is listed as being for 'any local network'.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:39:52 -07:00
Sean Hefty 8541f8de05 RDMA/uverbs: Export XRC SRQs to user space
We require additional information to create XRC SRQs than we can
exchange using the existing create SRQ ABI.  Provide an enhanced create
ABI for extended SRQ types.

Based on patches by Jack Morgenstein <jackm@dev.mellanox.co.il>
and Roland Dreier <roland@purestorage.com>

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:29:18 -07:00
Sean Hefty 53d0bd1e7f RDMA/uverbs: Export XRC domains to user space
Allow user space to create XRC domains.  Because XRCDs are expected to
be shared among multiple processes, we use inodes to identify an XRCD.

Based on patches by Jack Morgenstein <jackm@dev.mellanox.co.il>

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:21:24 -07:00
Sean Hefty d3d72d909e RDMA/verbs: Cleanup XRC TGT QPs when destroying XRCD
XRC TGT QPs are intended to be shared among multiple users and
processes.  Allow the destruction of an XRC TGT QP to be done explicitly
through ib_destroy_qp() or when the XRCD is destroyed.

To support destroying an XRC TGT QP, we need to track TGT QPs with the
XRCD.  When the XRCD is destroyed, all tracked XRC TGT QPs are also
cleaned up.

To avoid stale reference issues, if a user is holding a reference on a
TGT QP, we increment a reference count on the QP.  The user releases the
reference by calling ib_release_qp.  This releases any access to the QP
from a user above verbs, but allows the QP to continue to exist until
destroyed by the XRCD.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:20:27 -07:00
Sean Hefty b42b63cf0d RDMA/core: Add XRC QPs
XRC ("eXtended reliable connected") is an IB transport that provides
better scalability by allowing senders to specify which shared receive
queue (SRQ) should be used to receive a message, which essentially
allows one transport context (QP connection) to serve multiple
destinations (as long as they share an adapter, of course).

XRC communication is between an initiator (INI) QP and a target (TGT)
QP.  Target QPs are associated with SRQs through an XRCD.  An XRC TGT QP
behaves like a receive-only RD QP.  XRC INI QPs behave similarly to RC
QPs, except that work requests posted to an XRC INI QP must specify the
remote SRQ that is the target of the work request.

We define two new QP types for XRC, to distinguish between INI and TGT
QPs, and update the core layer to support XRC QPs.

This patch is derived from work by Jack Morgenstein
<jackm@dev.mellanox.co.il>

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:16:19 -07:00
Sean Hefty 418d51307d RDMA/core: Add XRC SRQ type
XRC ("eXtended reliable connected") is an IB transport that provides
better scalability by allowing senders to specify which shared receive
queue (SRQ) should be used to receive a message, which essentially
allows one transport context (QP connection) to serve multiple
destinations (as long as they share an adapter, of course).

XRC defines SRQs that are specifically used by XRC connections.  Expand
the SRQ code to support XRC SRQs.  An XRC SRQ is currently restricted to
only XRC use according to the IB XRC Annex.

Portions of this patch were derived from work by
Jack Morgenstein <jackm@dev.mellanox.co.il>.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:14:31 -07:00
Sean Hefty 96104eda01 RDMA/core: Add SRQ type field
Currently, there is only a single ("basic") type of SRQ, but with XRC
support we will add a second.  Prepare for this by defining an SRQ type
and setting all current users to IB_SRQT_BASIC.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:13:26 -07:00
Li Dongyang 32a8d26cc9 xen-blkfront: add BLKIF_OP_DISCARD and discard request struct
Now we use BLKIF_OP_DISCARD and add blkif_request_discard to blkif_request union,
the patch is taken from Owen Smith and Konrad, Thanks

Signed-off-by: Owen Smith <owen.smith@citrix.com>
Signed-off-by: Li Dongyang <lidongyang@novell.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-13 09:48:29 -04:00
Padmavathi Venna 196a57c274 ARM: 7131/1: clkdev: Add Common Macro for clk_lookup
Added a standardized macro CLKDEV_INIT which can used across all
the platforms to support clkdev

Suggested-by: Russell King <rmk+kernel@arm.linux.org.uk>

Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Padmavathi Venna <padma.v@samsung.com>
Signed-off-by: Rajeshwari Shinde <rajeshwari.s@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-13 14:36:58 +01:00
Linus Walleij 2744e8afb3 drivers: create a pin control subsystem
This creates a subsystem for handling of pin control devices.
These are devices that control different aspects of package
pins.

Currently it handles pinmuxing, i.e. assigning electronic
functions to groups of pins on primarily PGA and BGA type of
chip packages which are common in embedded systems.

The plan is to also handle other I/O pin control aspects
such as biasing, driving, input properties such as
schmitt-triggering, load capacitance etc within this
subsystem, to remove a lot of ARM arch code as well as
feature-creepy GPIO drivers which are implementing the same
thing over and over again.

This is being done to depopulate the arch/arm/* directory
of such custom drivers and try to abstract the infrastructure
they all need. See the Documentation/pinctrl.txt file that is
part of this patch for more details.

ChangeLog v1->v2:

- Various minor fixes from Joe's and Stephens review comments
- Added a pinmux_config() that can invoke custom configuration
  with arbitrary data passed in or out to/from the pinmux driver

ChangeLog v2->v3:

- Renamed subsystem folder to "pinctrl" since we will likely
  want to keep other pin control such as biasing in this
  subsystem too, so let us keep to something generic even though
  we're mainly doing pinmux now.
- As a consequence, register pins as an abstract entity separate
  from the pinmux. The muxing functions will claim pins out of the
  pin pool and make sure they do not collide. Pins can now be
  named by the pinctrl core.
- Converted the pin lookup from a static array into a radix tree,
  I agreed with Grant Likely to try to avoid any static allocation
  (which is crap for device tree stuff) so I just rewrote this
  to be dynamic, just like irq number descriptors. The
  platform-wide definition of number of pins goes away - this is
  now just the sum total of the pins registered to the subsystem.
- Make sure mappings with only a function name and no device
  works properly.

ChangeLog v3->v4:

- Define a number space per controller instead of globally,
  Stephen and Grant requested the same thing so now maps need to
  define target controller, and the radix tree of pin descriptors
  is a property on each pin controller device.
- Add a compulsory pinctrl device entry to the pinctrl mapping
  table. This must match the pinctrl device, like "pinctrl.0"
- Split the file core.c in two: core.c and pinmux.c where the
  latter carry all pinmux stuff, the core is for generic pin
  control, and use local headers to access functionality between
  files. It is now possible to implement a "blank" pin controller
  without pinmux capabilities. This split will make new additions
  like pindrive.c, pinbias.c etc possible for combined drivers
  and chunks of functionality which is a GoodThing(TM).
- Rewrite the interaction with the GPIO subsystem - the pin
  controller descriptor now handles this by defining an offset
  into the GPIO numberspace for its handled pin range. This is
  used to look up the apropriate pin controller for a GPIO pin.
  Then that specific GPIO range is matched 1-1 for the target
  controller instance.
- Fixed a number of review comments from Joe Perches.
- Broke out a header file pinctrl.h for the core pin handling
  stuff that will be reused by other stuff than pinmux.
- Fixed some erroneous EXPORT() stuff.
- Remove mispatched U300 Kconfig and Makefile entries
- Fixed a number of review comments from Stephen Warren, not all
  of them - still WIP. But I think the new mapping that will
  specify which function goes to which pin mux controller address
  50% of your concerns (else beat me up).

ChangeLog v4->v5:

- Defined a "position" for each function, so the pin controller now
  tracks a function in a certain position, and the pinmux maps define
  what position you want the function in. (Feedback from Stephen
  Warren and Sascha Hauer).
- Since we now need to request a combined function+position from
  the machine mapping table that connect mux settings to drivers,
  it was extended with a position field and a name field. The
  name field is now used if you e.g. need to switch between two
  mux map settings at runtime.
- Switched from a class device to using struct bus_type for this
  subsystem. Verified sysfs functionality: seems to work fine.
  (Feedback from Arnd Bergmann and Greg Kroah-Hartman)
- Define a per pincontroller list of GPIO ranges from the GPIO
  pin space that can be handled by the pin controller. These can
  be added one by one at runtime. (Feedback from Barry Song)
- Expanded documentation of regulator_[get|enable|disable|put]
  semantics.
- Fixed a number of review comments from Barry Song. (Thanks!)

ChangeLog v5->v6:

- Create an abstract pin group concept that can sort pins into
  named and enumerated groups no matter what the use of these
  groups may be, one possible usecase is a group of pins being
  muxed in or so. The intention is however to also use these
  groups for other pin control activities.
- Make it compulsory for pinmux functions to associate with
  at least one group, so the abstract pin group concept is used
  to define the groups of pins affected by a pinmux function.
  The pinmux driver interface has been altered so as to enforce
  a function to list applicable groups per function.
- Provide an optional .group entry in the pinmux machine map
  so the map can select beteween different available groups
  to be used with a certain function.
- Consequent changes all over the place so that e.g. debugfs
  present reasonable information about the world.
- Drop the per-pin mux (*config) function in the pinmux_ops
  struct - I was afraid that this would start to be used for
  things totally unrelated to muxing, we can introduce that to
  the generic struct pinctrl_ops if needed. I want to keep
  muxing orthogonal to other pin control subjects and not mix
  these things up.

ChangeLog v6->v7:

- Make it possible to have several map entries matching the
  same device, pin controller and function, but using
  a different group, and alter the semantics so that
  pinmux_get() will pick all matching map entries, and
  store the associated groups in a list. The list will
  then be iterated over at pinmux_enable()/pinmux_disable()
  and corresponding driver functions called for each
  defined group. Notice that you're only allowed to map
  multiple *groups* to the same
  { device, pin controller, function } triplet, attempts
  to map the same device to multiple pin controllers will
  for example fail. This is hopefully the crucial feature
  requested by Stephen Warren.
- Add a pinmux hogging field to the pinmux mapping entries,
  and enable the pinmux core to hog pinmux map entries.
  This currently only works for pinmuxes without assigned
  devices as it looks now, but with device trees we can
  look up the corresponding struct device * entries when
  we register the pinmux driver, and have it hog each
  pinmux map in turn, for a simple approach to
  non-dynamic pin muxing. This addresses an issue from
  Grant Likely that the machine should take care of as
  much of the pinmux setup as possible, not the devices.
  By supplying a list of hogs, it can now instruct the
  core to take care of any static mappings.
- Switch pinmux group retrieveal function to grab an
  array of strings representing the groups rather than an
  array of unsigned and rewrite accordingly.
- Alter debugfs to show the grouplist handled by each
  pinmux. Also add a list of hogs.
- Dynamically allocate a struct pinmux at pinmux_get() and
  free it at pinmux_put(), then add these to the global
  list of pinmuxes active as we go along.
- Go over the list of pinmux maps at pinmux_get() time
  and repeatedly apply matches.
- Retrieve applicable groups per function from the driver
  as a string array rather than a unsigned array, then
  lookup the enumerators.
- Make the device to pinmux map a singleton - only allow the
  mapping table to be registered once and even tag the
  registration function with __init so it surely won't be
  abused.
- Create a separate debugfs file to view the pinmux map at
  runtime.
- Introduce a spin lock to the pin descriptor struct, lock it
  when modifying pin status entries. Reported by Stijn Devriendt.
- Fix up the documentation after review from Stephen Warren.
- Let the GPIO ranges give names as const char * instead of some
  fixed-length string.
- add a function to unregister GPIO ranges to mirror the
  registration function.
- Privatized the struct pinctrl_device and removed it from the
  <linux/pinctrl/pinctrl.h> API, the drivers do not need to know
  the members of this struct. It is now in the local header
  "core.h".
- Rename the concept of "anonymous" mux maps to "system" muxes
  and add convenience macros and documentation.

ChangeLog v7->v8:

- Delete the leftover pinmux_config() function from the
 <linux/pinctrl/pinmux.h> header.
- Fix a race condition found by Stijn Devriendt in pin_request()

ChangeLog v8->v9:

- Drop the bus_type and the sysfs attributes and all, we're not on
  the clear about how this should be used for e.g. userspace
  interfaces so let us save this for the future.
- Use the right name in MAINTAINERS, PIN CONTROL rather than
  PINMUX
- Don't kfree() the device state holder, let the .remove() callback
  handle this.
- Fix up numerous kerneldoc headers to have one line for the function
  description and more verbose documentation below the parameters

ChangeLog v9->v10:
- pinctrl: EXPORT_SYMBOL needs export.h, folded in a patch
  from Steven Rothwell
- fix pinctrl_register error handling, folded in a patch from
  Axel Lin
- Various fixes to documentation text so that it's consistent.
- Removed pointless comment from drivers/Kconfig
- Removed dependency on SYSFS since we removed the bus in
  v9.
- Renamed hopelessly abbreviated pctldev_* functions to the
  more verbose pinctrl_dev_*
- Drop mutex properly when looking up GPIO ranges
- Return NULL instead of ERR_PTR() errors on registration of
  pin controllers, using cast pointers is fragile. We can
  live without the detailed error codes for sure.

Cc: Stijn Devriendt <highguy@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Barry Song <21cnbao@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2011-10-13 12:49:17 +02:00
Dan Carpenter 05be8b81aa Input: force feedback - potential integer wrap in input_ff_create()
The problem here is that max_effects can wrap on 32 bits systems.
We'd allocate a smaller amount of data than sizeof(struct ff_device).
The call to kcalloc() on the next line would fail but it would write
the NULL return outside of the memory we just allocated causing data
corruption.

The call path is that uinput_setup_device() get ->ff_effects_max from
the user and sets the value in the ->private_data struct.  From there
it is:
-> uinput_ioctl_handler()
   -> uinput_create_device()
      -> input_ff_create(dev, udev->ff_effects_max);

I've also changed ff_effects_max so it's an unsigned int instead of
a signed int as a cleanup.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-10-12 21:13:11 -07:00
Murali Raja 3ceca74966 net-netlink: Add a new attribute to expose TOS values via netlink
This patch exposes the tos value for the TCP sockets when the TOS flag
is requested in the ext_flags for the inet_diag request. This would mainly be
used to expose TOS values for both for TCP and UDP sockets. Currently it is
supported for TCP. When netlink support for UDP would be added the support
to expose the TOS values would alse be done. For IPV4 tos value is exposed
and for IPV6 tclass value is exposed.

Signed-off-by: Murali Raja <muralira@google.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-12 19:09:18 -04:00
Sean Hefty 59991f94eb RDMA/core: Add XRC domain support
XRC ("eXtended reliable connected") is an IB transport that provides
better scalability by allowing senders to specify which shared receive
queue (SRQ) should be used to receive a message, which essentially
allows one transport context (QP connection) to serve multiple
destinations (as long as they share an adapter, of course).

A few new concepts are introduced to support this.  This patch adds:

 - A new device capability flag, IB_DEVICE_XRC, which low-level
   drivers set to indicate that a device supports XRC.
 - A new object type, XRC domains (struct ib_xrcd), and new verbs
   ib_alloc_xrcd()/ib_dealloc_xrcd().  XRCDs are used to limit which
   XRC SRQs an incoming message can target.

This patch is derived from work by Jack Morgenstein <jackm@dev.mellanox.co.il>.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-12 10:32:26 -07:00
Hans Schillstrom ae1d48b23d IPVS netns shutdown/startup dead-lock
ip_vs_mutext is used by both netns shutdown code and startup
and both implicit uses sk_lock-AF_INET mutex.

cleanup CPU-1         startup CPU-2
ip_vs_dst_event()     ip_vs_genl_set_cmd()
 sk_lock-AF_INET     __ip_vs_mutex
                     sk_lock-AF_INET
__ip_vs_mutex
* DEAD LOCK *

A new mutex placed in ip_vs netns struct called sync_mutex is added.

Comments from Julian and Simon added.
This patch has been running for more than 3 month now and it seems to work.

Ver. 3
    IP_VS_SO_GET_DAEMON in do_ip_vs_get_ctl protected by sync_mutex
    instead of __ip_vs_mutex as sugested by Julian.

Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-10-12 18:32:15 +02:00
Chen Gong b238b8fa93 pstore: make pstore write function return normal success/fail value
Currently pstore write interface employs record id as return
value, but it is not enough because it can't tell caller if
the write operation is successful. Pass the record id back via
an argument pointer and return zero for success, non-zero for
failure.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-10-12 09:17:24 -07:00
Ingo Molnar 910e94dd0c Merge branch 'tip/perf/core' of git://github.com/rostedt/linux into perf/core 2011-10-12 17:14:47 +02:00
Peter Ujfalusi 33b6816ca3 ASoC: twl6040: Workaround for headset DC offset caused pop noise
Both Headset DAC need to be turned on/off at the same time before
any of the output drivers are enabled (HS Left/Right, Earpiece).
Move the HS DAC enable code to sequenced DAPM_SUPPLY, and attach
it to the DACs.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-12 13:11:54 +01:00
Peter Ujfalusi 70601ec10a MFD: twl6040: function to query the vibra status for clients
If the client only interested, if any of the vibra channels enabled, or
if any of the channels are set to receive audio data via PDM.

This function targets mainly the vibra driver, so it can check if it is
allowed to execute effects ot not.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Samuel Ortiz <samuel.ortiz@intel.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-12 11:48:49 +01:00
Peter Ujfalusi 31b402e3c9 MFD: twl6040: Cache the vibra control registers
The vibra control register will be used from the ASoC codec driver as well.
In order to avoid latency issues caused by I2C read access, cache the two
control register within the core driver, so we do not need to reach out
to the chip to read it back.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Samuel Ortiz <samuel.ortiz@intel.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-12 11:48:46 +01:00
Peter Ujfalusi 1e036f6532 Input: twl6040: Simplify vibra regsiter definitions
The bits within the two control registers (for left and right channel)
are identical.
Use common names for the bits acros the two register.
Also add the missing definition for the path selection bit.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-12 11:48:35 +01:00
Philip Rakity 341deefe8f Input: tsc2007 - make sure that X plate resistance is specified
Abort driver initialization if X plate resistance was not specified in
platform data as it will cause pressure to be always calculated as 0,
and making userspace ignore touch coordinates.

Signed-off-by: Philip Rakity <prakity@marvell.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-10-11 20:56:41 -07:00
Johannes Berg 73b9f03a81 mac80211: parse radiotap header earlier
We can now move the radiotap header parsing into
ieee80211_monitor_start_xmit(). This moves it out of
the hotpath, and also helps the code since now the
radiotap header will no longer be present in
ieee80211_xmit() etc. which is easier to understand.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-10-11 16:41:19 -04:00
Johannes Berg a26eb27ab4 mac80211: move fragment flag to info flag as dont-fragment
The purpose of this is two-fold:
 1) by moving it out of tx_data.flags, we can in
    another patch move the radiotap parsing so it
    no longer is in the hotpath
 2) if a device implements fragmentation but can
    optionally skip it, the radiotap request for
    not doing fragmentation may be honoured

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-10-11 16:41:19 -04:00
John W. Linville 094daf7db7 Merge branch 'master' of git://git.infradead.org/users/linville/wireless-next into for-davem
Conflicts:
	Documentation/feature-removal-schedule.txt
2011-10-11 15:35:42 -04:00
Marcel Apfelbaum 71eeba161d IB: Add new InfiniBand link speeds
Introduce support for the following extended speeds:

FDR-10: a Mellanox proprietary link speed which is 10.3125 Gbps with
        64b/66b encoding rather than 8b/10b encoding.
FDR:    IBA extended speed 14.0625 Gbps.
EDR:    IBA extended speed 25.78125 Gbps.

Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-11 11:53:47 -07:00
Greg Kroah-Hartman 15b80d6417 hv: remove struct hv_device_info from hyperv.h
This is only used/needed by the vmbus core code, so move it out of the
hyperv.h file and into the .c file that uses it.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-11 09:51:22 -06:00
Greg Kroah-Hartman 9f3e28e375 hv: remove free_channel() from hyperv.h
This function is only used in the file it is declared in
(channel_mgmt.c) so make it static and remove it from the hyperv.h file.

Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-11 09:51:22 -06:00
Greg Kroah-Hartman 2726f95e0b hv: hyperv.h: remove unneeded forward declarations of structures
This file was created by mushing different .h files together and it
shows.  This change removes some unneeded forward declarations.

Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-11 09:51:22 -06:00
Greg Kroah-Hartman 7a4ba88cc1 hv: hyperv.h: remove unused module macros
I have no idea what these were ever for, but they aren't used, so delete
them.

Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-11 09:51:22 -06:00
Greg Kroah-Hartman 5557e8a605 hv: remove unused LOWORD and HIWORD macros from hyperv.h
They aren't used anywhere anymore now that the debugging macros are
gone, so remove it from hyperv.h as well.

Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-11 09:51:22 -06:00
Greg Kroah-Hartman 815166b95d Staging: hv: remove vmbus_loglevel as it is not used at all anymore
As there is no user of this variable, it's time to delete it.  For
dynamic debugging of the hyperv code, use the standard dynamic debug
kernel interface.

Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-11 09:51:21 -06:00
Greg Kroah-Hartman 1a2643012f Staging: hv: remove last user of DPRINT() macro
This also removed the unused function hv_dump_ring_info().

Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-11 09:51:21 -06:00
Greg Kroah-Hartman d181daa06d Staging: hv: storvsc: remove last usage of DPRINT_WARN
Used the correct dev_warn() call instead.

Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-11 09:51:21 -06:00
Greg Kroah-Hartman a832a1eba9 hv: remove a bunch of unused debug macros from hyperv.h
These aren't used by anyone anymore, so remove them before someone tries
to use them again.

Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-11 08:49:19 -06:00
Greg Kroah-Hartman da0e96315c hv: rename prep_negotiate_resp() to vmbus_prep_negotiate_resp()
It's a global symbol, so properly prefix it and use the proper EXPORT
value as well.

Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-11 08:49:19 -06:00
Greg Kroah-Hartman 407dd16443 Staging: hv: remove unneeded asm include file in hyperv.h
No one outside of the hyperv core needs to include the asm/hyperv.h
file, so don't put it in the "global" include/linux/hyperv.h file.

Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-11 08:49:19 -06:00
Stephen Rothwell 540f41edc1 llist: Add back llist_add_batch() and llist_del_first() prototypes
Commit 1230db8e15 ("llist: Make some llist functions inline")
has deleted the definitions, causing problems for (not upstream yet)
code that tries to make use of them.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: David Miller <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20111005172528.0d0a8afc65acef7ace22a24e@canb.auug.org.au
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-11 12:51:22 +02:00
Greg Kroah-Hartman 46a9719136 Staging: hv: move hyperv code out of staging directory
After many years wandering the desert, it is finally time for the
Microsoft HyperV code to move out of the staging directory.  Or at least
the core hyperv bus code, and the utility driver, the rest still have
some review to get through by the various subsystem maintainers.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
2011-10-10 22:52:55 -06:00
Harro Haan 276532ba96 USB: fix ehci alignment error
The Kirkwood gave an unaligned memory access error on
line 742 of drivers/usb/host/echi-hcd.c:
"ehci->last_periodic_enable = ktime_get_real();"

Signed-off-by: Harro Haan <hrhaan@gmail.com>
Cc: stable <stable@kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-10 16:43:53 -07:00
Dave Airlie 62addcb8c1 Merge branch 'drm-intel-next' of git://people.freedesktop.org/~keithp/linux into drm-core-next
* 'drm-intel-next' of git://people.freedesktop.org/~keithp/linux:
  drm/i915: Dumb down the semaphore logic
  drm/i915: pass ELD to HDMI/DP audio driver
  drm: support routines for HDMI/DP ELD
  drm/i915: Enable dither whenever display bpc < frame buffer bpc
  drm/i915: Enable dither whenever display bpc < frame buffer bpc
2011-10-10 20:05:21 +01:00
Tao Ma 40bfa16dac ext3: Remove the obsolete broken EXT3_IOC32_WAIT_FOR_READONLY.
There are no user of EXT3_IOC32_WAIT_FOR_READONLY and also it is
broken. No one set the set_ro_timer, no one wake up us and our
state is set to TASK_INTERRUPTIBLE not RUNNING. So remove it.

Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2011-10-10 18:25:59 +02:00
Thomas Hellstrom 57c5ee79ac vmwgfx: Add fence events
Add a way to send DRM events down the gpu fifo by attaching them to
fence objects. This may be useful for Xserver swapbuffer throttling and
page-flip done notifications.

Bump version to 2.2 to signal the availability of the FENCE_EVENT ioctl.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-10-10 15:46:55 +01:00
Mark Brown 25c77c5fae ASoC: Fix DAPM sync for TLV320AIC3x custom DAPM widget
We really should be doing this in the core, not in a driver...

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Tested-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
2011-10-10 10:28:26 +01:00
Heiko Stübner 3f0292ae8b regulator: Add driver for gpio-controlled regulators
This patch adds support for regulators that can be controlled via gpios.

Examples for such regulators are the TI-tps65024x voltage regulators
with 4 fixed and 1 runtime-switchable voltage regulators
or the TI-bq240XX charger regulators.

The number of controlling gpios is not limited, the mapping between
voltage/current and target gpio state is done via the states map
and the driver can be used for either voltage or current regulators.

A mapping for a regulator with two GPIOs could look like:

gpios = {
	{ .gpio = GPIO1, .flags = GPIOF_OUT_INIT_HIGH, .label = "gpio name 1" },
	{ .gpio = GPIO2, .flags = GPIOF_OUT_INIT_LOW,  .label = "gpio name 2" },
}

The flags element of the gpios array determines the initial state of
the gpio, set during probe. The initial state of the regulator is also
calculated from these values

states = {
	{ .value = volt_or_cur1, .gpios = (0 << 1) | (0 << 0) },
	{ .value = volt_or_cur2, .gpios = (0 << 1) | (1 << 0) },
	{ .value = volt_or_cur3, .gpios = (1 << 1) | (0 << 0) },
	{ .value = volt_or_cur4, .gpios = (1 << 1) | (1 << 0) },
}

The target-state for the n-th gpio is determined by the n-th bit
in the bitfield of the target-value.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-09 12:36:21 +01:00
Mark Brown 024dc07855 ASoC: Cache connected input and output recursions
The number of connected input and output endpoints for a given widgets
can't change during a DAPM run so there is no need to redo the recursion
through branches of the tree we've already visited. Doing this on one of
my test systems gives an improvement of:

         Power    Path   Neighbour
Before:  63       607    731
After:   63       141    181

which scales up well as more widgets are involved in paths.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-09 12:07:48 +01:00
Clemens Ladisch 8d448162bd ALSA: control: add support for ENUMERATED user space controls
Handling of user control elements was implemented for all types except
ENUMERATED.  This type will be needed for the device-specific mixers of
upcoming FireWire drivers.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2011-10-09 09:09:11 +02:00
Mauro Carvalho Chehab 8d6c0b216f [media] videodev2: Reorganize standard macros and add a few more macros
Reorganize the standards macro and add a few more, that will be
used on msp3400 in order to allow it to detect the audio standard.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-10-08 08:01:06 -03:00
Rafael J. Wysocki 7811ac276b Merge branch 'pm-devfreq' into pm-for-linus
* pm-devfreq:
  PM / devfreq: Add basic governors
  PM / devfreq: Add common sysfs interfaces
  PM: Introduce devfreq: generic DVFS framework with device-specific OPPs
  PM / OPP: Add OPP availability change notifier.
2011-10-07 23:17:18 +02:00
Rafael J. Wysocki 9696cc9007 Merge branch 'pm-qos' into pm-for-linus
* pm-qos:
  PM / QoS: Update Documentation for the pm_qos and dev_pm_qos frameworks
  PM / QoS: Add function dev_pm_qos_read_value() (v3)
  PM QoS: Add global notification mechanism for device constraints
  PM QoS: Implement per-device PM QoS constraints
  PM QoS: Generalize and export constraints management code
  PM QoS: Reorganize data structs
  PM QoS: Code reorganization
  PM QoS: Minor clean-ups
  PM QoS: Move and rename the implementation files
2011-10-07 23:17:07 +02:00
Rafael J. Wysocki c28b56b1d4 Merge branch 'pm-domains' into pm-for-linus
* pm-domains:
  PM / Domains: Split device PM domain data into base and need_restore
  ARM: mach-shmobile: sh7372 sleep warning fixes
  ARM: mach-shmobile: sh7372 A3SM support
  ARM: mach-shmobile: sh7372 generic suspend/resume support
  PM / Domains: Preliminary support for devices with power.irq_safe set
  PM: Move clock-related definitions and headers to separate file
  PM / Domains: Use power.sybsys_data to reduce overhead
  PM: Reference counting of power.subsys_data
  PM: Introduce struct pm_subsys_data
  ARM / shmobile: Make A3RV be a subdomain of A4LC on SH7372
  PM / Domains: Rename argument of pm_genpd_add_subdomain()
  PM / Domains: Rename GPD_STATE_WAIT_PARENT to GPD_STATE_WAIT_MASTER
  PM / Domains: Allow generic PM domains to have multiple masters
  PM / Domains: Add "wait for parent" status for generic PM domains
  PM / Domains: Make pm_genpd_poweron() always survive parent removal
  PM / Domains: Do not take parent locks to modify subdomain counters
  PM / Domains: Implement subdomain counters as atomic fields
2011-10-07 23:17:02 +02:00
Rafael J. Wysocki d727b60659 Merge branch 'pm-runtime' into pm-for-linus
* pm-runtime:
  PM / Tracing: build rpm-traces.c only if CONFIG_PM_RUNTIME is set
  PM / Runtime: Replace dev_dbg() with trace_rpm_*()
  PM / Runtime: Introduce trace points for tracing rpm_* functions
  PM / Runtime: Don't run callbacks under lock for power.irq_safe set
  USB: Add wakeup info to debugging messages
  PM / Runtime: pm_runtime_idle() can be called in atomic context
  PM / Runtime: Add macro to test for runtime PM events
  PM / Runtime: Add might_sleep() to runtime PM functions
2011-10-07 23:16:55 +02:00
David S. Miller 88c5100c28 Merge branch 'master' of github.com:davem330/net
Conflicts:
	net/batman-adv/soft-interface.c
2011-10-07 13:38:43 -04:00
John Fastabend 27737aa3a9 dcb: Add stub routines for !CONFIG_DCB
To avoid ifdefs in the other code that supports DCB notifiers
add stub routines. This method seems popular in other net code
for example 8021Q.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-06 15:49:51 -04:00
John Fastabend 6bd0e1cb10 dcb: add DCBX mode to event notifier attributes
Add DCBX mode to event notifiers so listeners can learn
currently enabled mode.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-06 15:49:51 -04:00
Mark Rustad e290ed8130 dcb: Use ifindex instead of ifname
Use ifindex instead of ifname in the DCB app ring. This makes for a smaller
data structure and faster comparisons. It also avoids possible issues when
a net device is renamed.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-06 15:49:51 -04:00
Peter Ujfalusi a92f1394a1 ASoC: fix codec breakage caused by the volsw/volsw_2r merger
By accident few places still uses the _2r calls from
the core.
This is a quick fix, the drivers using the old callbacks
going to be changed.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-06 20:02:55 +01:00
Kumar Sanghvi 3ebeebc38b RDMA/iwcm: Propagate ird/ord values upwards
Update struct iw_cm_event to support propagating the ird/ord values
upwards to the application.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-06 09:37:52 -07:00
Linus Torvalds 6367f1775e Merge branch 'for-linus' of http://people.redhat.com/agk/git/linux-dm
* 'for-linus' of http://people.redhat.com/agk/git/linux-dm:
  dm crypt: always disable discard_zeroes_data
  dm: raid fix write_mostly arg validation
  dm table: avoid crash if integrity profile changes
  dm: flakey fix corrupt_bio_byte error path
2011-10-06 08:31:47 -07:00
Joerg Roedel a240f76165 perf, core: Introduce attrs to count in either host or guest mode
The two new attributes exclude_guest and exclude_host can
bes used by user-space to tell the kernel to setup
performance counter to either only count while the CPU is in
guest or in host mode.

An additional check is also introduced to make sure
user-space does not try to exclude guest and host mode from
counting.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317816084-18026-2-git-send-email-gleb@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-06 13:00:28 +02:00
Ingo Molnar 9d01402023 Merge commit 'v3.1-rc9' into perf/core
Merge reason: pick up latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-06 12:49:21 +02:00
Ingo Molnar 9243a169ac Merge commit 'v3.1-rc9' into sched/core
Merge reason: pick up latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-06 12:43:35 +02:00
Grant Likely 9514a56753 Merge branch 'for-grant' of git://git.jdl.com/software/linux-3.0 into devicetree/next 2011-10-05 10:52:27 -06:00
Peter Ujfalusi 1576a5ff49 ASoC: core: Remove snd_soc_put_volsw_2r definition
We do not have users for snd_soc_put_volsw_2r anymore.
It can be removed.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-05 17:10:10 +01:00
Peter Ujfalusi 974815ba4f ASoC: core: Combine snd_soc_put_volsw/put_volsw_2r functions
Handle the put_volsw/put_volsw_2r in one function.

To avoid build breakage in twl6040 keep the
snd_soc_put_volsw_2r as define, and map it snd_soc_put_volsw.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-05 17:10:10 +01:00
Peter Ujfalusi f7915d9975 ASoC: core: Combine snd_soc_get_volsw/get_volsw_2r functions
Handle the get_volsw/get_volsw_2r in one function.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-05 17:10:09 +01:00
Peter Ujfalusi e8f5a10307 ASoC: core: Combine snd_soc_info_volsw/info_volsw_2r functions
Handle the info_volsw/info_volsw_2r in one function.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-05 17:10:09 +01:00
Peter Ujfalusi 30d86ba47f ASoC: core: Change SOC_SINGLE/DOUBLE_VALUE representation
SOC_SINGLE/DOUBLE_VALUE is used for mixer controls, where the
bits are within one register.

Assign .rreg to be the same as .reg for these types.

With this change we can tell if the mixer in question:
is mono:
mc->reg == mc->rreg && mc->shift == mc->rshift

is stereo, within single register:
mc->reg == mc->rreg && mc->shift != mc->rshift

is stereo, in two registers:
mc->reg != mc->rreg

The patch provide a small inline function to query, if the mixer
is stereo, or mono.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-05 17:10:09 +01:00
David Henningsson 7c2f8e4009 ALSA: jack - Add "Line In" input jack constants
Similar to Line Out, these constants form the base for future
patches enabling input jack reporting for Line in jacks.

Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2011-10-05 17:22:04 +02:00
Mark Brown 9b8a83b205 ASoC: Only run power_check() on a widget once per run
Some widgets will get power_check() run on them more than once during a
DAPM run, most commonly due to supply widgets checking to see if their
consumers are powered up. It's wasteful to do this so cache the result
of power_check() during a run. For one system I tested this on I got an
improvement of:

           Power    Path   Neighbour
Before:    106      970    1186
After:     69       727    905

from this.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-05 11:22:22 +01:00
Inki Dae 1c248b7d29 DRM: add DRM Driver for Samsung SoC EXYNOS4210.
This patch is a DRM Driver for Samsung SoC Exynos4210 and now enables
only FIMD yet but we will add HDMI support also in the future.

this patch is based on git repository below:
git://people.freedesktop.org/~airlied/linux.git
branch name: drm-next
commit-id: 88ef4e3f4f

you can refer to our working repository below:
http://git.infradead.org/users/kmpark/linux-2.6-samsung
branch name: samsung-drm

We tried to re-use lowlevel codes of the FIMD driver(s3c-fb.c
based on Linux framebuffer) but couldn't so because lowlevel codes
of s3c-fb.c are included internally and so FIMD module of this driver has
its own lowlevel codes.

We used GEM framework for buffer management and DMA APIs(dma_alloc_*)
for buffer allocation so we can allocate physically continuous memory
for DMA through it and also we could use CMA later if CMA is applied to
mainline.

Refer to this link for CMA(Continuous Memory Allocator):
http://lkml.org/lkml/2011/7/20/45

this driver supports only physically continuous memory(non-iommu).

Links to previous versions of the patchset:
v1: < https://lwn.net/Articles/454380/ >
v2: < http://www.spinics.net/lists/kernel/msg1224275.html >
v3: < http://www.spinics.net/lists/dri-devel/msg13755.html >
v4: < http://permalink.gmane.org/gmane.comp.video.dri.devel/60439 >
v5: < http://comments.gmane.org/gmane.comp.video.dri.devel/60802 >

Changelog v2:
DRM: add DRM_IOCTL_SAMSUNG_GEM_MMAP ioctl command.

    this feature maps user address space to physical memory region
    once user application requests DRM_IOCTL_SAMSUNG_GEM_MMAP ioctl.

DRM: code clean and add exception codes.

Changelog v3:
DRM: Support multiple irq.

    FIMD and HDMI have their own irq handler but DRM Framework can regiter
    only one irq handler this patch supports mutiple irq for Samsung SoC.

DRM: Consider modularization.

    each DRM, FIMD could be built as a module.

DRM: Have indenpendent crtc object.

    crtc isn't specific to SoC Platform so this patch gets a crtc
    to be used as common object.
    created crtc could be attached to any encoder object.

DRM: code clean and add exception codes.

Changelog v4:
DRM: remove is_defult from samsung_fb.

    is_default isn't used for default framebuffer.

DRM: code refactoring to fimd module.
    this patch is be considered with multiple display objects and
    would use its own request_irq() to register a irq handler instead of
    drm framework's one.

DRM: remove find_samsung_drm_gem_object()

DRM: move kernel private data structures and definitions to driver folder.

    samsung_drm.h would contain only public information for userspace
    ioctl interface.

DRM: code refactoring to gem modules.
    buffer module isn't dependent of gem module anymore.

DRM: fixed security issue.

DRM: remove encoder porinter from specific connector.

    samsung connector doesn't need to have generic encoder.

DRM: code clean and add exception codes.

Changelog v5:
DRM: updated fimd(display controller) driver.
    added various pixel formats, color key and pixel blending features.

DRM: removed end_buf_off from samsung_drm_overlay structure.
    this variable isn't used and end buffer address would be
    calculated by each sub driver.

DRM: use generic function for mmap_offset.
    replaced samsung_drm_gem_create_mmap_offset() and
    samsung_drm_free_mmap_offset() with generic ones applied
    to mainline recentrly.

DRM: removed unnecessary codes and added exception codes.

DRM: added comments and code clean.

Changelog v6:
DRM: added default config options.

DRM: added padding for 64-bit align.

DRM: changed prefix 'samsung' to 'exynos'

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-10-05 10:27:31 +01:00
Jakob Bornecrantz 2fcd5a73bf vmwgfx: Add present and readback ioctls
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-10-05 10:17:17 +01:00
Timur Tabi c4e5a02327 drivers/video: fsl-diu-fb: only DIU modes 0 and 1 are supported
The Freescale DIU video controller supports five video "modes", but only
the first two are used by the driver.  The other three are special modes
that don't make sense for a framebuffer driver.  Therefore, there's no
point in keeping a global variable that indicates which mode we're
supposed to use.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
2011-10-05 01:10:12 +00:00
Timur Tabi b715f9f04c drivers/video: fsl-diu-fb: move some definitions out of the header file
Move several macros and structures from the Freescale DIU driver's header
file into the source file, because they're only used by that file.  Also
delete a few unused macros.

The diu and diu_ad structures cannot be moved because they're being used
by the MPC5121 platform file.  A future patch eliminate the need for
the platform file to access these structs, so they'll be moved also.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
2011-10-05 01:10:12 +00:00
Timur Tabi 36b0b1d415 drivers/video: fsl-diu-fb: fix some ioctls
Use the _IOx macros to define the ioctl commands, instead of hard-coded
numbers.  Unfortunately, the original definitions of MFB_SET_PIXFMT and
MFB_GET_PIXFMT used the wrong value for the size, so these macros have
new values now.  To avoid breaking binary compatibility with older
applications, we retain support for the original values, but the driver
displays a warning message if they're used.

Also remove the FBIOGET_GWINFO and FBIOPUT_GWINFO ioctls.  FBIOPUT_GWINFO
was never implemented, and FBIOGET_GWINFO was never used by any
application.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
2011-10-05 01:06:55 +00:00
Jamie Iles 4cd7f7a311 dt: add helper to read 64-bit integers
Add a helper similar to of_property_read_u32() that handles 64-bit
integers.

v2/v3: constify device node and property name parameters.

Cc: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-10-04 16:59:53 -06:00
Jamie Iles a1330228f9 dw_apb_timer: constify clocksource name
The clocksource name should be const for correctness.

Cc: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-10-04 13:08:18 -07:00
Rafael J. Wysocki 1a9a91525d PM / QoS: Add function dev_pm_qos_read_value() (v3)
To read the current PM QoS value for a given device we need to
make sure that the device's power.constraints object won't be
removed while we're doing that.  For this reason, put the
operation under dev->power.lock and acquire the lock
around the initialization and removal of power.constraints.

Moreover, since we're using the value of power.constraints to
determine whether or not the object is present, the
power.constraints_state field isn't necessary any more and may be
removed.  However, dev_pm_qos_add_request() needs to check if the
device is being removed from the system before allocating a new
PM QoS constraints object for it, so make it use the
power.power_state field of struct device for this purpose.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-04 21:54:26 +02:00
John W. Linville d6222fb0d6 Merge branch 'master' of git://github.com/padovan/bluetooth-next 2011-10-04 14:06:47 -04:00
Linus Torvalds 8a04b45367 Merge git://github.com/davem330/net
* git://github.com/davem330/net:
  pch_gbe: Fixed the issue on which a network freezes
  pch_gbe: Fixed the issue on which PC was frozen when link was downed.
  make PACKET_STATISTICS getsockopt report consistently between ring and non-ring
  net: xen-netback: correctly restart Tx after a VM restore/migrate
  bonding: properly stop queuing work when requested
  can bcm: fix incomplete tx_setup fix
  RDSRDMA: Fix cleanup of rds_iw_mr_pool
  net: Documentation: Fix type of variables
  ibmveth: Fix oops on request_irq failure
  ipv6: nullify ipv6_ac_list and ipv6_fl_list when creating new socket
  cxgb4: Fix EEH on IBM P7IOC
  can bcm: fix tx_setup off-by-one errors
  MAINTAINERS: tehuti: Alexander Indenbaum's address bounces
  dp83640: reduce driver noise
  ptp: fix L2 event message recognition
2011-10-04 10:37:06 -07:00
Jon Mason 5f39e6705f PCI: Disable MPS configuration by default
Add the ability to disable PCI-E MPS turning and using the BIOS
configured MPS defaults.  Due to the number of issues recently
discovered on some x86 chipsets, make this the default behavior.

Also, add the option for peer to peer DMA MPS configuration.  Peer to
peer DMA is outside the scope of this patch, but MPS configuration could
prevent it from working by having the MPS on one root port different
than the MPS on another.  To work around this, simply make the system
wide MPS the smallest possible value (128B).

Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-04 09:52:28 -07:00
Benoit Cousson 4fcd15a032 of: Add helpers to get one string in multiple strings property
Add of_property_read_string_index and of_property_count_strings
to retrieve one string inside a property that will contains
severals strings.

Signed-off-by: Benoit Cousson <b-cousson@ti.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Kevin Hilman <khilman@ti.com>
2011-10-04 09:52:23 -07:00
Mark Brown db432b414e ASoC: Do DAPM power checks only for widgets changed since last run
In order to reduce the number of DAPM power checks we run keep a list of
widgets which have been changed since the last DAPM run and iterate over
that rather than the full widget list. Whenever we change the power state
for a widget we add all the source and sink widgets it has to the dirty
list, ensuring that all widgets in the path are checked.

This covers more widgets than we need to as some of the neighbour widgets
won't be connected but it's simpler as a first step. On one system I tried
this gave:

           Power    Path   Neighbour
Before:    207      1939   2461
After:     114      1066   1327

which seems useful.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-04 16:50:20 +01:00
Peter Ujfalusi cdffa775e7 ASoC: core: Introduce SOC_DOUBLE_R_VALUE macro
With the new macro we can remove duplicated code
for the SOC_DOUBLE_R type of controls.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-04 16:27:05 +01:00
Peter Ujfalusi 460acbec1e ASoC: core: Introduce SOC_DOUBLE_VALUE macro
With the new macro we can remove duplicated code
for the SOC_DOUBLE type of controls.
We can also remap the SOC_SINGLE_VALUE macro to
SOC_DOUBLE_VALUE

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-04 16:27:05 +01:00
Mark Brown 81204c84ca ASoC: Add WM1811 support
The WM1811 is mostly register compatible with the WM8994 and WM8958,
providing a high performance audio hub CODEC in a small form factor
suitable for ultra compact system designs.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-04 11:59:46 +01:00
Mark Brown b1f43bf3a5 mfd: Add WM1811 support
The WM1811 is mostly register compatible with the WM8994 and WM8958,
providing a high performance audio hub CODEC in a small form factor
suitable for ultra compact system designs.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-04 11:59:02 +01:00
Peter Zijlstra f0f1d32f93 llist: Remove cpu_relax() usage in cmpxchg loops
Initial benchmarks show they're a net loss:

 $ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i; done
 $ echo 4096 32000 64 128 > /proc/sys/kernel/sem
 $ ./sembench -t 2048 -w 1900 -o 0

Pre:

 run time 30 seconds 778936 worker burns per second
 run time 30 seconds 912190 worker burns per second
 run time 30 seconds 817506 worker burns per second
 run time 30 seconds 830870 worker burns per second
 run time 30 seconds 845056 worker burns per second

Post:

 run time 30 seconds 905920 worker burns per second
 run time 30 seconds 849046 worker burns per second
 run time 30 seconds 886286 worker burns per second
 run time 30 seconds 822320 worker burns per second
 run time 30 seconds 900283 worker burns per second

So about 4% faster. (!)

cpu_relax() stalls the pipeline, therefore, when used in a tight loop
it has the following benefits:

 - allows SMT siblings to have a go;
 - reduces pressure on the CPU interconnect.

However, cmpxchg loops are unfair and thus have unbounded completion
time, therefore we should avoid getting in such heavily contended
situations where the above benefits make any difference.

A typical cmpxchg loop should not go round more than a handfull of
times at worst, therefore adding extra delays just slows things down.

Since the llist primitives are new, there aren't any bad users yet,
and we should avoid growing them. Heavily contended sites should
generally be better off using the ticket locks for serialization since
they provide bounded completion times (fifo-fair over the cpus).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1315836358.26517.43.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:44:03 +02:00
Peter Zijlstra fa14ff4acc sched: Convert to struct llist
Use the generic llist primitives.

We had a private lockless list implementation in the scheduler in the wake-list
code, now that we have a generic llist implementation that provides all required
operations, switch to it.

This patch is not expected to change any behavior.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1315836353.26517.42.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:58 +02:00
Peter Zijlstra 924f8f5af3 llist: Add llist_next()
So we don't have to expose the struct list_node member.

Cc: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315836348.26517.41.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:53 +02:00
Huang Ying 38aaf8090d irq_work: Use llist in the struct irq_work logic
Use llist in irq_work instead of the lock-less linked list
implementation in irq_work to avoid the code duplication.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-6-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:49 +02:00
Huang Ying 781f7fd916 llist: Return whether list is empty before adding in llist_add()
Extend the llist_add*() functions to return a success indicator, this
allows us in the scheduler code to send an IPI if the queue was empty.

( There's no effect on existing users, because the list_add_xxx() functions
  are inline, thus this will be optimized out by the compiler if not used
  by callers. )

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-5-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:44 +02:00
Huang Ying a3127336b7 llist: Move cpu_relax() to after the cmpxchg()
If in llist_add()/etc. functions the first cmpxchg() call succeeds, it is
not necessary to use cpu_relax() before the cmpxchg(). So cpu_relax() in
a busy loop involving cmpxchg() should go after cmpxchg() instead of before
that.

This patch fixes this for all involved llist functions.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-4-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:39 +02:00
Ingo Molnar 2c30245c65 llist: Remove the platform-dependent NMI checks
Remove the nmi() checks spread around the code. in_nmi() is not available
on every architecture and it's a pretty obscure and ugly check in any case.

Cc: Huang Ying <ying.huang@intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-3-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:11 +02:00
Boaz Harrosh d866d875f6 ore/exofs: Change the type of the devices array (API change)
In the pNFS obj-LD the device table at the layout level needs
to point to a device_cache node, where it is possible and likely
that many layouts will point to the same device-nodes.

In Exofs we have a more orderly structure where we have a single
array of devices that repeats twice for a round-robin view of the
device table

This patch moves to a model that can be used by the pNFS obj-LD
where struct ore_components holds an array of ore_dev-pointers.
(ore_dev is newly defined and contains a struct osd_dev *od
 member)

Each pointer in the array of pointers will point to a bigger
user-defined dev_struct. That can be accessed by use of the
container_of macro.

In Exofs an __alloc_dev_table() function allocates the
ore_dev-pointers array as well as an exofs_dev array, in one
allocation and does the addresses dance to set everything pointing
correctly. It still keeps the double allocation trick for the
inodes round-robin view of the table.

The device table is always allocated dynamically, also for the
single device case. So it is unconditionally freed at umount.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-04 12:13:59 +02:00
Huang Ying 1230db8e15 llist: Make some llist functions inline
Because llist code will be used in performance critical scheduler
code path, make llist_add() and llist_del_all() inline to avoid
function calling overhead and related 'glue' overhead.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-2-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 11:30:53 +02:00
Ingo Molnar 22f92bacbe Merge branch 'linus' into sched/core
Merge reason: pick up the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 11:09:08 +02:00
Andrew Vagin 92e51938f5 perf: Fix counter of ftrace events
Each event adds some points to its counters. By default it adds 1,
and a number of points may be transmited in event's parameters.

E.g. sched:sched_stat_runtime adds how long process has been running.

But this functionality was broken by v2.6.31-rc5-392-gf413cdb
and now the event's parameters doesn't affect on a number of points.

TP_perf_assign isn't defined, so __perf_count(c) isn't executed and
__count is always equal to 1.

Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317052535-1765247-2-git-send-email-avagin@openvz.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 11:07:54 +02:00
Eliad Peller 8a3a3c85e4 mac80211: pass vif param to conf_tx() callback
tx params should be configured per interface.
add ieee80211_vif param to the conf_tx callback,
and change all the drivers that use this callback.

The following spatch was used:
@rule1@
struct ieee80211_ops ops;
identifier conf_tx_op;
@@
	ops.conf_tx = conf_tx_op;

@rule2@
identifier rule1.conf_tx_op;
identifier hw, queue, params;
@@
	conf_tx_op (
-		struct ieee80211_hw *hw,
+		struct ieee80211_hw *hw, struct ieee80211_vif *vif,
		u16 queue,
		const struct ieee80211_tx_queue_params *params) {...}

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-10-03 15:22:41 -04:00
Rajkumar Manoharan b6f35301ef mac80211: Send nullfunc frames at lower rate during connection monitor
Recently mac80211 was changed to use nullfunc instead of probe
request for connection monitoring for tx ack status reporting
hardwares. Sometimes in congested network, STA got disconnected
quickly after the association. It was observered that the rate
control was not adopted to environment due to minimal transmission.

As the nullfunc are used for monitoring purpose, these frames should
not be sacrificed for rate control updation. So it is better to send
the monitoring null func frames at minimum rate that could help to
retain the connection.

Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-10-03 15:22:32 -04:00
Sangwook Lee e209c5a7ed net:rfkill: add a gpio setup function into GPIO rfkill
Add a gpio setup function which gives a chance to set up
platform specific configuration such as pin multiplexing,
input/output direction at the runtime or booting time.

Signed-off-by: Sangwook Lee <sangwook.lee@linaro.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-10-03 15:19:19 -04:00
Helmut Schaa 893d73f4a1 mac80211: Allow noack flag overwrite for injected frames
Allow injected unicast frames to be sent without having to wait
for an ACK.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-10-03 15:19:17 -04:00
Vasily Averin 349d2895cc ipv4: NET_IPV4_ROUTE_GC_INTERVAL removal
removing obsoleted sysctl,
ip_rt_gc_interval variable no longer used since 2.6.38

Signed-off-by: Vasily Averin <vvs@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-03 14:13:01 -04:00
Jiří Župka 96c131842a Repair wrong named definition aligned_u64
This repairs problem with compile library in userspace (libnl).

Signed-off-by: Jiří Župka <jzupka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-03 14:03:48 -04:00
Eric Dumazet b5c5693bb7 tcp: report ECN_SEEN in tcp_info
Allows ss command (iproute2) to display "ecnseen" if at least one packet
with ECT(0) or ECT(1) or ECN was received by this socket.

"ecn" means ECN was negotiated at session establishment (TCP level)

"ecnseen" means we received at least one packet with ECT fields set (IP
level)

ss -i
...
ESTAB      0      0   192.168.20.110:22  192.168.20.144:38016
ino:5950 sk:f178e400
	 mem:(r0,w0,f0,t0) ts sack ecn ecnseen bic wscale:7,8 rto:210
rtt:12.5/7.5 cwnd:10 send 9.3Mbps rcv_space:14480

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-03 14:01:21 -04:00
Boaz Harrosh eb507bc189 ore: Make ore_striping_info and ore_calc_stripe_info public
The struct ore_striping_info will be used later in other
structures. And ore_calc_stripe_info as well. Rename them
make struct ore_striping_info public. ore_calc_stripe_info
is still static, will be made public on first use.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-03 17:07:51 +02:00
Boaz Harrosh 8d2d83a835 exofs: Remove unused data_map member from exofs_sb_info
The struct pnfs_osd_data_map data_map member of exofs_sb_info was
never used after mount. In fact all it's members were duplicated
by the ore_layout structure. So just remove the duplicated information.

Also removed some stupid, but perfectly supported, restrictions on
layout parameters. The case where num_devices is not divisible by
mirror_count+1 is perfectly fine since the rotating device view
will eventually use all the devices it can get.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
2011-10-03 17:07:51 +02:00
Boaz Harrosh 5bf696dad4 exofs: Rename struct ore_components comps => oc
ore_components already has a comps member so this leads
to things like comps->comps which is annoying. the name oc
was already used in new code. So rename all old usage of
ore_components comps => ore_components oc.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-03 17:07:50 +02:00
Archit Taneja 54128701ec OMAPDSS: DISPC: zorder support for DSS overlays
Add zorder support on OMAP4, this feature allows deciding the visibility order
of the overlays based on the zorder value provided as an overlay info parameter
or a sysfs attribute of the overlay object.

Use the overlay cap OMAP_DSS_OVL_CAP_ZORDER to determine whether zorder is
supported for the overlay or not. Use dss feature FEAT_ALPHA_FREE_ZORDER
if the caps are not available.

Ensure that all overlays that are enabled and connected to the same manager
have different zorders. Swapping zorders of 2 enabled overlays currently
requires disabling one of the overlays.

Signed-off-by: Archit Taneja <archit@ti.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2011-10-03 16:51:55 +03:00
Archit Taneja b8c095b4d6 OMAPDSS: DISPC: VIDEO3 pipeline support
Add support for VIDEO3 pipeline on OMAP4:
- Add VIDEO3 pipeline information in dss_features and omapdss.h
- Add VIDEO3 pipeline register coefficients in dispc.h
- Create a new overlay structure corresponding to VIDEO3.
- Make changes in dispc.c for VIDEO3

Signed-off-by: Archit Taneja <archit@ti.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2011-10-03 16:51:54 +03:00
Archit Taneja 11354dd58d OMAPDSS/OMAP_VOUT: Fix incorrect OMAP3-alpha compatibility setting
On OMAP3, in order to enable alpha blending for LCD and TV managers, we needed
to set LCDALPHABLENDERENABLE/TVALPHABLENDERENABLE bits in DISPC_CONFIG. On
OMAP4, alpha blending is always enabled by default, if the above bits are set,
we switch to an OMAP3 compatibility mode where the zorder values in the pipeline
attribute registers are ignored and a fixed priority is configured.

Rename the manager_info member "alpha_enabled" to "partial_alpha_enabled" for
more clarity. Introduce two dss_features FEAT_ALPHA_FIXED_ZORDER and
FEAT_ALPHA_FREE_ZORDER which represent OMAP3-alpha compatibility mode and OMAP4
alpha mode respectively. Introduce an overlay cap for ZORDER. The DSS2 user is
expected to check for the ZORDER cap, if an overlay doesn't have this cap, the
user is expected to set the parameter partial_alpha_enabled. If the overlay has
ZORDER cap, the DSS2 user can assume that alpha blending is already enabled.

Don't support OMAP3 compatibility mode for now. Trying to read/write to
alpha_blending_enabled sysfs attribute issues a warning for OMAP4 and does not
set the LCDALPHABLENDERENABLE/TVALPHABLENDERENABLE bits.

Change alpha_enabled to partial_alpha_enabled in the omap_vout driver. Use
overlay cap "OMAP_DSS_OVL_CAP_GLOBAL_ALPHA" to check if overlay supports alpha
blending or not. Replace this with checks for VIDEO1 pipeline.

Cc: linux-media@vger.kernel.org
Cc: Lajos Molnar <molnar@ti.com>
Signed-off-by: Archit Taneja <archit@ti.com>
Acked-by: Vaibhav Hiremath <hvaibhav@ti.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2011-10-03 16:51:54 +03:00
Marc Zyngier 1e7c5fd294 genirq: percpu: allow interrupt type to be set at enable time
As request_percpu_irq() doesn't allow for a percpu interrupt to have
its type configured (it is generally impossible to configure it on all
CPUs at once), add a 'type' argument to enable_percpu_irq().

This allows some low-level, board specific init code to be switched to
a generic API.

[ tglx: Added WARN_ON argument ]

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-10-03 15:35:27 +02:00
Marc Zyngier 31d9d9b6d8 genirq: Add support for per-cpu dev_id interrupts
The ARM GIC interrupt controller offers per CPU interrupts (PPIs),
which are usually used to connect local timers to each core. Each CPU
has its own private interface to the GIC, and only sees the PPIs that
are directly connect to it.

While these timers are separate devices and have a separate interrupt
line to a core, they all use the same IRQ number.

For these devices, request_irq() is not the right API as it assumes
that an IRQ number is visible by a number of CPUs (through the
affinity setting), but makes it very awkward to express that an IRQ
number can be handled by all CPUs, and yet be a different interrupt
line on each CPU, requiring a different dev_id cookie to be passed
back to the handler.

The *_percpu_irq() functions is designed to overcome these
limitations, by providing a per-cpu dev_id vector:

int request_percpu_irq(unsigned int irq, irq_handler_t handler,
		   const char *devname, void __percpu *percpu_dev_id);
void free_percpu_irq(unsigned int, void __percpu *);
int setup_percpu_irq(unsigned int irq, struct irqaction *new);
void remove_percpu_irq(unsigned int irq, struct irqaction *act);
void enable_percpu_irq(unsigned int irq);
void disable_percpu_irq(unsigned int irq);

The API has a number of limitations:
- no interrupt sharing
- no threading
- common handler across all the CPUs

Once the interrupt is requested using setup_percpu_irq() or
request_percpu_irq(), it must be enabled by each core that wishes its
local interrupt to be delivered.

Based on an initial patch by Thomas Gleixner.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1316793788-14500-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-10-03 15:35:26 +02:00
Wu Fengguang 143dfe8611 writeback: IO-less balance_dirty_pages()
As proposed by Chris, Dave and Jan, don't start foreground writeback IO
inside balance_dirty_pages(). Instead, simply let it idle sleep for some
time to throttle the dirtying task. In the mean while, kick off the
per-bdi flusher thread to do background writeback IO.

RATIONALS
=========

- disk seeks on concurrent writeback of multiple inodes (Dave Chinner)

  If every thread doing writes and being throttled start foreground
  writeback, it leads to N IO submitters from at least N different
  inodes at the same time, end up with N different sets of IO being
  issued with potentially zero locality to each other, resulting in
  much lower elevator sort/merge efficiency and hence we seek the disk
  all over the place to service the different sets of IO.
  OTOH, if there is only one submission thread, it doesn't jump between
  inodes in the same way when congestion clears - it keeps writing to
  the same inode, resulting in large related chunks of sequential IOs
  being issued to the disk. This is more efficient than the above
  foreground writeback because the elevator works better and the disk
  seeks less.

- lock contention and cache bouncing on concurrent IO submitters (Dave Chinner)

  With this patchset, the fs_mark benchmark on a 12-drive software RAID0 goes
  from CPU bound to IO bound, freeing "3-4 CPUs worth of spinlock contention".

  * "CPU usage has dropped by ~55%", "it certainly appears that most of
    the CPU time saving comes from the removal of contention on the
    inode_wb_list_lock" (IMHO at least 10% comes from the reduction of
    cacheline bouncing, because the new code is able to call much less
    frequently into balance_dirty_pages() and hence access the global
    page states)

  * the user space "App overhead" is reduced by 20%, by avoiding the
    cacheline pollution by the complex writeback code path

  * "for a ~5% throughput reduction", "the number of write IOs have
    dropped by ~25%", and the elapsed time reduced from 41:42.17 to
    40:53.23.

  * On a simple test of 100 dd, it reduces the CPU %system time from 30% to 3%,
    and improves IO throughput from 38MB/s to 42MB/s.

- IO size too small for fast arrays and too large for slow USB sticks

  The write_chunk used by current balance_dirty_pages() cannot be
  directly set to some large value (eg. 128MB) for better IO efficiency.
  Because it could lead to more than 1 second user perceivable stalls.
  Even the current 4MB write size may be too large for slow USB sticks.
  The fact that balance_dirty_pages() starts IO on itself couples the
  IO size to wait time, which makes it hard to do suitable IO size while
  keeping the wait time under control.

  Now it's possible to increase writeback chunk size proportional to the
  disk bandwidth. In a simple test of 50 dd's on XFS, 1-HDD, 3GB ram,
  the larger writeback size dramatically reduces the seek count to 1/10
  (far beyond my expectation) and improves the write throughput by 24%.

- long block time in balance_dirty_pages() hurts desktop responsiveness

  Many of us may have the experience: it often takes a couple of seconds
  or even long time to stop a heavy writing dd/cp/tar command with
  Ctrl-C or "kill -9".

- IO pipeline broken by bumpy write() progress

  There are a broad class of "loop {read(buf); write(buf);}" applications
  whose read() pipeline will be under-utilized or even come to a stop if
  the write()s have long latencies _or_ don't progress in a constant rate.
  The current threshold based throttling inherently transfers the large
  low level IO completion fluctuations to bumpy application write()s,
  and further deteriorates with increasing number of dirtiers and/or bdi's.

  For example, when doing 50 dd's + 1 remote rsync to an XFS partition,
  the rsync progresses very bumpy in legacy kernel, and throughput is
  improved by 67% by this patchset. (plus the larger write chunk size,
  it will be 93% speedup).

  The new rate based throttling can support 1000+ dd's with excellent
  smoothness, low latency and low overheads.

For the above reasons, it's much better to do IO-less and low latency
pauses in balance_dirty_pages().

Jan Kara, Dave Chinner and me explored the scheme to let
balance_dirty_pages() wait for enough writeback IO completions to
safeguard the dirty limit. However it's found to have two problems:

- in large NUMA systems, the per-cpu counters may have big accounting
  errors, leading to big throttle wait time and jitters.

- NFS may kill large amount of unstable pages with one single COMMIT.
  Because NFS server serves COMMIT with expensive fsync() IOs, it is
  desirable to delay and reduce the number of COMMITs. So it's not
  likely to optimize away such kind of bursty IO completions, and the
  resulted large (and tiny) stall times in IO completion based throttling.

So here is a pause time oriented approach, which tries to control the
pause time in each balance_dirty_pages() invocations, by controlling
the number of pages dirtied before calling balance_dirty_pages(), for
smooth and efficient dirty throttling:

- avoid useless (eg. zero pause time) balance_dirty_pages() calls
- avoid too small pause time (less than   4ms, which burns CPU power)
- avoid too large pause time (more than 200ms, which hurts responsiveness)
- avoid big fluctuations of pause times

It can control pause times at will. The default policy (in a followup
patch) will be to do ~10ms pauses in 1-dd case, and increase to ~100ms
in 1000-dd case.

BEHAVIOR CHANGE
===============

(1) dirty threshold

Users will notice that the applications will get throttled once crossing
the global (background + dirty)/2=15% threshold, and then balanced around
17.5%. Before patch, the behavior is to just throttle it at 20% dirtyable
memory in 1-dd case.

Since the task will be soft throttled earlier than before, it may be
perceived by end users as performance "slow down" if his application
happens to dirty more than 15% dirtyable memory.

(2) smoothness/responsiveness

Users will notice a more responsive system during heavy writeback.
"killall dd" will take effect instantly.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:57 +08:00
Wu Fengguang 9d823e8f6b writeback: per task dirty rate limit
Add two fields to task_struct.

1) account dirtied pages in the individual tasks, for accuracy
2) per-task balance_dirty_pages() call intervals, for flexibility

The balance_dirty_pages() call interval (ie. nr_dirtied_pause) will
scale near-sqrt to the safety gap between dirty pages and threshold.

The main problem of per-task nr_dirtied is, if 1k+ tasks start dirtying
pages at exactly the same time, each task will be assigned a large
initial nr_dirtied_pause, so that the dirty threshold will be exceeded
long before each task reached its nr_dirtied_pause and hence call
balance_dirty_pages().

The solution is to watch for the number of pages dirtied on each CPU in
between the calls into balance_dirty_pages(). If it exceeds ratelimit_pages
(3% dirty threshold), force call balance_dirty_pages() for a chance to
set bdi->dirty_exceeded. In normal situations, this safeguarding
condition is not expected to trigger at all.

On the sqrt in dirty_poll_interval():

It will serve as an initial guess when dirty pages are still in the
freerun area.

When dirty pages are floating inside the dirty control scope [freerun,
limit], a followup patch will use some refined dirty poll interval to
get the desired pause time.

   thresh-dirty (MB)    sqrt
		   1      16
		   2      22
		   4      32
		   8      45
		  16      64
		  32      90
		  64     128
		 128     181
		 256     256
		 512     362
		1024     512

The above table means, given 1MB (or 1GB) gap and the dd tasks polling
balance_dirty_pages() on every 16 (or 512) pages, the dirty limit won't
be exceeded as long as there are less than 16 (or 512) concurrent dd's.

So sqrt naturally leads to less overheads and more safe concurrent tasks
for large memory servers, which have large (thresh-freerun) gaps.

peter: keep the per-CPU ratelimit for safeguarding the 1k+ tasks case

CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Andrea Righi <andrea@betterlinux.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:57 +08:00
Wu Fengguang 7381131cbc writeback: stabilize bdi->dirty_ratelimit
There are some imperfections in balanced_dirty_ratelimit.

1) large fluctuations

The dirty_rate used for computing balanced_dirty_ratelimit is merely
averaged in the past 200ms (very small comparing to the 3s estimation
period for write_bw), which makes rather dispersed distribution of
balanced_dirty_ratelimit.

It's pretty hard to average out the singular points by increasing the
estimation period. Considering that the averaging technique will
introduce very undesirable time lags, I give it up totally. (btw, the 3s
write_bw averaging time lag is much more acceptable because its impact
is one-way and therefore won't lead to oscillations.)

The more practical way is filtering -- most singular
balanced_dirty_ratelimit points can be filtered out by remembering some
prev_balanced_rate and prev_prev_balanced_rate. However the more
reliable way is to guard balanced_dirty_ratelimit with task_ratelimit.

2) due to truncates and fs redirties, the (write_bw <=> dirty_rate)
match could become unbalanced, which may lead to large systematical
errors in balanced_dirty_ratelimit. The truncates, due to its possibly
bumpy nature, can hardly be compensated smoothly. So let's face it. When
some over-estimated balanced_dirty_ratelimit brings dirty_ratelimit
high, dirty pages will go higher than the setpoint. task_ratelimit will
in turn become lower than dirty_ratelimit.  So if we consider both
balanced_dirty_ratelimit and task_ratelimit and update dirty_ratelimit
only when they are on the same side of dirty_ratelimit, the systematical
errors in balanced_dirty_ratelimit won't be able to bring
dirty_ratelimit far away.

The balanced_dirty_ratelimit estimation may also be inaccurate near
@limit or @freerun, however is less an issue.

3) since we ultimately want to

- keep the fluctuations of task ratelimit as small as possible
- keep the dirty pages around the setpoint as long time as possible

the update policy used for (2) also serves the above goals nicely:
if for some reason the dirty pages are high (task_ratelimit < dirty_ratelimit),
and dirty_ratelimit is low (dirty_ratelimit < balanced_dirty_ratelimit),
there is no point to bring up dirty_ratelimit in a hurry only to hurt
both the above two goals.

So, we make use of task_ratelimit to limit the update of dirty_ratelimit
in two ways:

1) avoid changing dirty rate when it's against the position control target
   (the adjusted rate will slow down the progress of dirty pages going
   back to setpoint).

2) limit the step size. task_ratelimit is changing values step by step,
   leaving a consistent trace comparing to the randomly jumping
   balanced_dirty_ratelimit. task_ratelimit also has the nice smaller
   errors in stable state and typically larger errors when there are big
   errors in rate.  So it's a pretty good limiting factor for the step
   size of dirty_ratelimit.

Note that bdi->dirty_ratelimit is always tracking balanced_dirty_ratelimit.
task_ratelimit is merely used as a limiting factor.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:57 +08:00
Wu Fengguang be3ffa2764 writeback: dirty rate control
It's all about bdi->dirty_ratelimit, which aims to be (write_bw / N)
when there are N dd tasks.

On write() syscall, use bdi->dirty_ratelimit
============================================

    balance_dirty_pages(pages_dirtied)
    {
        task_ratelimit = bdi->dirty_ratelimit * bdi_position_ratio();
        pause = pages_dirtied / task_ratelimit;
        sleep(pause);
    }

On every 200ms, update bdi->dirty_ratelimit
===========================================

    bdi_update_dirty_ratelimit()
    {
        task_ratelimit = bdi->dirty_ratelimit * bdi_position_ratio();
        balanced_dirty_ratelimit = task_ratelimit * write_bw / dirty_rate;
        bdi->dirty_ratelimit = balanced_dirty_ratelimit
    }

Estimation of balanced bdi->dirty_ratelimit
===========================================

balanced task_ratelimit
-----------------------

balance_dirty_pages() needs to throttle tasks dirtying pages such that
the total amount of dirty pages stays below the specified dirty limit in
order to avoid memory deadlocks. Furthermore we desire fairness in that
tasks get throttled proportionally to the amount of pages they dirty.

IOW we want to throttle tasks such that we match the dirty rate to the
writeout bandwidth, this yields a stable amount of dirty pages:

        dirty_rate == write_bw                                          (1)

The fairness requirement gives us:

        task_ratelimit = balanced_dirty_ratelimit
                       == write_bw / N                                  (2)

where N is the number of dd tasks.  We don't know N beforehand, but
still can estimate balanced_dirty_ratelimit within 200ms.

Start by throttling each dd task at rate

        task_ratelimit = task_ratelimit_0                               (3)
                         (any non-zero initial value is OK)

After 200ms, we measured

        dirty_rate = # of pages dirtied by all dd's / 200ms
        write_bw   = # of pages written to the disk / 200ms

For the aggressive dd dirtiers, the equality holds

        dirty_rate == N * task_rate
                   == N * task_ratelimit_0                              (4)
Or
        task_ratelimit_0 == dirty_rate / N                              (5)

Now we conclude that the balanced task ratelimit can be estimated by

                                                      write_bw
        balanced_dirty_ratelimit = task_ratelimit_0 * ----------        (6)
                                                      dirty_rate

Because with (4) and (5) we can get the desired equality (1):

                                                       write_bw
        balanced_dirty_ratelimit == (dirty_rate / N) * ----------
                                                       dirty_rate
                                 == write_bw / N

Then using the balanced task ratelimit we can compute task pause times like:

        task_pause = task->nr_dirtied / task_ratelimit

task_ratelimit with position control
------------------------------------

However, while the above gives us means of matching the dirty rate to
the writeout bandwidth, it at best provides us with a stable dirty page
count (assuming a static system). In order to control the dirty page
count such that it is high enough to provide performance, but does not
exceed the specified limit we need another control.

The dirty position control works by extending (2) to

        task_ratelimit = balanced_dirty_ratelimit * pos_ratio           (7)

where pos_ratio is a negative feedback function that subjects to

1) f(setpoint) = 1.0
2) df/dx < 0

That is, if the dirty pages are ABOVE the setpoint, we throttle each
task a bit more HEAVY than balanced_dirty_ratelimit, so that the dirty
pages are created less fast than they are cleaned, thus DROP to the
setpoints (and the reverse).

Based on (7) and the assumption that both dirty_ratelimit and pos_ratio
remains CONSTANT for the past 200ms, we get

        task_ratelimit_0 = balanced_dirty_ratelimit * pos_ratio         (8)

Putting (8) into (6), we get the formula used in
bdi_update_dirty_ratelimit():

                                                write_bw
        balanced_dirty_ratelimit *= pos_ratio * ----------              (9)
                                                dirty_rate

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:56 +08:00
Wu Fengguang af6a311384 writeback: add bg_threshold parameter to __bdi_update_bandwidth()
No behavior change.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:56 +08:00
Wu Fengguang c8e28ce049 writeback: account per-bdi accumulated dirtied pages
Introduce the BDI_DIRTIED counter. It will be used for estimating the
bdi's dirty bandwidth.

CC: Jan Kara <jack@suse.cz>
CC: Michael Rubin <mrubin@google.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:56 +08:00
Linus Walleij b1e3be0647 clocksource: fixup ux500 build problems
Based on a patch from Arnd Bergmann this fixes up the build
problem of assigning a non-existing global when the ux500 PRCMU
timer is not linked in by passing its base address to the init
function. We also add a missing <linux/errno.h> inclusion and
staticize the dummy function.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2011-10-03 09:34:16 +02:00
Dan Williams ac013ed1cb [SCSI] isci: export phy events via ->lldd_control_phy()
Allow the sas-transport-class to update events for local phys via a new
PHY_FUNC_GET_EVENTS command to ->lldd_control_phy().  Fixup drivers that
are not prepared for new enum phy_func values, and unify
->lldd_control_phy() error codes.

These are the SAS defined phy events that are reported in a
smp-report-phy-error-log command:
 * /sys/class/sas_phy/<phyX>/invalid_dword_count
 * /sys/class/sas_phy/<phyX>/running_disparity_error_count
 * /sys/class/sas_phy/<phyX>/loss_of_dword_sync_count
 * /sys/class/sas_phy/<phyX>/phy_reset_problem_count

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-02 13:24:26 -05:00
Dan Williams b50102d3e9 [SCSI] isci: atapi support
Based on original implementation from Jiangbi Liu and Maciej Trela.

ATAPI transfers happen in two-to-three stages.  The two stage atapi
commands are those that include a dma data transfer.  The data transfer
portion of these operations is handled by the hardware packet-dma
acceleration.  The three-stage commands do not have a data transfer and
are handled without hardware assistance in raw frame mode.

stage1: transmit host-to-device fis to notify the device of an incoming
atapi cdb.  Upon reception of the pio-setup-fis repost the task_context
to perform the dma transfer of the cdb+data (go to stage3), or repost
the task_context to transmit the cdb as a raw frame (go to stage 2).

stage2: wait for hardware notification of the cdb transmission and then
go to stage 3.

stage3: wait for the arrival of the terminating device-to-host fis and
terminate the command.

To keep the implementation simple we only support ATAPI packet-dma
protocol (for commands with data) to avoid needing to handle the data
transfer manually (like we do for SATA-PIO).  This may affect
compatibility for a small number of devices (see
ATA_HORKAGE_ATAPI_MOD16_DMA).

If the data-transfer underruns, or encounters an error the
device-to-host fis is expected to arrive in the unsolicited frame queue
to pass to libata for disposition.  However, in the DONE_UNEXP_FIS (data
underrun) case it appears we need to craft a response.  In the
DONE_REG_ERR case we do receive the UF and propagate it to libsas.

Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-02 13:20:03 -05:00
Vasu Dev 49a198898e [SCSI] libfc: cache align struct fc_exch fields
cache aligned xid and ex_lock beside
removing holes.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-02 12:56:26 -05:00
Vasu Dev ed26cfece6 [SCSI] libfc: cache align struct fc_fcp_pkt fields
Re-arrange its fields to avoid padding and have better
cacheline alignments.

Removed not used start_time, end_time and last_pkt_time
fields.

This all reduced this struct size to 448 from 480 and
that also reduced one cacheline on x86_64 beside
eliminating 8 pads. However kept logical fields together.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-02 12:55:07 -05:00
Dan Williams 05a2a17317 [SCSI] libsas: fix warnings when checking sata/stp protocol
Several sas drivers legitimately check the protocol against the union of
SAS_PROTOCOL_SATA and SAS_PROTOCOL_STP.  Provide a SAS_PROTOCOL_STP_ALL
to silence warnings like:

drivers/scsi/pm8001/pm8001_sas.c:438:3: warning: case value ‘5’ not in enumerated type ‘enum sas_protocol’ [-Wswitch]
drivers/scsi/mvsas/mv_sas.c:798:2: warning: case value ‘5’ not in enumerated type ‘enum sas_protocol’ [-Wswitch]
drivers/scsi/mvsas/mv_sas.c:1783:2: warning: case value ‘5’ not in enumerated type ‘enum sas_protocol’ [-Wswitch]
drivers/scsi/mvsas/mv_sas.c:1886:2: warning: case value ‘5’ not in enumerated type ‘enum sas_protocol’ [-Wswitch]
drivers/scsi/isci/request.c:3565:2: warning: case value ‘5’ not in enumerated type ‘enum sas_protocol’ [-Wswitch]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-02 12:51:53 -05:00
Dan Williams d962480e9a [SCSI] libsas: fix try_test_sas_gpio_gp_bit() build error
If the user has disabled CONFIG_SCSI_SAS_HOST_SMP then libsas drivers
will not be receiving smp-gpio frames and do not need this lookup code.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Tested-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-02 12:47:40 -05:00
Dan Williams f6e67035a9 [SCSI] libsas,libata: fix ->change_queue_{depth|type} for sata devices
Pass queue_depth change requests to libata, and prevent queue_type
changes for ATA devices.

Otherwise:
1/ we do not honor the libata specific restrictions on the queue depth
2/ libsas drivers that do not set sdev->tagged_supported are unable to
   change the queue_depth of ata devices via sysfs

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-02 12:30:30 -05:00
Luben Tuikov ffaac8f45b [SCSI] libsas: Allow expander T-T attachments
Allow expander table-to-table attachments for
expanders that support it.

Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-02 12:23:11 -05:00
MyungJoo Ham ce26c5bb95 PM / devfreq: Add basic governors
Four cpufreq-like governors are provided as examples.

powersave: use the lowest frequency possible. The user (device) should
set the polling_ms as 0 because polling is useless for this governor.

performance: use the highest freqeuncy possible. The user (device)
should set the polling_ms as 0 because polling is useless for this
governor.

userspace: use the user specified frequency stored at
devfreq.user_set_freq. With sysfs support in the following patch, a user
may set the value with the sysfs interface.

simple_ondemand: simplified version of cpufreq's ondemand governor.

When a user updates OPP entries (enable/disable/add), OPP framework
automatically notifies devfreq to update operating frequency
accordingly. Thus, devfreq users (device drivers) do not need to update
devfreq manually with OPP entry updates or set polling_ms for powersave
, performance, userspace, or any other "static" governors.

Note that these are given only as basic examples for governors and any
devices with devfreq may implement their own governors with the drivers
and use them.

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Mike Turquette <mturquette@ti.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-02 00:19:34 +02:00
MyungJoo Ham a3c98b8b2e PM: Introduce devfreq: generic DVFS framework with device-specific OPPs
With OPPs, a device may have multiple operable frequency and voltage
sets. However, there can be multiple possible operable sets and a system
will need to choose one from them. In order to reduce the power
consumption (by reducing frequency and voltage) without affecting the
performance too much, a Dynamic Voltage and Frequency Scaling (DVFS)
scheme may be used.

This patch introduces the DVFS capability to non-CPU devices with OPPs.
DVFS is a techique whereby the frequency and supplied voltage of a
device is adjusted on-the-fly. DVFS usually sets the frequency as low
as possible with given conditions (such as QoS assurance) and adjusts
voltage according to the chosen frequency in order to reduce power
consumption and heat dissipation.

The generic DVFS for devices, devfreq, may appear quite similar with
/drivers/cpufreq.  However, cpufreq does not allow to have multiple
devices registered and is not suitable to have multiple heterogenous
devices with different (but simple) governors.

Normally, DVFS mechanism controls frequency based on the demand for
the device, and then, chooses voltage based on the chosen frequency.
devfreq also controls the frequency based on the governor's frequency
recommendation and let OPP pick up the pair of frequency and voltage
based on the recommended frequency. Then, the chosen OPP is passed to
device driver's "target" callback.

When PM QoS is going to be used with the devfreq device, the device
driver should enable OPPs that are appropriate with the current PM QoS
requests. In order to do so, the device driver may call opp_enable and
opp_disable at the notifier callback of PM QoS so that PM QoS's
update_target() call enables the appropriate OPPs. Note that at least
one of OPPs should be enabled at any time; be careful when there is a
transition.

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Mike Turquette <mturquette@ti.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-02 00:19:15 +02:00
Linus Torvalds f72a209a3e Merge branches 'irq-urgent-for-linus', 'x86-urgent-for-linus' and 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip
* 'irq-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  irq: Fix check for already initialized irq_domain in irq_domain_add
  irq: Add declaration of irq_domain_simple_ops to irqdomain.h

* 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  x86/rtc: Don't recursively acquire rtc_lock

* 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  posix-cpu-timers: Cure SMP wobbles
  sched: Fix up wchan borkage
  sched/rt: Migrate equal priority tasks to available CPUs
2011-10-01 08:37:25 -07:00
Ingo Molnar 048b718029 Merge branch 'rcu/next' of git://github.com/paulmckrcu/linux into core/rcu 2011-10-01 14:21:36 +02:00
MyungJoo Ham 03ca370fbf PM / OPP: Add OPP availability change notifier.
The patch enables to register notifier_block for an OPP-device in order
to get notified for any changes in the availability of OPPs of the
device. For example, if a new OPP is inserted or enable/disable status
of an OPP is changed, the notifier is executed.

This enables the usage of opp_add, opp_enable, and opp_disable to
directly take effect with any connected entities such as cpufreq or
devfreq.

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Mike Turquette <mturquette@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-09-30 22:35:12 +02:00
Johannes Berg 4b801bc969 mac80211: document client powersave
With the addition of uAPSD and driver buffering
the powersave handling has gotten quite complex.
Add a section to the documentation to explain it
for anyone wanting to implement it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:24 -04:00
Johannes Berg 37fbd90800 mac80211: allow out-of-band EOSP notification
iwlwifi has a separate EOSP notification from
the device, and to make use of that properly
it needs to be passed to mac80211. To be able
to mix with tx_status_irqsafe and rx_irqsafe
it also needs to be an "_irqsafe" version in
the sense that it goes through the tasklet,
the actual flag clearing would be IRQ-safe
but doing it directly would cause reordering
issues.

This is needed in the case of a P2P GO going
into an absence period without transmitting
any frames that should be driver-released as
in this case there's no other way to inform
mac80211 that the service period ended. Note
that for drivers that don't use the _irqsafe
functions another version of this function
will be required.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:23 -04:00
Johannes Berg 40b9640883 mac80211: explicitly notify drivers of frame release
iwlwifi needs to know the number of frames that are
going to be sent to a station while it is asleep so
it can properly handle the uCode blocking of that
station.

Before uAPSD, we got by by telling the device that
a single frame was going to be released whenever we
encountered IEEE80211_TX_CTL_POLL_RESPONSE. With
uAPSD, however, that is no longer possible since
there could be more than a single frame.

To support this model, add a new callback to notify
drivers when frames are going to be released.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:21 -04:00
Johannes Berg deeaee197b mac80211: reply only once to each PS-poll
If a PS-poll frame is retried (but was received)
there is no way to detect that since it has no
sequence number. As a consequence, the standard
asks us to not react to PS-poll frames until the
response to one made it out (was ACKed or lost).

Implement this by using the WLAN_STA_SP flags to
also indicate a PS-Poll "service period" and the
IEEE80211_TX_STATUS_EOSP flag for the response
packet to indicate the end of the "SP" as usual.

We could use separate flags, but that will most
likely completely confuse drivers, and while the
standard doesn't exclude simultaneously polling
using uAPSD and PS-Poll, doing that seems quite
problematic.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:18 -04:00
Johannes Berg 47086fc51a mac80211: implement uAPSD
Add uAPSD support to mac80211. This is probably not
possible with all devices, so advertising it with
the cfg80211 flag will be left up to drivers that
want it.

Due to my previous patches it is now a fairly
straight-forward extension. Drivers need to have
accurate TX status reporting for the EOSP frame.
For drivers that buffer themselves, the provided
APIs allow releasing the right number of frames,
but then drivers need to set EOSP and more-data
themselves. This is documented in more detail in
the new code itself.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:15 -04:00
Johannes Berg 4049e09acd mac80211: allow releasing driver-buffered frames
If there are frames for a station buffered in
the driver, mac80211 announces those in the TIM
IE but there's no way to release them. Add new
API to release such frames and use it when the
station polls for a frame.

Since the API will soon also be used for uAPSD
it is easily extensible.

Note that before this change drivers announcing
driver-buffered frames in the TIM bit actually
will respond to a PS-Poll with a potentially
lower priority frame (if there are any frames
buffered in mac80211), after this patch a driver
that hasn't been changed will no longer respond
at all. This only affects ath9k, which will need
to be fixed to implement the new API.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:15 -04:00
Johannes Berg 948d887dec mac80211: split PS buffers into ACs
For uAPSD support we'll need to have per-AC PS
buffers. As this is a major undertaking, split
the buffers before really adding support for
uAPSD. This already makes some reference to the
uapsd_queues variable, but for now that will
never be non-zero.

Since book-keeping is complicated, also change
the logic for keeping a maximum of frames only
and allow 64 frames per AC (up from 128 for a
station).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:12 -04:00
Johannes Berg 042ec45337 mac80211: let drivers inform it about per TID buffered frames
For uAPSD implementation, it is necessary to know on
which ACs frames are buffered. mac80211 obviously
knows about the frames it has buffered itself, but
with aggregation many drivers buffer frames. Thus,
mac80211 needs to be informed about this.

For now, since we don't have APSD in any form, this
will unconditionally set the TIM bit for the station
but later with uAPSD only some ACs might cause the
TIM bit to be set.

ath9k is the only driver using this API and I only
modify it in the most basic way, it won't be able
to implement uAPSD with this yet. But it can't do
that anyway since there's no way to selectively
release frames to the peer yet.

Since drivers will buffer frames per TID, let them
inform mac80211 on a per TID basis, mac80211 will
then sort out the AC mapping itself.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:10 -04:00
Arik Nemtsov 07ba55d7f1 nl80211/mac80211: allow adding TDLS peers as stations
When adding a TDLS peer STA, mark it with a new flag in both nl80211 and
mac80211. Before adding a peer, make sure the wiphy supports TDLS and
our operating mode is appropriate (managed).

In addition, make sure all peers are removed on disassociation.

A TDLS peer is first added just before link setup is initiated. In later
setup stages we have more info about peer supported rates, capabilities,
etc. This info is reported via nl80211_set_station().

Signed-off-by: Arik Nemtsov <arik@wizery.com>
Cc: Kalyan C Gaddam <chakkal@iit.edu>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:08 -04:00
Arik Nemtsov dfe018bf99 mac80211: handle TDLS high-level commands and frames
Register and implement the TDLS cfg80211 callback functions.

Internally prepare and send TDLS management frames. We incorporate
local STA capabilities and supported rates with extra IEs given by
usermode. The resulting packet is either encapsulated in a data frame,
or assembled as an action frame. It is transmitted either directly or
through the AP, as mandated by the TDLS specification.

Declare support for the TDLS external setup wiphy capability. This
tells usermode to handle link setup and discovery on its own, and use the
kernel driver for sending TDLS mgmt packets.

Signed-off-by: Arik Nemtsov <arik@wizery.com>
Cc: Kalyan C Gaddam <chakkal@iit.edu>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:07 -04:00