8 years agoLinux
Greg Kroah-Hartman [Thu, 26 Aug 2010 23:40:25 +0000 (16:40 -0700)]

8 years agoUSB: io_ti: check firmware version before updating
Greg Kroah-Hartman [Tue, 17 Aug 2010 22:15:37 +0000 (15:15 -0700)]
USB: io_ti: check firmware version before updating

commit 0827a9ff2bbcbb03c33f1a6eb283fe051059482c upstream.

If we can't read the firmware for a device from the disk, and yet the
device already has a valid firmware image in it, we don't want to
replace the firmware with something invalid.  So check the version
number to be less than the current one to verify this is the correct
thing to do.

Reported-by: Chris Beauchamp <>
Tested-by: Chris Beauchamp <>
Cc: Alan Stern <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoUSB: add device IDs for igotu to navman
Ross Burton [Fri, 6 Aug 2010 15:36:39 +0000 (16:36 +0100)]
USB: add device IDs for igotu to navman

commit 0eee6a2b2a52e17066a572d30ad2805d3ebc7508 upstream.

I recently bought a i-gotU USB GPS, and whilst hunting around for linux
support discovered this post by you back in 2009:

>Try the navman driver instead.  You can either add the device id to the
> driver and rebuild it, or do this before you plug the device in:
>  modprobe navman
>  echo -n "0x0df7 0x0900" > /sys/bus/usb-serial/drivers/navman/new_id
> and then plug your device in and see if that works.

I can confirm that the navman driver works with the right device IDs on
my i-gotU GT-600, which has the same device IDs.  Attached is a patch
adding the IDs.

From: Ross Burton <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agodrm: stop information leak of old kernel stack.
Dave Airlie [Tue, 17 Aug 2010 04:46:00 +0000 (14:46 +1000)]
drm: stop information leak of old kernel stack.

commit b9f0aee83335db1f3915f4e42a5e21b351740afd upstream.

non-critical issue, CVE-2010-2803

Userspace controls the amount of memory to be allocate, so it can
get the ioctl to allocate more memory than the kernel uses, and get
access to kernel stack. This can only be done for processes authenticated
to the X server for DRI access, and if the user has DRI access.

Fix is to just memset the data to 0 if the user doesn't copy into
it in the first place.

Reported-by: Kees Cook <>
Signed-off-by: Dave Airlie <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agofixes for using make 3.82
Jan Beulich [Mon, 16 Aug 2010 10:58:58 +0000 (11:58 +0100)]
fixes for using make 3.82

commit 3c955b407a084810f57260d61548cc92c14bc627 upstream.

It doesn't like pattern and explicit rules to be on the same line,
and it seems to be more picky when matching file (or really directory)
names with different numbers of trailing slashes.

Signed-off-by: Jan Beulich <>
Acked-by: Sam Ravnborg <>
Andrew Benton <>
Signed-off-by: Michal Marek <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agocan: add limit for nframes and clean up signed/unsigned variables
Oliver Hartkopp [Wed, 11 Aug 2010 23:12:35 +0000 (16:12 -0700)]
can: add limit for nframes and clean up signed/unsigned variables

commit 5b75c4973ce779520b9d1e392483207d6f842cde upstream.

This patch adds a limit for nframes as the number of frames in TX_SETUP and
RX_SETUP are derived from a single byte multiplex value by default.
Use-cases that would require to send/filter more than 256 CAN frames should
be implemented in userspace for complexity reasons anyway.

Additionally the assignments of unsigned values from userspace to signed
values in kernelspace and vice versa are fixed by using unsigned values in
kernelspace consistently.

Signed-off-by: Oliver Hartkopp <>
Reported-by: Ben Hawkes <>
Acked-by: Urs Thuermann <>
Signed-off-by: David S. Miller <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoselinux: use default proc sid on symlinks
Stephen Smalley [Mon, 22 Sep 2008 19:41:19 +0000 (15:41 -0400)]
selinux: use default proc sid on symlinks

commit ea6b184f7d521a503ecab71feca6e4057562252b upstream.

As we are not concerned with fine-grained control over reading of
symlinks in proc, always use the default proc SID for all proc symlinks.
This should help avoid permission issues upon changes to the proc tree
as in the /proc/net -> /proc/self/net example.
This does not alter labeling of symlinks within /proc/pid directories.
ls -Zd /proc/net output before and after the patch should show the difference.

Signed-off-by: Stephen D. Smalley <>
Signed-off-by: James Morris <>
Cc: Florian Mickler <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agokbuild: fix make incompatibility
Sam Ravnborg [Sat, 13 Dec 2008 22:00:45 +0000 (23:00 +0100)]
kbuild: fix make incompatibility

commit 31110ebbec8688c6e9597b641101afc94e1c762a upstream.

"Paul Smith" <> reported that we would fail
to build with a new check that may be enabled in an
upcoming version of make.

The error was:

      Makefile:442: *** mixed implicit and normal rules.  Stop.

The problem is that we did stuff like this:

config %config: ...

The solution was simple - the above was split into two with identical
prerequisites and commands.
With only three lines it was not worth to try to avoid the duplication.

Cc: "Paul Smith" <>
Signed-off-by: Sam Ravnborg <>
Cc: Thomas Backlund <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoARM: Tighten check for allowable CPSR values
Russell King [Fri, 13 Aug 2010 22:33:46 +0000 (23:33 +0100)]
ARM: Tighten check for allowable CPSR values

commit 41e2e8fd34fff909a0e40129f6ac4233ecfa67a9 upstream.

Reviewed-by: Arve Hjønnevåg <>
Acked-by: Dima Zavin <>
Signed-off-by: Russell King <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoLinux
Greg Kroah-Hartman [Fri, 20 Aug 2010 18:25:26 +0000 (11:25 -0700)]

8 years agomm: fix up some user-visible effects of the stack guard page
Linus Torvalds [Sun, 15 Aug 2010 18:35:52 +0000 (11:35 -0700)]
mm: fix up some user-visible effects of the stack guard page

commit d7824370e26325c881b665350ce64fb0a4fde24a upstream.

This commit makes the stack guard page somewhat less visible to user
space. It does this by:

 - not showing the guard page in /proc/<pid>/maps

   It looks like lvm-tools will actually read /proc/self/maps to figure
   out where all its mappings are, and effectively do a specialized
   "mlockall()" in user space.  By not showing the guard page as part of
   the mapping (by just adding PAGE_SIZE to the start for grows-up
   pages), lvm-tools ends up not being aware of it.

 - by also teaching the _real_ mlock() functionality not to try to lock
   the guard page.

   That would just expand the mapping down to create a new guard page,
   so there really is no point in trying to lock it in place.

It would perhaps be nice to show the guard page specially in
/proc/<pid>/maps (or at least mark grow-down segments some way), but
let's not open ourselves up to more breakage by user space from programs
that depends on the exact deails of the 'maps' file.

Special thanks to Henrique de Moraes Holschuh for diving into lvm-tools
source code to see what was going on with the whole new warning.

[Note, for .27, only the /proc change is done, mlock is not modified
here. - gregkh]

Reported-and-tested-by: François Valenduc <
Reported-by: Henrique de Moraes Holschuh <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agomm: fix page table unmap for stack guard page properly
Linus Torvalds [Sat, 14 Aug 2010 18:44:56 +0000 (11:44 -0700)]
mm: fix page table unmap for stack guard page properly

commit 11ac552477e32835cb6970bf0a70c210807f5673 upstream.

We do in fact need to unmap the page table _before_ doing the whole
stack guard page logic, because if it is needed (mainly 32-bit x86 with
PAE and CONFIG_HIGHPTE, but other architectures may use it too) then it
will do a kmap_atomic/kunmap_atomic.

And those kmaps will create an atomic region that we cannot do
allocations in.  However, the whole stack expand code will need to do
anon_vma_prepare() and vma_lock_anon_vma() and they cannot do that in an
atomic region.

Now, a better model might actually be to do the anon_vma_prepare() when
_creating_ a VM_GROWSDOWN segment, and not have to worry about any of
this at page fault time.  But in the meantime, this is the
straightforward fix for the issue.

See for details.

Reported-by: Wylda <>
Reported-by: Sedat Dilek <>
Reported-by: Mike Pagano <>
Reported-by: François Valenduc <>
Tested-by: Ed Tomlinson <>
Cc: Pekka Enberg <>
Cc: Greg KH <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agomm: pass correct mm when growing stack
Hugh Dickins [Thu, 16 Apr 2009 20:58:12 +0000 (21:58 +0100)]
mm: pass correct mm when growing stack

commit 05fa199d45c54a9bda7aa3ae6537253d6f097aa9 upstream.

Tetsuo Handa reports seeing the WARN_ON(current->mm == NULL) in
security_vm_enough_memory(), when do_execve() is touching the
target mm's stack, to set up its args and environment.

Yes, a UMH_NO_WAIT or UMH_WAIT_PROC call_usermodehelper() spawns
an mm-less kernel thread to do the exec.  And in any case, that
vm_enough_memory check when growing stack ought to be done on the
target mm, not on the execer's mm (though apart from the warning,
it only makes a slight tweak to OVERCOMMIT_NEVER behaviour).

Reported-by: Tetsuo Handa <>
Signed-off-by: Hugh Dickins <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agox86: don't send SIGBUS for kernel page faults
Greg Kroah-Hartman [Fri, 13 Aug 2010 20:46:26 +0000 (13:46 -0700)]
x86: don't send SIGBUS for kernel page faults

Based on commit 96054569190bdec375fe824e48ca1f4e3b53dd36 upstream,
authored by Linus Torvalds.

This is my backport to the .27 kernel tree, hopefully preserving
the same functionality.

Original commit message:
It's wrong for several reasons, but the most direct one is that the
fault may be for the stack accesses to set up a previous SIGBUS.  When
we have a kernel exception, the kernel exception handler does all the
fixups, not some user-level signal handler.

Even apart from the nested SIGBUS issue, it's also wrong to give out
kernel fault addresses in the signal handler info block, or to send a
SIGBUS when a system call already returns EFAULT.

Cc: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agomm: fix missing page table unmap for stack guard page failure case
Linus Torvalds [Fri, 13 Aug 2010 16:24:04 +0000 (09:24 -0700)]
mm: fix missing page table unmap for stack guard page failure case

commit 5528f9132cf65d4d892bcbc5684c61e7822b21e9 upstream.

.. which didn't show up in my tests because it's a no-op on x86-64 and
most other architectures.  But we enter the function with the last-level
page table mapped, and should unmap it at exit.

Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agomm: keep a guard page below a grow-down stack segment
Linus Torvalds [Fri, 13 Aug 2010 00:54:33 +0000 (17:54 -0700)]
mm: keep a guard page below a grow-down stack segment

commit 320b2b8de12698082609ebbc1a17165727f4c893 upstream.

This is a rather minimally invasive patch to solve the problem of the
user stack growing into a memory mapped area below it.  Whenever we fill
the first page of the stack segment, expand the segment down by one

Now, admittedly some odd application might _want_ the stack to grow down
into the preceding memory mapping, and so we may at some point need to
make this a process tunable (some people might also want to have more
than a single page of guarding), but let's try the minimal approach

Tested with trivial application that maps a single page just below the
stack, and then starts recursing.  Without this, we will get a SIGSEGV
_after_ the stack has smashed the mapping.  With this patch, we'll get a
nice SIGBUS just as the stack touches the page just above the mapping.

Requested-by: Keith Packard <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoLinux
Greg Kroah-Hartman [Fri, 13 Aug 2010 21:02:40 +0000 (14:02 -0700)]

8 years agomm/backing-dev.c: remove recently-added WARN_ON()
Andrew Morton [Tue, 9 Dec 2008 21:14:06 +0000 (13:14 -0800)]
mm/backing-dev.c: remove recently-added WARN_ON()

commit 69fc208be5b7eb18d22d1eca185b201400fd5ffc upstream.

On second thoughts, this is just going to disturb people while telling us
things which we already knew.

Cc: Peter Korsgaard <>
Cc: Peter Zijlstra <>
Cc: Kay Sievers <>
Cc: David Woodhouse <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Cc: Ben Hutchings <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agobdi: register sysfs bdi device only once per queue
Kay Sievers [Tue, 2 Dec 2008 18:31:50 +0000 (10:31 -0800)]
bdi: register sysfs bdi device only once per queue

commit f1d0b063d993527754f062c589b73f125024d216 upstream.

Devices which share the same queue, like floppies and mtd devices, get
registered multiple times in the bdi interface, but bdi accounts only the
last registered device of the devices sharing one queue.

On remove, all earlier registered devices leak, stay around in sysfs, and
cause "duplicate filename" errors if the devices are re-created.

This prevents the creation of multiple bdi interfaces per queue, and the
bdi device will carry the dev_t name of the block device which is the
first one registered, of the pool of devices using the same queue.

[ add a WARN_ON so we know which drivers are misbehaving]
Tested-by: Peter Korsgaard <>
Acked-by: Peter Zijlstra <>
Signed-off-by: Kay Sievers <>
Cc: David Woodhouse <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Cc: Ben Hutchings <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoxen: drop xen_sched_clock in favour of using plain wallclock time
Jeremy Fitzhardinge [Mon, 12 Jul 2010 18:49:59 +0000 (11:49 -0700)]
xen: drop xen_sched_clock in favour of using plain wallclock time

commit 8a22b9996b001c88f2bfb54c6de6a05fc39e177a upstream.

xen_sched_clock only counts unstolen time.  In principle this should
be useful to the Linux scheduler so that it knows how much time a process
actually consumed.  But in practice this doesn't work very well as the
scheduler expects the sched_clock time to be synchronized between
cpus.  It also uses sched_clock to measure the time a task spends
sleeping, in which case "unstolen time" isn't meaningful.

So just use plain xen_clocksource_read to return wallclock nanoseconds
for sched_clock.

Signed-off-by: Jeremy Fitzhardinge <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agojfs: don't allow os2 xattr namespace overlap with others
Dave Kleikamp [Mon, 9 Aug 2010 20:57:38 +0000 (15:57 -0500)]
jfs: don't allow os2 xattr namespace overlap with others

commit aca0fa34bdaba39bfddddba8ca70dba4782e8fe6 upstream.

It's currently possible to bypass xattr namespace access rules by
prefixing valid xattr names with "os2.", since the os2 namespace stores
extended attributes in a legacy format with no prefix.

This patch adds checking to deny access to any valid namespace prefix
following "os2.".

Signed-off-by: Dave Kleikamp <>
Reported-by: Sergey Vlasov <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agosignalfd: fill in ssi_int for posix timers and message queues
Nathan Lynch [Wed, 11 Aug 2010 01:03:08 +0000 (18:03 -0700)]
signalfd: fill in ssi_int for posix timers and message queues

commit a2a20c412c86e0bb46a9ab0dd31bcfe6d201b913 upstream.

If signalfd is used to consume a signal generated by a POSIX interval
timer or POSIX message queue, the ssi_int field does not reflect the data
(sigevent->sigev_value) supplied to timer_create(2) or mq_notify(3).  (The
ssi_ptr field, however, is filled in.)

This behavior differs from signalfd's treatment of sigqueue-generated
signals -- see the default case in signalfd_copyinfo.  It also gives
results that differ from the case when a signal is handled conventionally
via a sigaction-registered handler.

So, set signalfd_siginfo->ssi_int in the remaining cases (__SI_TIMER,
__SI_MESGQ) where ssi_ptr is set.

akpm: a non-back-compatible change.  Merge into -stable to minimise the
number of kernels which are in the field and which miss this feature.

Signed-off-by: Nathan Lynch <>
Acked-by: Davide Libenzi <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agofs/ecryptfs/file.c: introduce missing free
Julia Lawall [Fri, 6 Aug 2010 20:58:49 +0000 (22:58 +0200)]
fs/ecryptfs/file.c: introduce missing free

commit ceeab92971e8af05c1e81a4ff2c271124b55bb9b upstream.

The comments in the code indicate that file_info should be released if the
function fails.  This releasing is done at the label out_free, not out.

The semantic match that finds this problem is as follows:

// <smpl>
@r exists@
local idexpression x;
statement S;
expression E;
identifier f,f1,l;
position p1,p2;
expression *ptr != NULL;

x@p1 = kmem_cache_zalloc(...);
if (x == NULL) S
<... when != x
     when != if (...) { <+...x...+> }
x->f1 = E
 (x->f1 == NULL || ...)
 return <+...x...+>;
 return@p2 ...;

p1 << r.p1;
p2 << r.p2;

print "* file: %s kmem_cache_zalloc %s" % (p1[0].file,p1[0].line)
// </smpl>

Signed-off-by: Julia Lawall <>
Signed-off-by: Tyler Hicks <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoeCryptfs: Handle ioctl calls with unlocked and compat functions
Tyler Hicks [Tue, 3 Nov 2009 17:45:11 +0000 (11:45 -0600)]
eCryptfs: Handle ioctl calls with unlocked and compat functions

commit c43f7b8fb03be8bcc579bfc4e6ab70eac887ab55 upstream.

Lower filesystems that only implemented unlocked_ioctl weren't being
passed ioctl calls because eCryptfs only checked for
lower_file->f_op->ioctl and returned -ENOTTY if it was NULL.

eCryptfs shouldn't implement ioctl(), since it doesn't require the BKL.
This patch introduces ecryptfs_unlocked_ioctl() and
ecryptfs_compat_ioctl(), which passes the calls on to the lower file

Reported-by: James Dupin <>
Signed-off-by: Tyler Hicks <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agomd/raid10: fix deadlock with unaligned read during resync
NeilBrown [Sat, 7 Aug 2010 11:17:00 +0000 (21:17 +1000)]
md/raid10: fix deadlock with unaligned read during resync

commit 51e9ac77035a3dfcb6fc0a88a0d80b6f99b5edb1 upstream.

If the 'bio_split' path in raid10-read is used while
resync/recovery is happening it is possible to deadlock.
Fix this be elevating ->nr_waiting for the duration of both
parts of the split request.

This fixes a bug that has been present since 2.6.22
but has only started manifesting recently for unknown reasons.
It is suitable for and -stable since then.

Reported-by: Justin Bronder <>
Tested-by: Justin Bronder <>
Signed-off-by: NeilBrown <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoPCI: disable MSI on VIA K8M800
Tejun Heo [Sun, 23 May 2010 08:22:55 +0000 (10:22 +0200)]
PCI: disable MSI on VIA K8M800

commit 549e15611b4ac1de51ef0e0a79c2704f50a638a2 upstream.

MSI delivery from on-board ahci controller doesn't work on K8M800.  At
this point, it's unclear whether the culprit is with the ahci
controller or the host bridge.  Given the track record and considering
the rather minimal impact of MSI, disabling it seems reasonable.

Signed-off-by: Tejun Heo <>
Reported-by: Rainer Hurtado Navarro <>
Signed-off-by: Jesse Barnes <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agosplice: fix misuse of SPLICE_F_NONBLOCK
Miklos Szeredi [Tue, 3 Aug 2010 10:48:50 +0000 (12:48 +0200)]
splice: fix misuse of SPLICE_F_NONBLOCK

commit 6965031d331a642e31278fa1b5bd47f372ffdd5d upstream.

SPLICE_F_NONBLOCK is clearly documented to only affect blocking on the
pipe.  In __generic_file_splice_read(), however, it causes an EAGAIN
if the page is currently being read.

This makes it impossible to write an application that only wants
failure if the pipe is full.  For example if the same process is
handling both ends of a pipe and isn't otherwise able to determine
whether a splice to the pipe will fill it or not.

We could make the read non-blocking on O_NONBLOCK or some other splice
flag, but for now this is the simplest fix.

Signed-off-by: Miklos Szeredi <>
Signed-off-by: Jens Axboe <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agonvram: Fix write beyond end condition; prove to gcc copy is safe
H. Peter Anvin [Fri, 11 Dec 2009 23:48:23 +0000 (15:48 -0800)]
nvram: Fix write beyond end condition; prove to gcc copy is safe

commit a01c7800420d2c294ca403988488a635d4087a6d upstream.

In nvram_write, first of all, correctly handle the case where the file
pointer is already beyond the end; we should return EOF in that case.

Second, make the logic a bit more explicit so that gcc can statically
prove that the copy_from_user() is safe.  Once the condition of the
beyond-end filepointer is eliminated, the copy is safe but gcc can't
prove it, causing build failures for i386 allyesconfig.

Third, eliminate the entirely superfluous variable "len", and just use
the passed-in variable "count" instead.

Signed-off-by: H. Peter Anvin <>
Cc: Arjan van de Ven <>
Cc: Andrew Morton <>
Cc: Wim Van Sebroeck <>
Cc: Frederic Weisbecker <>
LKML-Reference: <tip-*>
Cc: Stephen Hemminger <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoLinux
Greg Kroah-Hartman [Tue, 10 Aug 2010 16:57:13 +0000 (09:57 -0700)]

8 years agoGFS2: rename causes kernel Oops
Bob Peterson [Wed, 14 Jul 2010 22:12:26 +0000 (18:12 -0400)]
GFS2: rename causes kernel Oops

commit 728a756b8fcd22d80e2dbba8117a8a3aafd3f203 upstream.

This patch fixes a kernel Oops in the GFS2 rename code.

The problem was in the way the gfs2 directory code was trying
to re-use sentinel directory entries.

In the failing case, gfs2's rename function was renaming a
file to another name that had the same non-trivial length.
The file being renamed happened to be the first directory
entry on the leaf block.

First, the rename code (gfs2_rename in ops_inode.c) found the
original directory entry and decided it could do its job by
simply replacing the directory entry with another.  Therefore
it determined correctly that no block allocations were needed.

Next, the rename code deleted the old directory entry prior to
replacing it with the new name.  Therefore, the soon-to-be
replaced directory entry was temporarily made into a directory
entry "sentinel" or a place holder at the start of a leaf block.

Lastly, it went to re-add the replacement directory entry in
that leaf block.  However, when gfs2_dirent_find_space was
looking for space in the leaf block, it used the wrong value
for the sentinel.  That threw off its calculations so later
it decides it can't really re-use the sentinel and therefore
must allocate a new leaf block.  But because it previously decided
to re-use the directory entry, it didn't waste the time to
grab a new block allocation for the inode.  Therefore, the
inode's i_alloc pointer was still NULL and it crashes trying to
reference it.

In the case of sentinel directory entries, the entire dirent is
reused, not just the "free space" portion of it, and therefore
the function gfs2_dirent_find_space should use the value 0
rather than GFS2_DIRENT_SIZE(0) for the actual dirent size.

Fixing this calculation enables the reproducer programs to work

Signed-off-by: Bob Peterson <>
Signed-off-by: Steven Whitehouse <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoSCSI: enclosure: fix error path - actually return ERR_PTR() on error
James Bottomley [Fri, 12 Mar 2010 22:14:42 +0000 (16:14 -0600)]
SCSI: enclosure: fix error path - actually return ERR_PTR() on error

commit a91c1be21704113b023919826c6d531da46656ef upstream.

we also need to clean up and free the cdev.

Reported-by: Jani Nikula <>
Signed-off-by: James Bottomley <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoxfs: prevent swapext from operating on write-only files
Dan Rosenberg [Thu, 24 Jun 2010 02:07:47 +0000 (12:07 +1000)]
xfs: prevent swapext from operating on write-only files

commit 1817176a86352f65210139d4c794ad2d19fc6b63 upstream.

This patch prevents user "foo" from using the SWAPEXT ioctl to swap
a write-only file owned by user "bar" into a file owned by "foo" and
subsequently reading it.  It does so by checking that the file
descriptors passed to the ioctl are also opened for reading.

Signed-off-by: Dan Rosenberg <>
Reviewed-by: Christoph Hellwig <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoPARISC: led.c - fix potential stack overflow in led_proc_write()
Helge Deller [Mon, 2 Aug 2010 20:46:41 +0000 (22:46 +0200)]
PARISC: led.c - fix potential stack overflow in led_proc_write()

commit 4b4fd27c0b5ec638a1f06ced9226fd95229dbbf0 upstream.

avoid potential stack overflow by correctly checking count parameter

Reported-by: Ilja <>
Signed-off-by: Helge Deller <>
Acked-by: Kyle McMartin <>
Cc: James E.J. Bottomley <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years ago.gitignore updates
Alexey Dobriyan [Wed, 29 Oct 2008 21:00:50 +0000 (14:00 -0700)]
.gitignore updates

commit c17dad6905fc82d8f523399e5c3f014e81d61df6 upstream.

Signed-off-by: Alexey Dobriyan <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoLinux
Greg Kroah-Hartman [Mon, 2 Aug 2010 17:19:11 +0000 (10:19 -0700)]

8 years agoecryptfs: Bugfix for error related to ecryptfs_hash_buckets
Andre Osterhues [Tue, 13 Jul 2010 20:59:17 +0000 (15:59 -0500)]
ecryptfs: Bugfix for error related to ecryptfs_hash_buckets

commit a6f80fb7b5986fda663d94079d3bba0937a6b6ff upstream.

The function ecryptfs_uid_hash wrongly assumes that the
second parameter to hash_long() is the number of hash
buckets instead of the number of hash bits.
This patch fixes that and renames the variable
ecryptfs_hash_buckets to ecryptfs_hash_bits to make it

Fixes: CVE-2010-2492

Signed-off-by: Andre Osterhues <>
Signed-off-by: Tyler Hicks <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agokbuild: Fix modpost segfault
Krzysztof Halasa [Thu, 10 Jun 2010 23:08:20 +0000 (01:08 +0200)]
kbuild: Fix modpost segfault

commit 1c938663d58b5b2965976a6f54cc51b5d6f691aa upstream.

Alan <> writes:

> program: /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost -o
> Module.symvers -S vmlinux.o
> Program received signal SIGSEGV, Segmentation fault.

It just hit me.
It's the offset calculation in reloc_location() which overflows:
        return (void *)elf->hdr + sechdrs[section].sh_offset +
               (r->r_offset - sechdrs[section].sh_addr);

E.g. for the first rodata r entry:
r->r_offset < sechdrs[section].sh_addr
and the expression in the parenthesis produces 0xFFFFFFE0 or something
equally wise.

Reported-by: Alan <>
Signed-off-by: Krzysztof Hałasa <>
Tested-by: Alan <>
Signed-off-by: Michal Marek <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agobonding: select current active slave when enslaving device for mode tlb and alb
Jiri Pirko [Thu, 26 Mar 2009 00:23:38 +0000 (17:23 -0700)]
bonding: select current active slave when enslaving device for mode tlb and alb

commit 5a29f7893fbe681f1334285be7e41e56f0de666c upstream.

I've hit an issue on my system when I've been using RealTek RTL8139D cards in
bonding interface in mode balancing-alb. When I enslave a card, the current
active slave (bond->curr_active_slave) is not set and the link is therefore
not functional.

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)

Bonding Mode: adaptive load balancing
Primary Slave: None
Currently Active Slave: None
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:1f:1f:01:2f:22

The thing that gets it right is when I unplug the cable and then I put it back
into the NIC. Then the current active slave is set to eth1 and link is working
just fine. Here is dmesg log with bonding DEBUG messages turned on:
ADDRCONF(NETDEV_UP): bond0: link is not ready
event_dev: bond0, event: 1
event_dev: bond0, event: 8
bond_ioctl: master=bond0, cmd=35216
event_dev: eth1, event: 8
eth1: link up, 100Mbps, full-duplex, lpa 0xC5E1
event_dev: eth1, event: 1
event_dev: eth1, event: 8
Initial state of slave_dev is BOND_LINK_UP
bonding: bond0: enslaving eth1 as an active interface with an up link.
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
event_dev: bond0, event: 4
bond0: no IPv6 routers present

<<<<cable unplug>>>>

eth1: link down
event_dev: eth1, event: 4
bonding: bond0: link status definitely down for interface eth1, disabling it
event_dev: bond0, event: 4

<<<<cable plug>>>>

eth1: link up, 100Mbps, full-duplex, lpa 0xC5E1
event_dev: eth1, event: 4
bonding: bond0: link status definitely up for interface eth1.
bonding: bond0: making interface eth1 the new active one.
event_dev: eth1, event: 8
event_dev: eth1, event: 8
bonding: bond0: first active interface up!
event_dev: bond0, event: 4

The current active slave is set by calling bond_select_active_slave() function
from bond_miimon_commit() function when the slave (eth1) link goes to state up.

I also tested this on other machine with Broadcom NetXtreme II BCM5708
1000Base-T NIC and there all works fine. The thing is that this adapter is down
and goes up after few seconds after it is enslaved.

This patch calls bond_select_active_slave() in bond_enslave() function for modes
alb and tlb and makes sure that the current active slave is set up properly even
when the slave state is already up. Tested on both systems, works fine.

Notice: The same problem can maybe also occrur in mode 8023AD but I'm unable to
test that.

Signed-off-by: Jiri Pirko <>
Signed-off-by: David S. Miller <>
Cc: Jean Delvare <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoIPoIB: Fix world-writable child interface control sysfs attributes
Or Gerlitz [Sun, 6 Jun 2010 04:59:16 +0000 (04:59 +0000)]
IPoIB: Fix world-writable child interface control sysfs attributes

commit 7a52b34b07122ff5f45258d47f260f8a525518f0 upstream.

Sumeet Lahorani <> reported that the IPoIB
child entries are world-writable; however we don't want ordinary users
to be able to create and destroy child interfaces, so fix them to be
writable only by root.

Signed-off-by: Or Gerlitz <>
Signed-off-by: Roland Dreier <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agox86, Calgary: Limit the max PHB number to 256
Darrick J. Wong [Thu, 1 Jul 2010 00:45:19 +0000 (17:45 -0700)]
x86, Calgary: Limit the max PHB number to 256

commit d596043d71ff0d7b3d0bead19b1d68c55f003093 upstream.

The x3950 family can have as many as 256 PCI buses in a single system, so
change the limits to the maximum.  Since there can only be 256 PCI buses in one
domain, we no longer need the BUG_ON check.

Signed-off-by: Darrick J. Wong <>
LKML-Reference: <>
Signed-off-by: H. Peter Anvin <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agox86, Calgary: Increase max PHB number
Darrick J. Wong [Thu, 24 Jun 2010 21:26:47 +0000 (14:26 -0700)]
x86, Calgary: Increase max PHB number

commit 499a00e92dd9a75395081f595e681629eb1eebad upstream.

Newer systems (x3950M2) can have 48 PHBs per chassis and 8
chassis, so bump the limits up and provide an explanation
of the requirements for each class.

Signed-off-by: Darrick J. Wong <>
Acked-by: Muli Ben-Yehuda <>
Cc: Corinna Schultz <>
LKML-Reference: <>
[ v2: Fixed build bug, added back PHBS_PER_CALGARY == 4 ]
Signed-off-by: Ingo Molnar <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoamd64-agp: Probe unknown AGP devices the right way
Ben Hutchings [Wed, 24 Mar 2010 03:36:31 +0000 (03:36 +0000)]
amd64-agp: Probe unknown AGP devices the right way

commit 6fd024893911dcb51b4a0aa71971db5ba38f7071 upstream.

The current initialisation code probes 'unsupported' AGP devices
simply by calling its own probe function.  It does not lock these
devices or even check whether another driver is already bound to

We must use the device core to manage this.  So if the specific
device id table didn't match anything and agp_try_unsupported=1,
switch the device id table and call driver_attach() again.

Signed-off-by: Ben Hutchings <>
Signed-off-by: Dave Airlie <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoSCSI: aacraid: Eliminate use after free
Julia Lawall [Sat, 15 May 2010 09:46:12 +0000 (11:46 +0200)]
SCSI: aacraid: Eliminate use after free

commit 8a52da632ceb9d8b776494563df579e87b7b586b upstream.

The debugging code using the freed structure is moved before the kfree.

A simplified version of the semantic match that finds this problem is as
follows: (

// <smpl>
expression E;
position p;

expression free.E, subE<=free.E, E1;
position free.p;

  subE = E1
* E
// </smpl>

Signed-off-by: Julia Lawall <>
Signed-off-by: James Bottomley <>
8 years agonetfilter: ip6t_REJECT: fix a dst leak in ipv6 REJECT
Eric Dumazet [Fri, 2 Jul 2010 08:05:01 +0000 (10:05 +0200)]
netfilter: ip6t_REJECT: fix a dst leak in ipv6 REJECT

commit 499031ac8a3df6738f6186ded9da853e8ea18253 upstream.

We should release dst if dst->error is set.

Bug introduced in 2.6.14 by commit e104411b82f5c
([XFRM]: Always release dst_entry on error in xfrm_lookup)

Signed-off-by: Eric Dumazet <>
Signed-off-by: Patrick McHardy <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agohostap: Protect against initialization interrupt
Tim Gardner [Tue, 8 Jun 2010 17:33:02 +0000 (11:33 -0600)]
hostap: Protect against initialization interrupt

commit d6a574ff6bfb842bdb98065da053881ff527be46 upstream.

Use an irq spinlock to hold off the IRQ handler until
enough early card init is complete such that the handler
can run without faulting.

Signed-off-by: Tim Gardner <>
Signed-off-by: John W. Linville <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agomath-emu: correct test for downshifting fraction in _FP_FROM_INT()
Mikael Pettersson [Wed, 21 Jul 2010 01:45:14 +0000 (18:45 -0700)]
math-emu: correct test for downshifting fraction in _FP_FROM_INT()

commit f8324e20f8289dffc646d64366332e05eaacab25 upstream.

The kernel's math-emu code contains a macro _FP_FROM_INT() which is
used to convert an integer to a raw normalized floating-point value.
It does this basically in three steps:

1. Compute the exponent from the number of leading zero bits.
2. Downshift large fractions to put the MSB in the right position
   for normalized fractions.
3. Upshift small fractions to put the MSB in the right position.

There is an boundary error in step 2, causing a fraction with its
MSB exactly one bit above the normalized MSB position to not be
downshifted.  This results in a non-normalized raw float, which when
packed becomes a massively inaccurate representation for that input.

The impact of this depends on a number of arch-specific factors,
but it is known to have broken emulation of FXTOD instructions
on UltraSPARC III, which was originally reported as GCC bug 44631

Any arch which uses math-emu to emulate conversions from integers to
same-size floats may be affected.

The fix is simple: the exponent comparison used to determine if the
fraction should be downshifted must be "<=" not "<".

I'm sending a kernel module to test this as a reply to this message.
There are also SPARC user-space test cases in the GCC bug entry.

Signed-off-by: Mikael Pettersson <>
Signed-off-by: David S. Miller <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agosky2: enable rx/tx in sky2_phy_reinit()
Brandon Philips [Wed, 16 Jun 2010 16:21:58 +0000 (16:21 +0000)]
sky2: enable rx/tx in sky2_phy_reinit()

commit 38000a94a902e94ca8b5498f7871c6316de8957a upstream.

sky2_phy_reinit is called by the ethtool helpers sky2_set_settings,
sky2_nway_reset and sky2_set_pauseparam when netif_running.

However, at the end of sky2_phy_init GM_GP_CTRL has GM_GPCR_RX_ENA and
GM_GPCR_TX_ENA cleared. So, doing these commands causes the device to
stop working:

$ ethtool -r eth0
$ ethtool -A eth0 autoneg off

Fix this issue by enabling Rx/Tx after running sky2_phy_init in

Signed-off-by: Brandon Philips <>
Tested-by: Brandon Philips <>
Tested-by: Mike McCormack <>
Signed-off-by: David S. Miller <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agocpmac: do not leak struct net_device on phy_connect errors
Florian Fainelli [Sun, 20 Jun 2010 22:07:48 +0000 (22:07 +0000)]
cpmac: do not leak struct net_device on phy_connect errors

commit ed770f01360b392564650bf1553ce723fa46afec upstream.

If the call to phy_connect fails, we will return directly instead of freeing
the previously allocated struct net_device.

Signed-off-by: Florian Fainelli <>
Signed-off-by: David S. Miller <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agocifs: Fix a kernel BUG with remote OS/2 server (try #3)
Suresh Jayaraman [Wed, 31 Mar 2010 06:30:03 +0000 (12:00 +0530)]
cifs: Fix a kernel BUG with remote OS/2 server (try #3)

commit 6513a81e9325d712f1bfb9a1d7b750134e49ff18 upstream.

While chasing a bug report involving a OS/2 server, I noticed the server sets
pSMBr->CountHigh to a incorrect value even in case of normal writes. This
results in 'nbytes' being computed wrongly and triggers a kernel BUG at

void iov_iter_advance(struct iov_iter *i, size_t bytes)
        BUG_ON(i->count < bytes);    <--- BUG here

Why the server is setting 'CountHigh' is not clear but only does so after
writing 64k bytes. Though this looks like the server bug, the client side
crash may not be acceptable.

The workaround is to mask off high 16 bits if the number of bytes written as
returned by the server is greater than the bytes requested by the client as
suggested by Jeff Layton.

Reviewed-by: Jeff Layton <>
Signed-off-by: Suresh Jayaraman <>
Signed-off-by: Steve French <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agocifs: remove bogus first_time check in NTLMv2 session setup code
Jeff Layton [Wed, 16 Jun 2010 17:40:18 +0000 (13:40 -0400)]
cifs: remove bogus first_time check in NTLMv2 session setup code

commit 8a224d489454b7457105848610cfebebdec5638d upstream.

This bug appears to be the result of a cut-and-paste mistake from the
NTLMv1 code. The function to generate the MAC key was commented out, but
not the conditional above it. The conditional then ended up causing the
session setup key not to be copied to the buffer unless this was the
first session on the socket, and that made all but the first NTLMv2
session setup fail.

Fix this by removing the conditional and all of the commented clutter
that made it difficult to see.

Reported-by: Gunther Deschner <>
Signed-off-by: Jeff Layton <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agohwmon: (coretemp) Skip duplicate CPU entries
Jean Delvare [Fri, 9 Jul 2010 14:22:49 +0000 (16:22 +0200)]
hwmon: (coretemp) Skip duplicate CPU entries

commit d883b9f0977269d519469da72faec6a7f72cb489 upstream.

On hyper-threaded CPUs, each core appears twice in the CPU list. Skip
the second entry to avoid duplicate sensors.

Signed-off-by: Jean Delvare <>
Acked-by: Huaxu Wan <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agohwmon: (coretemp) Properly label the sensors
Jean Delvare [Fri, 9 Jul 2010 14:22:51 +0000 (16:22 +0200)]
hwmon: (coretemp) Properly label the sensors

commit 3f4f09b4be35d38d6e2bf22c989443e65e70fc4c upstream.

Don't assume that CPU entry number and core ID always match. It
worked in the simple cases (single CPU, no HT) but fails on
multi-CPU systems.

Signed-off-by: Jean Delvare <>
Acked-by: Huaxu Wan <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoLinux
Greg Kroah-Hartman [Mon, 5 Jul 2010 18:09:05 +0000 (11:09 -0700)]

8 years agosctp: fix append error cause to ERROR chunk correctly
Wei Yongjun [Tue, 18 May 2010 05:51:58 +0000 (22:51 -0700)]
sctp: fix append error cause to ERROR chunk correctly

commit 2e3219b5c8a2e44e0b83ae6e04f52f20a82ac0f2 upstream.

commit 5fa782c2f5ef6c2e4f04d3e228412c9b4a4c8809
  sctp: Fix skb_over_panic resulting from multiple invalid \
    parameter errors (CVE-2010-1173) (v4)

cause 'error cause' never be add the the ERROR chunk due to
some typo when check valid length in sctp_init_cause_fixed().

Signed-off-by: Wei Yongjun <>
Reviewed-by: Neil Horman <>
Acked-by: Vlad Yasevich <>
Signed-off-by: David S. Miller <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoKEYS: find_keyring_by_name() can gain access to a freed keyring
Toshiyuki Okajima [Fri, 30 Apr 2010 13:32:13 +0000 (14:32 +0100)]
KEYS: find_keyring_by_name() can gain access to a freed keyring

commit cea7daa3589d6b550546a8c8963599f7c1a3ae5c upstream.

find_keyring_by_name() can gain access to a keyring that has had its reference
count reduced to zero, and is thus ready to be freed.  This then allows the
dead keyring to be brought back into use whilst it is being destroyed.

The following timeline illustrates the process:

|(cleaner)                           (user)
| free_user(user)                    sys_keyctl()
|  |                                  |
|  key_put(user->session_keyring)     keyctl_get_keyring_ID()
|  || //=> keyring->usage = 0        |
|  |schedule_work(&key_cleanup_task)   lookup_user_key()
|  ||                                   |
|  kmem_cache_free(,user)               |
|  .                                    |[KEY_SPEC_USER_KEYRING]
|  .                                    install_user_keyrings()
|  .                                    ||
| key_cleanup() [<= worker_thread()]    ||
|  |                                    ||
|  [spin_lock(&key_serial_lock)]        |[mutex_lock(&key_user_keyr..mutex)]
|  |                                    ||
|  atomic_read() == 0                   ||
|  |{ rb_ease(&key->serial_node,) }     ||
|  |                                    ||
|  [spin_unlock(&key_serial_lock)]      |find_keyring_by_name()
|  |                                    |||
|  keyring_destroy(keyring)             ||[read_lock(&keyring_name_lock)]
|  ||                                   |||
|  |[write_lock(&keyring_name_lock)]    ||atomic_inc(&keyring->usage)
|  |.                                   ||| *** GET freeing keyring ***
|  |.                                   ||[read_unlock(&keyring_name_lock)]
|  ||                                   ||
|  |list_del()                          |[mutex_unlock(&key_user_k..mutex)]
|  ||                                   |
|  |[write_unlock(&keyring_name_lock)]  ** INVALID keyring is returned **
|  |                                    .
|  kmem_cache_free(,keyring)            .
|                                       .
|                                       atomic_dec(&keyring->usage)
v                                         *** DESTROYED ***

If CONFIG_SLUB_DEBUG=y then we may see the following message generated:

BUG key_jar: Poison overwritten

INFO: 0xffff880197a7e200-0xffff880197a7e200. First byte 0x6a instead of 0x6b
INFO: Allocated in key_alloc+0x10b/0x35f age=25 cpu=1 pid=5086
INFO: Freed in key_cleanup+0xd0/0xd5 age=12 cpu=1 pid=10
INFO: Slab 0xffffea000592cb90 objects=16 used=2 fp=0xffff880197a7e200 flags=0x200000000000c3
INFO: Object 0xffff880197a7e200 @offset=512 fp=0xffff880197a7e300

Bytes b4 0xffff880197a7e1f0:  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
  Object 0xffff880197a7e200:  6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk

Alternatively, we may see a system panic happen, such as:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
IP: [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9
PGD 6b2b4067 PUD 6a80d067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/kernel/kexec_crash_loaded
Pid: 31245, comm: su Not tainted 2.6.34-rc5-nofixed-nodebug #2 D2089/PRIMERGY
RIP: 0010:[<ffffffff810e61a3>]  [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9
RSP: 0018:ffff88006af3bd98  EFLAGS: 00010002
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88007d19900b
RDX: 0000000100000000 RSI: 00000000000080d0 RDI: ffffffff81828430
RBP: ffffffff81828430 R08: ffff88000a293750 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000100000 R12: 00000000000080d0
R13: 00000000000080d0 R14: 0000000000000296 R15: ffffffff810f20ce
FS:  00007f97116bc700(0000) GS:ffff88000a280000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000001 CR3: 000000006a91c000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process su (pid: 31245, threadinfo ffff88006af3a000, task ffff8800374414c0)
 0000000512e0958e 0000000000008000 ffff880037f8d180 0000000000000001
 0000000000000000 0000000000008001 ffff88007d199000 ffffffff810f20ce
 0000000000008000 ffff88006af3be48 0000000000000024 ffffffff810face3
Call Trace:
 [<ffffffff810f20ce>] ? get_empty_filp+0x70/0x12f
 [<ffffffff810face3>] ? do_filp_open+0x145/0x590
 [<ffffffff810ce208>] ? tlb_finish_mmu+0x2a/0x33
 [<ffffffff810ce43c>] ? unmap_region+0xd3/0xe2
 [<ffffffff810e4393>] ? virt_to_head_page+0x9/0x2d
 [<ffffffff81103916>] ? alloc_fd+0x69/0x10e
 [<ffffffff810ef4ed>] ? do_sys_open+0x56/0xfc
 [<ffffffff81008a02>] ? system_call_fastpath+0x16/0x1b
Code: 0f 1f 44 00 00 49 89 c6 fa 66 0f 1f 44 00 00 65 4c 8b 04 25 60 e8 00 00 48 8b 45 00 49 01 c0 49 8b 18 48 85 db 74 0d 48 63 45 18 <48> 8b 04 03 49 89 00 eb 14 4c 89 f9 83 ca ff 44 89 e6 48 89 ef
RIP  [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9

This problem is that find_keyring_by_name does not confirm that the keyring is
valid before accepting it.

Skipping keyrings that have been reduced to a zero count seems the way to go.
To this end, use atomic_inc_not_zero() to increment the usage count and skip
the candidate keyring if that returns false.

The following script _may_ cause the bug to happen, but there's no guarantee
as the window of opportunity is small:

/bin/su -c "exit;" $USER || { /usr/sbin/adduser -m $USER; add=1; }
for ((i=0; i<LOOP; i++))
/bin/su -c "echo '$i' > /dev/null" $USER
(( add == 1 )) && /usr/sbin/userdel -r $USER

Note that the nominated user must not be in use.

An alternative way of testing this may be:

for ((i=0; i<100000; i++))
keyctl session foo /bin/true || break
done >&/dev/null

as that uses a keyring named "foo" rather than relying on the user and
user-session named keyrings.

Reported-by: Toshiyuki Okajima <>
Signed-off-by: David Howells <>
Tested-by: Toshiyuki Okajima <>
Acked-by: Serge Hallyn <>
Signed-off-by: James Morris <>
Cc: Ben Hutchings <>
Cc: Chuck Ebbert <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoKEYS: Return more accurate error codes
Dan Carpenter [Mon, 17 May 2010 13:42:35 +0000 (14:42 +0100)]
KEYS: Return more accurate error codes

commit 4d09ec0f705cf88a12add029c058b53f288cfaa2 upstream.

We were using the wrong variable here so the error codes weren't being returned
properly.  The original code returns -ENOKEY.

Signed-off-by: Dan Carpenter <>
Signed-off-by: David Howells <>
Signed-off-by: James Morris <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoparisc: clear floating point exception flag on SIGFPE signal
Helge Deller [Mon, 3 May 2010 20:44:21 +0000 (20:44 +0000)]
parisc: clear floating point exception flag on SIGFPE signal

commit 550f0d922286556c7ea43974bb7921effb5a5278 upstream.

Clear the floating point exception flag before returning to
user space. This is needed, else the libc trampoline handler
may hit the same SIGFPE again while building up a trampoline
to a signal handler.

Fixes debian bug #559406.

Signed-off-by: Helge Deller <>
Signed-off-by: Kyle McMartin <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agotipc: Fix oops on send prior to entering networked mode (v3)
Neil Horman [Wed, 3 Mar 2010 08:31:23 +0000 (08:31 +0000)]
tipc: Fix oops on send prior to entering networked mode (v3)

commit d0021b252eaf65ca07ed14f0d66425dd9ccab9a6 upstream.

Fix TIPC to disallow sending to remote addresses prior to entering NET_MODE

user programs can oops the kernel by sending datagrams via AF_TIPC prior to
entering networked mode.  The following backtrace has been observed:

ID: 13459  TASK: ffff810014640040  CPU: 0   COMMAND: "tipc-client"
[exception RIP: tipc_node_select_next_hop+90]
RIP: ffffffff8869d3c3  RSP: ffff81002d9a5ab8  RFLAGS: 00010202
RAX: 0000000000000001  RBX: 0000000000000001  RCX: 0000000000000001
RDX: 0000000000000000  RSI: 0000000000000001  RDI: 0000000001001001
RBP: 0000000001001001   R8: 0074736575716552   R9: 0000000000000000
R10: ffff81003fbd0680  R11: 00000000000000c8  R12: 0000000000000008
R13: 0000000000000001  R14: 0000000000000001  R15: ffff810015c6ca00
ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
RIP: 0000003cbd8d49a3  RSP: 00007fffc84e0be8  RFLAGS: 00010206
RAX: 000000000000002c  RBX: ffffffff8005d116  RCX: 0000000000000000
RDX: 0000000000000008  RSI: 00007fffc84e0c00  RDI: 0000000000000003
RBP: 0000000000000000   R8: 00007fffc84e0c10   R9: 0000000000000010
R10: 0000000000000000  R11: 0000000000000246  R12: 0000000000000000
R13: 00007fffc84e0d10  R14: 0000000000000000  R15: 00007fffc84e0c30
ORIG_RAX: 000000000000002c  CS: 0033  SS: 002b

What happens is that, when the tipc module in inserted it enters a standalone
node mode in which communication to its own address is allowed <0.0.0> but not
to other addresses, since the appropriate data structures have not been
allocated yet (specifically the tipc_net pointer).  There is nothing stopping a
client from trying to send such a message however, and if that happens, we
attempt to dereference tipc_net.zones while the pointer is still NULL, and
explode.  The fix is pretty straightforward.  Since these oopses all arise from
the dereference of global pointers prior to their assignment to allocated
values, and since these allocations are small (about 2k total), lets convert
these pointers to static arrays of the appropriate size.  All the accesses to
these bits consider 0/NULL to be a non match when searching, so all the lookups
still work properly, and there is no longer a chance of a bad dererence
anywhere.  As a bonus, this lets us eliminate the setup/teardown routines for
those pointers, and elimnates the need to preform any locking around them to
prevent access while their being allocated/freed.

I've updated the tipc_net structure to behave this way to fix the exact reported
problem, and also fixed up the tipc_bearers and media_list arrays to fix an
obvious simmilar problem that arises from issuing tipc-config commands to
manipulate bearers/links prior to entering networked mode

I've tested this for a few hours by running the sanity tests and stress test
with the tipcutils suite, and nothing has fallen over.  There have been a few
lockdep warnings, but those were there before, and can be addressed later, as
they didn't actually result in any deadlock.

Signed-off-by: Neil Horman <>
CC: Allan Stephens <>
CC: David S. Miller <>
Signed-off-by: David S. Miller <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agovfs: add NOFOLLOW flag to umount(2)
Miklos Szeredi [Wed, 10 Feb 2010 11:15:53 +0000 (12:15 +0100)]
vfs: add NOFOLLOW flag to umount(2)

commit db1f05bb85d7966b9176e293f3ceead1cb8b5d79 upstream.

Add a new UMOUNT_NOFOLLOW flag to umount(2).  This is needed to prevent
symlink attacks in unprivileged unmounts (fuse, samba, ncpfs).

Additionally, return -EINVAL if an unknown flag is used (and specify
an explicitly unused flag: UMOUNT_UNUSED).  This makes it possible for
the caller to determine if a flag is supported or not.

CC: Eugene Teo <>
CC: Michael Kerrisk <>
Signed-off-by: Miklos Szeredi <>
Signed-off-by: Al Viro <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agosctp: Fix skb_over_panic resulting from multiple invalid parameter errors (CVE-2010...
Neil Horman [Wed, 28 Apr 2010 10:30:59 +0000 (10:30 +0000)]
sctp: Fix skb_over_panic resulting from multiple invalid parameter errors (CVE-2010-1173) (v4)

commit 5fa782c2f5ef6c2e4f04d3e228412c9b4a4c8809 upstream.

Ok, version 4

Change Notes:
1) Minor cleanups, from Vlads notes


Recently, it was reported to me that the kernel could oops in the
following way:

<5> kernel BUG at net/core/skbuff.c:91!
<5> invalid operand: 0000 [#1]
<5> Modules linked in: sctp netconsole nls_utf8 autofs4 sunrpc iptable_filter
ip_tables cpufreq_powersave parport_pc lp parport vmblock(U) vsock(U) vmci(U)
vmxnet(U) vmmemctl(U) vmhgfs(U) acpiphp dm_mirror dm_mod button battery ac md5
ipv6 uhci_hcd ehci_hcd snd_ens1371 snd_rawmidi snd_seq_device snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_ac97_codec snd soundcore
pcnet32 mii floppy ext3 jbd ata_piix libata mptscsih mptsas mptspi mptscsi
mptbase sd_mod scsi_mod
<5> CPU:    0
<5> EIP:    0060:[<c02bff27>]    Not tainted VLI
<5> EFLAGS: 00010216   (2.6.9-89.0.25.EL)
<5> EIP is at skb_over_panic+0x1f/0x2d
<5> eax: 0000002c   ebx: c033f461   ecx: c0357d96   edx: c040fd44
<5> esi: c033f461   edi: df653280   ebp: 00000000   esp: c040fd40
<5> ds: 007b   es: 007b   ss: 0068
<5> Process swapper (pid: 0, threadinfo=c040f000 task=c0370be0)
<5> Stack: c0357d96 e0c29478 00000084 00000004 c033f461 df653280 d7883180
<5>        00000000 00000080 df653490 00000004 de4f1ac0 de4f1ac0 00000004
<5>        00000001 e0c2877a 08000800 de4f1ac0 df653490 00000000 e0c29d2e
<5> Call Trace:
<5>  [<e0c29478>] sctp_addto_chunk+0xb0/0x128 [sctp]
<5>  [<e0c2947d>] sctp_addto_chunk+0xb5/0x128 [sctp]
<5>  [<e0c2877a>] sctp_init_cause+0x3f/0x47 [sctp]
<5>  [<e0c29d2e>] sctp_process_unk_param+0xac/0xb8 [sctp]
<5>  [<e0c29e90>] sctp_verify_init+0xcc/0x134 [sctp]
<5>  [<e0c20322>] sctp_sf_do_5_1B_init+0x83/0x28e [sctp]
<5>  [<e0c25333>] sctp_do_sm+0x41/0x77 [sctp]
<5>  [<c01555a4>] cache_grow+0x140/0x233
<5>  [<e0c26ba1>] sctp_endpoint_bh_rcv+0xc5/0x108 [sctp]
<5>  [<e0c2b863>] sctp_inq_push+0xe/0x10 [sctp]
<5>  [<e0c34600>] sctp_rcv+0x454/0x509 [sctp]
<5>  [<e084e017>] ipt_hook+0x17/0x1c [iptable_filter]
<5>  [<c02d005e>] nf_iterate+0x40/0x81
<5>  [<c02e0bb9>] ip_local_deliver_finish+0x0/0x151
<5>  [<c02e0c7f>] ip_local_deliver_finish+0xc6/0x151
<5>  [<c02d0362>] nf_hook_slow+0x83/0xb5
<5>  [<c02e0bb2>] ip_local_deliver+0x1a2/0x1a9
<5>  [<c02e0bb9>] ip_local_deliver_finish+0x0/0x151
<5>  [<c02e103e>] ip_rcv+0x334/0x3b4
<5>  [<c02c66fd>] netif_receive_skb+0x320/0x35b
<5>  [<e0a0928b>] init_stall_timer+0x67/0x6a [uhci_hcd]
<5>  [<c02c67a4>] process_backlog+0x6c/0xd9
<5>  [<c02c690f>] net_rx_action+0xfe/0x1f8
<5>  [<c012a7b1>] __do_softirq+0x35/0x79
<5>  [<c0107efb>] handle_IRQ_event+0x0/0x4f
<5>  [<c01094de>] do_softirq+0x46/0x4d

Its an skb_over_panic BUG halt that results from processing an init chunk in
which too many of its variable length parameters are in some way malformed.

The problem is in sctp_process_unk_param:
if (NULL == *errp)
*errp = sctp_make_op_error_space(asoc, chunk,

if (*errp) {
sctp_init_cause(*errp, SCTP_ERROR_UNKNOWN_PARAM,

When we allocate an error chunk, we assume that the worst case scenario requires
that we have chunk_hdr->length data allocated, which would be correct nominally,
given that we call sctp_addto_chunk for the violating parameter.  Unfortunately,
we also, in sctp_init_cause insert a sctp_errhdr_t structure into the error
chunk, so the worst case situation in which all parameters are in violation
requires chunk_hdr->length+(sizeof(sctp_errhdr_t)*param_count) bytes of data.

The result of this error is that a deliberately malformed packet sent to a
listening host can cause a remote DOS, described in CVE-2010-1173:

I've tested the below fix and confirmed that it fixes the issue.  We move to a
strategy whereby we allocate a fixed size error chunk and ignore errors we don't
have space to report.  Tested by me successfully

Signed-off-by: Neil Horman <>
Acked-by: Vlad Yasevich <>
Signed-off-by: David S. Miller <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages
Aneesh Kumar K.V [Fri, 28 May 2010 19:27:23 +0000 (14:27 -0500)]
ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages

commit 2acf2c261b823d9d9ed954f348b97620297a36b5 upstream.

With delayed allocation we lock the page in write_cache_pages() and
try to build an in memory extent of contiguous blocks.  This is needed
so that we can get large contiguous blocks request.  If range_cyclic
mode is enabled, write_cache_pages() will loop back to the 0 index if
no I/O has been done yet, and try to start writing from the beginning
of the range.  That causes an attempt to take the page lock of lower
index page while holding the page lock of higher index page, which can
cause a dead lock with another writeback thread.

The solution is to implement the range_cyclic behavior in
ext4_da_writepages() instead.

Signed-off-by: Aneesh Kumar K.V <>
Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Jayson R. King <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoext4: Fix file fragmentation during large file write.
Aneesh Kumar K.V [Fri, 28 May 2010 19:26:57 +0000 (14:26 -0500)]
ext4: Fix file fragmentation during large file write.

commit 22208dedbd7626e5fc4339c417f8d24cc21f79d7 upstream.

The range_cyclic writeback mode uses the address_space writeback_index
as the start index for writeback.  With delayed allocation we were
updating writeback_index wrongly resulting in highly fragmented file.
This patch reduces the number of extents reduced from 4000 to 27 for a
3GB file.

Signed-off-by: Aneesh Kumar K.V <>
Signed-off-by: Theodore Ts'o <>
[ Some changed lines from the original version of this patch were dropped, since they were rolled up with another cherry-picked patch applied to 2.6.27.y earlier.]
[ Use of wbc->no_nrwrite_index_update was dropped, since write_cache_pages_da() implies it.]
Signed-off-by: Jayson R. King <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoext4: Use our own write_cache_pages()
Theodore Ts'o [Fri, 28 May 2010 19:26:25 +0000 (14:26 -0500)]
ext4: Use our own write_cache_pages()

commit 8e48dcfbd7c0892b4cfd064d682cc4c95a29df32 upstream.

Make a copy of write_cache_pages() for the benefit of
ext4_da_writepages().  This allows us to simplify the code some, and
will allow us to further customize the code in future patches.

There are some nasty hacks in write_cache_pages(), which Linus has
(correctly) characterized as vile.  I've just copied it into
write_cache_pages_da(), without trying to clean those bits up lest I
break something in the ext4's delalloc implementation, which is a bit
fragile right now.  This will allow Dave Chinner to clean up
write_cache_pages() in mm/page-writeback.c, without worrying about
breaking ext4.  Eventually write_cache_pages_da() will go away when I
rewrite ext4's delayed allocation and create a general
ext4_writepages() which is used for all of ext4's writeback.  Until
now this is the lowest risk way to clean up the core
write_cache_pages() function.

Signed-off-by: "Theodore Ts'o" <>
Cc: Dave Chinner <>
[ Dropped the hunks which reverted the use of no_nrwrite_index_update, since those lines weren't ever created on 2.6.27.y]
[ Copied from 2.6.27.y's version of write_cache_pages(), plus the changes to it from patch "vfs: Add no_nrwrite_index_update writeback control flag"]
Signed-off-by: Jayson R. King <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoext4: check s_log_groups_per_flex in online resize code
Eric Sandeen [Sun, 16 May 2010 05:00:00 +0000 (01:00 -0400)]
ext4: check s_log_groups_per_flex in online resize code

commit 42007efd569f1cf3bfb9a61da60ef6c2179508ca upstream.

If groups_per_flex < 2, sbi->s_flex_groups[] doesn't get filled out,
and every other access to this first tests s_log_groups_per_flex;
same thing needs to happen in resize or we'll wander off into
a null pointer when doing an online resize of the file system.

Thanks to Christoph Biedl, who came up with the trivial testcase:

# truncate --size 128M fsfile
# mkfs.ext3 -F fsfile
# tune2fs -O extents,uninit_bg,dir_index,flex_bg,huge_file,dir_nlink,extra_isize fsfile
# e2fsck -yDf -C0 fsfile
# truncate --size 132M fsfile
# losetup /dev/loop0 fsfile
# mount /dev/loop0 mnt
# resize2fs -p /dev/loop0

Reported-by: Alessandro Polverini <>
Test-case-by: Christoph Biedl <>
Signed-off-by: Eric Sandeen <>
Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agogconfig: fix build failure on fedora 13
Richard Kennedy [Thu, 27 May 2010 09:22:28 +0000 (10:22 +0100)]
gconfig: fix build failure on fedora 13

commit cbab05f041a4cff6ca15856bdd35238b282b64eb upstream.

Making gconfig fails on fedora 13 as the linker cannot resolve dlsym.

Adding libdl to the link command fixes this.

make shows this error :-
    /usr/bin/ld: scripts/kconfig/kconfig_load.o: undefined reference to symbol 'dlsym@@GLIBC_2.2.5'
    /usr/bin/ld: note: 'dlsym@@GLIBC_2.2.5' is defined in DSO /lib64/ so try adding it to the linker command line
    /lib64/ could not read symbols: Invalid operation

tested on x86_64 fedora 13.

Signed-off-by: Richard Kennedy <>
Reviewed-by: WANG Cong <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Michal Marek <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoipmi: handle run_to_completion properly in deliver_recv_msg()
Jiri Kosina [Wed, 26 May 2010 21:43:53 +0000 (14:43 -0700)]
ipmi: handle run_to_completion properly in deliver_recv_msg()

commit a747c5abc329611220f16df0bb4cf0ca4a7fdf0c upstream.

If run_to_completion flag is set, it means that we are running in a
single-threaded mode, and thus no locks are held.

This fixes a deadlock when IPMI notifier is being called during panic.

Signed-off-by: Jiri Kosina <>
Acked-by: Corey Minyard <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agodo_generic_file_read: clear page errors when issuing a fresh read of the page
Jeff Moyer [Wed, 26 May 2010 15:49:40 +0000 (11:49 -0400)]
do_generic_file_read: clear page errors when issuing a fresh read of the page

commit 91803b499cca2fe558abad709ce83dc896b80950 upstream.

I/O errors can happen due to temporary failures, like multipath
errors or losing network contact with the iSCSI server. Because
of that, the VM will retry readpage on the page.

However, do_generic_file_read does not clear PG_error.  This
causes the system to be unable to actually use the data in the
page cache page, even if the subsequent readpage completes

The function filemap_fault has had a ClearPageError before
readpage forever.  This patch simply adds the same to

Signed-off-by: Jeff Moyer <>
Signed-off-by: Rik van Riel <>
Acked-by: Larry Woodman <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agomd: set mddev readonly flag on blkdev BLKROSET ioctl
Dan Williams [Tue, 11 May 2010 22:25:37 +0000 (08:25 +1000)]
md: set mddev readonly flag on blkdev BLKROSET ioctl

commit e2218350465e7e0931676b4849b594c978437bce upstream.

When the user sets the block device to readwrite then the mddev should
follow suit.  Otherwise, the BUG_ON in md_write_start() will be set to

The reverse direction, setting mddev->ro to match a set readonly
request, can be ignored because the blkdev level readonly flag precludes
the need to have mddev->ro set correctly.  Nevermind the fact that
setting mddev->ro to 1 may fail if the array is in use.

Signed-off-by: Dan Williams <>
Signed-off-by: NeilBrown <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agomd: Fix read balancing in RAID1 and RAID10 on drives > 2TB
NeilBrown [Fri, 7 May 2010 22:20:17 +0000 (08:20 +1000)]
md: Fix read balancing in RAID1 and RAID10 on drives > 2TB

commit af3a2cd6b8a479345786e7fe5e199ad2f6240e56 upstream.

read_balance uses a "unsigned long" for a sector number which
will get truncated beyond 2TB.
This will cause read-balancing to be non-optimal, and can cause
data to be read from the 'wrong' branch during a resync.  This has a
very small chance of returning wrong data.

Reported-by: Jordan Russell <>
Signed-off-by: NeilBrown <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agomd/raid1: fix counting of write targets.
NeilBrown [Tue, 18 May 2010 05:27:13 +0000 (15:27 +1000)]
md/raid1: fix counting of write targets.

commit 964147d5c86d63be79b442c30f3783d49860c078 upstream.

There is a very small race window when writing to a
RAID1 such that if a device is marked faulty at exactly the wrong
time, the write-in-progress will not be sent to the device,
but the bitmap (if present) will be updated to say that
the write was sent.

Then if the device turned out to still be usable as was re-added
to the array, the bitmap-based-resync would skip resyncing that
block, possibly leading to corruption.  This would only be a problem
if no further writes were issued to that area of the device (i.e.
that bitmap chunk).

Suitable for any pending -stable kernel.

Signed-off-by: NeilBrown <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agopowerpc/oprofile: fix potential buffer overrun in op_model_cell.c
Denis Kirjanov [Tue, 1 Jun 2010 19:43:34 +0000 (15:43 -0400)]
powerpc/oprofile: fix potential buffer overrun in op_model_cell.c

commit 238c1a78c957f3dc7cb848b161dcf4805793ed56 upstream.

Fix potential initial_lfsr buffer overrun.
Writing past the end of the buffer could happen when index == ENTRIES

Signed-off-by: Denis Kirjanov <>
Signed-off-by: Robert Richter <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agopowerpc/pseries: Make query_cpu_stopped callable outside hotplug cpu
Michael Neuling [Wed, 28 Apr 2010 13:39:41 +0000 (13:39 +0000)]
powerpc/pseries: Make query_cpu_stopped callable outside hotplug cpu

commit f8b67691828321f5c85bb853283aa101ae673130 upstream.

This moves query_cpu_stopped() out of the hotplug cpu code and into
smp.c so it can called in other places and renames it to

It also cleans up the return values by adding some #defines

Signed-off-by: Michael Neuling <>
Signed-off-by: Benjamin Herrenschmidt <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agopowerpc/pseries: Only call start-cpu when a CPU is stopped
Michael Neuling [Wed, 28 Apr 2010 13:39:41 +0000 (13:39 +0000)]
powerpc/pseries: Only call start-cpu when a CPU is stopped

commit aef40e87d866355ffd279ab21021de733242d0d5 upstream.

Currently we always call start-cpu irrespective of if the CPU is
stopped or not. Unfortunatley on POWER7, firmware seems to not like
start-cpu being called when a cpu already been started.  This was not
the case on POWER6 and earlier.

This patch checks to see if the CPU is stopped or not via an
query-cpu-stopped-state call, and only calls start-cpu on CPUs which
are stopped.

This fixes a bug with kexec on POWER7 on PHYP where only the primary
thread would make it to the second kernel.

Reported-by: Ankita Garg <>
Signed-off-by: Michael Neuling <>
Signed-off-by: Benjamin Herrenschmidt <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agopowerpc: Fix handling of strncmp with zero len
Jeff Mahoney [Wed, 17 Mar 2010 10:55:51 +0000 (10:55 +0000)]
powerpc: Fix handling of strncmp with zero len

commit 637a99022fb119b90fb281715d13172f0394fc12 upstream.

Commit 0119536c, which added the assembly version of strncmp to
powerpc, mentions that it adds two instructions to the version from
boot/string.S to allow it to handle len=0. Unfortunately, it doesn't
always return 0 when that is the case. The length is passed in r5, but
the return value is passed back in r3. In certain cases, this will
happen to work. Otherwise it will pass back the address of the first
string as the return value.

This patch lifts the len <= 0 handling code from memcpy to handle that

Reported by:
Signed-off-by: Jeff Mahoney <>
Signed-off-by: Benjamin Herrenschmidt <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoARCNET: Limit com20020 PCI ID matches for SOHARD cards
Andreas Bombe [Tue, 18 May 2010 06:12:46 +0000 (23:12 -0700)]
ARCNET: Limit com20020 PCI ID matches for SOHARD cards

commit e7971c80a8e0299f91272ad8e8ac4167623e1862 upstream.

The SH SOHARD ARCNET cards are implemented using generic PLX Technology
PCI<->IOBus bridges. Subvendor and subdevice IDs were not specified,
causing the driver to attach to any such bridge and likely crash the
system by attempting to initialize an unrelated device.

Fix by specifying subvendor and subdevice according to the values found
in the PCI-ID Repository at .

Signed-off-by: Andreas Bombe <>
Signed-off-by: David S. Miller <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoNFSD: don't report compiled-out versions as present
Pavel Emelyanov [Fri, 14 May 2010 11:33:36 +0000 (15:33 +0400)]
NFSD: don't report compiled-out versions as present

commit 15ddb4aec54422ead137b03ea4e9b3f5db3f7cc2 upstream.

The /proc/fs/nfsd/versions file calls nfsd_vers() to check whether
the particular nfsd version is present/available. The problem is
that once I turn off e.g. NFSD-V4 this call returns -1 which is
true from the callers POV which is wrong.

The proposal is to report false in that case.

The bug has existed since 6658d3a7bbfd1768 "[PATCH] knfsd: remove
nfsd_versbits as intermediate storage for desired versions".

Signed-off-by: Pavel Emelyanov <>
Acked-by: NeilBrown <>
Signed-off-by: J. Bruce Fields <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agolibata: disable ATAPI AN by default
Tejun Heo [Wed, 19 May 2010 13:38:58 +0000 (15:38 +0200)]
libata: disable ATAPI AN by default

commit e7ecd435692ca9bde9d124be30b3a26e672ea6c2 upstream.

There are ATAPI devices which raise AN when hit by commands issued by
open().  This leads to infinite loop of AN -> MEDIA_CHANGE uevent ->
udev open() to check media -> AN.

Both ACS and SerialATA standards don't define in which case ATAPI
devices are supposed to raise or not raise AN.  They both list media
insertion event as a possible use case for ATAPI ANs but there is no
clear description of what constitutes such events.  As such, it seems
a bit too naive to export ANs directly to userland as MEDIA_CHANGE
events without further verification (which should behave similarly to
windows as it apparently is the only thing that some hardware vendors
are testing against).

This patch adds libata.atapi_an module parameter and disables ATAPI AN
by default for now.

Signed-off-by: Tejun Heo <>
Cc: Kay Sievers <>
Cc: Nick Bowler <>
Cc: David Zeuthen <>
Signed-off-by: Jeff Garzik <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoLinux
Greg Kroah-Hartman [Wed, 26 May 2010 21:27:20 +0000 (14:27 -0700)]

8 years agonfsd: fix vm overcommit crash fix #2
Junjiro R. Okajima [Tue, 2 Dec 2008 18:31:46 +0000 (10:31 -0800)]
nfsd: fix vm overcommit crash fix #2

commit 1b79cd04fab80be61dcd2732e2423aafde9a4c1c upstream.

The previous patch from Alan Cox ("nfsd: fix vm overcommit crash",
commit 731572d39fcd3498702eda4600db4c43d51e0b26) fixed the problem where
knfsd crashes on exported shmemfs objects and strict overcommit is set.

But the patch forgot supporting the case when CONFIG_SECURITY is

This patch copies a part of his fix which is mainly for detecting a bug

Acked-by: James Morris <>
Signed-off-by: Alan Cox <>
Signed-off-by: Junjiro R. Okajima <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agonfsd: fix vm overcommit crash
Alan Cox [Wed, 29 Oct 2008 21:01:20 +0000 (14:01 -0700)]
nfsd: fix vm overcommit crash

commit 731572d39fcd3498702eda4600db4c43d51e0b26 upstream.

Junjiro R.  Okajima reported a problem where knfsd crashes if you are
using it to export shmemfs objects and run strict overcommit.  In this
situation the current->mm based modifier to the overcommit goes through a
NULL pointer.

We could simply check for NULL and skip the modifier but we've caught
other real bugs in the past from mm being NULL here - cases where we did
need a valid mm set up (eg the exec bug about a year ago).

To preserve the checks and get the logic we want shuffle the checking
around and add a new helper to the vm_ security wrappers

Also fix a current->mm reference in nommu that should use the passed mm

[ coding-style fixes]
[ fix build]
Reported-by: Junjiro R. Okajima <>
Acked-by: James Morris <>
Signed-off-by: Alan Cox <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoi2c-tiny-usb: Fix on big-endian systems
Jean Delvare [Fri, 5 Feb 2010 16:48:13 +0000 (17:48 +0100)]
i2c-tiny-usb: Fix on big-endian systems

commit 1c010ff8912cbc08d80e865aab9c32b6b00c527d upstream.

The functionality bit vector is always returned as a little-endian
32-bit number by the device, so it must be byte-swapped to the host

On the other hand, the delay value is handled by the USB stack, so no
byte swapping is needed on our side.

This fixes bug #15105:

Reported-by: Jens Richter <>
Signed-off-by: Jean Delvare <>
Tested-by: Jens Richter <>
Cc: Till Harbaum <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoi2c-i801: Don't use the block buffer for I2C block writes
Jean Delvare [Sat, 13 Mar 2010 19:56:53 +0000 (20:56 +0100)]
i2c-i801: Don't use the block buffer for I2C block writes

commit c074c39d62306efa5ba7c69c1a1531bc7333d252 upstream.

Experience has shown that the block buffer can only be used for SMBus
(not I2C) block transactions, even though the datasheet doesn't
mention this limitation.

Reported-by: Felix Rubinstein <>
Signed-off-by: Jean Delvare <>
Cc: Oleg Ryjkov <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agohwmon: (w83781d) Request I/O ports individually for probing
Jean Delvare [Fri, 5 Feb 2010 18:58:36 +0000 (19:58 +0100)]
hwmon: (w83781d) Request I/O ports individually for probing

commit b0bcdd3cd0adb85a7686b396ba50493871b1135c upstream.

Different motherboards have different PNP declarations for
W83781D/W83782D chips. Some declare the whole range of I/O ports (8
ports), some declare only the useful ports (2 ports at offset 5) and
some declare fancy ranges, for example 4 ports at offset 4. To
properly handle all cases, request all ports individually for probing.
After we have determined that we really have a W83781D or W83782D
chip, the useful port range will be requested again, as a single

I did not see a board which needs this yet, but I know of one for lm78
driver and I'd like to keep the logic of these two drivers in sync.

Signed-off-by: Jean Delvare <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agosvc: Clean up deferred requests on transport destruction
Tom Tucker [Mon, 5 Jan 2009 21:21:19 +0000 (15:21 -0600)]
svc: Clean up deferred requests on transport destruction

commit 22945e4a1c7454c97f5d8aee1ef526c83fef3223 upstream.

A race between svc_revisit and svc_delete_xprt can result in
deferred requests holding references on a transport that can never be
recovered because dead transports are not enqueued for subsequent

Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
transport and sweep a transport's deferred queue to do the same for queued
but unprocessed deferrals.

Signed-off-by: Tom Tucker <>
Signed-off-by: J. Bruce Fields <>
Cc: roma1390 <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agolibata: retry FS IOs even if it has failed with AC_ERR_INVALID
Tejun Heo [Thu, 14 Jan 2010 07:18:09 +0000 (16:18 +0900)]
libata: retry FS IOs even if it has failed with AC_ERR_INVALID

commit 534ead709235b967b659947c55d9130873a432c4 upstream.

libata currently doesn't retry if a command fails with AC_ERR_INVALID
assuming that retrying won't get it any further even if retried.
However, a failure may be classified as invalid through hardware
glitch (incorrect reading of the error register or firmware bug) and
there isn't whole lot to gain by not retrying as actually invalid
commands will be failed immediately.  Also, commands serving FS IOs
are extremely unlikely to be invalid.  Retry FS IOs even if it's
marked invalid.

Transient and incorrect invalid failure was seen while debugging
firmware related issue on Samsung n130 on bko#14314.

Signed-off-by: Tejun Heo <>
Reported-by: Johannes Stezenbach <>
Signed-off-by: Jeff Garzik <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agolibata: ensure NCQ error result taskfile is fully initialized before returning it...
Jeff Garzik [Fri, 23 Apr 2010 01:59:13 +0000 (21:59 -0400)]
libata: ensure NCQ error result taskfile is fully initialized before returning it via qc->result_tf.

commit a09bf4cd53b8ab000197ef81f15d50f29ecf973c upstream.

Signed-off-by: Jeff Garzik <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoi2c: Fix probing of FSC hardware monitoring chips
Jean Delvare [Tue, 4 May 2010 09:09:28 +0000 (11:09 +0200)]
i2c: Fix probing of FSC hardware monitoring chips

commit b1d4b390ea4bb480e65974ce522a04022608a8df upstream.

Some FSC hardware monitoring chips (Syleus at least) doesn't like
quick writes we typically use to probe for I2C chips. Use a regular
byte read instead for the address they live at (0x73). These are the
only known chips living at this address on PC systems.

For clarity, this fix should not be needed for kernels 2.6.30 and
later, as we started instantiating the hwmon devices explicitly based
on DMI data. Still, this fix is valuable in the following two cases:
* Support for recent FSC chips on older kernels. The DMI-based device
  instantiation is more difficult to backport than the device support
* Case where the DMI-based device instantiation fails, whatever the
  reason. We fall back to probing in that case, so it should work.

This fixes kernel bug #15634:

Signed-off-by: Jean Delvare <>
Acked-by: Hans de Goede <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoNFS: rsize and wsize settings ignored on v4 mounts
Chuck Lever [Thu, 22 Apr 2010 19:35:56 +0000 (15:35 -0400)]
NFS: rsize and wsize settings ignored on v4 mounts

commit 356e76b855bdbfd8d1c5e75bcf0c6bf0dfe83496 upstream.

NFSv4 mounts ignore the rsize and wsize mount options, and always use
the default transfer size for both.  This seems to be because all
NFSv4 mounts are now cloned, and the cloning logic doesn't copy the
rsize and wsize settings from the parent nfs_server.

I tested Fedora's and it seems to have this problem as
well, so I'm guessing that .33, .32, and perhaps older kernels have
this issue as well.

Signed-off-by: Chuck Lever <>
Signed-off-by: Trond Myklebust <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agonfs d_revalidate() is too trigger-happy with d_drop()
Al Viro [Thu, 29 Apr 2010 02:10:43 +0000 (03:10 +0100)]
nfs d_revalidate() is too trigger-happy with d_drop()

commit d9e80b7de91db05c1c4d2e5ebbfd70b3b3ba0e0f upstream.

If dentry found stale happens to be a root of disconnected tree, we
can't d_drop() it; its d_hash is actually part of s_anon and d_drop()
would simply hide it from shrink_dcache_for_umount(), leading to
all sorts of fun, including busy inodes on umount and oopsen after

Bug had been there since at least 2006 (commit c636eb already has it),
so it's definitely -stable fodder.

Signed-off-by: Al Viro <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoUSB: fix testing the wrong variable in fs_create_by_name()
Dan Carpenter [Thu, 22 Apr 2010 10:00:52 +0000 (12:00 +0200)]
USB: fix testing the wrong variable in fs_create_by_name()

commit fa7fe7af146a7b613e36a311eefbbfb5555325d1 upstream.

There is a typo here.  We should be testing "*dentry" which was just
assigned instead of "dentry".  This could result in dereferencing an
ERR_PTR inside either usbfs_mkdir() or usbfs_create().

Signed-off-by: Dan Carpenter <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agonfsd4: bug in read_buf
Neil Brown [Tue, 20 Apr 2010 02:16:52 +0000 (12:16 +1000)]
nfsd4: bug in read_buf

commit 2bc3c1179c781b359d4f2f3439cb3df72afc17fc upstream.

When read_buf is called to move over to the next page in the pagelist
of an NFSv4 request, it sets argp->end to essentially a random
number, certainly not an address within the page which argp->p now
points to.  So subsequent calls to READ_BUF will think there is much
more than a page of spare space (the cast to u32 ensures an unsigned
comparison) so we can expect to fall off the end of the second

We never encountered thsi in testing because typically the only
operations which use more than two pages are write-like operations,
which have their own decoding logic.  Something like a getattr after a
write may cross a page boundary, but it would be very unusual for it to
cross another boundary after that.

Signed-off-by: J. Bruce Fields <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoclockevent: Prevent dead lock on clockevents_lock
Suresh Siddha [Thu, 22 Apr 2010 09:47:51 +0000 (11:47 +0200)]
clockevent: Prevent dead lock on clockevents_lock

This is a merge of two mainline commits, intended
for submission for 2.6.27 kernel.

commit f833bab87fca5c3ce13778421b1365845843b976

commit 918aae42aa9b611a3663b16ae849fdedc67c2292
Changelog of both:

    Currently clockevents_notify() is called with interrupts enabled at
    some places and interrupts disabled at some other places.

    This results in a deadlock in this scenario.

    cpu A holds clockevents_lock in clockevents_notify() with irqs enabled
    cpu B waits for clockevents_lock in clockevents_notify() with irqs disabled
    cpu C doing set_mtrr() which will try to rendezvous of all the cpus.

    This will result in C and A come to the rendezvous point and waiting
    for B. B is stuck forever waiting for the spinlock and thus not
    reaching the rendezvous point.

    Fix the clockevents code so that clockevents_lock is taken with
    interrupts disabled and thus avoid the above deadlock.

    Also call lapic_timer_propagate_broadcast() on the destination cpu so
    that we avoid calling smp_call_function() in the clockevents notifier

    This issue left us wondering if we need to change the MTRR rendezvous
    logic to use stop machine logic (instead of smp_call_function) or add
    a check in spinlock debug code to see if there are other spinlocks
    which gets taken under both interrupts enabled/disabled conditions.

Signed-off-by: Suresh Siddha <>
Cc: "Brown Len" <>
    LKML-Reference: <>
Signed-off-by: Thomas Gleixner <>
    I got following warning on ia64 box:
      In function 'acpi_processor_power_verify':
      642: warning: passing argument 2 of 'smp_call_function_single' from
      incompatible pointer type

    This smp_call_function_single() was introduced by a commit

    The problem is that the lapic_timer_propagate_broadcast() has 2 versions:
    One is real code that modified in the above commit, and the other is NOP
    code that used when !ARCH_APICTIMER_STOPS_ON_C3:

      static void lapic_timer_propagate_broadcast(struct acpi_processor *pr) { }

    So I got warning because of !ARCH_APICTIMER_STOPS_ON_C3.

    We really want to do nothing here on !ARCH_APICTIMER_STOPS_ON_C3, so
    modify lapic_timer_propagate_broadcast() of real version to use
    smp_call_function_single() in it.

Signed-off-by: Hidetoshi Seto <>
Acked-by: Suresh Siddha <>
Signed-off-by: Len Brown <>
Signed-off-by: Thomas Renninger <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agotrace: Fix inappropriate substraction on tracing_pages_allocated in trace_free_page()
Wang Sheng-Hui [Tue, 13 Apr 2010 13:04:10 +0000 (21:04 +0800)]
trace: Fix inappropriate substraction on tracing_pages_allocated in trace_free_page()

[No matching upstream git commit id as it was fixed differently due to a
rewrite of the tracing code there.]

For normal case, the code in trace_free_page() do once more substraction
on tracing_pages_allocated, but for CONFIG_TRACER_MAX_TRACE  it doesn't
take the freed page into account. That's not consistent with
trace_alloc_page().  Well, for there are no message related with this,
so we cannot observe its incorrect state when the kernel doesn't define
"CONFIG_TRACER_MAX_TRACE". If you add some pr_info() as
trace_alloc_page(), you may notice it.

Cc: Steven Rostedt <>
Cc: Frederic Weisbecker <>
Cc: Ingo Molnar <>
Cc: Li Zefan <>
Signed-off-by: Wang Sheng-Hui <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agomegaraid_sas: fix for 32bit apps
Tomas Henzl [Thu, 11 Feb 2010 17:01:50 +0000 (18:01 +0100)]
megaraid_sas: fix for 32bit apps

commit b3dc1a212e5167984616445990c76056034f8eeb upstream.

It looks like this patch -

commit 7b2519afa1abd1b9f63aa1e90879307842422dae
Author: Yang, Bo <>
Date:   Tue Oct 6 14:52:20 2009 -0600

    [SCSI] megaraid_sas: fix 64 bit sense pointer truncation

has caused a problem for 32bit programs with 64bit os -

fix by converting the user space 32bit pointer to a 64 bit one when

[jejb: fix up some 64 bit warnings]
Signed-off-by: Tomas Henzl <>
Cc: Bo Yang <>
Signed-off-by: James Bottomley <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agotty: release_one_tty() forgets to put pids
Oleg Nesterov [Fri, 2 Apr 2010 16:05:12 +0000 (18:05 +0200)]
tty: release_one_tty() forgets to put pids

commit 6da8d866d0d39e9509ff826660f6a86a6757c966 upstream.

release_one_tty(tty) can be called when tty still has a reference
to pgrp/session. In this case we leak the pid.

Signed-off-by: Oleg Nesterov <>
Reported-by: Catalin Marinas <>
Reported-and-tested-by: Tetsuo Handa <>
Acked-by: Linus Torvalds <>
Acked-by: Eric W. Biederman <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agovfs: Remove the range_cont writeback mode.
Aneesh Kumar K.V [Tue, 16 Mar 2010 00:26:02 +0000 (20:26 -0400)]
vfs: Remove the range_cont writeback mode.

commit 74baaaaec8b4f22e1ae279f5ecca4ff705b28912 upstream.

Ext4 was the only user of range_cont writeback mode and ext4 switched
to a different method. So remove the range_cont mode which is not used
in the kernel.

Signed-off-by: Aneesh Kumar K.V <>
Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Jayson R. King <>
Signed-off-by: Theodore Ts'o <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoext4: Use tag dirty lookup during mpage_da_submit_io
Aneesh Kumar K.V [Tue, 16 Mar 2010 00:26:01 +0000 (20:26 -0400)]
ext4: Use tag dirty lookup during mpage_da_submit_io

commit af6f029d3836eb7264cd3fbb13a6baf0e5fdb5ea upstream.

This enables us to drop the range_cont writeback mode
use from ext4_da_writepages.

Signed-off-by: Aneesh Kumar K.V <>
Signed-off-by: Jayson R. King <>
Signed-off-by: Theodore Ts'o <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoext4: Retry block allocation if we have free blocks left
Aneesh Kumar K.V [Tue, 16 Mar 2010 00:26:00 +0000 (20:26 -0400)]
ext4: Retry block allocation if we have free blocks left

commit df22291ff0fde0d350cf15dac3e5cc33ac528875 upstream.

When we truncate files, the meta-data blocks released are not reused
untill we commit the truncate transaction.  That means delayed get_block
request will return ENOSPC even if we have free blocks left.  Force a
journal commit and retry block allocation if we get ENOSPC with free
blocks left.

Signed-off-by: Aneesh Kumar K.V <>
Signed-off-by: Mingming Cao <>
Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Jayson R. King <>
Signed-off-by: Theodore Ts'o <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoext4: Retry block reservation
Aneesh Kumar K.V [Tue, 16 Mar 2010 00:25:59 +0000 (20:25 -0400)]
ext4: Retry block reservation

commit 030ba6bc67b4f2bc5cd174f57785a1745c929abe upstream.

During block reservation if we don't have enough blocks left, retry
block reservation with smaller block counts.  This makes sure we try
fallocate and DIO with smaller request size and don't fail early.  The
delayed allocation reservation cannot try with smaller block count. So
retry block reservation to handle temporary disk full conditions.  Also
print free blocks details if we fail block allocation during writepages.

Signed-off-by: Aneesh Kumar K.V <>
Signed-off-by: Mingming Cao <>
Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Jayson R. King <>
Signed-off-by: Theodore Ts'o <>
Signed-off-by: Greg Kroah-Hartman <>
8 years agoext4: Add percpu dirty block accounting.
Aneesh Kumar K.V [Tue, 16 Mar 2010 00:25:58 +0000 (20:25 -0400)]
ext4: Add percpu dirty block accounting.

commit 6bc6e63fcd7dac9e633ea29f1fddd9580ab28f3f upstream.

This patch adds dirty block accounting using percpu_counters.  Delayed
allocation block reservation is now done by updating dirty block
counter.  In a later patch we switch to non delalloc mode if the
filesystem free blocks is greater than 150% of total filesystem dirty

Signed-off-by: Aneesh Kumar K.V <>
Signed-off-by: Mingming Cao<>
Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Jayson R. King <>
Signed-off-by: Theodore Ts'o <>
Signed-off-by: Greg Kroah-Hartman <>