This patch change the kernel VSID range so that we limit VSID_BITS to 37.
This enables us to support 64TB with 65 bit VA (37+28). Without this patch
we have boot hangs on platforms that only support 65 bit VA.
With this patch we now have proto vsid generated as below:
We first generate a 37-bit "proto-VSID". Proto-VSIDs are generated
from mmu context id and effective segment id of the address.
For user processes max context id is limited to ((1ul << 19) - 5)
for kernel space, we use the top 4 context ids to map address as below
0x7fffc - [ 0xc000000000000000 - 0xc0003fffffffffff ]
0x7fffd - [ 0xd000000000000000 - 0xd0003fffffffffff ]
0x7fffe - [ 0xe000000000000000 - 0xe0003fffffffffff ]
0x7ffff - [ 0xf000000000000000 - 0xf0003fffffffffff ]
Michael Neuling [Mon, 11 Mar 2013 16:42:49 +0000 (16:42 +0000)]
powerpc/ptrace: Fix brk.len used uninitialised
With some CONFIGS it's possible that in ppc_set_hwdebug, brk.len is
uninitialised before being used. It has been reported that GCC 4.2 will
produce the following error in this case:
arch/powerpc/kernel/ptrace.c:1479: warning: 'brk.len' is used uninitialized in this function
arch/powerpc/kernel/ptrace.c:1381: note: 'brk.len' was declared here
Liu Bo [Fri, 15 Mar 2013 14:46:39 +0000 (08:46 -0600)]
Btrfs: fix warning of free_extent_map
Users report that an extent map's list is still linked when it's actually
going to be freed from cache.
The story is that
a) when we're going to drop an extent map and may split this large one into
smaller ems, and if this large one is flagged as EXTENT_FLAG_LOGGING which means
that it's on the list to be logged, then the smaller ems split from it will also
be flagged as EXTENT_FLAG_LOGGING, and this is _not_ expected.
b) we'll keep ems from unlinking the list and freeing when they are flagged with
EXTENT_FLAG_LOGGING, because the log code holds one reference.
The end result is the warning, but the truth is that we set the flag
EXTENT_FLAG_LOGGING only during fsync.
So clear flag EXTENT_FLAG_LOGGING for extent maps split from a large one.
Linus Torvalds [Sat, 16 Mar 2013 01:06:55 +0000 (18:06 -0700)]
Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild fix from Michal Marek:
"One fix for for make headers_install/headers_check to not require make
3.81. The requirement has been accidentally introduced in 3.7."
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kbuild: fix make headers_check with make 3.80
Linus Torvalds [Sat, 16 Mar 2013 01:05:37 +0000 (18:05 -0700)]
Merge tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux
Pull OpenRISC bug fixes from Jonas Bonn:
- The GPIO descriptor work has exposed how broken the non-GPIOLIB bits
for OpenRISC were. We now require GPIOLIB as this is the preferred
way forward.
- The system.h split introduced a bug in llist.h for arches using
asm-generic/cmpxchg.h directly, which is currently only OpenRISC.
The patch here moves two defines from asm-generic/atomic.h to
asm-generic/cmpxchg.h to make things work as they should.
- The VIRT_TO_BUS selector was added for OpenRISC, but OpenRISC does
not have the virt_to_bus methods, so there's a patch to remove it
again.
* tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux:
openrisc: remove HAVE_VIRT_TO_BUS
asm-generic: move cmpxchg*_local defs to cmpxchg.h
openrisc: require gpiolib
Linus Torvalds [Sat, 16 Mar 2013 01:04:38 +0000 (18:04 -0700)]
Merge tag 'char-misc-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg Kroah-Hartman:
"Here are some tiny fixes for the w1 drivers and the final removal
patch for getting rid of CONFIG_EXPERIMENTAL (all users of it are now
gone from your tree, this just drops the Kconfig item itself.)
All have been in the linux-next tree for a while"
* tag 'char-misc-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
final removal of CONFIG_EXPERIMENTAL
w1: fix oops when w1_search is called from netlink connector
w1-gpio: fix unused variable warning
w1-gpio: remove erroneous __exit and __exit_p()
ARM: w1-gpio: fix erroneous gpio requests
Linus Torvalds [Sat, 16 Mar 2013 00:35:49 +0000 (17:35 -0700)]
Merge tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes, as expected for the middle rc:
- A couple of fixes for potential NULL dereferences and out-of-range
array accesses revealed by static code parsers
- A fix for the wrong error handling detected by trinity
- A regression fix for missing audio on some MacBooks
- CA0132 DSP loader fixes
- Fix for EAPD control of IDT codecs on machines w/o speaker
- Fix a regression in the HD-audio widget list parser code
- Workaround for the NuForce UDH-100 USB audio"
* tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix missing EAPD/GPIO setup for Cirrus codecs
sound: sequencer: cap array index in seq_chn_common_event()
ALSA: hda/ca0132 - Remove extra setting of dsp_state.
ALSA: hda/ca0132 - Check download state of DSP.
ALSA: hda/ca0132 - Check if dspload_image succeeded.
ALSA: hda - Disable IDT eapd_switch if there are no internal speakers
ALSA: hda - Fix snd_hda_get_num_raw_conns() to return a correct value
ALSA: usb-audio: add a workaround for the NuForce UDH-100
ALSA: asihpi - fix potential NULL pointer dereference
ALSA: seq: Fix missing error handling in snd_seq_timer_open()
Linus Torvalds [Sat, 16 Mar 2013 00:35:03 +0000 (17:35 -0700)]
Merge branch 'fixes-for-3.9' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA-mapping fix from Marek Szyprowski:
"An important fix for all ARM architectures which use ZONE_DMA.
Without it dma_alloc_* calls with GFP_ATOMIC flag might have allocated
buffers outsize DMA zone."
* 'fixes-for-3.9' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: DMA-mapping: add missing GFP_DMA flag for atomic buffer allocation
Linus Torvalds [Sat, 16 Mar 2013 00:34:01 +0000 (17:34 -0700)]
Merge tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes
Pull MFD fixes from Samuel Ortiz:
"This is the first batch of MFD fixes for 3.9.
With this one we have:
- An ab8500 build failure fix.
- An ab8500 device tree parsing fix.
- A fix for twl4030_madc remove routine to work properly (when
built-in).
- A fix for properly registering palmas interrupt handler.
- A fix for omap-usb init routine to actually write into the
hostconfig register.
- A couple of warning fixes for ab8500-gpadc and tps65912"
* tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes:
mfd: twl4030-madc: Remove __exit_p annotation
mfd: ab8500: Kill "reg" property from binding
mfd: ab8500-gpadc: Complain if we fail to enable vtvout LDO
mfd: wm831x: Don't forward declare enum wm831x_auxadc
mfd: twl4030-audio: Fix argument type for twl4030_audio_disable_resource()
mfd: tps65912: Declare and use tps65912_irq_exit()
mfd: palmas: Provide irq flags through DT/platform data
mfd: Make AB8500_CORE select POWER_SUPPLY to fix build error
mfd: omap-usb-host: Actually update hostconfig
Linus Torvalds [Sat, 16 Mar 2013 00:33:13 +0000 (17:33 -0700)]
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"Bug fixes for pmbus, ltc2978, and lineage-pem drivers
Added specific maintainer for some hwmon drivers"
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (pmbus/ltc2978) Fix temperature reporting
hwmon: (pmbus) Fix krealloc() misuse in pmbus_add_attribute()
hwmon: (lineage-pem) Add missing terminating entry for pem_[input|fan]_attributes
MAINTAINERS: Add maintainer for MAX6697, INA209, and INA2XX drivers
Make this depend on CONFIG_PM. This protection is necessary to not
cause any build errors with any combination of PM features especially
when supporting a new SoC where each PM features are being enabled
one-by-one during its depelopment.
ARM: 7674/1: smp: Avoid dummy clockevent being preferred over real hardware clock-event
With recent arm broadcast time clean-up from Mark Rutland, the dummy
broadcast device is always registered with timer subsystem. And since
the rating of the dummy clock event is very high, it may be preferred
over a real clock event.
This is a change in behavior from past and not an intended one. So
reduce the rating of the dummy clock-event so that real clock-event
device is selected when available.
Stephane Eranian [Fri, 15 Mar 2013 13:26:07 +0000 (14:26 +0100)]
perf,x86: fix kernel crash with PEBS/BTS after suspend/resume
This patch fixes a kernel crash when using precise sampling (PEBS)
after a suspend/resume. Turns out the CPU notifier code is not invoked
on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly
by the kernel and keeps it power-on/resume value of 0 causing any PEBS
measurement to crash when running on CPU0.
The workaround is to add a hook in the actual resume code to restore
the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0,
the DS_AREA will be restored twice but this is harmless.
Sascha Hauer [Tue, 26 Feb 2013 09:55:18 +0000 (10:55 +0100)]
ARM: i.MX35: enable MAX clock
The i.MX35 has two bits per clock gate which are decoded as follows:
0b00 -> clock off
0b01 -> clock is on in run mode, off in wait/doze
0b10 -> clock is on in run/wait mode, off in doze
0b11 -> clock is always on
The reset value for the MAX clock is 0b10.
The MAX clock is needed by the SoC, yet unused in the Kernel, so the
common clock framework will disable it during late init time. It will
only disable clocks though which it detects as being turned on. This
detection is made depending on the lower bit of the gate. If the reset
value has been altered by the bootloader to 0b11 the clock framework
will detect the clock as turned on, yet unused, hence it will turn it
off and the system locks up.
This patch turns the MAX clock on unconditionally making the Kernel
independent of the bootloader.
Takashi Iwai [Fri, 15 Mar 2013 13:23:32 +0000 (14:23 +0100)]
ALSA: hda - Fix missing EAPD/GPIO setup for Cirrus codecs
During the transition to the generic parser, the hook to the codec
specific automute function was forgotten. This resulted in the silent
output on some MacBooks.
Zhouyi Zhou [Thu, 14 Mar 2013 17:21:50 +0000 (17:21 +0000)]
Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug
When neighbour table is full, dst_neigh_lookup/dst_neigh_lookup_skb will return
-ENOBUFS which is absolutely non zero, while all the code in kernel which use
above functions assume failure only on zero return which will cause panic. (for
example: : https://bugzilla.kernel.org/show_bug.cgi?id=54731).
This patch corrects above error with smallest changes to kernel source code and
also correct two return value check missing bugs in drivers/infiniband/hw/cxgb4/cm.c
David S. Miller [Fri, 15 Mar 2013 13:00:39 +0000 (09:00 -0400)]
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Jesse Gross says:
====================
A few different bug fixes, including several for issues with userspace
communication that have gone unnoticed up until now. These are intended
for net/3.9.
====================
Georg Hofmann [Thu, 14 Mar 2013 06:54:09 +0000 (06:54 +0000)]
net: fec: fix missing napi_disable call
Commit dc975382d2ef36be7e78fac3717927de1a5abcd8 introduces napi support
but never calls napi_disable. This will generate a kernel oops
(kernel BUG at include/linux/netdevice.h:473!) every time, when
ndo_stop is called followed by ndo_start.
Add the missing napi_diable call.
Lucas Stach [Thu, 14 Mar 2013 05:12:01 +0000 (05:12 +0000)]
net: fec: restart the FEC when PHY speed changes
Proviously we would only restart the FEC when PHY link or duplex state
changed. PHY does not always bring down the link for speed changes, in
which case we would not detect any change and keep FEC running.
Switching link speed without restarting the FEC results in the FEC being
stuck in an indefinite state, generating error conditions for every
packet.
Haojian Zhuang [Fri, 15 Mar 2013 08:27:53 +0000 (16:27 +0800)]
ARM: mmp: add platform_device head file in gplugd
arch/arm/mach-mmp/gplugd.c: In function ‘gplugd_init’:
arch/arm/mach-mmp/gplugd.c:188:2: error: implicit declaration of
function ‘platform_device_register’
[-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
make[1]: *** [arch/arm/mach-mmp/gplugd.o] Error 1
make: *** [arch/arm/mach-mmp] Error 2
So append platform_device.h to resolve build issue.
Dan Carpenter [Fri, 15 Mar 2013 06:14:22 +0000 (09:14 +0300)]
sound: sequencer: cap array index in seq_chn_common_event()
"chn" here is a number between 0 and 255, but ->chn_info[] only has
16 elements so there is a potential write beyond the end of the
array.
If the seq_mode isn't SEQ_2 then we let the individual drivers
(either opl3.c or midi_synth.c) handle it. Those functions all
do a bounds check on "chn" so I haven't changed anything here.
The opl3.c driver has up to 18 channels and not 16.
Arnd Bergmann [Thu, 14 Mar 2013 21:56:38 +0000 (22:56 +0100)]
mfd: twl4030-madc: Remove __exit_p annotation
4740f73fe5 "mfd: remove use of __devexit" removed the __devexit annotation
on the twl4030_madc_remove function, but left an __exit_p() present on the
pointer to this function. Using __exit_p was as wrong with the devexit in
place as it is now, but now we get a gcc warning about an unused function.
In order for the twl4030_madc_remove to work correctly in built-in code, we
have to remove the __exit_p.
Dylan Reid [Fri, 15 Mar 2013 00:27:45 +0000 (17:27 -0700)]
ALSA: hda/ca0132 - Check download state of DSP.
Instead of using the dspload_is_loaded() function, check the dsp_state
that is kept in the spec. The dspload_is_loaded() function returns
true if the DSP transfer was never started. This false-positive leads
to multiple second delays when ca0132_setup_efaults() times out on
each write.
Dylan Reid [Fri, 15 Mar 2013 00:27:44 +0000 (17:27 -0700)]
ALSA: hda/ca0132 - Check if dspload_image succeeded.
If dspload_image() fails, it was ignored and dspload_wait_loaded() was
still called. dsp_loaded should never be set to true in this case,
skip it. The check in dspload_wait_loaded() return true if the DSP is
loaded or if it never started.
The vm_flags introduced in 6d7825b10dbe ("mm/fremap.c: fix oops on error
path") is supposed to avoid a compiler warning about unitialized
vm_flags without changing the generated code.
However I am concerned that this is going to be very brittle, and fail
with some compiler versions. The failure could be either of:
- compiler could actually load vma->vm_flags before checking for the
!vma condition, thus reintroducing the oops
- compiler could optimize out the !vma check, since the pointer just got
dereferenced shortly before (so the compiler knows it can't be NULL!)
I propose reversing this part of the change and initializing vm_flags to 0
just to avoid the bogus uninitialized use warning.
Arnd Bergmann [Thu, 14 Mar 2013 23:00:50 +0000 (00:00 +0100)]
Merge at91 lcdfb bug fixes into fixes
These are part of a longer series that has been submitted some time
ago for the frame buffer tree, but it was never accepted there.
The first two of the five patches are bug fixes, so let's merge
this through arm-soc to get a working 3.9 kernel for at91.
Linus Torvalds [Thu, 14 Mar 2013 21:53:07 +0000 (14:53 -0700)]
Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull fix for hlist_entry_safe() regression from Paul McKenney:
"This contains a single commit that fixes a regression in
hlist_entry_safe(). This macro references its argument twice, which
can cause NULL-pointer errors. This commit applies a gcc statement
expression, creating a temporary variable to avoid the double
reference. This has been posted to LKML at
https://lkml.org/lkml/2013/3/9/75.
Kudos to CAI Qian, whose testing uncovered this, to Eric Dumazet, who
spotted root cause, and to Li Zefan, who tested this commit."
* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
list: Fix double fetch of pointer in hlist_entry_safe()
Arnd Bergmann [Wed, 13 Feb 2013 16:11:09 +0000 (17:11 +0100)]
input/joystick: use get_cycles on ARM
ARM normally has an accurate clock source, so
we can theoretically use analog joysticks more
accurately and at the same time avoid the
build warning
#warning Precise timer not defined for this architecture.
from the joystick driver.
Now, why anybody would use that driver no ARM I have no
idea, but Ben Dooks enabled it in the s3c2410_defconfig
along with a bunch of other drivers, even though that
platform has neither ISA nor PCI support. It still
seems to be the right thing to fix this quirk.
Arnd Bergmann [Thu, 14 Feb 2013 22:11:33 +0000 (23:11 +0100)]
[media] s5p-fimc: fix s5pv210 build
56bc911 "[media] s5p-fimc: Redefine platform data structure for fimc-is"
changed the bus_type member of struct fimc_source_info treewide, but
got one instance wrong in mach-s5pv210, which was evidently not
even build tested.
This adds the missing change to get s5pv210_defconfig to build again.
Applies on the Mauro's media tree.
list: Fix double fetch of pointer in hlist_entry_safe()
The current version of hlist_entry_safe() fetches the pointer twice,
once to test for NULL and the other to compute the offset back to the
enclosing structure. This is OK for normal lock-based use because in
that case, the pointer cannot change. However, when the pointer is
protected by RCU (as in "rcu_dereference(p)"), then the pointer can
change at any time. This use case can result in the following sequence
of events:
1. CPU 0 invokes hlist_entry_safe(), fetches the RCU-protected
pointer as sees that it is non-NULL.
2. CPU 1 invokes hlist_del_rcu(), deleting the entry that CPU 0
just fetched a pointer to. Because this is the last entry
in the list, the pointer fetched by CPU 0 is now NULL.
3. CPU 0 refetches the pointer, obtains NULL, and then gets a
NULL-pointer crash.
This commit therefore applies gcc's "({ })" statement expression to
create a temporary variable so that the specified pointer is fetched
only once, avoiding the above sequence of events. Please note that
it is the caller's responsibility to use rcu_dereference() as needed.
This allows RCU-protected uses to work correctly without imposing
any additional overhead on the non-RCU case.
Many thanks to Eric Dumazet for spotting root cause!
Linus Torvalds [Thu, 14 Mar 2013 19:11:28 +0000 (12:11 -0700)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull ext2, ext3, reiserfs, quota fixes from Jan Kara:
"A fix for regression in ext2, and a format string issue in ext3. The
rest isn't too serious."
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext2: Fix BUG_ON in evict() on inode deletion
reiserfs: Use kstrdup instead of kmalloc/strcpy
ext3: Fix format string issues
quota: add missing use of dq_data_lock in __dquot_initialize
Liu Bo [Wed, 13 Mar 2013 13:43:03 +0000 (07:43 -0600)]
Btrfs: fix warning when creating snapshots
Creating snapshot passes extent_root to commit its transaction,
but it can lead to the warning of checking root for quota in
the __btrfs_end_transaction() when someone else is committing
the current transaction. Since we've recorded the needed root
in trans_handle, just use it to get rid of the warning.
Josef Bacik [Fri, 8 Mar 2013 20:41:02 +0000 (15:41 -0500)]
Btrfs: return EIO if we have extent tree corruption
The callers of lookup_inline_extent_info all handle getting an error back
properly, so return an error if we have corruption instead of being a jerk and
panicing. Still WARN_ON() since this is kind of crucial and I've been seeing it
a bit too much recently for my taste, I think we're doing something wrong
somewhere. Thanks,
Eric Sandeen [Sat, 9 Mar 2013 15:18:39 +0000 (15:18 +0000)]
btrfs: use rcu_barrier() to wait for bdev puts at unmount
Doing this would reliably fail with -EBUSY for me:
# mount /dev/sdb2 /mnt/scratch; umount /mnt/scratch; mkfs.btrfs -f /dev/sdb2
...
unable to open /dev/sdb2: Device or resource busy
because mkfs.btrfs tries to open the device O_EXCL, and somebody still has it.
Using systemtap to track bdev gets & puts shows a kworker thread doing a
blkdev put after mkfs attempts a get; this is left over from the unmount
path:
so unmount might complete before __free_device fires & does its blkdev_put.
Adding an rcu_barrier() to btrfs_close_devices() causes unmount to wait
until all blkdev_put()s are done, and the device is truly free once
unmount completes.
Guenter Roeck [Thu, 21 Feb 2013 18:27:54 +0000 (10:27 -0800)]
hwmon: (pmbus/ltc2978) Fix temperature reporting
On LTC2978, only READ_TEMPERATURE is supported. It reports
the internal junction temperature. This register is unpaged.
On LTC3880, READ_TEMPERATURE and READ_TEMPERATURE2 are supported.
READ_TEMPERATURE is paged and reports external temperatures.
READ_TEMPERATURE2 is unpaged and reports the internal junction
temperature.
Pavel Emelyanov [Thu, 14 Mar 2013 03:29:40 +0000 (03:29 +0000)]
skb: Propagate pfmemalloc on skb from head page only
Hi.
I'm trying to send big chunks of memory from application address space via
TCP socket using vmsplice + splice like this
mem = mmap(128Mb);
vmsplice(pipe[1], mem); /* splice memory into pipe */
splice(pipe[0], tcp_socket); /* send it into network */
When I'm lucky and a huge page splices into the pipe and then into the socket
_and_ client and server ends of the TCP connection are on the same host,
communicating via lo, the whole connection gets stuck! The sending queue
becomes full and app stops writing/splicing more into it, but the receiving
queue remains empty, and that's why.
The __skb_fill_page_desc observes a tail page of a huge page and erroneously
propagates its page->pfmemalloc value onto socket (the pfmemalloc on tail pages
contain garbage). Then this skb->pfmemalloc leaks through lo and due to the
no packets reach the socket. Even TCP re-transmits are dropped by this, as skb
cloning clones the pfmemalloc flag as well.
That said, here's the proper page->pfmemalloc propagation onto socket: we
must check the huge-page's head page only, other pages' pfmemalloc and mapping
values do not contain what is expected in this place. However, I'm not sure
whether this fix is _complete_, since pfmemalloc propagation via lo also
oesn't look great.
Both, bit propagation from page to skb and this check in sk_filter, were
introduced by c48a11c7 (netvm: propagate page->pfmemalloc to skb), in v3.5 so
Mel and stable@ are in Cc.
commit a21d45726acac (tcp: avoid order-1 allocations on wifi and tx
path) did a poor choice adding an 'avail_size' field to skb, while
what we really needed was a 'reserved_tailroom' one.
It would have avoided commit 22b4a4f22da (tcp: fix retransmit of
partially acked frames) and this commit.
Crash occurs because skb_split() is not aware of the 'avail_size'
management (and should not be aware)
The original documentation for NAND_NO_READRDY included "True for all
large page devices, as they do not support autoincrement." I was
conflating "not support autoincrement" with the NAND_NO_AUTOINCR option,
which was in fact doing nothing. So, when I dropped NAND_NO_AUTOINCR, I
concluded that I then could harmlessly drop NAND_NO_READRDY. But of
course the fact the NAND_NO_AUTOINCR was doing nothing didn't mean
NAND_NO_READRDY was doing nothing...
So, NAND_NO_READRDY is re-introduced as NAND_NEED_READRDY and applied
only to those few remaining small-page NAND which needed it in the first
place.
Marek Szyprowski [Tue, 26 Feb 2013 06:46:24 +0000 (07:46 +0100)]
ARM: DMA-mapping: add missing GFP_DMA flag for atomic buffer allocation
Atomic pool should always be allocated from DMA zone if such zone is
available in the system to avoid issues caused by limited dma mask of
any of the devices used for making an atomic allocation.
Takuya Yoshikawa [Tue, 12 Mar 2013 08:45:30 +0000 (17:45 +0900)]
KVM: x86: Optimize mmio spte zapping when creating/moving memslot
When we create or move a memory slot, we need to zap mmio sptes.
Currently, zap_all() is used for this and this is causing two problems:
- extra page faults after zapping mmu pages
- long mmu_lock hold time during zapping mmu pages
For the latter, Marcelo reported a disastrous mmu_lock hold time during
hot-plug, which made the guest unresponsive for a long time.
This patch takes a simple way to fix these problems: do not zap mmu
pages unless they are marked mmio cached. On our test box, this took
only 50us for the 4GB guest and we did not see ms of mmu_lock hold time
any more.
Note that we still need to do zap_all() for other cases. So another
work is also needed: Xiao's work may be the one.
Jan Kiszka [Wed, 13 Mar 2013 10:31:24 +0000 (11:31 +0100)]
KVM: nVMX: Add preemption timer support
Provided the host has this feature, it's straightforward to offer it to
the guest as well. We just need to load to timer value on L2 entry if
the feature was enabled by L1 and watch out for the corresponding exit
reason.
Jan Kiszka [Wed, 13 Mar 2013 15:06:41 +0000 (16:06 +0100)]
KVM: nVMX: Provide EFER.LMA saving support
We will need EFER.LMA saving to provide unrestricted guest mode. All
what is missing for this is picking up EFER.LMA from VM_ENTRY_CONTROLS
on L2->L1 switches. If the host does not support EFER.LMA saving,
no change is performed, otherwise we properly emulate for L1 what the
hardware does for L0. Advertise the support, depending on the host
feature.
Linus Torvalds [Wed, 13 Mar 2013 22:47:50 +0000 (15:47 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace bugfixes from Eric Biederman:
"This tree includes a partial revert for "fs: Limit sys_mount to only
request filesystem modules." When I added the new style module aliases
to the filesystems I deleted the old ones. A bad move. It turns out
that distributions like Arch linux use module aliases when
constructing ramdisks. Which meant ultimately that an ext3 filesystem
mounted with ext4 would not result in the ext4 module being put into
the ramdisk.
The other change in this tree adds a handful of filesystem module
alias I simply failed to add the first time. Which inconvinienced a
few folks using cifs.
I don't want to inconvinience folks any longer than I have to so here
are these trivial fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
fs: Readd the fs module aliases.
fs: Limit sys_mount to only request filesystem modules. (Part 3)
Tejun Heo [Wed, 13 Mar 2013 21:59:49 +0000 (14:59 -0700)]
idr: idr_alloc() shouldn't trigger lowmem warning when preloaded
GFP_NOIO is often used for idr_alloc() inside preloaded section as the
allocation mask doesn't really matter. If the idr tree needs to be
expanded, idr_alloc() first tries to allocate using the specified
allocation mask and if it fails falls back to the preloaded buffer. This
order prevent non-preloading idr_alloc() users from taking advantage of
preloading ones by using preload buffer without filling it shifting the
burden of allocation to the preload users.
Unfortunately, this allowed/expected-to-fail kmem_cache allocation ends up
generating spurious slab lowmem warning before succeeding the request from
the preload buffer.
This patch makes idr_layer_alloc() add __GFP_NOWARN to the first
kmem_cache attempt and try kmem_cache again w/o __GFP_NOWARN after
allocation from preload_buffer fails so that lowmem warning is generated
if not suppressed by the original @gfp_mask.
David Howells [Wed, 13 Mar 2013 21:59:48 +0000 (14:59 -0700)]
UAPI: fix endianness conditionals in M32R's asm/stat.h
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).
However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.
The definition of struct stat64 in M32R's asm/stat.h is wrong in this way.
Note that userspace will likely interpret the field order incorrectly as
the big-endian variant on little-endian machines - depending on header
inclusion order.
[!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
be better to fix the ordering of st_blocks and __pad4 in struct stat64.
David Howells [Wed, 13 Mar 2013 21:59:47 +0000 (14:59 -0700)]
UAPI: fix endianness conditionals in linux/raid/md_p.h
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).
However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.
The definition of struct mdp_superblock_s in linux/raid/md_p.h is wrong in
this way. Note that userspace will likely interpret the ordering of the
fields incorrectly as the big-endian variant on a little-endian machines -
depending on header inclusion order.
[!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
be better to fix the ordering of events_hi, events_lo, cp_events_hi and
cp_events_lo in struct mdp_superblock_s / typedef mdp_super_t.
David Howells [Wed, 13 Mar 2013 21:59:46 +0000 (14:59 -0700)]
UAPI: fix endianness conditionals in linux/acct.h
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).
However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.
The definition of ACCT_BYTEORDER in linux/acct.h is wrong in this way.
Note that userspace will likely interpret this incorrectly as the
big-endian variant on little-endian machines - depending on header
inclusion order.
[!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
be better to fix the value of ACCT_BYTEORDER.
David Howells [Wed, 13 Mar 2013 21:59:45 +0000 (14:59 -0700)]
UAPI: fix endianness conditionals in linux/aio_abi.h
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).
However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.
The definition of PADDED() in linux/aio_abi.h is wrong in this way. Note
that userspace will likely interpret this and thus the order of fields in
struct iocb incorrectly as the little-endian variant on big-endian
machines - depending on header inclusion order.
[!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
be better to fix the ordering of aio_key and aio_reserved1 in struct iocb.
Tejun Heo [Wed, 13 Mar 2013 21:59:42 +0000 (14:59 -0700)]
idr: deprecate idr_pre_get() and idr_get_new[_above]()
Now that all in-kernel users are converted to ues the new alloc
interface, mark the old interface deprecated. We should be able to
remove these in a few releases.
Tejun Heo [Wed, 13 Mar 2013 21:59:41 +0000 (14:59 -0700)]
tidspbridge: convert to idr_alloc()
idr_get_new*() and friends are about to be deprecated. Convert to the
new idr_alloc() interface.
There are some peculiarities and possible bugs in the converted
functions. This patch preserves those.
* drv_insert_node_res_element() returns -ENOMEM on alloc failure,
-EFAULT if id space is exhausted. -EFAULT is at best misleading.
* drv_proc_insert_strm_res_element() is even weirder. It returns
-EFAULT if kzalloc() fails, -ENOMEM if idr preloading fails and
-EPERM if id space is exhausted. What's going on here?
* drv_proc_insert_strm_res_element() doesn't free *pstrm_res after
failure.
Tejun Heo [Wed, 13 Mar 2013 21:59:39 +0000 (14:59 -0700)]
mlx4: remove leftover idr_pre_get() call
Commit 6a9200603d76 ("IB/mlx4: convert to idr_alloc()") forgot to remove
idr_pre_get() call in mlx4_ib_cm_paravirt_init(). It's unnecessary and
idr_pre_get() will soon be deprecated. Remove it.
Kees Cook [Wed, 13 Mar 2013 21:59:33 +0000 (14:59 -0700)]
signal: always clear sa_restorer on execve
When the new signal handlers are set up, the location of sa_restorer is
not cleared, leaking a parent process's address space location to
children. This allows for a potential bypass of the parent's ASLR by
examining the sa_restorer value returned when calling sigaction().
Based on what should be considered "secret" about addresses, it only
matters across the exec not the fork (since the VMAs haven't changed
until the exec). But since exec sets SIG_DFL and keeps sa_restorer,
this is where it should be fixed.
Given the few uses of sa_restorer, a "set" function was not written
since this would be the only use. Instead, we use
__ARCH_HAS_SA_RESTORER, as already done in other places.
Toshi Kani [Wed, 13 Mar 2013 21:59:31 +0000 (14:59 -0700)]
mm: remove_memory(): fix end_pfn setting
remove_memory() calls walk_memory_range() with [start_pfn, end_pfn), where
end_pfn is exclusive in this range. Therefore, end_pfn needs to be set to
the next page of the end address.
Andrew Morton [Wed, 13 Mar 2013 21:59:30 +0000 (14:59 -0700)]
include/linux/res_counter.h needs errno.h
alpha allmodconfig:
In file included from mm/memcontrol.c:28:
include/linux/res_counter.h: In function 'res_counter_set_limit':
include/linux/res_counter.h:203: error: 'EBUSY' undeclared (first use in this function)
include/linux/res_counter.h:203: error: (Each undeclared identifier is reported only once
include/linux/res_counter.h:203: error: for each function it appears in.)
Linus Torvalds [Wed, 13 Mar 2013 22:03:48 +0000 (15:03 -0700)]
Merge tag 'usb-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg Kroah-Hartman:
"Here are a number of tiny USB fixes and new USB device ids for your
3.9 tree.
The "largest" one here is a revert of a usb-storage patch that turned
out to be incorrect, breaking existing users, which is never a good
thing. Everything else is pretty simple and small"
* tag 'usb-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (43 commits)
USB: quatech2: only write to the tty if the port is open.
qcserial: bind to DM/DIAG port on Gobi 1K devices
USB: cdc-wdm: fix buffer overflow
usb: serial: Add Rigblaster Advantage to device table
qcaux: add Franklin U600
usb: musb: core: fix possible build error with randconfig
usb: cp210x new Vendor/Device IDs
usb: gadget: pxa25x: fix disconnect reporting
usb: dwc3: ep0: fix sparc64 build
usb: c67x00 RetryCnt value in c67x00 TD should be 3
usb: Correction to c67x00 TD data length mask
usb: Makefile: fix drivers/usb/phy/ Makefile entry
USB: added support for Cinterion's products AH6 and PLS8
usb: gadget: fix omap_udc build errors
USB: storage: fix Huawei mode switching regression
USB: storage: in-kernel modeswitching is deprecated
tools: usb: ffs-test: Fix build failure
USB: option: add Huawei E5331
usb: musb: omap2430: fix sparse warning
usb: musb: omap2430: fix omap_musb_mailbox glue check again
...
Linus Torvalds [Wed, 13 Mar 2013 22:02:02 +0000 (15:02 -0700)]
Merge tag 'tty-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg Kroah-Hartman:
"Here are some tty/serial driver fixes for 3.9
We finally mute the annoying WARN_ON that lots of people are hitting
and it turns out isn't needed anymore. Also add a few new device ids
and a some other minor fixes."
* tag 'tty-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: serial: fix typo "SERIAL_S3C2412"
serial: 8250: Keep 8250.<xxxx> module options functional after driver rename
tty: serial: fix typo "ARCH_S5P6450"
tty/8250_pnp: serial port detection regression since v3.7
serial: bcm63xx_uart: fix compilation after "TTY: switch tty_insert_flip_char"
serial: 8250_pci: add support for another kind of NetMos Technology PCI 9835 Multi-I/O Controller
Fix 4 port and add support for 8 port 'Unknown' PCI serial port cards
tty/serial: Add support for Altera serial port
tty: serial: vt8500: Unneccessary duplicated clock code removed
tty: serial: mpc5xxx: fix PSC clock name bug
TTY: disable debugging warning
Linus Torvalds [Wed, 13 Mar 2013 22:01:08 +0000 (15:01 -0700)]
Merge tag 'staging-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging tree fixes from Greg Kroah-Hartman:
"Here are some drivers/staging and drivers/iio fixes for 3.9 (the two
are still pretty intertwined, hence them coming both from my tree
still.) Nothing major, just a few things that have been reported by
users, all of these have been in linux-next for a while."
* tag 'staging-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: comedi: dt9812: use CR_CHAN() for channel number
staging/vt6656: Fix too large integer constant warning on 32-bit
staging: comedi: drivers: usbduxsigma.c: fix DMA buffers on stack
staging: imx/drm: request irq only after adding the crtc
staging: comedi: drivers: usbduxfast.c: fix for DMA buffers on stack
staging: comedi: drivers: usbdux.c: fix DMA buffers on stack
staging: vt6656: Fix oops on resume from suspend.
iio:common:st_sensors fixed all warning messages about uninitialized variables
iio: Fix build error seen if IIO_TRIGGER is defined but IIO_BUFFER is not
iio/imu: inv_mpu6050 depends on IIO_BUFFER
iio:ad5064: Initialize register cache correctly
iio:ad5064: Fix off by one in DAC value range check
iio:ad5064: Fix address of the second channel for ad5065/ad5045/ad5025
Don't allowing sharing the root directory with processes in a
different user namespace. There doesn't seem to be any point, and to
allow it would require the overhead of putting a user namespace
reference in fs_struct (for permission checks) and incrementing that
reference count on practically every call to fork.
So just perform the inexpensive test of forbidding sharing fs_struct
acrosss processes in different user namespaces. We already disallow
other forms of threading when unsharing a user namespace so this
should be no real burden in practice.
This updates setns, clone, and unshare to disallow multiple user
namespaces sharing an fs_struct.
Bill Pemberton [Wed, 13 Mar 2013 13:50:15 +0000 (09:50 -0400)]
USB: quatech2: only write to the tty if the port is open.
The commit 2e124b4a390ca85325fae75764bef92f0547fa25 removed the checks
that prevented qt2_process_read_urb() from trying to put chars into
ttys that weren't actually opened. This resulted in 'tty is NULL'
warnings from flush_to_ldisc() when the device was used.
The devices use just one read urb for all ports. As a result
qt2_process_read_urb() may be called with the current port set to a
port number that has not been opened. Add a check if the port is open
before calling tty_flip_buffer_push().
Larry Finger [Wed, 13 Mar 2013 15:28:13 +0000 (10:28 -0500)]
rtlwifi: rtl8192cu: Fix problem that prevents reassociation
The driver was failing to clear the BSSID when a disconnect happened. That
prevented a reconnection. This problem is reported at
https://bugzilla.redhat.com/show_bug.cgi?id=789605,
https://bugzilla.redhat.com/show_bug.cgi?id=866786,
https://bugzilla.redhat.com/show_bug.cgi?id=906734, and
https://bugzilla.kernel.org/show_bug.cgi?id=46171.
Thanks to Jussi Kivilinna for making the critical observation
that led to the solution.
John Crispin [Wed, 13 Mar 2013 12:20:15 +0000 (13:20 +0100)]
rt2x00: fix rt2x00 to work with the new ralink SoC config symbols
Since v3.9-rc1 the kernel has basic support for Ralink WiSoC. The config symbols
are named slightly different than before. Fix the rt2x00 to match the new
symbols.
Bjørn Mork [Wed, 13 Mar 2013 02:25:17 +0000 (02:25 +0000)]
net: qmi_wwan: set correct altsetting for Gobi 1K devices
commit bd877e4 ("net: qmi_wwan: use a single bind function for
all device types") made Gobi 1K devices fail probing.
Using the number of endpoints in the default altsetting to decide
whether the function use one or two interfaces is wrong. Other
altsettings may provide more endpoints.
With Gobi 1K devices, USB interface #3's altsetting is 0 by default, but
altsetting 0 only provides one interrupt endpoint and is not sufficent
for QMI. Altsetting 1 provides all 3 endpoints required for qmi_wwan
and works with QMI. Gobi 1K layout for intf#3 is:
Interface Descriptor: 255/255/255
bInterfaceNumber 3
bAlternateSetting 0
Endpoint Descriptor: Interrupt IN
Interface Descriptor: 255/255/255
bInterfaceNumber 3
bAlternateSetting 1
Endpoint Descriptor: Interrupt IN
Endpoint Descriptor: Bulk IN
Endpoint Descriptor: Bulk OUT
Prior to commit bd877e4, we would call usbnet_get_endpoints
before giving up finding enough endpoints. Removing the early
endpoint number test and the strict functional descriptor
requirement allow qmi_wwan_bind to continue until
usbnet_get_endpoints has made the final attempt to collect
endpoints. This restores the behaviour from before commit bd877e4 without losing the added benefit of using a single bind
function.
The driver has always required a CDC Union functional descriptor
for two-interface functions. Using the existence of this
descriptor to detect two-interface functions is the logically
correct method.
the defenition of FIB_HASH_TABLE size has obtained wrong dependency:
it should depend upon CONFIG_IP_MULTIPLE_TABLES (as was in the original
code) but it was depended from CONFIG_IP_ROUTE_MULTIPATH
This patch returns the situation to the original state.
Jan Kara [Wed, 13 Mar 2013 11:57:08 +0000 (12:57 +0100)]
ext2: Fix BUG_ON in evict() on inode deletion
Commit 8e3dffc6 introduced a regression where deleting inode with
large extended attributes leads to triggering
BUG_ON(inode->i_state != (I_FREEING | I_CLEAR))
in fs/inode.c:evict(). That happens because freeing of xattr block
dirtied the inode and it happened after clear_inode() has been called.
Fix the issue by moving removal of xattr block into ext2_evict_inode()
before clear_inode() call close to a place where data blocks are
truncated. That is also more logical place and removes surprising
requirement that ext2_free_blocks() mustn't dirty the inode.
Jan Kiszka [Wed, 13 Mar 2013 10:30:50 +0000 (11:30 +0100)]
KVM: nVMX: Clean up and fix pin-based execution controls
Only interrupt and NMI exiting are mandatory for KVM to work, thus can
be exposed to the guest unconditionally, virtual NMI exiting is
optional. So we must not advertise it unless the host supports it.
Introduce the symbolic constant PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR at
this chance.
Xufeng Zhang [Thu, 7 Mar 2013 21:39:37 +0000 (21:39 +0000)]
sctp: don't break the loop while meeting the active_path so as to find the matched transport
sctp_assoc_lookup_tsn() function searchs which transport a certain TSN
was sent on, if not found in the active_path transport, then go search
all the other transports in the peer's transport_addr_list, however, we
should continue to the next entry rather than break the loop when meet
the active_path transport.
Jan Kiszka [Wed, 13 Mar 2013 11:42:34 +0000 (12:42 +0100)]
KVM: x86: Rework INIT and SIPI handling
A VCPU sending INIT or SIPI to some other VCPU races for setting the
remote VCPU's mp_state. When we were unlucky, KVM_MP_STATE_INIT_RECEIVED
was overwritten by kvm_emulate_halt and, thus, got lost.
This introduces APIC events for those two signals, keeping them in
kvm_apic until kvm_apic_accept_events is run over the target vcpu
context. kvm_apic_has_events reports to kvm_arch_vcpu_runnable if there
are pending events, thus if vcpu blocking should end.
The patch comes with the side effect of effectively obsoleting
KVM_MP_STATE_SIPI_RECEIVED. We still accept it from user space, but
immediately translate it to KVM_MP_STATE_INIT_RECEIVED + KVM_APIC_SIPI.
The vcpu itself will no longer enter the KVM_MP_STATE_SIPI_RECEIVED
state. That also means we no longer exit to user space after receiving a
SIPI event.
Furthermore, we already reset the VCPU on INIT, only fixing up the code
segment later on when SIPI arrives. Moreover, we fix INIT handling for
the BSP: it never enter wait-for-SIPI but directly starts over on INIT.