Paul Burton [Tue, 19 Nov 2013 17:30:36 +0000 (17:30 +0000)]
MIPS: Simplify ptrace_getfpregs FPU IR retrieval
All architecturally defined bits in the FPU implementation register
are read only & unchanging. It contains some implementation-defined
bits but the architecture manual states "This bits are explicitly not
intended to be used for mode control functions" which seems to provide
justification for viewing the register as a whole as unchanging. This
being the case we can simply re-use the value we read at boot rather
than having to re-read it later, and avoid the complexity which that
read entails.
Paul Burton [Tue, 19 Nov 2013 17:30:35 +0000 (17:30 +0000)]
MIPS: Simplify PTRACE_PEEKUSR for FPC_EIR
All architecturally defined bits in the FPU implementation register
are read only & unchanging. It contains some implementation-defined
bits but the architecture manual states "This bits are explicitly not
intended to be used for mode control functions" which seems to provide
justification for viewing the register as a whole as unchanging. This
being the case we can simply re-use the value we read at boot rather
than having to re-read it later, and avoid the complexity which that
read entails.
Paul Gortmaker [Wed, 22 Jan 2014 20:21:30 +0000 (15:21 -0500)]
MIPS: SEAD3: Don't use module_init in non-modular sead3-mtd.c code
The sead3-mtd.o is built for obj-y -- and hence this code is always
present. It will never be modular, so using module_init as an alias
for __initcall can be somewhat misleading.
Fix this up now, so that we can relocate module_init from
init.h into module.h in the future. If we don't do this, we'd
have to add module.h to obviously non-modular code, and that
would be a worse thing.
Note that direct use of __initcall is discouraged, vs. one
of the priority categorized subgroups. As __initcall gets
mapped onto device_initcall, our use of device_initcall
directly in this change means that the runtime impact is
zero -- it will remain at level 6 in initcall ordering.
We also fix a missing semicolon, which this change uncovers.
Huacai Chen [Sun, 16 Feb 2014 08:01:18 +0000 (16:01 +0800)]
MIPS: Loongson: Rename PRID_IMP_LOONGSON1 and PRID_IMP_LOONGSON2
Loongson-1 is a 32-bit MIPS CPU and Loongson-2/3 are 64-bit MIPS CPUs,
and both Loongson-2/3 has the same PRID IMP filed (0x6300). As a
result, renaming PRID_IMP_LOONGSON1 and PRID_IMP_LOONGSON2 to
PRID_IMP_LOONGSON_32 and PRID_IMP_LOONGSON_64 will make more sense.
Paul Bolle [Sat, 8 Feb 2014 21:46:58 +0000 (22:46 +0100)]
MIPS: No need to select ARCH_SUPPORTS_MSI
Commit c24a8a7a9988 ("MIPS: Netlogic: Add MSI support for XLP") added
"select ARCH_SUPPORTS_MSI". But the Kconfig symbol ARCH_SUPPORTS_MSI was
already removed in v3.12, so that select is a nop. Drop it.
Jean Delvare [Fri, 14 Mar 2014 19:25:29 +0000 (20:25 +0100)]
watchdog: Fix Elan SC520 dependencies
Anyone using a system based on an AMD Elan SC520 processor would be
building a dedicated kernel for it, so we can make the sc520_wdt
driver depend on MELAN. SC520_CPUFREQ already depends on MELAN so it
makes things more consistent. It also makes kernel configuration for
every other x86 user easier.
Jean Delvare [Fri, 14 Mar 2014 12:18:47 +0000 (13:18 +0100)]
watchdog: ib700wdt: Use platform_driver_probe
Using platform_driver_probe instead of platform_driver_register has
two benefits:
* The driver will fail to load if device probing fails.
* The probe function can be marked __init.
Jean Delvare [Fri, 14 Mar 2014 12:16:47 +0000 (13:16 +0100)]
watchdog: geodewdt: Use platform_driver_probe
Using platform_driver_probe instead of platform_driver_register has
two benefits:
* The driver will fail to load if device probing fails.
* The probe function can be marked __init.
Jean Delvare [Fri, 14 Mar 2014 12:07:40 +0000 (13:07 +0100)]
watchdog: advantechwdt: Use platform_driver_probe
Using platform_driver_probe instead of platform_driver_register has
two benefits:
* The driver will fail to load if device probing fails.
* The probe function can be marked __init.
Jean Delvare [Fri, 14 Mar 2014 12:04:37 +0000 (13:04 +0100)]
watchdog: acquirewdt: Use platform_driver_probe
Using platform_driver_probe instead of platform_driver_register has
two benefits:
* The driver will fail to load if device probing fails.
* The probe function can be marked __init.
Jean Delvare [Mon, 10 Mar 2014 20:28:17 +0000 (21:28 +0100)]
watchdog: iTCO_wdt: Fix the parent device
The watchdog's parent is iTCO_wdt (the platform device) not lpc_ich
(the PCI device.) Setting the parent right makes it much easier for
the user to figure out which driver/module is handling the watchdog
device node.
watchdog: it87_wdt: Work around non-working CIR interrupts
On some hardware platforms, the it87_wdt watchdog resets the machine
despite the watchdog daemon running and writing to /dev/watchdog.
This is due to Consumer IR buffer underrun interrupts being used as
triggers to reset the timer. On some buggy hardware implementations
such as the iEi AFL-12A-N270 single-board computer, this method does
not work.
However, resetting the timer by writing its original timeout value in
its configuration register over and over again suppresses the unwanted
reboots.
Add a module option (nocir), 0 by default in order not to break existing
setups. Setting it to 1 enables the workaround.
Fixes bug #42801 <https://bugzilla.kernel.org/show_bug.cgi?id=42801>.
Tested primarily on Linux 3.5.7, applies cleanly on Linux 3.13.5.
Initializing clk to NULL as a reset/error condition does not
help as NULL is not an invalid condition w.r.t clk. Remove this
initialization altogether as there is no state retention.
Maxime Ripard [Fri, 7 Feb 2014 21:29:24 +0000 (22:29 +0100)]
watchdog: sunxi: Change compatibles
The Allwinner A10 and A31 compatibles were following a slightly different
compatible patterns than the rest of the SoCs for historical reasons. Change
the compatibles to match the other pattern in the watchdog controller driver
for consistency.
watchdog: orion: prepare new Dove DT Kconfig variable
DT-enabled Dove will move over from ARCH_DOVE in mach-dove to MACH_DOVE in
mach-mvebu. As non-DT ARCH_DOVE will stay to rot for a while, add a new
DT-only MACH_DOVE Kconfig.
Jingoo Han [Thu, 27 Feb 2014 05:41:42 +0000 (14:41 +0900)]
watchdog: fix checkpatch warnings and error
Fix the following checkpatch warnings and error:
WARNING: quoted string split across lines
WARNING: braces {} are not necessary for single statement blocks
WARNING: __initdata should be placed after ibmasr_id_table[]
WARNING: please, no space before tabs
ERROR: do not initialise statics to 0 or NULL
Andrew Chew [Fri, 14 Feb 2014 20:03:05 +0000 (12:03 -0800)]
watchdog: Add tegra watchdog
Add a driver for the hardware watchdogs in NVIDIA Tegra SoCs (Tegra30 and
later). This driver will configure one watchdog timer that will reset the
system in the case of a watchdog timeout.
This driver binds to the nvidia,tegra30-timer device node and gets its
register base from there.
Jingoo Han [Tue, 11 Feb 2014 12:44:12 +0000 (21:44 +0900)]
watchdog: omap_wdt: Use devm_ioremap_resource()
Use devm_ioremap_resource() in order to make the code simpler,
and remove 'struct resource *mem' from 'struct omap_wdt_dev'
and omap_wdt_probe(), resplectively. because the 'mem' variables
are not used anymore. Also the redundant return value check of
platform_get_resource() is removed, because the value is checked
by devm_ioremap_resource().
Jingoo Han [Tue, 11 Feb 2014 12:41:54 +0000 (21:41 +0900)]
watchdog: ep93xx_wdt: Use devm_ioremap_resource()
Use devm_ioremap_resource() in order to make the code simpler,
and remove redundant return value check of platform_get_resource()
because the value is checked by devm_ioremap_resource().
Jingoo Han [Tue, 11 Feb 2014 06:46:43 +0000 (15:46 +0900)]
watchdog: Remove unnecessary OOM messages
The site-specific OOM messages are unnecessary, because they
duplicate the MM subsystem generic OOM message. For example,
k.alloc and v.alloc failures use dump_stack().
Paul Gortmaker [Tue, 21 Jan 2014 21:22:43 +0000 (16:22 -0500)]
watchdog: delete non-required instances of include <linux/init.h>
None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>. Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.
Hans de Goede [Thu, 27 Mar 2014 17:02:39 +0000 (18:02 +0100)]
irqchip: sun7i/sun6i: Disable NMI before registering the handler
It is advisable to disable the NMI before registering the IRQ handler as
registering the IRQ handler unmasks the IRQ on the GIC, so if U-Boot has
left the NMI enabled and the NMI pin is active we will immediately get
an interrupt before any driver has claimed the downstream interrupt of
the NMI.
Hans de Goede [Thu, 27 Mar 2014 17:02:38 +0000 (18:02 +0100)]
ARM: sun7i/sun6i: dts: Fix IRQ number for sun6i NMI controller
The IRQ line used in sun6i-a31.dtsi for the NMI controller is wrong.
This causes a IRQ storm since the NMI controller is repeatedly fired.
This patch fixes this problem assigning the correct IRQ number to the
NMI controller.
Dan Carpenter [Mon, 31 Mar 2014 07:49:15 +0000 (10:49 +0300)]
ALSA: asihpi: fix some indenting in snd_card_asihpi_pcm_new()
This used to be a part of a condition until f3d145aac913 ('ALSA: asihpi:
MMAP for non-busmaster cards') but now it's not and we can remove an
indent level.
Huacai Chen [Sat, 22 Mar 2014 09:21:44 +0000 (17:21 +0800)]
MIPS: Hibernate: Flush TLB entries in swsusp_arch_resume()
The original MIPS hibernate code flushes cache and TLB entries in
swsusp_arch_resume(). But they are removed in Commit 44eeab67416711
(MIPS: Hibernation: Remove SMP TLB and cacheflushing code.). A cross-
CPU flush is surely unnecessary because all but the local CPU have
already been disabled. But a local flush (at least the TLB flush) is
needed. When we do hibernation on Loongson-3 with an E1000E NIC, it is
very easy to produce a kernel panic (kernel page fault, or unaligned
access). The root cause is E1000E driver use vzalloc_node() to allocate
pages, the stale TLB entries of the booting kernel will be misused by
the resumed target kernel.
The UART register names are identical to the ones in uapi/linux/serial_reg.h,
which causes build failures in various drivers when they indirectly pull in
the au1000.h header, for example via gpio.h:
In file included from arch/mips/include/asm/mach-au1x00/gpio.h:13:0,
from arch/mips/include/asm/gpio.h:4,
from include/linux/gpio.h:48,
from include/linux/ssb/ssb.h:9,
from drivers/ssb/driver_mipscore.c:11:
arch/mips/include/asm/mach-au1x00/au1000.h:1171:0: note: this is the location of the previous definition
#define UART_LSR 0x1C /* Line Status Register */
Get rid of the altogether, nothing in the core Alchemy code depends
on them any more.
Ralf Baechle [Wed, 26 Mar 2014 20:40:25 +0000 (21:40 +0100)]
MIPS: Fix build error due to multiple prom_putchar() definitions.
This can happen if both the generic 8250 and another early console
driver are enable. Fixed by using an auxilliary kconfig symbol to
restrict that choice.
Wolfram Sang [Sat, 22 Feb 2014 08:28:27 +0000 (09:28 +0100)]
avr32: remove cpu_data macro to fix compiles
Having cpu_data as a parameterless macro can easily cause build failures
because it can be a variable name like in linux/pm_domain.h [1]. So,
remove the macro and convert its only user. Because this architecture
cannot do SMP, remove the whole SMP block, too. Only compile tested due
to no hardware.
David S. Miller [Mon, 31 Mar 2014 04:45:49 +0000 (00:45 -0400)]
Merge branch 'filter-next'
Daniel Borkmann says:
====================
BPF updates
We sat down and have heavily reworked the whole previous patchset
from v10 [1] to address all comments/concerns. This patchset therefore
*replaces* the internal BPF interpreter with the new layout as
discussed in [1], and migrates some exotic callers to properly use the
BPF API for a transparent upgrade. All other callers that already use
the BPF API in a way it should be used, need no further changes to run
the new internals. We also removed the sysctl knob entirely, and do not
expose any structure to userland, so that implementation details only
reside in kernel space. Since we are replacing the interpreter we had
to migrate seccomp in one patch along with the interpreter to not break
anything. When attaching a new filter, the flow can be described as
following: i) test if jit compiler is enabled and can compile the user
BPF, ii) if so, then go for it, iii) if not, then transparently migrate
the filter into the new representation, and run it in the interpreter.
Also, we have scratched the jit flag from the len attribute and made it
as initial patch in this series as Pablo has suggested in the last
feedback, thanks. For details, please refer to the patches themselves.
We did extensive testing of BPF and seccomp on the new interpreter
itself and also on the user ABIs and could not find any issues; new
performance numbers as posted in patch 8 are also still the same.
Please find more details in the patches themselves.
For all the previous history from v1 to v10, see [1]. We have decided
to drop the v11 as we have pedantically reworked the set, but of course,
included all previous feedback.
v3 -> v4:
- Applied feedback from Dave regarding swap insns
- Rebased on net-next
v2 -> v3:
- Rebased to latest net-next (i.e. w/ rxhash->hash rename)
- Fixed patch 8/9 commit message/doc as suggested by Dave
- Rest is unchanged
v1 -> v2:
- Rebased to latest net-next
- Added static to ptp_filter as suggested by Dave
- Fixed a typo in patch 8's commit message
- Rest unchanged
net: filter: rework/optimize internal BPF interpreter's instruction set
This patch replaces/reworks the kernel-internal BPF interpreter with
an optimized BPF instruction set format that is modelled closer to
mimic native instruction sets and is designed to be JITed with one to
one mapping. Thus, the new interpreter is noticeably faster than the
current implementation of sk_run_filter(); mainly for two reasons:
1. Fall-through jumps:
BPF jump instructions are forced to go either 'true' or 'false'
branch which causes branch-miss penalty. The new BPF jump
instructions have only one branch and fall-through otherwise,
which fits the CPU branch predictor logic better. `perf stat`
shows drastic difference for branch-misses between the old and
new code.
2. Jump-threaded implementation of interpreter vs switch
statement:
Instead of single table-jump at the top of 'switch' statement,
gcc will now generate multiple table-jump instructions, which
helps CPU branch predictor logic.
Note that the verification of filters is still being done through
sk_chk_filter() in classical BPF format, so filters from user- or
kernel space are verified in the same way as we do now, and same
restrictions/constraints hold as well.
We reuse current BPF JIT compilers in a way that this upgrade would
even be fine as is, but nevertheless allows for a successive upgrade
of BPF JIT compilers to the new format.
The internal instruction set migration is being done after the
probing for JIT compilation, so in case JIT compilers are able to
create a native opcode image, we're going to use that, and in all
other cases we're doing a follow-up migration of the BPF program's
instruction set, so that it can be transparently run in the new
interpreter.
In short, the *internal* format extends BPF in the following way (more
details can be taken from the appended documentation):
- Number of registers increase from 2 to 10
- Register width increases from 32-bit to 64-bit
- Conditional jt/jf targets replaced with jt/fall-through
- Adds signed > and >= insns
- 16 4-byte stack slots for register spill-fill replaced
with up to 512 bytes of multi-use stack space
- Introduction of bpf_call insn and register passing convention
for zero overhead calls from/to other kernel functions
- Adds arithmetic right shift and endianness conversion insns
- Adds atomic_add insn
- Old tax/txa insns are replaced with 'mov dst,src' insn
Performance of two BPF filters generated by libpcap resp. bpf_asm
was measured on x86_64, i386 and arm32 (other libpcap programs
have similar performance differences):
fprog #1 is taken from Documentation/networking/filter.txt:
tcpdump -i eth0 port 22 -dd
fprog #2 is taken from 'man tcpdump':
tcpdump -i eth0 'tcp port 22 and (((ip[2:2] - ((ip[0]&0xf)<<2)) -
((tcp[12]&0xf0)>>2)) != 0)' -dd
Raw performance data from BPF micro-benchmark: SK_RUN_FILTER on the
same SKB (cache-hit) or 10k SKBs (cache-miss); time in ns per call,
smaller is better:
--x86_64--
fprog #1 fprog #1 fprog #2 fprog #2
cache-hit cache-miss cache-hit cache-miss
old BPF 90 101 192 202
new BPF 31 71 47 97
old BPF jit 12 34 17 44
new BPF jit TBD
--arm32--
fprog #1 fprog #1 fprog #2 fprog #2
cache-hit cache-miss cache-hit cache-miss
old BPF 202 300 475 540
new BPF 180 270 330 470
old BPF jit 26 182 37 202
new BPF jit TBD
Thus, without changing any userland BPF filters, applications on
top of AF_PACKET (or other families) such as libpcap/tcpdump, cls_bpf
classifier, netfilter's xt_bpf, team driver's load-balancing mode,
and many more will have better interpreter filtering performance.
While we are replacing the internal BPF interpreter, we also need
to convert seccomp BPF in the same step to make use of the new
internal structure since it makes use of lower-level API details
without being further decoupled through higher-level calls like
sk_unattached_filter_{create,destroy}(), for example.
Just as for normal socket filtering, also seccomp BPF experiences
a time-to-verdict speedup:
05-sim-long_jumps.c of libseccomp was used as micro-benchmark:
BPF filters generated by seccomp are very branchy, so the new
internal BPF performance is better than the old one. Performance
gains will be even higher when BPF JIT is committed for the
new structure, which is planned in future work (as successive
JIT migrations).
BPF has also been stress-tested with trinity's BPF fuzzer.
Daniel Borkmann [Fri, 28 Mar 2014 17:58:24 +0000 (18:58 +0100)]
net: isdn: use sk_unattached_filter api
Similarly as in ppp, we need to migrate the ISDN/PPP code to make use
of the sk_unattached_filter api in order to decouple having direct
filter structure access. By using sk_unattached_filter_{create,destroy},
we can allow for the possibility to jit compile filters for faster
filter verdicts as well.
Daniel Borkmann [Fri, 28 Mar 2014 17:58:23 +0000 (18:58 +0100)]
net: ppp: use sk_unattached_filter api
For the ppp driver, there are currently two open-coded BPF filters in use,
that is, pass_filter and active_filter. Migrate both to make proper use
of sk_unattached_filter_{create,destroy} API so that the actual BPF code
is decoupled from direct access, and filters can be jited as a side-effect
by the internal filter compiler.
Daniel Borkmann [Fri, 28 Mar 2014 17:58:22 +0000 (18:58 +0100)]
net: ptp: do not reimplement PTP/BPF classifier
There are currently pch_gbe, cpts, and ixp4xx_eth drivers that open-code
and reimplement a BPF classifier for the PTP protocol. Since all of them
effectively do the very same thing and load the very same PTP/BPF filter,
we can just consolidate that code by introducing ptp_classify_raw() in
the time-stamping core framework which can be used in drivers.
As drivers get initialized after bootstrapping the core networking
subsystem, they can make use of ptp_insns wrapped through
ptp_classify_raw(), which allows to simplify and remove PTP classifier
setup code in drivers.
Daniel Borkmann [Fri, 28 Mar 2014 17:58:21 +0000 (18:58 +0100)]
net: ptp: use sk_unattached_filter_create() for BPF
This patch migrates an open-coded sk_run_filter() implementation with
proper use of the BPF API, that is, sk_unattached_filter_create(). This
migration is needed, as we will be internally transforming the filter
to a different representation, and therefore needs to be decoupled.
It is okay to do so as skb_timestamping_init() is called during
initialization of the network stack in core initcall via sock_init().
This would effectively also allow for PTP filters to be jit compiled if
bpf_jit_enable is set.
For better readability, there are also some newlines introduced, also
ptp_classify.h is only in kernel space.
Daniel Borkmann [Fri, 28 Mar 2014 17:58:20 +0000 (18:58 +0100)]
net: filter: move filter accounting to filter core
This patch basically does two things, i) removes the extern keyword
from the include/linux/filter.h file to be more consistent with the
rest of Joe's changes, and ii) moves filter accounting into the filter
core framework.
Filter accounting mainly done through sk_filter_{un,}charge() take
care of the case when sockets are being cloned through sk_clone_lock()
so that removal of the filter on one socket won't result in eviction
as it's still referenced by the other.
These functions actually belong to net/core/filter.c and not
include/net/sock.h as we want to keep all that in a central place.
It's also not in fast-path so uninlining them is fine and even allows
us to get rd of sk_filter_release_rcu()'s EXPORT_SYMBOL and a forward
declaration.
Daniel Borkmann [Fri, 28 Mar 2014 17:58:19 +0000 (18:58 +0100)]
net: filter: keep original BPF program around
In order to open up the possibility to internally transform a BPF program
into an alternative and possibly non-trivial reversible representation, we
need to keep the original BPF program around, so that it can be passed back
to user space w/o the need of a complex decoder.
The reason for that use case resides in commit a8fc92778080 ("sk-filter:
Add ability to get socket filter program (v2)"), that is, the ability
to retrieve the currently attached BPF filter from a given socket used
mainly by the checkpoint-restore project, for example.
Therefore, we add two helpers sk_{store,release}_orig_filter for taking
care of that. In the sk_unattached_filter_create() case, there's no such
possibility/requirement to retrieve a loaded BPF program. Therefore, we
can spare us the work in that case.
This approach will simplify and slightly speed up both, sk_get_filter()
and sock_diag_put_filterinfo() handlers as we won't need to successively
decode filters anymore through sk_decode_filter(). As we still need
sk_decode_filter() later on, we're keeping it around.
Daniel Borkmann [Fri, 28 Mar 2014 17:58:18 +0000 (18:58 +0100)]
net: filter: add jited flag to indicate jit compiled filters
This patch adds a jited flag into sk_filter struct in order to indicate
whether a filter is currently jited or not. The size of sk_filter is
not being expanded as the 32 bit 'len' member allows upper bits to be
reused since a filter can currently only grow as large as BPF_MAXINSNS.
Therefore, there's enough room also for other in future needed flags to
reuse 'len' field if necessary. The jited flag also allows for having
alternative interpreter functions running as currently, we can only
detect jit compiled filters by testing fp->bpf_func to not equal the
address of sk_run_filter().
Linus Torvalds [Mon, 31 Mar 2014 00:26:08 +0000 (17:26 -0700)]
Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"Switch mnt_hash to hlist, turning the races between __lookup_mnt() and
hash modifications into false negatives from __lookup_mnt() (instead
of hangs)"
On the false negatives from __lookup_mnt():
"The *only* thing we care about is not getting stuck in __lookup_mnt().
If it misses an entry because something in front of it just got moved
around, etc, we are fine. We'll notice that mount_lock mismatch and
that'll be it"
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
switch mnt_hash to hlist
don't bother with propagate_mnt() unless the target is shared
keep shadowed vfsmounts together
resizable namespace.c hashes
Linus Torvalds [Mon, 31 Mar 2014 00:20:40 +0000 (17:20 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
"Some more updates for the input subsystem.
You will get a fix for race in mousedev that has been causing quite a
few oopses lately and a small fixup for force feedback support in
evdev"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: mousedev - fix race when creating mixed device
Input: don't modify the id of ioctl-provided ff effect on upload failure
Eric Paris [Sun, 30 Mar 2014 23:07:54 +0000 (19:07 -0400)]
AUDIT: Allow login in non-init namespaces
It its possible to configure your PAM stack to refuse login if audit
messages (about the login) were unable to be sent. This is common in
many distros and thus normal configuration of many containers. The PAM
modules determine if audit is enabled/disabled in the kernel based on
the return value from sending an audit message on the netlink socket.
If userspace gets back ECONNREFUSED it believes audit is disabled in the
kernel. If it gets any other error else it refuses to let the login
proceed.
Just about ever since the introduction of namespaces the kernel audit
subsystem has returned EPERM if the task sending a message was not in
the init user or pid namespace. So many forms of containers have never
worked if audit was enabled in the kernel.
BUT if the container was not in net_init then the kernel network code
would send ECONNREFUSED (instead of the audit code sending EPERM). Thus
by pure accident/dumb luck/bug if an admin configured the PAM stack to
reject all logins that didn't talk to audit, but then ran the login
untility in the non-init_net namespace, it would work!! Clearly this was
a bug, but it is a bug some people expected.
With the introduction of network namespace support in 3.14-rc1 the two
bugs stopped cancelling each other out. Now, containers in the
non-init_net namespace refused to let users log in (just like PAM was
configfured!) Obviously some people were not happy that what used to let
users log in, now didn't!
This fix is kinda hacky. We return ECONNREFUSED for all non-init
relevant namespaces. That means that not only will the old broken
non-init_net setups continue to work, now the broken non-init_pid or
non-init_user setups will 'work'. They don't really work, since audit
isn't logging things. But it's what most users want.
In 3.15 we should have patches to support not only the non-init_net
(3.14) namespace but also the non-init_pid and non-init_user namespace.
So all will be right in the world. This just opens the doors wide open
on 3.14 and hopefully makes users happy, if not the audit system...
Theodore Ts'o [Sun, 30 Mar 2014 14:20:01 +0000 (10:20 -0400)]
ext4: atomically set inode->i_flags in ext4_set_inode_flags()
Use cmpxchg() to atomically set i_flags instead of clearing out the
S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the
EXT4_IMMUTABLE_FL, EXT4_APPEND_FL flags, since this opens up a race
where an immutable file has the immutable flag cleared for a brief
window of time.
Al Viro [Fri, 21 Mar 2014 01:10:51 +0000 (21:10 -0400)]
switch mnt_hash to hlist
fixes RCU bug - walking through hlist is safe in face of element moves,
since it's self-terminating. Cyclic lists are not - if we end up jumping
to another hash chain, we'll loop infinitely without ever hitting the
original list head.
Al Viro [Fri, 21 Mar 2014 14:14:08 +0000 (10:14 -0400)]
don't bother with propagate_mnt() unless the target is shared
If the dest_mnt is not shared, propagate_mnt() does nothing -
there's no mounts to propagate to and thus no copies to create.
Might as well don't bother calling it in that case.
Al Viro [Fri, 28 Feb 2014 18:46:44 +0000 (13:46 -0500)]
resizable namespace.c hashes
* switch allocation to alloc_large_system_hash()
* make sizes overridable by boot parameters (mhash_entries=, mphash_entries=)
* switch mountpoint_hashtable from list_head to hlist_head
Andy Lutomirski [Sat, 29 Mar 2014 20:15:35 +0000 (13:15 -0700)]
x86, vdso: Fix the symbol versions on the 32-bit vDSO
The new symbols provide the same API as the 64-bit variants, so they
should have the same symbol version name. This can't break
userspace, since these symbols are new for 32-bit Linux.
Mark Brown [Sat, 29 Mar 2014 23:48:07 +0000 (23:48 +0000)]
spi: Fix handling of cs_change in core implementation
The core implementation of cs_change didn't follow the documentation
which says that cs_change in the middle of the transfer means to briefly
deassert chip select, instead it followed buggy drivers which change the
polarity of chip select. Use a delay of 10us between deassert and
reassert simply from pulling numbers out of a hat.
Sander Eikelenboom reported an issue with ring overflow in netback in
3.14-rc3. This turns outo be be because of a bug in the ring slot estimation
code. This patch series fixes the slot estimation, fixes the BUG_ON() that
was supposed to catch the issue that Sander ran into and also makes a small
fix to start_new_rx_buffer().
v3:
- Added a cap of MAX_SKB_FRAGS to estimate in patch #2
v2:
- Added BUG_ON() to patch #1
- Added more explanation to patch #3
====================