aboutsummaryrefslogtreecommitdiffstats
path: root/kernel
Commit message (Collapse)AuthorAgeFilesLines
* softirq: Use __this_cpu_write() in takeover_tasklets()irq/coreMuchun Song2019-06-231-1/+1
| | | | | | | | | | | | | | | | | | The code is executed with interrupts disabled, so it's safe to use __this_cpu_write(). [ tglx: Massaged changelog ] Signed-off-by: Muchun Song <smuchun@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: joel@joelfernandes.org Cc: rostedt@goodmis.org Cc: frederic@kernel.org Cc: paulmck@linux.vnet.ibm.com Cc: alexander.levin@verizon.com Cc: peterz@infradead.org Link: https://lkml.kernel.org/r/20190618143305.2038-1-smuchun@gmail.com
* genirq/affinity: Remove unused argument from [__]irq_build_affinity_masks()Minwoo Im2019-06-121-7/+5
| | | | | | | | | | | | | The *affd argument is neither used in irq_build_affinity_masks() nor __irq_build_affinity_masks(). Remove it. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Minwoo Im <minwoo.im@samsung.com> Cc: linux-block@vger.kernel.org Link: https://lkml.kernel.org/r/20190602112117.31839-1-minwoo.im.dev@gmail.com
* genirq/timings: Add selftest for next event computationDaniel Lezcano2019-06-121-0/+66
| | | | | | | | | | | | | | | The circular buffers are now validated with selftests. The next interrupt index algorithm which is the hardest part to validate needs extra coverage. Add a selftest which uses the intervals stored in the arrays and insert all the values except the last one. The next event computation must return the same value as the last element which was not inserted. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20190527205521.12091-9-daniel.lezcano@linaro.org
* genirq/timings: Add selftest for irqs circular bufferDaniel Lezcano2019-06-121-0/+139
| | | | | | | | | | | | | After testing the per cpu interrupt circular event, make sure the per interrupt circular buffer usage is correct. Add tests to validate the interrupt circular buffer. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20190527205521.12091-8-daniel.lezcano@linaro.org
* genirq/timings: Add selftest for circular arrayDaniel Lezcano2019-06-122-0/+122
| | | | | | | | | | | | | | | | | Due to the complexity of the code and the difficulty to debug it, add some selftests to the framework in order to spot issues or regression at boot time when the runtime testing is enabled for this subsystem. This tests the circular buffer at the limits and validates: - the encoding / decoding of the values - the macro to browse the irq timings circular buffer - the function to push data in the circular buffer Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20190527205521.12091-7-daniel.lezcano@linaro.org
* genirq/timings: Encapsulate storing functionDaniel Lezcano2019-06-121-19/+34
| | | | | | | | | | | | | | | For the next patches providing the selftest, it is required to insert interval values directly in the buffer in order to check the correctness of the code. Encapsulate the code doing that in a always inline function in order to reuse it in the test code. No functional changes. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20190527205521.12091-6-daniel.lezcano@linaro.org
* genirq/timings: Encapsulate timings pushDaniel Lezcano2019-06-121-9/+12
| | | | | | | | | | | | | | | For the next patches providing the selftest, it is required to artificially insert timings value in the circular buffer in order to check the correctness of the code. Encapsulate the common code between the future test code and the current code with an always-inline tag. No functional change. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20190527205521.12091-5-daniel.lezcano@linaro.org
* genirq/timings: Optimize the period detection speedDaniel Lezcano2019-06-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | With a minimal period and if there is a period which is a multiple of it but lesser than the max period then it will be detected before and the minimal period will be never reached. 1 2 1 2 1 2 1 2 1 2 1 2 <-----> <-----> <-----> <-> <-> <-> <-> <-> <-> In that case, the minimum period is 2 and the maximum period is 5. That means all repeating pattern of 2 will be detected as repeating pattern of 4, it is pointless to go up to 2 when searching for the period as it will always fail. Remove one loop iteration by increasing the minimal period to 3. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20190527205521.12091-4-daniel.lezcano@linaro.org
* genirq/timings: Fix timings buffer inspectionDaniel Lezcano2019-06-121-5/+18
| | | | | | | | | | | | | | | | | | | | | | | | It appears the index beginning computation is not correct, the current code does: i = (irqts->count & IRQ_TIMINGS_MASK) - 1 If irqts->count is equal to zero, we end up with an index equal to -1, but that does not happen because the function checks against zero before and returns in such case. However, if irqts->count is a multiple of IRQ_TIMINGS_SIZE, the resulting & bit op will be zero and leads also to a -1 index. Re-introduce the iteration loop belonging to the previous variance code which was correct. Fixes: bbba0e7c5cda "genirq/timings: Add array suffix computation code" Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20190527205521.12091-3-daniel.lezcano@linaro.org
* genirq/timings: Fix next event index functionDaniel Lezcano2019-06-121-9/+42
| | | | | | | | | | | | | | | | The current code is luckily working with most of the interval samples testing but actually it fails to correctly detect pattern repetition breaking at the end of the buffer. Narrowing down the bug has been a real pain because of the pointers, so the routine is rewrittne by using indexes instead. Fixes: bbba0e7c5cda "genirq/timings: Add array suffix computation code" Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20190527205521.12091-2-daniel.lezcano@linaro.org
* genirq/irqdomain: Remove WARN_ON() on out-of-memory conditionGeert Uytterhoeven2019-05-281-1/+1
| | | | | | | | | | | There is no need to print a backtrace when memory allocation fails, as the memory allocation core already takes care of that. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <marc.zyngier@arm.com> Link: https://lkml.kernel.org/r/20190527115742.2693-1-geert+renesas@glider.be
* Merge tag 'trace-v5.2-rc1-2' of ↵Linus Torvalds2019-05-263-10/+20
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing warning fix from Steven Rostedt: "Make the GCC 9 warning for sub struct memset go away. GCC 9 now warns about calling memset() on partial structures when it goes across multiple fields. This adds a helper for the place in tracing that does this type of clearing of a structure" * tag 'trace-v5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Silence GCC 9 array bounds warning
| * tracing: Silence GCC 9 array bounds warningMiguel Ojeda2019-05-253-10/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting with GCC 9, -Warray-bounds detects cases when memset is called starting on a member of a struct but the size to be cleared ends up writing over further members. Such a call happens in the trace code to clear, at once, all members after and including `seq` on struct trace_iterator: In function 'memset', inlined from 'ftrace_dump' at kernel/trace/trace.c:8914:3: ./include/linux/string.h:344:9: warning: '__builtin_memset' offset [8505, 8560] from the object at 'iter' is out of the bounds of referenced subobject 'seq' with type 'struct trace_seq' at offset 4368 [-Warray-bounds] 344 | return __builtin_memset(p, c, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ In order to avoid GCC complaining about it, we compute the address ourselves by adding the offsetof distance instead of referring directly to the member. Since there are two places doing this clear (trace.c and trace_kdb.c), take the chance to move the workaround into a single place in the internal header. Link: http://lkml.kernel.org/r/20190523124535.GA12931@gmail.com Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> [ Removed unnecessary parenthesis around "iter" ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
* | Merge tag 'trace-v5.2-rc1' of ↵Linus Torvalds2019-05-252-3/+11
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Tom Zanussi sent me some small fixes and cleanups to the histogram code and I forgot to incorporate them. I also added a small clean up patch that was sent to me a while ago and I just noticed it" * tag 'trace-v5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: kernel/trace/trace.h: Remove duplicate header of trace_seq.h tracing: Add a check_val() check before updating cond_snapshot() track_val tracing: Check keys for variable references in expressions too tracing: Prevent hist_field_var_ref() from accessing NULL tracing_map_elts
| * kernel/trace/trace.h: Remove duplicate header of trace_seq.hJagadeesh Pagadala2019-05-221-1/+0
| | | | | | | | | | | | | | | | | | | | Remove duplicate header which is included twice. Link: http://lkml.kernel.org/r/1553725186-41442-1-git-send-email-jagdsh.linux@gmail.com Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Jagadeesh Pagadala <jagdsh.linux@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * tracing: Add a check_val() check before updating cond_snapshot() track_valTom Zanussi2019-05-211-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without this check a snapshot is taken whenever a bucket's max is hit, rather than only when the global max is hit, as it should be. Before: In this example, we do a first run of the workload (cyclictest), examine the output, note the max ('triggering value') (347), then do a second run and note the max again. In this case, the max in the second run (39) is below the max in the first run, but since we haven't cleared the histogram, the first max is still in the histogram and is higher than any other max, so it should still be the max for the snapshot. It isn't however - the value should still be 347 after the second run. # echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="cyclictest"' >> /sys/kernel/debug/tracing/events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio,prev_comm):onmax($wakeup_lat).snapshot() if next_comm=="cyclictest"' >> /sys/kernel/debug/tracing/events/sched/sched_switch/trigger # cyclictest -p 80 -n -s -t 2 -D 2 # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist { next_pid: 2143 } hitcount: 199 max: 44 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/4 { next_pid: 2145 } hitcount: 1325 max: 38 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/2 { next_pid: 2144 } hitcount: 1982 max: 347 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 Snapshot taken (see tracing/snapshot). Details: triggering value { onmax($wakeup_lat) }: 347 triggered by event with key: { next_pid: 2144 } # cyclictest -p 80 -n -s -t 2 -D 2 # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist { next_pid: 2143 } hitcount: 199 max: 44 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/4 { next_pid: 2148 } hitcount: 199 max: 16 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/1 { next_pid: 2145 } hitcount: 1325 max: 38 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/2 { next_pid: 2150 } hitcount: 1326 max: 39 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/4 { next_pid: 2144 } hitcount: 1982 max: 347 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 { next_pid: 2149 } hitcount: 1983 max: 130 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/0 Snapshot taken (see tracing/snapshot). Details: triggering value { onmax($wakeup_lat) }: 39 triggered by event with key: { next_pid: 2150 } After: In this example, we do a first run of the workload (cyclictest), examine the output, note the max ('triggering value') (375), then do a second run and note the max again. In this case, the max in the second run is still 375, the highest in any bucket, as it should be. # cyclictest -p 80 -n -s -t 2 -D 2 # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist { next_pid: 2072 } hitcount: 200 max: 28 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/5 { next_pid: 2074 } hitcount: 1323 max: 375 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/2 { next_pid: 2073 } hitcount: 1980 max: 153 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 Snapshot taken (see tracing/snapshot). Details: triggering value { onmax($wakeup_lat) }: 375 triggered by event with key: { next_pid: 2074 } # cyclictest -p 80 -n -s -t 2 -D 2 # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist { next_pid: 2101 } hitcount: 199 max: 49 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 { next_pid: 2072 } hitcount: 200 max: 28 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/5 { next_pid: 2074 } hitcount: 1323 max: 375 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/2 { next_pid: 2103 } hitcount: 1325 max: 74 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/4 { next_pid: 2073 } hitcount: 1980 max: 153 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 { next_pid: 2102 } hitcount: 1981 max: 84 next_prio: 19 next_comm: cyclictest prev_pid: 12 prev_prio: 120 prev_comm: kworker/0:1 Snapshot taken (see tracing/snapshot). Details: triggering value { onmax($wakeup_lat) }: 375 triggered by event with key: { next_pid: 2074 } Link: http://lkml.kernel.org/r/95958351329f129c07504b4d1769c47a97b70d65.1555597045.git.tom.zanussi@linux.intel.com Cc: stable@vger.kernel.org Fixes: a3785b7eca8fd ("tracing: Add hist trigger snapshot() action") Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * tracing: Check keys for variable references in expressions tooTom Zanussi2019-05-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's an existing check for variable references in keys, but it doesn't go far enough. It checks whether a key field is a variable reference but doesn't check whether it's an expression containing variable references, which can cause the same problems for callers. Use the existing field_has_hist_vars() function rather than a direct top-level flag check to catch all possible variable references. Link: http://lkml.kernel.org/r/e8c3d3d53db5ca90ceea5a46e5413103a6902fc7.1555597045.git.tom.zanussi@linux.intel.com Cc: stable@vger.kernel.org Fixes: 067fe038e70f6 ("tracing: Add variable reference handling to hist triggers") Reported-by: Vincent Bernat <vincent@bernat.ch> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * tracing: Prevent hist_field_var_ref() from accessing NULL tracing_map_eltsTom Zanussi2019-05-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hist_field_var_ref() is an implementation of hist_field_fn_t(), which can be called with a null tracing_map_elt elt param when assembling a key in event_hist_trigger(). In the case of hist_field_var_ref() this doesn't make sense, because a variable can only be resolved by looking it up using an already assembled key i.e. a variable can't be used to assemble a key since the key is required in order to access the variable. Upper layers should prevent the user from constructing a key using a variable in the first place, but in case one slips through, it shouldn't cause a NULL pointer dereference. Also if one does slip through, we want to know about it, so emit a one-time warning in that case. Link: http://lkml.kernel.org/r/64ec8dc15c14d305295b64cdfcc6b2b9dd14753f.1555597045.git.tom.zanussi@linux.intel.com Reported-by: Vincent Bernat <vincent@bernat.ch> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
* | Merge tag 'spdx-5.2-rc2-2' of ↵Linus Torvalds2019-05-244-17/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pule more SPDX updates from Greg KH: "Here is another set of reviewed patches that adds SPDX tags to different kernel files, based on a set of rules that are being used to parse the comments to try to determine that the license of the file is "GPL-2.0-or-later". Only the "obvious" versions of these matches are included here, a number of "non-obvious" variants of text have been found but those have been postponed for later review and analysis. These patches have been out for review on the linux-spdx@vger mailing list, and while they were created by automatic tools, they were hand-verified by a bunch of different people, all whom names are on the patches are reviewers" * tag 'spdx-5.2-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (85 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 125 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 123 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 122 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 121 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 120 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 119 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 118 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 116 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 114 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 113 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 112 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 111 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 110 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 106 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 105 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 104 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 103 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 102 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 101 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 98 ...
| * | treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 38Thomas Gleixner2019-05-241-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on 1 normalized pattern(s): this file is released under the gplv2 and any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520170857.732920462@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36Thomas Gleixner2019-05-243-15/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public licence as published by the free software foundation either version 2 of the licence or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 114 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520170857.552531963@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | locking/lock_events: Use this_cpu_add() when necessaryWaiman Long2019-05-241-2/+40
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel test robot has reported that the use of __this_cpu_add() causes bug messages like: BUG: using __this_cpu_add() in preemptible [00000000] code: ... Given the imprecise nature of the count and the possibility of resetting the count and doing the measurement again, this is not really a big problem to use the unprotected __this_cpu_*() functions. To make the preemption checking code happy, the this_cpu_*() functions will be used if CONFIG_DEBUG_PREEMPT is defined. The imprecise nature of the locking counts are also documented with the suggestion that we should run the measurement a few times with the counts reset in between to get a better picture of what is going on under the hood. Fixes: a8654596f0371 ("locking/rwsem: Enable lock event counting") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'spdx-5.2-rc2' of ↵Linus Torvalds2019-05-2170-112/+70
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull SPDX update from Greg KH: "Here is a series of patches that add SPDX tags to different kernel files, based on two different things: - SPDX entries are added to a bunch of files that we missed a year ago that do not have any license information at all. These were either missed because the tool saw the MODULE_LICENSE() tag, or some EXPORT_SYMBOL tags, and got confused and thought the file had a real license, or the files have been added since the last big sweep, or they were Makefile/Kconfig files, which we didn't touch last time. - Add GPL-2.0-only or GPL-2.0-or-later tags to files where our scan tools can determine the license text in the file itself. Where this happens, the license text is removed, in order to cut down on the 700+ different ways we have in the kernel today, in a quest to get rid of all of these. These patches have been out for review on the linux-spdx@vger mailing list, and while they were created by automatic tools, they were hand-verified by a bunch of different people, all whom names are on the patches are reviewers. The reason for these "large" patches is if we were to continue to progress at the current rate of change in the kernel, adding license tags to individual files in different subsystems, we would be finished in about 10 years at the earliest. There will be more series of these types of patches coming over the next few weeks as the tools and reviewers crunch through the more "odd" variants of how to say "GPLv2" that developers have come up with over the years, combined with other fun oddities (GPL + a BSD disclaimer?) that are being unearthed, with the goal for the whole kernel to be cleaned up. These diffstats are not small, 3840 files are touched, over 10k lines removed in just 24 patches" * tag 'spdx-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (24 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 25 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 24 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 23 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 22 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 21 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 20 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 19 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 18 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 17 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 15 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 14 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 12 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 11 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 10 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 9 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 7 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 5 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 4 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 3 ...
| * | treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 25Thomas Gleixner2019-05-212-20/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it would be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 6 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190519154043.007767574@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13Thomas Gleixner2019-05-216-78/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details [based] [from] [clk] [highbank] [c] you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 355 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 9Thomas Gleixner2019-05-211-14/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not you can access it online at http www gnu org licenses gpl 2 0 html extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190519154041.430943677@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | treewide: Add SPDX license identifier - Makefile/KconfigThomas Gleixner2019-05-2116-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | treewide: Add SPDX license identifier for missed filesThomas Gleixner2019-05-2145-0/+45
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2019-05-205-9/+29
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller:1) Use after free in __dev_map_entry_free(), from Eric Dumazet. 1) Use after free in __dev_map_entry_free(), from Eric Dumazet. 2) Fix TCP retransmission timestamps on passive Fast Open, from Yuchung Cheng. 3) Orphan NFC, we'll take the patches directly into my tree. From Johannes Berg. 4) We can't recycle cloned TCP skbs, from Eric Dumazet. 5) Some flow dissector bpf test fixes, from Stanislav Fomichev. 6) Fix RCU marking and warnings in rhashtable, from Herbert Xu. 7) Fix some potential fib6 leaks, from Eric Dumazet. 8) Fix a _decode_session4 uninitialized memory read bug fix that got lost in a merge. From Florian Westphal. 9) Fix ipv6 source address routing wrt. exception route entries, from Wei Wang. 10) The netdev_xmit_more() conversion was not done %100 properly in mlx5 driver, fix from Tariq Toukan. 11) Clean up botched merge on netfilter kselftest, from Florian Westphal. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (74 commits) of_net: fix of_get_mac_address retval if compiled without CONFIG_OF net: fix kernel-doc warnings for socket.c net: Treat sock->sk_drops as an unsigned int when printing kselftests: netfilter: fix leftover net/net-next merge conflict mlxsw: core: Prevent reading unsupported slave address from SFP EEPROM mlxsw: core: Prevent QSFP module initialization for old hardware vsock/virtio: Initialize core virtio vsock before registering the driver net/mlx5e: Fix possible modify header actions memory leak net/mlx5e: Fix no rewrite fields with the same match net/mlx5e: Additional check for flow destination comparison net/mlx5e: Add missing ethtool driver info for representors net/mlx5e: Fix number of vports for ingress ACL configuration net/mlx5e: Fix ethtool rxfh commands when CONFIG_MLX5_EN_RXNFC is disabled net/mlx5e: Fix wrong xmit_more application net/mlx5: Fix peer pf disable hca command net/mlx5: E-Switch, Correct type to u16 for vport_num and int for vport_index net/mlx5: Add meaningful return codes to status_to_err function net/mlx5: Imply MLXFW in mlx5_core Revert "tipc: fix modprobe tipc failed after switch order of device registration" vsock/virtio: free packets during the socket release ...
| * bpf: relax inode permission check for retrieving bpf programChenbo Feng2019-05-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For iptable module to load a bpf program from a pinned location, it only retrieve a loaded program and cannot change the program content so requiring a write permission for it might not be necessary. Also when adding or removing an unrelated iptable rule, it might need to flush and reload the xt_bpf related rules as well and triggers the inode permission check. It might be better to remove the write premission check for the inode so we won't need to grant write access to all the processes that flush and restore iptables rules. Signed-off-by: Chenbo Feng <fengc@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2019-05-154-8/+28
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf 2019-05-16 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a use after free in __dev_map_entry_free(), from Eric. 2) Several sockmap related bug fixes: a splat in strparser if it was never initialized, remove duplicate ingress msg list purging which can race, fix msg->sg.size accounting upon skb to msg conversion, and last but not least fix a timeout bug in tcp_bpf_wait_data(), from John. 3) Fix LRU map to avoid messing with eviction heuristics upon syscall lookup, e.g. map walks from user space side will then lead to eviction of just recently created entries on updates as it would mark all map entries, from Daniel. 4) Don't bail out when libbpf feature probing fails. Also various smaller fixes to flow_dissector test, from Stanislav. 5) Fix missing brackets for BTF_INT_OFFSET() in UAPI, from Gary. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * bpf, lru: avoid messing with eviction heuristics upon syscall lookupDaniel Borkmann2019-05-141-5/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the biggest issues we face right now with picking LRU map over regular hash table is that a map walk out of user space, for example, to just dump the existing entries or to remove certain ones, will completely mess up LRU eviction heuristics and wrong entries such as just created ones will get evicted instead. The reason for this is that we mark an entry as "in use" via bpf_lru_node_set_ref() from system call lookup side as well. Thus upon walk, all entries are being marked, so information of actual least recently used ones are "lost". In case of Cilium where it can be used (besides others) as a BPF based connection tracker, this current behavior causes disruption upon control plane changes that need to walk the map from user space to evict certain entries. Discussion result from bpfconf [0] was that we should simply just remove marking from system call side as no good use case could be found where it's actually needed there. Therefore this patch removes marking for regular LRU and per-CPU flavor. If there ever should be a need in future, the behavior could be selected via map creation flag, but due to mentioned reason we avoid this here. [0] http://vger.kernel.org/bpfconf.html Fixes: 29ba732acbee ("bpf: Add BPF_MAP_TYPE_LRU_HASH") Fixes: 8f8449384ec3 ("bpf: Add BPF_MAP_TYPE_LRU_PERCPU_HASH") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * bpf: add map_lookup_elem_sys_only for lookups from syscall sideDaniel Borkmann2019-05-141-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a callback map_lookup_elem_sys_only() that map implementations could use over map_lookup_elem() from system call side in case the map implementation needs to handle the latter differently than from the BPF data path. If map_lookup_elem_sys_only() is set, this will be preferred pick for map lookups out of user space. This hook is used in a follow-up fix for LRU map, but once development window opens, we can convert other map types from map_lookup_elem() (here, the one called upon BPF_MAP_LOOKUP_ELEM cmd is meant) over to use the callback to simplify and clean up the latter. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * bpf: mark bpf_event_notify and bpf_event_init as staticStanislav Fomichev2019-05-141-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Both of them are not declared in the headers and not used outside of bpf_trace.c file. Fixes: a38d1107f937c ("bpf: support raw tracepoints in modules") Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * bpf: devmap: fix use-after-free Read in __dev_map_entry_freeEric Dumazet2019-05-141-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | synchronize_rcu() is fine when the rcu callbacks only need to free memory (kfree_rcu() or direct kfree() call rcu call backs) __dev_map_entry_free() is a bit more complex, so we need to make sure that call queued __dev_map_entry_free() callbacks have completed. sysbot report: BUG: KASAN: use-after-free in dev_map_flush_old kernel/bpf/devmap.c:365 [inline] BUG: KASAN: use-after-free in __dev_map_entry_free+0x2a8/0x300 kernel/bpf/devmap.c:379 Read of size 8 at addr ffff8801b8da38c8 by task ksoftirqd/1/18 CPU: 1 PID: 18 Comm: ksoftirqd/1 Not tainted 4.17.0+ #39 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 dev_map_flush_old kernel/bpf/devmap.c:365 [inline] __dev_map_entry_free+0x2a8/0x300 kernel/bpf/devmap.c:379 __rcu_reclaim kernel/rcu/rcu.h:178 [inline] rcu_do_batch kernel/rcu/tree.c:2558 [inline] invoke_rcu_callbacks kernel/rcu/tree.c:2818 [inline] __rcu_process_callbacks kernel/rcu/tree.c:2785 [inline] rcu_process_callbacks+0xe9d/0x1760 kernel/rcu/tree.c:2802 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:284 run_ksoftirqd+0x86/0x100 kernel/softirq.c:645 smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164 kthread+0x345/0x410 kernel/kthread.c:240 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412 Allocated by task 6675: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553 kmem_cache_alloc_trace+0x152/0x780 mm/slab.c:3620 kmalloc include/linux/slab.h:513 [inline] kzalloc include/linux/slab.h:706 [inline] dev_map_alloc+0x208/0x7f0 kernel/bpf/devmap.c:102 find_and_alloc_map kernel/bpf/syscall.c:129 [inline] map_create+0x393/0x1010 kernel/bpf/syscall.c:453 __do_sys_bpf kernel/bpf/syscall.c:2351 [inline] __se_sys_bpf kernel/bpf/syscall.c:2328 [inline] __x64_sys_bpf+0x303/0x510 kernel/bpf/syscall.c:2328 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 26: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kfree+0xd9/0x260 mm/slab.c:3813 dev_map_free+0x4fa/0x670 kernel/bpf/devmap.c:191 bpf_map_free_deferred+0xba/0xf0 kernel/bpf/syscall.c:262 process_one_work+0xc64/0x1b70 kernel/workqueue.c:2153 worker_thread+0x181/0x13a0 kernel/workqueue.c:2296 kthread+0x345/0x410 kernel/kthread.c:240 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412 The buggy address belongs to the object at ffff8801b8da37c0 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 264 bytes inside of 512-byte region [ffff8801b8da37c0, ffff8801b8da39c0) The buggy address belongs to the page: page:ffffea0006e368c0 count:1 mapcount:0 mapping:ffff8801da800940 index:0xffff8801b8da3540 flags: 0x2fffc0000000100(slab) raw: 02fffc0000000100 ffffea0007217b88 ffffea0006e30cc8 ffff8801da800940 raw: ffff8801b8da3540 ffff8801b8da3040 0000000100000004 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801b8da3780: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ffff8801b8da3800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > ffff8801b8da3880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8801b8da3900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801b8da3980: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc Fixes: 546ac1ffb70d ("bpf: add devmap, a map for storing net device references") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+457d3e2ffbcf31aee5c0@syzkaller.appspotmail.com Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2019-05-192-2/+16
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge yet more updates from Andrew Morton: "A few final bits: - large changes to vmalloc, yielding large performance benefits - tweak the console-flush-on-panic code - a few fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: panic: add an option to replay all the printk message in buffer initramfs: don't free a non-existent initrd fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount mm/compaction.c: correct zone boundary handling when isolating pages from a pageblock mm/vmap: add DEBUG_AUGMENT_LOWEST_MATCH_CHECK macro mm/vmap: add DEBUG_AUGMENT_PROPAGATE_CHECK macro mm/vmalloc.c: keep track of free blocks for vmap allocation
| * | | panic: add an option to replay all the printk message in bufferFeng Tang2019-05-182-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently on panic, kernel will lower the loglevel and print out pending printk msg only with console_flush_on_panic(). Add an option for users to configure the "panic_print" to replay all dmesg in buffer, some of which they may have never seen due to the loglevel setting, which will help panic debugging . [feng.tang@intel.com: keep the original console_flush_on_panic() inside panic()] Link: http://lkml.kernel.org/r/1556199137-14163-1-git-send-email-feng.tang@intel.com [feng.tang@intel.com: use logbuf lock to protect the console log index] Link: http://lkml.kernel.org/r/1556269868-22654-1-git-send-email-feng.tang@intel.com Link: http://lkml.kernel.org/r/1556095872-36838-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang <feng.tang@intel.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Aaro Koskinen <aaro.koskinen@nokia.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds2019-05-193-1/+31
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull IRQ chip updates from Ingo Molnar: "A late irqchips update: - New TI INTR/INTA set of drivers - Rewrite of the stm32mp1-exti driver as a platform driver - Update the IOMMU MSI mapping API to be RT friendly - A number of cleanups and other low impact fixes" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) iommu/dma-iommu: Remove iommu_dma_map_msi_msg() irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg() irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg() irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg() irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg() iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts genirq/msi: Add a new field in msi_desc to store an IOMMU cookie arm64: arch_k3: Enable interrupt controller drivers irqchip/ti-sci-inta: Add msi domain support soc: ti: Add MSI domain bus support for Interrupt Aggregator irqchip/ti-sci-inta: Add support for Interrupt Aggregator driver dt-bindings: irqchip: Introduce TISCI Interrupt Aggregator bindings irqchip/ti-sci-intr: Add support for Interrupt Router driver dt-bindings: irqchip: Introduce TISCI Interrupt router bindings gpio: thunderx: Use the default parent apis for {request,release}_resources genirq: Introduce irq_chip_{request,release}_resource_parent() apis firmware: ti_sci: Add helper apis to manage resources firmware: ti_sci: Add RM mapping table for am654 firmware: ti_sci: Add support for IRQ management firmware: ti_sci: Add support for RM core ops ...
| * | | Merge tag 'irqchip-5.2' of ↵Thomas Gleixner2019-05-0625-152/+295
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier - The huge (and terrifying) TI INTR/INTA set of drivers - Rewrite of the stm32mp1-exti driver as a platform driver - Update the IOMMU MSI mapping API to be RT friendly - A number of cleanups and other low impact fixes
| | * | | genirq/msi: Add a new field in msi_desc to store an IOMMU cookieJulien Grall2019-05-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an MSI doorbell is located downstream of an IOMMU, it is required to swizzle the physical address with an appropriately-mapped IOVA for any device attached to one of our DMA ops domain. At the moment, the allocation of the mapping may be done when composing the message. However, the composing may be done in non-preemtible context while the allocation requires to be called from preemptible context. A follow-up change will split the current logic in two functions requiring to keep an IOMMU cookie per MSI. A new field is introduced in msi_desc to store an IOMMU cookie. As the cookie may not be required in some configuration, the field is protected under a new config CONFIG_IRQ_MSI_IOMMU. A pair of helpers has also been introduced to access the field. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | genirq: Introduce irq_chip_{request,release}_resource_parent() apisLokesh Vutla2019-05-011-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce irq_chip_{request,release}_resource_parent() apis so that these can be used in hierarchical irqchips. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | irq/irqdomain: Fix typo in the comment on top of __irq_domain_alloc_irqs()Julien Grall2019-04-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The word 'number' has been misspelt in the comment on top of _irq_domain_alloc_irqs(). Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* | | | | Merge branch 'for-5.2-fixes' of ↵Linus Torvalds2019-05-161-0/+1
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "The cgroup2 freezer pulled in this cycle broke strace. This pull request includes a workaround for the problem. It's not a complete fix in that it may cause spurious frozen state flip-flops which is fairly minor. Will push a full fix once it's ready" * 'for-5.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: signal: unconditionally leave the frozen state in ptrace_stop()
| * | | | | signal: unconditionally leave the frozen state in ptrace_stop()Roman Gushchin2019-05-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alex Xu reported a regression in strace, caused by the introduction of the cgroup v2 freezer. The regression can be reproduced by stracing the following simple program: #include <unistd.h> int main() { write(1, "a", 1); return 0; } An attempt to run strace ./a.out leads to the infinite loop: [ pre-main omitted ] write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) [ repeats forever ] The problem occurs because the traced task leaves ptrace_stop() (and the signal handling loop) with the frozen bit set. So let's call cgroup_leave_frozen(true) unconditionally after sleeping in ptrace_stop(). With this patch applied, strace works as expected: [ pre-main omitted ] write(1, "a", 1) = 1 exit_group(0) = ? +++ exited with 0 +++ Reported-by: Alex Xu <alex_y_xu@yahoo.ca> Fixes: 76f969e8948d ("cgroup: cgroup v2 freezer") Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
* | | | | | Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds2019-05-161-1/+1
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull time fixes from Ingo Molnar: "A TIA adjtimex interface extension, and a POSIX compliance ABI fix for timespec64 users" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ntp: Allow TAI-UTC offset to be set to zero y2038: Make CONFIG_64BIT_TIME unconditional
| * | | | | | ntp: Allow TAI-UTC offset to be set to zeroMiroslav Lichvar2019-05-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ADJ_TAI adjtimex mode sets the TAI-UTC offset of the system clock. It is typically set by NTP/PTP implementations and it is automatically updated by the kernel on leap seconds. The initial value is zero (which applications may interpret as unknown), but this value cannot be set by adjtimex. This limitation seems to go back to the original "nanokernel" implementation by David Mills. Change the ADJ_TAI check to accept zero as a valid TAI-UTC offset in order to allow setting it back to the initial value. Fixes: 153b5d054ac2 ("ntp: support for TAI") Suggested-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Link: https://lkml.kernel.org/r/20190417084833.7401-1-mlichvar@redhat.com
* | | | | | | Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds2019-05-161-15/+31
|\ \ \ \ \ \ \ | |_|/ / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "A single rwsem fix" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwsem: Prevent decrement of reader count before increment
| * | | | | | locking/rwsem: Prevent decrement of reader count before incrementlocking/urgentlocking-urgent-for-linusWaiman Long2019-05-071-15/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During my rwsem testing, it was found that after a down_read(), the reader count may occasionally become 0 or even negative. Consequently, a writer may steal the lock at that time and execute with the reader in parallel thus breaking the mutual exclusion guarantee of the write lock. In other words, both readers and writer can become rwsem owners simultaneously. The current reader wakeup code does it in one pass to clear waiter->task and put them into wake_q before fully incrementing the reader count. Once waiter->task is cleared, the corresponding reader may see it, finish the critical section and do unlock to decrement the count before the count is incremented. This is not a problem if there is only one reader to wake up as the count has been pre-incremented by 1. It is a problem if there are more than one readers to be woken up and writer can steal the lock. The wakeup was actually done in 2 passes before the following v4.9 commit: 70800c3c0cc5 ("locking/rwsem: Scan the wait_list for readers only once") To fix this problem, the wakeup is now done in two passes again. In the first pass, we collect the readers and count them. The reader count is then fully incremented. In the second pass, the waiter->task is then cleared and they are put into wake_q to be woken up later. Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: huang ying <huang.ying.caritas@gmail.com> Fixes: 70800c3c0cc5 ("locking/rwsem: Scan the wait_list for readers only once") Link: http://lkml.kernel.org/r/20190428212557.13482-2-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | | | Merge tag 'trace-v5.2' of ↵Linus Torvalds2019-05-1519-400/+998
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The major changes in this tracing update includes: - Removal of non-DYNAMIC_FTRACE from 32bit x86 - Removal of mcount support from x86 - Emulating a call from int3 on x86_64, fixes live kernel patching - Consolidated Tracing Error logs file Minor updates: - Removal of klp_check_compiler_support() - kdb ftrace dumping output changes - Accessing and creating ftrace instances from inside the kernel - Clean up of #define if macro - Introduction of TRACE_EVENT_NOP() to disable trace events based on config options And other minor fixes and clean ups" * tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits) x86: Hide the int3_emulate_call/jmp functions from UML livepatch: Remove klp_check_compiler_support() ftrace/x86: Remove mcount support ftrace/x86_32: Remove support for non DYNAMIC_FTRACE tracing: Simplify "if" macro code tracing: Fix documentation about disabling options using trace_options tracing: Replace kzalloc with kcalloc tracing: Fix partial reading of trace event's id file tracing: Allow RCU to run between postponed startup tests tracing: Fix white space issues in parse_pred() function tracing: Eliminate const char[] auto variables ring-buffer: Fix mispelling of Calculate tracing: probeevent: Fix to make the type of $comm string tracing: probeevent: Do not accumulate on ret variable tracing: uprobes: Re-enable $comm support for uprobe events ftrace/x86_64: Emulate call function while updating in breakpoint handler x86_64: Allow breakpoints to emulate call instructions x86_64: Add gap to int3 to allow for call emulation tracing: kdb: Allow ftdump to skip all but the last few entries tracing: Add trace_total_entries() / trace_total_entries_cpu() ...
| * | | | | | | livepatch: Remove klp_check_compiler_support()Jiri Kosina2019-05-101-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only purpose of klp_check_compiler_support() is to make sure that we are not using ftrace on x86 via mcount (because that's executed only after prologue has already happened, and that's too late for livepatching purposes). Now that mcount is not supported by ftrace any more, there is no need for klp_check_compiler_support() either. Link: http://lkml.kernel.org/r/nycvar.YFH.7.76.1905102346100.17054@cbobk.fhfr.pm Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>