From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 7FDE015800A for ; Wed, 30 Aug 2023 14:45:25 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id A06182BC01E; Wed, 30 Aug 2023 14:45:24 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [IPv6:2001:470:ea4a:1:5054:ff:fec7:86e4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 5E2432BC01E for ; Wed, 30 Aug 2023 14:45:24 +0000 (UTC) Received: from oystercatcher.gentoo.org (oystercatcher.gentoo.org [148.251.78.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id 20C25335D44 for ; Wed, 30 Aug 2023 14:45:23 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id BA849BDC for ; Wed, 30 Aug 2023 14:45:21 +0000 (UTC) From: "Mike Pagano" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Mike Pagano" Message-ID: <1693406710.d715bcd9f1450fbfe6cac0fd619d7aa6f91e2cbc.mpagano@gentoo> Subject: [gentoo-commits] proj/linux-patches:5.15 commit in: / X-VCS-Repository: proj/linux-patches X-VCS-Files: 0000_README 1128_linux-5.15.129.patch X-VCS-Directories: / X-VCS-Committer: mpagano X-VCS-Committer-Name: Mike Pagano X-VCS-Revision: d715bcd9f1450fbfe6cac0fd619d7aa6f91e2cbc X-VCS-Branch: 5.15 Date: Wed, 30 Aug 2023 14:45:21 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply X-Archives-Salt: aa077029-fc71-4f72-acd5-12e6819b3ca9 X-Archives-Hash: a4f4a03b9e29603f9fed0ba1b572065e commit: d715bcd9f1450fbfe6cac0fd619d7aa6f91e2cbc Author: Mike Pagano gentoo org> AuthorDate: Wed Aug 30 14:45:10 2023 +0000 Commit: Mike Pagano gentoo org> CommitDate: Wed Aug 30 14:45:10 2023 +0000 URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=d715bcd9 Linux patch 5.15.129 Signed-off-by: Mike Pagano gentoo.org> 0000_README | 4 + 1128_linux-5.15.129.patch | 4490 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 4494 insertions(+) diff --git a/0000_README b/0000_README index 6ffafb6e..b875c3b0 100644 --- a/0000_README +++ b/0000_README @@ -555,6 +555,10 @@ Patch: 1127_linux-5.15.128.patch From: https://www.kernel.org Desc: Linux 5.15.128 +Patch: 1128_linux-5.15.129.patch +From: https://www.kernel.org +Desc: Linux 5.15.129 + Patch: 1500_XATTR_USER_PREFIX.patch From: https://bugs.gentoo.org/show_bug.cgi?id=470644 Desc: Support for namespace user.pax.* on tmpfs. diff --git a/1128_linux-5.15.129.patch b/1128_linux-5.15.129.patch new file mode 100644 index 00000000..a88aa282 --- /dev/null +++ b/1128_linux-5.15.129.patch @@ -0,0 +1,4490 @@ +diff --git a/Makefile b/Makefile +index 2b5160227e820..96c3f014094b2 100644 +--- a/Makefile ++++ b/Makefile +@@ -1,7 +1,7 @@ + # SPDX-License-Identifier: GPL-2.0 + VERSION = 5 + PATCHLEVEL = 15 +-SUBLEVEL = 128 ++SUBLEVEL = 129 + EXTRAVERSION = + NAME = Trick or Treat + +diff --git a/arch/mips/include/asm/cpu-features.h b/arch/mips/include/asm/cpu-features.h +index 3d71081afc55f..e69833213e792 100644 +--- a/arch/mips/include/asm/cpu-features.h ++++ b/arch/mips/include/asm/cpu-features.h +@@ -124,7 +124,24 @@ + #define cpu_has_tx39_cache __opt(MIPS_CPU_TX39_CACHE) + #endif + #ifndef cpu_has_octeon_cache +-#define cpu_has_octeon_cache 0 ++#define cpu_has_octeon_cache \ ++({ \ ++ int __res; \ ++ \ ++ switch (boot_cpu_type()) { \ ++ case CPU_CAVIUM_OCTEON: \ ++ case CPU_CAVIUM_OCTEON_PLUS: \ ++ case CPU_CAVIUM_OCTEON2: \ ++ case CPU_CAVIUM_OCTEON3: \ ++ __res = 1; \ ++ break; \ ++ \ ++ default: \ ++ __res = 0; \ ++ } \ ++ \ ++ __res; \ ++}) + #endif + /* Don't override `cpu_has_fpu' to 1 or the "nofpu" option won't work. */ + #ifndef cpu_has_fpu +@@ -351,7 +368,7 @@ + ({ \ + int __res; \ + \ +- switch (current_cpu_type()) { \ ++ switch (boot_cpu_type()) { \ + case CPU_M14KC: \ + case CPU_74K: \ + case CPU_1074K: \ +diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h +index d4201fb2c46d3..6a6f741edda36 100644 +--- a/arch/x86/include/asm/fpu/internal.h ++++ b/arch/x86/include/asm/fpu/internal.h +@@ -416,8 +416,7 @@ DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx); + * FPU state for a task MUST let the rest of the kernel know that the + * FPU registers are no longer valid for this task. + * +- * Either one of these invalidation functions is enough. Invalidate +- * a resource you control: CPU if using the CPU for something else ++ * Invalidate a resource you control: CPU if using the CPU for something else + * (with preemption disabled), FPU for the current task, or a task that + * is prevented from running by the current task. + */ +diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c +index 759e1cef5e695..3ad1bf5de7373 100644 +--- a/arch/x86/kernel/fpu/core.c ++++ b/arch/x86/kernel/fpu/core.c +@@ -330,7 +330,7 @@ static void fpu_reset_fpstate(void) + struct fpu *fpu = ¤t->thread.fpu; + + fpregs_lock(); +- fpu__drop(fpu); ++ __fpu_invalidate_fpregs_state(fpu); + /* + * This does not change the actual hardware registers. It just + * resets the memory image and sets TIF_NEED_FPU_LOAD so a +diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c +index 8bbf37c0bebe2..81891f0fff6f6 100644 +--- a/arch/x86/kernel/fpu/xstate.c ++++ b/arch/x86/kernel/fpu/xstate.c +@@ -809,6 +809,13 @@ void __init fpu__init_system_xstate(void) + goto out_disable; + } + ++ /* ++ * CPU capabilities initialization runs before FPU init. So ++ * X86_FEATURE_OSXSAVE is not set. Now that XSAVE is completely ++ * functional, set the feature bit so depending code works. ++ */ ++ setup_force_cpu_cap(X86_FEATURE_OSXSAVE); ++ + print_xstate_offset_size(); + pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is %d bytes, using '%s' format.\n", + xfeatures_mask_all, +diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c +index 6c2bb60ccd88b..7a64fb2380448 100644 +--- a/arch/x86/kvm/mmu/tdp_mmu.c ++++ b/arch/x86/kvm/mmu/tdp_mmu.c +@@ -10,7 +10,7 @@ + #include + #include + +-static bool __read_mostly tdp_mmu_enabled = true; ++static bool __read_mostly tdp_mmu_enabled = false; + module_param_named(tdp_mmu, tdp_mmu_enabled, bool, 0644); + + /* Initializes the TDP MMU for the VM, if enabled. */ +diff --git a/drivers/clk/clk-devres.c b/drivers/clk/clk-devres.c +index 4fb4fd4b06bda..737aa70e2cb3d 100644 +--- a/drivers/clk/clk-devres.c ++++ b/drivers/clk/clk-devres.c +@@ -205,18 +205,19 @@ EXPORT_SYMBOL(devm_clk_put); + struct clk *devm_get_clk_from_child(struct device *dev, + struct device_node *np, const char *con_id) + { +- struct clk **ptr, *clk; ++ struct devm_clk_state *state; ++ struct clk *clk; + +- ptr = devres_alloc(devm_clk_release, sizeof(*ptr), GFP_KERNEL); +- if (!ptr) ++ state = devres_alloc(devm_clk_release, sizeof(*state), GFP_KERNEL); ++ if (!state) + return ERR_PTR(-ENOMEM); + + clk = of_clk_get_by_name(np, con_id); + if (!IS_ERR(clk)) { +- *ptr = clk; +- devres_add(dev, ptr); ++ state->clk = clk; ++ devres_add(dev, state); + } else { +- devres_free(ptr); ++ devres_free(state); + } + + return clk; +diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c +index 348b3a9170fa4..7f5ed1aa7a9f8 100644 +--- a/drivers/dma-buf/sw_sync.c ++++ b/drivers/dma-buf/sw_sync.c +@@ -191,6 +191,7 @@ static const struct dma_fence_ops timeline_fence_ops = { + */ + static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) + { ++ LIST_HEAD(signalled); + struct sync_pt *pt, *next; + + trace_sync_timeline(obj); +@@ -203,21 +204,20 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) + if (!timeline_fence_signaled(&pt->base)) + break; + +- list_del_init(&pt->link); ++ dma_fence_get(&pt->base); ++ ++ list_move_tail(&pt->link, &signalled); + rb_erase(&pt->node, &obj->pt_tree); + +- /* +- * A signal callback may release the last reference to this +- * fence, causing it to be freed. That operation has to be +- * last to avoid a use after free inside this loop, and must +- * be after we remove the fence from the timeline in order to +- * prevent deadlocking on timeline->lock inside +- * timeline_fence_release(). +- */ + dma_fence_signal_locked(&pt->base); + } + + spin_unlock_irq(&obj->lock); ++ ++ list_for_each_entry_safe(pt, next, &signalled, link) { ++ list_del_init(&pt->link); ++ dma_fence_put(&pt->base); ++ } + } + + /** +diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c +index 73457c32f3e7f..aa5a1fa68da05 100644 +--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c ++++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c +@@ -3142,7 +3142,9 @@ void dcn10_wait_for_mpcc_disconnect( + if (pipe_ctx->stream_res.opp->mpcc_disconnect_pending[mpcc_inst]) { + struct hubp *hubp = get_hubp_by_inst(res_pool, mpcc_inst); + +- res_pool->mpc->funcs->wait_for_idle(res_pool->mpc, mpcc_inst); ++ if (pipe_ctx->stream_res.tg && ++ pipe_ctx->stream_res.tg->funcs->is_tg_enabled(pipe_ctx->stream_res.tg)) ++ res_pool->mpc->funcs->wait_for_idle(res_pool->mpc, mpcc_inst); + pipe_ctx->stream_res.opp->mpcc_disconnect_pending[mpcc_inst] = false; + hubp->funcs->set_blank(hubp, true); + } +diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c +index 2f640b9fdf4af..3034ce392ac1f 100644 +--- a/drivers/gpu/drm/i915/i915_active.c ++++ b/drivers/gpu/drm/i915/i915_active.c +@@ -447,8 +447,11 @@ int i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence) + } + } while (unlikely(is_barrier(active))); + +- if (!__i915_active_fence_set(active, fence)) ++ fence = __i915_active_fence_set(active, fence); ++ if (!fence) + __i915_active_acquire(ref); ++ else ++ dma_fence_put(fence); + + out: + i915_active_release(ref); +@@ -467,13 +470,9 @@ __i915_active_set_fence(struct i915_active *ref, + return NULL; + } + +- rcu_read_lock(); + prev = __i915_active_fence_set(active, fence); +- if (prev) +- prev = dma_fence_get_rcu(prev); +- else ++ if (!prev) + __i915_active_acquire(ref); +- rcu_read_unlock(); + + return prev; + } +@@ -1040,10 +1039,11 @@ void i915_request_add_active_barriers(struct i915_request *rq) + * + * Records the new @fence as the last active fence along its timeline in + * this active tracker, moving the tracking callbacks from the previous +- * fence onto this one. Returns the previous fence (if not already completed), +- * which the caller must ensure is executed before the new fence. To ensure +- * that the order of fences within the timeline of the i915_active_fence is +- * understood, it should be locked by the caller. ++ * fence onto this one. Gets and returns a reference to the previous fence ++ * (if not already completed), which the caller must put after making sure ++ * that it is executed before the new fence. To ensure that the order of ++ * fences within the timeline of the i915_active_fence is understood, it ++ * should be locked by the caller. + */ + struct dma_fence * + __i915_active_fence_set(struct i915_active_fence *active, +@@ -1052,7 +1052,23 @@ __i915_active_fence_set(struct i915_active_fence *active, + struct dma_fence *prev; + unsigned long flags; + +- if (fence == rcu_access_pointer(active->fence)) ++ /* ++ * In case of fences embedded in i915_requests, their memory is ++ * SLAB_FAILSAFE_BY_RCU, then it can be reused right after release ++ * by new requests. Then, there is a risk of passing back a pointer ++ * to a new, completely unrelated fence that reuses the same memory ++ * while tracked under a different active tracker. Combined with i915 ++ * perf open/close operations that build await dependencies between ++ * engine kernel context requests and user requests from different ++ * timelines, this can lead to dependency loops and infinite waits. ++ * ++ * As a countermeasure, we try to get a reference to the active->fence ++ * first, so if we succeed and pass it back to our user then it is not ++ * released and potentially reused by an unrelated request before the ++ * user has a chance to set up an await dependency on it. ++ */ ++ prev = i915_active_fence_get(active); ++ if (fence == prev) + return fence; + + GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)); +@@ -1061,27 +1077,56 @@ __i915_active_fence_set(struct i915_active_fence *active, + * Consider that we have two threads arriving (A and B), with + * C already resident as the active->fence. + * +- * A does the xchg first, and so it sees C or NULL depending +- * on the timing of the interrupt handler. If it is NULL, the +- * previous fence must have been signaled and we know that +- * we are first on the timeline. If it is still present, +- * we acquire the lock on that fence and serialise with the interrupt +- * handler, in the process removing it from any future interrupt +- * callback. A will then wait on C before executing (if present). +- * +- * As B is second, it sees A as the previous fence and so waits for +- * it to complete its transition and takes over the occupancy for +- * itself -- remembering that it needs to wait on A before executing. ++ * Both A and B have got a reference to C or NULL, depending on the ++ * timing of the interrupt handler. Let's assume that if A has got C ++ * then it has locked C first (before B). + * + * Note the strong ordering of the timeline also provides consistent + * nesting rules for the fence->lock; the inner lock is always the + * older lock. + */ + spin_lock_irqsave(fence->lock, flags); +- prev = xchg(__active_fence_slot(active), fence); +- if (prev) { +- GEM_BUG_ON(prev == fence); ++ if (prev) + spin_lock_nested(prev->lock, SINGLE_DEPTH_NESTING); ++ ++ /* ++ * A does the cmpxchg first, and so it sees C or NULL, as before, or ++ * something else, depending on the timing of other threads and/or ++ * interrupt handler. If not the same as before then A unlocks C if ++ * applicable and retries, starting from an attempt to get a new ++ * active->fence. Meanwhile, B follows the same path as A. ++ * Once A succeeds with cmpxch, B fails again, retires, gets A from ++ * active->fence, locks it as soon as A completes, and possibly ++ * succeeds with cmpxchg. ++ */ ++ while (cmpxchg(__active_fence_slot(active), prev, fence) != prev) { ++ if (prev) { ++ spin_unlock(prev->lock); ++ dma_fence_put(prev); ++ } ++ spin_unlock_irqrestore(fence->lock, flags); ++ ++ prev = i915_active_fence_get(active); ++ GEM_BUG_ON(prev == fence); ++ ++ spin_lock_irqsave(fence->lock, flags); ++ if (prev) ++ spin_lock_nested(prev->lock, SINGLE_DEPTH_NESTING); ++ } ++ ++ /* ++ * If prev is NULL then the previous fence must have been signaled ++ * and we know that we are first on the timeline. If it is still ++ * present then, having the lock on that fence already acquired, we ++ * serialise with the interrupt handler, in the process of removing it ++ * from any future interrupt callback. A will then wait on C before ++ * executing (if present). ++ * ++ * As B is second, it sees A as the previous fence and so waits for ++ * it to complete its transition and takes over the occupancy for ++ * itself -- remembering that it needs to wait on A before executing. ++ */ ++ if (prev) { + __list_del_entry(&active->cb.node); + spin_unlock(prev->lock); /* serialise with prev->cb_list */ + } +@@ -1098,11 +1143,7 @@ int i915_active_fence_set(struct i915_active_fence *active, + int err = 0; + + /* Must maintain timeline ordering wrt previous active requests */ +- rcu_read_lock(); + fence = __i915_active_fence_set(active, &rq->fence); +- if (fence) /* but the previous fence may not belong to that timeline! */ +- fence = dma_fence_get_rcu(fence); +- rcu_read_unlock(); + if (fence) { + err = i915_request_await_dma_fence(rq, fence); + dma_fence_put(fence); +diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c +index 79da5eca60af5..bd85113ad150c 100644 +--- a/drivers/gpu/drm/i915/i915_request.c ++++ b/drivers/gpu/drm/i915/i915_request.c +@@ -1596,6 +1596,8 @@ __i915_request_add_to_timeline(struct i915_request *rq) + &rq->dep, + 0); + } ++ if (prev) ++ i915_request_put(prev); + + /* + * Make sure that no request gazumped us - if it was allocated after +diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h +index 288e883177bea..7bb7a69321d30 100644 +--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h ++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h +@@ -1685,4 +1685,16 @@ static inline bool vmw_has_fences(struct vmw_private *vmw) + return (vmw_fifo_caps(vmw) & SVGA_FIFO_CAP_FENCE) != 0; + } + ++static inline bool vmw_shadertype_is_valid(enum vmw_sm_type shader_model, ++ u32 shader_type) ++{ ++ SVGA3dShaderType max_allowed = SVGA3D_SHADERTYPE_PREDX_MAX; ++ ++ if (shader_model >= VMW_SM_5) ++ max_allowed = SVGA3D_SHADERTYPE_MAX; ++ else if (shader_model >= VMW_SM_4) ++ max_allowed = SVGA3D_SHADERTYPE_DX10_MAX; ++ return shader_type >= SVGA3D_SHADERTYPE_MIN && shader_type < max_allowed; ++} ++ + #endif +diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +index 9144e8f88c812..ed75622bf7082 100644 +--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c ++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +@@ -2003,7 +2003,7 @@ static int vmw_cmd_set_shader(struct vmw_private *dev_priv, + + cmd = container_of(header, typeof(*cmd), header); + +- if (cmd->body.type >= SVGA3D_SHADERTYPE_PREDX_MAX) { ++ if (!vmw_shadertype_is_valid(VMW_SM_LEGACY, cmd->body.type)) { + VMW_DEBUG_USER("Illegal shader type %u.\n", + (unsigned int) cmd->body.type); + return -EINVAL; +@@ -2125,8 +2125,6 @@ vmw_cmd_dx_set_single_constant_buffer(struct vmw_private *dev_priv, + SVGA3dCmdHeader *header) + { + VMW_DECLARE_CMD_VAR(*cmd, SVGA3dCmdDXSetSingleConstantBuffer); +- SVGA3dShaderType max_shader_num = has_sm5_context(dev_priv) ? +- SVGA3D_NUM_SHADERTYPE : SVGA3D_NUM_SHADERTYPE_DX10; + + struct vmw_resource *res = NULL; + struct vmw_ctx_validation_info *ctx_node = VMW_GET_CTX_NODE(sw_context); +@@ -2143,6 +2141,14 @@ vmw_cmd_dx_set_single_constant_buffer(struct vmw_private *dev_priv, + if (unlikely(ret != 0)) + return ret; + ++ if (!vmw_shadertype_is_valid(dev_priv->sm_type, cmd->body.type) || ++ cmd->body.slot >= SVGA3D_DX_MAX_CONSTBUFFERS) { ++ VMW_DEBUG_USER("Illegal const buffer shader %u slot %u.\n", ++ (unsigned int) cmd->body.type, ++ (unsigned int) cmd->body.slot); ++ return -EINVAL; ++ } ++ + binding.bi.ctx = ctx_node->ctx; + binding.bi.res = res; + binding.bi.bt = vmw_ctx_binding_cb; +@@ -2151,14 +2157,6 @@ vmw_cmd_dx_set_single_constant_buffer(struct vmw_private *dev_priv, + binding.size = cmd->body.sizeInBytes; + binding.slot = cmd->body.slot; + +- if (binding.shader_slot >= max_shader_num || +- binding.slot >= SVGA3D_DX_MAX_CONSTBUFFERS) { +- VMW_DEBUG_USER("Illegal const buffer shader %u slot %u.\n", +- (unsigned int) cmd->body.type, +- (unsigned int) binding.slot); +- return -EINVAL; +- } +- + vmw_binding_add(ctx_node->staged, &binding.bi, binding.shader_slot, + binding.slot); + +@@ -2179,15 +2177,13 @@ static int vmw_cmd_dx_set_shader_res(struct vmw_private *dev_priv, + { + VMW_DECLARE_CMD_VAR(*cmd, SVGA3dCmdDXSetShaderResources) = + container_of(header, typeof(*cmd), header); +- SVGA3dShaderType max_allowed = has_sm5_context(dev_priv) ? +- SVGA3D_SHADERTYPE_MAX : SVGA3D_SHADERTYPE_DX10_MAX; + + u32 num_sr_view = (cmd->header.size - sizeof(cmd->body)) / + sizeof(SVGA3dShaderResourceViewId); + + if ((u64) cmd->body.startView + (u64) num_sr_view > + (u64) SVGA3D_DX_MAX_SRVIEWS || +- cmd->body.type >= max_allowed) { ++ !vmw_shadertype_is_valid(dev_priv->sm_type, cmd->body.type)) { + VMW_DEBUG_USER("Invalid shader binding.\n"); + return -EINVAL; + } +@@ -2211,8 +2207,6 @@ static int vmw_cmd_dx_set_shader(struct vmw_private *dev_priv, + SVGA3dCmdHeader *header) + { + VMW_DECLARE_CMD_VAR(*cmd, SVGA3dCmdDXSetShader); +- SVGA3dShaderType max_allowed = has_sm5_context(dev_priv) ? +- SVGA3D_SHADERTYPE_MAX : SVGA3D_SHADERTYPE_DX10_MAX; + struct vmw_resource *res = NULL; + struct vmw_ctx_validation_info *ctx_node = VMW_GET_CTX_NODE(sw_context); + struct vmw_ctx_bindinfo_shader binding; +@@ -2223,8 +2217,7 @@ static int vmw_cmd_dx_set_shader(struct vmw_private *dev_priv, + + cmd = container_of(header, typeof(*cmd), header); + +- if (cmd->body.type >= max_allowed || +- cmd->body.type < SVGA3D_SHADERTYPE_MIN) { ++ if (!vmw_shadertype_is_valid(dev_priv->sm_type, cmd->body.type)) { + VMW_DEBUG_USER("Illegal shader type %u.\n", + (unsigned int) cmd->body.type); + return -EINVAL; +diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c +index d97a6765693f1..389ac3d1f3446 100644 +--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c ++++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c +@@ -733,6 +733,8 @@ static int vb2ops_venc_queue_setup(struct vb2_queue *vq, + return -EINVAL; + + if (*nplanes) { ++ if (*nplanes != q_data->fmt->num_planes) ++ return -EINVAL; + for (i = 0; i < *nplanes; i++) + if (sizes[i] < q_data->sizeimage[i]) + return -EINVAL; +diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c +index a6a70b872ac4a..b29393831a302 100644 +--- a/drivers/net/bonding/bond_alb.c ++++ b/drivers/net/bonding/bond_alb.c +@@ -657,10 +657,10 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond) + return NULL; + arp = (struct arp_pkt *)skb_network_header(skb); + +- /* Don't modify or load balance ARPs that do not originate locally +- * (e.g.,arrive via a bridge). ++ /* Don't modify or load balance ARPs that do not originate ++ * from the bond itself or a VLAN directly above the bond. + */ +- if (!bond_slave_has_mac_rx(bond, arp->mac_src)) ++ if (!bond_slave_has_mac_rcu(bond, arp->mac_src)) + return NULL; + + if (arp->op_code == htons(ARPOP_REPLY)) { +diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c +index be5566168d0f3..afd9060c5421c 100644 +--- a/drivers/net/can/vxcan.c ++++ b/drivers/net/can/vxcan.c +@@ -179,12 +179,7 @@ static int vxcan_newlink(struct net *net, struct net_device *dev, + + nla_peer = data[VXCAN_INFO_PEER]; + ifmp = nla_data(nla_peer); +- err = rtnl_nla_parse_ifla(peer_tb, +- nla_data(nla_peer) + +- sizeof(struct ifinfomsg), +- nla_len(nla_peer) - +- sizeof(struct ifinfomsg), +- NULL); ++ err = rtnl_nla_parse_ifinfomsg(peer_tb, nla_peer, extack); + if (err < 0) + return err; + +diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c +index a9c99ac81730a..c691635cf4ebe 100644 +--- a/drivers/net/ethernet/broadcom/bgmac.c ++++ b/drivers/net/ethernet/broadcom/bgmac.c +@@ -1448,7 +1448,7 @@ int bgmac_phy_connect_direct(struct bgmac *bgmac) + int err; + + phy_dev = fixed_phy_register(PHY_POLL, &fphy_status, NULL); +- if (!phy_dev || IS_ERR(phy_dev)) { ++ if (IS_ERR(phy_dev)) { + dev_err(bgmac->dev, "Failed to register fixed PHY device\n"); + return -ENODEV; + } +diff --git a/drivers/net/ethernet/broadcom/genet/bcmmii.c b/drivers/net/ethernet/broadcom/genet/bcmmii.c +index 8c800d9c11b78..bfe90cacbd073 100644 +--- a/drivers/net/ethernet/broadcom/genet/bcmmii.c ++++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c +@@ -571,7 +571,7 @@ static int bcmgenet_mii_pd_init(struct bcmgenet_priv *priv) + }; + + phydev = fixed_phy_register(PHY_POLL, &fphy_status, NULL); +- if (!phydev || IS_ERR(phydev)) { ++ if (IS_ERR(phydev)) { + dev_err(kdev, "failed to register fixed PHY device\n"); + return -ENODEV; + } +diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c +index 77d8db9b8a1d8..05759f690e1fd 100644 +--- a/drivers/net/ethernet/ibm/ibmveth.c ++++ b/drivers/net/ethernet/ibm/ibmveth.c +@@ -196,7 +196,7 @@ static inline void ibmveth_flush_buffer(void *addr, unsigned long length) + unsigned long offset; + + for (offset = 0; offset < length; offset += SMP_CACHE_BYTES) +- asm("dcbfl %0,%1" :: "b" (addr), "r" (offset)); ++ asm("dcbf %0,%1,1" :: "b" (addr), "r" (offset)); + } + + /* replenish the buffers for a pool. note that we don't need to +diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c +index 533a953f15acb..09525dbeccfec 100644 +--- a/drivers/net/ethernet/intel/ice/ice_base.c ++++ b/drivers/net/ethernet/intel/ice/ice_base.c +@@ -359,7 +359,8 @@ static int ice_setup_rx_ctx(struct ice_ring *ring) + /* Receive Packet Data Buffer Size. + * The Packet Data Buffer Size is defined in 128 byte units. + */ +- rlan_ctx.dbuf = ring->rx_buf_len >> ICE_RLAN_CTX_DBUF_S; ++ rlan_ctx.dbuf = DIV_ROUND_UP(ring->rx_buf_len, ++ BIT_ULL(ICE_RLAN_CTX_DBUF_S)); + + /* use 32 byte descriptors */ + rlan_ctx.dsize = 1; +diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c +index 0011b15e678c3..9cdb7a856ab6c 100644 +--- a/drivers/net/ethernet/intel/igb/igb_ptp.c ++++ b/drivers/net/ethernet/intel/igb/igb_ptp.c +@@ -1260,18 +1260,6 @@ void igb_ptp_init(struct igb_adapter *adapter) + return; + } + +- spin_lock_init(&adapter->tmreg_lock); +- INIT_WORK(&adapter->ptp_tx_work, igb_ptp_tx_work); +- +- if (adapter->ptp_flags & IGB_PTP_OVERFLOW_CHECK) +- INIT_DELAYED_WORK(&adapter->ptp_overflow_work, +- igb_ptp_overflow_check); +- +- adapter->tstamp_config.rx_filter = HWTSTAMP_FILTER_NONE; +- adapter->tstamp_config.tx_type = HWTSTAMP_TX_OFF; +- +- igb_ptp_reset(adapter); +- + adapter->ptp_clock = ptp_clock_register(&adapter->ptp_caps, + &adapter->pdev->dev); + if (IS_ERR(adapter->ptp_clock)) { +@@ -1281,6 +1269,18 @@ void igb_ptp_init(struct igb_adapter *adapter) + dev_info(&adapter->pdev->dev, "added PHC on %s\n", + adapter->netdev->name); + adapter->ptp_flags |= IGB_PTP_ENABLED; ++ ++ spin_lock_init(&adapter->tmreg_lock); ++ INIT_WORK(&adapter->ptp_tx_work, igb_ptp_tx_work); ++ ++ if (adapter->ptp_flags & IGB_PTP_OVERFLOW_CHECK) ++ INIT_DELAYED_WORK(&adapter->ptp_overflow_work, ++ igb_ptp_overflow_check); ++ ++ adapter->tstamp_config.rx_filter = HWTSTAMP_FILTER_NONE; ++ adapter->tstamp_config.tx_type = HWTSTAMP_TX_OFF; ++ ++ igb_ptp_reset(adapter); + } + } + +diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h +index 60d0ca69ceca9..703b62c5f79b5 100644 +--- a/drivers/net/ethernet/intel/igc/igc_defines.h ++++ b/drivers/net/ethernet/intel/igc/igc_defines.h +@@ -539,7 +539,7 @@ + #define IGC_PTM_CTRL_START_NOW BIT(29) /* Start PTM Now */ + #define IGC_PTM_CTRL_EN BIT(30) /* Enable PTM */ + #define IGC_PTM_CTRL_TRIG BIT(31) /* PTM Cycle trigger */ +-#define IGC_PTM_CTRL_SHRT_CYC(usec) (((usec) & 0x2f) << 2) ++#define IGC_PTM_CTRL_SHRT_CYC(usec) (((usec) & 0x3f) << 2) + #define IGC_PTM_CTRL_PTM_TO(usec) (((usec) & 0xff) << 8) + + #define IGC_PTM_SHORT_CYC_DEFAULT 10 /* Default Short/interrupted cycle interval */ +diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c +index dee2f2086bb5d..f5922d63e33e4 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c ++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c +@@ -4013,9 +4013,10 @@ rx_frscfg: + if (link < 0) + return NIX_AF_ERR_RX_LINK_INVALID; + +- nix_find_link_frs(rvu, req, pcifunc); + + linkcfg: ++ nix_find_link_frs(rvu, req, pcifunc); ++ + cfg = rvu_read64(rvu, blkaddr, NIX_AF_RX_LINKX_CFG(link)); + cfg = (cfg & ~(0xFFFFULL << 16)) | ((u64)req->maxlen << 16); + if (req->update_minlen) +diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c +index 3f43c253adaca..c199f0b465cd0 100644 +--- a/drivers/net/ipvlan/ipvlan_main.c ++++ b/drivers/net/ipvlan/ipvlan_main.c +@@ -748,7 +748,8 @@ static int ipvlan_device_event(struct notifier_block *unused, + + write_pnet(&port->pnet, newnet); + +- ipvlan_migrate_l3s_hook(oldnet, newnet); ++ if (port->mode == IPVLAN_MODE_L3S) ++ ipvlan_migrate_l3s_hook(oldnet, newnet); + break; + } + case NETDEV_UNREGISTER: +diff --git a/drivers/net/veth.c b/drivers/net/veth.c +index 41cb9179e8b79..45ee44f66e77d 100644 +--- a/drivers/net/veth.c ++++ b/drivers/net/veth.c +@@ -1654,10 +1654,7 @@ static int veth_newlink(struct net *src_net, struct net_device *dev, + + nla_peer = data[VETH_INFO_PEER]; + ifmp = nla_data(nla_peer); +- err = rtnl_nla_parse_ifla(peer_tb, +- nla_data(nla_peer) + sizeof(struct ifinfomsg), +- nla_len(nla_peer) - sizeof(struct ifinfomsg), +- NULL); ++ err = rtnl_nla_parse_ifinfomsg(peer_tb, nla_peer, extack); + if (err < 0) + return err; + +diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c +index cd3821a6444f0..4e436f2d13aeb 100644 +--- a/drivers/of/dynamic.c ++++ b/drivers/of/dynamic.c +@@ -63,15 +63,14 @@ int of_reconfig_notifier_unregister(struct notifier_block *nb) + } + EXPORT_SYMBOL_GPL(of_reconfig_notifier_unregister); + +-#ifdef DEBUG +-const char *action_names[] = { ++static const char *action_names[] = { ++ [0] = "INVALID", + [OF_RECONFIG_ATTACH_NODE] = "ATTACH_NODE", + [OF_RECONFIG_DETACH_NODE] = "DETACH_NODE", + [OF_RECONFIG_ADD_PROPERTY] = "ADD_PROPERTY", + [OF_RECONFIG_REMOVE_PROPERTY] = "REMOVE_PROPERTY", + [OF_RECONFIG_UPDATE_PROPERTY] = "UPDATE_PROPERTY", + }; +-#endif + + int of_reconfig_notify(unsigned long action, struct of_reconfig_data *p) + { +@@ -594,21 +593,9 @@ static int __of_changeset_entry_apply(struct of_changeset_entry *ce) + } + + ret = __of_add_property(ce->np, ce->prop); +- if (ret) { +- pr_err("changeset: add_property failed @%pOF/%s\n", +- ce->np, +- ce->prop->name); +- break; +- } + break; + case OF_RECONFIG_REMOVE_PROPERTY: + ret = __of_remove_property(ce->np, ce->prop); +- if (ret) { +- pr_err("changeset: remove_property failed @%pOF/%s\n", +- ce->np, +- ce->prop->name); +- break; +- } + break; + + case OF_RECONFIG_UPDATE_PROPERTY: +@@ -622,20 +609,17 @@ static int __of_changeset_entry_apply(struct of_changeset_entry *ce) + } + + ret = __of_update_property(ce->np, ce->prop, &old_prop); +- if (ret) { +- pr_err("changeset: update_property failed @%pOF/%s\n", +- ce->np, +- ce->prop->name); +- break; +- } + break; + default: + ret = -EINVAL; + } + raw_spin_unlock_irqrestore(&devtree_lock, flags); + +- if (ret) ++ if (ret) { ++ pr_err("changeset: apply failed: %-15s %pOF:%s\n", ++ action_names[ce->action], ce->np, ce->prop->name); + return ret; ++ } + + switch (ce->action) { + case OF_RECONFIG_ATTACH_NODE: +@@ -921,6 +905,9 @@ int of_changeset_action(struct of_changeset *ocs, unsigned long action, + if (!ce) + return -ENOMEM; + ++ if (WARN_ON(action >= ARRAY_SIZE(action_names))) ++ return -EINVAL; ++ + /* get a reference to the node */ + ce->action = action; + ce->np = of_node_get(np); +diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c +index 52bb68fb22169..3a07cc58e7d7a 100644 +--- a/drivers/of/kexec.c ++++ b/drivers/of/kexec.c +@@ -187,8 +187,8 @@ int ima_free_kexec_buffer(void) + if (ret) + return ret; + +- return memblock_free(addr, size); +- ++ memblock_free_late(addr, size); ++ return 0; + } + + /** +diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c +index 2bee1d992408f..a9871e2f0a0bb 100644 +--- a/drivers/of/unittest.c ++++ b/drivers/of/unittest.c +@@ -657,12 +657,12 @@ static void __init of_unittest_parse_phandle_with_args_map(void) + memset(&args, 0, sizeof(args)); + + EXPECT_BEGIN(KERN_INFO, +- "OF: /testcase-data/phandle-tests/consumer-b: could not find phandle"); ++ "OF: /testcase-data/phandle-tests/consumer-b: could not find phandle 12345678"); + + rc = of_parse_phandle_with_args_map(np, "phandle-list-bad-phandle", + "phandle", 0, &args); + EXPECT_END(KERN_INFO, +- "OF: /testcase-data/phandle-tests/consumer-b: could not find phandle"); ++ "OF: /testcase-data/phandle-tests/consumer-b: could not find phandle 12345678"); + + unittest(rc == -EINVAL, "expected:%i got:%i\n", -EINVAL, rc); + +diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c +index f031302ad4019..0a37967b0a939 100644 +--- a/drivers/pci/hotplug/acpiphp_glue.c ++++ b/drivers/pci/hotplug/acpiphp_glue.c +@@ -503,12 +503,15 @@ static void enable_slot(struct acpiphp_slot *slot, bool bridge) + if (pass && dev->subordinate) { + check_hotplug_bridge(slot, dev); + pcibios_resource_survey_bus(dev->subordinate); +- __pci_bus_size_bridges(dev->subordinate, +- &add_list); ++ if (pci_is_root_bus(bus)) ++ __pci_bus_size_bridges(dev->subordinate, &add_list); + } + } + } +- __pci_bus_assign_resources(bus, &add_list, NULL); ++ if (pci_is_root_bus(bus)) ++ __pci_bus_assign_resources(bus, &add_list, NULL); ++ else ++ pci_assign_unassigned_bridge_resources(bus->self); + } + + acpiphp_sanitize_bus(bus); +diff --git a/drivers/pinctrl/renesas/pinctrl-rza2.c b/drivers/pinctrl/renesas/pinctrl-rza2.c +index 32829eb9656c9..ddd8ee6b604ef 100644 +--- a/drivers/pinctrl/renesas/pinctrl-rza2.c ++++ b/drivers/pinctrl/renesas/pinctrl-rza2.c +@@ -14,6 +14,7 @@ + #include + #include + #include ++#include + #include + #include + +@@ -46,6 +47,7 @@ struct rza2_pinctrl_priv { + struct pinctrl_dev *pctl; + struct pinctrl_gpio_range gpio_range; + int npins; ++ struct mutex mutex; /* serialize adding groups and functions */ + }; + + #define RZA2_PDR(port) (0x0000 + (port) * 2) /* Direction 16-bit */ +@@ -359,10 +361,14 @@ static int rza2_dt_node_to_map(struct pinctrl_dev *pctldev, + psel_val[i] = MUX_FUNC(value); + } + ++ mutex_lock(&priv->mutex); ++ + /* Register a single pin group listing all the pins we read from DT */ + gsel = pinctrl_generic_add_group(pctldev, np->name, pins, npins, NULL); +- if (gsel < 0) +- return gsel; ++ if (gsel < 0) { ++ ret = gsel; ++ goto unlock; ++ } + + /* + * Register a single group function where the 'data' is an array PSEL +@@ -391,6 +397,8 @@ static int rza2_dt_node_to_map(struct pinctrl_dev *pctldev, + (*map)->data.mux.function = np->name; + *num_maps = 1; + ++ mutex_unlock(&priv->mutex); ++ + return 0; + + remove_function: +@@ -399,6 +407,9 @@ remove_function: + remove_group: + pinctrl_generic_remove_group(pctldev, gsel); + ++unlock: ++ mutex_unlock(&priv->mutex); ++ + dev_err(priv->dev, "Unable to parse DT node %s\n", np->name); + + return ret; +@@ -474,6 +485,8 @@ static int rza2_pinctrl_probe(struct platform_device *pdev) + if (IS_ERR(priv->base)) + return PTR_ERR(priv->base); + ++ mutex_init(&priv->mutex); ++ + platform_set_drvdata(pdev, priv); + + priv->npins = (int)(uintptr_t)of_device_get_match_data(&pdev->dev) * +diff --git a/drivers/scsi/raid_class.c b/drivers/scsi/raid_class.c +index 711252e52d8e1..95a86e0dfd77a 100644 +--- a/drivers/scsi/raid_class.c ++++ b/drivers/scsi/raid_class.c +@@ -209,54 +209,6 @@ raid_attr_ro_state(level); + raid_attr_ro_fn(resync); + raid_attr_ro_state_fn(state); + +-static void raid_component_release(struct device *dev) +-{ +- struct raid_component *rc = +- container_of(dev, struct raid_component, dev); +- dev_printk(KERN_ERR, rc->dev.parent, "COMPONENT RELEASE\n"); +- put_device(rc->dev.parent); +- kfree(rc); +-} +- +-int raid_component_add(struct raid_template *r,struct device *raid_dev, +- struct device *component_dev) +-{ +- struct device *cdev = +- attribute_container_find_class_device(&r->raid_attrs.ac, +- raid_dev); +- struct raid_component *rc; +- struct raid_data *rd = dev_get_drvdata(cdev); +- int err; +- +- rc = kzalloc(sizeof(*rc), GFP_KERNEL); +- if (!rc) +- return -ENOMEM; +- +- INIT_LIST_HEAD(&rc->node); +- device_initialize(&rc->dev); +- rc->dev.release = raid_component_release; +- rc->dev.parent = get_device(component_dev); +- rc->num = rd->component_count++; +- +- dev_set_name(&rc->dev, "component-%d", rc->num); +- list_add_tail(&rc->node, &rd->component_list); +- rc->dev.class = &raid_class.class; +- err = device_add(&rc->dev); +- if (err) +- goto err_out; +- +- return 0; +- +-err_out: +- put_device(&rc->dev); +- list_del(&rc->node); +- rd->component_count--; +- put_device(component_dev); +- kfree(rc); +- return err; +-} +-EXPORT_SYMBOL(raid_component_add); +- + struct raid_template * + raid_class_attach(struct raid_function_template *ft) + { +diff --git a/drivers/scsi/snic/snic_disc.c b/drivers/scsi/snic/snic_disc.c +index c445853c623e2..e362453e8d262 100644 +--- a/drivers/scsi/snic/snic_disc.c ++++ b/drivers/scsi/snic/snic_disc.c +@@ -317,12 +317,11 @@ snic_tgt_create(struct snic *snic, struct snic_tgt_id *tgtid) + "Snic Tgt: device_add, with err = %d\n", + ret); + +- put_device(&tgt->dev); + put_device(&snic->shost->shost_gendev); + spin_lock_irqsave(snic->shost->host_lock, flags); + list_del(&tgt->list); + spin_unlock_irqrestore(snic->shost->host_lock, flags); +- kfree(tgt); ++ put_device(&tgt->dev); + tgt = NULL; + + return tgt; +diff --git a/drivers/video/fbdev/core/sysimgblt.c b/drivers/video/fbdev/core/sysimgblt.c +index a4d05b1b17d7d..665ef7a0a2495 100644 +--- a/drivers/video/fbdev/core/sysimgblt.c ++++ b/drivers/video/fbdev/core/sysimgblt.c +@@ -188,23 +188,29 @@ static void fast_imageblit(const struct fb_image *image, struct fb_info *p, + { + u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel; + u32 ppw = 32/bpp, spitch = (image->width + 7)/8; +- u32 bit_mask, end_mask, eorx, shift; +- const char *s = image->data, *src; ++ u32 bit_mask, eorx, shift; ++ const u8 *s = image->data, *src; + u32 *dst; +- const u32 *tab = NULL; ++ const u32 *tab; ++ size_t tablen; ++ u32 colortab[16]; + int i, j, k; + + switch (bpp) { + case 8: + tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le; ++ tablen = 16; + break; + case 16: + tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le; ++ tablen = 4; + break; + case 32: +- default: + tab = cfb_tab32; ++ tablen = 2; + break; ++ default: ++ return; + } + + for (i = ppw-1; i--; ) { +@@ -218,20 +224,62 @@ static void fast_imageblit(const struct fb_image *image, struct fb_info *p, + eorx = fgx ^ bgx; + k = image->width/ppw; + ++ for (i = 0; i < tablen; ++i) ++ colortab[i] = (tab[i] & eorx) ^ bgx; ++ + for (i = image->height; i--; ) { + dst = dst1; + shift = 8; + src = s; + +- for (j = k; j--; ) { ++ /* ++ * Manually unroll the per-line copying loop for better ++ * performance. This works until we processed the last ++ * completely filled source byte (inclusive). ++ */ ++ switch (ppw) { ++ case 4: /* 8 bpp */ ++ for (j = k; j >= 2; j -= 2, ++src) { ++ *dst++ = colortab[(*src >> 4) & bit_mask]; ++ *dst++ = colortab[(*src >> 0) & bit_mask]; ++ } ++ break; ++ case 2: /* 16 bpp */ ++ for (j = k; j >= 4; j -= 4, ++src) { ++ *dst++ = colortab[(*src >> 6) & bit_mask]; ++ *dst++ = colortab[(*src >> 4) & bit_mask]; ++ *dst++ = colortab[(*src >> 2) & bit_mask]; ++ *dst++ = colortab[(*src >> 0) & bit_mask]; ++ } ++ break; ++ case 1: /* 32 bpp */ ++ for (j = k; j >= 8; j -= 8, ++src) { ++ *dst++ = colortab[(*src >> 7) & bit_mask]; ++ *dst++ = colortab[(*src >> 6) & bit_mask]; ++ *dst++ = colortab[(*src >> 5) & bit_mask]; ++ *dst++ = colortab[(*src >> 4) & bit_mask]; ++ *dst++ = colortab[(*src >> 3) & bit_mask]; ++ *dst++ = colortab[(*src >> 2) & bit_mask]; ++ *dst++ = colortab[(*src >> 1) & bit_mask]; ++ *dst++ = colortab[(*src >> 0) & bit_mask]; ++ } ++ break; ++ } ++ ++ /* ++ * For image widths that are not a multiple of 8, there ++ * are trailing pixels left on the current line. Print ++ * them as well. ++ */ ++ for (; j--; ) { + shift -= ppw; +- end_mask = tab[(*src >> shift) & bit_mask]; +- *dst++ = (end_mask & eorx) ^ bgx; ++ *dst++ = colortab[(*src >> shift) & bit_mask]; + if (!shift) { + shift = 8; +- src++; ++ ++src; + } + } ++ + dst1 += p->fix.line_length; + s += spitch; + } +diff --git a/fs/attr.c b/fs/attr.c +index 0ca14cbd4b8bb..28e953e86960f 100644 +--- a/fs/attr.c ++++ b/fs/attr.c +@@ -47,6 +47,7 @@ int setattr_should_drop_sgid(struct user_namespace *mnt_userns, + return ATTR_KILL_SGID; + return 0; + } ++EXPORT_SYMBOL(setattr_should_drop_sgid); + + /** + * setattr_should_drop_suidgid - determine whether the set{g,u}id bit needs to +diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c +index 862cb7a353c1c..b9829b873bf2e 100644 +--- a/fs/dlm/lock.c ++++ b/fs/dlm/lock.c +@@ -1856,7 +1856,7 @@ static void del_timeout(struct dlm_lkb *lkb) + void dlm_scan_timeout(struct dlm_ls *ls) + { + struct dlm_rsb *r; +- struct dlm_lkb *lkb; ++ struct dlm_lkb *lkb = NULL, *iter; + int do_cancel, do_warn; + s64 wait_us; + +@@ -1867,27 +1867,28 @@ void dlm_scan_timeout(struct dlm_ls *ls) + do_cancel = 0; + do_warn = 0; + mutex_lock(&ls->ls_timeout_mutex); +- list_for_each_entry(lkb, &ls->ls_timeout, lkb_time_list) { ++ list_for_each_entry(iter, &ls->ls_timeout, lkb_time_list) { + + wait_us = ktime_to_us(ktime_sub(ktime_get(), +- lkb->lkb_timestamp)); ++ iter->lkb_timestamp)); + +- if ((lkb->lkb_exflags & DLM_LKF_TIMEOUT) && +- wait_us >= (lkb->lkb_timeout_cs * 10000)) ++ if ((iter->lkb_exflags & DLM_LKF_TIMEOUT) && ++ wait_us >= (iter->lkb_timeout_cs * 10000)) + do_cancel = 1; + +- if ((lkb->lkb_flags & DLM_IFL_WATCH_TIMEWARN) && ++ if ((iter->lkb_flags & DLM_IFL_WATCH_TIMEWARN) && + wait_us >= dlm_config.ci_timewarn_cs * 10000) + do_warn = 1; + + if (!do_cancel && !do_warn) + continue; +- hold_lkb(lkb); ++ hold_lkb(iter); ++ lkb = iter; + break; + } + mutex_unlock(&ls->ls_timeout_mutex); + +- if (!do_cancel && !do_warn) ++ if (!lkb) + break; + + r = lkb->lkb_resource; +@@ -5239,21 +5240,18 @@ void dlm_recover_waiters_pre(struct dlm_ls *ls) + + static struct dlm_lkb *find_resend_waiter(struct dlm_ls *ls) + { +- struct dlm_lkb *lkb; +- int found = 0; ++ struct dlm_lkb *lkb = NULL, *iter; + + mutex_lock(&ls->ls_waiters_mutex); +- list_for_each_entry(lkb, &ls->ls_waiters, lkb_wait_reply) { +- if (lkb->lkb_flags & DLM_IFL_RESEND) { +- hold_lkb(lkb); +- found = 1; ++ list_for_each_entry(iter, &ls->ls_waiters, lkb_wait_reply) { ++ if (iter->lkb_flags & DLM_IFL_RESEND) { ++ hold_lkb(iter); ++ lkb = iter; + break; + } + } + mutex_unlock(&ls->ls_waiters_mutex); + +- if (!found) +- lkb = NULL; + return lkb; + } + +@@ -5912,37 +5910,36 @@ int dlm_user_adopt_orphan(struct dlm_ls *ls, struct dlm_user_args *ua_tmp, + int mode, uint32_t flags, void *name, unsigned int namelen, + unsigned long timeout_cs, uint32_t *lkid) + { +- struct dlm_lkb *lkb; ++ struct dlm_lkb *lkb = NULL, *iter; + struct dlm_user_args *ua; + int found_other_mode = 0; +- int found = 0; + int rv = 0; + + mutex_lock(&ls->ls_orphans_mutex); +- list_for_each_entry(lkb, &ls->ls_orphans, lkb_ownqueue) { +- if (lkb->lkb_resource->res_length != namelen) ++ list_for_each_entry(iter, &ls->ls_orphans, lkb_ownqueue) { ++ if (iter->lkb_resource->res_length != namelen) + continue; +- if (memcmp(lkb->lkb_resource->res_name, name, namelen)) ++ if (memcmp(iter->lkb_resource->res_name, name, namelen)) + continue; +- if (lkb->lkb_grmode != mode) { ++ if (iter->lkb_grmode != mode) { + found_other_mode = 1; + continue; + } + +- found = 1; +- list_del_init(&lkb->lkb_ownqueue); +- lkb->lkb_flags &= ~DLM_IFL_ORPHAN; +- *lkid = lkb->lkb_id; ++ lkb = iter; ++ list_del_init(&iter->lkb_ownqueue); ++ iter->lkb_flags &= ~DLM_IFL_ORPHAN; ++ *lkid = iter->lkb_id; + break; + } + mutex_unlock(&ls->ls_orphans_mutex); + +- if (!found && found_other_mode) { ++ if (!lkb && found_other_mode) { + rv = -EAGAIN; + goto out; + } + +- if (!found) { ++ if (!lkb) { + rv = -ENOENT; + goto out; + } +diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c +index f3482e936cc25..28735e8c5e206 100644 +--- a/fs/dlm/plock.c ++++ b/fs/dlm/plock.c +@@ -80,8 +80,7 @@ static void send_op(struct plock_op *op) + abandoned waiter. So, we have to insert the unlock-close when the + lock call is interrupted. */ + +-static void do_unlock_close(struct dlm_ls *ls, u64 number, +- struct file *file, struct file_lock *fl) ++static void do_unlock_close(const struct dlm_plock_info *info) + { + struct plock_op *op; + +@@ -90,15 +89,12 @@ static void do_unlock_close(struct dlm_ls *ls, u64 number, + return; + + op->info.optype = DLM_PLOCK_OP_UNLOCK; +- op->info.pid = fl->fl_pid; +- op->info.fsid = ls->ls_global_id; +- op->info.number = number; ++ op->info.pid = info->pid; ++ op->info.fsid = info->fsid; ++ op->info.number = info->number; + op->info.start = 0; + op->info.end = OFFSET_MAX; +- if (fl->fl_lmops && fl->fl_lmops->lm_grant) +- op->info.owner = (__u64) fl->fl_pid; +- else +- op->info.owner = (__u64)(long) fl->fl_owner; ++ op->info.owner = info->owner; + + op->info.flags |= DLM_PLOCK_FL_CLOSE; + send_op(op); +@@ -161,13 +157,14 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file, + + rv = wait_event_killable(recv_wq, (op->done != 0)); + if (rv == -ERESTARTSYS) { +- log_debug(ls, "%s: wait killed %llx", __func__, +- (unsigned long long)number); + spin_lock(&ops_lock); + list_del(&op->list); + spin_unlock(&ops_lock); ++ log_debug(ls, "%s: wait interrupted %x %llx pid %d", ++ __func__, ls->ls_global_id, ++ (unsigned long long)number, op->info.pid); + dlm_release_plock_op(op); +- do_unlock_close(ls, number, file, fl); ++ do_unlock_close(&op->info); + goto out; + } + +@@ -408,7 +405,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, + if (op->info.flags & DLM_PLOCK_FL_CLOSE) + list_del(&op->list); + else +- list_move(&op->list, &recv_list); ++ list_move_tail(&op->list, &recv_list); + memcpy(&info, &op->info, sizeof(info)); + } + spin_unlock(&ops_lock); +@@ -433,9 +430,9 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, + static ssize_t dev_write(struct file *file, const char __user *u, size_t count, + loff_t *ppos) + { ++ struct plock_op *op = NULL, *iter; + struct dlm_plock_info info; +- struct plock_op *op; +- int found = 0, do_callback = 0; ++ int do_callback = 0; + + if (count != sizeof(info)) + return -EINVAL; +@@ -446,31 +443,63 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, + if (check_version(&info)) + return -EINVAL; + ++ /* ++ * The results for waiting ops (SETLKW) can be returned in any ++ * order, so match all fields to find the op. The results for ++ * non-waiting ops are returned in the order that they were sent ++ * to userspace, so match the result with the first non-waiting op. ++ */ + spin_lock(&ops_lock); +- list_for_each_entry(op, &recv_list, list) { +- if (op->info.fsid == info.fsid && +- op->info.number == info.number && +- op->info.owner == info.owner) { +- list_del_init(&op->list); +- memcpy(&op->info, &info, sizeof(info)); +- if (op->data) +- do_callback = 1; +- else +- op->done = 1; +- found = 1; +- break; ++ if (info.wait) { ++ list_for_each_entry(iter, &recv_list, list) { ++ if (iter->info.fsid == info.fsid && ++ iter->info.number == info.number && ++ iter->info.owner == info.owner && ++ iter->info.pid == info.pid && ++ iter->info.start == info.start && ++ iter->info.end == info.end && ++ iter->info.ex == info.ex && ++ iter->info.wait) { ++ op = iter; ++ break; ++ } ++ } ++ } else { ++ list_for_each_entry(iter, &recv_list, list) { ++ if (!iter->info.wait) { ++ op = iter; ++ break; ++ } + } + } ++ ++ if (op) { ++ /* Sanity check that op and info match. */ ++ if (info.wait) ++ WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK); ++ else ++ WARN_ON(op->info.fsid != info.fsid || ++ op->info.number != info.number || ++ op->info.owner != info.owner || ++ op->info.optype != info.optype); ++ ++ list_del_init(&op->list); ++ memcpy(&op->info, &info, sizeof(info)); ++ if (op->data) ++ do_callback = 1; ++ else ++ op->done = 1; ++ } + spin_unlock(&ops_lock); + +- if (found) { ++ if (op) { + if (do_callback) + dlm_plock_callback(op); + else + wake_up(&recv_wq); + } else +- log_print("dev_write no op %x %llx", info.fsid, +- (unsigned long long)info.number); ++ log_print("%s: no op %x %llx", __func__, ++ info.fsid, (unsigned long long)info.number); + return count; + } + +diff --git a/fs/dlm/recover.c b/fs/dlm/recover.c +index 8928e99dfd47d..df18f38a02734 100644 +--- a/fs/dlm/recover.c ++++ b/fs/dlm/recover.c +@@ -732,10 +732,9 @@ void dlm_recovered_lock(struct dlm_rsb *r) + + static void recover_lvb(struct dlm_rsb *r) + { +- struct dlm_lkb *lkb, *high_lkb = NULL; ++ struct dlm_lkb *big_lkb = NULL, *iter, *high_lkb = NULL; + uint32_t high_seq = 0; + int lock_lvb_exists = 0; +- int big_lock_exists = 0; + int lvblen = r->res_ls->ls_lvblen; + + if (!rsb_flag(r, RSB_NEW_MASTER2) && +@@ -751,37 +750,37 @@ static void recover_lvb(struct dlm_rsb *r) + /* we are the new master, so figure out if VALNOTVALID should + be set, and set the rsb lvb from the best lkb available. */ + +- list_for_each_entry(lkb, &r->res_grantqueue, lkb_statequeue) { +- if (!(lkb->lkb_exflags & DLM_LKF_VALBLK)) ++ list_for_each_entry(iter, &r->res_grantqueue, lkb_statequeue) { ++ if (!(iter->lkb_exflags & DLM_LKF_VALBLK)) + continue; + + lock_lvb_exists = 1; + +- if (lkb->lkb_grmode > DLM_LOCK_CR) { +- big_lock_exists = 1; ++ if (iter->lkb_grmode > DLM_LOCK_CR) { ++ big_lkb = iter; + goto setflag; + } + +- if (((int)lkb->lkb_lvbseq - (int)high_seq) >= 0) { +- high_lkb = lkb; +- high_seq = lkb->lkb_lvbseq; ++ if (((int)iter->lkb_lvbseq - (int)high_seq) >= 0) { ++ high_lkb = iter; ++ high_seq = iter->lkb_lvbseq; + } + } + +- list_for_each_entry(lkb, &r->res_convertqueue, lkb_statequeue) { +- if (!(lkb->lkb_exflags & DLM_LKF_VALBLK)) ++ list_for_each_entry(iter, &r->res_convertqueue, lkb_statequeue) { ++ if (!(iter->lkb_exflags & DLM_LKF_VALBLK)) + continue; + + lock_lvb_exists = 1; + +- if (lkb->lkb_grmode > DLM_LOCK_CR) { +- big_lock_exists = 1; ++ if (iter->lkb_grmode > DLM_LOCK_CR) { ++ big_lkb = iter; + goto setflag; + } + +- if (((int)lkb->lkb_lvbseq - (int)high_seq) >= 0) { +- high_lkb = lkb; +- high_seq = lkb->lkb_lvbseq; ++ if (((int)iter->lkb_lvbseq - (int)high_seq) >= 0) { ++ high_lkb = iter; ++ high_seq = iter->lkb_lvbseq; + } + } + +@@ -790,7 +789,7 @@ static void recover_lvb(struct dlm_rsb *r) + goto out; + + /* lvb is invalidated if only NL/CR locks remain */ +- if (!big_lock_exists) ++ if (!big_lkb) + rsb_set_flag(r, RSB_VALNOTVALID); + + if (!r->res_lvbptr) { +@@ -799,9 +798,9 @@ static void recover_lvb(struct dlm_rsb *r) + goto out; + } + +- if (big_lock_exists) { +- r->res_lvbseq = lkb->lkb_lvbseq; +- memcpy(r->res_lvbptr, lkb->lkb_lvbptr, lvblen); ++ if (big_lkb) { ++ r->res_lvbseq = big_lkb->lkb_lvbseq; ++ memcpy(r->res_lvbptr, big_lkb->lkb_lvbptr, lvblen); + } else if (high_lkb) { + r->res_lvbseq = high_lkb->lkb_lvbseq; + memcpy(r->res_lvbptr, high_lkb->lkb_lvbptr, lvblen); +diff --git a/fs/internal.h b/fs/internal.h +index 1ff8cfc94467b..d241eaa0c58b2 100644 +--- a/fs/internal.h ++++ b/fs/internal.h +@@ -235,5 +235,3 @@ int do_setxattr(struct user_namespace *mnt_userns, struct dentry *dentry, + /* + * fs/attr.c + */ +-int setattr_should_drop_sgid(struct user_namespace *mnt_userns, +- const struct inode *inode); +diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c +index d2aba55833f92..fc6989e7a8c51 100644 +--- a/fs/jbd2/checkpoint.c ++++ b/fs/jbd2/checkpoint.c +@@ -27,7 +27,7 @@ + * + * Called with j_list_lock held. + */ +-static inline void __buffer_unlink_first(struct journal_head *jh) ++static inline void __buffer_unlink(struct journal_head *jh) + { + transaction_t *transaction = jh->b_cp_transaction; + +@@ -40,23 +40,6 @@ static inline void __buffer_unlink_first(struct journal_head *jh) + } + } + +-/* +- * Unlink a buffer from a transaction checkpoint(io) list. +- * +- * Called with j_list_lock held. +- */ +-static inline void __buffer_unlink(struct journal_head *jh) +-{ +- transaction_t *transaction = jh->b_cp_transaction; +- +- __buffer_unlink_first(jh); +- if (transaction->t_checkpoint_io_list == jh) { +- transaction->t_checkpoint_io_list = jh->b_cpnext; +- if (transaction->t_checkpoint_io_list == jh) +- transaction->t_checkpoint_io_list = NULL; +- } +-} +- + /* + * Check a checkpoint buffer could be release or not. + * +@@ -366,50 +349,10 @@ int jbd2_cleanup_journal_tail(journal_t *journal) + + /* Checkpoint list management */ + +-/* +- * journal_clean_one_cp_list +- * +- * Find all the written-back checkpoint buffers in the given list and +- * release them. If 'destroy' is set, clean all buffers unconditionally. +- * +- * Called with j_list_lock held. +- * Returns 1 if we freed the transaction, 0 otherwise. +- */ +-static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy) +-{ +- struct journal_head *last_jh; +- struct journal_head *next_jh = jh; +- +- if (!jh) +- return 0; +- +- last_jh = jh->b_cpprev; +- do { +- jh = next_jh; +- next_jh = jh->b_cpnext; +- +- if (!destroy && __cp_buffer_busy(jh)) +- return 0; +- +- if (__jbd2_journal_remove_checkpoint(jh)) +- return 1; +- /* +- * This function only frees up some memory +- * if possible so we dont have an obligation +- * to finish processing. Bail out if preemption +- * requested: +- */ +- if (need_resched()) +- return 0; +- } while (jh != last_jh); +- +- return 0; +-} +- + /* + * journal_shrink_one_cp_list + * +- * Find 'nr_to_scan' written-back checkpoint buffers in the given list ++ * Find all the written-back checkpoint buffers in the given list + * and try to release them. If the whole transaction is released, set + * the 'released' parameter. Return the number of released checkpointed + * buffers. +@@ -417,15 +360,15 @@ static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy) + * Called with j_list_lock held. + */ + static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, +- unsigned long *nr_to_scan, +- bool *released) ++ bool destroy, bool *released) + { + struct journal_head *last_jh; + struct journal_head *next_jh = jh; + unsigned long nr_freed = 0; + int ret; + +- if (!jh || *nr_to_scan == 0) ++ *released = false; ++ if (!jh) + return 0; + + last_jh = jh->b_cpprev; +@@ -433,12 +376,15 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, + jh = next_jh; + next_jh = jh->b_cpnext; + +- (*nr_to_scan)--; +- if (__cp_buffer_busy(jh)) +- continue; ++ if (destroy) { ++ ret = __jbd2_journal_remove_checkpoint(jh); ++ } else { ++ ret = jbd2_journal_try_remove_checkpoint(jh); ++ if (ret < 0) ++ continue; ++ } + + nr_freed++; +- ret = __jbd2_journal_remove_checkpoint(jh); + if (ret) { + *released = true; + break; +@@ -446,7 +392,7 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, + + if (need_resched()) + break; +- } while (jh != last_jh && *nr_to_scan); ++ } while (jh != last_jh); + + return nr_freed; + } +@@ -464,11 +410,11 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, + unsigned long *nr_to_scan) + { + transaction_t *transaction, *last_transaction, *next_transaction; +- bool released; ++ bool __maybe_unused released; + tid_t first_tid = 0, last_tid = 0, next_tid = 0; + tid_t tid = 0; + unsigned long nr_freed = 0; +- unsigned long nr_scanned = *nr_to_scan; ++ unsigned long freed; + + again: + spin_lock(&journal->j_list_lock); +@@ -497,19 +443,11 @@ again: + transaction = next_transaction; + next_transaction = transaction->t_cpnext; + tid = transaction->t_tid; +- released = false; + +- nr_freed += journal_shrink_one_cp_list(transaction->t_checkpoint_list, +- nr_to_scan, &released); +- if (*nr_to_scan == 0) +- break; +- if (need_resched() || spin_needbreak(&journal->j_list_lock)) +- break; +- if (released) +- continue; +- +- nr_freed += journal_shrink_one_cp_list(transaction->t_checkpoint_io_list, +- nr_to_scan, &released); ++ freed = journal_shrink_one_cp_list(transaction->t_checkpoint_list, ++ false, &released); ++ nr_freed += freed; ++ (*nr_to_scan) -= min(*nr_to_scan, freed); + if (*nr_to_scan == 0) + break; + if (need_resched() || spin_needbreak(&journal->j_list_lock)) +@@ -530,9 +468,8 @@ again: + if (*nr_to_scan && next_tid) + goto again; + out: +- nr_scanned -= *nr_to_scan; + trace_jbd2_shrink_checkpoint_list(journal, first_tid, tid, last_tid, +- nr_freed, nr_scanned, next_tid); ++ nr_freed, next_tid); + + return nr_freed; + } +@@ -548,7 +485,7 @@ out: + void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) + { + transaction_t *transaction, *last_transaction, *next_transaction; +- int ret; ++ bool released; + + transaction = journal->j_checkpoint_transactions; + if (!transaction) +@@ -559,8 +496,8 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) + do { + transaction = next_transaction; + next_transaction = transaction->t_cpnext; +- ret = journal_clean_one_cp_list(transaction->t_checkpoint_list, +- destroy); ++ journal_shrink_one_cp_list(transaction->t_checkpoint_list, ++ destroy, &released); + /* + * This function only frees up some memory if possible so we + * dont have an obligation to finish processing. Bail out if +@@ -568,23 +505,12 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) + */ + if (need_resched()) + return; +- if (ret) +- continue; +- /* +- * It is essential that we are as careful as in the case of +- * t_checkpoint_list with removing the buffer from the list as +- * we can possibly see not yet submitted buffers on io_list +- */ +- ret = journal_clean_one_cp_list(transaction-> +- t_checkpoint_io_list, destroy); +- if (need_resched()) +- return; + /* + * Stop scanning if we couldn't free the transaction. This + * avoids pointless scanning of transactions which still + * weren't checkpointed. + */ +- if (!ret) ++ if (!released) + return; + } while (transaction != last_transaction); + } +@@ -663,7 +589,7 @@ int __jbd2_journal_remove_checkpoint(struct journal_head *jh) + jbd2_journal_put_journal_head(jh); + + /* Is this transaction empty? */ +- if (transaction->t_checkpoint_list || transaction->t_checkpoint_io_list) ++ if (transaction->t_checkpoint_list) + return 0; + + /* +@@ -694,6 +620,34 @@ int __jbd2_journal_remove_checkpoint(struct journal_head *jh) + return 1; + } + ++/* ++ * Check the checkpoint buffer and try to remove it from the checkpoint ++ * list if it's clean. Returns -EBUSY if it is not clean, returns 1 if ++ * it frees the transaction, 0 otherwise. ++ * ++ * This function is called with j_list_lock held. ++ */ ++int jbd2_journal_try_remove_checkpoint(struct journal_head *jh) ++{ ++ struct buffer_head *bh = jh2bh(jh); ++ ++ if (!trylock_buffer(bh)) ++ return -EBUSY; ++ if (buffer_dirty(bh)) { ++ unlock_buffer(bh); ++ return -EBUSY; ++ } ++ unlock_buffer(bh); ++ ++ /* ++ * Buffer is clean and the IO has finished (we held the buffer ++ * lock) so the checkpoint is done. We can safely remove the ++ * buffer from this transaction. ++ */ ++ JBUFFER_TRACE(jh, "remove from checkpoint list"); ++ return __jbd2_journal_remove_checkpoint(jh); ++} ++ + /* + * journal_insert_checkpoint: put a committed buffer onto a checkpoint + * list so that we know when it is safe to clean the transaction out of +@@ -755,7 +709,6 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact + J_ASSERT(transaction->t_forget == NULL); + J_ASSERT(transaction->t_shadow_list == NULL); + J_ASSERT(transaction->t_checkpoint_list == NULL); +- J_ASSERT(transaction->t_checkpoint_io_list == NULL); + J_ASSERT(atomic_read(&transaction->t_updates) == 0); + J_ASSERT(journal->j_committing_transaction != transaction); + J_ASSERT(journal->j_running_transaction != transaction); +diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c +index ac328e3321242..20294c1bbeab7 100644 +--- a/fs/jbd2/commit.c ++++ b/fs/jbd2/commit.c +@@ -1184,8 +1184,7 @@ restart_loop: + spin_lock(&journal->j_list_lock); + commit_transaction->t_state = T_FINISHED; + /* Check if the transaction can be dropped now that we are finished */ +- if (commit_transaction->t_checkpoint_list == NULL && +- commit_transaction->t_checkpoint_io_list == NULL) { ++ if (commit_transaction->t_checkpoint_list == NULL) { + __jbd2_journal_drop_transaction(journal, commit_transaction); + jbd2_journal_free_transaction(commit_transaction); + } +diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c +index ce4a5ccadeff4..62e68c5b8ec3d 100644 +--- a/fs/jbd2/transaction.c ++++ b/fs/jbd2/transaction.c +@@ -1775,8 +1775,7 @@ int jbd2_journal_forget(handle_t *handle, struct buffer_head *bh) + * Otherwise, if the buffer has been written to disk, + * it is safe to remove the checkpoint and drop it. + */ +- if (!buffer_dirty(bh)) { +- __jbd2_journal_remove_checkpoint(jh); ++ if (jbd2_journal_try_remove_checkpoint(jh) >= 0) { + spin_unlock(&journal->j_list_lock); + goto drop; + } +@@ -2103,20 +2102,14 @@ __journal_try_to_free_buffer(journal_t *journal, struct buffer_head *bh) + + jh = bh2jh(bh); + +- if (buffer_locked(bh) || buffer_dirty(bh)) +- goto out; +- + if (jh->b_next_transaction != NULL || jh->b_transaction != NULL) +- goto out; ++ return; + + spin_lock(&journal->j_list_lock); +- if (jh->b_cp_transaction != NULL) { +- /* written-back checkpointed metadata buffer */ +- JBUFFER_TRACE(jh, "remove from checkpoint list"); +- __jbd2_journal_remove_checkpoint(jh); +- } ++ /* Remove written-back checkpointed metadata buffer */ ++ if (jh->b_cp_transaction != NULL) ++ jbd2_journal_try_remove_checkpoint(jh); + spin_unlock(&journal->j_list_lock); +-out: + return; + } + +diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c +index c220810c61d14..fbc7304bed56b 100644 +--- a/fs/nfs/direct.c ++++ b/fs/nfs/direct.c +@@ -509,20 +509,26 @@ out: + return result; + } + +-static void +-nfs_direct_join_group(struct list_head *list, struct inode *inode) ++static void nfs_direct_join_group(struct list_head *list, struct inode *inode) + { +- struct nfs_page *req, *next; ++ struct nfs_page *req, *subreq; + + list_for_each_entry(req, list, wb_list) { +- if (req->wb_head != req || req->wb_this_page == req) ++ if (req->wb_head != req) + continue; +- for (next = req->wb_this_page; +- next != req->wb_head; +- next = next->wb_this_page) { +- nfs_list_remove_request(next); +- nfs_release_request(next); +- } ++ subreq = req->wb_this_page; ++ if (subreq == req) ++ continue; ++ do { ++ /* ++ * Remove subrequests from this list before freeing ++ * them in the call to nfs_join_page_group(). ++ */ ++ if (!list_empty(&subreq->wb_list)) { ++ nfs_list_remove_request(subreq); ++ nfs_release_request(subreq); ++ } ++ } while ((subreq = subreq->wb_this_page) != req); + nfs_join_page_group(req, inode); + } + } +diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c +index e4524635a129a..d8f01d222c499 100644 +--- a/fs/nfs/inode.c ++++ b/fs/nfs/inode.c +@@ -731,9 +731,7 @@ void nfs_setattr_update_inode(struct inode *inode, struct iattr *attr, + if ((attr->ia_valid & ATTR_KILL_SUID) != 0 && + inode->i_mode & S_ISUID) + inode->i_mode &= ~S_ISUID; +- if ((attr->ia_valid & ATTR_KILL_SGID) != 0 && +- (inode->i_mode & (S_ISGID | S_IXGRP)) == +- (S_ISGID | S_IXGRP)) ++ if (setattr_should_drop_sgid(&init_user_ns, inode)) + inode->i_mode &= ~S_ISGID; + if ((attr->ia_valid & ATTR_MODE) != 0) { + int mode = attr->ia_mode & S_IALLUGO; +diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c +index da94bf2afd070..bc07012741cb4 100644 +--- a/fs/nfs/nfs42proc.c ++++ b/fs/nfs/nfs42proc.c +@@ -1339,7 +1339,6 @@ ssize_t nfs42_proc_getxattr(struct inode *inode, const char *name, + for (i = 0; i < np; i++) { + pages[i] = alloc_page(GFP_KERNEL); + if (!pages[i]) { +- np = i + 1; + err = -ENOMEM; + goto out; + } +@@ -1363,8 +1362,8 @@ ssize_t nfs42_proc_getxattr(struct inode *inode, const char *name, + } while (exception.retry); + + out: +- while (--np >= 0) +- __free_page(pages[np]); ++ while (--i >= 0) ++ __free_page(pages[i]); + kfree(pages); + + return err; +diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c +index b1ec9b5d06e58..a21e25cbd4515 100644 +--- a/fs/nfs/nfs4proc.c ++++ b/fs/nfs/nfs4proc.c +@@ -5964,9 +5964,8 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu + out_ok: + ret = res.acl_len; + out_free: +- for (i = 0; i < npages; i++) +- if (pages[i]) +- __free_page(pages[i]); ++ while (--i >= 0) ++ __free_page(pages[i]); + if (res.acl_scratch) + __free_page(res.acl_scratch); + kfree(pages); +@@ -7153,8 +7152,15 @@ static void nfs4_lock_done(struct rpc_task *task, void *calldata) + } else if (!nfs4_update_lock_stateid(lsp, &data->res.stateid)) + goto out_restart; + break; +- case -NFS4ERR_BAD_STATEID: + case -NFS4ERR_OLD_STATEID: ++ if (data->arg.new_lock_owner != 0 && ++ nfs4_refresh_open_old_stateid(&data->arg.open_stateid, ++ lsp->ls_state)) ++ goto out_restart; ++ if (nfs4_refresh_lock_old_stateid(&data->arg.lock_stateid, lsp)) ++ goto out_restart; ++ fallthrough; ++ case -NFS4ERR_BAD_STATEID: + case -NFS4ERR_STALE_STATEID: + case -NFS4ERR_EXPIRED: + if (data->arg.new_lock_owner != 0) { +diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c +index c8729493df5ca..f54ef526f25d0 100644 +--- a/fs/nfsd/nfs4state.c ++++ b/fs/nfsd/nfs4state.c +@@ -1263,9 +1263,9 @@ static void revoke_delegation(struct nfs4_delegation *dp) + WARN_ON(!list_empty(&dp->dl_recall_lru)); + + if (clp->cl_minorversion) { ++ spin_lock(&clp->cl_lock); + dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID; + refcount_inc(&dp->dl_stid.sc_count); +- spin_lock(&clp->cl_lock); + list_add(&dp->dl_recall_lru, &clp->cl_revoked); + spin_unlock(&clp->cl_lock); + } +diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c +index d4adc599737d8..15a86876e3d90 100644 +--- a/fs/nfsd/vfs.c ++++ b/fs/nfsd/vfs.c +@@ -322,7 +322,9 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap) + iap->ia_mode &= ~S_ISGID; + } else { + /* set ATTR_KILL_* bits and let VFS handle it */ +- iap->ia_valid |= (ATTR_KILL_SUID | ATTR_KILL_SGID); ++ iap->ia_valid |= ATTR_KILL_SUID; ++ iap->ia_valid |= ++ setattr_should_drop_sgid(&init_user_ns, inode); + } + } + } +diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h +index dfb46915015b1..9c7949ebc159e 100644 +--- a/include/drm/drm_dp_helper.h ++++ b/include/drm/drm_dp_helper.h +@@ -1495,7 +1495,7 @@ u8 drm_dp_get_adjust_request_post_cursor(const u8 link_status[DP_LINK_STATUS_SIZ + + #define DP_BRANCH_OUI_HEADER_SIZE 0xc + #define DP_RECEIVER_CAP_SIZE 0xf +-#define DP_DSC_RECEIVER_CAP_SIZE 0xf ++#define DP_DSC_RECEIVER_CAP_SIZE 0x10 /* DSC Capabilities 0x60 through 0x6F */ + #define EDP_PSR_RECEIVER_CAP_SIZE 2 + #define EDP_DISPLAY_CTL_CAP_SIZE 3 + #define DP_LTTPR_COMMON_CAP_SIZE 8 +diff --git a/include/linux/clk.h b/include/linux/clk.h +index e280e0acb55c6..05ab315aa84bc 100644 +--- a/include/linux/clk.h ++++ b/include/linux/clk.h +@@ -183,6 +183,39 @@ int clk_get_scaled_duty_cycle(struct clk *clk, unsigned int scale); + */ + bool clk_is_match(const struct clk *p, const struct clk *q); + ++/** ++ * clk_rate_exclusive_get - get exclusivity over the rate control of a ++ * producer ++ * @clk: clock source ++ * ++ * This function allows drivers to get exclusive control over the rate of a ++ * provider. It prevents any other consumer to execute, even indirectly, ++ * opereation which could alter the rate of the provider or cause glitches ++ * ++ * If exlusivity is claimed more than once on clock, even by the same driver, ++ * the rate effectively gets locked as exclusivity can't be preempted. ++ * ++ * Must not be called from within atomic context. ++ * ++ * Returns success (0) or negative errno. ++ */ ++int clk_rate_exclusive_get(struct clk *clk); ++ ++/** ++ * clk_rate_exclusive_put - release exclusivity over the rate control of a ++ * producer ++ * @clk: clock source ++ * ++ * This function allows drivers to release the exclusivity it previously got ++ * from clk_rate_exclusive_get() ++ * ++ * The caller must balance the number of clk_rate_exclusive_get() and ++ * clk_rate_exclusive_put() calls. ++ * ++ * Must not be called from within atomic context. ++ */ ++void clk_rate_exclusive_put(struct clk *clk); ++ + #else + + static inline int clk_notifier_register(struct clk *clk, +@@ -236,6 +269,13 @@ static inline bool clk_is_match(const struct clk *p, const struct clk *q) + return p == q; + } + ++static inline int clk_rate_exclusive_get(struct clk *clk) ++{ ++ return 0; ++} ++ ++static inline void clk_rate_exclusive_put(struct clk *clk) {} ++ + #endif + + #ifdef CONFIG_HAVE_CLK_PREPARE +@@ -570,38 +610,6 @@ struct clk *devm_clk_get_optional_enabled(struct device *dev, const char *id); + */ + struct clk *devm_get_clk_from_child(struct device *dev, + struct device_node *np, const char *con_id); +-/** +- * clk_rate_exclusive_get - get exclusivity over the rate control of a +- * producer +- * @clk: clock source +- * +- * This function allows drivers to get exclusive control over the rate of a +- * provider. It prevents any other consumer to execute, even indirectly, +- * opereation which could alter the rate of the provider or cause glitches +- * +- * If exlusivity is claimed more than once on clock, even by the same driver, +- * the rate effectively gets locked as exclusivity can't be preempted. +- * +- * Must not be called from within atomic context. +- * +- * Returns success (0) or negative errno. +- */ +-int clk_rate_exclusive_get(struct clk *clk); +- +-/** +- * clk_rate_exclusive_put - release exclusivity over the rate control of a +- * producer +- * @clk: clock source +- * +- * This function allows drivers to release the exclusivity it previously got +- * from clk_rate_exclusive_get() +- * +- * The caller must balance the number of clk_rate_exclusive_get() and +- * clk_rate_exclusive_put() calls. +- * +- * Must not be called from within atomic context. +- */ +-void clk_rate_exclusive_put(struct clk *clk); + + /** + * clk_enable - inform the system when the clock source should be running. +@@ -961,14 +969,6 @@ static inline void clk_bulk_put_all(int num_clks, struct clk_bulk_data *clks) {} + + static inline void devm_clk_put(struct device *dev, struct clk *clk) {} + +- +-static inline int clk_rate_exclusive_get(struct clk *clk) +-{ +- return 0; +-} +- +-static inline void clk_rate_exclusive_put(struct clk *clk) {} +- + static inline int clk_enable(struct clk *clk) + { + return 0; +diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h +index d2b9c41c8edf5..82fb7e24d1cbe 100644 +--- a/include/linux/cpuset.h ++++ b/include/linux/cpuset.h +@@ -56,8 +56,10 @@ extern void cpuset_init_smp(void); + extern void cpuset_force_rebuild(void); + extern void cpuset_update_active_cpus(void); + extern void cpuset_wait_for_hotplug(void); +-extern void cpuset_read_lock(void); +-extern void cpuset_read_unlock(void); ++extern void inc_dl_tasks_cs(struct task_struct *task); ++extern void dec_dl_tasks_cs(struct task_struct *task); ++extern void cpuset_lock(void); ++extern void cpuset_unlock(void); + extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mask); + extern bool cpuset_cpus_allowed_fallback(struct task_struct *p); + extern nodemask_t cpuset_mems_allowed(struct task_struct *p); +@@ -179,8 +181,10 @@ static inline void cpuset_update_active_cpus(void) + + static inline void cpuset_wait_for_hotplug(void) { } + +-static inline void cpuset_read_lock(void) { } +-static inline void cpuset_read_unlock(void) { } ++static inline void inc_dl_tasks_cs(struct task_struct *task) { } ++static inline void dec_dl_tasks_cs(struct task_struct *task) { } ++static inline void cpuset_lock(void) { } ++static inline void cpuset_unlock(void) { } + + static inline void cpuset_cpus_allowed(struct task_struct *p, + struct cpumask *mask) +diff --git a/include/linux/fs.h b/include/linux/fs.h +index c0dd2794e1ba0..6bba7a58c95c6 100644 +--- a/include/linux/fs.h ++++ b/include/linux/fs.h +@@ -3135,6 +3135,8 @@ extern struct inode *new_inode(struct super_block *sb); + extern void free_inode_nonrcu(struct inode *inode); + extern int setattr_should_drop_suidgid(struct user_namespace *, struct inode *); + extern int file_remove_privs(struct file *); ++int setattr_should_drop_sgid(struct user_namespace *mnt_userns, ++ const struct inode *inode); + + extern void __insert_inode_hash(struct inode *, unsigned long hashval); + static inline void insert_inode_hash(struct inode *inode) +diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h +index d63b8106796e2..ade8a6d7acff9 100644 +--- a/include/linux/jbd2.h ++++ b/include/linux/jbd2.h +@@ -626,12 +626,6 @@ struct transaction_s + */ + struct journal_head *t_checkpoint_list; + +- /* +- * Doubly-linked circular list of all buffers submitted for IO while +- * checkpointing. [j_list_lock] +- */ +- struct journal_head *t_checkpoint_io_list; +- + /* + * Doubly-linked circular list of metadata buffers being + * shadowed by log IO. The IO buffers on the iobuf list and +@@ -1447,6 +1441,7 @@ extern void jbd2_journal_commit_transaction(journal_t *); + void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy); + unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, unsigned long *nr_to_scan); + int __jbd2_journal_remove_checkpoint(struct journal_head *); ++int jbd2_journal_try_remove_checkpoint(struct journal_head *jh); + void jbd2_journal_destroy_checkpoint(journal_t *journal); + void __jbd2_journal_insert_checkpoint(struct journal_head *, transaction_t *); + +diff --git a/include/linux/raid_class.h b/include/linux/raid_class.h +index 5cdfcb873a8f0..772d45b2a60a0 100644 +--- a/include/linux/raid_class.h ++++ b/include/linux/raid_class.h +@@ -77,7 +77,3 @@ DEFINE_RAID_ATTRIBUTE(enum raid_state, state) + + struct raid_template *raid_class_attach(struct raid_function_template *); + void raid_class_release(struct raid_template *); +- +-int __must_check raid_component_add(struct raid_template *, struct device *, +- struct device *); +- +diff --git a/include/linux/sched.h b/include/linux/sched.h +index 7c17742d359cd..7bfc2b45cd99b 100644 +--- a/include/linux/sched.h ++++ b/include/linux/sched.h +@@ -1797,7 +1797,9 @@ current_restore_flags(unsigned long orig_flags, unsigned long flags) + } + + extern int cpuset_cpumask_can_shrink(const struct cpumask *cur, const struct cpumask *trial); +-extern int task_can_attach(struct task_struct *p, const struct cpumask *cs_effective_cpus); ++extern int task_can_attach(struct task_struct *p); ++extern int dl_bw_alloc(int cpu, u64 dl_bw); ++extern void dl_bw_free(int cpu, u64 dl_bw); + #ifdef CONFIG_SMP + extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask); + extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask); +diff --git a/include/net/bonding.h b/include/net/bonding.h +index e4453cf4f0171..08d222752cc88 100644 +--- a/include/net/bonding.h ++++ b/include/net/bonding.h +@@ -700,37 +700,14 @@ static inline struct slave *bond_slave_has_mac(struct bonding *bond, + } + + /* Caller must hold rcu_read_lock() for read */ +-static inline struct slave *bond_slave_has_mac_rcu(struct bonding *bond, +- const u8 *mac) ++static inline bool bond_slave_has_mac_rcu(struct bonding *bond, const u8 *mac) + { + struct list_head *iter; + struct slave *tmp; + +- bond_for_each_slave_rcu(bond, tmp, iter) +- if (ether_addr_equal_64bits(mac, tmp->dev->dev_addr)) +- return tmp; +- +- return NULL; +-} +- +-/* Caller must hold rcu_read_lock() for read */ +-static inline bool bond_slave_has_mac_rx(struct bonding *bond, const u8 *mac) +-{ +- struct list_head *iter; +- struct slave *tmp; +- struct netdev_hw_addr *ha; +- + bond_for_each_slave_rcu(bond, tmp, iter) + if (ether_addr_equal_64bits(mac, tmp->dev->dev_addr)) + return true; +- +- if (netdev_uc_empty(bond->dev)) +- return false; +- +- netdev_for_each_uc_addr(ha, bond->dev) +- if (ether_addr_equal_64bits(mac, ha->addr)) +- return true; +- + return false; + } + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index 9f48733bfd21c..a2a74e0e5c494 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -175,8 +175,8 @@ struct net_device *rtnl_create_link(struct net *net, const char *ifname, + int rtnl_delete_link(struct net_device *dev); + int rtnl_configure_link(struct net_device *dev, const struct ifinfomsg *ifm); + +-int rtnl_nla_parse_ifla(struct nlattr **tb, const struct nlattr *head, int len, +- struct netlink_ext_ack *exterr); ++int rtnl_nla_parse_ifinfomsg(struct nlattr **tb, const struct nlattr *nla_peer, ++ struct netlink_ext_ack *exterr); + struct net *rtnl_get_net_ns_capable(struct sock *sk, int netnsid); + + #define MODULE_ALIAS_RTNL_LINK(kind) MODULE_ALIAS("rtnl-link-" kind) +diff --git a/include/net/sock.h b/include/net/sock.h +index 6b12b62417e08..640bd7a367779 100644 +--- a/include/net/sock.h ++++ b/include/net/sock.h +@@ -1259,6 +1259,7 @@ struct proto { + /* + * Pressure flag: try to collapse. + * Technical note: it is used by multiple contexts non atomically. ++ * Make sure to use READ_ONCE()/WRITE_ONCE() for all reads/writes. + * All the __sk_mem_schedule() is of this nature: accounting + * is strict, actions are advisory and have some latency. + */ +@@ -1384,7 +1385,7 @@ static inline bool sk_has_memory_pressure(const struct sock *sk) + static inline bool sk_under_global_memory_pressure(const struct sock *sk) + { + return sk->sk_prot->memory_pressure && +- !!*sk->sk_prot->memory_pressure; ++ !!READ_ONCE(*sk->sk_prot->memory_pressure); + } + + static inline bool sk_under_memory_pressure(const struct sock *sk) +@@ -1396,7 +1397,7 @@ static inline bool sk_under_memory_pressure(const struct sock *sk) + mem_cgroup_under_socket_pressure(sk->sk_memcg)) + return true; + +- return !!*sk->sk_prot->memory_pressure; ++ return !!READ_ONCE(*sk->sk_prot->memory_pressure); + } + + static inline long +@@ -1454,7 +1455,7 @@ proto_memory_pressure(struct proto *prot) + { + if (!prot->memory_pressure) + return false; +- return !!*prot->memory_pressure; ++ return !!READ_ONCE(*prot->memory_pressure); + } + + +diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h +index 29414288ea3e0..34ce197bd76e0 100644 +--- a/include/trace/events/jbd2.h ++++ b/include/trace/events/jbd2.h +@@ -462,11 +462,9 @@ TRACE_EVENT(jbd2_shrink_scan_exit, + TRACE_EVENT(jbd2_shrink_checkpoint_list, + + TP_PROTO(journal_t *journal, tid_t first_tid, tid_t tid, tid_t last_tid, +- unsigned long nr_freed, unsigned long nr_scanned, +- tid_t next_tid), ++ unsigned long nr_freed, tid_t next_tid), + +- TP_ARGS(journal, first_tid, tid, last_tid, nr_freed, +- nr_scanned, next_tid), ++ TP_ARGS(journal, first_tid, tid, last_tid, nr_freed, next_tid), + + TP_STRUCT__entry( + __field(dev_t, dev) +@@ -474,7 +472,6 @@ TRACE_EVENT(jbd2_shrink_checkpoint_list, + __field(tid_t, tid) + __field(tid_t, last_tid) + __field(unsigned long, nr_freed) +- __field(unsigned long, nr_scanned) + __field(tid_t, next_tid) + ), + +@@ -484,15 +481,14 @@ TRACE_EVENT(jbd2_shrink_checkpoint_list, + __entry->tid = tid; + __entry->last_tid = last_tid; + __entry->nr_freed = nr_freed; +- __entry->nr_scanned = nr_scanned; + __entry->next_tid = next_tid; + ), + + TP_printk("dev %d,%d shrink transaction %u-%u(%u) freed %lu " +- "scanned %lu next transaction %u", ++ "next transaction %u", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->first_tid, __entry->tid, __entry->last_tid, +- __entry->nr_freed, __entry->nr_scanned, __entry->next_tid) ++ __entry->nr_freed, __entry->next_tid) + ); + + #endif /* _TRACE_JBD2_H */ +diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c +index 6ccdbce173994..be467aea457e7 100644 +--- a/kernel/cgroup/cgroup.c ++++ b/kernel/cgroup/cgroup.c +@@ -56,6 +56,7 @@ + #include + #include + #include ++#include + #include + #include + +@@ -6467,6 +6468,9 @@ void cgroup_exit(struct task_struct *tsk) + list_add_tail(&tsk->cg_list, &cset->dying_tasks); + cset->nr_tasks--; + ++ if (dl_task(tsk)) ++ dec_dl_tasks_cs(tsk); ++ + WARN_ON_ONCE(cgroup_task_frozen(tsk)); + if (unlikely(!(tsk->flags & PF_KTHREAD) && + test_bit(CGRP_FREEZE, &task_dfl_cgroup(tsk)->flags))) +diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c +index fb895eaf3a7c3..6905079c15c25 100644 +--- a/kernel/cgroup/cpuset.c ++++ b/kernel/cgroup/cpuset.c +@@ -162,6 +162,14 @@ struct cpuset { + int use_parent_ecpus; + int child_ecpus_count; + ++ /* ++ * number of SCHED_DEADLINE tasks attached to this cpuset, so that we ++ * know when to rebuild associated root domain bandwidth information. ++ */ ++ int nr_deadline_tasks; ++ int nr_migrate_dl_tasks; ++ u64 sum_migrate_dl_bw; ++ + /* Handle for cpuset.cpus.partition */ + struct cgroup_file partition_file; + }; +@@ -209,6 +217,20 @@ static inline struct cpuset *parent_cs(struct cpuset *cs) + return css_cs(cs->css.parent); + } + ++void inc_dl_tasks_cs(struct task_struct *p) ++{ ++ struct cpuset *cs = task_cs(p); ++ ++ cs->nr_deadline_tasks++; ++} ++ ++void dec_dl_tasks_cs(struct task_struct *p) ++{ ++ struct cpuset *cs = task_cs(p); ++ ++ cs->nr_deadline_tasks--; ++} ++ + /* bits in struct cpuset flags field */ + typedef enum { + CS_ONLINE, +@@ -312,22 +334,23 @@ static struct cpuset top_cpuset = { + if (is_cpuset_online(((des_cs) = css_cs((pos_css))))) + + /* +- * There are two global locks guarding cpuset structures - cpuset_rwsem and ++ * There are two global locks guarding cpuset structures - cpuset_mutex and + * callback_lock. We also require taking task_lock() when dereferencing a + * task's cpuset pointer. See "The task_lock() exception", at the end of this +- * comment. The cpuset code uses only cpuset_rwsem write lock. Other +- * kernel subsystems can use cpuset_read_lock()/cpuset_read_unlock() to +- * prevent change to cpuset structures. ++ * comment. The cpuset code uses only cpuset_mutex. Other kernel subsystems ++ * can use cpuset_lock()/cpuset_unlock() to prevent change to cpuset ++ * structures. Note that cpuset_mutex needs to be a mutex as it is used in ++ * paths that rely on priority inheritance (e.g. scheduler - on RT) for ++ * correctness. + * + * A task must hold both locks to modify cpusets. If a task holds +- * cpuset_rwsem, it blocks others wanting that rwsem, ensuring that it +- * is the only task able to also acquire callback_lock and be able to +- * modify cpusets. It can perform various checks on the cpuset structure +- * first, knowing nothing will change. It can also allocate memory while +- * just holding cpuset_rwsem. While it is performing these checks, various +- * callback routines can briefly acquire callback_lock to query cpusets. +- * Once it is ready to make the changes, it takes callback_lock, blocking +- * everyone else. ++ * cpuset_mutex, it blocks others, ensuring that it is the only task able to ++ * also acquire callback_lock and be able to modify cpusets. It can perform ++ * various checks on the cpuset structure first, knowing nothing will change. ++ * It can also allocate memory while just holding cpuset_mutex. While it is ++ * performing these checks, various callback routines can briefly acquire ++ * callback_lock to query cpusets. Once it is ready to make the changes, it ++ * takes callback_lock, blocking everyone else. + * + * Calls to the kernel memory allocator can not be made while holding + * callback_lock, as that would risk double tripping on callback_lock +@@ -349,16 +372,16 @@ static struct cpuset top_cpuset = { + * guidelines for accessing subsystem state in kernel/cgroup.c + */ + +-DEFINE_STATIC_PERCPU_RWSEM(cpuset_rwsem); ++static DEFINE_MUTEX(cpuset_mutex); + +-void cpuset_read_lock(void) ++void cpuset_lock(void) + { +- percpu_down_read(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + } + +-void cpuset_read_unlock(void) ++void cpuset_unlock(void) + { +- percpu_up_read(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + } + + static DEFINE_SPINLOCK(callback_lock); +@@ -396,7 +419,7 @@ static inline bool is_in_v2_mode(void) + * One way or another, we guarantee to return some non-empty subset + * of cpu_online_mask. + * +- * Call with callback_lock or cpuset_rwsem held. ++ * Call with callback_lock or cpuset_mutex held. + */ + static void guarantee_online_cpus(struct task_struct *tsk, + struct cpumask *pmask) +@@ -438,7 +461,7 @@ out_unlock: + * One way or another, we guarantee to return some non-empty subset + * of node_states[N_MEMORY]. + * +- * Call with callback_lock or cpuset_rwsem held. ++ * Call with callback_lock or cpuset_mutex held. + */ + static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask) + { +@@ -450,7 +473,7 @@ static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask) + /* + * update task's spread flag if cpuset's page/slab spread flag is set + * +- * Call with callback_lock or cpuset_rwsem held. ++ * Call with callback_lock or cpuset_mutex held. + */ + static void cpuset_update_task_spread_flag(struct cpuset *cs, + struct task_struct *tsk) +@@ -471,7 +494,7 @@ static void cpuset_update_task_spread_flag(struct cpuset *cs, + * + * One cpuset is a subset of another if all its allowed CPUs and + * Memory Nodes are a subset of the other, and its exclusive flags +- * are only set if the other's are set. Call holding cpuset_rwsem. ++ * are only set if the other's are set. Call holding cpuset_mutex. + */ + + static int is_cpuset_subset(const struct cpuset *p, const struct cpuset *q) +@@ -580,7 +603,7 @@ static inline void free_cpuset(struct cpuset *cs) + * If we replaced the flag and mask values of the current cpuset + * (cur) with those values in the trial cpuset (trial), would + * our various subset and exclusive rules still be valid? Presumes +- * cpuset_rwsem held. ++ * cpuset_mutex held. + * + * 'cur' is the address of an actual, in-use cpuset. Operations + * such as list traversal that depend on the actual address of the +@@ -703,7 +726,7 @@ static void update_domain_attr_tree(struct sched_domain_attr *dattr, + rcu_read_unlock(); + } + +-/* Must be called with cpuset_rwsem held. */ ++/* Must be called with cpuset_mutex held. */ + static inline int nr_cpusets(void) + { + /* jump label reference count + the top-level cpuset */ +@@ -729,7 +752,7 @@ static inline int nr_cpusets(void) + * domains when operating in the severe memory shortage situations + * that could cause allocation failures below. + * +- * Must be called with cpuset_rwsem held. ++ * Must be called with cpuset_mutex held. + * + * The three key local variables below are: + * cp - cpuset pointer, used (together with pos_css) to perform a +@@ -940,11 +963,14 @@ done: + return ndoms; + } + +-static void update_tasks_root_domain(struct cpuset *cs) ++static void dl_update_tasks_root_domain(struct cpuset *cs) + { + struct css_task_iter it; + struct task_struct *task; + ++ if (cs->nr_deadline_tasks == 0) ++ return; ++ + css_task_iter_start(&cs->css, 0, &it); + + while ((task = css_task_iter_next(&it))) +@@ -953,12 +979,12 @@ static void update_tasks_root_domain(struct cpuset *cs) + css_task_iter_end(&it); + } + +-static void rebuild_root_domains(void) ++static void dl_rebuild_rd_accounting(void) + { + struct cpuset *cs = NULL; + struct cgroup_subsys_state *pos_css; + +- percpu_rwsem_assert_held(&cpuset_rwsem); ++ lockdep_assert_held(&cpuset_mutex); + lockdep_assert_cpus_held(); + lockdep_assert_held(&sched_domains_mutex); + +@@ -981,7 +1007,7 @@ static void rebuild_root_domains(void) + + rcu_read_unlock(); + +- update_tasks_root_domain(cs); ++ dl_update_tasks_root_domain(cs); + + rcu_read_lock(); + css_put(&cs->css); +@@ -995,7 +1021,7 @@ partition_and_rebuild_sched_domains(int ndoms_new, cpumask_var_t doms_new[], + { + mutex_lock(&sched_domains_mutex); + partition_sched_domains_locked(ndoms_new, doms_new, dattr_new); +- rebuild_root_domains(); ++ dl_rebuild_rd_accounting(); + mutex_unlock(&sched_domains_mutex); + } + +@@ -1008,7 +1034,7 @@ partition_and_rebuild_sched_domains(int ndoms_new, cpumask_var_t doms_new[], + * 'cpus' is removed, then call this routine to rebuild the + * scheduler's dynamic sched domains. + * +- * Call with cpuset_rwsem held. Takes cpus_read_lock(). ++ * Call with cpuset_mutex held. Takes cpus_read_lock(). + */ + static void rebuild_sched_domains_locked(void) + { +@@ -1019,7 +1045,7 @@ static void rebuild_sched_domains_locked(void) + int ndoms; + + lockdep_assert_cpus_held(); +- percpu_rwsem_assert_held(&cpuset_rwsem); ++ lockdep_assert_held(&cpuset_mutex); + + /* + * If we have raced with CPU hotplug, return early to avoid +@@ -1070,9 +1096,9 @@ static void rebuild_sched_domains_locked(void) + void rebuild_sched_domains(void) + { + cpus_read_lock(); +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + rebuild_sched_domains_locked(); +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + cpus_read_unlock(); + } + +@@ -1081,7 +1107,7 @@ void rebuild_sched_domains(void) + * @cs: the cpuset in which each task's cpus_allowed mask needs to be changed + * + * Iterate through each task of @cs updating its cpus_allowed to the +- * effective cpuset's. As this function is called with cpuset_rwsem held, ++ * effective cpuset's. As this function is called with cpuset_mutex held, + * cpuset membership stays stable. + */ + static void update_tasks_cpumask(struct cpuset *cs) +@@ -1188,7 +1214,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, + int old_prs, new_prs; + bool part_error = false; /* Partition error? */ + +- percpu_rwsem_assert_held(&cpuset_rwsem); ++ lockdep_assert_held(&cpuset_mutex); + + /* + * The parent must be a partition root. +@@ -1358,7 +1384,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, + * + * On legacy hierarchy, effective_cpus will be the same with cpu_allowed. + * +- * Called with cpuset_rwsem held ++ * Called with cpuset_mutex held + */ + static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) + { +@@ -1521,7 +1547,7 @@ static void update_sibling_cpumasks(struct cpuset *parent, struct cpuset *cs, + struct cpuset *sibling; + struct cgroup_subsys_state *pos_css; + +- percpu_rwsem_assert_held(&cpuset_rwsem); ++ lockdep_assert_held(&cpuset_mutex); + + /* + * Check all its siblings and call update_cpumasks_hier() +@@ -1724,12 +1750,12 @@ static void *cpuset_being_rebound; + * @cs: the cpuset in which each task's mems_allowed mask needs to be changed + * + * Iterate through each task of @cs updating its mems_allowed to the +- * effective cpuset's. As this function is called with cpuset_rwsem held, ++ * effective cpuset's. As this function is called with cpuset_mutex held, + * cpuset membership stays stable. + */ + static void update_tasks_nodemask(struct cpuset *cs) + { +- static nodemask_t newmems; /* protected by cpuset_rwsem */ ++ static nodemask_t newmems; /* protected by cpuset_mutex */ + struct css_task_iter it; + struct task_struct *task; + +@@ -1742,7 +1768,7 @@ static void update_tasks_nodemask(struct cpuset *cs) + * take while holding tasklist_lock. Forks can happen - the + * mpol_dup() cpuset_being_rebound check will catch such forks, + * and rebind their vma mempolicies too. Because we still hold +- * the global cpuset_rwsem, we know that no other rebind effort ++ * the global cpuset_mutex, we know that no other rebind effort + * will be contending for the global variable cpuset_being_rebound. + * It's ok if we rebind the same mm twice; mpol_rebind_mm() + * is idempotent. Also migrate pages in each mm to new nodes. +@@ -1788,7 +1814,7 @@ static void update_tasks_nodemask(struct cpuset *cs) + * + * On legacy hierarchy, effective_mems will be the same with mems_allowed. + * +- * Called with cpuset_rwsem held ++ * Called with cpuset_mutex held + */ + static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems) + { +@@ -1841,7 +1867,7 @@ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems) + * mempolicies and if the cpuset is marked 'memory_migrate', + * migrate the tasks pages to the new memory. + * +- * Call with cpuset_rwsem held. May take callback_lock during call. ++ * Call with cpuset_mutex held. May take callback_lock during call. + * Will take tasklist_lock, scan tasklist for tasks in cpuset cs, + * lock each such tasks mm->mmap_lock, scan its vma's and rebind + * their mempolicies to the cpusets new mems_allowed. +@@ -1931,7 +1957,7 @@ static int update_relax_domain_level(struct cpuset *cs, s64 val) + * @cs: the cpuset in which each task's spread flags needs to be changed + * + * Iterate through each task of @cs updating its spread flags. As this +- * function is called with cpuset_rwsem held, cpuset membership stays ++ * function is called with cpuset_mutex held, cpuset membership stays + * stable. + */ + static void update_tasks_flags(struct cpuset *cs) +@@ -1951,7 +1977,7 @@ static void update_tasks_flags(struct cpuset *cs) + * cs: the cpuset to update + * turning_on: whether the flag is being set or cleared + * +- * Call with cpuset_rwsem held. ++ * Call with cpuset_mutex held. + */ + + static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, +@@ -2000,7 +2026,7 @@ out: + * cs: the cpuset to update + * new_prs: new partition root state + * +- * Call with cpuset_rwsem held. ++ * Call with cpuset_mutex held. + */ + static int update_prstate(struct cpuset *cs, int new_prs) + { +@@ -2182,19 +2208,26 @@ static int fmeter_getrate(struct fmeter *fmp) + + static struct cpuset *cpuset_attach_old_cs; + +-/* Called by cgroups to determine if a cpuset is usable; cpuset_rwsem held */ ++static void reset_migrate_dl_data(struct cpuset *cs) ++{ ++ cs->nr_migrate_dl_tasks = 0; ++ cs->sum_migrate_dl_bw = 0; ++} ++ ++/* Called by cgroups to determine if a cpuset is usable; cpuset_mutex held */ + static int cpuset_can_attach(struct cgroup_taskset *tset) + { + struct cgroup_subsys_state *css; +- struct cpuset *cs; ++ struct cpuset *cs, *oldcs; + struct task_struct *task; + int ret; + + /* used later by cpuset_attach() */ + cpuset_attach_old_cs = task_cs(cgroup_taskset_first(tset, &css)); ++ oldcs = cpuset_attach_old_cs; + cs = css_cs(css); + +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + + /* allow moving tasks into an empty cpuset if on default hierarchy */ + ret = -ENOSPC; +@@ -2203,14 +2236,39 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) + goto out_unlock; + + cgroup_taskset_for_each(task, css, tset) { +- ret = task_can_attach(task, cs->effective_cpus); ++ ret = task_can_attach(task); + if (ret) + goto out_unlock; + ret = security_task_setscheduler(task); + if (ret) + goto out_unlock; ++ ++ if (dl_task(task)) { ++ cs->nr_migrate_dl_tasks++; ++ cs->sum_migrate_dl_bw += task->dl.dl_bw; ++ } + } + ++ if (!cs->nr_migrate_dl_tasks) ++ goto out_success; ++ ++ if (!cpumask_intersects(oldcs->effective_cpus, cs->effective_cpus)) { ++ int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus); ++ ++ if (unlikely(cpu >= nr_cpu_ids)) { ++ reset_migrate_dl_data(cs); ++ ret = -EINVAL; ++ goto out_unlock; ++ } ++ ++ ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw); ++ if (ret) { ++ reset_migrate_dl_data(cs); ++ goto out_unlock; ++ } ++ } ++ ++out_success: + /* + * Mark attach is in progress. This makes validate_change() fail + * changes which zero cpus/mems_allowed. +@@ -2218,7 +2276,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) + cs->attach_in_progress++; + ret = 0; + out_unlock: +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + return ret; + } + +@@ -2230,15 +2288,23 @@ static void cpuset_cancel_attach(struct cgroup_taskset *tset) + cgroup_taskset_first(tset, &css); + cs = css_cs(css); + +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + cs->attach_in_progress--; + if (!cs->attach_in_progress) + wake_up(&cpuset_attach_wq); +- percpu_up_write(&cpuset_rwsem); ++ ++ if (cs->nr_migrate_dl_tasks) { ++ int cpu = cpumask_any(cs->effective_cpus); ++ ++ dl_bw_free(cpu, cs->sum_migrate_dl_bw); ++ reset_migrate_dl_data(cs); ++ } ++ ++ mutex_unlock(&cpuset_mutex); + } + + /* +- * Protected by cpuset_rwsem. cpus_attach is used only by cpuset_attach() ++ * Protected by cpuset_mutex. cpus_attach is used only by cpuset_attach() + * but we can't allocate it dynamically there. Define it global and + * allocate from cpuset_init(). + */ +@@ -2246,7 +2312,7 @@ static cpumask_var_t cpus_attach; + + static void cpuset_attach(struct cgroup_taskset *tset) + { +- /* static buf protected by cpuset_rwsem */ ++ /* static buf protected by cpuset_mutex */ + static nodemask_t cpuset_attach_nodemask_to; + struct task_struct *task; + struct task_struct *leader; +@@ -2258,7 +2324,7 @@ static void cpuset_attach(struct cgroup_taskset *tset) + cs = css_cs(css); + + lockdep_assert_cpus_held(); /* see cgroup_attach_lock() */ +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + + guarantee_online_mems(cs, &cpuset_attach_nodemask_to); + +@@ -2306,11 +2372,17 @@ static void cpuset_attach(struct cgroup_taskset *tset) + + cs->old_mems_allowed = cpuset_attach_nodemask_to; + ++ if (cs->nr_migrate_dl_tasks) { ++ cs->nr_deadline_tasks += cs->nr_migrate_dl_tasks; ++ oldcs->nr_deadline_tasks -= cs->nr_migrate_dl_tasks; ++ reset_migrate_dl_data(cs); ++ } ++ + cs->attach_in_progress--; + if (!cs->attach_in_progress) + wake_up(&cpuset_attach_wq); + +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + } + + /* The various types of files and directories in a cpuset file system */ +@@ -2342,7 +2414,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft, + int retval = 0; + + cpus_read_lock(); +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + if (!is_cpuset_online(cs)) { + retval = -ENODEV; + goto out_unlock; +@@ -2378,7 +2450,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft, + break; + } + out_unlock: +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + cpus_read_unlock(); + return retval; + } +@@ -2391,7 +2463,7 @@ static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft, + int retval = -ENODEV; + + cpus_read_lock(); +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + if (!is_cpuset_online(cs)) + goto out_unlock; + +@@ -2404,7 +2476,7 @@ static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft, + break; + } + out_unlock: +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + cpus_read_unlock(); + return retval; + } +@@ -2437,7 +2509,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of, + * operation like this one can lead to a deadlock through kernfs + * active_ref protection. Let's break the protection. Losing the + * protection is okay as we check whether @cs is online after +- * grabbing cpuset_rwsem anyway. This only happens on the legacy ++ * grabbing cpuset_mutex anyway. This only happens on the legacy + * hierarchies. + */ + css_get(&cs->css); +@@ -2445,7 +2517,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of, + flush_work(&cpuset_hotplug_work); + + cpus_read_lock(); +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + if (!is_cpuset_online(cs)) + goto out_unlock; + +@@ -2469,7 +2541,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of, + + free_cpuset(trialcs); + out_unlock: +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + cpus_read_unlock(); + kernfs_unbreak_active_protection(of->kn); + css_put(&cs->css); +@@ -2602,13 +2674,13 @@ static ssize_t sched_partition_write(struct kernfs_open_file *of, char *buf, + + css_get(&cs->css); + cpus_read_lock(); +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + if (!is_cpuset_online(cs)) + goto out_unlock; + + retval = update_prstate(cs, val); + out_unlock: +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + cpus_read_unlock(); + css_put(&cs->css); + return retval ?: nbytes; +@@ -2821,7 +2893,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css) + return 0; + + cpus_read_lock(); +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + + set_bit(CS_ONLINE, &cs->flags); + if (is_spread_page(parent)) +@@ -2872,7 +2944,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css) + cpumask_copy(cs->effective_cpus, parent->cpus_allowed); + spin_unlock_irq(&callback_lock); + out_unlock: +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + cpus_read_unlock(); + return 0; + } +@@ -2893,7 +2965,7 @@ static void cpuset_css_offline(struct cgroup_subsys_state *css) + struct cpuset *cs = css_cs(css); + + cpus_read_lock(); +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + + if (is_partition_root(cs)) + update_prstate(cs, 0); +@@ -2912,7 +2984,7 @@ static void cpuset_css_offline(struct cgroup_subsys_state *css) + cpuset_dec(); + clear_bit(CS_ONLINE, &cs->flags); + +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + cpus_read_unlock(); + } + +@@ -2925,7 +2997,7 @@ static void cpuset_css_free(struct cgroup_subsys_state *css) + + static void cpuset_bind(struct cgroup_subsys_state *root_css) + { +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + spin_lock_irq(&callback_lock); + + if (is_in_v2_mode()) { +@@ -2938,7 +3010,7 @@ static void cpuset_bind(struct cgroup_subsys_state *root_css) + } + + spin_unlock_irq(&callback_lock); +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + } + + /* +@@ -2980,8 +3052,6 @@ struct cgroup_subsys cpuset_cgrp_subsys = { + + int __init cpuset_init(void) + { +- BUG_ON(percpu_init_rwsem(&cpuset_rwsem)); +- + BUG_ON(!alloc_cpumask_var(&top_cpuset.cpus_allowed, GFP_KERNEL)); + BUG_ON(!alloc_cpumask_var(&top_cpuset.effective_cpus, GFP_KERNEL)); + BUG_ON(!zalloc_cpumask_var(&top_cpuset.subparts_cpus, GFP_KERNEL)); +@@ -3053,7 +3123,7 @@ hotplug_update_tasks_legacy(struct cpuset *cs, + is_empty = cpumask_empty(cs->cpus_allowed) || + nodes_empty(cs->mems_allowed); + +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + + /* + * Move tasks to the nearest ancestor with execution resources, +@@ -3063,7 +3133,7 @@ hotplug_update_tasks_legacy(struct cpuset *cs, + if (is_empty) + remove_tasks_in_empty_cpuset(cs); + +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + } + + static void +@@ -3113,14 +3183,14 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp) + retry: + wait_event(cpuset_attach_wq, cs->attach_in_progress == 0); + +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + + /* + * We have raced with task attaching. We wait until attaching + * is finished, so we won't attach a task to an empty cpuset. + */ + if (cs->attach_in_progress) { +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + goto retry; + } + +@@ -3198,7 +3268,7 @@ update_tasks: + hotplug_update_tasks_legacy(cs, &new_cpus, &new_mems, + cpus_updated, mems_updated); + +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + } + + /** +@@ -3228,7 +3298,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work) + if (on_dfl && !alloc_cpumasks(NULL, &tmp)) + ptmp = &tmp; + +- percpu_down_write(&cpuset_rwsem); ++ mutex_lock(&cpuset_mutex); + + /* fetch the available cpus/mems and find out which changed how */ + cpumask_copy(&new_cpus, cpu_active_mask); +@@ -3285,7 +3355,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work) + update_tasks_nodemask(&top_cpuset); + } + +- percpu_up_write(&cpuset_rwsem); ++ mutex_unlock(&cpuset_mutex); + + /* if cpus or mems changed, we need to propagate to descendants */ + if (cpus_updated || mems_updated) { +@@ -3695,7 +3765,7 @@ void __cpuset_memory_pressure_bump(void) + * - Used for /proc//cpuset. + * - No need to task_lock(tsk) on this tsk->cpuset reference, as it + * doesn't really matter if tsk->cpuset changes after we read it, +- * and we take cpuset_rwsem, keeping cpuset_attach() from changing it ++ * and we take cpuset_mutex, keeping cpuset_attach() from changing it + * anyway. + */ + int proc_cpuset_show(struct seq_file *m, struct pid_namespace *ns, +diff --git a/kernel/sched/core.c b/kernel/sched/core.c +index d34a56f16d13b..2324b7055260a 100644 +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -7309,6 +7309,7 @@ static int __sched_setscheduler(struct task_struct *p, + int reset_on_fork; + int queue_flags = DEQUEUE_SAVE | DEQUEUE_MOVE | DEQUEUE_NOCLOCK; + struct rq *rq; ++ bool cpuset_locked = false; + + /* The pi code expects interrupts enabled */ + BUG_ON(pi && in_interrupt()); +@@ -7405,8 +7406,14 @@ recheck: + return retval; + } + +- if (pi) +- cpuset_read_lock(); ++ /* ++ * SCHED_DEADLINE bandwidth accounting relies on stable cpusets ++ * information. ++ */ ++ if (dl_policy(policy) || dl_policy(p->policy)) { ++ cpuset_locked = true; ++ cpuset_lock(); ++ } + + /* + * Make sure no PI-waiters arrive (or leave) while we are +@@ -7482,8 +7489,8 @@ change: + if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) { + policy = oldpolicy = -1; + task_rq_unlock(rq, p, &rf); +- if (pi) +- cpuset_read_unlock(); ++ if (cpuset_locked) ++ cpuset_unlock(); + goto recheck; + } + +@@ -7550,7 +7557,8 @@ change: + task_rq_unlock(rq, p, &rf); + + if (pi) { +- cpuset_read_unlock(); ++ if (cpuset_locked) ++ cpuset_unlock(); + rt_mutex_adjust_pi(p); + } + +@@ -7562,8 +7570,8 @@ change: + + unlock: + task_rq_unlock(rq, p, &rf); +- if (pi) +- cpuset_read_unlock(); ++ if (cpuset_locked) ++ cpuset_unlock(); + return retval; + } + +@@ -8781,8 +8789,7 @@ int cpuset_cpumask_can_shrink(const struct cpumask *cur, + return ret; + } + +-int task_can_attach(struct task_struct *p, +- const struct cpumask *cs_effective_cpus) ++int task_can_attach(struct task_struct *p) + { + int ret = 0; + +@@ -8795,21 +8802,9 @@ int task_can_attach(struct task_struct *p, + * success of set_cpus_allowed_ptr() on all attached tasks + * before cpus_mask may be changed. + */ +- if (p->flags & PF_NO_SETAFFINITY) { ++ if (p->flags & PF_NO_SETAFFINITY) + ret = -EINVAL; +- goto out; +- } + +- if (dl_task(p) && !cpumask_intersects(task_rq(p)->rd->span, +- cs_effective_cpus)) { +- int cpu = cpumask_any_and(cpu_active_mask, cs_effective_cpus); +- +- if (unlikely(cpu >= nr_cpu_ids)) +- return -EINVAL; +- ret = dl_cpu_busy(cpu, p); +- } +- +-out: + return ret; + } + +@@ -9091,7 +9086,7 @@ static void cpuset_cpu_active(void) + static int cpuset_cpu_inactive(unsigned int cpu) + { + if (!cpuhp_tasks_frozen) { +- int ret = dl_cpu_busy(cpu, NULL); ++ int ret = dl_bw_check_overflow(cpu); + + if (ret) + return ret; +diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c +index b3e2064983952..de45e4d2c61fa 100644 +--- a/kernel/sched/deadline.c ++++ b/kernel/sched/deadline.c +@@ -17,6 +17,7 @@ + */ + #include "sched.h" + #include "pelt.h" ++#include + + struct dl_bandwidth def_dl_bandwidth; + +@@ -2446,6 +2447,12 @@ static void switched_from_dl(struct rq *rq, struct task_struct *p) + if (task_on_rq_queued(p) && p->dl.dl_runtime) + task_non_contending(p); + ++ /* ++ * In case a task is setscheduled out from SCHED_DEADLINE we need to ++ * keep track of that on its cpuset (for correct bandwidth tracking). ++ */ ++ dec_dl_tasks_cs(p); ++ + if (!task_on_rq_queued(p)) { + /* + * Inactive timer is armed. However, p is leaving DEADLINE and +@@ -2486,6 +2493,12 @@ static void switched_to_dl(struct rq *rq, struct task_struct *p) + if (hrtimer_try_to_cancel(&p->dl.inactive_timer) == 1) + put_task_struct(p); + ++ /* ++ * In case a task is setscheduled to SCHED_DEADLINE we need to keep ++ * track of that on its cpuset (for correct bandwidth tracking). ++ */ ++ inc_dl_tasks_cs(p); ++ + /* If p is not queued we will update its parameters at next wakeup. */ + if (!task_on_rq_queued(p)) { + add_rq_bw(&p->dl, &rq->dl); +@@ -2885,26 +2898,38 @@ int dl_cpuset_cpumask_can_shrink(const struct cpumask *cur, + return ret; + } + +-int dl_cpu_busy(int cpu, struct task_struct *p) ++enum dl_bw_request { ++ dl_bw_req_check_overflow = 0, ++ dl_bw_req_alloc, ++ dl_bw_req_free ++}; ++ ++static int dl_bw_manage(enum dl_bw_request req, int cpu, u64 dl_bw) + { +- unsigned long flags, cap; ++ unsigned long flags; + struct dl_bw *dl_b; +- bool overflow; ++ bool overflow = 0; + + rcu_read_lock_sched(); + dl_b = dl_bw_of(cpu); + raw_spin_lock_irqsave(&dl_b->lock, flags); +- cap = dl_bw_capacity(cpu); +- overflow = __dl_overflow(dl_b, cap, 0, p ? p->dl.dl_bw : 0); + +- if (!overflow && p) { +- /* +- * We reserve space for this task in the destination +- * root_domain, as we can't fail after this point. +- * We will free resources in the source root_domain +- * later on (see set_cpus_allowed_dl()). +- */ +- __dl_add(dl_b, p->dl.dl_bw, dl_bw_cpus(cpu)); ++ if (req == dl_bw_req_free) { ++ __dl_sub(dl_b, dl_bw, dl_bw_cpus(cpu)); ++ } else { ++ unsigned long cap = dl_bw_capacity(cpu); ++ ++ overflow = __dl_overflow(dl_b, cap, 0, dl_bw); ++ ++ if (req == dl_bw_req_alloc && !overflow) { ++ /* ++ * We reserve space in the destination ++ * root_domain, as we can't fail after this point. ++ * We will free resources in the source root_domain ++ * later on (see set_cpus_allowed_dl()). ++ */ ++ __dl_add(dl_b, dl_bw, dl_bw_cpus(cpu)); ++ } + } + + raw_spin_unlock_irqrestore(&dl_b->lock, flags); +@@ -2912,6 +2937,21 @@ int dl_cpu_busy(int cpu, struct task_struct *p) + + return overflow ? -EBUSY : 0; + } ++ ++int dl_bw_check_overflow(int cpu) ++{ ++ return dl_bw_manage(dl_bw_req_check_overflow, cpu, 0); ++} ++ ++int dl_bw_alloc(int cpu, u64 dl_bw) ++{ ++ return dl_bw_manage(dl_bw_req_alloc, cpu, dl_bw); ++} ++ ++void dl_bw_free(int cpu, u64 dl_bw) ++{ ++ dl_bw_manage(dl_bw_req_free, cpu, dl_bw); ++} + #endif + + #ifdef CONFIG_SCHED_DEBUG +diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h +index 6312f19048250..5061093d9baae 100644 +--- a/kernel/sched/sched.h ++++ b/kernel/sched/sched.h +@@ -349,7 +349,7 @@ extern void __getparam_dl(struct task_struct *p, struct sched_attr *attr); + extern bool __checkparam_dl(const struct sched_attr *attr); + extern bool dl_param_changed(struct task_struct *p, const struct sched_attr *attr); + extern int dl_cpuset_cpumask_can_shrink(const struct cpumask *cur, const struct cpumask *trial); +-extern int dl_cpu_busy(int cpu, struct task_struct *p); ++extern int dl_bw_check_overflow(int cpu); + + #ifdef CONFIG_CGROUP_SCHED + +diff --git a/kernel/torture.c b/kernel/torture.c +index bb8f411c974b8..7233b847737fd 100644 +--- a/kernel/torture.c ++++ b/kernel/torture.c +@@ -915,7 +915,7 @@ void torture_kthread_stopping(char *title) + VERBOSE_TOROUT_STRING(buf); + while (!kthread_should_stop()) { + torture_shutdown_absorb(title); +- schedule_timeout_uninterruptible(1); ++ schedule_timeout_uninterruptible(HZ / 20); + } + } + EXPORT_SYMBOL_GPL(torture_kthread_stopping); +diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c +index d4c381f06b7b2..8769cd18f622f 100644 +--- a/kernel/trace/trace.c ++++ b/kernel/trace/trace.c +@@ -4108,8 +4108,15 @@ static void *s_start(struct seq_file *m, loff_t *pos) + * will point to the same string as current_trace->name. + */ + mutex_lock(&trace_types_lock); +- if (unlikely(tr->current_trace && iter->trace->name != tr->current_trace->name)) ++ if (unlikely(tr->current_trace && iter->trace->name != tr->current_trace->name)) { ++ /* Close iter->trace before switching to the new current tracer */ ++ if (iter->trace->close) ++ iter->trace->close(iter); + *iter->trace = *tr->current_trace; ++ /* Reopen the new current tracer */ ++ if (iter->trace->open) ++ iter->trace->open(iter); ++ } + mutex_unlock(&trace_types_lock); + + #ifdef CONFIG_TRACER_MAX_TRACE +@@ -5171,11 +5178,17 @@ int tracing_set_cpumask(struct trace_array *tr, + !cpumask_test_cpu(cpu, tracing_cpumask_new)) { + atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); + ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu); ++#ifdef CONFIG_TRACER_MAX_TRACE ++ ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu); ++#endif + } + if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) && + cpumask_test_cpu(cpu, tracing_cpumask_new)) { + atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); + ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu); ++#ifdef CONFIG_TRACER_MAX_TRACE ++ ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); ++#endif + } + } + arch_spin_unlock(&tr->max_lock); +diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c +index 590b3d51afae9..ba37f768e2f27 100644 +--- a/kernel/trace/trace_irqsoff.c ++++ b/kernel/trace/trace_irqsoff.c +@@ -231,7 +231,8 @@ static void irqsoff_trace_open(struct trace_iterator *iter) + { + if (is_graph(iter->tr)) + graph_trace_open(iter); +- ++ else ++ iter->private = NULL; + } + + static void irqsoff_trace_close(struct trace_iterator *iter) +diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c +index 2402de520eca7..b239bfaa51ae8 100644 +--- a/kernel/trace/trace_sched_wakeup.c ++++ b/kernel/trace/trace_sched_wakeup.c +@@ -168,6 +168,8 @@ static void wakeup_trace_open(struct trace_iterator *iter) + { + if (is_graph(iter->tr)) + graph_trace_open(iter); ++ else ++ iter->private = NULL; + } + + static void wakeup_trace_close(struct trace_iterator *iter) +diff --git a/lib/clz_ctz.c b/lib/clz_ctz.c +index 0d3a686b5ba29..fb8c0c5c2bd27 100644 +--- a/lib/clz_ctz.c ++++ b/lib/clz_ctz.c +@@ -28,36 +28,16 @@ int __weak __clzsi2(int val) + } + EXPORT_SYMBOL(__clzsi2); + +-int __weak __clzdi2(long val); +-int __weak __ctzdi2(long val); +-#if BITS_PER_LONG == 32 +- +-int __weak __clzdi2(long val) ++int __weak __clzdi2(u64 val); ++int __weak __clzdi2(u64 val) + { +- return 32 - fls((int)val); ++ return 64 - fls64(val); + } + EXPORT_SYMBOL(__clzdi2); + +-int __weak __ctzdi2(long val) ++int __weak __ctzdi2(u64 val); ++int __weak __ctzdi2(u64 val) + { +- return __ffs((u32)val); ++ return __ffs64(val); + } + EXPORT_SYMBOL(__ctzdi2); +- +-#elif BITS_PER_LONG == 64 +- +-int __weak __clzdi2(long val) +-{ +- return 64 - fls64((u64)val); +-} +-EXPORT_SYMBOL(__clzdi2); +- +-int __weak __ctzdi2(long val) +-{ +- return __ffs64((u64)val); +-} +-EXPORT_SYMBOL(__ctzdi2); +- +-#else +-#error BITS_PER_LONG not 32 or 64 +-#endif +diff --git a/lib/radix-tree.c b/lib/radix-tree.c +index b3afafe46fffb..c7918b7b8a23c 100644 +--- a/lib/radix-tree.c ++++ b/lib/radix-tree.c +@@ -1134,7 +1134,6 @@ static void set_iter_tags(struct radix_tree_iter *iter, + void __rcu **radix_tree_iter_resume(void __rcu **slot, + struct radix_tree_iter *iter) + { +- slot++; + iter->index = __radix_tree_iter_add(iter, 1); + iter->next_index = iter->index; + iter->tags = 0; +diff --git a/mm/memory-failure.c b/mm/memory-failure.c +index 9f9dd968fbe3c..bcd71d8736be5 100644 +--- a/mm/memory-failure.c ++++ b/mm/memory-failure.c +@@ -2219,16 +2219,6 @@ static int soft_offline_in_use_page(struct page *page) + return __soft_offline_page(page); + } + +-static int soft_offline_free_page(struct page *page) +-{ +- int rc = 0; +- +- if (!page_handle_poison(page, true, false)) +- rc = -EBUSY; +- +- return rc; +-} +- + static void put_ref_page(struct page *page) + { + if (page) +@@ -2294,10 +2284,13 @@ retry: + if (ret > 0) { + ret = soft_offline_in_use_page(page); + } else if (ret == 0) { +- if (soft_offline_free_page(page) && try_again) { +- try_again = false; +- flags &= ~MF_COUNT_INCREASED; +- goto retry; ++ if (!page_handle_poison(page, true, false)) { ++ if (try_again) { ++ try_again = false; ++ flags &= ~MF_COUNT_INCREASED; ++ goto retry; ++ } ++ ret = -EBUSY; + } + } + +diff --git a/mm/vmalloc.c b/mm/vmalloc.c +index 3e482209a1c4c..b36d94ef337d7 100644 +--- a/mm/vmalloc.c ++++ b/mm/vmalloc.c +@@ -2806,6 +2806,10 @@ void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot) + free_vm_area(area); + return NULL; + } ++ ++ flush_cache_vmap((unsigned long)area->addr, ++ (unsigned long)area->addr + count * PAGE_SIZE); ++ + return area->addr; + } + EXPORT_SYMBOL_GPL(vmap_pfn); +diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c +index 71999e13f7293..5c5ddacd81cb7 100644 +--- a/net/batman-adv/bat_v_elp.c ++++ b/net/batman-adv/bat_v_elp.c +@@ -507,7 +507,7 @@ int batadv_v_elp_packet_recv(struct sk_buff *skb, + struct batadv_priv *bat_priv = netdev_priv(if_incoming->soft_iface); + struct batadv_elp_packet *elp_packet; + struct batadv_hard_iface *primary_if; +- struct ethhdr *ethhdr = (struct ethhdr *)skb_mac_header(skb); ++ struct ethhdr *ethhdr; + bool res; + int ret = NET_RX_DROP; + +@@ -515,6 +515,7 @@ int batadv_v_elp_packet_recv(struct sk_buff *skb, + if (!res) + goto free_skb; + ++ ethhdr = eth_hdr(skb); + if (batadv_is_my_mac(bat_priv, ethhdr->h_source)) + goto free_skb; + +diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c +index 1d750f3cb2e4a..4fe6df68dfcb7 100644 +--- a/net/batman-adv/bat_v_ogm.c ++++ b/net/batman-adv/bat_v_ogm.c +@@ -124,8 +124,10 @@ static void batadv_v_ogm_send_to_if(struct sk_buff *skb, + { + struct batadv_priv *bat_priv = netdev_priv(hard_iface->soft_iface); + +- if (hard_iface->if_status != BATADV_IF_ACTIVE) ++ if (hard_iface->if_status != BATADV_IF_ACTIVE) { ++ kfree_skb(skb); + return; ++ } + + batadv_inc_counter(bat_priv, BATADV_CNT_MGMT_TX); + batadv_add_counter(bat_priv, BATADV_CNT_MGMT_TX_BYTES, +@@ -986,7 +988,7 @@ int batadv_v_ogm_packet_recv(struct sk_buff *skb, + { + struct batadv_priv *bat_priv = netdev_priv(if_incoming->soft_iface); + struct batadv_ogm2_packet *ogm_packet; +- struct ethhdr *ethhdr = eth_hdr(skb); ++ struct ethhdr *ethhdr; + int ogm_offset; + u8 *packet_pos; + int ret = NET_RX_DROP; +@@ -1000,6 +1002,7 @@ int batadv_v_ogm_packet_recv(struct sk_buff *skb, + if (!batadv_check_management_packet(skb, if_incoming, BATADV_OGM2_HLEN)) + goto free_skb; + ++ ethhdr = eth_hdr(skb); + if (batadv_is_my_mac(bat_priv, ethhdr->h_source)) + goto free_skb; + +diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c +index 35fadb9248498..44cf612c0831e 100644 +--- a/net/batman-adv/hard-interface.c ++++ b/net/batman-adv/hard-interface.c +@@ -627,7 +627,19 @@ out: + */ + void batadv_update_min_mtu(struct net_device *soft_iface) + { +- soft_iface->mtu = batadv_hardif_min_mtu(soft_iface); ++ struct batadv_priv *bat_priv = netdev_priv(soft_iface); ++ int limit_mtu; ++ int mtu; ++ ++ mtu = batadv_hardif_min_mtu(soft_iface); ++ ++ if (bat_priv->mtu_set_by_user) ++ limit_mtu = bat_priv->mtu_set_by_user; ++ else ++ limit_mtu = ETH_DATA_LEN; ++ ++ mtu = min(mtu, limit_mtu); ++ dev_set_mtu(soft_iface, mtu); + + /* Check if the local translate table should be cleaned up to match a + * new (and smaller) MTU. +diff --git a/net/batman-adv/netlink.c b/net/batman-adv/netlink.c +index 00875e1d8c44c..bbd6ecf1678c9 100644 +--- a/net/batman-adv/netlink.c ++++ b/net/batman-adv/netlink.c +@@ -495,7 +495,10 @@ static int batadv_netlink_set_mesh(struct sk_buff *skb, struct genl_info *info) + attr = info->attrs[BATADV_ATTR_FRAGMENTATION_ENABLED]; + + atomic_set(&bat_priv->fragmentation, !!nla_get_u8(attr)); ++ ++ rtnl_lock(); + batadv_update_min_mtu(bat_priv->soft_iface); ++ rtnl_unlock(); + } + + if (info->attrs[BATADV_ATTR_GW_BANDWIDTH_DOWN]) { +diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c +index 6ab28b509d4bb..99cd8aef07354 100644 +--- a/net/batman-adv/soft-interface.c ++++ b/net/batman-adv/soft-interface.c +@@ -154,11 +154,14 @@ static int batadv_interface_set_mac_addr(struct net_device *dev, void *p) + + static int batadv_interface_change_mtu(struct net_device *dev, int new_mtu) + { ++ struct batadv_priv *bat_priv = netdev_priv(dev); ++ + /* check ranges */ + if (new_mtu < 68 || new_mtu > batadv_hardif_min_mtu(dev)) + return -EINVAL; + + dev->mtu = new_mtu; ++ bat_priv->mtu_set_by_user = new_mtu; + + return 0; + } +diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c +index 4b7ad6684bc40..1e1cf0e8a1427 100644 +--- a/net/batman-adv/translation-table.c ++++ b/net/batman-adv/translation-table.c +@@ -774,7 +774,6 @@ check_roaming: + if (roamed_back) { + batadv_tt_global_free(bat_priv, tt_global, + "Roaming canceled"); +- tt_global = NULL; + } else { + /* The global entry has to be marked as ROAMING and + * has to be kept for consistency purpose +diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h +index 2be5d4a712c5c..2635763bbd67a 100644 +--- a/net/batman-adv/types.h ++++ b/net/batman-adv/types.h +@@ -1546,6 +1546,12 @@ struct batadv_priv { + /** @soft_iface: net device which holds this struct as private data */ + struct net_device *soft_iface; + ++ /** ++ * @mtu_set_by_user: MTU was set once by user ++ * protected by rtnl_lock ++ */ ++ int mtu_set_by_user; ++ + /** + * @bat_counters: mesh internal traffic statistic counters (see + * batadv_counters) +diff --git a/net/can/raw.c b/net/can/raw.c +index 7105fa4824e4b..8877d22da67ee 100644 +--- a/net/can/raw.c ++++ b/net/can/raw.c +@@ -83,6 +83,7 @@ struct raw_sock { + struct sock sk; + int bound; + int ifindex; ++ struct net_device *dev; + struct list_head notifier; + int loopback; + int recv_own_msgs; +@@ -275,21 +276,24 @@ static void raw_notify(struct raw_sock *ro, unsigned long msg, + if (!net_eq(dev_net(dev), sock_net(sk))) + return; + +- if (ro->ifindex != dev->ifindex) ++ if (ro->dev != dev) + return; + + switch (msg) { + case NETDEV_UNREGISTER: + lock_sock(sk); + /* remove current filters & unregister */ +- if (ro->bound) ++ if (ro->bound) { + raw_disable_allfilters(dev_net(dev), dev, sk); ++ dev_put(dev); ++ } + + if (ro->count > 1) + kfree(ro->filter); + + ro->ifindex = 0; + ro->bound = 0; ++ ro->dev = NULL; + ro->count = 0; + release_sock(sk); + +@@ -335,6 +339,7 @@ static int raw_init(struct sock *sk) + + ro->bound = 0; + ro->ifindex = 0; ++ ro->dev = NULL; + + /* set default filter to single entry dfilter */ + ro->dfilter.can_id = 0; +@@ -380,18 +385,14 @@ static int raw_release(struct socket *sock) + list_del(&ro->notifier); + spin_unlock(&raw_notifier_lock); + ++ rtnl_lock(); + lock_sock(sk); + + /* remove current filters & unregister */ + if (ro->bound) { +- if (ro->ifindex) { +- struct net_device *dev; +- +- dev = dev_get_by_index(sock_net(sk), ro->ifindex); +- if (dev) { +- raw_disable_allfilters(dev_net(dev), dev, sk); +- dev_put(dev); +- } ++ if (ro->dev) { ++ raw_disable_allfilters(dev_net(ro->dev), ro->dev, sk); ++ dev_put(ro->dev); + } else { + raw_disable_allfilters(sock_net(sk), NULL, sk); + } +@@ -402,6 +403,7 @@ static int raw_release(struct socket *sock) + + ro->ifindex = 0; + ro->bound = 0; ++ ro->dev = NULL; + ro->count = 0; + free_percpu(ro->uniq); + +@@ -409,6 +411,8 @@ static int raw_release(struct socket *sock) + sock->sk = NULL; + + release_sock(sk); ++ rtnl_unlock(); ++ + sock_put(sk); + + return 0; +@@ -419,6 +423,7 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len) + struct sockaddr_can *addr = (struct sockaddr_can *)uaddr; + struct sock *sk = sock->sk; + struct raw_sock *ro = raw_sk(sk); ++ struct net_device *dev = NULL; + int ifindex; + int err = 0; + int notify_enetdown = 0; +@@ -428,24 +433,23 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len) + if (addr->can_family != AF_CAN) + return -EINVAL; + ++ rtnl_lock(); + lock_sock(sk); + + if (ro->bound && addr->can_ifindex == ro->ifindex) + goto out; + + if (addr->can_ifindex) { +- struct net_device *dev; +- + dev = dev_get_by_index(sock_net(sk), addr->can_ifindex); + if (!dev) { + err = -ENODEV; + goto out; + } + if (dev->type != ARPHRD_CAN) { +- dev_put(dev); + err = -ENODEV; +- goto out; ++ goto out_put_dev; + } ++ + if (!(dev->flags & IFF_UP)) + notify_enetdown = 1; + +@@ -453,7 +457,9 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len) + + /* filters set by default/setsockopt */ + err = raw_enable_allfilters(sock_net(sk), dev, sk); +- dev_put(dev); ++ if (err) ++ goto out_put_dev; ++ + } else { + ifindex = 0; + +@@ -464,26 +470,30 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len) + if (!err) { + if (ro->bound) { + /* unregister old filters */ +- if (ro->ifindex) { +- struct net_device *dev; +- +- dev = dev_get_by_index(sock_net(sk), +- ro->ifindex); +- if (dev) { +- raw_disable_allfilters(dev_net(dev), +- dev, sk); +- dev_put(dev); +- } ++ if (ro->dev) { ++ raw_disable_allfilters(dev_net(ro->dev), ++ ro->dev, sk); ++ /* drop reference to old ro->dev */ ++ dev_put(ro->dev); + } else { + raw_disable_allfilters(sock_net(sk), NULL, sk); + } + } + ro->ifindex = ifindex; + ro->bound = 1; ++ /* bind() ok -> hold a reference for new ro->dev */ ++ ro->dev = dev; ++ if (ro->dev) ++ dev_hold(ro->dev); + } + +- out: ++out_put_dev: ++ /* remove potential reference from dev_get_by_index() */ ++ if (dev) ++ dev_put(dev); ++out: + release_sock(sk); ++ rtnl_unlock(); + + if (notify_enetdown) { + sk->sk_err = ENETDOWN; +@@ -549,9 +559,9 @@ static int raw_setsockopt(struct socket *sock, int level, int optname, + rtnl_lock(); + lock_sock(sk); + +- if (ro->bound && ro->ifindex) { +- dev = dev_get_by_index(sock_net(sk), ro->ifindex); +- if (!dev) { ++ dev = ro->dev; ++ if (ro->bound && dev) { ++ if (dev->reg_state != NETREG_REGISTERED) { + if (count > 1) + kfree(filter); + err = -ENODEV; +@@ -592,7 +602,6 @@ static int raw_setsockopt(struct socket *sock, int level, int optname, + ro->count = count; + + out_fil: +- dev_put(dev); + release_sock(sk); + rtnl_unlock(); + +@@ -610,9 +619,9 @@ static int raw_setsockopt(struct socket *sock, int level, int optname, + rtnl_lock(); + lock_sock(sk); + +- if (ro->bound && ro->ifindex) { +- dev = dev_get_by_index(sock_net(sk), ro->ifindex); +- if (!dev) { ++ dev = ro->dev; ++ if (ro->bound && dev) { ++ if (dev->reg_state != NETREG_REGISTERED) { + err = -ENODEV; + goto out_err; + } +@@ -636,7 +645,6 @@ static int raw_setsockopt(struct socket *sock, int level, int optname, + ro->err_mask = err_mask; + + out_err: +- dev_put(dev); + release_sock(sk); + rtnl_unlock(); + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index b055e196f5306..1b71e5c582bbc 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -2173,13 +2173,27 @@ out_err: + return err; + } + +-int rtnl_nla_parse_ifla(struct nlattr **tb, const struct nlattr *head, int len, +- struct netlink_ext_ack *exterr) ++int rtnl_nla_parse_ifinfomsg(struct nlattr **tb, const struct nlattr *nla_peer, ++ struct netlink_ext_ack *exterr) + { +- return nla_parse_deprecated(tb, IFLA_MAX, head, len, ifla_policy, ++ const struct ifinfomsg *ifmp; ++ const struct nlattr *attrs; ++ size_t len; ++ ++ ifmp = nla_data(nla_peer); ++ attrs = nla_data(nla_peer) + sizeof(struct ifinfomsg); ++ len = nla_len(nla_peer) - sizeof(struct ifinfomsg); ++ ++ if (ifmp->ifi_index < 0) { ++ NL_SET_ERR_MSG_ATTR(exterr, nla_peer, ++ "ifindex can't be negative"); ++ return -EINVAL; ++ } ++ ++ return nla_parse_deprecated(tb, IFLA_MAX, attrs, len, ifla_policy, + exterr); + } +-EXPORT_SYMBOL(rtnl_nla_parse_ifla); ++EXPORT_SYMBOL(rtnl_nla_parse_ifinfomsg); + + struct net *rtnl_link_get_net(struct net *src_net, struct nlattr *tb[]) + { +@@ -3280,6 +3294,7 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, + struct ifinfomsg *ifm; + char ifname[IFNAMSIZ]; + struct nlattr **data; ++ bool link_specified; + int err; + + #ifdef CONFIG_MODULES +@@ -3300,12 +3315,19 @@ replay: + ifname[0] = '\0'; + + ifm = nlmsg_data(nlh); +- if (ifm->ifi_index > 0) ++ if (ifm->ifi_index > 0) { ++ link_specified = true; + dev = __dev_get_by_index(net, ifm->ifi_index); +- else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME]) ++ } else if (ifm->ifi_index < 0) { ++ NL_SET_ERR_MSG(extack, "ifindex can't be negative"); ++ return -EINVAL; ++ } else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME]) { ++ link_specified = true; + dev = rtnl_dev_get(net, NULL, tb[IFLA_ALT_IFNAME], ifname); +- else ++ } else { ++ link_specified = false; + dev = NULL; ++ } + + master_dev = NULL; + m_ops = NULL; +@@ -3408,7 +3430,12 @@ replay: + } + + if (!(nlh->nlmsg_flags & NLM_F_CREATE)) { +- if (ifm->ifi_index == 0 && tb[IFLA_GROUP]) ++ /* No dev found and NLM_F_CREATE not set. Requested dev does not exist, ++ * or it's for a group ++ */ ++ if (link_specified) ++ return -ENODEV; ++ if (tb[IFLA_GROUP]) + return rtnl_group_changelink(skb, net, + nla_get_u32(tb[IFLA_GROUP]), + ifm, extack, tb); +diff --git a/net/dccp/proto.c b/net/dccp/proto.c +index 5422d64af246e..0b0567a692a8f 100644 +--- a/net/dccp/proto.c ++++ b/net/dccp/proto.c +@@ -324,11 +324,15 @@ EXPORT_SYMBOL_GPL(dccp_disconnect); + __poll_t dccp_poll(struct file *file, struct socket *sock, + poll_table *wait) + { +- __poll_t mask; + struct sock *sk = sock->sk; ++ __poll_t mask; ++ u8 shutdown; ++ int state; + + sock_poll_wait(file, sock, wait); +- if (sk->sk_state == DCCP_LISTEN) ++ ++ state = inet_sk_state_load(sk); ++ if (state == DCCP_LISTEN) + return inet_csk_listen_poll(sk); + + /* Socket is not locked. We are protected from async events +@@ -337,20 +341,21 @@ __poll_t dccp_poll(struct file *file, struct socket *sock, + */ + + mask = 0; +- if (sk->sk_err) ++ if (READ_ONCE(sk->sk_err)) + mask = EPOLLERR; ++ shutdown = READ_ONCE(sk->sk_shutdown); + +- if (sk->sk_shutdown == SHUTDOWN_MASK || sk->sk_state == DCCP_CLOSED) ++ if (shutdown == SHUTDOWN_MASK || state == DCCP_CLOSED) + mask |= EPOLLHUP; +- if (sk->sk_shutdown & RCV_SHUTDOWN) ++ if (shutdown & RCV_SHUTDOWN) + mask |= EPOLLIN | EPOLLRDNORM | EPOLLRDHUP; + + /* Connected? */ +- if ((1 << sk->sk_state) & ~(DCCPF_REQUESTING | DCCPF_RESPOND)) { ++ if ((1 << state) & ~(DCCPF_REQUESTING | DCCPF_RESPOND)) { + if (atomic_read(&sk->sk_rmem_alloc) > 0) + mask |= EPOLLIN | EPOLLRDNORM; + +- if (!(sk->sk_shutdown & SEND_SHUTDOWN)) { ++ if (!(shutdown & SEND_SHUTDOWN)) { + if (sk_stream_is_writeable(sk)) { + mask |= EPOLLOUT | EPOLLWRNORM; + } else { /* send SIGIO later */ +@@ -368,7 +373,6 @@ __poll_t dccp_poll(struct file *file, struct socket *sock, + } + return mask; + } +- + EXPORT_SYMBOL_GPL(dccp_poll); + + int dccp_ioctl(struct sock *sk, int cmd, unsigned long arg) +diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c +index 6447a09932f55..069c2659074bc 100644 +--- a/net/ncsi/ncsi-rsp.c ++++ b/net/ncsi/ncsi-rsp.c +@@ -611,14 +611,14 @@ static int ncsi_rsp_handler_snfc(struct ncsi_request *nr) + return 0; + } + +-/* Response handler for Mellanox command Get Mac Address */ +-static int ncsi_rsp_handler_oem_mlx_gma(struct ncsi_request *nr) ++/* Response handler for Get Mac Address command */ ++static int ncsi_rsp_handler_oem_gma(struct ncsi_request *nr, int mfr_id) + { + struct ncsi_dev_priv *ndp = nr->ndp; + struct net_device *ndev = ndp->ndev.dev; +- const struct net_device_ops *ops = ndev->netdev_ops; + struct ncsi_rsp_oem_pkt *rsp; + struct sockaddr saddr; ++ u32 mac_addr_off = 0; + int ret = 0; + + /* Get the response header */ +@@ -626,11 +626,25 @@ static int ncsi_rsp_handler_oem_mlx_gma(struct ncsi_request *nr) + + saddr.sa_family = ndev->type; + ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE; +- memcpy(saddr.sa_data, &rsp->data[MLX_MAC_ADDR_OFFSET], ETH_ALEN); ++ if (mfr_id == NCSI_OEM_MFR_BCM_ID) ++ mac_addr_off = BCM_MAC_ADDR_OFFSET; ++ else if (mfr_id == NCSI_OEM_MFR_MLX_ID) ++ mac_addr_off = MLX_MAC_ADDR_OFFSET; ++ else if (mfr_id == NCSI_OEM_MFR_INTEL_ID) ++ mac_addr_off = INTEL_MAC_ADDR_OFFSET; ++ ++ memcpy(saddr.sa_data, &rsp->data[mac_addr_off], ETH_ALEN); ++ if (mfr_id == NCSI_OEM_MFR_BCM_ID || mfr_id == NCSI_OEM_MFR_INTEL_ID) ++ eth_addr_inc((u8 *)saddr.sa_data); ++ if (!is_valid_ether_addr((const u8 *)saddr.sa_data)) ++ return -ENXIO; ++ + /* Set the flag for GMA command which should only be called once */ + ndp->gma_flag = 1; + +- ret = ops->ndo_set_mac_address(ndev, &saddr); ++ rtnl_lock(); ++ ret = dev_set_mac_address(ndev, &saddr, NULL); ++ rtnl_unlock(); + if (ret < 0) + netdev_warn(ndev, "NCSI: 'Writing mac address to device failed\n"); + +@@ -649,41 +663,10 @@ static int ncsi_rsp_handler_oem_mlx(struct ncsi_request *nr) + + if (mlx->cmd == NCSI_OEM_MLX_CMD_GMA && + mlx->param == NCSI_OEM_MLX_CMD_GMA_PARAM) +- return ncsi_rsp_handler_oem_mlx_gma(nr); ++ return ncsi_rsp_handler_oem_gma(nr, NCSI_OEM_MFR_MLX_ID); + return 0; + } + +-/* Response handler for Broadcom command Get Mac Address */ +-static int ncsi_rsp_handler_oem_bcm_gma(struct ncsi_request *nr) +-{ +- struct ncsi_dev_priv *ndp = nr->ndp; +- struct net_device *ndev = ndp->ndev.dev; +- const struct net_device_ops *ops = ndev->netdev_ops; +- struct ncsi_rsp_oem_pkt *rsp; +- struct sockaddr saddr; +- int ret = 0; +- +- /* Get the response header */ +- rsp = (struct ncsi_rsp_oem_pkt *)skb_network_header(nr->rsp); +- +- saddr.sa_family = ndev->type; +- ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE; +- memcpy(saddr.sa_data, &rsp->data[BCM_MAC_ADDR_OFFSET], ETH_ALEN); +- /* Increase mac address by 1 for BMC's address */ +- eth_addr_inc((u8 *)saddr.sa_data); +- if (!is_valid_ether_addr((const u8 *)saddr.sa_data)) +- return -ENXIO; +- +- /* Set the flag for GMA command which should only be called once */ +- ndp->gma_flag = 1; +- +- ret = ops->ndo_set_mac_address(ndev, &saddr); +- if (ret < 0) +- netdev_warn(ndev, "NCSI: 'Writing mac address to device failed\n"); +- +- return ret; +-} +- + /* Response handler for Broadcom card */ + static int ncsi_rsp_handler_oem_bcm(struct ncsi_request *nr) + { +@@ -695,42 +678,10 @@ static int ncsi_rsp_handler_oem_bcm(struct ncsi_request *nr) + bcm = (struct ncsi_rsp_oem_bcm_pkt *)(rsp->data); + + if (bcm->type == NCSI_OEM_BCM_CMD_GMA) +- return ncsi_rsp_handler_oem_bcm_gma(nr); ++ return ncsi_rsp_handler_oem_gma(nr, NCSI_OEM_MFR_BCM_ID); + return 0; + } + +-/* Response handler for Intel command Get Mac Address */ +-static int ncsi_rsp_handler_oem_intel_gma(struct ncsi_request *nr) +-{ +- struct ncsi_dev_priv *ndp = nr->ndp; +- struct net_device *ndev = ndp->ndev.dev; +- const struct net_device_ops *ops = ndev->netdev_ops; +- struct ncsi_rsp_oem_pkt *rsp; +- struct sockaddr saddr; +- int ret = 0; +- +- /* Get the response header */ +- rsp = (struct ncsi_rsp_oem_pkt *)skb_network_header(nr->rsp); +- +- saddr.sa_family = ndev->type; +- ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE; +- memcpy(saddr.sa_data, &rsp->data[INTEL_MAC_ADDR_OFFSET], ETH_ALEN); +- /* Increase mac address by 1 for BMC's address */ +- eth_addr_inc((u8 *)saddr.sa_data); +- if (!is_valid_ether_addr((const u8 *)saddr.sa_data)) +- return -ENXIO; +- +- /* Set the flag for GMA command which should only be called once */ +- ndp->gma_flag = 1; +- +- ret = ops->ndo_set_mac_address(ndev, &saddr); +- if (ret < 0) +- netdev_warn(ndev, +- "NCSI: 'Writing mac address to device failed\n"); +- +- return ret; +-} +- + /* Response handler for Intel card */ + static int ncsi_rsp_handler_oem_intel(struct ncsi_request *nr) + { +@@ -742,7 +693,7 @@ static int ncsi_rsp_handler_oem_intel(struct ncsi_request *nr) + intel = (struct ncsi_rsp_oem_intel_pkt *)(rsp->data); + + if (intel->cmd == NCSI_OEM_INTEL_CMD_GMA) +- return ncsi_rsp_handler_oem_intel_gma(nr); ++ return ncsi_rsp_handler_oem_gma(nr, NCSI_OEM_MFR_INTEL_ID); + + return 0; + } +diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c +index 1e2d1e4bdb74d..d84da11aaee5c 100644 +--- a/net/netfilter/nf_tables_api.c ++++ b/net/netfilter/nf_tables_api.c +@@ -10303,7 +10303,7 @@ static int nft_rcv_nl_event(struct notifier_block *this, unsigned long event, + deleted = 0; + mutex_lock(&nft_net->commit_mutex); + if (!list_empty(&nf_tables_destroy_list)) +- rcu_barrier(); ++ nf_tables_trans_destroy_flush_work(); + again: + list_for_each_entry(table, &nft_net->tables, list) { + if (nft_table_has_owner(table) && +diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c +index 32cfd0a84b0e2..8c16681884b7e 100644 +--- a/net/netfilter/nft_set_pipapo.c ++++ b/net/netfilter/nft_set_pipapo.c +@@ -901,12 +901,14 @@ static void pipapo_lt_bits_adjust(struct nft_pipapo_field *f) + static int pipapo_insert(struct nft_pipapo_field *f, const uint8_t *k, + int mask_bits) + { +- int rule = f->rules++, group, ret, bit_offset = 0; ++ int rule = f->rules, group, ret, bit_offset = 0; + +- ret = pipapo_resize(f, f->rules - 1, f->rules); ++ ret = pipapo_resize(f, f->rules, f->rules + 1); + if (ret) + return ret; + ++ f->rules++; ++ + for (group = 0; group < f->groups; group++) { + int i, v; + u8 mask; +@@ -1051,7 +1053,9 @@ static int pipapo_expand(struct nft_pipapo_field *f, + step++; + if (step >= len) { + if (!masks) { +- pipapo_insert(f, base, 0); ++ err = pipapo_insert(f, base, 0); ++ if (err < 0) ++ return err; + masks = 1; + } + goto out; +@@ -1234,6 +1238,9 @@ static int nft_pipapo_insert(const struct net *net, const struct nft_set *set, + else + ret = pipapo_expand(f, start, end, f->groups * f->bb); + ++ if (ret < 0) ++ return ret; ++ + if (f->bsize > bsize_max) + bsize_max = f->bsize; + +diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c +index 328db5e1b0eaf..fa79dbd3601fa 100644 +--- a/net/sched/sch_api.c ++++ b/net/sched/sch_api.c +@@ -1513,10 +1513,28 @@ static int tc_get_qdisc(struct sk_buff *skb, struct nlmsghdr *n, + return 0; + } + ++static bool req_create_or_replace(struct nlmsghdr *n) ++{ ++ return (n->nlmsg_flags & NLM_F_CREATE && ++ n->nlmsg_flags & NLM_F_REPLACE); ++} ++ ++static bool req_create_exclusive(struct nlmsghdr *n) ++{ ++ return (n->nlmsg_flags & NLM_F_CREATE && ++ n->nlmsg_flags & NLM_F_EXCL); ++} ++ ++static bool req_change(struct nlmsghdr *n) ++{ ++ return (!(n->nlmsg_flags & NLM_F_CREATE) && ++ !(n->nlmsg_flags & NLM_F_REPLACE) && ++ !(n->nlmsg_flags & NLM_F_EXCL)); ++} ++ + /* + * Create/change qdisc. + */ +- + static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, + struct netlink_ext_ack *extack) + { +@@ -1613,27 +1631,35 @@ replay: + * + * We know, that some child q is already + * attached to this parent and have choice: +- * either to change it or to create/graft new one. ++ * 1) change it or 2) create/graft new one. ++ * If the requested qdisc kind is different ++ * than the existing one, then we choose graft. ++ * If they are the same then this is "change" ++ * operation - just let it fallthrough.. + * + * 1. We are allowed to create/graft only +- * if CREATE and REPLACE flags are set. ++ * if the request is explicitly stating ++ * "please create if it doesn't exist". + * +- * 2. If EXCL is set, requestor wanted to say, +- * that qdisc tcm_handle is not expected ++ * 2. If the request is to exclusive create ++ * then the qdisc tcm_handle is not expected + * to exist, so that we choose create/graft too. + * + * 3. The last case is when no flags are set. ++ * This will happen when for example tc ++ * utility issues a "change" command. + * Alas, it is sort of hole in API, we + * cannot decide what to do unambiguously. +- * For now we select create/graft, if +- * user gave KIND, which does not match existing. ++ * For now we select create/graft. + */ +- if ((n->nlmsg_flags & NLM_F_CREATE) && +- (n->nlmsg_flags & NLM_F_REPLACE) && +- ((n->nlmsg_flags & NLM_F_EXCL) || +- (tca[TCA_KIND] && +- nla_strcmp(tca[TCA_KIND], q->ops->id)))) +- goto create_n_graft; ++ if (tca[TCA_KIND] && ++ nla_strcmp(tca[TCA_KIND], q->ops->id)) { ++ if (req_create_or_replace(n) || ++ req_create_exclusive(n)) ++ goto create_n_graft; ++ else if (req_change(n)) ++ goto create_n_graft2; ++ } + } + } + } else { +@@ -1667,6 +1693,7 @@ create_n_graft: + NL_SET_ERR_MSG(extack, "Qdisc not found. To create specify NLM_F_CREATE flag"); + return -ENOENT; + } ++create_n_graft2: + if (clid == TC_H_INGRESS) { + if (dev_ingress_queue(dev)) { + q = qdisc_create(dev, dev_ingress_queue(dev), p, +diff --git a/net/sctp/socket.c b/net/sctp/socket.c +index f10ad80fd6948..717e2f60370b3 100644 +--- a/net/sctp/socket.c ++++ b/net/sctp/socket.c +@@ -97,7 +97,7 @@ struct percpu_counter sctp_sockets_allocated; + + static void sctp_enter_memory_pressure(struct sock *sk) + { +- sctp_memory_pressure = 1; ++ WRITE_ONCE(sctp_memory_pressure, 1); + } + + +diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c +index 507ba8b799920..41095a278f798 100644 +--- a/net/sunrpc/xprtrdma/verbs.c ++++ b/net/sunrpc/xprtrdma/verbs.c +@@ -962,9 +962,6 @@ struct rpcrdma_rep *rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt, + if (!rep->rr_rdmabuf) + goto out_free; + +- if (!rpcrdma_regbuf_dma_map(r_xprt, rep->rr_rdmabuf)) +- goto out_free_regbuf; +- + rep->rr_cid.ci_completion_id = + atomic_inc_return(&r_xprt->rx_ep->re_completion_ids); + +@@ -983,8 +980,6 @@ struct rpcrdma_rep *rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt, + spin_unlock(&buf->rb_lock); + return rep; + +-out_free_regbuf: +- rpcrdma_regbuf_free(rep->rr_rdmabuf); + out_free: + kfree(rep); + out: +@@ -1391,6 +1386,10 @@ void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed, bool temp) + rep = rpcrdma_rep_create(r_xprt, temp); + if (!rep) + break; ++ if (!rpcrdma_regbuf_dma_map(r_xprt, rep->rr_rdmabuf)) { ++ rpcrdma_rep_put(buf, rep); ++ break; ++ } + + rep->rr_cid.ci_queue_id = ep->re_attr.recv_cq->res.id; + trace_xprtrdma_post_recv(rep); +diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c +index 0ae1b718194a3..7f7858593bdb4 100644 +--- a/security/selinux/ss/policydb.c ++++ b/security/selinux/ss/policydb.c +@@ -2011,6 +2011,7 @@ static int filename_trans_read_helper(struct policydb *p, void *fp) + if (!datum) + goto out; + ++ datum->next = NULL; + *dst = datum; + + /* ebitmap_read() will at least init the bitmap */ +@@ -2023,7 +2024,6 @@ static int filename_trans_read_helper(struct policydb *p, void *fp) + goto out; + + datum->otype = le32_to_cpu(buf[0]); +- datum->next = NULL; + + dst = &datum->next; + } +diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c +index f1470590239e5..711e71016a7c3 100644 +--- a/sound/core/pcm_memory.c ++++ b/sound/core/pcm_memory.c +@@ -31,20 +31,51 @@ static unsigned long max_alloc_per_card = 32UL * 1024UL * 1024UL; + module_param(max_alloc_per_card, ulong, 0644); + MODULE_PARM_DESC(max_alloc_per_card, "Max total allocation bytes per card."); + ++static void __update_allocated_size(struct snd_card *card, ssize_t bytes) ++{ ++ card->total_pcm_alloc_bytes += bytes; ++} ++ ++static void update_allocated_size(struct snd_card *card, ssize_t bytes) ++{ ++ mutex_lock(&card->memory_mutex); ++ __update_allocated_size(card, bytes); ++ mutex_unlock(&card->memory_mutex); ++} ++ ++static void decrease_allocated_size(struct snd_card *card, size_t bytes) ++{ ++ mutex_lock(&card->memory_mutex); ++ WARN_ON(card->total_pcm_alloc_bytes < bytes); ++ __update_allocated_size(card, -(ssize_t)bytes); ++ mutex_unlock(&card->memory_mutex); ++} ++ + static int do_alloc_pages(struct snd_card *card, int type, struct device *dev, + size_t size, struct snd_dma_buffer *dmab) + { + int err; + ++ /* check and reserve the requested size */ ++ mutex_lock(&card->memory_mutex); + if (max_alloc_per_card && +- card->total_pcm_alloc_bytes + size > max_alloc_per_card) ++ card->total_pcm_alloc_bytes + size > max_alloc_per_card) { ++ mutex_unlock(&card->memory_mutex); + return -ENOMEM; ++ } ++ __update_allocated_size(card, size); ++ mutex_unlock(&card->memory_mutex); + + err = snd_dma_alloc_pages(type, dev, size, dmab); + if (!err) { +- mutex_lock(&card->memory_mutex); +- card->total_pcm_alloc_bytes += dmab->bytes; +- mutex_unlock(&card->memory_mutex); ++ /* the actual allocation size might be bigger than requested, ++ * and we need to correct the account ++ */ ++ if (dmab->bytes != size) ++ update_allocated_size(card, dmab->bytes - size); ++ } else { ++ /* take back on allocation failure */ ++ decrease_allocated_size(card, size); + } + return err; + } +@@ -53,10 +84,7 @@ static void do_free_pages(struct snd_card *card, struct snd_dma_buffer *dmab) + { + if (!dmab->area) + return; +- mutex_lock(&card->memory_mutex); +- WARN_ON(card->total_pcm_alloc_bytes < dmab->bytes); +- card->total_pcm_alloc_bytes -= dmab->bytes; +- mutex_unlock(&card->memory_mutex); ++ decrease_allocated_size(card, dmab->bytes); + snd_dma_free_pages(dmab); + dmab->area = NULL; + } +diff --git a/sound/pci/ymfpci/ymfpci.c b/sound/pci/ymfpci/ymfpci.c +index 82d4e0fda91be..d62a0e2ddf609 100644 +--- a/sound/pci/ymfpci/ymfpci.c ++++ b/sound/pci/ymfpci/ymfpci.c +@@ -150,8 +150,8 @@ static inline int snd_ymfpci_create_gameport(struct snd_ymfpci *chip, int dev, i + void snd_ymfpci_free_gameport(struct snd_ymfpci *chip) { } + #endif /* SUPPORT_JOYSTICK */ + +-static int snd_card_ymfpci_probe(struct pci_dev *pci, +- const struct pci_device_id *pci_id) ++static int __snd_card_ymfpci_probe(struct pci_dev *pci, ++ const struct pci_device_id *pci_id) + { + static int dev; + struct snd_card *card; +@@ -333,6 +333,12 @@ static int snd_card_ymfpci_probe(struct pci_dev *pci, + return 0; + } + ++static int snd_card_ymfpci_probe(struct pci_dev *pci, ++ const struct pci_device_id *pci_id) ++{ ++ return snd_card_free_on_error(&pci->dev, __snd_card_ymfpci_probe(pci, pci_id)); ++} ++ + static struct pci_driver ymfpci_driver = { + .name = KBUILD_MODNAME, + .id_table = snd_ymfpci_ids, +diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c +index a9c5d54f1688e..76acd39478eea 100644 +--- a/tools/objtool/arch/x86/decode.c ++++ b/tools/objtool/arch/x86/decode.c +@@ -725,8 +725,11 @@ bool arch_is_retpoline(struct symbol *sym) + + bool arch_is_rethunk(struct symbol *sym) + { +- return !strcmp(sym->name, "__x86_return_thunk") || +- !strcmp(sym->name, "srso_untrain_ret") || +- !strcmp(sym->name, "srso_safe_ret") || +- !strcmp(sym->name, "retbleed_return_thunk"); ++ return !strcmp(sym->name, "__x86_return_thunk"); ++} ++ ++bool arch_is_embedded_insn(struct symbol *sym) ++{ ++ return !strcmp(sym->name, "retbleed_return_thunk") || ++ !strcmp(sym->name, "srso_safe_ret"); + } +diff --git a/tools/objtool/check.c b/tools/objtool/check.c +index f9ff878d6ba18..36ad0b6b94a91 100644 +--- a/tools/objtool/check.c ++++ b/tools/objtool/check.c +@@ -990,16 +990,33 @@ static int add_ignore_alternatives(struct objtool_file *file) + return 0; + } + ++/* ++ * Symbols that replace INSN_CALL_DYNAMIC, every (tail) call to such a symbol ++ * will be added to the .retpoline_sites section. ++ */ + __weak bool arch_is_retpoline(struct symbol *sym) + { + return false; + } + ++/* ++ * Symbols that replace INSN_RETURN, every (tail) call to such a symbol ++ * will be added to the .return_sites section. ++ */ + __weak bool arch_is_rethunk(struct symbol *sym) + { + return false; + } + ++/* ++ * Symbols that are embedded inside other instructions, because sometimes crazy ++ * code exists. These are mostly ignored for validation purposes. ++ */ ++__weak bool arch_is_embedded_insn(struct symbol *sym) ++{ ++ return false; ++} ++ + #define NEGATIVE_RELOC ((void *)-1L) + + static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn) +@@ -1235,7 +1252,7 @@ static int add_jump_destinations(struct objtool_file *file) + * middle of another instruction. Objtool only + * knows about the outer instruction. + */ +- if (sym && sym->return_thunk) { ++ if (sym && sym->embedded_insn) { + add_return_call(file, insn, false); + continue; + } +@@ -2066,6 +2083,9 @@ static int classify_symbols(struct objtool_file *file) + if (arch_is_rethunk(func)) + func->return_thunk = true; + ++ if (arch_is_embedded_insn(func)) ++ func->embedded_insn = true; ++ + if (!strcmp(func->name, "__fentry__")) + func->fentry = true; + +diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h +index 8d57e3d1f763d..d81d9a8e4808a 100644 +--- a/tools/objtool/include/objtool/arch.h ++++ b/tools/objtool/include/objtool/arch.h +@@ -89,6 +89,7 @@ int arch_decode_hint_reg(u8 sp_reg, int *base); + + bool arch_is_retpoline(struct symbol *sym); + bool arch_is_rethunk(struct symbol *sym); ++bool arch_is_embedded_insn(struct symbol *sym); + + int arch_rewrite_retpolines(struct objtool_file *file); + +diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h +index 6cdfa401b0009..f06398ea57510 100644 +--- a/tools/objtool/include/objtool/elf.h ++++ b/tools/objtool/include/objtool/elf.h +@@ -60,6 +60,7 @@ struct symbol { + u8 return_thunk : 1; + u8 fentry : 1; + u8 kcov : 1; ++ u8 embedded_insn : 1; + }; + + struct reloc {